From 51bd39860ff829475aef611a3234309e37e090d9 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:26:30 -0700 Subject: [MLSXFRM]: Granular IPSec associations for use in MLS environments The current approach to labeling Security Associations for SELinux purposes uses a one-to-one mapping between xfrm policy rules and security associations. This doesn't address the needs of real world MLS (Multi-level System, traditional Bell-LaPadula) environments where a single xfrm policy rule (pertaining to a range, classified to secret for example) might need to map to multiple Security Associations (one each for classified, secret, top secret and all the compartments applicable to these security levels). This patch set addresses the above problem by allowing for the mapping of a single xfrm policy rule to multiple security associations, with each association used in the security context it is defined for. It also includes the security context to be used in IKE negotiation in the acquire messages sent to the IKE daemon so that a unique SA can be negotiated for each unique security context. A couple of bug fixes are also included; checks to make sure the SAs used by a packet match policy (security context-wise) on the inbound and also that the bundle used for the outbound matches the security context of the flow. This patch set also makes the use of the SELinux sid in flow cache lookups seemless by including the sid in the flow key itself. Also, open requests as well as connection-oriented child sockets are labeled automatically to be at the same level as the peer to allow for use of appropriately labeled IPSec associations. Description of changes: A "sid" member has been added to the flow cache key resulting in the sid being available at all needed locations and the flow cache lookups automatically using the sid. The flow sid is derived from the socket on the outbound and the SAs (unlabeled where an SA was not used) on the inbound. Outbound case: 1. Find policy for the socket. 2. OLD: Find an SA that matches the policy. NEW: Find an SA that matches BOTH the policy and the flow/socket. This is necessary since not every SA that matches the policy can be used for the flow/socket. Consider policy range Secret-TS, and SAs each for Secret and TS. We don't want a TS socket to use the Secret SA. Hence the additional check for the SA Vs. flow/socket. 3. NEW: When looking thru bundles for a policy, make sure the flow/socket can use the bundle. If a bundle is not found, create one, calling for IKE if necessary. If using IKE, include the security context in the acquire message to the IKE daemon. Inbound case: 1. OLD: Find policy for the socket. NEW: Find policy for the incoming packet based on the sid of the SA(s) it used or the unlabeled sid if no SAs were used. (Consider a case where a socket is "authorized" for two policies (unclassified-confidential, secret-top_secret). If the packet has come in using a secret SA, we really ought to be using the latter policy (secret-top_secret).) 2. OLD: BUG: No check to see if the SAs used by the packet agree with the policy sec_ctx-wise. (It was indicated in selinux_xfrm_sock_rcv_skb() that this was being accomplished by (x->id.spi == tmpl->id.spi || !tmpl->id.spi) in xfrm_state_ok, but it turns out tmpl->id.spi would normally be zero (unless xfrm policy rules specify one at the template level, which they usually don't). NEW: The socket is checked for access to the SAs used (based on the sid of the SAs) in selinux_xfrm_sock_rcv_skb(). Forward case: This would be Step 1 from the Inbound case, followed by Steps 2 and 3 from the Outbound case. Outstanding items/issues: - Timewait acknowledgements and such are generated in the current/upstream implementation using a NULL socket resulting in the any_socket sid (SYSTEM_HIGH) to be used. This problem is not addressed by this patch set. This patch: Add new flask definitions to SELinux Adds a new avperm "polmatch" to arbitrate flow/state access to a xfrm policy rule. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/security/selinux/include/av_perm_to_string.h b/security/selinux/include/av_perm_to_string.h index 7c9b583..09fc8a2 100644 --- a/security/selinux/include/av_perm_to_string.h +++ b/security/selinux/include/av_perm_to_string.h @@ -241,6 +241,7 @@ S_(SECCLASS_ASSOCIATION, ASSOCIATION__SENDTO, "sendto") S_(SECCLASS_ASSOCIATION, ASSOCIATION__RECVFROM, "recvfrom") S_(SECCLASS_ASSOCIATION, ASSOCIATION__SETCONTEXT, "setcontext") + S_(SECCLASS_ASSOCIATION, ASSOCIATION__POLMATCH, "polmatch") S_(SECCLASS_PACKET, PACKET__SEND, "send") S_(SECCLASS_PACKET, PACKET__RECV, "recv") S_(SECCLASS_PACKET, PACKET__RELABELTO, "relabelto") diff --git a/security/selinux/include/av_permissions.h b/security/selinux/include/av_permissions.h index 69fd4b4..81f4f52 100644 --- a/security/selinux/include/av_permissions.h +++ b/security/selinux/include/av_permissions.h @@ -911,6 +911,7 @@ #define ASSOCIATION__SENDTO 0x00000001UL #define ASSOCIATION__RECVFROM 0x00000002UL #define ASSOCIATION__SETCONTEXT 0x00000004UL +#define ASSOCIATION__POLMATCH 0x00000008UL #define NETLINK_KOBJECT_UEVENT_SOCKET__IOCTL 0x00000001UL #define NETLINK_KOBJECT_UEVENT_SOCKET__READ 0x00000002UL -- cgit v0.10.2 From 08554d6b33e60aa8ee40bbef94505941c0eefef2 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:27:16 -0700 Subject: [MLSXFRM]: Define new SELinux service routine This defines a routine that combines the Type Enforcement portion of one sid with the MLS portion from the other sid to arrive at a new sid. This would be used to define a sid for a security association that is to be negotiated by IKE as well as for determing the sid for open requests and connection-oriented child sockets. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/security/selinux/include/security.h b/security/selinux/include/security.h index 063af47..911954a 100644 --- a/security/selinux/include/security.h +++ b/security/selinux/include/security.h @@ -78,6 +78,8 @@ int security_node_sid(u16 domain, void *addr, u32 addrlen, int security_validate_transition(u32 oldsid, u32 newsid, u32 tasksid, u16 tclass); +int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid); + #define SECURITY_FS_USE_XATTR 1 /* use xattr */ #define SECURITY_FS_USE_TRANS 2 /* use transition SIDs, e.g. devpts/tmpfs */ #define SECURITY_FS_USE_TASK 3 /* use task SIDs, e.g. pipefs/sockfs */ diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c index 7bc5b64..e15f7e0 100644 --- a/security/selinux/ss/mls.c +++ b/security/selinux/ss/mls.c @@ -212,26 +212,6 @@ int mls_context_isvalid(struct policydb *p, struct context *c) } /* - * Copies the MLS range from `src' into `dst'. - */ -static inline int mls_copy_context(struct context *dst, - struct context *src) -{ - int l, rc = 0; - - /* Copy the MLS range from the source context */ - for (l = 0; l < 2; l++) { - dst->range.level[l].sens = src->range.level[l].sens; - rc = ebitmap_cpy(&dst->range.level[l].cat, - &src->range.level[l].cat); - if (rc) - break; - } - - return rc; -} - -/* * Set the MLS fields in the security context structure * `context' based on the string representation in * the string `*scontext'. Update `*scontext' to diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h index fbb42f0..90c5e88 100644 --- a/security/selinux/ss/mls.h +++ b/security/selinux/ss/mls.h @@ -17,6 +17,26 @@ #include "context.h" #include "policydb.h" +/* + * Copies the MLS range from `src' into `dst'. + */ +static inline int mls_copy_context(struct context *dst, + struct context *src) +{ + int l, rc = 0; + + /* Copy the MLS range from the source context */ + for (l = 0; l < 2; l++) { + dst->range.level[l].sens = src->range.level[l].sens; + rc = ebitmap_cpy(&dst->range.level[l].cat, + &src->range.level[l].cat); + if (rc) + break; + } + + return rc; +} + int mls_compute_context_len(struct context *context); void mls_sid_to_context(struct context *context, char **scontext); int mls_context_isvalid(struct policydb *p, struct context *c); diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 85e4298..b00ec69 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -1817,6 +1817,75 @@ out: return rc; } +/* + * security_sid_mls_copy() - computes a new sid based on the given + * sid and the mls portion of mls_sid. + */ +int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid) +{ + struct context *context1; + struct context *context2; + struct context newcon; + char *s; + u32 len; + int rc = 0; + + if (!ss_initialized) { + *new_sid = sid; + goto out; + } + + context_init(&newcon); + + POLICY_RDLOCK; + context1 = sidtab_search(&sidtab, sid); + if (!context1) { + printk(KERN_ERR "security_sid_mls_copy: unrecognized SID " + "%d\n", sid); + rc = -EINVAL; + goto out_unlock; + } + + context2 = sidtab_search(&sidtab, mls_sid); + if (!context2) { + printk(KERN_ERR "security_sid_mls_copy: unrecognized SID " + "%d\n", mls_sid); + rc = -EINVAL; + goto out_unlock; + } + + newcon.user = context1->user; + newcon.role = context1->role; + newcon.type = context1->type; + rc = mls_copy_context(&newcon, context2); + if (rc) + goto out_unlock; + + + /* Check the validity of the new context. */ + if (!policydb_context_isvalid(&policydb, &newcon)) { + rc = convert_context_handle_invalid_context(&newcon); + if (rc) + goto bad; + } + + rc = sidtab_context_to_sid(&sidtab, &newcon, new_sid); + goto out_unlock; + +bad: + if (!context_struct_to_string(&newcon, &s, &len)) { + audit_log(current->audit_context, GFP_ATOMIC, AUDIT_SELINUX_ERR, + "security_sid_mls_copy: invalid context %s", s); + kfree(s); + } + +out_unlock: + POLICY_RDUNLOCK; + context_destroy(&newcon); +out: + return rc; +} + struct selinux_audit_rule { u32 au_seqno; struct context au_ctxt; -- cgit v0.10.2 From 892c141e62982272b9c738b5520ad0e5e1ad7b42 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Fri, 4 Aug 2006 23:08:56 -0700 Subject: [MLSXFRM]: Add security sid to sock This adds security for IP sockets at the sock level. Security at the sock level is needed to enforce the SELinux security policy for security associations even when a sock is orphaned (such as in the TCP LAST_ACK state). This will also be used to enforce SELinux controls over data arriving at or leaving a child socket while it's still waiting to be accepted. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index 6bc2aad..4d7fb59 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -812,6 +812,8 @@ struct swap_info_struct; * which is used to copy security attributes between local stream sockets. * @sk_free_security: * Deallocate security structure. + * @sk_clone_security: + * Clone/copy security structure. * @sk_getsid: * Retrieve the LSM-specific sid for the sock to enable caching of network * authorizations. @@ -1332,6 +1334,7 @@ struct security_operations { int (*socket_getpeersec_dgram) (struct socket *sock, struct sk_buff *skb, u32 *secid); int (*sk_alloc_security) (struct sock *sk, int family, gfp_t priority); void (*sk_free_security) (struct sock *sk); + void (*sk_clone_security) (const struct sock *sk, struct sock *newsk); unsigned int (*sk_getsid) (struct sock *sk, struct flowi *fl, u8 dir); #endif /* CONFIG_SECURITY_NETWORK */ @@ -2885,6 +2888,11 @@ static inline void security_sk_free(struct sock *sk) return security_ops->sk_free_security(sk); } +static inline void security_sk_clone(const struct sock *sk, struct sock *newsk) +{ + return security_ops->sk_clone_security(sk, newsk); +} + static inline unsigned int security_sk_sid(struct sock *sk, struct flowi *fl, u8 dir) { return security_ops->sk_getsid(sk, fl, dir); @@ -3011,6 +3019,10 @@ static inline void security_sk_free(struct sock *sk) { } +static inline void security_sk_clone(const struct sock *sk, struct sock *newsk) +{ +} + static inline unsigned int security_sk_sid(struct sock *sk, struct flowi *fl, u8 dir) { return 0; diff --git a/include/net/sock.h b/include/net/sock.h index 324b3ea..91cdceb 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -972,6 +972,19 @@ static inline void sock_graft(struct sock *sk, struct socket *parent) write_unlock_bh(&sk->sk_callback_lock); } +static inline void sock_copy(struct sock *nsk, const struct sock *osk) +{ +#ifdef CONFIG_SECURITY_NETWORK + void *sptr = nsk->sk_security; +#endif + + memcpy(nsk, osk, osk->sk_prot->obj_size); +#ifdef CONFIG_SECURITY_NETWORK + nsk->sk_security = sptr; + security_sk_clone(osk, nsk); +#endif +} + extern int sock_i_uid(struct sock *sk); extern unsigned long sock_i_ino(struct sock *sk); diff --git a/net/core/sock.c b/net/core/sock.c index 51fcfbc..b67d868 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -911,7 +911,7 @@ struct sock *sk_clone(const struct sock *sk, const gfp_t priority) if (newsk != NULL) { struct sk_filter *filter; - memcpy(newsk, sk, sk->sk_prot->obj_size); + sock_copy(newsk, sk); /* SANITY */ sk_node_init(&newsk->sk_node); diff --git a/security/dummy.c b/security/dummy.c index 58c6d39..bd3bc5f 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -805,6 +805,10 @@ static inline void dummy_sk_free_security (struct sock *sk) { } +static inline void dummy_sk_clone_security (const struct sock *sk, struct sock *newsk) +{ +} + static unsigned int dummy_sk_getsid(struct sock *sk, struct flowi *fl, u8 dir) { return 0; @@ -1060,6 +1064,7 @@ void security_fixup_ops (struct security_operations *ops) set_to_dummy_if_null(ops, socket_getpeersec_dgram); set_to_dummy_if_null(ops, sk_alloc_security); set_to_dummy_if_null(ops, sk_free_security); + set_to_dummy_if_null(ops, sk_clone_security); set_to_dummy_if_null(ops, sk_getsid); #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 5d1b8c7..d67abf7 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -269,15 +269,13 @@ static int sk_alloc_security(struct sock *sk, int family, gfp_t priority) { struct sk_security_struct *ssec; - if (family != PF_UNIX) - return 0; - ssec = kzalloc(sizeof(*ssec), priority); if (!ssec) return -ENOMEM; ssec->sk = sk; ssec->peer_sid = SECINITSID_UNLABELED; + ssec->sid = SECINITSID_UNLABELED; sk->sk_security = ssec; return 0; @@ -287,9 +285,6 @@ static void sk_free_security(struct sock *sk) { struct sk_security_struct *ssec = sk->sk_security; - if (sk->sk_family != PF_UNIX) - return; - sk->sk_security = NULL; kfree(ssec); } @@ -3068,6 +3063,7 @@ static void selinux_socket_post_create(struct socket *sock, int family, { struct inode_security_struct *isec; struct task_security_struct *tsec; + struct sk_security_struct *sksec; u32 newsid; isec = SOCK_INODE(sock)->i_security; @@ -3078,6 +3074,11 @@ static void selinux_socket_post_create(struct socket *sock, int family, isec->sid = kern ? SECINITSID_KERNEL : newsid; isec->initialized = 1; + if (sock->sk) { + sksec = sock->sk->sk_security; + sksec->sid = isec->sid; + } + return; } @@ -3551,22 +3552,24 @@ static void selinux_sk_free_security(struct sock *sk) sk_free_security(sk); } -static unsigned int selinux_sk_getsid_security(struct sock *sk, struct flowi *fl, u8 dir) +static void selinux_sk_clone_security(const struct sock *sk, struct sock *newsk) { - struct inode_security_struct *isec; - u32 sock_sid = SECINITSID_ANY_SOCKET; + struct sk_security_struct *ssec = sk->sk_security; + struct sk_security_struct *newssec = newsk->sk_security; + newssec->sid = ssec->sid; + newssec->peer_sid = ssec->peer_sid; +} + +static unsigned int selinux_sk_getsid_security(struct sock *sk, struct flowi *fl, u8 dir) +{ if (!sk) return selinux_no_sk_sid(fl); + else { + struct sk_security_struct *sksec = sk->sk_security; - read_lock_bh(&sk->sk_callback_lock); - isec = get_sock_isec(sk); - - if (isec) - sock_sid = isec->sid; - - read_unlock_bh(&sk->sk_callback_lock); - return sock_sid; + return sksec->sid; + } } static int selinux_nlmsg_perm(struct sock *sk, struct sk_buff *skb) @@ -4618,6 +4621,7 @@ static struct security_operations selinux_ops = { .socket_getpeersec_dgram = selinux_socket_getpeersec_dgram, .sk_alloc_security = selinux_sk_alloc_security, .sk_free_security = selinux_sk_free_security, + .sk_clone_security = selinux_sk_clone_security, .sk_getsid = selinux_sk_getsid_security, #ifdef CONFIG_SECURITY_NETWORK_XFRM diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h index 9401788..79b9e0a 100644 --- a/security/selinux/include/objsec.h +++ b/security/selinux/include/objsec.h @@ -99,6 +99,7 @@ struct netif_security_struct { struct sk_security_struct { struct sock *sk; /* back pointer to sk object */ + u32 sid; /* SID of this object */ u32 peer_sid; /* SID of peer */ }; -- cgit v0.10.2 From b6340fcd761acf9249b3acbc95c4dc555d9beb07 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:28:37 -0700 Subject: [MLSXFRM]: Add security sid to flowi This adds security to flow key for labeling of flows as also to allow for making flow cache lookups based on the security label seemless. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/Documentation/networking/secid.txt b/Documentation/networking/secid.txt new file mode 100644 index 0000000..95ea067 --- /dev/null +++ b/Documentation/networking/secid.txt @@ -0,0 +1,14 @@ +flowi structure: + +The secid member in the flow structure is used in LSMs (e.g. SELinux) to indicate +the label of the flow. This label of the flow is currently used in selecting +matching labeled xfrm(s). + +If this is an outbound flow, the label is derived from the socket, if any, or +the incoming packet this flow is being generated as a response to (e.g. tcp +resets, timewait ack, etc.). It is also conceivable that the label could be +derived from other sources such as process context, device, etc., in special +cases, as may be appropriate. + +If this is an inbound flow, the label is derived from the IPSec security +associations, if any, used by the packet. diff --git a/include/net/flow.h b/include/net/flow.h index 04d89f7..1cee5a8 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -78,6 +78,7 @@ struct flowi { #define fl_icmp_type uli_u.icmpt.type #define fl_icmp_code uli_u.icmpt.code #define fl_ipsec_spi uli_u.spi + __u32 secid; /* used by xfrm; see secid.txt */ } __attribute__((__aligned__(BITS_PER_LONG/8))); #define FLOW_DIR_IN 0 -- cgit v0.10.2 From e0d1caa7b0d5f02e4f34aa09c695d04251310c6c Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:29:07 -0700 Subject: [MLSXFRM]: Flow based matching of xfrm policy and state This implements a seemless mechanism for xfrm policy selection and state matching based on the flow sid. This also includes the necessary SELinux enforcement pieces. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index 4d7fb59..2c4921d 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -31,6 +31,7 @@ #include #include #include +#include struct ctl_table; @@ -825,9 +826,8 @@ struct swap_info_struct; * used by the XFRM system. * @sec_ctx contains the security context information being provided by * the user-level policy update program (e.g., setkey). - * Allocate a security structure to the xp->security field. - * The security field is initialized to NULL when the xfrm_policy is - * allocated. + * Allocate a security structure to the xp->security field; the security + * field is initialized to NULL when the xfrm_policy is allocated. * Return 0 if operation was successful (memory to allocate, legal context) * @xfrm_policy_clone_security: * @old contains an existing xfrm_policy in the SPD. @@ -846,9 +846,14 @@ struct swap_info_struct; * Database by the XFRM system. * @sec_ctx contains the security context information being provided by * the user-level SA generation program (e.g., setkey or racoon). - * Allocate a security structure to the x->security field. The - * security field is initialized to NULL when the xfrm_state is - * allocated. + * @polsec contains the security context information associated with a xfrm + * policy rule from which to take the base context. polsec must be NULL + * when sec_ctx is specified. + * @secid contains the secid from which to take the mls portion of the context. + * Allocate a security structure to the x->security field; the security + * field is initialized to NULL when the xfrm_state is allocated. Set the + * context to correspond to either sec_ctx or polsec, with the mls portion + * taken from secid in the latter case. * Return 0 if operation was successful (memory to allocate, legal context). * @xfrm_state_free_security: * @x contains the xfrm_state. @@ -859,13 +864,26 @@ struct swap_info_struct; * @xfrm_policy_lookup: * @xp contains the xfrm_policy for which the access control is being * checked. - * @sk_sid contains the sock security label that is used to authorize + * @fl_secid contains the flow security label that is used to authorize * access to the policy xp. * @dir contains the direction of the flow (input or output). - * Check permission when a sock selects a xfrm_policy for processing + * Check permission when a flow selects a xfrm_policy for processing * XFRMs on a packet. The hook is called when selecting either a * per-socket policy or a generic xfrm policy. * Return 0 if permission is granted. + * @xfrm_state_pol_flow_match: + * @x contains the state to match. + * @xp contains the policy to check for a match. + * @fl contains the flow to check for a match. + * Return 1 if there is a match. + * @xfrm_flow_state_match: + * @fl contains the flow key to match. + * @xfrm points to the xfrm_state to match. + * Return 1 if there is a match. + * @xfrm_decode_session: + * @skb points to skb to decode. + * @fl points to the flow key to set. + * Return 0 if successful decoding. * * Security hooks affecting all Key Management operations * @@ -1343,10 +1361,16 @@ struct security_operations { int (*xfrm_policy_clone_security) (struct xfrm_policy *old, struct xfrm_policy *new); void (*xfrm_policy_free_security) (struct xfrm_policy *xp); int (*xfrm_policy_delete_security) (struct xfrm_policy *xp); - int (*xfrm_state_alloc_security) (struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx); + int (*xfrm_state_alloc_security) (struct xfrm_state *x, + struct xfrm_user_sec_ctx *sec_ctx, struct xfrm_sec_ctx *polsec, + u32 secid); void (*xfrm_state_free_security) (struct xfrm_state *x); int (*xfrm_state_delete_security) (struct xfrm_state *x); - int (*xfrm_policy_lookup)(struct xfrm_policy *xp, u32 sk_sid, u8 dir); + int (*xfrm_policy_lookup)(struct xfrm_policy *xp, u32 fl_secid, u8 dir); + int (*xfrm_state_pol_flow_match)(struct xfrm_state *x, + struct xfrm_policy *xp, struct flowi *fl); + int (*xfrm_flow_state_match)(struct flowi *fl, struct xfrm_state *xfrm); + int (*xfrm_decode_session)(struct sk_buff *skb, struct flowi *fl); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ /* key management security hooks */ @@ -3050,9 +3074,18 @@ static inline int security_xfrm_policy_delete(struct xfrm_policy *xp) return security_ops->xfrm_policy_delete_security(xp); } -static inline int security_xfrm_state_alloc(struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx) +static inline int security_xfrm_state_alloc(struct xfrm_state *x, + struct xfrm_user_sec_ctx *sec_ctx) +{ + return security_ops->xfrm_state_alloc_security(x, sec_ctx, NULL, 0); +} + +static inline int security_xfrm_state_alloc_acquire(struct xfrm_state *x, + struct xfrm_sec_ctx *polsec, u32 secid) { - return security_ops->xfrm_state_alloc_security(x, sec_ctx); + if (!polsec) + return 0; + return security_ops->xfrm_state_alloc_security(x, NULL, polsec, secid); } static inline int security_xfrm_state_delete(struct xfrm_state *x) @@ -3065,9 +3098,25 @@ static inline void security_xfrm_state_free(struct xfrm_state *x) security_ops->xfrm_state_free_security(x); } -static inline int security_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir) +static inline int security_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir) +{ + return security_ops->xfrm_policy_lookup(xp, fl_secid, dir); +} + +static inline int security_xfrm_state_pol_flow_match(struct xfrm_state *x, + struct xfrm_policy *xp, struct flowi *fl) +{ + return security_ops->xfrm_state_pol_flow_match(x, xp, fl); +} + +static inline int security_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm) +{ + return security_ops->xfrm_flow_state_match(fl, xfrm); +} + +static inline int security_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) { - return security_ops->xfrm_policy_lookup(xp, sk_sid, dir); + return security_ops->xfrm_decode_session(skb, fl); } #else /* CONFIG_SECURITY_NETWORK_XFRM */ static inline int security_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx) @@ -3089,7 +3138,14 @@ static inline int security_xfrm_policy_delete(struct xfrm_policy *xp) return 0; } -static inline int security_xfrm_state_alloc(struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx) +static inline int security_xfrm_state_alloc(struct xfrm_state *x, + struct xfrm_user_sec_ctx *sec_ctx) +{ + return 0; +} + +static inline int security_xfrm_state_alloc_acquire(struct xfrm_state *x, + struct xfrm_sec_ctx *polsec, u32 secid) { return 0; } @@ -3103,10 +3159,28 @@ static inline int security_xfrm_state_delete(struct xfrm_state *x) return 0; } -static inline int security_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir) +static inline int security_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir) { return 0; } + +static inline int security_xfrm_state_pol_flow_match(struct xfrm_state *x, + struct xfrm_policy *xp, struct flowi *fl) +{ + return 1; +} + +static inline int security_xfrm_flow_state_match(struct flowi *fl, + struct xfrm_state *xfrm) +{ + return 1; +} + +static inline int security_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +{ + return 0; +} + #endif /* CONFIG_SECURITY_NETWORK_XFRM */ #ifdef CONFIG_KEYS diff --git a/include/net/flow.h b/include/net/flow.h index 1cee5a8..21d988b 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -86,10 +86,10 @@ struct flowi { #define FLOW_DIR_FWD 2 struct sock; -typedef void (*flow_resolve_t)(struct flowi *key, u32 sk_sid, u16 family, u8 dir, +typedef void (*flow_resolve_t)(struct flowi *key, u16 family, u8 dir, void **objp, atomic_t **obj_refp); -extern void *flow_cache_lookup(struct flowi *key, u32 sk_sid, u16 family, u8 dir, +extern void *flow_cache_lookup(struct flowi *key, u16 family, u8 dir, flow_resolve_t resolver); extern void flow_cache_flush(void); extern atomic_t flow_cache_genid; diff --git a/net/core/flow.c b/net/core/flow.c index 2191af5..6452411 100644 --- a/net/core/flow.c +++ b/net/core/flow.c @@ -32,7 +32,6 @@ struct flow_cache_entry { u8 dir; struct flowi key; u32 genid; - u32 sk_sid; void *object; atomic_t *object_ref; }; @@ -165,7 +164,7 @@ static int flow_key_compare(struct flowi *key1, struct flowi *key2) return 0; } -void *flow_cache_lookup(struct flowi *key, u32 sk_sid, u16 family, u8 dir, +void *flow_cache_lookup(struct flowi *key, u16 family, u8 dir, flow_resolve_t resolver) { struct flow_cache_entry *fle, **head; @@ -189,7 +188,6 @@ void *flow_cache_lookup(struct flowi *key, u32 sk_sid, u16 family, u8 dir, for (fle = *head; fle; fle = fle->next) { if (fle->family == family && fle->dir == dir && - fle->sk_sid == sk_sid && flow_key_compare(key, &fle->key) == 0) { if (fle->genid == atomic_read(&flow_cache_genid)) { void *ret = fle->object; @@ -214,7 +212,6 @@ void *flow_cache_lookup(struct flowi *key, u32 sk_sid, u16 family, u8 dir, *head = fle; fle->family = family; fle->dir = dir; - fle->sk_sid = sk_sid; memcpy(&fle->key, key, sizeof(*key)); fle->object = NULL; flow_count(cpu)++; @@ -226,7 +223,7 @@ nocache: void *obj; atomic_t *obj_ref; - resolver(key, sk_sid, family, dir, &obj, &obj_ref); + resolver(key, family, dir, &obj, &obj_ref); if (fle) { fle->genid = atomic_read(&flow_cache_genid); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 3da67ca..79405da 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -597,7 +597,7 @@ EXPORT_SYMBOL(xfrm_policy_walk); /* Find policy to apply to this flow. */ -static void xfrm_policy_lookup(struct flowi *fl, u32 sk_sid, u16 family, u8 dir, +static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, void **objp, atomic_t **obj_refp) { struct xfrm_policy *pol; @@ -613,7 +613,7 @@ static void xfrm_policy_lookup(struct flowi *fl, u32 sk_sid, u16 family, u8 dir, match = xfrm_selector_match(sel, fl, family); if (match) { - if (!security_xfrm_policy_lookup(pol, sk_sid, dir)) { + if (!security_xfrm_policy_lookup(pol, fl->secid, dir)) { xfrm_pol_hold(pol); break; } @@ -641,7 +641,7 @@ static inline int policy_to_flow_dir(int dir) }; } -static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl, u32 sk_sid) +static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struct flowi *fl) { struct xfrm_policy *pol; @@ -652,7 +652,7 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struc int err = 0; if (match) - err = security_xfrm_policy_lookup(pol, sk_sid, policy_to_flow_dir(dir)); + err = security_xfrm_policy_lookup(pol, fl->secid, policy_to_flow_dir(dir)); if (match && !err) xfrm_pol_hold(pol); @@ -862,19 +862,20 @@ int xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, u32 genid; u16 family; u8 dir = policy_to_flow_dir(XFRM_POLICY_OUT); - u32 sk_sid = security_sk_sid(sk, fl, dir); + + fl->secid = security_sk_sid(sk, fl, dir); restart: genid = atomic_read(&flow_cache_genid); policy = NULL; if (sk && sk->sk_policy[1]) - policy = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl, sk_sid); + policy = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl); if (!policy) { /* To accelerate a bit... */ if ((dst_orig->flags & DST_NOXFRM) || !xfrm_policy_list[XFRM_POLICY_OUT]) return 0; - policy = flow_cache_lookup(fl, sk_sid, dst_orig->ops->family, + policy = flow_cache_lookup(fl, dst_orig->ops->family, dir, xfrm_policy_lookup); } @@ -1032,13 +1033,15 @@ int xfrm_decode_session(struct sk_buff *skb, struct flowi *fl, unsigned short family) { struct xfrm_policy_afinfo *afinfo = xfrm_policy_get_afinfo(family); + int err; if (unlikely(afinfo == NULL)) return -EAFNOSUPPORT; afinfo->decode_session(skb, fl); + err = security_xfrm_decode_session(skb, fl); xfrm_policy_put_afinfo(afinfo); - return 0; + return err; } EXPORT_SYMBOL(xfrm_decode_session); @@ -1058,14 +1061,11 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, struct xfrm_policy *pol; struct flowi fl; u8 fl_dir = policy_to_flow_dir(dir); - u32 sk_sid; if (xfrm_decode_session(skb, &fl, family) < 0) return 0; nf_nat_decode_session(skb, &fl, family); - sk_sid = security_sk_sid(sk, &fl, fl_dir); - /* First, check used SA against their selectors. */ if (skb->sp) { int i; @@ -1079,10 +1079,10 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, pol = NULL; if (sk && sk->sk_policy[dir]) - pol = xfrm_sk_policy_lookup(sk, dir, &fl, sk_sid); + pol = xfrm_sk_policy_lookup(sk, dir, &fl); if (!pol) - pol = flow_cache_lookup(&fl, sk_sid, family, fl_dir, + pol = flow_cache_lookup(&fl, family, fl_dir, xfrm_policy_lookup); if (!pol) @@ -1298,6 +1298,8 @@ int xfrm_bundle_ok(struct xfrm_dst *first, struct flowi *fl, int family) if (fl && !xfrm_selector_match(&dst->xfrm->sel, fl, family)) return 0; + if (fl && !security_xfrm_flow_state_match(fl, dst->xfrm)) + return 0; if (dst->xfrm->km.state != XFRM_STATE_VALID) return 0; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 0021aad..be02bd9 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -367,7 +367,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, */ if (x->km.state == XFRM_STATE_VALID) { if (!xfrm_selector_match(&x->sel, fl, family) || - !xfrm_sec_ctx_match(pol->security, x->security)) + !security_xfrm_state_pol_flow_match(x, pol, fl)) continue; if (!best || best->km.dying > x->km.dying || @@ -379,7 +379,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, } else if (x->km.state == XFRM_STATE_ERROR || x->km.state == XFRM_STATE_EXPIRED) { if (xfrm_selector_match(&x->sel, fl, family) && - xfrm_sec_ctx_match(pol->security, x->security)) + security_xfrm_state_pol_flow_match(x, pol, fl)) error = -ESRCH; } } @@ -403,6 +403,14 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, * to current session. */ xfrm_init_tempsel(x, fl, tmpl, daddr, saddr, family); + error = security_xfrm_state_alloc_acquire(x, pol->security, fl->secid); + if (error) { + x->km.state = XFRM_STATE_DEAD; + xfrm_state_put(x); + x = NULL; + goto out; + } + if (km_query(x, tmpl, pol) == 0) { x->km.state = XFRM_STATE_ACQ; list_add_tail(&x->bydst, xfrm_state_bydst+h); diff --git a/security/dummy.c b/security/dummy.c index bd3bc5f..c1f1065 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -835,7 +835,8 @@ static int dummy_xfrm_policy_delete_security(struct xfrm_policy *xp) return 0; } -static int dummy_xfrm_state_alloc_security(struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx) +static int dummy_xfrm_state_alloc_security(struct xfrm_state *x, + struct xfrm_user_sec_ctx *sec_ctx, struct xfrm_sec_ctx *pol, u32 secid) { return 0; } @@ -853,6 +854,23 @@ static int dummy_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir) { return 0; } + +static int dummy_xfrm_state_pol_flow_match(struct xfrm_state *x, + struct xfrm_policy *xp, struct flowi *fl) +{ + return 1; +} + +static int dummy_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm) +{ + return 1; +} + +static int dummy_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +{ + return 0; +} + #endif /* CONFIG_SECURITY_NETWORK_XFRM */ static int dummy_register_security (const char *name, struct security_operations *ops) { @@ -1076,6 +1094,9 @@ void security_fixup_ops (struct security_operations *ops) set_to_dummy_if_null(ops, xfrm_state_free_security); set_to_dummy_if_null(ops, xfrm_state_delete_security); set_to_dummy_if_null(ops, xfrm_policy_lookup); + set_to_dummy_if_null(ops, xfrm_state_pol_flow_match); + set_to_dummy_if_null(ops, xfrm_flow_state_match); + set_to_dummy_if_null(ops, xfrm_decode_session); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ #ifdef CONFIG_KEYS set_to_dummy_if_null(ops, key_alloc); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index d67abf7..5c189da 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3468,7 +3468,7 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) if (err) goto out; - err = selinux_xfrm_sock_rcv_skb(sock_sid, skb); + err = selinux_xfrm_sock_rcv_skb(sock_sid, skb, &ad); out: return err; } @@ -3720,7 +3720,7 @@ static unsigned int selinux_ip_postroute_last(unsigned int hooknum, if (err) goto out; - err = selinux_xfrm_postroute_last(isec->sid, skb); + err = selinux_xfrm_postroute_last(isec->sid, skb, &ad); out: return err ? NF_DROP : NF_ACCEPT; } @@ -4633,6 +4633,9 @@ static struct security_operations selinux_ops = { .xfrm_state_free_security = selinux_xfrm_state_free, .xfrm_state_delete_security = selinux_xfrm_state_delete, .xfrm_policy_lookup = selinux_xfrm_policy_lookup, + .xfrm_state_pol_flow_match = selinux_xfrm_state_pol_flow_match, + .xfrm_flow_state_match = selinux_xfrm_flow_state_match, + .xfrm_decode_session = selinux_xfrm_decode_session, #endif #ifdef CONFIG_KEYS diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h index c96498a..f51a3e8 100644 --- a/security/selinux/include/xfrm.h +++ b/security/selinux/include/xfrm.h @@ -2,6 +2,7 @@ * SELinux support for the XFRM LSM hooks * * Author : Trent Jaeger, + * Updated : Venkat Yekkirala, */ #ifndef _SELINUX_XFRM_H_ #define _SELINUX_XFRM_H_ @@ -10,10 +11,16 @@ int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx * int selinux_xfrm_policy_clone(struct xfrm_policy *old, struct xfrm_policy *new); void selinux_xfrm_policy_free(struct xfrm_policy *xp); int selinux_xfrm_policy_delete(struct xfrm_policy *xp); -int selinux_xfrm_state_alloc(struct xfrm_state *x, struct xfrm_user_sec_ctx *sec_ctx); +int selinux_xfrm_state_alloc(struct xfrm_state *x, + struct xfrm_user_sec_ctx *sec_ctx, struct xfrm_sec_ctx *pol, u32 secid); void selinux_xfrm_state_free(struct xfrm_state *x); int selinux_xfrm_state_delete(struct xfrm_state *x); -int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir); +int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir); +int selinux_xfrm_state_pol_flow_match(struct xfrm_state *x, + struct xfrm_policy *xp, struct flowi *fl); +int selinux_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm); +int selinux_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl); + /* * Extract the security blob from the sock (it's actually on the socket) @@ -39,17 +46,21 @@ static inline u32 selinux_no_sk_sid(struct flowi *fl) } #ifdef CONFIG_SECURITY_NETWORK_XFRM -int selinux_xfrm_sock_rcv_skb(u32 sid, struct sk_buff *skb); -int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb); +int selinux_xfrm_sock_rcv_skb(u32 sid, struct sk_buff *skb, + struct avc_audit_data *ad); +int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb, + struct avc_audit_data *ad); u32 selinux_socket_getpeer_stream(struct sock *sk); u32 selinux_socket_getpeer_dgram(struct sk_buff *skb); #else -static inline int selinux_xfrm_sock_rcv_skb(u32 isec_sid, struct sk_buff *skb) +static inline int selinux_xfrm_sock_rcv_skb(u32 isec_sid, struct sk_buff *skb, + struct avc_audit_data *ad) { return 0; } -static inline int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb) +static inline int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb, + struct avc_audit_data *ad) { return 0; } diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c index 6c985ce..a502b05 100644 --- a/security/selinux/xfrm.c +++ b/security/selinux/xfrm.c @@ -6,7 +6,12 @@ * Authors: Serge Hallyn * Trent Jaeger * + * Updated: Venkat Yekkirala + * + * Granular IPSec Associations for use in MLS environments. + * * Copyright (C) 2005 International Business Machines Corporation + * Copyright (C) 2006 Trusted Computer Solutions, Inc. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2, @@ -67,10 +72,10 @@ static inline int selinux_authorizable_xfrm(struct xfrm_state *x) } /* - * LSM hook implementation that authorizes that a socket can be used - * with the corresponding xfrm_sec_ctx and direction. + * LSM hook implementation that authorizes that a flow can use + * a xfrm policy rule. */ -int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir) +int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir) { int rc = 0; u32 sel_sid = SECINITSID_UNLABELED; @@ -84,27 +89,129 @@ int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 sk_sid, u8 dir) sel_sid = ctx->ctx_sid; } - rc = avc_has_perm(sk_sid, sel_sid, SECCLASS_ASSOCIATION, - ((dir == FLOW_DIR_IN) ? ASSOCIATION__RECVFROM : - ((dir == FLOW_DIR_OUT) ? ASSOCIATION__SENDTO : - (ASSOCIATION__SENDTO | ASSOCIATION__RECVFROM))), + rc = avc_has_perm(fl_secid, sel_sid, SECCLASS_ASSOCIATION, + ASSOCIATION__POLMATCH, NULL); return rc; } /* + * LSM hook implementation that authorizes that a state matches + * the given policy, flow combo. + */ + +int selinux_xfrm_state_pol_flow_match(struct xfrm_state *x, struct xfrm_policy *xp, + struct flowi *fl) +{ + u32 state_sid; + u32 pol_sid; + int err; + + if (x->security) + state_sid = x->security->ctx_sid; + else + state_sid = SECINITSID_UNLABELED; + + if (xp->security) + pol_sid = xp->security->ctx_sid; + else + pol_sid = SECINITSID_UNLABELED; + + err = avc_has_perm(state_sid, pol_sid, SECCLASS_ASSOCIATION, + ASSOCIATION__POLMATCH, + NULL); + + if (err) + return 0; + + return selinux_xfrm_flow_state_match(fl, x); +} + +/* + * LSM hook implementation that authorizes that a particular outgoing flow + * can use a given security association. + */ + +int selinux_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm) +{ + int rc = 0; + u32 sel_sid = SECINITSID_UNLABELED; + struct xfrm_sec_ctx *ctx; + + /* Context sid is either set to label or ANY_ASSOC */ + if ((ctx = xfrm->security)) { + if (!selinux_authorizable_ctx(ctx)) + return 0; + + sel_sid = ctx->ctx_sid; + } + + rc = avc_has_perm(fl->secid, sel_sid, SECCLASS_ASSOCIATION, + ASSOCIATION__SENDTO, + NULL)? 0:1; + + return rc; +} + +/* + * LSM hook implementation that determines the sid for the session. + */ + +int selinux_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +{ + struct sec_path *sp; + + fl->secid = SECSID_NULL; + + if (skb == NULL) + return 0; + + sp = skb->sp; + if (sp) { + int i, sid_set = 0; + + for (i = sp->len-1; i >= 0; i--) { + struct xfrm_state *x = sp->xvec[i]; + if (selinux_authorizable_xfrm(x)) { + struct xfrm_sec_ctx *ctx = x->security; + + if (!sid_set) { + fl->secid = ctx->ctx_sid; + sid_set = 1; + } + else if (fl->secid != ctx->ctx_sid) + return -EINVAL; + } + } + } + + return 0; +} + +/* * Security blob allocation for xfrm_policy and xfrm_state * CTX does not have a meaningful value on input */ -static int selinux_xfrm_sec_ctx_alloc(struct xfrm_sec_ctx **ctxp, struct xfrm_user_sec_ctx *uctx) +static int selinux_xfrm_sec_ctx_alloc(struct xfrm_sec_ctx **ctxp, + struct xfrm_user_sec_ctx *uctx, struct xfrm_sec_ctx *pol, u32 sid) { int rc = 0; struct task_security_struct *tsec = current->security; - struct xfrm_sec_ctx *ctx; + struct xfrm_sec_ctx *ctx = NULL; + char *ctx_str = NULL; + u32 str_len; + u32 ctx_sid; + + BUG_ON(uctx && pol); + + if (pol) + goto from_policy; BUG_ON(!uctx); - BUG_ON(uctx->ctx_doi != XFRM_SC_ALG_SELINUX); + + if (uctx->ctx_doi != XFRM_SC_ALG_SELINUX) + return -EINVAL; if (uctx->ctx_len >= PAGE_SIZE) return -ENOMEM; @@ -141,9 +248,41 @@ static int selinux_xfrm_sec_ctx_alloc(struct xfrm_sec_ctx **ctxp, struct xfrm_us return rc; +from_policy: + BUG_ON(!pol); + rc = security_sid_mls_copy(pol->ctx_sid, sid, &ctx_sid); + if (rc) + goto out; + + rc = security_sid_to_context(ctx_sid, &ctx_str, &str_len); + if (rc) + goto out; + + *ctxp = ctx = kmalloc(sizeof(*ctx) + + str_len, + GFP_ATOMIC); + + if (!ctx) { + rc = -ENOMEM; + goto out; + } + + + ctx->ctx_doi = XFRM_SC_DOI_LSM; + ctx->ctx_alg = XFRM_SC_ALG_SELINUX; + ctx->ctx_sid = ctx_sid; + ctx->ctx_len = str_len; + memcpy(ctx->ctx_str, + ctx_str, + str_len); + + goto out2; + out: *ctxp = NULL; kfree(ctx); +out2: + kfree(ctx_str); return rc; } @@ -157,7 +296,7 @@ int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx * BUG_ON(!xp); - err = selinux_xfrm_sec_ctx_alloc(&xp->security, uctx); + err = selinux_xfrm_sec_ctx_alloc(&xp->security, uctx, NULL, 0); return err; } @@ -217,13 +356,14 @@ int selinux_xfrm_policy_delete(struct xfrm_policy *xp) * LSM hook implementation that allocs and transfers sec_ctx spec to * xfrm_state. */ -int selinux_xfrm_state_alloc(struct xfrm_state *x, struct xfrm_user_sec_ctx *uctx) +int selinux_xfrm_state_alloc(struct xfrm_state *x, struct xfrm_user_sec_ctx *uctx, + struct xfrm_sec_ctx *pol, u32 secid) { int err; BUG_ON(!x); - err = selinux_xfrm_sec_ctx_alloc(&x->security, uctx); + err = selinux_xfrm_sec_ctx_alloc(&x->security, uctx, pol, secid); return err; } @@ -329,38 +469,30 @@ int selinux_xfrm_state_delete(struct xfrm_state *x) * we need to check for unlabelled access since this may not have * gone thru the IPSec process. */ -int selinux_xfrm_sock_rcv_skb(u32 isec_sid, struct sk_buff *skb) +int selinux_xfrm_sock_rcv_skb(u32 isec_sid, struct sk_buff *skb, + struct avc_audit_data *ad) { int i, rc = 0; struct sec_path *sp; + u32 sel_sid = SECINITSID_UNLABELED; sp = skb->sp; if (sp) { - /* - * __xfrm_policy_check does not approve unless xfrm_policy_ok - * says that spi's match for policy and the socket. - * - * Only need to verify the existence of an authorizable sp. - */ for (i = 0; i < sp->len; i++) { struct xfrm_state *x = sp->xvec[i]; - if (x && selinux_authorizable_xfrm(x)) - goto accept; + if (x && selinux_authorizable_xfrm(x)) { + struct xfrm_sec_ctx *ctx = x->security; + sel_sid = ctx->ctx_sid; + break; + } } } - /* check SELinux sock for unlabelled access */ - rc = avc_has_perm(isec_sid, SECINITSID_UNLABELED, SECCLASS_ASSOCIATION, - ASSOCIATION__RECVFROM, NULL); - if (rc) - goto drop; - -accept: - return 0; + rc = avc_has_perm(isec_sid, sel_sid, SECCLASS_ASSOCIATION, + ASSOCIATION__RECVFROM, ad); -drop: return rc; } @@ -371,7 +503,8 @@ drop: * If we do have a authorizable security association, then it has already been * checked in xfrm_policy_lookup hook. */ -int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb) +int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb, + struct avc_audit_data *ad) { struct dst_entry *dst; int rc = 0; @@ -391,7 +524,7 @@ int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb) } rc = avc_has_perm(isec_sid, SECINITSID_UNLABELED, SECCLASS_ASSOCIATION, - ASSOCIATION__SENDTO, NULL); + ASSOCIATION__SENDTO, ad); out: return rc; } -- cgit v0.10.2 From 0d681623d30c6565e8b62889f3aa3f4d4662c3e8 Mon Sep 17 00:00:00 2001 From: Serge Hallyn Date: Mon, 24 Jul 2006 23:30:44 -0700 Subject: [MLSXFRM]: Add security context to acquire messages using netlink This includes the security context of a security association created for use by IKE in the acquire messages sent to IKE daemons using netlink/xfrm_user. This would allow the daemons to include the security context in the negotiation, so that the resultant association is unique to that security context. Signed-off-by: Serge Hallyn Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index fa79ddc..dac8db1 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -911,25 +911,38 @@ rtattr_failure: return -1; } -static int copy_to_user_sec_ctx(struct xfrm_policy *xp, struct sk_buff *skb) +static int copy_sec_ctx(struct xfrm_sec_ctx *s, struct sk_buff *skb) { - if (xp->security) { - int ctx_size = sizeof(struct xfrm_sec_ctx) + - xp->security->ctx_len; - struct rtattr *rt = __RTA_PUT(skb, XFRMA_SEC_CTX, ctx_size); - struct xfrm_user_sec_ctx *uctx = RTA_DATA(rt); + int ctx_size = sizeof(struct xfrm_sec_ctx) + s->ctx_len; + struct rtattr *rt = __RTA_PUT(skb, XFRMA_SEC_CTX, ctx_size); + struct xfrm_user_sec_ctx *uctx = RTA_DATA(rt); + + uctx->exttype = XFRMA_SEC_CTX; + uctx->len = ctx_size; + uctx->ctx_doi = s->ctx_doi; + uctx->ctx_alg = s->ctx_alg; + uctx->ctx_len = s->ctx_len; + memcpy(uctx + 1, s->ctx_str, s->ctx_len); + return 0; - uctx->exttype = XFRMA_SEC_CTX; - uctx->len = ctx_size; - uctx->ctx_doi = xp->security->ctx_doi; - uctx->ctx_alg = xp->security->ctx_alg; - uctx->ctx_len = xp->security->ctx_len; - memcpy(uctx + 1, xp->security->ctx_str, xp->security->ctx_len); + rtattr_failure: + return -1; +} + +static inline int copy_to_user_state_sec_ctx(struct xfrm_state *x, struct sk_buff *skb) +{ + if (x->security) { + return copy_sec_ctx(x->security, skb); } return 0; +} - rtattr_failure: - return -1; +static inline int copy_to_user_sec_ctx(struct xfrm_policy *xp, struct sk_buff *skb) +{ + if (xp->security) { + return copy_sec_ctx(xp->security, skb); + } + return 0; } static int dump_one_policy(struct xfrm_policy *xp, int dir, int count, void *ptr) @@ -1710,7 +1723,7 @@ static int build_acquire(struct sk_buff *skb, struct xfrm_state *x, if (copy_to_user_tmpl(xp, skb) < 0) goto nlmsg_failure; - if (copy_to_user_sec_ctx(xp, skb)) + if (copy_to_user_state_sec_ctx(x, skb)) goto nlmsg_failure; nlh->nlmsg_len = skb->tail - b; -- cgit v0.10.2 From 4e2ba18eae7f370c7c3ed96eaca747cc9b39f917 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:31:14 -0700 Subject: [MLSXFRM]: Add security context to acquire messages using PF_KEY This includes the security context of a security association created for use by IKE in the acquire messages sent to IKE daemons using PF_KEY. This would allow the daemons to include the security context in the negotiation, so that the resultant association is unique to that security context. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/net/key/af_key.c b/net/key/af_key.c index 3a95b2e..a065e1a 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2708,6 +2708,9 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct #endif int sockaddr_size; int size; + struct sadb_x_sec_ctx *sec_ctx; + struct xfrm_sec_ctx *xfrm_ctx; + int ctx_size = 0; sockaddr_size = pfkey_sockaddr_size(x->props.family); if (!sockaddr_size) @@ -2723,6 +2726,11 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct else if (x->id.proto == IPPROTO_ESP) size += count_esp_combs(t); + if ((xfrm_ctx = x->security)) { + ctx_size = PFKEY_ALIGN8(xfrm_ctx->ctx_len); + size += sizeof(struct sadb_x_sec_ctx) + ctx_size; + } + skb = alloc_skb(size + 16, GFP_ATOMIC); if (skb == NULL) return -ENOMEM; @@ -2818,6 +2826,20 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct else if (x->id.proto == IPPROTO_ESP) dump_esp_combs(skb, t); + /* security context */ + if (xfrm_ctx) { + sec_ctx = (struct sadb_x_sec_ctx *) skb_put(skb, + sizeof(struct sadb_x_sec_ctx) + ctx_size); + sec_ctx->sadb_x_sec_len = + (sizeof(struct sadb_x_sec_ctx) + ctx_size) / sizeof(uint64_t); + sec_ctx->sadb_x_sec_exttype = SADB_X_EXT_SEC_CTX; + sec_ctx->sadb_x_ctx_doi = xfrm_ctx->ctx_doi; + sec_ctx->sadb_x_ctx_alg = xfrm_ctx->ctx_alg; + sec_ctx->sadb_x_ctx_len = xfrm_ctx->ctx_len; + memcpy(sec_ctx + 1, xfrm_ctx->ctx_str, + xfrm_ctx->ctx_len); + } + return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL); } -- cgit v0.10.2 From beb8d13bed80f8388f1a9a107d07ddd342e627e8 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Fri, 4 Aug 2006 23:12:42 -0700 Subject: [MLSXFRM]: Add flow labeling This labels the flows that could utilize IPSec xfrms at the points the flows are defined so that IPSec policy and SAs at the right label can be used. The following protos are currently not handled, but they should continue to be able to use single-labeled IPSec like they currently do. ipmr ip_gre ipip igmp sit sctp ip6_tunnel (IPv6 over IPv6 tunnel device) decnet Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index 2c4921d..f3909d1 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -32,6 +32,7 @@ #include #include #include +#include struct ctl_table; @@ -815,8 +816,8 @@ struct swap_info_struct; * Deallocate security structure. * @sk_clone_security: * Clone/copy security structure. - * @sk_getsid: - * Retrieve the LSM-specific sid for the sock to enable caching of network + * @sk_getsecid: + * Retrieve the LSM-specific secid for the sock to enable caching of network * authorizations. * * Security hooks for XFRM operations. @@ -882,8 +883,9 @@ struct swap_info_struct; * Return 1 if there is a match. * @xfrm_decode_session: * @skb points to skb to decode. - * @fl points to the flow key to set. - * Return 0 if successful decoding. + * @secid points to the flow key secid to set. + * @ckall says if all xfrms used should be checked for same secid. + * Return 0 if ckall is zero or all xfrms used have the same secid. * * Security hooks affecting all Key Management operations * @@ -1353,7 +1355,7 @@ struct security_operations { int (*sk_alloc_security) (struct sock *sk, int family, gfp_t priority); void (*sk_free_security) (struct sock *sk); void (*sk_clone_security) (const struct sock *sk, struct sock *newsk); - unsigned int (*sk_getsid) (struct sock *sk, struct flowi *fl, u8 dir); + void (*sk_getsecid) (struct sock *sk, u32 *secid); #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM @@ -1370,7 +1372,7 @@ struct security_operations { int (*xfrm_state_pol_flow_match)(struct xfrm_state *x, struct xfrm_policy *xp, struct flowi *fl); int (*xfrm_flow_state_match)(struct flowi *fl, struct xfrm_state *xfrm); - int (*xfrm_decode_session)(struct sk_buff *skb, struct flowi *fl); + int (*xfrm_decode_session)(struct sk_buff *skb, u32 *secid, int ckall); #endif /* CONFIG_SECURITY_NETWORK_XFRM */ /* key management security hooks */ @@ -2917,9 +2919,9 @@ static inline void security_sk_clone(const struct sock *sk, struct sock *newsk) return security_ops->sk_clone_security(sk, newsk); } -static inline unsigned int security_sk_sid(struct sock *sk, struct flowi *fl, u8 dir) +static inline void security_sk_classify_flow(struct sock *sk, struct flowi *fl) { - return security_ops->sk_getsid(sk, fl, dir); + security_ops->sk_getsecid(sk, &fl->secid); } #else /* CONFIG_SECURITY_NETWORK */ static inline int security_unix_stream_connect(struct socket * sock, @@ -3047,9 +3049,8 @@ static inline void security_sk_clone(const struct sock *sk, struct sock *newsk) { } -static inline unsigned int security_sk_sid(struct sock *sk, struct flowi *fl, u8 dir) +static inline void security_sk_classify_flow(struct sock *sk, struct flowi *fl) { - return 0; } #endif /* CONFIG_SECURITY_NETWORK */ @@ -3114,9 +3115,16 @@ static inline int security_xfrm_flow_state_match(struct flowi *fl, struct xfrm_s return security_ops->xfrm_flow_state_match(fl, xfrm); } -static inline int security_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid) +{ + return security_ops->xfrm_decode_session(skb, secid, 1); +} + +static inline void security_skb_classify_flow(struct sk_buff *skb, struct flowi *fl) { - return security_ops->xfrm_decode_session(skb, fl); + int rc = security_ops->xfrm_decode_session(skb, &fl->secid, 0); + + BUG_ON(rc); } #else /* CONFIG_SECURITY_NETWORK_XFRM */ static inline int security_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx) @@ -3176,11 +3184,15 @@ static inline int security_xfrm_flow_state_match(struct flowi *fl, return 1; } -static inline int security_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +static inline int security_xfrm_decode_session(struct sk_buff *skb, u32 *secid) { return 0; } +static inline void security_skb_classify_flow(struct sk_buff *skb, struct flowi *fl) +{ +} + #endif /* CONFIG_SECURITY_NETWORK_XFRM */ #ifdef CONFIG_KEYS diff --git a/include/net/route.h b/include/net/route.h index c4a0686..7f93ac0 100644 --- a/include/net/route.h +++ b/include/net/route.h @@ -32,6 +32,7 @@ #include #include #include +#include #ifndef __KERNEL__ #warning This file is not supposed to be used outside of kernel. @@ -166,6 +167,7 @@ static inline int ip_route_connect(struct rtable **rp, u32 dst, ip_rt_put(*rp); *rp = NULL; } + security_sk_classify_flow(sk, &fl); return ip_route_output_flow(rp, &fl, sk, 0); } @@ -182,6 +184,7 @@ static inline int ip_route_newports(struct rtable **rp, u8 protocol, fl.proto = protocol; ip_rt_put(*rp); *rp = NULL; + security_sk_classify_flow(sk, &fl); return ip_route_output_flow(rp, &fl, sk, 0); } return 0; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 7f56f7e..3864980 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -678,6 +678,7 @@ static struct dst_entry* dccp_v4_route_skb(struct sock *sk, } }; + security_skb_classify_flow(skb, &fl); if (ip_route_output_flow(&rt, &fl, sk, 0)) { IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES); return NULL; diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 610c722..53d255c 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -201,6 +201,7 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = usin->sin6_port; fl.fl_ip_sport = inet->sport; + security_sk_classify_flow(sk, &fl); if (np->opt != NULL && np->opt->srcrt != NULL) { const struct rt0_hdr *rt0 = (struct rt0_hdr *)np->opt->srcrt; @@ -322,6 +323,7 @@ static void dccp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet->dport; fl.fl_ip_sport = inet->sport; + security_sk_classify_flow(sk, &fl); err = ip6_dst_lookup(sk, &dst, &fl); if (err) { @@ -422,6 +424,7 @@ static int dccp_v6_send_response(struct sock *sk, struct request_sock *req, fl.oif = ireq6->iif; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; + security_sk_classify_flow(sk, &fl); if (dst == NULL) { opt = np->opt; @@ -566,6 +569,7 @@ static void dccp_v6_ctl_send_reset(struct sk_buff *rxskb) fl.oif = inet6_iif(rxskb); fl.fl_ip_dport = dh->dccph_dport; fl.fl_ip_sport = dh->dccph_sport; + security_skb_classify_flow(rxskb, &fl); /* sk = NULL, but it is safe for now. RST socket required. */ if (!ip6_dst_lookup(NULL, &skb->dst, &fl)) { @@ -622,6 +626,7 @@ static void dccp_v6_reqsk_send_ack(struct sk_buff *rxskb, fl.oif = inet6_iif(rxskb); fl.fl_ip_dport = dh->dccph_dport; fl.fl_ip_sport = dh->dccph_sport; + security_skb_classify_flow(rxskb, &fl); if (!ip6_dst_lookup(NULL, &skb->dst, &fl)) { if (xfrm_lookup(&skb->dst, &fl, NULL, 0) >= 0) { @@ -842,6 +847,7 @@ static struct sock *dccp_v6_request_recv_sock(struct sock *sk, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; + security_sk_classify_flow(sk, &fl); if (ip6_dst_lookup(sk, &dst, &fl)) goto out; diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index c84a320..fc40da3 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1074,6 +1074,7 @@ int inet_sk_rebuild_header(struct sock *sk) }, }; + security_sk_classify_flow(sk, &fl); err = ip_route_output_flow(&rt, &fl, sk, 0); } if (!err) diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 4c86ac3..6ad797c 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -406,6 +406,7 @@ static void icmp_reply(struct icmp_bxm *icmp_param, struct sk_buff *skb) .saddr = rt->rt_spec_dst, .tos = RT_TOS(skb->nh.iph->tos) } }, .proto = IPPROTO_ICMP }; + security_skb_classify_flow(skb, &fl); if (ip_route_output_key(&rt, &fl)) goto out_unlock; } @@ -560,6 +561,7 @@ void icmp_send(struct sk_buff *skb_in, int type, int code, u32 info) } } }; + security_skb_classify_flow(skb_in, &fl); if (ip_route_output_key(&rt, &fl)) goto out_unlock; } diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index e50a1bf..772b4ea 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -327,6 +327,7 @@ struct dst_entry* inet_csk_route_req(struct sock *sk, { .sport = inet_sk(sk)->sport, .dport = ireq->rmt_port } } }; + security_sk_classify_flow(sk, &fl); if (ip_route_output_flow(&rt, &fl, sk, 0)) { IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES); return NULL; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index a2ede16..308bdea 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -328,6 +328,7 @@ int ip_queue_xmit(struct sk_buff *skb, int ipfragok) * keep trying until route appears or the connection times * itself out. */ + security_sk_classify_flow(sk, &fl); if (ip_route_output_flow(&rt, &fl, sk, 0)) goto no_route; } @@ -1366,6 +1367,7 @@ void ip_send_reply(struct sock *sk, struct sk_buff *skb, struct ip_reply_arg *ar { .sport = skb->h.th->dest, .dport = skb->h.th->source } }, .proto = sk->sk_protocol }; + security_skb_classify_flow(skb, &fl); if (ip_route_output_key(&rt, &fl)) return; } diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c index 269bc20..7f905bf 100644 --- a/net/ipv4/netfilter/ipt_REJECT.c +++ b/net/ipv4/netfilter/ipt_REJECT.c @@ -90,6 +90,7 @@ static inline struct rtable *route_reverse(struct sk_buff *skb, fl.proto = IPPROTO_TCP; fl.fl_ip_sport = tcph->dest; fl.fl_ip_dport = tcph->source; + security_skb_classify_flow(skb, &fl); xfrm_lookup((struct dst_entry **)&rt, &fl, NULL, 0); diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index 62b2762..fe44cb5 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -484,6 +484,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, if (!inet->hdrincl) raw_probe_proto_opt(&fl, msg); + security_sk_classify_flow(sk, &fl); err = ip_route_output_flow(&rt, &fl, sk, !(msg->msg_flags&MSG_DONTWAIT)); } if (err) diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index e20be33..307dc3c 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -259,6 +259,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, .uli_u = { .ports = { .sport = skb->h.th->dest, .dport = skb->h.th->source } } }; + security_sk_classify_flow(sk, &fl); if (ip_route_output_key(&rt, &fl)) { reqsk_free(req); goto out; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index f136cec..a4d005e 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -603,6 +603,7 @@ int udp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, .uli_u = { .ports = { .sport = inet->sport, .dport = dport } } }; + security_sk_classify_flow(sk, &fl); err = ip_route_output_flow(&rt, &fl, sk, !(msg->msg_flags&MSG_DONTWAIT)); if (err) goto out; diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index ac85e9c..82a1b1a 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -637,6 +637,7 @@ int inet6_sk_rebuild_header(struct sock *sk) fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet->dport; fl.fl_ip_sport = inet->sport; + security_sk_classify_flow(sk, &fl); if (np->opt && np->opt->srcrt) { struct rt0_hdr *rt0 = (struct rt0_hdr *) np->opt->srcrt; diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 3b55b4c..c73508e 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -156,6 +156,8 @@ ipv4_connected: if (!fl.oif && (addr_type&IPV6_ADDR_MULTICAST)) fl.oif = np->mcast_oif; + security_sk_classify_flow(sk, &fl); + if (flowlabel) { if (flowlabel->opt && flowlabel->opt->srcrt) { struct rt0_hdr *rt0 = (struct rt0_hdr *) flowlabel->opt->srcrt; diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 356a8a7..dbfce08 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -358,6 +358,7 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, __u32 info, fl.oif = iif; fl.fl_icmp_type = type; fl.fl_icmp_code = code; + security_skb_classify_flow(skb, &fl); if (icmpv6_xmit_lock()) return; @@ -472,6 +473,7 @@ static void icmpv6_echo_reply(struct sk_buff *skb) ipv6_addr_copy(&fl.fl6_src, saddr); fl.oif = skb->dev->ifindex; fl.fl_icmp_type = ICMPV6_ECHO_REPLY; + security_skb_classify_flow(skb, &fl); if (icmpv6_xmit_lock()) return; diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c index bf49107..7a51a25 100644 --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -157,6 +157,7 @@ int inet6_csk_xmit(struct sk_buff *skb, int ipfragok) fl.oif = sk->sk_bound_dev_if; fl.fl_ip_sport = inet->sport; fl.fl_ip_dport = inet->dport; + security_sk_classify_flow(sk, &fl); if (np->opt && np->opt->srcrt) { struct rt0_hdr *rt0 = (struct rt0_hdr *)np->opt->srcrt; diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index b50055b..67cfc38 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -419,6 +419,7 @@ static inline void ndisc_flow_init(struct flowi *fl, u8 type, fl->proto = IPPROTO_ICMPV6; fl->fl_icmp_type = type; fl->fl_icmp_code = 0; + security_sk_classify_flow(ndisc_socket->sk, fl); } static void ndisc_send_na(struct net_device *dev, struct neighbour *neigh, diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c index 8629ba1..c4eba1a 100644 --- a/net/ipv6/netfilter/ip6t_REJECT.c +++ b/net/ipv6/netfilter/ip6t_REJECT.c @@ -96,6 +96,7 @@ static void send_reset(struct sk_buff *oldskb) ipv6_addr_copy(&fl.fl6_dst, &oip6h->saddr); fl.fl_ip_sport = otcph.dest; fl.fl_ip_dport = otcph.source; + security_skb_classify_flow(oldskb, &fl); dst = ip6_route_output(NULL, &fl); if (dst == NULL) return; diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index 15b862d..d5040e1 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -759,6 +759,7 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk, if (!fl.oif && ipv6_addr_is_multicast(&fl.fl6_dst)) fl.oif = np->mcast_oif; + security_sk_classify_flow(sk, &fl); err = ip6_dst_lookup(sk, &dst, &fl); if (err) diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 802a1a6..46922e5 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -251,6 +251,8 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, final_p = &final; } + security_sk_classify_flow(sk, &fl); + err = ip6_dst_lookup(sk, &dst, &fl); if (err) goto failure; @@ -374,6 +376,7 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet->dport; fl.fl_ip_sport = inet->sport; + security_skb_classify_flow(skb, &fl); if ((err = ip6_dst_lookup(sk, &dst, &fl))) { sk->sk_err_soft = -err; @@ -467,6 +470,7 @@ static int tcp_v6_send_synack(struct sock *sk, struct request_sock *req, fl.oif = treq->iif; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; + security_sk_classify_flow(sk, &fl); if (dst == NULL) { opt = np->opt; @@ -625,6 +629,7 @@ static void tcp_v6_send_reset(struct sk_buff *skb) fl.oif = inet6_iif(skb); fl.fl_ip_dport = t1->dest; fl.fl_ip_sport = t1->source; + security_skb_classify_flow(skb, &fl); /* sk = NULL, but it is safe for now. RST socket required. */ if (!ip6_dst_lookup(NULL, &buff->dst, &fl)) { @@ -691,6 +696,7 @@ static void tcp_v6_send_ack(struct sk_buff *skb, u32 seq, u32 ack, u32 win, u32 fl.oif = inet6_iif(skb); fl.fl_ip_dport = t1->dest; fl.fl_ip_sport = t1->source; + security_skb_classify_flow(skb, &fl); if (!ip6_dst_lookup(NULL, &buff->dst, &fl)) { if (xfrm_lookup(&buff->dst, &fl, NULL, 0) >= 0) { @@ -923,6 +929,7 @@ static struct sock * tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; + security_sk_classify_flow(sk, &fl); if (ip6_dst_lookup(sk, &dst, &fl)) goto out; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 3d54f24..82c7c9c 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -782,6 +782,8 @@ do_udp_sendmsg: connected = 0; } + security_sk_classify_flow(sk, fl); + err = ip6_sk_dst_lookup(sk, &dst, fl); if (err) goto out; diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 79405da..32c963c 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -863,7 +863,6 @@ int xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, u16 family; u8 dir = policy_to_flow_dir(XFRM_POLICY_OUT); - fl->secid = security_sk_sid(sk, fl, dir); restart: genid = atomic_read(&flow_cache_genid); policy = NULL; @@ -1039,7 +1038,7 @@ xfrm_decode_session(struct sk_buff *skb, struct flowi *fl, unsigned short family return -EAFNOSUPPORT; afinfo->decode_session(skb, fl); - err = security_xfrm_decode_session(skb, fl); + err = security_xfrm_decode_session(skb, &fl->secid); xfrm_policy_put_afinfo(afinfo); return err; } diff --git a/security/dummy.c b/security/dummy.c index c1f1065..c0ff6b9 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -809,9 +809,8 @@ static inline void dummy_sk_clone_security (const struct sock *sk, struct sock * { } -static unsigned int dummy_sk_getsid(struct sock *sk, struct flowi *fl, u8 dir) +static inline void dummy_sk_getsecid(struct sock *sk, u32 *secid) { - return 0; } #endif /* CONFIG_SECURITY_NETWORK */ @@ -866,7 +865,7 @@ static int dummy_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm return 1; } -static int dummy_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +static int dummy_xfrm_decode_session(struct sk_buff *skb, u32 *fl, int ckall) { return 0; } @@ -1083,7 +1082,7 @@ void security_fixup_ops (struct security_operations *ops) set_to_dummy_if_null(ops, sk_alloc_security); set_to_dummy_if_null(ops, sk_free_security); set_to_dummy_if_null(ops, sk_clone_security); - set_to_dummy_if_null(ops, sk_getsid); + set_to_dummy_if_null(ops, sk_getsecid); #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM set_to_dummy_if_null(ops, xfrm_policy_alloc_security); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 5c189da..4e5989d 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3561,14 +3561,14 @@ static void selinux_sk_clone_security(const struct sock *sk, struct sock *newsk) newssec->peer_sid = ssec->peer_sid; } -static unsigned int selinux_sk_getsid_security(struct sock *sk, struct flowi *fl, u8 dir) +static void selinux_sk_getsecid(struct sock *sk, u32 *secid) { if (!sk) - return selinux_no_sk_sid(fl); + *secid = SECINITSID_ANY_SOCKET; else { struct sk_security_struct *sksec = sk->sk_security; - return sksec->sid; + *secid = sksec->sid; } } @@ -4622,7 +4622,7 @@ static struct security_operations selinux_ops = { .sk_alloc_security = selinux_sk_alloc_security, .sk_free_security = selinux_sk_free_security, .sk_clone_security = selinux_sk_clone_security, - .sk_getsid = selinux_sk_getsid_security, + .sk_getsecid = selinux_sk_getsecid, #ifdef CONFIG_SECURITY_NETWORK_XFRM .xfrm_policy_alloc_security = selinux_xfrm_policy_alloc, diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h index f51a3e8..8e45c1d 100644 --- a/security/selinux/include/xfrm.h +++ b/security/selinux/include/xfrm.h @@ -19,7 +19,7 @@ int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir); int selinux_xfrm_state_pol_flow_match(struct xfrm_state *x, struct xfrm_policy *xp, struct flowi *fl); int selinux_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm); -int selinux_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl); +int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *fl, int ckall); /* @@ -33,18 +33,6 @@ static inline struct inode_security_struct *get_sock_isec(struct sock *sk) return SOCK_INODE(sk->sk_socket)->i_security; } - -static inline u32 selinux_no_sk_sid(struct flowi *fl) -{ - /* NOTE: no sock occurs on ICMP reply, forwards, ... */ - /* icmp_reply: authorize as kernel packet */ - if (fl && fl->proto == IPPROTO_ICMP) { - return SECINITSID_KERNEL; - } - - return SECINITSID_ANY_SOCKET; -} - #ifdef CONFIG_SECURITY_NETWORK_XFRM int selinux_xfrm_sock_rcv_skb(u32 sid, struct sk_buff *skb, struct avc_audit_data *ad); diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c index a502b05..c750ef7 100644 --- a/security/selinux/xfrm.c +++ b/security/selinux/xfrm.c @@ -158,11 +158,11 @@ int selinux_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm) * LSM hook implementation that determines the sid for the session. */ -int selinux_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) +int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *sid, int ckall) { struct sec_path *sp; - fl->secid = SECSID_NULL; + *sid = SECSID_NULL; if (skb == NULL) return 0; @@ -177,10 +177,13 @@ int selinux_xfrm_decode_session(struct sk_buff *skb, struct flowi *fl) struct xfrm_sec_ctx *ctx = x->security; if (!sid_set) { - fl->secid = ctx->ctx_sid; + *sid = ctx->ctx_sid; sid_set = 1; + + if (!ckall) + break; } - else if (fl->secid != ctx->ctx_sid) + else if (*sid != ctx->ctx_sid) return -EINVAL; } } -- cgit v0.10.2 From cb969f072b6d67770b559617f14e767f47e77ece Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:32:20 -0700 Subject: [MLSXFRM]: Default labeling of socket specific IPSec policies This defaults the label of socket-specific IPSec policies to be the same as the socket they are set on. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index f3909d1..8e3dc6c 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -827,8 +827,10 @@ struct swap_info_struct; * used by the XFRM system. * @sec_ctx contains the security context information being provided by * the user-level policy update program (e.g., setkey). + * @sk refers to the sock from which to derive the security context. * Allocate a security structure to the xp->security field; the security - * field is initialized to NULL when the xfrm_policy is allocated. + * field is initialized to NULL when the xfrm_policy is allocated. Only + * one of sec_ctx or sock can be specified. * Return 0 if operation was successful (memory to allocate, legal context) * @xfrm_policy_clone_security: * @old contains an existing xfrm_policy in the SPD. @@ -1359,7 +1361,8 @@ struct security_operations { #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM - int (*xfrm_policy_alloc_security) (struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx); + int (*xfrm_policy_alloc_security) (struct xfrm_policy *xp, + struct xfrm_user_sec_ctx *sec_ctx, struct sock *sk); int (*xfrm_policy_clone_security) (struct xfrm_policy *old, struct xfrm_policy *new); void (*xfrm_policy_free_security) (struct xfrm_policy *xp); int (*xfrm_policy_delete_security) (struct xfrm_policy *xp); @@ -3057,7 +3060,12 @@ static inline void security_sk_classify_flow(struct sock *sk, struct flowi *fl) #ifdef CONFIG_SECURITY_NETWORK_XFRM static inline int security_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx) { - return security_ops->xfrm_policy_alloc_security(xp, sec_ctx); + return security_ops->xfrm_policy_alloc_security(xp, sec_ctx, NULL); +} + +static inline int security_xfrm_sock_policy_alloc(struct xfrm_policy *xp, struct sock *sk) +{ + return security_ops->xfrm_policy_alloc_security(xp, NULL, sk); } static inline int security_xfrm_policy_clone(struct xfrm_policy *old, struct xfrm_policy *new) @@ -3132,6 +3140,11 @@ static inline int security_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm return 0; } +static inline int security_xfrm_sock_policy_alloc(struct xfrm_policy *xp, struct sock *sk) +{ + return 0; +} + static inline int security_xfrm_policy_clone(struct xfrm_policy *old, struct xfrm_policy *new) { return 0; diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 3ecd9fa..00bf86e 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -362,7 +362,7 @@ struct xfrm_mgr char *id; int (*notify)(struct xfrm_state *x, struct km_event *c); int (*acquire)(struct xfrm_state *x, struct xfrm_tmpl *, struct xfrm_policy *xp, int dir); - struct xfrm_policy *(*compile_policy)(u16 family, int opt, u8 *data, int len, int *dir); + struct xfrm_policy *(*compile_policy)(struct sock *sk, int opt, u8 *data, int len, int *dir); int (*new_mapping)(struct xfrm_state *x, xfrm_address_t *ipaddr, u16 sport); int (*notify_policy)(struct xfrm_policy *x, int dir, struct km_event *c); }; diff --git a/net/key/af_key.c b/net/key/af_key.c index a065e1a..797c744 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -2843,14 +2843,14 @@ static int pfkey_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *t, struct return pfkey_broadcast(skb, GFP_ATOMIC, BROADCAST_REGISTERED, NULL); } -static struct xfrm_policy *pfkey_compile_policy(u16 family, int opt, +static struct xfrm_policy *pfkey_compile_policy(struct sock *sk, int opt, u8 *data, int len, int *dir) { struct xfrm_policy *xp; struct sadb_x_policy *pol = (struct sadb_x_policy*)data; struct sadb_x_sec_ctx *sec_ctx; - switch (family) { + switch (sk->sk_family) { case AF_INET: if (opt != IP_IPSEC_POLICY) { *dir = -EOPNOTSUPP; @@ -2891,7 +2891,7 @@ static struct xfrm_policy *pfkey_compile_policy(u16 family, int opt, xp->lft.hard_byte_limit = XFRM_INF; xp->lft.soft_packet_limit = XFRM_INF; xp->lft.hard_packet_limit = XFRM_INF; - xp->family = family; + xp->family = sk->sk_family; xp->xfrm_nr = 0; if (pol->sadb_x_policy_type == IPSEC_POLICY_IPSEC && @@ -2907,8 +2907,10 @@ static struct xfrm_policy *pfkey_compile_policy(u16 family, int opt, p += pol->sadb_x_policy_len*8; sec_ctx = (struct sadb_x_sec_ctx *)p; if (len < pol->sadb_x_policy_len*8 + - sec_ctx->sadb_x_sec_len) + sec_ctx->sadb_x_sec_len) { + *dir = -EINVAL; goto out; + } if ((*dir = verify_sec_ctx_len(p))) goto out; uctx = pfkey_sadb2xfrm_user_sec_ctx(sec_ctx); @@ -2918,6 +2920,11 @@ static struct xfrm_policy *pfkey_compile_policy(u16 family, int opt, if (*dir) goto out; } + else { + *dir = security_xfrm_sock_policy_alloc(xp, sk); + if (*dir) + goto out; + } *dir = pol->sadb_x_policy_dir-1; return xp; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index be02bd9..1c79608 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -1026,7 +1026,7 @@ int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int optlen err = -EINVAL; read_lock(&xfrm_km_lock); list_for_each_entry(km, &xfrm_km_list, list) { - pol = km->compile_policy(sk->sk_family, optname, data, + pol = km->compile_policy(sk, optname, data, optlen, &err); if (err >= 0) break; diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index dac8db1..f70e158 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1757,7 +1757,7 @@ static int xfrm_send_acquire(struct xfrm_state *x, struct xfrm_tmpl *xt, /* User gives us xfrm_user_policy_info followed by an array of 0 * or more templates. */ -static struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, +static struct xfrm_policy *xfrm_compile_policy(struct sock *sk, int opt, u8 *data, int len, int *dir) { struct xfrm_userpolicy_info *p = (struct xfrm_userpolicy_info *)data; @@ -1765,7 +1765,7 @@ static struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, struct xfrm_policy *xp; int nr; - switch (family) { + switch (sk->sk_family) { case AF_INET: if (opt != IP_XFRM_POLICY) { *dir = -EOPNOTSUPP; @@ -1807,6 +1807,15 @@ static struct xfrm_policy *xfrm_compile_policy(u16 family, int opt, copy_from_user_policy(xp, p); copy_templates(xp, ut, nr); + if (!xp->security) { + int err = security_xfrm_sock_policy_alloc(xp, sk); + if (err) { + kfree(xp); + *dir = err; + return NULL; + } + } + *dir = p->dir; return xp; diff --git a/security/dummy.c b/security/dummy.c index c0ff6b9..66cc064 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -815,7 +815,8 @@ static inline void dummy_sk_getsecid(struct sock *sk, u32 *secid) #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM -static int dummy_xfrm_policy_alloc_security(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx) +static int dummy_xfrm_policy_alloc_security(struct xfrm_policy *xp, + struct xfrm_user_sec_ctx *sec_ctx, struct sock *sk) { return 0; } diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h index 8e45c1d..1822c73 100644 --- a/security/selinux/include/xfrm.h +++ b/security/selinux/include/xfrm.h @@ -7,7 +7,8 @@ #ifndef _SELINUX_XFRM_H_ #define _SELINUX_XFRM_H_ -int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *sec_ctx); +int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, + struct xfrm_user_sec_ctx *sec_ctx, struct sock *sk); int selinux_xfrm_policy_clone(struct xfrm_policy *old, struct xfrm_policy *new); void selinux_xfrm_policy_free(struct xfrm_policy *xp); int selinux_xfrm_policy_delete(struct xfrm_policy *xp); diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c index c750ef7..d3690f9 100644 --- a/security/selinux/xfrm.c +++ b/security/selinux/xfrm.c @@ -208,10 +208,8 @@ static int selinux_xfrm_sec_ctx_alloc(struct xfrm_sec_ctx **ctxp, BUG_ON(uctx && pol); - if (pol) - goto from_policy; - - BUG_ON(!uctx); + if (!uctx) + goto not_from_user; if (uctx->ctx_doi != XFRM_SC_ALG_SELINUX) return -EINVAL; @@ -251,11 +249,14 @@ static int selinux_xfrm_sec_ctx_alloc(struct xfrm_sec_ctx **ctxp, return rc; -from_policy: - BUG_ON(!pol); - rc = security_sid_mls_copy(pol->ctx_sid, sid, &ctx_sid); - if (rc) - goto out; +not_from_user: + if (pol) { + rc = security_sid_mls_copy(pol->ctx_sid, sid, &ctx_sid); + if (rc) + goto out; + } + else + ctx_sid = sid; rc = security_sid_to_context(ctx_sid, &ctx_str, &str_len); if (rc) @@ -293,13 +294,23 @@ out2: * LSM hook implementation that allocs and transfers uctx spec to * xfrm_policy. */ -int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, struct xfrm_user_sec_ctx *uctx) +int selinux_xfrm_policy_alloc(struct xfrm_policy *xp, + struct xfrm_user_sec_ctx *uctx, struct sock *sk) { int err; + u32 sid; BUG_ON(!xp); + BUG_ON(uctx && sk); + + if (sk) { + struct sk_security_struct *ssec = sk->sk_security; + sid = ssec->sid; + } + else + sid = SECSID_NULL; - err = selinux_xfrm_sec_ctx_alloc(&xp->security, uctx, NULL, 0); + err = selinux_xfrm_sec_ctx_alloc(&xp->security, uctx, NULL, sid); return err; } -- cgit v0.10.2 From 4237c75c0a35535d7f9f2bfeeb4b4df1e068a0bf Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Mon, 24 Jul 2006 23:32:50 -0700 Subject: [MLSXFRM]: Auto-labeling of child sockets This automatically labels the TCP, Unix stream, and dccp child sockets as well as openreqs to be at the same MLS level as the peer. This will result in the selection of appropriately labeled IPSec Security Associations. This also uses the sock's sid (as opposed to the isec sid) in SELinux enforcement of secmark in rcv_skb and postroute_last hooks. Signed-off-by: Venkat Yekkirala Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index 8e3dc6c..bb4c80f 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -90,6 +90,7 @@ extern int cap_netlink_recv(struct sk_buff *skb, int cap); struct nfsctl_arg; struct sched_param; struct swap_info_struct; +struct request_sock; /* bprm_apply_creds unsafe reasons */ #define LSM_UNSAFE_SHARE 1 @@ -819,6 +820,14 @@ struct swap_info_struct; * @sk_getsecid: * Retrieve the LSM-specific secid for the sock to enable caching of network * authorizations. + * @sock_graft: + * Sets the socket's isec sid to the sock's sid. + * @inet_conn_request: + * Sets the openreq's sid to socket's sid with MLS portion taken from peer sid. + * @inet_csk_clone: + * Sets the new child socket's sid to the openreq sid. + * @req_classify_flow: + * Sets the flow's sid to the openreq sid. * * Security hooks for XFRM operations. * @@ -1358,6 +1367,11 @@ struct security_operations { void (*sk_free_security) (struct sock *sk); void (*sk_clone_security) (const struct sock *sk, struct sock *newsk); void (*sk_getsecid) (struct sock *sk, u32 *secid); + void (*sock_graft)(struct sock* sk, struct socket *parent); + int (*inet_conn_request)(struct sock *sk, struct sk_buff *skb, + struct request_sock *req); + void (*inet_csk_clone)(struct sock *newsk, const struct request_sock *req); + void (*req_classify_flow)(const struct request_sock *req, struct flowi *fl); #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM @@ -2926,6 +2940,28 @@ static inline void security_sk_classify_flow(struct sock *sk, struct flowi *fl) { security_ops->sk_getsecid(sk, &fl->secid); } + +static inline void security_req_classify_flow(const struct request_sock *req, struct flowi *fl) +{ + security_ops->req_classify_flow(req, fl); +} + +static inline void security_sock_graft(struct sock* sk, struct socket *parent) +{ + security_ops->sock_graft(sk, parent); +} + +static inline int security_inet_conn_request(struct sock *sk, + struct sk_buff *skb, struct request_sock *req) +{ + return security_ops->inet_conn_request(sk, skb, req); +} + +static inline void security_inet_csk_clone(struct sock *newsk, + const struct request_sock *req) +{ + security_ops->inet_csk_clone(newsk, req); +} #else /* CONFIG_SECURITY_NETWORK */ static inline int security_unix_stream_connect(struct socket * sock, struct socket * other, @@ -3055,6 +3091,25 @@ static inline void security_sk_clone(const struct sock *sk, struct sock *newsk) static inline void security_sk_classify_flow(struct sock *sk, struct flowi *fl) { } + +static inline void security_req_classify_flow(const struct request_sock *req, struct flowi *fl) +{ +} + +static inline void security_sock_graft(struct sock* sk, struct socket *parent) +{ +} + +static inline int security_inet_conn_request(struct sock *sk, + struct sk_buff *skb, struct request_sock *req) +{ + return 0; +} + +static inline void security_inet_csk_clone(struct sock *newsk, + const struct request_sock *req) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM diff --git a/include/net/request_sock.h b/include/net/request_sock.h index c5d7f92..8e165ca 100644 --- a/include/net/request_sock.h +++ b/include/net/request_sock.h @@ -53,6 +53,7 @@ struct request_sock { unsigned long expires; struct request_sock_ops *rsk_ops; struct sock *sk; + u32 secid; }; static inline struct request_sock *reqsk_alloc(struct request_sock_ops *ops) diff --git a/include/net/sock.h b/include/net/sock.h index 91cdceb..337ebec 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -969,6 +969,7 @@ static inline void sock_graft(struct sock *sk, struct socket *parent) sk->sk_sleep = &parent->wait; parent->sk = sk; sk->sk_socket = parent; + security_sock_graft(sk, parent); write_unlock_bh(&sk->sk_callback_lock); } diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 3864980..171d363 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -501,6 +501,9 @@ int dccp_v4_conn_request(struct sock *sk, struct sk_buff *skb) dccp_openreq_init(req, &dp, skb); + if (security_inet_conn_request(sk, skb, req)) + goto drop_and_free; + ireq = inet_rsk(req); ireq->loc_addr = daddr; ireq->rmt_addr = saddr; diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 53d255c..231bc7c 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -424,7 +424,7 @@ static int dccp_v6_send_response(struct sock *sk, struct request_sock *req, fl.oif = ireq6->iif; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; - security_sk_classify_flow(sk, &fl); + security_req_classify_flow(req, &fl); if (dst == NULL) { opt = np->opt; @@ -626,7 +626,7 @@ static void dccp_v6_reqsk_send_ack(struct sk_buff *rxskb, fl.oif = inet6_iif(rxskb); fl.fl_ip_dport = dh->dccph_dport; fl.fl_ip_sport = dh->dccph_sport; - security_skb_classify_flow(rxskb, &fl); + security_req_classify_flow(req, &fl); if (!ip6_dst_lookup(NULL, &skb->dst, &fl)) { if (xfrm_lookup(&skb->dst, &fl, NULL, 0) >= 0) { @@ -709,6 +709,9 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb) dccp_openreq_init(req, &dp, skb); + if (security_inet_conn_request(sk, skb, req)) + goto drop_and_free; + ireq6 = inet6_rsk(req); ireq = inet_rsk(req); ipv6_addr_copy(&ireq6->rmt_addr, &skb->nh.ipv6h->saddr); diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 772b4ea..0720439 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -327,7 +327,7 @@ struct dst_entry* inet_csk_route_req(struct sock *sk, { .sport = inet_sk(sk)->sport, .dport = ireq->rmt_port } } }; - security_sk_classify_flow(sk, &fl); + security_req_classify_flow(req, &fl); if (ip_route_output_flow(&rt, &fl, sk, 0)) { IP_INC_STATS_BH(IPSTATS_MIB_OUTNOROUTES); return NULL; @@ -510,6 +510,8 @@ struct sock *inet_csk_clone(struct sock *sk, const struct request_sock *req, /* Deinitialize accept_queue to trap illegal accesses. */ memset(&newicsk->icsk_accept_queue, 0, sizeof(newicsk->icsk_accept_queue)); + + security_inet_csk_clone(newsk, req); } return newsk; } diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 307dc3c..661e0a4 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -214,6 +214,10 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, if (!req) goto out; + if (security_inet_conn_request(sk, skb, req)) { + reqsk_free(req); + goto out; + } ireq = inet_rsk(req); treq = tcp_rsk(req); treq->rcv_isn = htonl(skb->h.th->seq) - 1; @@ -259,7 +263,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, .uli_u = { .ports = { .sport = skb->h.th->dest, .dport = skb->h.th->source } } }; - security_sk_classify_flow(sk, &fl); + security_req_classify_flow(req, &fl); if (ip_route_output_key(&rt, &fl)) { reqsk_free(req); goto out; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 4b04c3e..43f6740 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -798,6 +798,9 @@ int tcp_v4_conn_request(struct sock *sk, struct sk_buff *skb) tcp_openreq_init(req, &tmp_opt, skb); + if (security_inet_conn_request(sk, skb, req)) + goto drop_and_free; + ireq = inet_rsk(req); ireq->loc_addr = daddr; ireq->rmt_addr = saddr; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 46922e5..302786a 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -470,7 +470,7 @@ static int tcp_v6_send_synack(struct sock *sk, struct request_sock *req, fl.oif = treq->iif; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; - security_sk_classify_flow(sk, &fl); + security_req_classify_flow(req, &fl); if (dst == NULL) { opt = np->opt; @@ -826,6 +826,8 @@ static int tcp_v6_conn_request(struct sock *sk, struct sk_buff *skb) tcp_rsk(req)->snt_isn = isn; + security_inet_conn_request(sk, skb, req); + if (tcp_v6_send_synack(sk, req, NULL)) goto drop; @@ -929,7 +931,7 @@ static struct sock * tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb, fl.oif = sk->sk_bound_dev_if; fl.fl_ip_dport = inet_rsk(req)->rmt_port; fl.fl_ip_sport = inet_sk(sk)->sport; - security_sk_classify_flow(sk, &fl); + security_req_classify_flow(req, &fl); if (ip6_dst_lookup(sk, &dst, &fl)) goto out; diff --git a/security/dummy.c b/security/dummy.c index 66cc064..1c45f8e 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -812,6 +812,26 @@ static inline void dummy_sk_clone_security (const struct sock *sk, struct sock * static inline void dummy_sk_getsecid(struct sock *sk, u32 *secid) { } + +static inline void dummy_sock_graft(struct sock* sk, struct socket *parent) +{ +} + +static inline int dummy_inet_conn_request(struct sock *sk, + struct sk_buff *skb, struct request_sock *req) +{ + return 0; +} + +static inline void dummy_inet_csk_clone(struct sock *newsk, + const struct request_sock *req) +{ +} + +static inline void dummy_req_classify_flow(const struct request_sock *req, + struct flowi *fl) +{ +} #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM @@ -1084,6 +1104,10 @@ void security_fixup_ops (struct security_operations *ops) set_to_dummy_if_null(ops, sk_free_security); set_to_dummy_if_null(ops, sk_clone_security); set_to_dummy_if_null(ops, sk_getsecid); + set_to_dummy_if_null(ops, sock_graft); + set_to_dummy_if_null(ops, inet_conn_request); + set_to_dummy_if_null(ops, inet_csk_clone); + set_to_dummy_if_null(ops, req_classify_flow); #endif /* CONFIG_SECURITY_NETWORK */ #ifdef CONFIG_SECURITY_NETWORK_XFRM set_to_dummy_if_null(ops, xfrm_policy_alloc_security); diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 4e5989d..1dc935f7 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3328,8 +3328,9 @@ static int selinux_socket_unix_stream_connect(struct socket *sock, /* server child socket */ ssec = newsk->sk_security; ssec->peer_sid = isec->sid; - - return 0; + err = security_sid_mls_copy(other_isec->sid, ssec->peer_sid, &ssec->sid); + + return err; } static int selinux_socket_unix_may_send(struct socket *sock, @@ -3355,11 +3356,29 @@ static int selinux_socket_unix_may_send(struct socket *sock, } static int selinux_sock_rcv_skb_compat(struct sock *sk, struct sk_buff *skb, - struct avc_audit_data *ad, u32 sock_sid, u16 sock_class, - u16 family, char *addrp, int len) + struct avc_audit_data *ad, u16 family, char *addrp, int len) { int err = 0; u32 netif_perm, node_perm, node_sid, if_sid, recv_perm = 0; + struct socket *sock; + u16 sock_class = 0; + u32 sock_sid = 0; + + read_lock_bh(&sk->sk_callback_lock); + sock = sk->sk_socket; + if (sock) { + struct inode *inode; + inode = SOCK_INODE(sock); + if (inode) { + struct inode_security_struct *isec; + isec = inode->i_security; + sock_sid = isec->sid; + sock_class = isec->sclass; + } + } + read_unlock_bh(&sk->sk_callback_lock); + if (!sock_sid) + goto out; if (!skb->dev) goto out; @@ -3419,12 +3438,10 @@ out: static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) { u16 family; - u16 sock_class = 0; char *addrp; int len, err = 0; - u32 sock_sid = 0; - struct socket *sock; struct avc_audit_data ad; + struct sk_security_struct *sksec = sk->sk_security; family = sk->sk_family; if (family != PF_INET && family != PF_INET6) @@ -3434,22 +3451,6 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) if (family == PF_INET6 && skb->protocol == ntohs(ETH_P_IP)) family = PF_INET; - read_lock_bh(&sk->sk_callback_lock); - sock = sk->sk_socket; - if (sock) { - struct inode *inode; - inode = SOCK_INODE(sock); - if (inode) { - struct inode_security_struct *isec; - isec = inode->i_security; - sock_sid = isec->sid; - sock_class = isec->sclass; - } - } - read_unlock_bh(&sk->sk_callback_lock); - if (!sock_sid) - goto out; - AVC_AUDIT_DATA_INIT(&ad, NET); ad.u.net.netif = skb->dev ? skb->dev->name : "[unknown]"; ad.u.net.family = family; @@ -3459,16 +3460,15 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) goto out; if (selinux_compat_net) - err = selinux_sock_rcv_skb_compat(sk, skb, &ad, sock_sid, - sock_class, family, + err = selinux_sock_rcv_skb_compat(sk, skb, &ad, family, addrp, len); else - err = avc_has_perm(sock_sid, skb->secmark, SECCLASS_PACKET, + err = avc_has_perm(sksec->sid, skb->secmark, SECCLASS_PACKET, PACKET__RECV, &ad); if (err) goto out; - err = selinux_xfrm_sock_rcv_skb(sock_sid, skb, &ad); + err = selinux_xfrm_sock_rcv_skb(sksec->sid, skb, &ad); out: return err; } @@ -3572,6 +3572,49 @@ static void selinux_sk_getsecid(struct sock *sk, u32 *secid) } } +void selinux_sock_graft(struct sock* sk, struct socket *parent) +{ + struct inode_security_struct *isec = SOCK_INODE(parent)->i_security; + struct sk_security_struct *sksec = sk->sk_security; + + isec->sid = sksec->sid; +} + +int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, + struct request_sock *req) +{ + struct sk_security_struct *sksec = sk->sk_security; + int err; + u32 newsid = 0; + u32 peersid; + + err = selinux_xfrm_decode_session(skb, &peersid, 0); + BUG_ON(err); + + err = security_sid_mls_copy(sksec->sid, peersid, &newsid); + if (err) + return err; + + req->secid = newsid; + return 0; +} + +void selinux_inet_csk_clone(struct sock *newsk, const struct request_sock *req) +{ + struct sk_security_struct *newsksec = newsk->sk_security; + + newsksec->sid = req->secid; + /* NOTE: Ideally, we should also get the isec->sid for the + new socket in sync, but we don't have the isec available yet. + So we will wait until sock_graft to do it, by which + time it will have been created and available. */ +} + +void selinux_req_classify_flow(const struct request_sock *req, struct flowi *fl) +{ + fl->secid = req->secid; +} + static int selinux_nlmsg_perm(struct sock *sk, struct sk_buff *skb) { int err = 0; @@ -3611,12 +3654,24 @@ out: #ifdef CONFIG_NETFILTER static int selinux_ip_postroute_last_compat(struct sock *sk, struct net_device *dev, - struct inode_security_struct *isec, struct avc_audit_data *ad, u16 family, char *addrp, int len) { - int err; + int err = 0; u32 netif_perm, node_perm, node_sid, if_sid, send_perm = 0; + struct socket *sock; + struct inode *inode; + struct inode_security_struct *isec; + + sock = sk->sk_socket; + if (!sock) + goto out; + + inode = SOCK_INODE(sock); + if (!inode) + goto out; + + isec = inode->i_security; err = sel_netif_sids(dev, &if_sid, NULL); if (err) @@ -3681,26 +3736,16 @@ static unsigned int selinux_ip_postroute_last(unsigned int hooknum, char *addrp; int len, err = 0; struct sock *sk; - struct socket *sock; - struct inode *inode; struct sk_buff *skb = *pskb; - struct inode_security_struct *isec; struct avc_audit_data ad; struct net_device *dev = (struct net_device *)out; + struct sk_security_struct *sksec; sk = skb->sk; if (!sk) goto out; - sock = sk->sk_socket; - if (!sock) - goto out; - - inode = SOCK_INODE(sock); - if (!inode) - goto out; - - isec = inode->i_security; + sksec = sk->sk_security; AVC_AUDIT_DATA_INIT(&ad, NET); ad.u.net.netif = dev->name; @@ -3711,16 +3756,16 @@ static unsigned int selinux_ip_postroute_last(unsigned int hooknum, goto out; if (selinux_compat_net) - err = selinux_ip_postroute_last_compat(sk, dev, isec, &ad, + err = selinux_ip_postroute_last_compat(sk, dev, &ad, family, addrp, len); else - err = avc_has_perm(isec->sid, skb->secmark, SECCLASS_PACKET, + err = avc_has_perm(sksec->sid, skb->secmark, SECCLASS_PACKET, PACKET__SEND, &ad); if (err) goto out; - err = selinux_xfrm_postroute_last(isec->sid, skb, &ad); + err = selinux_xfrm_postroute_last(sksec->sid, skb, &ad); out: return err ? NF_DROP : NF_ACCEPT; } @@ -4623,6 +4668,10 @@ static struct security_operations selinux_ops = { .sk_free_security = selinux_sk_free_security, .sk_clone_security = selinux_sk_clone_security, .sk_getsecid = selinux_sk_getsecid, + .sock_graft = selinux_sock_graft, + .inet_conn_request = selinux_inet_conn_request, + .inet_csk_clone = selinux_inet_csk_clone, + .req_classify_flow = selinux_req_classify_flow, #ifdef CONFIG_SECURITY_NETWORK_XFRM .xfrm_policy_alloc_security = selinux_xfrm_policy_alloc, diff --git a/security/selinux/xfrm.c b/security/selinux/xfrm.c index d3690f9..3e742b8 100644 --- a/security/selinux/xfrm.c +++ b/security/selinux/xfrm.c @@ -271,7 +271,6 @@ not_from_user: goto out; } - ctx->ctx_doi = XFRM_SC_DOI_LSM; ctx->ctx_alg = XFRM_SC_ALG_SELINUX; ctx->ctx_sid = ctx_sid; -- cgit v0.10.2 From a51c64f1e5c2876eab2a32955acd9e8015c91c15 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Thu, 27 Jul 2006 22:01:34 -0700 Subject: [MLSXFRM]: Fix build with SECURITY_NETWORK_XFRM disabled. The following patch will fix the build problem (encountered by Andrew Morton) when SECURITY_NETWORK_XFRM is not enabled. As compared to git-net-selinux_xfrm_decode_session-build-fix.patch in -mm, this patch sets the return parameter sid to SECSID_NULL in selinux_xfrm_decode_session() and handles this value in the caller selinux_inet_conn_request() appropriately. Signed-off-by: Venkat Yekkirala Acked-by: James Morris Signed-off-by: David S. Miller diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 1dc935f7..33028b3 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3591,6 +3591,11 @@ int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, err = selinux_xfrm_decode_session(skb, &peersid, 0); BUG_ON(err); + if (peersid == SECSID_NULL) { + req->secid = sksec->sid; + return 0; + } + err = security_sid_mls_copy(sksec->sid, peersid, &newsid); if (err) return err; diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h index 1822c73..81eb598 100644 --- a/security/selinux/include/xfrm.h +++ b/security/selinux/include/xfrm.h @@ -20,7 +20,6 @@ int selinux_xfrm_policy_lookup(struct xfrm_policy *xp, u32 fl_secid, u8 dir); int selinux_xfrm_state_pol_flow_match(struct xfrm_state *x, struct xfrm_policy *xp, struct flowi *fl); int selinux_xfrm_flow_state_match(struct flowi *fl, struct xfrm_state *xfrm); -int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *fl, int ckall); /* @@ -41,6 +40,7 @@ int selinux_xfrm_postroute_last(u32 isec_sid, struct sk_buff *skb, struct avc_audit_data *ad); u32 selinux_socket_getpeer_stream(struct sock *sk); u32 selinux_socket_getpeer_dgram(struct sk_buff *skb); +int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *sid, int ckall); #else static inline int selinux_xfrm_sock_rcv_skb(u32 isec_sid, struct sk_buff *skb, struct avc_audit_data *ad) @@ -63,6 +63,11 @@ static inline int selinux_socket_getpeer_dgram(struct sk_buff *skb) { return SECSID_NULL; } +static inline int selinux_xfrm_decode_session(struct sk_buff *skb, u32 *sid, int ckall) +{ + *sid = SECSID_NULL; + return 0; +} #endif #endif /* _SELINUX_XFRM_H_ */ -- cgit v0.10.2 From 8802f616f6de8576805f32e47602816f141118f2 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:45:49 -0700 Subject: [NetLabel]: documentation Documentation for the NetLabel system, this includes a basic overview of how NetLabel works, how LSM developers can integrate it into their favorite LSM, as well as documentation on the CIPSO related sysctl variables. Also, due to the difficulty of finding expired IETF drafts, I am including the IETF CIPSO draft that is the basis of the NetLabel CIPSO implementation. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/CREDITS b/CREDITS index 0fe904e..cc3453a 100644 --- a/CREDITS +++ b/CREDITS @@ -2384,6 +2384,13 @@ N: Thomas Molina E: tmolina@cablespeed.com D: bug fixes, documentation, minor hackery +N: Paul Moore +E: paul.moore@hp.com +D: NetLabel author +S: Hewlett-Packard +S: 110 Spit Brook Road +S: Nashua, NH 03062 + N: James Morris E: jmorris@namei.org W: http://namei.org/ diff --git a/Documentation/00-INDEX b/Documentation/00-INDEX index 5f7f7d7..02457ec 100644 --- a/Documentation/00-INDEX +++ b/Documentation/00-INDEX @@ -184,6 +184,8 @@ mtrr.txt - how to use PPro Memory Type Range Registers to increase performance. nbd.txt - info on a TCP implementation of a network block device. +netlabel/ + - directory with information on the NetLabel subsystem. networking/ - directory with info on various aspects of networking with Linux. nfsroot.txt diff --git a/Documentation/netlabel/00-INDEX b/Documentation/netlabel/00-INDEX new file mode 100644 index 0000000..837bf35 --- /dev/null +++ b/Documentation/netlabel/00-INDEX @@ -0,0 +1,10 @@ +00-INDEX + - this file. +cipso_ipv4.txt + - documentation on the IPv4 CIPSO protocol engine. +draft-ietf-cipso-ipsecurity-01.txt + - IETF draft of the CIPSO protocol, dated 16 July 1992. +introduction.txt + - NetLabel introduction, READ THIS FIRST. +lsm_interface.txt + - documentation on the NetLabel kernel security module API. diff --git a/Documentation/netlabel/cipso_ipv4.txt b/Documentation/netlabel/cipso_ipv4.txt new file mode 100644 index 0000000..93dacb1 --- /dev/null +++ b/Documentation/netlabel/cipso_ipv4.txt @@ -0,0 +1,48 @@ +NetLabel CIPSO/IPv4 Protocol Engine +============================================================================== +Paul Moore, paul.moore@hp.com + +May 17, 2006 + + * Overview + +The NetLabel CIPSO/IPv4 protocol engine is based on the IETF Commercial IP +Security Option (CIPSO) draft from July 16, 1992. A copy of this draft can be +found in this directory, consult '00-INDEX' for the filename. While the IETF +draft never made it to an RFC standard it has become a de-facto standard for +labeled networking and is used in many trusted operating systems. + + * Outbound Packet Processing + +The CIPSO/IPv4 protocol engine applies the CIPSO IP option to packets by +adding the CIPSO label to the socket. This causes all packets leaving the +system through the socket to have the CIPSO IP option applied. The socket's +CIPSO label can be changed at any point in time, however, it is recommended +that it is set upon the socket's creation. The LSM can set the socket's CIPSO +label by using the NetLabel security module API; if the NetLabel "domain" is +configured to use CIPSO for packet labeling then a CIPSO IP option will be +generated and attached to the socket. + + * Inbound Packet Processing + +The CIPSO/IPv4 protocol engine validates every CIPSO IP option it finds at the +IP layer without any special handling required by the LSM. However, in order +to decode and translate the CIPSO label on the packet the LSM must use the +NetLabel security module API to extract the security attributes of the packet. +This is typically done at the socket layer using the 'socket_sock_rcv_skb()' +LSM hook. + + * Label Translation + +The CIPSO/IPv4 protocol engine contains a mechanism to translate CIPSO security +attributes such as sensitivity level and category to values which are +appropriate for the host. These mappings are defined as part of a CIPSO +Domain Of Interpretation (DOI) definition and are configured through the +NetLabel user space communication layer. Each DOI definition can have a +different security attribute mapping table. + + * Label Translation Cache + +The NetLabel system provides a framework for caching security attribute +mappings from the network labels to the corresponding LSM identifiers. The +CIPSO/IPv4 protocol engine supports this caching mechanism. diff --git a/Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt b/Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt new file mode 100644 index 0000000..256c2c9 --- /dev/null +++ b/Documentation/netlabel/draft-ietf-cipso-ipsecurity-01.txt @@ -0,0 +1,791 @@ +IETF CIPSO Working Group +16 July, 1992 + + + + COMMERCIAL IP SECURITY OPTION (CIPSO 2.2) + + + +1. Status + +This Internet Draft provides the high level specification for a Commercial +IP Security Option (CIPSO). This draft reflects the version as approved by +the CIPSO IETF Working Group. Distribution of this memo is unlimited. + +This document is an Internet Draft. Internet Drafts are working documents +of the Internet Engineering Task Force (IETF), its Areas, and its Working +Groups. Note that other groups may also distribute working documents as +Internet Drafts. + +Internet Drafts are draft documents valid for a maximum of six months. +Internet Drafts may be updated, replaced, or obsoleted by other documents +at any time. It is not appropriate to use Internet Drafts as reference +material or to cite them other than as a "working draft" or "work in +progress." + +Please check the I-D abstract listing contained in each Internet Draft +directory to learn the current status of this or any other Internet Draft. + + + + +2. Background + +Currently the Internet Protocol includes two security options. One of +these options is the DoD Basic Security Option (BSO) (Type 130) which allows +IP datagrams to be labeled with security classifications. This option +provides sixteen security classifications and a variable number of handling +restrictions. To handle additional security information, such as security +categories or compartments, another security option (Type 133) exists and +is referred to as the DoD Extended Security Option (ESO). The values for +the fixed fields within these two options are administered by the Defense +Information Systems Agency (DISA). + +Computer vendors are now building commercial operating systems with +mandatory access controls and multi-level security. These systems are +no longer built specifically for a particular group in the defense or +intelligence communities. They are generally available commercial systems +for use in a variety of government and civil sector environments. + +The small number of ESO format codes can not support all the possible +applications of a commercial security option. The BSO and ESO were +designed to only support the United States DoD. CIPSO has been designed +to support multiple security policies. This Internet Draft provides the +format and procedures required to support a Mandatory Access Control +security policy. Support for additional security policies shall be +defined in future RFCs. + + + + +Internet Draft, Expires 15 Jan 93 [PAGE 1] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + + +3. CIPSO Format + +Option type: 134 (Class 0, Number 6, Copy on Fragmentation) +Option length: Variable + +This option permits security related information to be passed between +systems within a single Domain of Interpretation (DOI). A DOI is a +collection of systems which agree on the meaning of particular values +in the security option. An authority that has been assigned a DOI +identifier will define a mapping between appropriate CIPSO field values +and their human readable equivalent. This authority will distribute that +mapping to hosts within the authority's domain. These mappings may be +sensitive, therefore a DOI authority is not required to make these +mappings available to anyone other than the systems that are included in +the DOI. + +This option MUST be copied on fragmentation. This option appears at most +once in a datagram. All multi-octet fields in the option are defined to be +transmitted in network byte order. The format of this option is as follows: + ++----------+----------+------//------+-----------//---------+ +| 10000110 | LLLLLLLL | DDDDDDDDDDDD | TTTTTTTTTTTTTTTTTTTT | ++----------+----------+------//------+-----------//---------+ + + TYPE=134 OPTION DOMAIN OF TAGS + LENGTH INTERPRETATION + + + Figure 1. CIPSO Format + + +3.1 Type + +This field is 1 octet in length. Its value is 134. + + +3.2 Length + +This field is 1 octet in length. It is the total length of the option +including the type and length fields. With the current IP header length +restriction of 40 octets the value of this field MUST not exceed 40. + + +3.3 Domain of Interpretation Identifier + +This field is an unsigned 32 bit integer. The value 0 is reserved and MUST +not appear as the DOI identifier in any CIPSO option. Implementations +should assume that the DOI identifier field is not aligned on any particular +byte boundary. + +To conserve space in the protocol, security levels and categories are +represented by numbers rather than their ASCII equivalent. This requires +a mapping table within CIPSO hosts to map these numbers to their +corresponding ASCII representations. Non-related groups of systems may + + + +Internet Draft, Expires 15 Jan 93 [PAGE 2] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +have their own unique mappings. For example, one group of systems may +use the number 5 to represent Unclassified while another group may use the +number 1 to represent that same security level. The DOI identifier is used +to identify which mapping was used for the values within the option. + + +3.4 Tag Types + +A common format for passing security related information is necessary +for interoperability. CIPSO uses sets of "tags" to contain the security +information relevant to the data in the IP packet. Each tag begins with +a tag type identifier followed by the length of the tag and ends with the +actual security information to be passed. All multi-octet fields in a tag +are defined to be transmitted in network byte order. Like the DOI +identifier field in the CIPSO header, implementations should assume that +all tags, as well as fields within a tag, are not aligned on any particular +octet boundary. The tag types defined in this document contain alignment +bytes to assist alignment of some information, however alignment can not +be guaranteed if CIPSO is not the first IP option. + +CIPSO tag types 0 through 127 are reserved for defining standard tag +formats. Their definitions will be published in RFCs. Tag types whose +identifiers are greater than 127 are defined by the DOI authority and may +only be meaningful in certain Domains of Interpretation. For these tag +types, implementations will require the DOI identifier as well as the tag +number to determine the security policy and the format associated with the +tag. Use of tag types above 127 are restricted to closed networks where +interoperability with other networks will not be an issue. Implementations +that support a tag type greater than 127 MUST support at least one DOI that +requires only tag types 1 to 127. + +Tag type 0 is reserved. Tag types 1, 2, and 5 are defined in this +Internet Draft. Types 3 and 4 are reserved for work in progress. +The standard format for all current and future CIPSO tags is shown below: + ++----------+----------+--------//--------+ +| TTTTTTTT | LLLLLLLL | IIIIIIIIIIIIIIII | ++----------+----------+--------//--------+ + TAG TAG TAG + TYPE LENGTH INFORMATION + + Figure 2: Standard Tag Format + +In the three tag types described in this document, the length and count +restrictions are based on the current IP limitation of 40 octets for all +IP options. If the IP header is later expanded, then the length and count +restrictions specified in this document may increase to use the full area +provided for IP options. + + +3.4.1 Tag Type Classes + +Tag classes consist of tag types that have common processing requirements +and support the same security policy. The three tags defined in this +Internet Draft belong to the Mandatory Access Control (MAC) Sensitivity + + + +Internet Draft, Expires 15 Jan 93 [PAGE 3] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +class and support the MAC Sensitivity security policy. + + +3.4.2 Tag Type 1 + +This is referred to as the "bit-mapped" tag type. Tag type 1 is included +in the MAC Sensitivity tag type class. The format of this tag type is as +follows: + ++----------+----------+----------+----------+--------//---------+ +| 00000001 | LLLLLLLL | 00000000 | LLLLLLLL | CCCCCCCCCCCCCCCCC | ++----------+----------+----------+----------+--------//---------+ + + TAG TAG ALIGNMENT SENSITIVITY BIT MAP OF + TYPE LENGTH OCTET LEVEL CATEGORIES + + Figure 3. Tag Type 1 Format + + +3.4.2.1 Tag Type + +This field is 1 octet in length and has a value of 1. + + +3.4.2.2 Tag Length + +This field is 1 octet in length. It is the total length of the tag type +including the type and length fields. With the current IP header length +restriction of 40 bytes the value within this field is between 4 and 34. + + +3.4.2.3 Alignment Octet + +This field is 1 octet in length and always has the value of 0. Its purpose +is to align the category bitmap field on an even octet boundary. This will +speed many implementations including router implementations. + + +3.4.2.4 Sensitivity Level + +This field is 1 octet in length. Its value is from 0 to 255. The values +are ordered with 0 being the minimum value and 255 representing the maximum +value. + + +3.4.2.5 Bit Map of Categories + +The length of this field is variable and ranges from 0 to 30 octets. This +provides representation of categories 0 to 239. The ordering of the bits +is left to right or MSB to LSB. For example category 0 is represented by +the most significant bit of the first byte and category 15 is represented +by the least significant bit of the second byte. Figure 4 graphically +shows this ordering. Bit N is binary 1 if category N is part of the label +for the datagram, and bit N is binary 0 if category N is not part of the +label. Except for the optimized tag 1 format described in the next section, + + + +Internet Draft, Expires 15 Jan 93 [PAGE 4] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +minimal encoding SHOULD be used resulting in no trailing zero octets in the +category bitmap. + + octet 0 octet 1 octet 2 octet 3 octet 4 octet 5 + XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX . . . +bit 01234567 89111111 11112222 22222233 33333333 44444444 +number 012345 67890123 45678901 23456789 01234567 + + Figure 4. Ordering of Bits in Tag 1 Bit Map + + +3.4.2.6 Optimized Tag 1 Format + +Routers work most efficiently when processing fixed length fields. To +support these routers there is an optimized form of tag type 1. The format +does not change. The only change is to the category bitmap which is set to +a constant length of 10 octets. Trailing octets required to fill out the 10 +octets are zero filled. Ten octets, allowing for 80 categories, was chosen +because it makes the total length of the CIPSO option 20 octets. If CIPSO +is the only option then the option will be full word aligned and additional +filler octets will not be required. + + +3.4.3 Tag Type 2 + +This is referred to as the "enumerated" tag type. It is used to describe +large but sparsely populated sets of categories. Tag type 2 is in the MAC +Sensitivity tag type class. The format of this tag type is as follows: + ++----------+----------+----------+----------+-------------//-------------+ +| 00000010 | LLLLLLLL | 00000000 | LLLLLLLL | CCCCCCCCCCCCCCCCCCCCCCCCCC | ++----------+----------+----------+----------+-------------//-------------+ + + TAG TAG ALIGNMENT SENSITIVITY ENUMERATED + TYPE LENGTH OCTET LEVEL CATEGORIES + + Figure 5. Tag Type 2 Format + + +3.4.3.1 Tag Type + +This field is one octet in length and has a value of 2. + + +3.4.3.2 Tag Length + +This field is 1 octet in length. It is the total length of the tag type +including the type and length fields. With the current IP header length +restriction of 40 bytes the value within this field is between 4 and 34. + + +3.4.3.3 Alignment Octet + +This field is 1 octet in length and always has the value of 0. Its purpose +is to align the category field on an even octet boundary. This will + + + +Internet Draft, Expires 15 Jan 93 [PAGE 5] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +speed many implementations including router implementations. + + +3.4.3.4 Sensitivity Level + +This field is 1 octet in length. Its value is from 0 to 255. The values +are ordered with 0 being the minimum value and 255 representing the +maximum value. + + +3.4.3.5 Enumerated Categories + +In this tag, categories are represented by their actual value rather than +by their position within a bit field. The length of each category is 2 +octets. Up to 15 categories may be represented by this tag. Valid values +for categories are 0 to 65534. Category 65535 is not a valid category +value. The categories MUST be listed in ascending order within the tag. + + +3.4.4 Tag Type 5 + +This is referred to as the "range" tag type. It is used to represent +labels where all categories in a range, or set of ranges, are included +in the sensitivity label. Tag type 5 is in the MAC Sensitivity tag type +class. The format of this tag type is as follows: + ++----------+----------+----------+----------+------------//-------------+ +| 00000101 | LLLLLLLL | 00000000 | LLLLLLLL | Top/Bottom | Top/Bottom | ++----------+----------+----------+----------+------------//-------------+ + + TAG TAG ALIGNMENT SENSITIVITY CATEGORY RANGES + TYPE LENGTH OCTET LEVEL + + Figure 6. Tag Type 5 Format + + +3.4.4.1 Tag Type + +This field is one octet in length and has a value of 5. + + +3.4.4.2 Tag Length + +This field is 1 octet in length. It is the total length of the tag type +including the type and length fields. With the current IP header length +restriction of 40 bytes the value within this field is between 4 and 34. + + +3.4.4.3 Alignment Octet + +This field is 1 octet in length and always has the value of 0. Its purpose +is to align the category range field on an even octet boundary. This will +speed many implementations including router implementations. + + + + + +Internet Draft, Expires 15 Jan 93 [PAGE 6] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +3.4.4.4 Sensitivity Level + +This field is 1 octet in length. Its value is from 0 to 255. The values +are ordered with 0 being the minimum value and 255 representing the maximum +value. + + +3.4.4.5 Category Ranges + +A category range is a 4 octet field comprised of the 2 octet index of the +highest numbered category followed by the 2 octet index of the lowest +numbered category. These range endpoints are inclusive within the range of +categories. All categories within a range are included in the sensitivity +label. This tag may contain a maximum of 7 category pairs. The bottom +category endpoint for the last pair in the tag MAY be omitted and SHOULD be +assumed to be 0. The ranges MUST be non-overlapping and be listed in +descending order. Valid values for categories are 0 to 65534. Category +65535 is not a valid category value. + + +3.4.5 Minimum Requirements + +A CIPSO implementation MUST be capable of generating at least tag type 1 in +the non-optimized form. In addition, a CIPSO implementation MUST be able +to receive any valid tag type 1 even those using the optimized tag type 1 +format. + + +4. Configuration Parameters + +The configuration parameters defined below are required for all CIPSO hosts, +gateways, and routers that support multiple sensitivity labels. A CIPSO +host is defined to be the origination or destination system for an IP +datagram. A CIPSO gateway provides IP routing services between two or more +IP networks and may be required to perform label translations between +networks. A CIPSO gateway may be an enhanced CIPSO host or it may just +provide gateway services with no end system CIPSO capabilities. A CIPSO +router is a dedicated IP router that routes IP datagrams between two or more +IP networks. + +An implementation of CIPSO on a host MUST have the capability to reject a +datagram for reasons that the information contained can not be adequately +protected by the receiving host or if acceptance may result in violation of +the host or network security policy. In addition, a CIPSO gateway or router +MUST be able to reject datagrams going to networks that can not provide +adequate protection or may violate the network's security policy. To +provide this capability the following minimal set of configuration +parameters are required for CIPSO implementations: + +HOST_LABEL_MAX - This parameter contains the maximum sensitivity label that +a CIPSO host is authorized to handle. All datagrams that have a label +greater than this maximum MUST be rejected by the CIPSO host. This +parameter does not apply to CIPSO gateways or routers. This parameter need +not be defined explicitly as it can be implicitly derived from the +PORT_LABEL_MAX parameters for the associated interfaces. + + + +Internet Draft, Expires 15 Jan 93 [PAGE 7] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + + +HOST_LABEL_MIN - This parameter contains the minimum sensitivity label that +a CIPSO host is authorized to handle. All datagrams that have a label less +than this minimum MUST be rejected by the CIPSO host. This parameter does +not apply to CIPSO gateways or routers. This parameter need not be defined +explicitly as it can be implicitly derived from the PORT_LABEL_MIN +parameters for the associated interfaces. + +PORT_LABEL_MAX - This parameter contains the maximum sensitivity label for +all datagrams that may exit a particular network interface port. All +outgoing datagrams that have a label greater than this maximum MUST be +rejected by the CIPSO system. The label within this parameter MUST be +less than or equal to the label within the HOST_LABEL_MAX parameter. This +parameter does not apply to CIPSO hosts that support only one network port. + +PORT_LABEL_MIN - This parameter contains the minimum sensitivity label for +all datagrams that may exit a particular network interface port. All +outgoing datagrams that have a label less than this minimum MUST be +rejected by the CIPSO system. The label within this parameter MUST be +greater than or equal to the label within the HOST_LABEL_MIN parameter. +This parameter does not apply to CIPSO hosts that support only one network +port. + +PORT_DOI - This parameter is used to assign a DOI identifier value to a +particular network interface port. All CIPSO labels within datagrams +going out this port MUST use the specified DOI identifier. All CIPSO +hosts and gateways MUST support either this parameter, the NET_DOI +parameter, or the HOST_DOI parameter. + +NET_DOI - This parameter is used to assign a DOI identifier value to a +particular IP network address. All CIPSO labels within datagrams destined +for the particular IP network MUST use the specified DOI identifier. All +CIPSO hosts and gateways MUST support either this parameter, the PORT_DOI +parameter, or the HOST_DOI parameter. + +HOST_DOI - This parameter is used to assign a DOI identifier value to a +particular IP host address. All CIPSO labels within datagrams destined for +the particular IP host will use the specified DOI identifier. All CIPSO +hosts and gateways MUST support either this parameter, the PORT_DOI +parameter, or the NET_DOI parameter. + +This list represents the minimal set of configuration parameters required +to be compliant. Implementors are encouraged to add to this list to +provide enhanced functionality and control. For example, many security +policies may require both incoming and outgoing datagrams be checked against +the port and host label ranges. + + +4.1 Port Range Parameters + +The labels represented by the PORT_LABEL_MAX and PORT_LABEL_MIN parameters +MAY be in CIPSO or local format. Some CIPSO systems, such as routers, may +want to have the range parameters expressed in CIPSO format so that incoming +labels do not have to be converted to a local format before being compared +against the range. If multiple DOIs are supported by one of these CIPSO + + + +Internet Draft, Expires 15 Jan 93 [PAGE 8] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +systems then multiple port range parameters would be needed, one set for +each DOI supported on a particular port. + +The port range will usually represent the total set of labels that may +exist on the logical network accessed through the corresponding network +interface. It may, however, represent a subset of these labels that are +allowed to enter the CIPSO system. + + +4.2 Single Label CIPSO Hosts + +CIPSO implementations that support only one label are not required to +support the parameters described above. These limited implementations are +only required to support a NET_LABEL parameter. This parameter contains +the CIPSO label that may be inserted in datagrams that exit the host. In +addition, the host MUST reject any incoming datagram that has a label which +is not equivalent to the NET_LABEL parameter. + + +5. Handling Procedures + +This section describes the processing requirements for incoming and +outgoing IP datagrams. Just providing the correct CIPSO label format +is not enough. Assumptions will be made by one system on how a +receiving system will handle the CIPSO label. Wrong assumptions may +lead to non-interoperability or even a security incident. The +requirements described below represent the minimal set needed for +interoperability and that provide users some level of confidence. +Many other requirements could be added to increase user confidence, +however at the risk of restricting creativity and limiting vendor +participation. + + +5.1 Input Procedures + +All datagrams received through a network port MUST have a security label +associated with them, either contained in the datagram or assigned to the +receiving port. Without this label the host, gateway, or router will not +have the information it needs to make security decisions. This security +label will be obtained from the CIPSO if the option is present in the +datagram. See section 4.1.2 for handling procedures for unlabeled +datagrams. This label will be compared against the PORT (if appropriate) +and HOST configuration parameters defined in section 3. + +If any field within the CIPSO option, such as the DOI identifier, is not +recognized the IP datagram is discarded and an ICMP "parameter problem" +(type 12) is generated and returned. The ICMP code field is set to "bad +parameter" (code 0) and the pointer is set to the start of the CIPSO field +that is unrecognized. + +If the contents of the CIPSO are valid but the security label is +outside of the configured host or port label range, the datagram is +discarded and an ICMP "destination unreachable" (type 3) is generated +and returned. The code field of the ICMP is set to "communication with +destination network administratively prohibited" (code 9) or to + + + +Internet Draft, Expires 15 Jan 93 [PAGE 9] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +"communication with destination host administratively prohibited" +(code 10). The value of the code field used is dependent upon whether +the originator of the ICMP message is acting as a CIPSO host or a CIPSO +gateway. The recipient of the ICMP message MUST be able to handle either +value. The same procedure is performed if a CIPSO can not be added to an +IP packet because it is too large to fit in the IP options area. + +If the error is triggered by receipt of an ICMP message, the message +is discarded and no response is permitted (consistent with general ICMP +processing rules). + + +5.1.1 Unrecognized tag types + +The default condition for any CIPSO implementation is that an +unrecognized tag type MUST be treated as a "parameter problem" and +handled as described in section 4.1. A CIPSO implementation MAY allow +the system administrator to identify tag types that may safely be +ignored. This capability is an allowable enhancement, not a +requirement. + + +5.1.2 Unlabeled Packets + +A network port may be configured to not require a CIPSO label for all +incoming datagrams. For this configuration a CIPSO label must be +assigned to that network port and associated with all unlabeled IP +datagrams. This capability might be used for single level networks or +networks that have CIPSO and non-CIPSO hosts and the non-CIPSO hosts +all operate at the same label. + +If a CIPSO option is required and none is found, the datagram is +discarded and an ICMP "parameter problem" (type 12) is generated and +returned to the originator of the datagram. The code field of the ICMP +is set to "option missing" (code 1) and the ICMP pointer is set to 134 +(the value of the option type for the missing CIPSO option). + + +5.2 Output Procedures + +A CIPSO option MUST appear only once in a datagram. Only one tag type +from the MAC Sensitivity class MAY be included in a CIPSO option. Given +the current set of defined tag types, this means that CIPSO labels at +first will contain only one tag. + +All datagrams leaving a CIPSO system MUST meet the following condition: + + PORT_LABEL_MIN <= CIPSO label <= PORT_LABEL_MAX + +If this condition is not satisfied the datagram MUST be discarded. +If the CIPSO system only supports one port, the HOST_LABEL_MIN and the +HOST_LABEL_MAX parameters MAY be substituted for the PORT parameters in +the above condition. + +The DOI identifier to be used for all outgoing datagrams is configured by + + + +Internet Draft, Expires 15 Jan 93 [PAGE 10] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + +the administrator. If port level DOI identifier assignment is used, then +the PORT_DOI configuration parameter MUST contain the DOI identifier to +use. If network level DOI assignment is used, then the NET_DOI parameter +MUST contain the DOI identifier to use. And if host level DOI assignment +is employed, then the HOST_DOI parameter MUST contain the DOI identifier +to use. A CIPSO implementation need only support one level of DOI +assignment. + + +5.3 DOI Processing Requirements + +A CIPSO implementation MUST support at least one DOI and SHOULD support +multiple DOIs. System and network administrators are cautioned to +ensure that at least one DOI is common within an IP network to allow for +broadcasting of IP datagrams. + +CIPSO gateways MUST be capable of translating a CIPSO option from one +DOI to another when forwarding datagrams between networks. For +efficiency purposes this capability is only a desired feature for CIPSO +routers. + + +5.4 Label of ICMP Messages + +The CIPSO label to be used on all outgoing ICMP messages MUST be equivalent +to the label of the datagram that caused the ICMP message. If the ICMP was +generated due to a problem associated with the original CIPSO label then the +following responses are allowed: + + a. Use the CIPSO label of the original IP datagram + b. Drop the original datagram with no return message generated + +In most cases these options will have the same effect. If you can not +interpret the label or if it is outside the label range of your host or +interface then an ICMP message with the same label will probably not be +able to exit the system. + + +6. Assignment of DOI Identifier Numbers = + +Requests for assignment of a DOI identifier number should be addressed to +the Internet Assigned Numbers Authority (IANA). + + +7. Acknowledgements + +Much of the material in this RFC is based on (and copied from) work +done by Gary Winiger of Sun Microsystems and published as Commercial +IP Security Option at the INTEROP 89, Commercial IPSO Workshop. + + +8. Author's Address + +To submit mail for distribution to members of the IETF CIPSO Working +Group, send mail to: cipso@wdl1.wdl.loral.com. + + + +Internet Draft, Expires 15 Jan 93 [PAGE 11] + + + +CIPSO INTERNET DRAFT 16 July, 1992 + + + + +To be added to or deleted from this distribution, send mail to: +cipso-request@wdl1.wdl.loral.com. + + +9. References + +RFC 1038, "Draft Revised IP Security Option", M. St. Johns, IETF, January +1988. + +RFC 1108, "U.S. Department of Defense Security Options +for the Internet Protocol", Stephen Kent, IAB, 1 March, 1991. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Internet Draft, Expires 15 Jan 93 [PAGE 12] + + + diff --git a/Documentation/netlabel/introduction.txt b/Documentation/netlabel/introduction.txt new file mode 100644 index 0000000..a4ffba1 --- /dev/null +++ b/Documentation/netlabel/introduction.txt @@ -0,0 +1,46 @@ +NetLabel Introduction +============================================================================== +Paul Moore, paul.moore@hp.com + +August 2, 2006 + + * Overview + +NetLabel is a mechanism which can be used by kernel security modules to attach +security attributes to outgoing network packets generated from user space +applications and read security attributes from incoming network packets. It +is composed of three main components, the protocol engines, the communication +layer, and the kernel security module API. + + * Protocol Engines + +The protocol engines are responsible for both applying and retrieving the +network packet's security attributes. If any translation between the network +security attributes and those on the host are required then the protocol +engine will handle those tasks as well. Other kernel subsystems should +refrain from calling the protocol engines directly, instead they should use +the NetLabel kernel security module API described below. + +Detailed information about each NetLabel protocol engine can be found in this +directory, consult '00-INDEX' for filenames. + + * Communication Layer + +The communication layer exists to allow NetLabel configuration and monitoring +from user space. The NetLabel communication layer uses a message based +protocol built on top of the Generic NETLINK transport mechanism. The exact +formatting of these NetLabel messages as well as the Generic NETLINK family +names can be found in the the 'net/netlabel/' directory as comments in the +header files as well as in 'include/net/netlabel.h'. + + * Security Module API + +The purpose of the NetLabel security module API is to provide a protocol +independent interface to the underlying NetLabel protocol engines. In addition +to protocol independence, the security module API is designed to be completely +LSM independent which should allow multiple LSMs to leverage the same code +base. + +Detailed information about the NetLabel security module API can be found in the +'include/net/netlabel.h' header file as well as the 'lsm_interface.txt' file +found in this directory. diff --git a/Documentation/netlabel/lsm_interface.txt b/Documentation/netlabel/lsm_interface.txt new file mode 100644 index 0000000..98dd9f7 --- /dev/null +++ b/Documentation/netlabel/lsm_interface.txt @@ -0,0 +1,47 @@ +NetLabel Linux Security Module Interface +============================================================================== +Paul Moore, paul.moore@hp.com + +May 17, 2006 + + * Overview + +NetLabel is a mechanism which can set and retrieve security attributes from +network packets. It is intended to be used by LSM developers who want to make +use of a common code base for several different packet labeling protocols. +The NetLabel security module API is defined in 'include/net/netlabel.h' but a +brief overview is given below. + + * NetLabel Security Attributes + +Since NetLabel supports multiple different packet labeling protocols and LSMs +it uses the concept of security attributes to refer to the packet's security +labels. The NetLabel security attributes are defined by the +'netlbl_lsm_secattr' structure in the NetLabel header file. Internally the +NetLabel subsystem converts the security attributes to and from the correct +low-level packet label depending on the NetLabel build time and run time +configuration. It is up to the LSM developer to translate the NetLabel +security attributes into whatever security identifiers are in use for their +particular LSM. + + * NetLabel LSM Protocol Operations + +These are the functions which allow the LSM developer to manipulate the labels +on outgoing packets as well as read the labels on incoming packets. Functions +exist to operate both on sockets as well as the sk_buffs directly. These high +level functions are translated into low level protocol operations based on how +the administrator has configured the NetLabel subsystem. + + * NetLabel Label Mapping Cache Operations + +Depending on the exact configuration, translation between the network packet +label and the internal LSM security identifier can be time consuming. The +NetLabel label mapping cache is a caching mechanism which can be used to +sidestep much of this overhead once a mapping has been established. Once the +LSM has received a packet, used NetLabel to decode it's security attributes, +and translated the security attributes into a LSM internal identifier the LSM +can use the NetLabel caching functions to associate the LSM internal +identifier with the network packet's label. This means that in the future +when a incoming packet matches a cached value not only are the internal +NetLabel translation mechanisms bypassed but the LSM translation mechanisms are +bypassed as well which should result in a significant reduction in overhead. diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 90ed781..307cd4e 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -375,6 +375,41 @@ tcp_slow_start_after_idle - BOOLEAN be timed out after an idle period. Default: 1 +CIPSOv4 Variables: + +cipso_cache_enable - BOOLEAN + If set, enable additions to and lookups from the CIPSO label mapping + cache. If unset, additions are ignored and lookups always result in a + miss. However, regardless of the setting the cache is still + invalidated when required when means you can safely toggle this on and + off and the cache will always be "safe". + Default: 1 + +cipso_cache_bucket_size - INTEGER + The CIPSO label cache consists of a fixed size hash table with each + hash bucket containing a number of cache entries. This variable limits + the number of entries in each hash bucket; the larger the value the + more CIPSO label mappings that can be cached. When the number of + entries in a given hash bucket reaches this limit adding new entries + causes the oldest entry in the bucket to be removed to make room. + Default: 10 + +cipso_rbm_optfmt - BOOLEAN + Enable the "Optimized Tag 1 Format" as defined in section 3.4.2.6 of + the CIPSO draft specification (see Documentation/netlabel for details). + This means that when set the CIPSO tag will be padded with empty + categories in order to make the packet data 32-bit aligned. + Default: 0 + +cipso_rbm_structvalid - BOOLEAN + If set, do a very strict check of the CIPSO option when + ip_options_compile() is called. If unset, relax the checks done during + ip_options_compile(). Either way is "safe" as errors are caught else + where in the CIPSO processing code but setting this to 0 (False) should + result in less work (i.e. it should be faster) but could cause problems + with other implementations that require strict checking. + Default: 0 + IP Variables: ip_local_port_range - 2 INTEGERS -- cgit v0.10.2 From 11a03f78fbf15a866ba3bf6359a75cdfd1ced703 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:46:20 -0700 Subject: [NetLabel]: core network changes Changes to the core network stack to support the NetLabel subsystem. This includes changes to the IPv4 option handling to support CIPSO labels. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/include/linux/ip.h b/include/linux/ip.h index 4b55cf1..2f46001 100644 --- a/include/linux/ip.h +++ b/include/linux/ip.h @@ -57,6 +57,7 @@ #define IPOPT_SEC (2 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_LSRR (3 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_TIMESTAMP (4 |IPOPT_MEASUREMENT) +#define IPOPT_CIPSO (6 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_RR (7 |IPOPT_CONTROL) #define IPOPT_SID (8 |IPOPT_CONTROL|IPOPT_COPY) #define IPOPT_SSRR (9 |IPOPT_CONTROL|IPOPT_COPY) diff --git a/include/net/cipso_ipv4.h b/include/net/cipso_ipv4.h new file mode 100644 index 0000000..c7175e7 --- /dev/null +++ b/include/net/cipso_ipv4.h @@ -0,0 +1,250 @@ +/* + * CIPSO - Commercial IP Security Option + * + * This is an implementation of the CIPSO 2.2 protocol as specified in + * draft-ietf-cipso-ipsecurity-01.txt with additional tag types as found in + * FIPS-188, copies of both documents can be found in the Documentation + * directory. While CIPSO never became a full IETF RFC standard many vendors + * have chosen to adopt the protocol and over the years it has become a + * de-facto standard for labeled networking. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _CIPSO_IPV4_H +#define _CIPSO_IPV4_H + +#include +#include +#include +#include + +/* known doi values */ +#define CIPSO_V4_DOI_UNKNOWN 0x00000000 + +/* tag types */ +#define CIPSO_V4_TAG_INVALID 0 +#define CIPSO_V4_TAG_RBITMAP 1 +#define CIPSO_V4_TAG_ENUM 2 +#define CIPSO_V4_TAG_RANGE 5 +#define CIPSO_V4_TAG_PBITMAP 6 +#define CIPSO_V4_TAG_FREEFORM 7 + +/* doi mapping types */ +#define CIPSO_V4_MAP_UNKNOWN 0 +#define CIPSO_V4_MAP_STD 1 +#define CIPSO_V4_MAP_PASS 2 + +/* limits */ +#define CIPSO_V4_MAX_REM_LVLS 256 +#define CIPSO_V4_INV_LVL 0x80000000 +#define CIPSO_V4_MAX_LOC_LVLS (CIPSO_V4_INV_LVL - 1) +#define CIPSO_V4_MAX_REM_CATS 65536 +#define CIPSO_V4_INV_CAT 0x80000000 +#define CIPSO_V4_MAX_LOC_CATS (CIPSO_V4_INV_CAT - 1) + +/* + * CIPSO DOI definitions + */ + +/* DOI definition struct */ +#define CIPSO_V4_TAG_MAXCNT 5 +struct cipso_v4_doi { + u32 doi; + u32 type; + union { + struct cipso_v4_std_map_tbl *std; + } map; + u8 tags[CIPSO_V4_TAG_MAXCNT]; + + u32 valid; + struct list_head list; + struct rcu_head rcu; + struct list_head dom_list; +}; + +/* Standard CIPSO mapping table */ +/* NOTE: the highest order bit (i.e. 0x80000000) is an 'invalid' flag, if the + * bit is set then consider that value as unspecified, meaning the + * mapping for that particular level/category is invalid */ +struct cipso_v4_std_map_tbl { + struct { + u32 *cipso; + u32 *local; + u32 cipso_size; + u32 local_size; + } lvl; + struct { + u32 *cipso; + u32 *local; + u32 cipso_size; + u32 local_size; + } cat; +}; + +/* + * Sysctl Variables + */ + +#ifdef CONFIG_NETLABEL +extern int cipso_v4_cache_enabled; +extern int cipso_v4_cache_bucketsize; +extern int cipso_v4_rbm_optfmt; +extern int cipso_v4_rbm_strictvalid; +#endif + +/* + * Helper Functions + */ + +#define CIPSO_V4_OPTEXIST(x) (IPCB(x)->opt.cipso != 0) +#define CIPSO_V4_OPTPTR(x) ((x)->nh.raw + IPCB(x)->opt.cipso) + +/* + * DOI List Functions + */ + +#ifdef CONFIG_NETLABEL +int cipso_v4_doi_add(struct cipso_v4_doi *doi_def); +int cipso_v4_doi_remove(u32 doi, void (*callback) (struct rcu_head * head)); +struct cipso_v4_doi *cipso_v4_doi_getdef(u32 doi); +struct sk_buff *cipso_v4_doi_dump_all(size_t headroom); +struct sk_buff *cipso_v4_doi_dump(u32 doi, size_t headroom); +int cipso_v4_doi_domhsh_add(struct cipso_v4_doi *doi_def, const char *domain); +int cipso_v4_doi_domhsh_remove(struct cipso_v4_doi *doi_def, + const char *domain); +#else +static inline int cipso_v4_doi_add(struct cipso_v4_doi *doi_def) +{ + return -ENOSYS; +} + +static inline int cipso_v4_doi_remove(u32 doi, + void (*callback) (struct rcu_head * head)) +{ + return 0; +} + +static inline struct cipso_v4_doi *cipso_v4_doi_getdef(u32 doi) +{ + return NULL; +} + +static inline struct sk_buff *cipso_v4_doi_dump_all(size_t headroom) +{ + return NULL; +} + +static inline struct sk_buff *cipso_v4_doi_dump(u32 doi, size_t headroom) +{ + return NULL; +} + +static inline int cipso_v4_doi_domhsh_add(struct cipso_v4_doi *doi_def, + const char *domain) +{ + return -ENOSYS; +} + +static inline int cipso_v4_doi_domhsh_remove(struct cipso_v4_doi *doi_def, + const char *domain) +{ + return 0; +} +#endif /* CONFIG_NETLABEL */ + +/* + * Label Mapping Cache Functions + */ + +#ifdef CONFIG_NETLABEL +void cipso_v4_cache_invalidate(void); +int cipso_v4_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr); +#else +static inline void cipso_v4_cache_invalidate(void) +{ + return; +} + +static inline int cipso_v4_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr) +{ + return 0; +} +#endif /* CONFIG_NETLABEL */ + +/* + * Protocol Handling Functions + */ + +#ifdef CONFIG_NETLABEL +void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway); +int cipso_v4_socket_setopt(struct socket *sock, + unsigned char *opt, + u32 opt_len); +int cipso_v4_socket_setattr(const struct socket *sock, + const struct cipso_v4_doi *doi_def, + const struct netlbl_lsm_secattr *secattr); +int cipso_v4_socket_getopt(const struct socket *sock, + unsigned char **opt, + u32 *opt_len); +int cipso_v4_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr); +int cipso_v4_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr); +int cipso_v4_validate(unsigned char **option); +#else +static inline void cipso_v4_error(struct sk_buff *skb, + int error, + u32 gateway) +{ + return; +} + +static inline int cipso_v4_socket_setattr(const struct socket *sock, + const struct cipso_v4_doi *doi_def, + const struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline int cipso_v4_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline int cipso_v4_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline int cipso_v4_validate(unsigned char **option) +{ + return -ENOSYS; +} +#endif /* CONFIG_NETLABEL */ + +#endif /* _CIPSO_IPV4_H */ diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index 1f4a9a6..f4caad5 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -51,7 +51,7 @@ struct ip_options { ts_needtime:1, ts_needaddr:1; unsigned char router_alert; - unsigned char __pad1; + unsigned char cipso; unsigned char __pad2; unsigned char __data[0]; }; diff --git a/include/net/netlabel.h b/include/net/netlabel.h new file mode 100644 index 0000000..7cae730 --- /dev/null +++ b/include/net/netlabel.h @@ -0,0 +1,291 @@ +/* + * NetLabel System + * + * The NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_H +#define _NETLABEL_H + +#include +#include +#include + +/* + * NetLabel - A management interface for maintaining network packet label + * mapping tables for explicit packet labling protocols. + * + * Network protocols such as CIPSO and RIPSO require a label translation layer + * to convert the label on the packet into something meaningful on the host + * machine. In the current Linux implementation these mapping tables live + * inside the kernel; NetLabel provides a mechanism for user space applications + * to manage these mapping tables. + * + * NetLabel makes use of the Generic NETLINK mechanism as a transport layer to + * send messages between kernel and user space. The general format of a + * NetLabel message is shown below: + * + * +-----------------+-------------------+--------- --- -- - + * | struct nlmsghdr | struct genlmsghdr | payload + * +-----------------+-------------------+--------- --- -- - + * + * The 'nlmsghdr' and 'genlmsghdr' structs should be dealt with like normal. + * The payload is dependent on the subsystem specified in the + * 'nlmsghdr->nlmsg_type' and should be defined below, supporting functions + * should be defined in the corresponding net/netlabel/netlabel_.h|c + * file. All of the fields in the NetLabel payload are NETLINK attributes, the + * length of each field is the length of the NETLINK attribute payload, see + * include/net/netlink.h for more information on NETLINK attributes. + * + */ + +/* + * NetLabel NETLINK protocol + */ + +#define NETLBL_PROTO_VERSION 1 + +/* NetLabel NETLINK types/families */ +#define NETLBL_NLTYPE_NONE 0 +#define NETLBL_NLTYPE_MGMT 1 +#define NETLBL_NLTYPE_MGMT_NAME "NLBL_MGMT" +#define NETLBL_NLTYPE_RIPSO 2 +#define NETLBL_NLTYPE_RIPSO_NAME "NLBL_RIPSO" +#define NETLBL_NLTYPE_CIPSOV4 3 +#define NETLBL_NLTYPE_CIPSOV4_NAME "NLBL_CIPSOv4" +#define NETLBL_NLTYPE_CIPSOV6 4 +#define NETLBL_NLTYPE_CIPSOV6_NAME "NLBL_CIPSOv6" +#define NETLBL_NLTYPE_UNLABELED 5 +#define NETLBL_NLTYPE_UNLABELED_NAME "NLBL_UNLBL" + +/* NetLabel return codes */ +#define NETLBL_E_OK 0 + +/* + * Helper functions + */ + +#define NETLBL_LEN_U8 nla_total_size(sizeof(u8)) +#define NETLBL_LEN_U16 nla_total_size(sizeof(u16)) +#define NETLBL_LEN_U32 nla_total_size(sizeof(u32)) + +/** + * netlbl_netlink_alloc_skb - Allocate a NETLINK message buffer + * @head: the amount of headroom in bytes + * @body: the desired size (minus headroom) in bytes + * @gfp_flags: the alloc flags to pass to alloc_skb() + * + * Description: + * Allocate a NETLINK message buffer based on the sizes given in @head and + * @body. If @head is greater than zero skb_reserve() is called to reserve + * @head bytes at the start of the buffer. Returns a valid sk_buff pointer on + * success, NULL on failure. + * + */ +static inline struct sk_buff *netlbl_netlink_alloc_skb(size_t head, + size_t body, + int gfp_flags) +{ + struct sk_buff *skb; + + skb = alloc_skb(NLMSG_ALIGN(head + body), gfp_flags); + if (skb == NULL) + return NULL; + if (head > 0) { + skb_reserve(skb, head); + if (skb_tailroom(skb) < body) { + kfree_skb(skb); + return NULL; + } + } + + return skb; +} + +/* + * NetLabel - Kernel API for accessing the network packet label mappings. + * + * The following functions are provided for use by other kernel modules, + * specifically kernel LSM modules, to provide a consistent, transparent API + * for dealing with explicit packet labeling protocols such as CIPSO and + * RIPSO. The functions defined here are implemented in the + * net/netlabel/netlabel_kapi.c file. + * + */ + +/* Domain mapping definition struct */ +struct netlbl_dom_map; + +/* Domain mapping operations */ +int netlbl_domhsh_remove(const char *domain); + +/* LSM security attributes */ +struct netlbl_lsm_cache { + void (*free) (const void *data); + void *data; +}; +struct netlbl_lsm_secattr { + char *domain; + + u32 mls_lvl; + u32 mls_lvl_vld; + unsigned char *mls_cat; + size_t mls_cat_len; + + struct netlbl_lsm_cache cache; +}; + +/* + * LSM security attribute operations + */ + + +/** + * netlbl_secattr_init - Initialize a netlbl_lsm_secattr struct + * @secattr: the struct to initialize + * + * Description: + * Initialize an already allocated netlbl_lsm_secattr struct. Returns zero on + * success, negative values on error. + * + */ +static inline int netlbl_secattr_init(struct netlbl_lsm_secattr *secattr) +{ + memset(secattr, 0, sizeof(*secattr)); + return 0; +} + +/** + * netlbl_secattr_destroy - Clears a netlbl_lsm_secattr struct + * @secattr: the struct to clear + * @clear_cache: cache clear flag + * + * Description: + * Destroys the @secattr struct, including freeing all of the internal buffers. + * If @clear_cache is true then free the cache fields, otherwise leave them + * intact. The struct must be reset with a call to netlbl_secattr_init() + * before reuse. + * + */ +static inline void netlbl_secattr_destroy(struct netlbl_lsm_secattr *secattr, + u32 clear_cache) +{ + if (clear_cache && secattr->cache.data != NULL && secattr->cache.free) + secattr->cache.free(secattr->cache.data); + kfree(secattr->domain); + kfree(secattr->mls_cat); +} + +/** + * netlbl_secattr_alloc - Allocate and initialize a netlbl_lsm_secattr struct + * @flags: the memory allocation flags + * + * Description: + * Allocate and initialize a netlbl_lsm_secattr struct. Returns a valid + * pointer on success, or NULL on failure. + * + */ +static inline struct netlbl_lsm_secattr *netlbl_secattr_alloc(int flags) +{ + return kzalloc(sizeof(struct netlbl_lsm_secattr), flags); +} + +/** + * netlbl_secattr_free - Frees a netlbl_lsm_secattr struct + * @secattr: the struct to free + * @clear_cache: cache clear flag + * + * Description: + * Frees @secattr including all of the internal buffers. If @clear_cache is + * true then free the cache fields, otherwise leave them intact. + * + */ +static inline void netlbl_secattr_free(struct netlbl_lsm_secattr *secattr, + u32 clear_cache) +{ + netlbl_secattr_destroy(secattr, clear_cache); + kfree(secattr); +} + +/* + * LSM protocol operations + */ + +#ifdef CONFIG_NETLABEL +int netlbl_socket_setattr(const struct socket *sock, + const struct netlbl_lsm_secattr *secattr); +int netlbl_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr); +int netlbl_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr); +void netlbl_skbuff_err(struct sk_buff *skb, int error); +#else +static inline int netlbl_socket_setattr(const struct socket *sock, + const struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline int netlbl_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline int netlbl_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr) +{ + return -ENOSYS; +} + +static inline void netlbl_skbuff_err(struct sk_buff *skb, int error) +{ + return; +} +#endif /* CONFIG_NETLABEL */ + +/* + * LSM label mapping cache operations + */ + +#ifdef CONFIG_NETLABEL +void netlbl_cache_invalidate(void); +int netlbl_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr); +#else +static inline void netlbl_cache_invalidate(void) +{ + return; +} + +static inline int netlbl_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr) +{ + return 0; +} +#endif /* CONFIG_NETLABEL */ + +#endif /* _NETLABEL_H */ diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index 2b98943..008e69d 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -35,7 +35,7 @@ static int ip_clear_mutable_options(struct iphdr *iph, u32 *daddr) switch (*optptr) { case IPOPT_SEC: case 0x85: /* Some "Extended Security" crap. */ - case 0x86: /* Another "Commercial Security" crap. */ + case IPOPT_CIPSO: case IPOPT_RA: case 0x80|21: /* RFC1770 */ break; diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index 406056e..e0a93b4 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -24,6 +24,7 @@ #include #include #include +#include /* * Write options to IP header, record destination address to @@ -194,6 +195,13 @@ int ip_options_echo(struct ip_options * dopt, struct sk_buff * skb) dopt->is_strictroute = sopt->is_strictroute; } } + if (sopt->cipso) { + optlen = sptr[sopt->cipso+1]; + dopt->cipso = dopt->optlen+sizeof(struct iphdr); + memcpy(dptr, sptr+sopt->cipso, optlen); + dptr += optlen; + dopt->optlen += optlen; + } while (dopt->optlen & 3) { *dptr++ = IPOPT_END; dopt->optlen++; @@ -434,6 +442,17 @@ int ip_options_compile(struct ip_options * opt, struct sk_buff * skb) if (optptr[2] == 0 && optptr[3] == 0) opt->router_alert = optptr - iph; break; + case IPOPT_CIPSO: + if (opt->cipso) { + pp_ptr = optptr; + goto error; + } + opt->cipso = optptr - iph; + if (cipso_v4_validate(&optptr)) { + pp_ptr = optptr; + goto error; + } + break; case IPOPT_SEC: case IPOPT_SID: default: -- cgit v0.10.2 From 446fda4f26822b2d42ab3396aafcedf38a9ff2b6 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:48:06 -0700 Subject: [NetLabel]: CIPSOv4 engine Add support for the Commercial IP Security Option (CIPSO) to the IPv4 network stack. CIPSO has become a de-facto standard for trusted/labeled networking amongst existing Trusted Operating Systems such as Trusted Solaris, HP-UX CMW, etc. This implementation is designed to be used with the NetLabel subsystem to provide explicit packet labeling to LSM developers. The CIPSO/IPv4 packet labeling works by the LSM calling a NetLabel API function which attaches a CIPSO label (IPv4 option) to a given socket; this in turn attaches the CIPSO label to every packet leaving the socket without any extra processing on the outbound side. On the inbound side the individual packet's sk_buff is examined through a call to a NetLabel API function to determine if a CIPSO/IPv4 label is present and if so the security attributes of the CIPSO label are returned to the caller of the NetLabel API function. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index e4b1a4d..af61d92 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -411,6 +411,10 @@ enum NET_IPV4_TCP_WORKAROUND_SIGNED_WINDOWS=115, NET_TCP_DMA_COPYBREAK=116, NET_TCP_SLOW_START_AFTER_IDLE=117, + NET_CIPSOV4_CACHE_ENABLE=118, + NET_CIPSOV4_CACHE_BUCKET_SIZE=119, + NET_CIPSOV4_RBM_OPTFMT=120, + NET_CIPSOV4_RBM_STRICTVALID=121, }; enum { diff --git a/net/ipv4/Makefile b/net/ipv4/Makefile index 4878fc5..f66049e 100644 --- a/net/ipv4/Makefile +++ b/net/ipv4/Makefile @@ -47,6 +47,7 @@ obj-$(CONFIG_TCP_CONG_VEGAS) += tcp_vegas.o obj-$(CONFIG_TCP_CONG_VENO) += tcp_veno.o obj-$(CONFIG_TCP_CONG_SCALABLE) += tcp_scalable.o obj-$(CONFIG_TCP_CONG_LP) += tcp_lp.o +obj-$(CONFIG_NETLABEL) += cipso_ipv4.o obj-$(CONFIG_XFRM) += xfrm4_policy.o xfrm4_state.o xfrm4_input.o \ xfrm4_output.o diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c new file mode 100644 index 0000000..b82a101 --- /dev/null +++ b/net/ipv4/cipso_ipv4.c @@ -0,0 +1,1607 @@ +/* + * CIPSO - Commercial IP Security Option + * + * This is an implementation of the CIPSO 2.2 protocol as specified in + * draft-ietf-cipso-ipsecurity-01.txt with additional tag types as found in + * FIPS-188, copies of both documents can be found in the Documentation + * directory. While CIPSO never became a full IETF RFC standard many vendors + * have chosen to adopt the protocol and over the years it has become a + * de-facto standard for labeled networking. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +struct cipso_v4_domhsh_entry { + char *domain; + u32 valid; + struct list_head list; + struct rcu_head rcu; +}; + +/* List of available DOI definitions */ +/* XXX - Updates should be minimal so having a single lock for the + * cipso_v4_doi_list and the cipso_v4_doi_list->dom_list should be + * okay. */ +/* XXX - This currently assumes a minimal number of different DOIs in use, + * if in practice there are a lot of different DOIs this list should + * probably be turned into a hash table or something similar so we + * can do quick lookups. */ +DEFINE_SPINLOCK(cipso_v4_doi_list_lock); +static struct list_head cipso_v4_doi_list = LIST_HEAD_INIT(cipso_v4_doi_list); + +/* Label mapping cache */ +int cipso_v4_cache_enabled = 1; +int cipso_v4_cache_bucketsize = 10; +#define CIPSO_V4_CACHE_BUCKETBITS 7 +#define CIPSO_V4_CACHE_BUCKETS (1 << CIPSO_V4_CACHE_BUCKETBITS) +#define CIPSO_V4_CACHE_REORDERLIMIT 10 +struct cipso_v4_map_cache_bkt { + spinlock_t lock; + u32 size; + struct list_head list; +}; +struct cipso_v4_map_cache_entry { + u32 hash; + unsigned char *key; + size_t key_len; + + struct netlbl_lsm_cache lsm_data; + + u32 activity; + struct list_head list; +}; +static struct cipso_v4_map_cache_bkt *cipso_v4_cache = NULL; + +/* Restricted bitmap (tag #1) flags */ +int cipso_v4_rbm_optfmt = 0; +int cipso_v4_rbm_strictvalid = 1; + +/* + * Helper Functions + */ + +/** + * cipso_v4_bitmap_walk - Walk a bitmap looking for a bit + * @bitmap: the bitmap + * @bitmap_len: length in bits + * @offset: starting offset + * @state: if non-zero, look for a set (1) bit else look for a cleared (0) bit + * + * Description: + * Starting at @offset, walk the bitmap from left to right until either the + * desired bit is found or we reach the end. Return the bit offset, -1 if + * not found, or -2 if error. + */ +static int cipso_v4_bitmap_walk(const unsigned char *bitmap, + u32 bitmap_len, + u32 offset, + u8 state) +{ + u32 bit_spot; + u32 byte_offset; + unsigned char bitmask; + unsigned char byte; + + /* gcc always rounds to zero when doing integer division */ + byte_offset = offset / 8; + byte = bitmap[byte_offset]; + bit_spot = offset; + bitmask = 0x80 >> (offset % 8); + + while (bit_spot < bitmap_len) { + if ((state && (byte & bitmask) == bitmask) || + (state == 0 && (byte & bitmask) == 0)) + return bit_spot; + + bit_spot++; + bitmask >>= 1; + if (bitmask == 0) { + byte = bitmap[++byte_offset]; + bitmask = 0x80; + } + } + + return -1; +} + +/** + * cipso_v4_bitmap_setbit - Sets a single bit in a bitmap + * @bitmap: the bitmap + * @bit: the bit + * @state: if non-zero, set the bit (1) else clear the bit (0) + * + * Description: + * Set a single bit in the bitmask. Returns zero on success, negative values + * on error. + */ +static void cipso_v4_bitmap_setbit(unsigned char *bitmap, + u32 bit, + u8 state) +{ + u32 byte_spot; + u8 bitmask; + + /* gcc always rounds to zero when doing integer division */ + byte_spot = bit / 8; + bitmask = 0x80 >> (bit % 8); + if (state) + bitmap[byte_spot] |= bitmask; + else + bitmap[byte_spot] &= ~bitmask; +} + +/** + * cipso_v4_doi_domhsh_free - Frees a domain list entry + * @entry: the entry's RCU field + * + * Description: + * This function is designed to be used as a callback to the call_rcu() + * function so that the memory allocated to a domain list entry can be released + * safely. + * + */ +static void cipso_v4_doi_domhsh_free(struct rcu_head *entry) +{ + struct cipso_v4_domhsh_entry *ptr; + + ptr = container_of(entry, struct cipso_v4_domhsh_entry, rcu); + kfree(ptr->domain); + kfree(ptr); +} + +/** + * cipso_v4_cache_entry_free - Frees a cache entry + * @entry: the entry to free + * + * Description: + * This function frees the memory associated with a cache entry. + * + */ +static void cipso_v4_cache_entry_free(struct cipso_v4_map_cache_entry *entry) +{ + if (entry->lsm_data.free) + entry->lsm_data.free(entry->lsm_data.data); + kfree(entry->key); + kfree(entry); +} + +/** + * cipso_v4_map_cache_hash - Hashing function for the CIPSO cache + * @key: the hash key + * @key_len: the length of the key in bytes + * + * Description: + * The CIPSO tag hashing function. Returns a 32-bit hash value. + * + */ +static u32 cipso_v4_map_cache_hash(const unsigned char *key, u32 key_len) +{ + return jhash(key, key_len, 0); +} + +/* + * Label Mapping Cache Functions + */ + +/** + * cipso_v4_cache_init - Initialize the CIPSO cache + * + * Description: + * Initializes the CIPSO label mapping cache, this function should be called + * before any of the other functions defined in this file. Returns zero on + * success, negative values on error. + * + */ +static int cipso_v4_cache_init(void) +{ + u32 iter; + + cipso_v4_cache = kcalloc(CIPSO_V4_CACHE_BUCKETS, + sizeof(struct cipso_v4_map_cache_bkt), + GFP_KERNEL); + if (cipso_v4_cache == NULL) + return -ENOMEM; + + for (iter = 0; iter < CIPSO_V4_CACHE_BUCKETS; iter++) { + spin_lock_init(&cipso_v4_cache[iter].lock); + cipso_v4_cache[iter].size = 0; + INIT_LIST_HEAD(&cipso_v4_cache[iter].list); + } + + return 0; +} + +/** + * cipso_v4_cache_invalidate - Invalidates the current CIPSO cache + * + * Description: + * Invalidates and frees any entries in the CIPSO cache. Returns zero on + * success and negative values on failure. + * + */ +void cipso_v4_cache_invalidate(void) +{ + struct cipso_v4_map_cache_entry *entry, *tmp_entry; + u32 iter; + + for (iter = 0; iter < CIPSO_V4_CACHE_BUCKETS; iter++) { + spin_lock(&cipso_v4_cache[iter].lock); + list_for_each_entry_safe(entry, + tmp_entry, + &cipso_v4_cache[iter].list, list) { + list_del(&entry->list); + cipso_v4_cache_entry_free(entry); + } + cipso_v4_cache[iter].size = 0; + spin_unlock(&cipso_v4_cache[iter].lock); + } + + return; +} + +/** + * cipso_v4_cache_check - Check the CIPSO cache for a label mapping + * @key: the buffer to check + * @key_len: buffer length in bytes + * @secattr: the security attribute struct to use + * + * Description: + * This function checks the cache to see if a label mapping already exists for + * the given key. If there is a match then the cache is adjusted and the + * @secattr struct is populated with the correct LSM security attributes. The + * cache is adjusted in the following manner if the entry is not already the + * first in the cache bucket: + * + * 1. The cache entry's activity counter is incremented + * 2. The previous (higher ranking) entry's activity counter is decremented + * 3. If the difference between the two activity counters is geater than + * CIPSO_V4_CACHE_REORDERLIMIT the two entries are swapped + * + * Returns zero on success, -ENOENT for a cache miss, and other negative values + * on error. + * + */ +static int cipso_v4_cache_check(const unsigned char *key, + u32 key_len, + struct netlbl_lsm_secattr *secattr) +{ + u32 bkt; + struct cipso_v4_map_cache_entry *entry; + struct cipso_v4_map_cache_entry *prev_entry = NULL; + u32 hash; + + if (!cipso_v4_cache_enabled) + return -ENOENT; + + hash = cipso_v4_map_cache_hash(key, key_len); + bkt = hash & (CIPSO_V4_CACHE_BUCKETBITS - 1); + spin_lock(&cipso_v4_cache[bkt].lock); + list_for_each_entry(entry, &cipso_v4_cache[bkt].list, list) { + if (entry->hash == hash && + entry->key_len == key_len && + memcmp(entry->key, key, key_len) == 0) { + entry->activity += 1; + secattr->cache.free = entry->lsm_data.free; + secattr->cache.data = entry->lsm_data.data; + if (prev_entry == NULL) { + spin_unlock(&cipso_v4_cache[bkt].lock); + return 0; + } + + if (prev_entry->activity > 0) + prev_entry->activity -= 1; + if (entry->activity > prev_entry->activity && + entry->activity - prev_entry->activity > + CIPSO_V4_CACHE_REORDERLIMIT) { + __list_del(entry->list.prev, entry->list.next); + __list_add(&entry->list, + prev_entry->list.prev, + &prev_entry->list); + } + + spin_unlock(&cipso_v4_cache[bkt].lock); + return 0; + } + prev_entry = entry; + } + spin_unlock(&cipso_v4_cache[bkt].lock); + + return -ENOENT; +} + +/** + * cipso_v4_cache_add - Add an entry to the CIPSO cache + * @skb: the packet + * @secattr: the packet's security attributes + * + * Description: + * Add a new entry into the CIPSO label mapping cache. Add the new entry to + * head of the cache bucket's list, if the cache bucket is out of room remove + * the last entry in the list first. It is important to note that there is + * currently no checking for duplicate keys. Returns zero on success, + * negative values on failure. + * + */ +int cipso_v4_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr) +{ + int ret_val = -EPERM; + u32 bkt; + struct cipso_v4_map_cache_entry *entry = NULL; + struct cipso_v4_map_cache_entry *old_entry = NULL; + unsigned char *cipso_ptr; + u32 cipso_ptr_len; + + if (!cipso_v4_cache_enabled || cipso_v4_cache_bucketsize <= 0) + return 0; + + cipso_ptr = CIPSO_V4_OPTPTR(skb); + cipso_ptr_len = cipso_ptr[1]; + + entry = kzalloc(sizeof(*entry), GFP_ATOMIC); + if (entry == NULL) + return -ENOMEM; + entry->key = kmalloc(cipso_ptr_len, GFP_ATOMIC); + if (entry->key == NULL) { + ret_val = -ENOMEM; + goto cache_add_failure; + } + memcpy(entry->key, cipso_ptr, cipso_ptr_len); + entry->key_len = cipso_ptr_len; + entry->hash = cipso_v4_map_cache_hash(cipso_ptr, cipso_ptr_len); + entry->lsm_data.free = secattr->cache.free; + entry->lsm_data.data = secattr->cache.data; + + bkt = entry->hash & (CIPSO_V4_CACHE_BUCKETBITS - 1); + spin_lock(&cipso_v4_cache[bkt].lock); + if (cipso_v4_cache[bkt].size < cipso_v4_cache_bucketsize) { + list_add(&entry->list, &cipso_v4_cache[bkt].list); + cipso_v4_cache[bkt].size += 1; + } else { + old_entry = list_entry(cipso_v4_cache[bkt].list.prev, + struct cipso_v4_map_cache_entry, list); + list_del(&old_entry->list); + list_add(&entry->list, &cipso_v4_cache[bkt].list); + cipso_v4_cache_entry_free(old_entry); + } + spin_unlock(&cipso_v4_cache[bkt].lock); + + return 0; + +cache_add_failure: + if (entry) + cipso_v4_cache_entry_free(entry); + return ret_val; +} + +/* + * DOI List Functions + */ + +/** + * cipso_v4_doi_search - Searches for a DOI definition + * @doi: the DOI to search for + * + * Description: + * Search the DOI definition list for a DOI definition with a DOI value that + * matches @doi. The caller is responsibile for calling rcu_read_[un]lock(). + * Returns a pointer to the DOI definition on success and NULL on failure. + */ +static struct cipso_v4_doi *cipso_v4_doi_search(u32 doi) +{ + struct cipso_v4_doi *iter; + + list_for_each_entry_rcu(iter, &cipso_v4_doi_list, list) + if (iter->doi == doi && iter->valid) + return iter; + return NULL; +} + +/** + * cipso_v4_doi_add - Add a new DOI to the CIPSO protocol engine + * @doi_def: the DOI structure + * + * Description: + * The caller defines a new DOI for use by the CIPSO engine and calls this + * function to add it to the list of acceptable domains. The caller must + * ensure that the mapping table specified in @doi_def->map meets all of the + * requirements of the mapping type (see cipso_ipv4.h for details). Returns + * zero on success and non-zero on failure. + * + */ +int cipso_v4_doi_add(struct cipso_v4_doi *doi_def) +{ + if (doi_def == NULL || doi_def->doi == CIPSO_V4_DOI_UNKNOWN) + return -EINVAL; + + doi_def->valid = 1; + INIT_RCU_HEAD(&doi_def->rcu); + INIT_LIST_HEAD(&doi_def->dom_list); + + rcu_read_lock(); + if (cipso_v4_doi_search(doi_def->doi) != NULL) + goto doi_add_failure_rlock; + spin_lock(&cipso_v4_doi_list_lock); + if (cipso_v4_doi_search(doi_def->doi) != NULL) + goto doi_add_failure_slock; + list_add_tail_rcu(&doi_def->list, &cipso_v4_doi_list); + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + + return 0; + +doi_add_failure_slock: + spin_unlock(&cipso_v4_doi_list_lock); +doi_add_failure_rlock: + rcu_read_unlock(); + return -EEXIST; +} + +/** + * cipso_v4_doi_remove - Remove an existing DOI from the CIPSO protocol engine + * @doi: the DOI value + * @callback: the DOI cleanup/free callback + * + * Description: + * Removes a DOI definition from the CIPSO engine, @callback is called to + * free any memory. The NetLabel routines will be called to release their own + * LSM domain mappings as well as our own domain list. Returns zero on + * success and negative values on failure. + * + */ +int cipso_v4_doi_remove(u32 doi, void (*callback) (struct rcu_head * head)) +{ + struct cipso_v4_doi *doi_def; + struct cipso_v4_domhsh_entry *dom_iter; + + rcu_read_lock(); + if (cipso_v4_doi_search(doi) != NULL) { + spin_lock(&cipso_v4_doi_list_lock); + doi_def = cipso_v4_doi_search(doi); + if (doi_def == NULL) { + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + return -ENOENT; + } + doi_def->valid = 0; + list_del_rcu(&doi_def->list); + spin_unlock(&cipso_v4_doi_list_lock); + list_for_each_entry_rcu(dom_iter, &doi_def->dom_list, list) + if (dom_iter->valid) + netlbl_domhsh_remove(dom_iter->domain); + cipso_v4_cache_invalidate(); + rcu_read_unlock(); + + call_rcu(&doi_def->rcu, callback); + return 0; + } + rcu_read_unlock(); + + return -ENOENT; +} + +/** + * cipso_v4_doi_getdef - Returns a pointer to a valid DOI definition + * @doi: the DOI value + * + * Description: + * Searches for a valid DOI definition and if one is found it is returned to + * the caller. Otherwise NULL is returned. The caller must ensure that + * rcu_read_lock() is held while accessing the returned definition. + * + */ +struct cipso_v4_doi *cipso_v4_doi_getdef(u32 doi) +{ + return cipso_v4_doi_search(doi); +} + +/** + * cipso_v4_doi_dump_all - Dump all the CIPSO DOI definitions into a sk_buff + * @headroom: the amount of headroom to allocate for the sk_buff + * + * Description: + * Dump a list of all the configured DOI values into a sk_buff. The returned + * sk_buff has room at the front of the sk_buff for @headroom bytes. See + * net/netlabel/netlabel_cipso_v4.h for the LISTALL message format. This + * function may fail if another process is changing the DOI list at the same + * time. Returns a pointer to a sk_buff on success, NULL on error. + * + */ +struct sk_buff *cipso_v4_doi_dump_all(size_t headroom) +{ + struct sk_buff *skb = NULL; + struct cipso_v4_doi *iter; + u32 doi_cnt = 0; + ssize_t buf_len; + + buf_len = NETLBL_LEN_U32; + rcu_read_lock(); + list_for_each_entry_rcu(iter, &cipso_v4_doi_list, list) + if (iter->valid) { + doi_cnt += 1; + buf_len += 2 * NETLBL_LEN_U32; + } + + skb = netlbl_netlink_alloc_skb(headroom, buf_len, GFP_ATOMIC); + if (skb == NULL) + goto doi_dump_all_failure; + + if (nla_put_u32(skb, NLA_U32, doi_cnt) != 0) + goto doi_dump_all_failure; + buf_len -= NETLBL_LEN_U32; + list_for_each_entry_rcu(iter, &cipso_v4_doi_list, list) + if (iter->valid) { + if (buf_len < 2 * NETLBL_LEN_U32) + goto doi_dump_all_failure; + if (nla_put_u32(skb, NLA_U32, iter->doi) != 0) + goto doi_dump_all_failure; + if (nla_put_u32(skb, NLA_U32, iter->type) != 0) + goto doi_dump_all_failure; + buf_len -= 2 * NETLBL_LEN_U32; + } + rcu_read_unlock(); + + return skb; + +doi_dump_all_failure: + rcu_read_unlock(); + kfree(skb); + return NULL; +} + +/** + * cipso_v4_doi_dump - Dump a CIPSO DOI definition into a sk_buff + * @doi: the DOI value + * @headroom: the amount of headroom to allocate for the sk_buff + * + * Description: + * Lookup the DOI definition matching @doi and dump it's contents into a + * sk_buff. The returned sk_buff has room at the front of the sk_buff for + * @headroom bytes. See net/netlabel/netlabel_cipso_v4.h for the LIST message + * format. This function may fail if another process is changing the DOI list + * at the same time. Returns a pointer to a sk_buff on success, NULL on error. + * + */ +struct sk_buff *cipso_v4_doi_dump(u32 doi, size_t headroom) +{ + struct sk_buff *skb = NULL; + struct cipso_v4_doi *iter; + u32 tag_cnt = 0; + u32 lvl_cnt = 0; + u32 cat_cnt = 0; + ssize_t buf_len; + ssize_t tmp; + + rcu_read_lock(); + iter = cipso_v4_doi_getdef(doi); + if (iter == NULL) + goto doi_dump_failure; + buf_len = NETLBL_LEN_U32; + switch (iter->type) { + case CIPSO_V4_MAP_PASS: + buf_len += NETLBL_LEN_U32; + while(tag_cnt < CIPSO_V4_TAG_MAXCNT && + iter->tags[tag_cnt] != CIPSO_V4_TAG_INVALID) { + tag_cnt += 1; + buf_len += NETLBL_LEN_U8; + } + break; + case CIPSO_V4_MAP_STD: + buf_len += 3 * NETLBL_LEN_U32; + while (tag_cnt < CIPSO_V4_TAG_MAXCNT && + iter->tags[tag_cnt] != CIPSO_V4_TAG_INVALID) { + tag_cnt += 1; + buf_len += NETLBL_LEN_U8; + } + for (tmp = 0; tmp < iter->map.std->lvl.local_size; tmp++) + if (iter->map.std->lvl.local[tmp] != + CIPSO_V4_INV_LVL) { + lvl_cnt += 1; + buf_len += NETLBL_LEN_U32 + NETLBL_LEN_U8; + } + for (tmp = 0; tmp < iter->map.std->cat.local_size; tmp++) + if (iter->map.std->cat.local[tmp] != + CIPSO_V4_INV_CAT) { + cat_cnt += 1; + buf_len += NETLBL_LEN_U32 + NETLBL_LEN_U16; + } + break; + } + + skb = netlbl_netlink_alloc_skb(headroom, buf_len, GFP_ATOMIC); + if (skb == NULL) + goto doi_dump_failure; + + if (nla_put_u32(skb, NLA_U32, iter->type) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U32; + if (iter != cipso_v4_doi_getdef(doi)) + goto doi_dump_failure; + switch (iter->type) { + case CIPSO_V4_MAP_PASS: + if (nla_put_u32(skb, NLA_U32, tag_cnt) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U32; + for (tmp = 0; + tmp < CIPSO_V4_TAG_MAXCNT && + iter->tags[tmp] != CIPSO_V4_TAG_INVALID; + tmp++) { + if (buf_len < NETLBL_LEN_U8) + goto doi_dump_failure; + if (nla_put_u8(skb, NLA_U8, iter->tags[tmp]) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U8; + } + break; + case CIPSO_V4_MAP_STD: + if (nla_put_u32(skb, NLA_U32, tag_cnt) != 0) + goto doi_dump_failure; + if (nla_put_u32(skb, NLA_U32, lvl_cnt) != 0) + goto doi_dump_failure; + if (nla_put_u32(skb, NLA_U32, cat_cnt) != 0) + goto doi_dump_failure; + buf_len -= 3 * NETLBL_LEN_U32; + for (tmp = 0; + tmp < CIPSO_V4_TAG_MAXCNT && + iter->tags[tmp] != CIPSO_V4_TAG_INVALID; + tmp++) { + if (buf_len < NETLBL_LEN_U8) + goto doi_dump_failure; + if (nla_put_u8(skb, NLA_U8, iter->tags[tmp]) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U8; + } + for (tmp = 0; tmp < iter->map.std->lvl.local_size; tmp++) + if (iter->map.std->lvl.local[tmp] != + CIPSO_V4_INV_LVL) { + if (buf_len < NETLBL_LEN_U32 + NETLBL_LEN_U8) + goto doi_dump_failure; + if (nla_put_u32(skb, NLA_U32, tmp) != 0) + goto doi_dump_failure; + if (nla_put_u8(skb, + NLA_U8, + iter->map.std->lvl.local[tmp]) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U32 + NETLBL_LEN_U8; + } + for (tmp = 0; tmp < iter->map.std->cat.local_size; tmp++) + if (iter->map.std->cat.local[tmp] != + CIPSO_V4_INV_CAT) { + if (buf_len < NETLBL_LEN_U32 + NETLBL_LEN_U16) + goto doi_dump_failure; + if (nla_put_u32(skb, NLA_U32, tmp) != 0) + goto doi_dump_failure; + if (nla_put_u16(skb, + NLA_U16, + iter->map.std->cat.local[tmp]) != 0) + goto doi_dump_failure; + buf_len -= NETLBL_LEN_U32 + NETLBL_LEN_U16; + } + break; + } + rcu_read_unlock(); + + return skb; + +doi_dump_failure: + rcu_read_unlock(); + kfree(skb); + return NULL; +} + +/** + * cipso_v4_doi_domhsh_add - Adds a domain entry to a DOI definition + * @doi_def: the DOI definition + * @domain: the domain to add + * + * Description: + * Adds the @domain to the the DOI specified by @doi_def, this function + * should only be called by external functions (i.e. NetLabel). This function + * does allocate memory. Returns zero on success, negative values on failure. + * + */ +int cipso_v4_doi_domhsh_add(struct cipso_v4_doi *doi_def, const char *domain) +{ + struct cipso_v4_domhsh_entry *iter; + struct cipso_v4_domhsh_entry *new_dom; + + new_dom = kzalloc(sizeof(*new_dom), GFP_KERNEL); + if (new_dom == NULL) + return -ENOMEM; + if (domain) { + new_dom->domain = kstrdup(domain, GFP_KERNEL); + if (new_dom->domain == NULL) { + kfree(new_dom); + return -ENOMEM; + } + } + new_dom->valid = 1; + INIT_RCU_HEAD(&new_dom->rcu); + + rcu_read_lock(); + spin_lock(&cipso_v4_doi_list_lock); + list_for_each_entry_rcu(iter, &doi_def->dom_list, list) + if (iter->valid && + ((domain != NULL && iter->domain != NULL && + strcmp(iter->domain, domain) == 0) || + (domain == NULL && iter->domain == NULL))) { + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + kfree(new_dom->domain); + kfree(new_dom); + return -EEXIST; + } + list_add_tail_rcu(&new_dom->list, &doi_def->dom_list); + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + + return 0; +} + +/** + * cipso_v4_doi_domhsh_remove - Removes a domain entry from a DOI definition + * @doi_def: the DOI definition + * @domain: the domain to remove + * + * Description: + * Removes the @domain from the DOI specified by @doi_def, this function + * should only be called by external functions (i.e. NetLabel). Returns zero + * on success and negative values on error. + * + */ +int cipso_v4_doi_domhsh_remove(struct cipso_v4_doi *doi_def, + const char *domain) +{ + struct cipso_v4_domhsh_entry *iter; + + rcu_read_lock(); + spin_lock(&cipso_v4_doi_list_lock); + list_for_each_entry_rcu(iter, &doi_def->dom_list, list) + if (iter->valid && + ((domain != NULL && iter->domain != NULL && + strcmp(iter->domain, domain) == 0) || + (domain == NULL && iter->domain == NULL))) { + iter->valid = 0; + list_del_rcu(&iter->list); + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + call_rcu(&iter->rcu, cipso_v4_doi_domhsh_free); + + return 0; + } + spin_unlock(&cipso_v4_doi_list_lock); + rcu_read_unlock(); + + return -ENOENT; +} + +/* + * Label Mapping Functions + */ + +/** + * cipso_v4_map_lvl_valid - Checks to see if the given level is understood + * @doi_def: the DOI definition + * @level: the level to check + * + * Description: + * Checks the given level against the given DOI definition and returns a + * negative value if the level does not have a valid mapping and a zero value + * if the level is defined by the DOI. + * + */ +static int cipso_v4_map_lvl_valid(const struct cipso_v4_doi *doi_def, u8 level) +{ + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + return 0; + case CIPSO_V4_MAP_STD: + if (doi_def->map.std->lvl.cipso[level] < CIPSO_V4_INV_LVL) + return 0; + break; + } + + return -EFAULT; +} + +/** + * cipso_v4_map_lvl_hton - Perform a level mapping from the host to the network + * @doi_def: the DOI definition + * @host_lvl: the host MLS level + * @net_lvl: the network/CIPSO MLS level + * + * Description: + * Perform a label mapping to translate a local MLS level to the correct + * CIPSO level using the given DOI definition. Returns zero on success, + * negative values otherwise. + * + */ +static int cipso_v4_map_lvl_hton(const struct cipso_v4_doi *doi_def, + u32 host_lvl, + u32 *net_lvl) +{ + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + *net_lvl = host_lvl; + return 0; + case CIPSO_V4_MAP_STD: + if (host_lvl < doi_def->map.std->lvl.local_size) { + *net_lvl = doi_def->map.std->lvl.local[host_lvl]; + return 0; + } + break; + } + + return -EINVAL; +} + +/** + * cipso_v4_map_lvl_ntoh - Perform a level mapping from the network to the host + * @doi_def: the DOI definition + * @net_lvl: the network/CIPSO MLS level + * @host_lvl: the host MLS level + * + * Description: + * Perform a label mapping to translate a CIPSO level to the correct local MLS + * level using the given DOI definition. Returns zero on success, negative + * values otherwise. + * + */ +static int cipso_v4_map_lvl_ntoh(const struct cipso_v4_doi *doi_def, + u32 net_lvl, + u32 *host_lvl) +{ + struct cipso_v4_std_map_tbl *map_tbl; + + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + *host_lvl = net_lvl; + return 0; + case CIPSO_V4_MAP_STD: + map_tbl = doi_def->map.std; + if (net_lvl < map_tbl->lvl.cipso_size && + map_tbl->lvl.cipso[net_lvl] < CIPSO_V4_INV_LVL) { + *host_lvl = doi_def->map.std->lvl.cipso[net_lvl]; + return 0; + } + break; + } + + return -EINVAL; +} + +/** + * cipso_v4_map_cat_rbm_valid - Checks to see if the category bitmap is valid + * @doi_def: the DOI definition + * @bitmap: category bitmap + * @bitmap_len: bitmap length in bytes + * + * Description: + * Checks the given category bitmap against the given DOI definition and + * returns a negative value if any of the categories in the bitmap do not have + * a valid mapping and a zero value if all of the categories are valid. + * + */ +static int cipso_v4_map_cat_rbm_valid(const struct cipso_v4_doi *doi_def, + const unsigned char *bitmap, + u32 bitmap_len) +{ + int cat = -1; + u32 bitmap_len_bits = bitmap_len * 8; + u32 cipso_cat_size = doi_def->map.std->cat.cipso_size; + u32 *cipso_array = doi_def->map.std->cat.cipso; + + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + return 0; + case CIPSO_V4_MAP_STD: + for (;;) { + cat = cipso_v4_bitmap_walk(bitmap, + bitmap_len_bits, + cat + 1, + 1); + if (cat < 0) + break; + if (cat >= cipso_cat_size || + cipso_array[cat] >= CIPSO_V4_INV_CAT) + return -EFAULT; + } + + if (cat == -1) + return 0; + break; + } + + return -EFAULT; +} + +/** + * cipso_v4_map_cat_rbm_hton - Perform a category mapping from host to network + * @doi_def: the DOI definition + * @host_cat: the category bitmap in host format + * @host_cat_len: the length of the host's category bitmap in bytes + * @net_cat: the zero'd out category bitmap in network/CIPSO format + * @net_cat_len: the length of the CIPSO bitmap in bytes + * + * Description: + * Perform a label mapping to translate a local MLS category bitmap to the + * correct CIPSO bitmap using the given DOI definition. Returns the minimum + * size in bytes of the network bitmap on success, negative values otherwise. + * + */ +static int cipso_v4_map_cat_rbm_hton(const struct cipso_v4_doi *doi_def, + const unsigned char *host_cat, + u32 host_cat_len, + unsigned char *net_cat, + u32 net_cat_len) +{ + int host_spot = -1; + u32 net_spot; + u32 net_spot_max = 0; + u32 host_clen_bits = host_cat_len * 8; + u32 net_clen_bits = net_cat_len * 8; + u32 host_cat_size = doi_def->map.std->cat.local_size; + u32 *host_cat_array = doi_def->map.std->cat.local; + + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + net_spot_max = host_cat_len - 1; + while (net_spot_max > 0 && host_cat[net_spot_max] == 0) + net_spot_max--; + if (net_spot_max > net_cat_len) + return -EINVAL; + memcpy(net_cat, host_cat, net_spot_max); + return net_spot_max; + case CIPSO_V4_MAP_STD: + for (;;) { + host_spot = cipso_v4_bitmap_walk(host_cat, + host_clen_bits, + host_spot + 1, + 1); + if (host_spot < 0) + break; + if (host_spot >= host_cat_size) + return -EPERM; + + net_spot = host_cat_array[host_spot]; + if (net_spot >= net_clen_bits) + return -ENOSPC; + cipso_v4_bitmap_setbit(net_cat, net_spot, 1); + + if (net_spot > net_spot_max) + net_spot_max = net_spot; + } + + if (host_spot == -2) + return -EFAULT; + + if (++net_spot_max % 8) + return net_spot_max / 8 + 1; + return net_spot_max / 8; + } + + return -EINVAL; +} + +/** + * cipso_v4_map_cat_rbm_ntoh - Perform a category mapping from network to host + * @doi_def: the DOI definition + * @net_cat: the category bitmap in network/CIPSO format + * @net_cat_len: the length of the CIPSO bitmap in bytes + * @host_cat: the zero'd out category bitmap in host format + * @host_cat_len: the length of the host's category bitmap in bytes + * + * Description: + * Perform a label mapping to translate a CIPSO bitmap to the correct local + * MLS category bitmap using the given DOI definition. Returns the minimum + * size in bytes of the host bitmap on success, negative values otherwise. + * + */ +static int cipso_v4_map_cat_rbm_ntoh(const struct cipso_v4_doi *doi_def, + const unsigned char *net_cat, + u32 net_cat_len, + unsigned char *host_cat, + u32 host_cat_len) +{ + u32 host_spot; + u32 host_spot_max = 0; + int net_spot = -1; + u32 net_clen_bits = net_cat_len * 8; + u32 host_clen_bits = host_cat_len * 8; + u32 net_cat_size = doi_def->map.std->cat.cipso_size; + u32 *net_cat_array = doi_def->map.std->cat.cipso; + + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + if (net_cat_len > host_cat_len) + return -EINVAL; + memcpy(host_cat, net_cat, net_cat_len); + return net_cat_len; + case CIPSO_V4_MAP_STD: + for (;;) { + net_spot = cipso_v4_bitmap_walk(net_cat, + net_clen_bits, + net_spot + 1, + 1); + if (net_spot < 0) + break; + if (net_spot >= net_cat_size || + net_cat_array[net_spot] >= CIPSO_V4_INV_CAT) + return -EPERM; + + host_spot = net_cat_array[net_spot]; + if (host_spot >= host_clen_bits) + return -ENOSPC; + cipso_v4_bitmap_setbit(host_cat, host_spot, 1); + + if (host_spot > host_spot_max) + host_spot_max = host_spot; + } + + if (net_spot == -2) + return -EFAULT; + + if (++host_spot_max % 8) + return host_spot_max / 8 + 1; + return host_spot_max / 8; + } + + return -EINVAL; +} + +/* + * Protocol Handling Functions + */ + +#define CIPSO_V4_HDR_LEN 6 + +/** + * cipso_v4_gentag_hdr - Generate a CIPSO option header + * @doi_def: the DOI definition + * @len: the total tag length in bytes + * @buf: the CIPSO option buffer + * + * Description: + * Write a CIPSO header into the beginning of @buffer. Return zero on success, + * negative values on failure. + * + */ +static int cipso_v4_gentag_hdr(const struct cipso_v4_doi *doi_def, + u32 len, + unsigned char *buf) +{ + if (CIPSO_V4_HDR_LEN + len > 40) + return -ENOSPC; + + buf[0] = IPOPT_CIPSO; + buf[1] = CIPSO_V4_HDR_LEN + len; + *(u32 *)&buf[2] = htonl(doi_def->doi); + + return 0; +} + +#define CIPSO_V4_TAG1_CAT_LEN 30 + +/** + * cipso_v4_gentag_rbm - Generate a CIPSO restricted bitmap tag (type #1) + * @doi_def: the DOI definition + * @secattr: the security attributes + * @buffer: the option buffer + * @buffer_len: length of buffer in bytes + * + * Description: + * Generate a CIPSO option using the restricted bitmap tag, tag type #1. The + * actual buffer length may be larger than the indicated size due to + * translation between host and network category bitmaps. Returns zero on + * success, negative values on failure. + * + */ +static int cipso_v4_gentag_rbm(const struct cipso_v4_doi *doi_def, + const struct netlbl_lsm_secattr *secattr, + unsigned char **buffer, + u32 *buffer_len) +{ + int ret_val = -EPERM; + unsigned char *buf = NULL; + u32 buf_len; + u32 level; + + if (secattr->mls_cat) { + buf = kzalloc(CIPSO_V4_HDR_LEN + 4 + CIPSO_V4_TAG1_CAT_LEN, + GFP_ATOMIC); + if (buf == NULL) + return -ENOMEM; + + ret_val = cipso_v4_map_cat_rbm_hton(doi_def, + secattr->mls_cat, + secattr->mls_cat_len, + &buf[CIPSO_V4_HDR_LEN + 4], + CIPSO_V4_TAG1_CAT_LEN); + if (ret_val < 0) + goto gentag_failure; + + /* This will send packets using the "optimized" format when + * possibile as specified in section 3.4.2.6 of the + * CIPSO draft. */ + if (cipso_v4_rbm_optfmt && (ret_val > 0 && ret_val < 10)) + ret_val = 10; + + buf_len = 4 + ret_val; + } else { + buf = kzalloc(CIPSO_V4_HDR_LEN + 4, GFP_ATOMIC); + if (buf == NULL) + return -ENOMEM; + buf_len = 4; + } + + ret_val = cipso_v4_map_lvl_hton(doi_def, secattr->mls_lvl, &level); + if (ret_val != 0) + goto gentag_failure; + + ret_val = cipso_v4_gentag_hdr(doi_def, buf_len, buf); + if (ret_val != 0) + goto gentag_failure; + + buf[CIPSO_V4_HDR_LEN] = 0x01; + buf[CIPSO_V4_HDR_LEN + 1] = buf_len; + buf[CIPSO_V4_HDR_LEN + 3] = level; + + *buffer = buf; + *buffer_len = CIPSO_V4_HDR_LEN + buf_len; + + return 0; + +gentag_failure: + kfree(buf); + return ret_val; +} + +/** + * cipso_v4_parsetag_rbm - Parse a CIPSO restricted bitmap tag + * @doi_def: the DOI definition + * @tag: the CIPSO tag + * @secattr: the security attributes + * + * Description: + * Parse a CIPSO restricted bitmap tag (tag type #1) and return the security + * attributes in @secattr. Return zero on success, negatives values on + * failure. + * + */ +static int cipso_v4_parsetag_rbm(const struct cipso_v4_doi *doi_def, + const unsigned char *tag, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val; + u8 tag_len = tag[1]; + u32 level; + + ret_val = cipso_v4_map_lvl_ntoh(doi_def, tag[3], &level); + if (ret_val != 0) + return ret_val; + secattr->mls_lvl = level; + secattr->mls_lvl_vld = 1; + + if (tag_len > 4) { + switch (doi_def->type) { + case CIPSO_V4_MAP_PASS: + secattr->mls_cat_len = tag_len - 4; + break; + case CIPSO_V4_MAP_STD: + secattr->mls_cat_len = + doi_def->map.std->cat.local_size; + break; + } + secattr->mls_cat = kzalloc(secattr->mls_cat_len, GFP_ATOMIC); + if (secattr->mls_cat == NULL) + return -ENOMEM; + + ret_val = cipso_v4_map_cat_rbm_ntoh(doi_def, + &tag[4], + tag_len - 4, + secattr->mls_cat, + secattr->mls_cat_len); + if (ret_val < 0) { + kfree(secattr->mls_cat); + return ret_val; + } + secattr->mls_cat_len = ret_val; + } + + return 0; +} + +/** + * cipso_v4_validate - Validate a CIPSO option + * @option: the start of the option, on error it is set to point to the error + * + * Description: + * This routine is called to validate a CIPSO option, it checks all of the + * fields to ensure that they are at least valid, see the draft snippet below + * for details. If the option is valid then a zero value is returned and + * the value of @option is unchanged. If the option is invalid then a + * non-zero value is returned and @option is adjusted to point to the + * offending portion of the option. From the IETF draft ... + * + * "If any field within the CIPSO options, such as the DOI identifier, is not + * recognized the IP datagram is discarded and an ICMP 'parameter problem' + * (type 12) is generated and returned. The ICMP code field is set to 'bad + * parameter' (code 0) and the pointer is set to the start of the CIPSO field + * that is unrecognized." + * + */ +int cipso_v4_validate(unsigned char **option) +{ + unsigned char *opt = *option; + unsigned char *tag; + unsigned char opt_iter; + unsigned char err_offset = 0; + u8 opt_len; + u8 tag_len; + struct cipso_v4_doi *doi_def = NULL; + u32 tag_iter; + + /* caller already checks for length values that are too large */ + opt_len = opt[1]; + if (opt_len < 8) { + err_offset = 1; + goto validate_return; + } + + rcu_read_lock(); + doi_def = cipso_v4_doi_getdef(ntohl(*((u32 *)&opt[2]))); + if (doi_def == NULL) { + err_offset = 2; + goto validate_return_locked; + } + + opt_iter = 6; + tag = opt + opt_iter; + while (opt_iter < opt_len) { + for (tag_iter = 0; doi_def->tags[tag_iter] != tag[0];) + if (doi_def->tags[tag_iter] == CIPSO_V4_TAG_INVALID || + ++tag_iter == CIPSO_V4_TAG_MAXCNT) { + err_offset = opt_iter; + goto validate_return_locked; + } + + tag_len = tag[1]; + if (tag_len > (opt_len - opt_iter)) { + err_offset = opt_iter + 1; + goto validate_return_locked; + } + + switch (tag[0]) { + case CIPSO_V4_TAG_RBITMAP: + if (tag_len < 4) { + err_offset = opt_iter + 1; + goto validate_return_locked; + } + + /* We are already going to do all the verification + * necessary at the socket layer so from our point of + * view it is safe to turn these checks off (and less + * work), however, the CIPSO draft says we should do + * all the CIPSO validations here but it doesn't + * really specify _exactly_ what we need to validate + * ... so, just make it a sysctl tunable. */ + if (cipso_v4_rbm_strictvalid) { + if (cipso_v4_map_lvl_valid(doi_def, + tag[3]) < 0) { + err_offset = opt_iter + 3; + goto validate_return_locked; + } + if (tag_len > 4 && + cipso_v4_map_cat_rbm_valid(doi_def, + &tag[4], + tag_len - 4) < 0) { + err_offset = opt_iter + 4; + goto validate_return_locked; + } + } + break; + default: + err_offset = opt_iter; + goto validate_return_locked; + } + + tag += tag_len; + opt_iter += tag_len; + } + +validate_return_locked: + rcu_read_unlock(); +validate_return: + *option = opt + err_offset; + return err_offset; +} + +/** + * cipso_v4_error - Send the correct reponse for a bad packet + * @skb: the packet + * @error: the error code + * @gateway: CIPSO gateway flag + * + * Description: + * Based on the error code given in @error, send an ICMP error message back to + * the originating host. From the IETF draft ... + * + * "If the contents of the CIPSO [option] are valid but the security label is + * outside of the configured host or port label range, the datagram is + * discarded and an ICMP 'destination unreachable' (type 3) is generated and + * returned. The code field of the ICMP is set to 'communication with + * destination network administratively prohibited' (code 9) or to + * 'communication with destination host administratively prohibited' + * (code 10). The value of the code is dependent on whether the originator + * of the ICMP message is acting as a CIPSO host or a CIPSO gateway. The + * recipient of the ICMP message MUST be able to handle either value. The + * same procedure is performed if a CIPSO [option] can not be added to an + * IP packet because it is too large to fit in the IP options area." + * + * "If the error is triggered by receipt of an ICMP message, the message is + * discarded and no response is permitted (consistent with general ICMP + * processing rules)." + * + */ +void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway) +{ + if (skb->nh.iph->protocol == IPPROTO_ICMP || error != -EACCES) + return; + + if (gateway) + icmp_send(skb, ICMP_DEST_UNREACH, ICMP_NET_ANO, 0); + else + icmp_send(skb, ICMP_DEST_UNREACH, ICMP_HOST_ANO, 0); +} + +/** + * cipso_v4_socket_setattr - Add a CIPSO option to a socket + * @sock: the socket + * @doi_def: the CIPSO DOI to use + * @secattr: the specific security attributes of the socket + * + * Description: + * Set the CIPSO option on the given socket using the DOI definition and + * security attributes passed to the function. This function requires + * exclusive access to @sock->sk, which means it either needs to be in the + * process of being created or locked via lock_sock(sock->sk). Returns zero on + * success and negative values on failure. + * + */ +int cipso_v4_socket_setattr(const struct socket *sock, + const struct cipso_v4_doi *doi_def, + const struct netlbl_lsm_secattr *secattr) +{ + int ret_val = -EPERM; + u32 iter; + unsigned char *buf = NULL; + u32 buf_len = 0; + u32 opt_len; + struct ip_options *opt = NULL; + struct sock *sk; + struct inet_sock *sk_inet; + struct inet_connection_sock *sk_conn; + + /* In the case of sock_create_lite(), the sock->sk field is not + * defined yet but it is not a problem as the only users of these + * "lite" PF_INET sockets are functions which do an accept() call + * afterwards so we will label the socket as part of the accept(). */ + sk = sock->sk; + if (sk == NULL) + return 0; + + /* XXX - This code assumes only one tag per CIPSO option which isn't + * really a good assumption to make but since we only support the MAC + * tags right now it is a safe assumption. */ + iter = 0; + do { + switch (doi_def->tags[iter]) { + case CIPSO_V4_TAG_RBITMAP: + ret_val = cipso_v4_gentag_rbm(doi_def, + secattr, + &buf, + &buf_len); + break; + default: + ret_val = -EPERM; + goto socket_setattr_failure; + } + + iter++; + } while (ret_val != 0 && + iter < CIPSO_V4_TAG_MAXCNT && + doi_def->tags[iter] != CIPSO_V4_TAG_INVALID); + if (ret_val != 0) + goto socket_setattr_failure; + + /* We can't use ip_options_get() directly because it makes a call to + * ip_options_get_alloc() which allocates memory with GFP_KERNEL and + * we can't block here. */ + opt_len = (buf_len + 3) & ~3; + opt = kzalloc(sizeof(*opt) + opt_len, GFP_ATOMIC); + if (opt == NULL) { + ret_val = -ENOMEM; + goto socket_setattr_failure; + } + memcpy(opt->__data, buf, buf_len); + opt->optlen = opt_len; + opt->is_data = 1; + kfree(buf); + buf = NULL; + ret_val = ip_options_compile(opt, NULL); + if (ret_val != 0) + goto socket_setattr_failure; + + sk_inet = inet_sk(sk); + if (sk_inet->is_icsk) { + sk_conn = inet_csk(sk); + if (sk_inet->opt) + sk_conn->icsk_ext_hdr_len -= sk_inet->opt->optlen; + sk_conn->icsk_ext_hdr_len += opt->optlen; + sk_conn->icsk_sync_mss(sk, sk_conn->icsk_pmtu_cookie); + } + opt = xchg(&sk_inet->opt, opt); + kfree(opt); + + return 0; + +socket_setattr_failure: + kfree(buf); + kfree(opt); + return ret_val; +} + +/** + * cipso_v4_socket_getattr - Get the security attributes from a socket + * @sock: the socket + * @secattr: the security attributes + * + * Description: + * Query @sock to see if there is a CIPSO option attached to the socket and if + * there is return the CIPSO security attributes in @secattr. Returns zero on + * success and negative values on failure. + * + */ +int cipso_v4_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val = -ENOMSG; + struct sock *sk; + struct inet_sock *sk_inet; + unsigned char *cipso_ptr; + u32 doi; + struct cipso_v4_doi *doi_def; + + sk = sock->sk; + lock_sock(sk); + sk_inet = inet_sk(sk); + if (sk_inet->opt == NULL || sk_inet->opt->cipso == 0) + goto socket_getattr_return; + cipso_ptr = sk_inet->opt->__data + sk_inet->opt->cipso - + sizeof(struct iphdr); + ret_val = cipso_v4_cache_check(cipso_ptr, cipso_ptr[1], secattr); + if (ret_val == 0) + goto socket_getattr_return; + + doi = ntohl(*(u32 *)&cipso_ptr[2]); + rcu_read_lock(); + doi_def = cipso_v4_doi_getdef(doi); + if (doi_def == NULL) { + rcu_read_unlock(); + goto socket_getattr_return; + } + switch (cipso_ptr[6]) { + case CIPSO_V4_TAG_RBITMAP: + ret_val = cipso_v4_parsetag_rbm(doi_def, + &cipso_ptr[6], + secattr); + break; + } + rcu_read_unlock(); + +socket_getattr_return: + release_sock(sk); + return ret_val; +} + +/** + * cipso_v4_skbuff_getattr - Get the security attributes from the CIPSO option + * @skb: the packet + * @secattr: the security attributes + * + * Description: + * Parse the given packet's CIPSO option and return the security attributes. + * Returns zero on success and negative values on failure. + * + */ +int cipso_v4_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val = -ENOMSG; + unsigned char *cipso_ptr; + u32 doi; + struct cipso_v4_doi *doi_def; + + if (!CIPSO_V4_OPTEXIST(skb)) + return -ENOMSG; + cipso_ptr = CIPSO_V4_OPTPTR(skb); + if (cipso_v4_cache_check(cipso_ptr, cipso_ptr[1], secattr) == 0) + return 0; + + doi = ntohl(*(u32 *)&cipso_ptr[2]); + rcu_read_lock(); + doi_def = cipso_v4_doi_getdef(doi); + if (doi_def == NULL) + goto skbuff_getattr_return; + switch (cipso_ptr[6]) { + case CIPSO_V4_TAG_RBITMAP: + ret_val = cipso_v4_parsetag_rbm(doi_def, + &cipso_ptr[6], + secattr); + break; + } + +skbuff_getattr_return: + rcu_read_unlock(); + return ret_val; +} + +/* + * Setup Functions + */ + +/** + * cipso_v4_init - Initialize the CIPSO module + * + * Description: + * Initialize the CIPSO module and prepare it for use. Returns zero on success + * and negative values on failure. + * + */ +static int __init cipso_v4_init(void) +{ + int ret_val; + + ret_val = cipso_v4_cache_init(); + if (ret_val != 0) + panic("Failed to initialize the CIPSO/IPv4 cache (%d)\n", + ret_val); + + return 0; +} + +subsys_initcall(cipso_v4_init); diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 70cea9d..19b2071 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -17,6 +17,7 @@ #include #include #include +#include /* From af_inet.c */ extern int sysctl_ip_nonlocal_bind; @@ -697,6 +698,40 @@ ctl_table ipv4_table[] = { .mode = 0644, .proc_handler = &proc_dointvec }, +#ifdef CONFIG_NETLABEL + { + .ctl_name = NET_CIPSOV4_CACHE_ENABLE, + .procname = "cipso_cache_enable", + .data = &cipso_v4_cache_enabled, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { + .ctl_name = NET_CIPSOV4_CACHE_BUCKET_SIZE, + .procname = "cipso_cache_bucket_size", + .data = &cipso_v4_cache_bucketsize, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { + .ctl_name = NET_CIPSOV4_RBM_OPTFMT, + .procname = "cipso_rbm_optfmt", + .data = &cipso_v4_rbm_optfmt, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { + .ctl_name = NET_CIPSOV4_RBM_STRICTVALID, + .procname = "cipso_rbm_strictvalid", + .data = &cipso_v4_rbm_strictvalid, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, +#endif /* CONFIG_NETLABEL */ { .ctl_name = 0 } }; -- cgit v0.10.2 From d15c345fe3b8dfda0fa5a1d2143a35fffa746a43 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:48:37 -0700 Subject: [NetLabel]: core NetLabel subsystem Add a new kernel subsystem, NetLabel, to provide explicit packet labeling services (CIPSO, RIPSO, etc.) to LSM developers. NetLabel is designed to work in conjunction with a LSM to intercept and decode security labels on incoming network packets as well as ensure that outgoing network packets are labeled according to the security mechanism employed by the LSM. The NetLabel subsystem is configured through a Generic NETLINK interface described in the header files included in this patch. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/net/Makefile b/net/Makefile index 065796f..ad4d14f 100644 --- a/net/Makefile +++ b/net/Makefile @@ -46,6 +46,7 @@ obj-$(CONFIG_IP_DCCP) += dccp/ obj-$(CONFIG_IP_SCTP) += sctp/ obj-$(CONFIG_IEEE80211) += ieee80211/ obj-$(CONFIG_TIPC) += tipc/ +obj-$(CONFIG_NETLABEL) += netlabel/ ifeq ($(CONFIG_NET),y) obj-$(CONFIG_SYSCTL) += sysctl_net.o diff --git a/net/netlabel/Kconfig b/net/netlabel/Kconfig new file mode 100644 index 0000000..fe23cb7 --- /dev/null +++ b/net/netlabel/Kconfig @@ -0,0 +1,14 @@ +# +# NetLabel configuration +# + +config NETLABEL + bool "NetLabel subsystem support" + depends on NET && SECURITY + default n + ---help--- + NetLabel provides support for explicit network packet labeling + protocols such as CIPSO and RIPSO. For more information see + Documentation/netlabel. + + If you are unsure, say N. diff --git a/net/netlabel/Makefile b/net/netlabel/Makefile new file mode 100644 index 0000000..8af18c0 --- /dev/null +++ b/net/netlabel/Makefile @@ -0,0 +1,16 @@ +# +# Makefile for the NetLabel subsystem. +# +# Feb 9, 2006, Paul Moore +# + +# base objects +obj-y := netlabel_user.o netlabel_kapi.o netlabel_domainhash.o + +# management objects +obj-y += netlabel_mgmt.o + +# protocol modules +obj-y += netlabel_unlabeled.o +obj-y += netlabel_cipso_v4.o + diff --git a/net/netlabel/netlabel_cipso_v4.h b/net/netlabel/netlabel_cipso_v4.h new file mode 100644 index 0000000..4c6ff4b --- /dev/null +++ b/net/netlabel/netlabel_cipso_v4.h @@ -0,0 +1,217 @@ +/* + * NetLabel CIPSO/IPv4 Support + * + * This file defines the CIPSO/IPv4 functions for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_CIPSO_V4 +#define _NETLABEL_CIPSO_V4 + +#include + +/* + * The following NetLabel payloads are supported by the CIPSO subsystem, all + * of which are preceeded by the nlmsghdr struct. + * + * o ACK: + * Sent by the kernel in response to an applications message, applications + * should never send this message. + * + * +----------------------+-----------------------+ + * | seq number (32 bits) | return code (32 bits) | + * +----------------------+-----------------------+ + * + * seq number: the sequence number of the original message, taken from the + * nlmsghdr structure + * return code: return value, based on errno values + * + * o ADD: + * Sent by an application to add a new DOI mapping table, after completion + * of the task the kernel should ACK this message. + * + * +---------------+--------------------+---------------------+ + * | DOI (32 bits) | map type (32 bits) | tag count (32 bits) | ... + * +---------------+--------------------+---------------------+ + * + * +-----------------+ + * | tag #X (8 bits) | ... repeated + * +-----------------+ + * + * +-------------- ---- --- -- - + * | mapping data + * +-------------- ---- --- -- - + * + * DOI: the DOI value + * map type: the mapping table type (defined in the cipso_ipv4.h header + * as CIPSO_V4_MAP_*) + * tag count: the number of tags, must be greater than zero + * tag: the CIPSO tag for the DOI, tags listed first are given + * higher priorirty when sending packets + * mapping data: specific to the map type (see below) + * + * CIPSO_V4_MAP_STD + * + * +------------------+-----------------------+----------------------+ + * | levels (32 bits) | max l level (32 bits) | max r level (8 bits) | ... + * +------------------+-----------------------+----------------------+ + * + * +----------------------+---------------------+---------------------+ + * | categories (32 bits) | max l cat (32 bits) | max r cat (16 bits) | ... + * +----------------------+---------------------+---------------------+ + * + * +--------------------------+-------------------------+ + * | local level #X (32 bits) | CIPSO level #X (8 bits) | ... repeated + * +--------------------------+-------------------------+ + * + * +-----------------------------+-----------------------------+ + * | local category #X (32 bits) | CIPSO category #X (16 bits) | ... repeated + * +-----------------------------+-----------------------------+ + * + * levels: the number of level mappings + * max l level: the highest local level + * max r level: the highest remote/CIPSO level + * categories: the number of category mappings + * max l cat: the highest local category + * max r cat: the highest remote/CIPSO category + * local level: the local part of a level mapping + * CIPSO level: the remote/CIPSO part of a level mapping + * local category: the local part of a category mapping + * CIPSO category: the remote/CIPSO part of a category mapping + * + * CIPSO_V4_MAP_PASS + * + * No mapping data is needed for this map type. + * + * o REMOVE: + * Sent by an application to remove a specific DOI mapping table from the + * CIPSO V4 system. The kernel should ACK this message. + * + * +---------------+ + * | DOI (32 bits) | + * +---------------+ + * + * DOI: the DOI value + * + * o LIST: + * Sent by an application to list the details of a DOI definition. The + * kernel should send an ACK on error or a response as indicated below. The + * application generated message format is shown below. + * + * +---------------+ + * | DOI (32 bits) | + * +---------------+ + * + * DOI: the DOI value + * + * The valid response message format depends on the type of the DOI mapping, + * the known formats are shown below. + * + * +--------------------+ + * | map type (32 bits) | ... + * +--------------------+ + * + * map type: the DOI mapping table type (defined in the cipso_ipv4.h + * header as CIPSO_V4_MAP_*) + * + * (map type == CIPSO_V4_MAP_STD) + * + * +----------------+------------------+----------------------+ + * | tags (32 bits) | levels (32 bits) | categories (32 bits) | ... + * +----------------+------------------+----------------------+ + * + * +-----------------+ + * | tag #X (8 bits) | ... repeated + * +-----------------+ + * + * +--------------------------+-------------------------+ + * | local level #X (32 bits) | CIPSO level #X (8 bits) | ... repeated + * +--------------------------+-------------------------+ + * + * +-----------------------------+-----------------------------+ + * | local category #X (32 bits) | CIPSO category #X (16 bits) | ... repeated + * +-----------------------------+-----------------------------+ + * + * tags: the number of CIPSO tag types + * levels: the number of level mappings + * categories: the number of category mappings + * tag: the tag number, tags listed first are given higher + * priority when sending packets + * local level: the local part of a level mapping + * CIPSO level: the remote/CIPSO part of a level mapping + * local category: the local part of a category mapping + * CIPSO category: the remote/CIPSO part of a category mapping + * + * (map type == CIPSO_V4_MAP_PASS) + * + * +----------------+ + * | tags (32 bits) | ... + * +----------------+ + * + * +-----------------+ + * | tag #X (8 bits) | ... repeated + * +-----------------+ + * + * tags: the number of CIPSO tag types + * tag: the tag number, tags listed first are given higher + * priority when sending packets + * + * o LISTALL: + * This message is sent by an application to list the valid DOIs on the + * system. There is no payload and the kernel should respond with an ACK + * or the following message. + * + * +---------------------+------------------+-----------------------+ + * | DOI count (32 bits) | DOI #X (32 bits) | map type #X (32 bits) | + * +---------------------+------------------+-----------------------+ + * + * +-----------------------+ + * | map type #X (32 bits) | ... + * +-----------------------+ + * + * DOI count: the number of DOIs + * DOI: the DOI value + * map type: the DOI mapping table type (defined in the cipso_ipv4.h + * header as CIPSO_V4_MAP_*) + * + */ + +/* NetLabel CIPSOv4 commands */ +enum { + NLBL_CIPSOV4_C_UNSPEC, + NLBL_CIPSOV4_C_ACK, + NLBL_CIPSOV4_C_ADD, + NLBL_CIPSOV4_C_REMOVE, + NLBL_CIPSOV4_C_LIST, + NLBL_CIPSOV4_C_LISTALL, + __NLBL_CIPSOV4_C_MAX, +}; +#define NLBL_CIPSOV4_C_MAX (__NLBL_CIPSOV4_C_MAX - 1) + +/* NetLabel protocol functions */ +int netlbl_cipsov4_genl_init(void); + +#endif diff --git a/net/netlabel/netlabel_domainhash.c b/net/netlabel/netlabel_domainhash.c new file mode 100644 index 0000000..5bb3fad --- /dev/null +++ b/net/netlabel/netlabel_domainhash.c @@ -0,0 +1,513 @@ +/* + * NetLabel Domain Hash Table + * + * This file manages the domain hash table that NetLabel uses to determine + * which network labeling protocol to use for a given domain. The NetLabel + * system manages static and dynamic label mappings for network protocols such + * as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netlabel_mgmt.h" +#include "netlabel_domainhash.h" + +struct netlbl_domhsh_tbl { + struct list_head *tbl; + u32 size; +}; + +/* Domain hash table */ +/* XXX - updates should be so rare that having one spinlock for the entire + * hash table should be okay */ +DEFINE_SPINLOCK(netlbl_domhsh_lock); +static struct netlbl_domhsh_tbl *netlbl_domhsh = NULL; + +/* Default domain mapping */ +DEFINE_SPINLOCK(netlbl_domhsh_def_lock); +static struct netlbl_dom_map *netlbl_domhsh_def = NULL; + +/* + * Domain Hash Table Helper Functions + */ + +/** + * netlbl_domhsh_free_entry - Frees a domain hash table entry + * @entry: the entry's RCU field + * + * Description: + * This function is designed to be used as a callback to the call_rcu() + * function so that the memory allocated to a hash table entry can be released + * safely. + * + */ +static void netlbl_domhsh_free_entry(struct rcu_head *entry) +{ + struct netlbl_dom_map *ptr; + + ptr = container_of(entry, struct netlbl_dom_map, rcu); + kfree(ptr->domain); + kfree(ptr); +} + +/** + * netlbl_domhsh_hash - Hashing function for the domain hash table + * @domain: the domain name to hash + * + * Description: + * This is the hashing function for the domain hash table, it returns the + * correct bucket number for the domain. The caller is responsibile for + * calling the rcu_read_[un]lock() functions. + * + */ +static u32 netlbl_domhsh_hash(const char *key) +{ + u32 iter; + u32 val; + u32 len; + + /* This is taken (with slight modification) from + * security/selinux/ss/symtab.c:symhash() */ + + for (iter = 0, val = 0, len = strlen(key); iter < len; iter++) + val = (val << 4 | (val >> (8 * sizeof(u32) - 4))) ^ key[iter]; + return val & (rcu_dereference(netlbl_domhsh)->size - 1); +} + +/** + * netlbl_domhsh_search - Search for a domain entry + * @domain: the domain + * @def: return default if no match is found + * + * Description: + * Searches the domain hash table and returns a pointer to the hash table + * entry if found, otherwise NULL is returned. If @def is non-zero and a + * match is not found in the domain hash table the default mapping is returned + * if it exists. The caller is responsibile for the rcu hash table locks + * (i.e. the caller much call rcu_read_[un]lock()). + * + */ +static struct netlbl_dom_map *netlbl_domhsh_search(const char *domain, u32 def) +{ + u32 bkt; + struct netlbl_dom_map *iter; + + if (domain != NULL) { + bkt = netlbl_domhsh_hash(domain); + list_for_each_entry_rcu(iter, &netlbl_domhsh->tbl[bkt], list) + if (iter->valid && strcmp(iter->domain, domain) == 0) + return iter; + } + + if (def != 0) { + iter = rcu_dereference(netlbl_domhsh_def); + if (iter != NULL && iter->valid) + return iter; + } + + return NULL; +} + +/* + * Domain Hash Table Functions + */ + +/** + * netlbl_domhsh_init - Init for the domain hash + * @size: the number of bits to use for the hash buckets + * + * Description: + * Initializes the domain hash table, should be called only by + * netlbl_user_init() during initialization. Returns zero on success, non-zero + * values on error. + * + */ +int netlbl_domhsh_init(u32 size) +{ + u32 iter; + struct netlbl_domhsh_tbl *hsh_tbl; + + if (size == 0) + return -EINVAL; + + hsh_tbl = kmalloc(sizeof(*hsh_tbl), GFP_KERNEL); + if (hsh_tbl == NULL) + return -ENOMEM; + hsh_tbl->size = 1 << size; + hsh_tbl->tbl = kcalloc(hsh_tbl->size, + sizeof(struct list_head), + GFP_KERNEL); + if (hsh_tbl->tbl == NULL) { + kfree(hsh_tbl); + return -ENOMEM; + } + for (iter = 0; iter < hsh_tbl->size; iter++) + INIT_LIST_HEAD(&hsh_tbl->tbl[iter]); + + rcu_read_lock(); + spin_lock(&netlbl_domhsh_lock); + rcu_assign_pointer(netlbl_domhsh, hsh_tbl); + spin_unlock(&netlbl_domhsh_lock); + rcu_read_unlock(); + + return 0; +} + +/** + * netlbl_domhsh_add - Adds a entry to the domain hash table + * @entry: the entry to add + * + * Description: + * Adds a new entry to the domain hash table and handles any updates to the + * lower level protocol handler (i.e. CIPSO). Returns zero on success, + * negative on failure. + * + */ +int netlbl_domhsh_add(struct netlbl_dom_map *entry) +{ + int ret_val; + u32 bkt; + + switch (entry->type) { + case NETLBL_NLTYPE_UNLABELED: + ret_val = 0; + break; + case NETLBL_NLTYPE_CIPSOV4: + ret_val = cipso_v4_doi_domhsh_add(entry->type_def.cipsov4, + entry->domain); + break; + default: + return -EINVAL; + } + if (ret_val != 0) + return ret_val; + + entry->valid = 1; + INIT_RCU_HEAD(&entry->rcu); + + ret_val = 0; + rcu_read_lock(); + if (entry->domain != NULL) { + bkt = netlbl_domhsh_hash(entry->domain); + spin_lock(&netlbl_domhsh_lock); + if (netlbl_domhsh_search(entry->domain, 0) == NULL) + list_add_tail_rcu(&entry->list, + &netlbl_domhsh->tbl[bkt]); + else + ret_val = -EEXIST; + spin_unlock(&netlbl_domhsh_lock); + } else if (entry->domain == NULL) { + INIT_LIST_HEAD(&entry->list); + spin_lock(&netlbl_domhsh_def_lock); + if (rcu_dereference(netlbl_domhsh_def) == NULL) + rcu_assign_pointer(netlbl_domhsh_def, entry); + else + ret_val = -EEXIST; + spin_unlock(&netlbl_domhsh_def_lock); + } else + ret_val = -EINVAL; + rcu_read_unlock(); + + if (ret_val != 0) { + switch (entry->type) { + case NETLBL_NLTYPE_CIPSOV4: + if (cipso_v4_doi_domhsh_remove(entry->type_def.cipsov4, + entry->domain) != 0) + BUG(); + break; + } + } + + return ret_val; +} + +/** + * netlbl_domhsh_add_default - Adds the default entry to the domain hash table + * @entry: the entry to add + * + * Description: + * Adds a new default entry to the domain hash table and handles any updates + * to the lower level protocol handler (i.e. CIPSO). Returns zero on success, + * negative on failure. + * + */ +int netlbl_domhsh_add_default(struct netlbl_dom_map *entry) +{ + return netlbl_domhsh_add(entry); +} + +/** + * netlbl_domhsh_remove - Removes an entry from the domain hash table + * @domain: the domain to remove + * + * Description: + * Removes an entry from the domain hash table and handles any updates to the + * lower level protocol handler (i.e. CIPSO). Returns zero on success, + * negative on failure. + * + */ +int netlbl_domhsh_remove(const char *domain) +{ + int ret_val = -ENOENT; + struct netlbl_dom_map *entry; + + rcu_read_lock(); + if (domain != NULL) + entry = netlbl_domhsh_search(domain, 0); + else + entry = netlbl_domhsh_search(domain, 1); + if (entry == NULL) + goto remove_return; + switch (entry->type) { + case NETLBL_NLTYPE_UNLABELED: + break; + case NETLBL_NLTYPE_CIPSOV4: + ret_val = cipso_v4_doi_domhsh_remove(entry->type_def.cipsov4, + entry->domain); + if (ret_val != 0) + goto remove_return; + break; + } + ret_val = 0; + if (entry != rcu_dereference(netlbl_domhsh_def)) { + spin_lock(&netlbl_domhsh_lock); + if (entry->valid) { + entry->valid = 0; + list_del_rcu(&entry->list); + } else + ret_val = -ENOENT; + spin_unlock(&netlbl_domhsh_lock); + } else { + spin_lock(&netlbl_domhsh_def_lock); + if (entry->valid) { + entry->valid = 0; + rcu_assign_pointer(netlbl_domhsh_def, NULL); + } else + ret_val = -ENOENT; + spin_unlock(&netlbl_domhsh_def_lock); + } + if (ret_val == 0) + call_rcu(&entry->rcu, netlbl_domhsh_free_entry); + +remove_return: + rcu_read_unlock(); + return ret_val; +} + +/** + * netlbl_domhsh_remove_default - Removes the default entry from the table + * + * Description: + * Removes/resets the default entry for the domain hash table and handles any + * updates to the lower level protocol handler (i.e. CIPSO). Returns zero on + * success, non-zero on failure. + * + */ +int netlbl_domhsh_remove_default(void) +{ + return netlbl_domhsh_remove(NULL); +} + +/** + * netlbl_domhsh_getentry - Get an entry from the domain hash table + * @domain: the domain name to search for + * + * Description: + * Look through the domain hash table searching for an entry to match @domain, + * return a pointer to a copy of the entry or NULL. The caller is responsibile + * for ensuring that rcu_read_[un]lock() is called. + * + */ +struct netlbl_dom_map *netlbl_domhsh_getentry(const char *domain) +{ + return netlbl_domhsh_search(domain, 1); +} + +/** + * netlbl_domhsh_dump - Dump the domain hash table into a sk_buff + * + * Description: + * Dump the domain hash table into a buffer suitable for returning to an + * application in response to a NetLabel management DOMAIN message. This + * function may fail if another process is growing the hash table at the same + * time. The returned sk_buff has room at the front of the sk_buff for + * @headroom bytes. See netlabel.h for the DOMAIN message format. Returns a + * pointer to a sk_buff on success, NULL on error. + * + */ +struct sk_buff *netlbl_domhsh_dump(size_t headroom) +{ + struct sk_buff *skb = NULL; + ssize_t buf_len; + u32 bkt_iter; + u32 dom_cnt = 0; + struct netlbl_domhsh_tbl *hsh_tbl; + struct netlbl_dom_map *list_iter; + ssize_t tmp_len; + + buf_len = NETLBL_LEN_U32; + rcu_read_lock(); + hsh_tbl = rcu_dereference(netlbl_domhsh); + for (bkt_iter = 0; bkt_iter < hsh_tbl->size; bkt_iter++) + list_for_each_entry_rcu(list_iter, + &hsh_tbl->tbl[bkt_iter], list) { + buf_len += NETLBL_LEN_U32 + + nla_total_size(strlen(list_iter->domain) + 1); + switch (list_iter->type) { + case NETLBL_NLTYPE_UNLABELED: + break; + case NETLBL_NLTYPE_CIPSOV4: + buf_len += 2 * NETLBL_LEN_U32; + break; + } + dom_cnt++; + } + + skb = netlbl_netlink_alloc_skb(headroom, buf_len, GFP_ATOMIC); + if (skb == NULL) + goto dump_failure; + + if (nla_put_u32(skb, NLA_U32, dom_cnt) != 0) + goto dump_failure; + buf_len -= NETLBL_LEN_U32; + hsh_tbl = rcu_dereference(netlbl_domhsh); + for (bkt_iter = 0; bkt_iter < hsh_tbl->size; bkt_iter++) + list_for_each_entry_rcu(list_iter, + &hsh_tbl->tbl[bkt_iter], list) { + tmp_len = nla_total_size(strlen(list_iter->domain) + + 1); + if (buf_len < NETLBL_LEN_U32 + tmp_len) + goto dump_failure; + if (nla_put_string(skb, + NLA_STRING, + list_iter->domain) != 0) + goto dump_failure; + if (nla_put_u32(skb, NLA_U32, list_iter->type) != 0) + goto dump_failure; + buf_len -= NETLBL_LEN_U32 + tmp_len; + switch (list_iter->type) { + case NETLBL_NLTYPE_UNLABELED: + break; + case NETLBL_NLTYPE_CIPSOV4: + if (buf_len < 2 * NETLBL_LEN_U32) + goto dump_failure; + if (nla_put_u32(skb, + NLA_U32, + list_iter->type_def.cipsov4->type) != 0) + goto dump_failure; + if (nla_put_u32(skb, + NLA_U32, + list_iter->type_def.cipsov4->doi) != 0) + goto dump_failure; + buf_len -= 2 * NETLBL_LEN_U32; + break; + } + } + rcu_read_unlock(); + + return skb; + +dump_failure: + rcu_read_unlock(); + kfree_skb(skb); + return NULL; +} + +/** + * netlbl_domhsh_dump_default - Dump the default domain mapping into a sk_buff + * + * Description: + * Dump the default domain mapping into a buffer suitable for returning to an + * application in response to a NetLabel management DEFDOMAIN message. This + * function may fail if another process is changing the default domain mapping + * at the same time. The returned sk_buff has room at the front of the + * skb_buff for @headroom bytes. See netlabel.h for the DEFDOMAIN message + * format. Returns a pointer to a sk_buff on success, NULL on error. + * + */ +struct sk_buff *netlbl_domhsh_dump_default(size_t headroom) +{ + struct sk_buff *skb; + ssize_t buf_len; + struct netlbl_dom_map *entry; + + buf_len = NETLBL_LEN_U32; + rcu_read_lock(); + entry = rcu_dereference(netlbl_domhsh_def); + if (entry != NULL) + switch (entry->type) { + case NETLBL_NLTYPE_UNLABELED: + break; + case NETLBL_NLTYPE_CIPSOV4: + buf_len += 2 * NETLBL_LEN_U32; + break; + } + + skb = netlbl_netlink_alloc_skb(headroom, buf_len, GFP_ATOMIC); + if (skb == NULL) + goto dump_default_failure; + + if (entry != rcu_dereference(netlbl_domhsh_def)) + goto dump_default_failure; + if (entry != NULL) { + if (nla_put_u32(skb, NLA_U32, entry->type) != 0) + goto dump_default_failure; + buf_len -= NETLBL_LEN_U32; + switch (entry->type) { + case NETLBL_NLTYPE_UNLABELED: + break; + case NETLBL_NLTYPE_CIPSOV4: + if (buf_len < 2 * NETLBL_LEN_U32) + goto dump_default_failure; + if (nla_put_u32(skb, + NLA_U32, + entry->type_def.cipsov4->type) != 0) + goto dump_default_failure; + if (nla_put_u32(skb, + NLA_U32, + entry->type_def.cipsov4->doi) != 0) + goto dump_default_failure; + buf_len -= 2 * NETLBL_LEN_U32; + break; + } + } else + nla_put_u32(skb, NLA_U32, NETLBL_NLTYPE_NONE); + rcu_read_unlock(); + + return skb; + +dump_default_failure: + rcu_read_unlock(); + kfree_skb(skb); + return NULL; +} diff --git a/net/netlabel/netlabel_domainhash.h b/net/netlabel/netlabel_domainhash.h new file mode 100644 index 0000000..9217863 --- /dev/null +++ b/net/netlabel/netlabel_domainhash.h @@ -0,0 +1,63 @@ +/* + * NetLabel Domain Hash Table + * + * This file manages the domain hash table that NetLabel uses to determine + * which network labeling protocol to use for a given domain. The NetLabel + * system manages static and dynamic label mappings for network protocols such + * as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_DOMAINHASH_H +#define _NETLABEL_DOMAINHASH_H + +/* Domain hash table size */ +/* XXX - currently this number is an uneducated guess */ +#define NETLBL_DOMHSH_BITSIZE 7 + +/* Domain mapping definition struct */ +struct netlbl_dom_map { + char *domain; + u32 type; + union { + struct cipso_v4_doi *cipsov4; + } type_def; + + u32 valid; + struct list_head list; + struct rcu_head rcu; +}; + +/* init function */ +int netlbl_domhsh_init(u32 size); + +/* Manipulate the domain hash table */ +int netlbl_domhsh_add(struct netlbl_dom_map *entry); +int netlbl_domhsh_add_default(struct netlbl_dom_map *entry); +int netlbl_domhsh_remove_default(void); +struct netlbl_dom_map *netlbl_domhsh_getentry(const char *domain); +struct sk_buff *netlbl_domhsh_dump(size_t headroom); +struct sk_buff *netlbl_domhsh_dump_default(size_t headroom); + +#endif diff --git a/net/netlabel/netlabel_kapi.c b/net/netlabel/netlabel_kapi.c new file mode 100644 index 0000000..0fd8aaa --- /dev/null +++ b/net/netlabel/netlabel_kapi.c @@ -0,0 +1,231 @@ +/* + * NetLabel Kernel API + * + * This file defines the kernel API for the NetLabel system. The NetLabel + * system manages static and dynamic label mappings for network protocols such + * as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include + +#include "netlabel_domainhash.h" +#include "netlabel_unlabeled.h" +#include "netlabel_user.h" + +/* + * LSM Functions + */ + +/** + * netlbl_socket_setattr - Label a socket using the correct protocol + * @sock: the socket to label + * @secattr: the security attributes + * + * Description: + * Attach the correct label to the given socket using the security attributes + * specified in @secattr. This function requires exclusive access to + * @sock->sk, which means it either needs to be in the process of being + * created or locked via lock_sock(sock->sk). Returns zero on success, + * negative values on failure. + * + */ +int netlbl_socket_setattr(const struct socket *sock, + const struct netlbl_lsm_secattr *secattr) +{ + int ret_val = -ENOENT; + struct netlbl_dom_map *dom_entry; + + rcu_read_lock(); + dom_entry = netlbl_domhsh_getentry(secattr->domain); + if (dom_entry == NULL) + goto socket_setattr_return; + switch (dom_entry->type) { + case NETLBL_NLTYPE_CIPSOV4: + ret_val = cipso_v4_socket_setattr(sock, + dom_entry->type_def.cipsov4, + secattr); + break; + case NETLBL_NLTYPE_UNLABELED: + ret_val = 0; + break; + default: + ret_val = -ENOENT; + } + +socket_setattr_return: + rcu_read_unlock(); + return ret_val; +} + +/** + * netlbl_socket_getattr - Determine the security attributes of a socket + * @sock: the socket + * @secattr: the security attributes + * + * Description: + * Examines the given socket to see any NetLabel style labeling has been + * applied to the socket, if so it parses the socket label and returns the + * security attributes in @secattr. Returns zero on success, negative values + * on failure. + * + */ +int netlbl_socket_getattr(const struct socket *sock, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val; + + ret_val = cipso_v4_socket_getattr(sock, secattr); + if (ret_val == 0) + return 0; + + return netlbl_unlabel_getattr(secattr); +} + +/** + * netlbl_skbuff_getattr - Determine the security attributes of a packet + * @skb: the packet + * @secattr: the security attributes + * + * Description: + * Examines the given packet to see if a recognized form of packet labeling + * is present, if so it parses the packet label and returns the security + * attributes in @secattr. Returns zero on success, negative values on + * failure. + * + */ +int netlbl_skbuff_getattr(const struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr) +{ + int ret_val; + + ret_val = cipso_v4_skbuff_getattr(skb, secattr); + if (ret_val == 0) + return 0; + + return netlbl_unlabel_getattr(secattr); +} + +/** + * netlbl_skbuff_err - Handle a LSM error on a sk_buff + * @skb: the packet + * @error: the error code + * + * Description: + * Deal with a LSM problem when handling the packet in @skb, typically this is + * a permission denied problem (-EACCES). The correct action is determined + * according to the packet's labeling protocol. + * + */ +void netlbl_skbuff_err(struct sk_buff *skb, int error) +{ + if (CIPSO_V4_OPTEXIST(skb)) + cipso_v4_error(skb, error, 0); +} + +/** + * netlbl_cache_invalidate - Invalidate all of the NetLabel protocol caches + * + * Description: + * For all of the NetLabel protocols that support some form of label mapping + * cache, invalidate the cache. Returns zero on success, negative values on + * error. + * + */ +void netlbl_cache_invalidate(void) +{ + cipso_v4_cache_invalidate(); +} + +/** + * netlbl_cache_add - Add an entry to a NetLabel protocol cache + * @skb: the packet + * @secattr: the packet's security attributes + * + * Description: + * Add the LSM security attributes for the given packet to the underlying + * NetLabel protocol's label mapping cache. Returns zero on success, negative + * values on error. + * + */ +int netlbl_cache_add(const struct sk_buff *skb, + const struct netlbl_lsm_secattr *secattr) +{ + if (secattr->cache.data == NULL) + return -ENOMSG; + + if (CIPSO_V4_OPTEXIST(skb)) + return cipso_v4_cache_add(skb, secattr); + + return -ENOMSG; +} + +/* + * Setup Functions + */ + +/** + * netlbl_init - Initialize NetLabel + * + * Description: + * Perform the required NetLabel initialization before first use. + * + */ +static int __init netlbl_init(void) +{ + int ret_val; + + printk(KERN_INFO "NetLabel: Initializing\n"); + printk(KERN_INFO "NetLabel: domain hash size = %u\n", + (1 << NETLBL_DOMHSH_BITSIZE)); + printk(KERN_INFO "NetLabel: protocols =" + " UNLABELED" + " CIPSOv4" + "\n"); + + ret_val = netlbl_domhsh_init(NETLBL_DOMHSH_BITSIZE); + if (ret_val != 0) + goto init_failure; + + ret_val = netlbl_netlink_init(); + if (ret_val != 0) + goto init_failure; + + ret_val = netlbl_unlabel_defconf(); + if (ret_val != 0) + goto init_failure; + printk(KERN_INFO "NetLabel: unlabeled traffic allowed by default\n"); + + return 0; + +init_failure: + panic("NetLabel: failed to initialize properly (%d)\n", ret_val); +} + +subsys_initcall(netlbl_init); diff --git a/net/netlabel/netlabel_mgmt.c b/net/netlabel/netlabel_mgmt.c new file mode 100644 index 0000000..85bc11a --- /dev/null +++ b/net/netlabel/netlabel_mgmt.c @@ -0,0 +1,624 @@ +/* + * NetLabel Management Support + * + * This file defines the management functions for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netlabel_domainhash.h" +#include "netlabel_user.h" +#include "netlabel_mgmt.h" + +/* NetLabel Generic NETLINK CIPSOv4 family */ +static struct genl_family netlbl_mgmt_gnl_family = { + .id = GENL_ID_GENERATE, + .hdrsize = 0, + .name = NETLBL_NLTYPE_MGMT_NAME, + .version = NETLBL_PROTO_VERSION, + .maxattr = 0, +}; + + +/* + * NetLabel Command Handlers + */ + +/** + * netlbl_mgmt_add - Handle an ADD message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated ADD message and add the domains from the message + * to the hash table. See netlabel.h for a description of the message format. + * Returns zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_add(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -EINVAL; + struct nlattr *msg_ptr = netlbl_netlink_payload_data(skb); + int msg_len = netlbl_netlink_payload_len(skb); + u32 count; + struct netlbl_dom_map *entry = NULL; + u32 iter; + u32 tmp_val; + int tmp_size; + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto add_failure; + + if (msg_len < NETLBL_LEN_U32) + goto add_failure; + count = netlbl_getinc_u32(&msg_ptr, &msg_len); + + for (iter = 0; iter < count && msg_len > 0; iter++, entry = NULL) { + if (msg_len <= 0) { + ret_val = -EINVAL; + goto add_failure; + } + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (entry == NULL) { + ret_val = -ENOMEM; + goto add_failure; + } + tmp_size = nla_len(msg_ptr); + if (tmp_size <= 0 || tmp_size > msg_len) { + ret_val = -EINVAL; + goto add_failure; + } + entry->domain = kmalloc(tmp_size, GFP_KERNEL); + if (entry->domain == NULL) { + ret_val = -ENOMEM; + goto add_failure; + } + nla_strlcpy(entry->domain, msg_ptr, tmp_size); + entry->domain[tmp_size - 1] = '\0'; + msg_ptr = nla_next(msg_ptr, &msg_len); + + if (msg_len < NETLBL_LEN_U32) { + ret_val = -EINVAL; + goto add_failure; + } + tmp_val = netlbl_getinc_u32(&msg_ptr, &msg_len); + entry->type = tmp_val; + switch (tmp_val) { + case NETLBL_NLTYPE_UNLABELED: + ret_val = netlbl_domhsh_add(entry); + break; + case NETLBL_NLTYPE_CIPSOV4: + if (msg_len < NETLBL_LEN_U32) { + ret_val = -EINVAL; + goto add_failure; + } + tmp_val = netlbl_getinc_u32(&msg_ptr, &msg_len); + /* We should be holding a rcu_read_lock() here + * while we hold the result but since the entry + * will always be deleted when the CIPSO DOI + * is deleted we aren't going to keep the lock. */ + rcu_read_lock(); + entry->type_def.cipsov4 = cipso_v4_doi_getdef(tmp_val); + if (entry->type_def.cipsov4 == NULL) { + rcu_read_unlock(); + ret_val = -EINVAL; + goto add_failure; + } + ret_val = netlbl_domhsh_add(entry); + rcu_read_unlock(); + break; + default: + ret_val = -EINVAL; + } + if (ret_val != 0) + goto add_failure; + } + + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + NETLBL_E_OK); + return 0; + +add_failure: + if (entry) + kfree(entry->domain); + kfree(entry); + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_remove - Handle a REMOVE message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated REMOVE message and remove the specified domain + * mappings. Returns zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_remove(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -EINVAL; + struct nlattr *msg_ptr = netlbl_netlink_payload_data(skb); + int msg_len = netlbl_netlink_payload_len(skb); + u32 count; + u32 iter; + int tmp_size; + unsigned char *domain; + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto remove_return; + + if (msg_len < NETLBL_LEN_U32) + goto remove_return; + count = netlbl_getinc_u32(&msg_ptr, &msg_len); + + for (iter = 0; iter < count && msg_len > 0; iter++) { + if (msg_len <= 0) { + ret_val = -EINVAL; + goto remove_return; + } + tmp_size = nla_len(msg_ptr); + domain = nla_data(msg_ptr); + if (tmp_size <= 0 || tmp_size > msg_len || + domain[tmp_size - 1] != '\0') { + ret_val = -EINVAL; + goto remove_return; + } + ret_val = netlbl_domhsh_remove(domain); + if (ret_val != 0) + goto remove_return; + msg_ptr = nla_next(msg_ptr, &msg_len); + } + + ret_val = 0; + +remove_return: + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_list - Handle a LIST message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated LIST message and dumps the domain hash table in a + * form suitable for use in a kernel generated LIST message. Returns zero on + * success, negative values on failure. + * + */ +static int netlbl_mgmt_list(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -ENOMEM; + struct sk_buff *ans_skb; + + ans_skb = netlbl_domhsh_dump(NLMSG_SPACE(GENL_HDRLEN)); + if (ans_skb == NULL) + goto list_failure; + netlbl_netlink_hdr_push(ans_skb, + info->snd_pid, + 0, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_LIST); + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto list_failure; + + return 0; + +list_failure: + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_adddef - Handle an ADDDEF message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated ADDDEF message and respond accordingly. Returns + * zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_adddef(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -EINVAL; + struct nlattr *msg_ptr = netlbl_netlink_payload_data(skb); + int msg_len = netlbl_netlink_payload_len(skb); + struct netlbl_dom_map *entry = NULL; + u32 tmp_val; + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto adddef_failure; + + if (msg_len < NETLBL_LEN_U32) + goto adddef_failure; + tmp_val = netlbl_getinc_u32(&msg_ptr, &msg_len); + + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (entry == NULL) { + ret_val = -ENOMEM; + goto adddef_failure; + } + + entry->type = tmp_val; + switch (entry->type) { + case NETLBL_NLTYPE_UNLABELED: + ret_val = netlbl_domhsh_add_default(entry); + break; + case NETLBL_NLTYPE_CIPSOV4: + if (msg_len < NETLBL_LEN_U32) { + ret_val = -EINVAL; + goto adddef_failure; + } + tmp_val = netlbl_getinc_u32(&msg_ptr, &msg_len); + /* We should be holding a rcu_read_lock here while we + * hold the result but since the entry will always be + * deleted when the CIPSO DOI is deleted we are going + * to skip the lock. */ + rcu_read_lock(); + entry->type_def.cipsov4 = cipso_v4_doi_getdef(tmp_val); + if (entry->type_def.cipsov4 == NULL) { + rcu_read_unlock(); + ret_val = -EINVAL; + goto adddef_failure; + } + ret_val = netlbl_domhsh_add_default(entry); + rcu_read_unlock(); + break; + default: + ret_val = -EINVAL; + } + if (ret_val != 0) + goto adddef_failure; + + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + NETLBL_E_OK); + return 0; + +adddef_failure: + kfree(entry); + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_removedef - Handle a REMOVEDEF message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated REMOVEDEF message and remove the default domain + * mapping. Returns zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_removedef(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val; + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto removedef_return; + + ret_val = netlbl_domhsh_remove_default(); + +removedef_return: + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_listdef - Handle a LISTDEF message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated LISTDEF message and dumps the default domain + * mapping in a form suitable for use in a kernel generated LISTDEF message. + * Returns zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_listdef(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -ENOMEM; + struct sk_buff *ans_skb; + + ans_skb = netlbl_domhsh_dump_default(NLMSG_SPACE(GENL_HDRLEN)); + if (ans_skb == NULL) + goto listdef_failure; + netlbl_netlink_hdr_push(ans_skb, + info->snd_pid, + 0, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_LISTDEF); + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto listdef_failure; + + return 0; + +listdef_failure: + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_modules - Handle a MODULES message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated MODULES message and respond accordingly. + * + */ +static int netlbl_mgmt_modules(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -ENOMEM; + size_t data_size; + u32 mod_count; + struct sk_buff *ans_skb = NULL; + + /* unlabeled + cipsov4 */ + mod_count = 2; + + data_size = GENL_HDRLEN + NETLBL_LEN_U32 + mod_count * NETLBL_LEN_U32; + ans_skb = netlbl_netlink_alloc_skb(0, data_size, GFP_KERNEL); + if (ans_skb == NULL) + goto modules_failure; + + if (netlbl_netlink_hdr_put(ans_skb, + info->snd_pid, + 0, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_MODULES) == NULL) + goto modules_failure; + + ret_val = nla_put_u32(ans_skb, NLA_U32, mod_count); + if (ret_val != 0) + goto modules_failure; + ret_val = nla_put_u32(ans_skb, NLA_U32, NETLBL_NLTYPE_UNLABELED); + if (ret_val != 0) + goto modules_failure; + ret_val = nla_put_u32(ans_skb, NLA_U32, NETLBL_NLTYPE_CIPSOV4); + if (ret_val != 0) + goto modules_failure; + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto modules_failure; + + return 0; + +modules_failure: + kfree_skb(ans_skb); + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_mgmt_version - Handle a VERSION message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated VERSION message and respond accordingly. Returns + * zero on success, negative values on failure. + * + */ +static int netlbl_mgmt_version(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -ENOMEM; + struct sk_buff *ans_skb = NULL; + + ans_skb = netlbl_netlink_alloc_skb(0, + GENL_HDRLEN + NETLBL_LEN_U32, + GFP_KERNEL); + if (ans_skb == NULL) + goto version_failure; + if (netlbl_netlink_hdr_put(ans_skb, + info->snd_pid, + 0, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_VERSION) == NULL) + goto version_failure; + + ret_val = nla_put_u32(ans_skb, NLA_U32, NETLBL_PROTO_VERSION); + if (ret_val != 0) + goto version_failure; + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto version_failure; + + return 0; + +version_failure: + kfree_skb(ans_skb); + netlbl_netlink_send_ack(info, + netlbl_mgmt_gnl_family.id, + NLBL_MGMT_C_ACK, + -ret_val); + return ret_val; +} + + +/* + * NetLabel Generic NETLINK Command Definitions + */ + +static struct genl_ops netlbl_mgmt_genl_c_add = { + .cmd = NLBL_MGMT_C_ADD, + .flags = 0, + .doit = netlbl_mgmt_add, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_remove = { + .cmd = NLBL_MGMT_C_REMOVE, + .flags = 0, + .doit = netlbl_mgmt_remove, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_list = { + .cmd = NLBL_MGMT_C_LIST, + .flags = 0, + .doit = netlbl_mgmt_list, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_adddef = { + .cmd = NLBL_MGMT_C_ADDDEF, + .flags = 0, + .doit = netlbl_mgmt_adddef, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_removedef = { + .cmd = NLBL_MGMT_C_REMOVEDEF, + .flags = 0, + .doit = netlbl_mgmt_removedef, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_listdef = { + .cmd = NLBL_MGMT_C_LISTDEF, + .flags = 0, + .doit = netlbl_mgmt_listdef, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_modules = { + .cmd = NLBL_MGMT_C_MODULES, + .flags = 0, + .doit = netlbl_mgmt_modules, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_mgmt_genl_c_version = { + .cmd = NLBL_MGMT_C_VERSION, + .flags = 0, + .doit = netlbl_mgmt_version, + .dumpit = NULL, +}; + +/* + * NetLabel Generic NETLINK Protocol Functions + */ + +/** + * netlbl_mgmt_genl_init - Register the NetLabel management component + * + * Description: + * Register the NetLabel management component with the Generic NETLINK + * mechanism. Returns zero on success, negative values on failure. + * + */ +int netlbl_mgmt_genl_init(void) +{ + int ret_val; + + ret_val = genl_register_family(&netlbl_mgmt_gnl_family); + if (ret_val != 0) + return ret_val; + + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_add); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_remove); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_list); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_adddef); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_removedef); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_listdef); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_modules); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_mgmt_gnl_family, + &netlbl_mgmt_genl_c_version); + if (ret_val != 0) + return ret_val; + + return 0; +} diff --git a/net/netlabel/netlabel_mgmt.h b/net/netlabel/netlabel_mgmt.h new file mode 100644 index 0000000..fd6c6ac --- /dev/null +++ b/net/netlabel/netlabel_mgmt.h @@ -0,0 +1,246 @@ +/* + * NetLabel Management Support + * + * This file defines the management functions for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_MGMT_H +#define _NETLABEL_MGMT_H + +#include + +/* + * The following NetLabel payloads are supported by the management interface, + * all of which are preceeded by the nlmsghdr struct. + * + * o ACK: + * Sent by the kernel in response to an applications message, applications + * should never send this message. + * + * +----------------------+-----------------------+ + * | seq number (32 bits) | return code (32 bits) | + * +----------------------+-----------------------+ + * + * seq number: the sequence number of the original message, taken from the + * nlmsghdr structure + * return code: return value, based on errno values + * + * o ADD: + * Sent by an application to add a domain mapping to the NetLabel system. + * The kernel should respond with an ACK. + * + * +-------------------+ + * | domains (32 bits) | ... + * +-------------------+ + * + * domains: the number of domains in the message + * + * +--------------------------+-------------------------+ + * | domain string (variable) | protocol type (32 bits) | ... + * +--------------------------+-------------------------+ + * + * +-------------- ---- --- -- - + * | mapping data ... repeated + * +-------------- ---- --- -- - + * + * domain string: the domain string, NULL terminated + * protocol type: the protocol type (defined by NETLBL_NLTYPE_*) + * mapping data: specific to the map type (see below) + * + * NETLBL_NLTYPE_UNLABELED + * + * No mapping data for this protocol type. + * + * NETLBL_NLTYPE_CIPSOV4 + * + * +---------------+ + * | doi (32 bits) | + * +---------------+ + * + * doi: the CIPSO DOI value + * + * o REMOVE: + * Sent by an application to remove a domain mapping from the NetLabel + * system. The kernel should ACK this message. + * + * +-------------------+ + * | domains (32 bits) | ... + * +-------------------+ + * + * domains: the number of domains in the message + * + * +--------------------------+ + * | domain string (variable) | ... + * +--------------------------+ + * + * domain string: the domain string, NULL terminated + * + * o LIST: + * This message can be sent either from an application or by the kernel in + * response to an application generated LIST message. When sent by an + * application there is no payload. The kernel should respond to a LIST + * message either with a LIST message on success or an ACK message on + * failure. + * + * +-------------------+ + * | domains (32 bits) | ... + * +-------------------+ + * + * domains: the number of domains in the message + * + * +--------------------------+ + * | domain string (variable) | ... + * +--------------------------+ + * + * +-------------------------+-------------- ---- --- -- - + * | protocol type (32 bits) | mapping data ... repeated + * +-------------------------+-------------- ---- --- -- - + * + * domain string: the domain string, NULL terminated + * protocol type: the protocol type (defined by NETLBL_NLTYPE_*) + * mapping data: specific to the map type (see below) + * + * NETLBL_NLTYPE_UNLABELED + * + * No mapping data for this protocol type. + * + * NETLBL_NLTYPE_CIPSOV4 + * + * +----------------+---------------+ + * | type (32 bits) | doi (32 bits) | + * +----------------+---------------+ + * + * type: the CIPSO mapping table type (defined in the cipso_ipv4.h header + * as CIPSO_V4_MAP_*) + * doi: the CIPSO DOI value + * + * o ADDDEF: + * Sent by an application to set the default domain mapping for the NetLabel + * system. The kernel should respond with an ACK. + * + * +-------------------------+-------------- ---- --- -- - + * | protocol type (32 bits) | mapping data ... repeated + * +-------------------------+-------------- ---- --- -- - + * + * protocol type: the protocol type (defined by NETLBL_NLTYPE_*) + * mapping data: specific to the map type (see below) + * + * NETLBL_NLTYPE_UNLABELED + * + * No mapping data for this protocol type. + * + * NETLBL_NLTYPE_CIPSOV4 + * + * +---------------+ + * | doi (32 bits) | + * +---------------+ + * + * doi: the CIPSO DOI value + * + * o REMOVEDEF: + * Sent by an application to remove the default domain mapping from the + * NetLabel system, there is no payload. The kernel should ACK this message. + * + * o LISTDEF: + * This message can be sent either from an application or by the kernel in + * response to an application generated LISTDEF message. When sent by an + * application there is no payload. The kernel should respond to a + * LISTDEF message either with a LISTDEF message on success or an ACK message + * on failure. + * + * +-------------------------+-------------- ---- --- -- - + * | protocol type (32 bits) | mapping data ... repeated + * +-------------------------+-------------- ---- --- -- - + * + * protocol type: the protocol type (defined by NETLBL_NLTYPE_*) + * mapping data: specific to the map type (see below) + * + * NETLBL_NLTYPE_UNLABELED + * + * No mapping data for this protocol type. + * + * NETLBL_NLTYPE_CIPSOV4 + * + * +----------------+---------------+ + * | type (32 bits) | doi (32 bits) | + * +----------------+---------------+ + * + * type: the CIPSO mapping table type (defined in the cipso_ipv4.h header + * as CIPSO_V4_MAP_*) + * doi: the CIPSO DOI value + * + * o MODULES: + * Sent by an application to request a list of configured NetLabel modules + * in the kernel. When sent by an application there is no payload. + * + * +-------------------+ + * | modules (32 bits) | ... + * +-------------------+ + * + * modules: the number of modules in the message, if this is an application + * generated message and the value is zero then return a list of + * the configured modules + * + * +------------------+ + * | module (32 bits) | ... repeated + * +------------------+ + * + * module: the module number as defined by NETLBL_NLTYPE_* + * + * o VERSION: + * Sent by an application to request the NetLabel version string. When sent + * by an application there is no payload. This message type is also used by + * the kernel to respond to an VERSION request. + * + * +-------------------+ + * | version (32 bits) | + * +-------------------+ + * + * version: the protocol version number + * + */ + +/* NetLabel Management commands */ +enum { + NLBL_MGMT_C_UNSPEC, + NLBL_MGMT_C_ACK, + NLBL_MGMT_C_ADD, + NLBL_MGMT_C_REMOVE, + NLBL_MGMT_C_LIST, + NLBL_MGMT_C_ADDDEF, + NLBL_MGMT_C_REMOVEDEF, + NLBL_MGMT_C_LISTDEF, + NLBL_MGMT_C_MODULES, + NLBL_MGMT_C_VERSION, + __NLBL_MGMT_C_MAX, +}; +#define NLBL_MGMT_C_MAX (__NLBL_MGMT_C_MAX - 1) + +/* NetLabel protocol functions */ +int netlbl_mgmt_genl_init(void); + +#endif diff --git a/net/netlabel/netlabel_unlabeled.h b/net/netlabel/netlabel_unlabeled.h new file mode 100644 index 0000000..f300e54 --- /dev/null +++ b/net/netlabel/netlabel_unlabeled.h @@ -0,0 +1,98 @@ +/* + * NetLabel Unlabeled Support + * + * This file defines functions for dealing with unlabeled packets for the + * NetLabel system. The NetLabel system manages static and dynamic label + * mappings for network protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_UNLABELED_H +#define _NETLABEL_UNLABELED_H + +#include + +/* + * The following NetLabel payloads are supported by the Unlabeled subsystem. + * + * o ACK: + * Sent by the kernel in response to an applications message, applications + * should never send this message. + * + * +----------------------+-----------------------+ + * | seq number (32 bits) | return code (32 bits) | + * +----------------------+-----------------------+ + * + * seq number: the sequence number of the original message, taken from the + * nlmsghdr structure + * return code: return value, based on errno values + * + * o ACCEPT + * This message is sent from an application to specify if the kernel should + * allow unlabled packets to pass if they do not match any of the static + * mappings defined in the unlabeled module. + * + * +-----------------+ + * | allow (32 bits) | + * +-----------------+ + * + * allow: if true (1) then allow the packets to pass, if false (0) then + * reject the packets + * + * o LIST + * This message can be sent either from an application or by the kernel in + * response to an application generated LIST message. When sent by an + * application there is no payload. The kernel should respond to a LIST + * message either with a LIST message on success or an ACK message on + * failure. + * + * +-----------------------+ + * | accept flag (32 bits) | + * +-----------------------+ + * + * accept flag: if true (1) then unlabeled packets are allowed to pass, + * if false (0) then unlabeled packets are rejected + * + */ + +/* NetLabel Unlabeled commands */ +enum { + NLBL_UNLABEL_C_UNSPEC, + NLBL_UNLABEL_C_ACK, + NLBL_UNLABEL_C_ACCEPT, + NLBL_UNLABEL_C_LIST, + __NLBL_UNLABEL_C_MAX, +}; +#define NLBL_UNLABEL_C_MAX (__NLBL_UNLABEL_C_MAX - 1) + +/* NetLabel protocol functions */ +int netlbl_unlabel_genl_init(void); + +/* Process Unlabeled incoming network packets */ +int netlbl_unlabel_getattr(struct netlbl_lsm_secattr *secattr); + +/* Set the default configuration to allow Unlabeled packets */ +int netlbl_unlabel_defconf(void); + +#endif diff --git a/net/netlabel/netlabel_user.c b/net/netlabel/netlabel_user.c new file mode 100644 index 0000000..8002222 --- /dev/null +++ b/net/netlabel/netlabel_user.c @@ -0,0 +1,158 @@ +/* + * NetLabel NETLINK Interface + * + * This file defines the NETLINK interface for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netlabel_mgmt.h" +#include "netlabel_unlabeled.h" +#include "netlabel_cipso_v4.h" +#include "netlabel_user.h" + +/* + * NetLabel NETLINK Setup Functions + */ + +/** + * netlbl_netlink_init - Initialize the NETLINK communication channel + * + * Description: + * Call out to the NetLabel components so they can register their families and + * commands with the Generic NETLINK mechanism. Returns zero on success and + * non-zero on failure. + * + */ +int netlbl_netlink_init(void) +{ + int ret_val; + + ret_val = netlbl_mgmt_genl_init(); + if (ret_val != 0) + return ret_val; + + ret_val = netlbl_cipsov4_genl_init(); + if (ret_val != 0) + return ret_val; + + ret_val = netlbl_unlabel_genl_init(); + if (ret_val != 0) + return ret_val; + + return 0; +} + +/* + * NetLabel Common Protocol Functions + */ + +/** + * netlbl_netlink_send_ack - Send an ACK message + * @info: the generic NETLINK information + * @genl_family: the generic NETLINK family ID value + * @ack_cmd: the generic NETLINK family ACK command value + * @ret_code: return code to use + * + * Description: + * This function sends an ACK message to the sender of the NETLINK message + * specified by @info. + * + */ +void netlbl_netlink_send_ack(const struct genl_info *info, + u32 genl_family, + u8 ack_cmd, + u32 ret_code) +{ + size_t data_size; + struct sk_buff *skb; + + data_size = GENL_HDRLEN + 2 * NETLBL_LEN_U32; + skb = netlbl_netlink_alloc_skb(0, data_size, GFP_KERNEL); + if (skb == NULL) + return; + + if (netlbl_netlink_hdr_put(skb, + info->snd_pid, + 0, + genl_family, + ack_cmd) == NULL) + goto send_ack_failure; + + if (nla_put_u32(skb, NLA_U32, info->snd_seq) != 0) + goto send_ack_failure; + if (nla_put_u32(skb, NLA_U32, ret_code) != 0) + goto send_ack_failure; + + netlbl_netlink_snd(skb, info->snd_pid); + return; + +send_ack_failure: + kfree_skb(skb); +} + +/* + * NETLINK I/O Functions + */ + +/** + * netlbl_netlink_snd - Send a NetLabel message + * @skb: NetLabel message + * @pid: destination PID + * + * Description: + * Sends a unicast NetLabel message over the NETLINK socket. + * + */ +int netlbl_netlink_snd(struct sk_buff *skb, u32 pid) +{ + return genlmsg_unicast(skb, pid); +} + +/** + * netlbl_netlink_snd - Send a NetLabel message + * @skb: NetLabel message + * @pid: sending PID + * @group: multicast group id + * + * Description: + * Sends a multicast NetLabel message over the NETLINK socket to all members + * of @group except @pid. + * + */ +int netlbl_netlink_snd_multicast(struct sk_buff *skb, u32 pid, u32 group) +{ + return genlmsg_multicast(skb, pid, group); +} diff --git a/net/netlabel/netlabel_user.h b/net/netlabel/netlabel_user.h new file mode 100644 index 0000000..ccf237b --- /dev/null +++ b/net/netlabel/netlabel_user.h @@ -0,0 +1,214 @@ +/* + * NetLabel NETLINK Interface + * + * This file defines the NETLINK interface for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _NETLABEL_USER_H +#define _NETLABEL_USER_H + +#include +#include +#include +#include +#include + +/* NetLabel NETLINK helper functions */ + +/** + * netlbl_netlink_cap_check - Check the NETLINK msg capabilities + * @skb: the NETLINK buffer + * @req_cap: the required capability + * + * Description: + * Check the NETLINK buffer's capabilities against the required capabilities. + * Returns zero on success, negative values on failure. + * + */ +static inline int netlbl_netlink_cap_check(const struct sk_buff *skb, + kernel_cap_t req_cap) +{ + if (cap_raised(NETLINK_CB(skb).eff_cap, req_cap)) + return 0; + return -EPERM; +} + +/** + * netlbl_getinc_u8 - Read a u8 value from a nlattr stream and move on + * @nla: the attribute + * @rem_len: remaining length + * + * Description: + * Return a u8 value pointed to by @nla and advance it to the next attribute. + * + */ +static inline u8 netlbl_getinc_u8(struct nlattr **nla, int *rem_len) +{ + u8 val = nla_get_u8(*nla); + *nla = nla_next(*nla, rem_len); + return val; +} + +/** + * netlbl_getinc_u16 - Read a u16 value from a nlattr stream and move on + * @nla: the attribute + * @rem_len: remaining length + * + * Description: + * Return a u16 value pointed to by @nla and advance it to the next attribute. + * + */ +static inline u16 netlbl_getinc_u16(struct nlattr **nla, int *rem_len) +{ + u16 val = nla_get_u16(*nla); + *nla = nla_next(*nla, rem_len); + return val; +} + +/** + * netlbl_getinc_u32 - Read a u32 value from a nlattr stream and move on + * @nla: the attribute + * @rem_len: remaining length + * + * Description: + * Return a u32 value pointed to by @nla and advance it to the next attribute. + * + */ +static inline u32 netlbl_getinc_u32(struct nlattr **nla, int *rem_len) +{ + u32 val = nla_get_u32(*nla); + *nla = nla_next(*nla, rem_len); + return val; +} + +/** + * netlbl_netlink_hdr_put - Write the NETLINK buffers into a sk_buff + * @skb: the packet + * @pid: the PID of the receipient + * @seq: the sequence number + * @type: the generic NETLINK message family type + * @cmd: command + * + * Description: + * Write both a NETLINK nlmsghdr structure and a Generic NETLINK genlmsghdr + * struct to the packet. Returns a pointer to the start of the payload buffer + * on success or NULL on failure. + * + */ +static inline void *netlbl_netlink_hdr_put(struct sk_buff *skb, + u32 pid, + u32 seq, + int type, + u8 cmd) +{ + return genlmsg_put(skb, + pid, + seq, + type, + 0, + 0, + cmd, + NETLBL_PROTO_VERSION); +} + +/** + * netlbl_netlink_hdr_push - Write the NETLINK buffers into a sk_buff + * @skb: the packet + * @pid: the PID of the receipient + * @seq: the sequence number + * @type: the generic NETLINK message family type + * @cmd: command + * + * Description: + * Write both a NETLINK nlmsghdr structure and a Generic NETLINK genlmsghdr + * struct to the packet. + * + */ +static inline void netlbl_netlink_hdr_push(struct sk_buff *skb, + u32 pid, + u32 seq, + int type, + u8 cmd) + +{ + struct nlmsghdr *nlh; + struct genlmsghdr *hdr; + + nlh = (struct nlmsghdr *)skb_push(skb, NLMSG_SPACE(GENL_HDRLEN)); + nlh->nlmsg_type = type; + nlh->nlmsg_len = skb->len; + nlh->nlmsg_flags = 0; + nlh->nlmsg_pid = pid; + nlh->nlmsg_seq = seq; + + hdr = nlmsg_data(nlh); + hdr->cmd = cmd; + hdr->version = NETLBL_PROTO_VERSION; + hdr->reserved = 0; +} + +/** + * netlbl_netlink_payload_len - Return the length of the payload + * @skb: the NETLINK buffer + * + * Description: + * This function returns the length of the NetLabel payload. + * + */ +static inline u32 netlbl_netlink_payload_len(const struct sk_buff *skb) +{ + return nlmsg_len((struct nlmsghdr *)skb->data) - GENL_HDRLEN; +} + +/** + * netlbl_netlink_payload_data - Returns a pointer to the start of the payload + * @skb: the NETLINK buffer + * + * Description: + * This function returns a pointer to the start of the NetLabel payload. + * + */ +static inline void *netlbl_netlink_payload_data(const struct sk_buff *skb) +{ + return (unsigned char *)nlmsg_data((struct nlmsghdr *)skb->data) + + GENL_HDRLEN; +} + +/* NetLabel common protocol functions */ + +void netlbl_netlink_send_ack(const struct genl_info *info, + u32 genl_family, + u8 ack_cmd, + u32 ret_code); + +/* NetLabel NETLINK I/O functions */ + +int netlbl_netlink_init(void); +int netlbl_netlink_snd(struct sk_buff *skb, u32 pid); +int netlbl_netlink_snd_multicast(struct sk_buff *skb, u32 pid, u32 group); + +#endif -- cgit v0.10.2 From 96cb8e3313c7a12e026c1ed510522ae6f6023875 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:48:59 -0700 Subject: [NetLabel]: CIPSOv4 and Unlabeled packet integration Add CIPSO/IPv4 and unlabeled packet management to the NetLabel subsystem. The CIPSO/IPv4 changes allow the configuration of CIPSO/IPv4 within the overall NetLabel framework. The unlabeled packet changes allows NetLabel to pass unlabeled packets without error. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/net/netlabel/netlabel_cipso_v4.c b/net/netlabel/netlabel_cipso_v4.c new file mode 100644 index 0000000..a4f40ad --- /dev/null +++ b/net/netlabel/netlabel_cipso_v4.c @@ -0,0 +1,542 @@ +/* + * NetLabel CIPSO/IPv4 Support + * + * This file defines the CIPSO/IPv4 functions for the NetLabel system. The + * NetLabel system manages static and dynamic label mappings for network + * protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "netlabel_user.h" +#include "netlabel_cipso_v4.h" + +/* NetLabel Generic NETLINK CIPSOv4 family */ +static struct genl_family netlbl_cipsov4_gnl_family = { + .id = GENL_ID_GENERATE, + .hdrsize = 0, + .name = NETLBL_NLTYPE_CIPSOV4_NAME, + .version = NETLBL_PROTO_VERSION, + .maxattr = 0, +}; + + +/* + * Helper Functions + */ + +/** + * netlbl_cipsov4_doi_free - Frees a CIPSO V4 DOI definition + * @entry: the entry's RCU field + * + * Description: + * This function is designed to be used as a callback to the call_rcu() + * function so that the memory allocated to the DOI definition can be released + * safely. + * + */ +static void netlbl_cipsov4_doi_free(struct rcu_head *entry) +{ + struct cipso_v4_doi *ptr; + + ptr = container_of(entry, struct cipso_v4_doi, rcu); + switch (ptr->type) { + case CIPSO_V4_MAP_STD: + kfree(ptr->map.std->lvl.cipso); + kfree(ptr->map.std->lvl.local); + kfree(ptr->map.std->cat.cipso); + kfree(ptr->map.std->cat.local); + break; + } + kfree(ptr); +} + + +/* + * NetLabel Command Handlers + */ + +/** + * netlbl_cipsov4_add_std - Adds a CIPSO V4 DOI definition + * @doi: the DOI value + * @msg: the ADD message data + * @msg_size: the size of the ADD message buffer + * + * Description: + * Create a new CIPSO_V4_MAP_STD DOI definition based on the given ADD message + * and add it to the CIPSO V4 engine. Return zero on success and non-zero on + * error. + * + */ +static int netlbl_cipsov4_add_std(u32 doi, struct nlattr *msg, size_t msg_size) +{ + int ret_val = -EINVAL; + int msg_len = msg_size; + u32 num_tags; + u32 num_lvls; + u32 num_cats; + struct cipso_v4_doi *doi_def = NULL; + u32 iter; + u32 tmp_val_a; + u32 tmp_val_b; + + if (msg_len < NETLBL_LEN_U32) + goto add_std_failure; + num_tags = netlbl_getinc_u32(&msg, &msg_len); + if (num_tags == 0 || num_tags > CIPSO_V4_TAG_MAXCNT) + goto add_std_failure; + + doi_def = kmalloc(sizeof(*doi_def), GFP_KERNEL); + if (doi_def == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + doi_def->map.std = kzalloc(sizeof(*doi_def->map.std), GFP_KERNEL); + if (doi_def->map.std == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + doi_def->type = CIPSO_V4_MAP_STD; + + for (iter = 0; iter < num_tags; iter++) { + if (msg_len < NETLBL_LEN_U8) + goto add_std_failure; + doi_def->tags[iter] = netlbl_getinc_u8(&msg, &msg_len); + switch (doi_def->tags[iter]) { + case CIPSO_V4_TAG_RBITMAP: + break; + default: + goto add_std_failure; + } + } + if (iter < CIPSO_V4_TAG_MAXCNT) + doi_def->tags[iter] = CIPSO_V4_TAG_INVALID; + + if (msg_len < 6 * NETLBL_LEN_U32) + goto add_std_failure; + + num_lvls = netlbl_getinc_u32(&msg, &msg_len); + if (num_lvls == 0) + goto add_std_failure; + doi_def->map.std->lvl.local_size = netlbl_getinc_u32(&msg, &msg_len); + if (doi_def->map.std->lvl.local_size > CIPSO_V4_MAX_LOC_LVLS) + goto add_std_failure; + doi_def->map.std->lvl.local = kcalloc(doi_def->map.std->lvl.local_size, + sizeof(u32), + GFP_KERNEL); + if (doi_def->map.std->lvl.local == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + doi_def->map.std->lvl.cipso_size = netlbl_getinc_u8(&msg, &msg_len); + if (doi_def->map.std->lvl.cipso_size > CIPSO_V4_MAX_REM_LVLS) + goto add_std_failure; + doi_def->map.std->lvl.cipso = kcalloc(doi_def->map.std->lvl.cipso_size, + sizeof(u32), + GFP_KERNEL); + if (doi_def->map.std->lvl.cipso == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + + num_cats = netlbl_getinc_u32(&msg, &msg_len); + doi_def->map.std->cat.local_size = netlbl_getinc_u32(&msg, &msg_len); + if (doi_def->map.std->cat.local_size > CIPSO_V4_MAX_LOC_CATS) + goto add_std_failure; + doi_def->map.std->cat.local = kcalloc(doi_def->map.std->cat.local_size, + sizeof(u32), + GFP_KERNEL); + if (doi_def->map.std->cat.local == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + doi_def->map.std->cat.cipso_size = netlbl_getinc_u16(&msg, &msg_len); + if (doi_def->map.std->cat.cipso_size > CIPSO_V4_MAX_REM_CATS) + goto add_std_failure; + doi_def->map.std->cat.cipso = kcalloc(doi_def->map.std->cat.cipso_size, + sizeof(u32), + GFP_KERNEL); + if (doi_def->map.std->cat.cipso == NULL) { + ret_val = -ENOMEM; + goto add_std_failure; + } + + if (msg_len < + num_lvls * (NETLBL_LEN_U32 + NETLBL_LEN_U8) + + num_cats * (NETLBL_LEN_U32 + NETLBL_LEN_U16)) + goto add_std_failure; + + for (iter = 0; iter < doi_def->map.std->lvl.cipso_size; iter++) + doi_def->map.std->lvl.cipso[iter] = CIPSO_V4_INV_LVL; + for (iter = 0; iter < doi_def->map.std->lvl.local_size; iter++) + doi_def->map.std->lvl.local[iter] = CIPSO_V4_INV_LVL; + for (iter = 0; iter < doi_def->map.std->cat.cipso_size; iter++) + doi_def->map.std->cat.cipso[iter] = CIPSO_V4_INV_CAT; + for (iter = 0; iter < doi_def->map.std->cat.local_size; iter++) + doi_def->map.std->cat.local[iter] = CIPSO_V4_INV_CAT; + + for (iter = 0; iter < num_lvls; iter++) { + tmp_val_a = netlbl_getinc_u32(&msg, &msg_len); + tmp_val_b = netlbl_getinc_u8(&msg, &msg_len); + + if (tmp_val_a >= doi_def->map.std->lvl.local_size || + tmp_val_b >= doi_def->map.std->lvl.cipso_size) + goto add_std_failure; + + doi_def->map.std->lvl.cipso[tmp_val_b] = tmp_val_a; + doi_def->map.std->lvl.local[tmp_val_a] = tmp_val_b; + } + + for (iter = 0; iter < num_cats; iter++) { + tmp_val_a = netlbl_getinc_u32(&msg, &msg_len); + tmp_val_b = netlbl_getinc_u16(&msg, &msg_len); + + if (tmp_val_a >= doi_def->map.std->cat.local_size || + tmp_val_b >= doi_def->map.std->cat.cipso_size) + goto add_std_failure; + + doi_def->map.std->cat.cipso[tmp_val_b] = tmp_val_a; + doi_def->map.std->cat.local[tmp_val_a] = tmp_val_b; + } + + doi_def->doi = doi; + ret_val = cipso_v4_doi_add(doi_def); + if (ret_val != 0) + goto add_std_failure; + return 0; + +add_std_failure: + if (doi_def) + netlbl_cipsov4_doi_free(&doi_def->rcu); + return ret_val; +} + +/** + * netlbl_cipsov4_add_pass - Adds a CIPSO V4 DOI definition + * @doi: the DOI value + * @msg: the ADD message data + * @msg_size: the size of the ADD message buffer + * + * Description: + * Create a new CIPSO_V4_MAP_PASS DOI definition based on the given ADD message + * and add it to the CIPSO V4 engine. Return zero on success and non-zero on + * error. + * + */ +static int netlbl_cipsov4_add_pass(u32 doi, + struct nlattr *msg, + size_t msg_size) +{ + int ret_val = -EINVAL; + int msg_len = msg_size; + u32 num_tags; + struct cipso_v4_doi *doi_def = NULL; + u32 iter; + + if (msg_len < NETLBL_LEN_U32) + goto add_pass_failure; + num_tags = netlbl_getinc_u32(&msg, &msg_len); + if (num_tags == 0 || num_tags > CIPSO_V4_TAG_MAXCNT) + goto add_pass_failure; + + doi_def = kmalloc(sizeof(*doi_def), GFP_KERNEL); + if (doi_def == NULL) { + ret_val = -ENOMEM; + goto add_pass_failure; + } + doi_def->type = CIPSO_V4_MAP_PASS; + + for (iter = 0; iter < num_tags; iter++) { + if (msg_len < NETLBL_LEN_U8) + goto add_pass_failure; + doi_def->tags[iter] = netlbl_getinc_u8(&msg, &msg_len); + switch (doi_def->tags[iter]) { + case CIPSO_V4_TAG_RBITMAP: + break; + default: + goto add_pass_failure; + } + } + if (iter < CIPSO_V4_TAG_MAXCNT) + doi_def->tags[iter] = CIPSO_V4_TAG_INVALID; + + doi_def->doi = doi; + ret_val = cipso_v4_doi_add(doi_def); + if (ret_val != 0) + goto add_pass_failure; + return 0; + +add_pass_failure: + if (doi_def) + netlbl_cipsov4_doi_free(&doi_def->rcu); + return ret_val; +} + +/** + * netlbl_cipsov4_add - Handle an ADD message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Create a new DOI definition based on the given ADD message and add it to the + * CIPSO V4 engine. Returns zero on success, negative values on failure. + * + */ +static int netlbl_cipsov4_add(struct sk_buff *skb, struct genl_info *info) + +{ + int ret_val = -EINVAL; + u32 doi; + u32 map_type; + int msg_len = netlbl_netlink_payload_len(skb); + struct nlattr *msg = netlbl_netlink_payload_data(skb); + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto add_return; + + if (msg_len < 2 * NETLBL_LEN_U32) + goto add_return; + + doi = netlbl_getinc_u32(&msg, &msg_len); + map_type = netlbl_getinc_u32(&msg, &msg_len); + switch (map_type) { + case CIPSO_V4_MAP_STD: + ret_val = netlbl_cipsov4_add_std(doi, msg, msg_len); + break; + case CIPSO_V4_MAP_PASS: + ret_val = netlbl_cipsov4_add_pass(doi, msg, msg_len); + break; + } + +add_return: + netlbl_netlink_send_ack(info, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_cipsov4_list - Handle a LIST message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated LIST message and respond accordingly. Returns + * zero on success and negative values on error. + * + */ +static int netlbl_cipsov4_list(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -EINVAL; + u32 doi; + struct nlattr *msg = netlbl_netlink_payload_data(skb); + struct sk_buff *ans_skb; + + if (netlbl_netlink_payload_len(skb) != NETLBL_LEN_U32) + goto list_failure; + + doi = nla_get_u32(msg); + ans_skb = cipso_v4_doi_dump(doi, NLMSG_SPACE(GENL_HDRLEN)); + if (ans_skb == NULL) { + ret_val = -ENOMEM; + goto list_failure; + } + netlbl_netlink_hdr_push(ans_skb, + info->snd_pid, + 0, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_LIST); + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto list_failure; + + return 0; + +list_failure: + netlbl_netlink_send_ack(info, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_cipsov4_listall - Handle a LISTALL message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated LISTALL message and respond accordingly. Returns + * zero on success and negative values on error. + * + */ +static int netlbl_cipsov4_listall(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -EINVAL; + struct sk_buff *ans_skb; + + ans_skb = cipso_v4_doi_dump_all(NLMSG_SPACE(GENL_HDRLEN)); + if (ans_skb == NULL) { + ret_val = -ENOMEM; + goto listall_failure; + } + netlbl_netlink_hdr_push(ans_skb, + info->snd_pid, + 0, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_LISTALL); + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto listall_failure; + + return 0; + +listall_failure: + netlbl_netlink_send_ack(info, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_ACK, + -ret_val); + return ret_val; +} + +/** + * netlbl_cipsov4_remove - Handle a REMOVE message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated REMOVE message and respond accordingly. Returns + * zero on success, negative values on failure. + * + */ +static int netlbl_cipsov4_remove(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val; + u32 doi; + struct nlattr *msg = netlbl_netlink_payload_data(skb); + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + goto remove_return; + + if (netlbl_netlink_payload_len(skb) != NETLBL_LEN_U32) { + ret_val = -EINVAL; + goto remove_return; + } + + doi = nla_get_u32(msg); + ret_val = cipso_v4_doi_remove(doi, netlbl_cipsov4_doi_free); + +remove_return: + netlbl_netlink_send_ack(info, + netlbl_cipsov4_gnl_family.id, + NLBL_CIPSOV4_C_ACK, + -ret_val); + return ret_val; +} + +/* + * NetLabel Generic NETLINK Command Definitions + */ + +static struct genl_ops netlbl_cipsov4_genl_c_add = { + .cmd = NLBL_CIPSOV4_C_ADD, + .flags = 0, + .doit = netlbl_cipsov4_add, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_cipsov4_genl_c_remove = { + .cmd = NLBL_CIPSOV4_C_REMOVE, + .flags = 0, + .doit = netlbl_cipsov4_remove, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_cipsov4_genl_c_list = { + .cmd = NLBL_CIPSOV4_C_LIST, + .flags = 0, + .doit = netlbl_cipsov4_list, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_cipsov4_genl_c_listall = { + .cmd = NLBL_CIPSOV4_C_LISTALL, + .flags = 0, + .doit = netlbl_cipsov4_listall, + .dumpit = NULL, +}; + +/* + * NetLabel Generic NETLINK Protocol Functions + */ + +/** + * netlbl_cipsov4_genl_init - Register the CIPSOv4 NetLabel component + * + * Description: + * Register the CIPSOv4 packet NetLabel component with the Generic NETLINK + * mechanism. Returns zero on success, negative values on failure. + * + */ +int netlbl_cipsov4_genl_init(void) +{ + int ret_val; + + ret_val = genl_register_family(&netlbl_cipsov4_gnl_family); + if (ret_val != 0) + return ret_val; + + ret_val = genl_register_ops(&netlbl_cipsov4_gnl_family, + &netlbl_cipsov4_genl_c_add); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_cipsov4_gnl_family, + &netlbl_cipsov4_genl_c_remove); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_cipsov4_gnl_family, + &netlbl_cipsov4_genl_c_list); + if (ret_val != 0) + return ret_val; + ret_val = genl_register_ops(&netlbl_cipsov4_gnl_family, + &netlbl_cipsov4_genl_c_listall); + if (ret_val != 0) + return ret_val; + + return 0; +} diff --git a/net/netlabel/netlabel_unlabeled.c b/net/netlabel/netlabel_unlabeled.c new file mode 100644 index 0000000..785f496 --- /dev/null +++ b/net/netlabel/netlabel_unlabeled.c @@ -0,0 +1,253 @@ +/* + * NetLabel Unlabeled Support + * + * This file defines functions for dealing with unlabeled packets for the + * NetLabel system. The NetLabel system manages static and dynamic label + * mappings for network protocols such as CIPSO and RIPSO. + * + * Author: Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "netlabel_user.h" +#include "netlabel_domainhash.h" +#include "netlabel_unlabeled.h" + +/* Accept unlabeled packets flag */ +static atomic_t netlabel_unlabel_accept_flg = ATOMIC_INIT(0); + +/* NetLabel Generic NETLINK CIPSOv4 family */ +static struct genl_family netlbl_unlabel_gnl_family = { + .id = GENL_ID_GENERATE, + .hdrsize = 0, + .name = NETLBL_NLTYPE_UNLABELED_NAME, + .version = NETLBL_PROTO_VERSION, + .maxattr = 0, +}; + + +/* + * NetLabel Command Handlers + */ + +/** + * netlbl_unlabel_accept - Handle an ACCEPT message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated ACCEPT message and set the accept flag accordingly. + * Returns zero on success, negative values on failure. + * + */ +static int netlbl_unlabel_accept(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val; + struct nlattr *data = netlbl_netlink_payload_data(skb); + u32 value; + + ret_val = netlbl_netlink_cap_check(skb, CAP_NET_ADMIN); + if (ret_val != 0) + return ret_val; + + if (netlbl_netlink_payload_len(skb) == NETLBL_LEN_U32) { + value = nla_get_u32(data); + if (value == 1 || value == 0) { + atomic_set(&netlabel_unlabel_accept_flg, value); + netlbl_netlink_send_ack(info, + netlbl_unlabel_gnl_family.id, + NLBL_UNLABEL_C_ACK, + NETLBL_E_OK); + return 0; + } + } + + netlbl_netlink_send_ack(info, + netlbl_unlabel_gnl_family.id, + NLBL_UNLABEL_C_ACK, + EINVAL); + return -EINVAL; +} + +/** + * netlbl_unlabel_list - Handle a LIST message + * @skb: the NETLINK buffer + * @info: the Generic NETLINK info block + * + * Description: + * Process a user generated LIST message and respond with the current status. + * Returns zero on success, negative values on failure. + * + */ +static int netlbl_unlabel_list(struct sk_buff *skb, struct genl_info *info) +{ + int ret_val = -ENOMEM; + struct sk_buff *ans_skb; + + ans_skb = netlbl_netlink_alloc_skb(0, + GENL_HDRLEN + NETLBL_LEN_U32, + GFP_KERNEL); + if (ans_skb == NULL) + goto list_failure; + + if (netlbl_netlink_hdr_put(ans_skb, + info->snd_pid, + 0, + netlbl_unlabel_gnl_family.id, + NLBL_UNLABEL_C_LIST) == NULL) + goto list_failure; + + ret_val = nla_put_u32(ans_skb, + NLA_U32, + atomic_read(&netlabel_unlabel_accept_flg)); + if (ret_val != 0) + goto list_failure; + + ret_val = netlbl_netlink_snd(ans_skb, info->snd_pid); + if (ret_val != 0) + goto list_failure; + + return 0; + +list_failure: + netlbl_netlink_send_ack(info, + netlbl_unlabel_gnl_family.id, + NLBL_UNLABEL_C_ACK, + -ret_val); + return ret_val; +} + + +/* + * NetLabel Generic NETLINK Command Definitions + */ + +static struct genl_ops netlbl_unlabel_genl_c_accept = { + .cmd = NLBL_UNLABEL_C_ACCEPT, + .flags = 0, + .doit = netlbl_unlabel_accept, + .dumpit = NULL, +}; + +static struct genl_ops netlbl_unlabel_genl_c_list = { + .cmd = NLBL_UNLABEL_C_LIST, + .flags = 0, + .doit = netlbl_unlabel_list, + .dumpit = NULL, +}; + + +/* + * NetLabel Generic NETLINK Protocol Functions + */ + +/** + * netlbl_unlabel_genl_init - Register the Unlabeled NetLabel component + * + * Description: + * Register the unlabeled packet NetLabel component with the Generic NETLINK + * mechanism. Returns zero on success, negative values on failure. + * + */ +int netlbl_unlabel_genl_init(void) +{ + int ret_val; + + ret_val = genl_register_family(&netlbl_unlabel_gnl_family); + if (ret_val != 0) + return ret_val; + + ret_val = genl_register_ops(&netlbl_unlabel_gnl_family, + &netlbl_unlabel_genl_c_accept); + if (ret_val != 0) + return ret_val; + + ret_val = genl_register_ops(&netlbl_unlabel_gnl_family, + &netlbl_unlabel_genl_c_list); + if (ret_val != 0) + return ret_val; + + return 0; +} + +/* + * NetLabel KAPI Hooks + */ + +/** + * netlbl_unlabel_getattr - Get the security attributes for an unlabled packet + * @secattr: the security attributes + * + * Description: + * Determine the security attributes, if any, for an unlabled packet and return + * them in @secattr. Returns zero on success and negative values on failure. + * + */ +int netlbl_unlabel_getattr(struct netlbl_lsm_secattr *secattr) +{ + if (atomic_read(&netlabel_unlabel_accept_flg) == 1) { + memset(secattr, 0, sizeof(*secattr)); + return 0; + } + + return -ENOMSG; +} + +/** + * netlbl_unlabel_defconf - Set the default config to allow unlabeled packets + * + * Description: + * Set the default NetLabel configuration to allow incoming unlabeled packets + * and to send unlabeled network traffic by default. + * + */ +int netlbl_unlabel_defconf(void) +{ + int ret_val; + struct netlbl_dom_map *entry; + + entry = kzalloc(sizeof(*entry), GFP_KERNEL); + if (entry == NULL) + return -ENOMEM; + entry->type = NETLBL_NLTYPE_UNLABELED; + ret_val = netlbl_domhsh_add_default(entry); + if (ret_val != 0) + return ret_val; + + atomic_set(&netlabel_unlabel_accept_flg, 1); + + return 0; +} -- cgit v0.10.2 From 7420ed23a4f77480b5b7b3245e5da30dd24b7575 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Fri, 4 Aug 2006 23:17:57 -0700 Subject: [NetLabel]: SELinux support Add NetLabel support to the SELinux LSM and modify the socket_post_create() LSM hook to return an error code. The most significant part of this patch is the addition of NetLabel hooks into the following SELinux LSM hooks: * selinux_file_permission() * selinux_socket_sendmsg() * selinux_socket_post_create() * selinux_socket_sock_rcv_skb() * selinux_socket_getpeersec_stream() * selinux_socket_getpeersec_dgram() * selinux_sock_graft() * selinux_inet_conn_request() The basic reasoning behind this patch is that outgoing packets are "NetLabel'd" by labeling their socket and the NetLabel security attributes are checked via the additional hook in selinux_socket_sock_rcv_skb(). NetLabel itself is only a labeling mechanism, similar to filesystem extended attributes, it is up to the SELinux enforcement mechanism to perform the actual access checks. In addition to the changes outlined above this patch also includes some changes to the extended bitmap (ebitmap) and multi-level security (mls) code to import and export SELinux TE/MLS attributes into and out of NetLabel. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/include/linux/security.h b/include/linux/security.h index bb4c80f..9f56fb8 100644 --- a/include/linux/security.h +++ b/include/linux/security.h @@ -1341,8 +1341,8 @@ struct security_operations { int (*unix_may_send) (struct socket * sock, struct socket * other); int (*socket_create) (int family, int type, int protocol, int kern); - void (*socket_post_create) (struct socket * sock, int family, - int type, int protocol, int kern); + int (*socket_post_create) (struct socket * sock, int family, + int type, int protocol, int kern); int (*socket_bind) (struct socket * sock, struct sockaddr * address, int addrlen); int (*socket_connect) (struct socket * sock, @@ -2824,13 +2824,13 @@ static inline int security_socket_create (int family, int type, return security_ops->socket_create(family, type, protocol, kern); } -static inline void security_socket_post_create(struct socket * sock, - int family, - int type, - int protocol, int kern) +static inline int security_socket_post_create(struct socket * sock, + int family, + int type, + int protocol, int kern) { - security_ops->socket_post_create(sock, family, type, - protocol, kern); + return security_ops->socket_post_create(sock, family, type, + protocol, kern); } static inline int security_socket_bind(struct socket * sock, @@ -2982,11 +2982,12 @@ static inline int security_socket_create (int family, int type, return 0; } -static inline void security_socket_post_create(struct socket * sock, - int family, - int type, - int protocol, int kern) +static inline int security_socket_post_create(struct socket * sock, + int family, + int type, + int protocol, int kern) { + return 0; } static inline int security_socket_bind(struct socket * sock, diff --git a/net/socket.c b/net/socket.c index 6d261bf..6756e57 100644 --- a/net/socket.c +++ b/net/socket.c @@ -973,11 +973,18 @@ int sock_create_lite(int family, int type, int protocol, struct socket **res) goto out; } - security_socket_post_create(sock, family, type, protocol, 1); sock->type = type; + err = security_socket_post_create(sock, family, type, protocol, 1); + if (err) + goto out_release; + out: *res = sock; return err; +out_release: + sock_release(sock); + sock = NULL; + goto out; } /* No kernel lock held - perfect */ @@ -1214,7 +1221,9 @@ static int __sock_create(int family, int type, int protocol, struct socket **res */ module_put(net_families[family]->owner); *res = sock; - security_socket_post_create(sock, family, type, protocol, kern); + err = security_socket_post_create(sock, family, type, protocol, kern); + if (err) + goto out_release; out: net_family_read_unlock(); diff --git a/security/dummy.c b/security/dummy.c index 1c45f8e..aeee705 100644 --- a/security/dummy.c +++ b/security/dummy.c @@ -709,10 +709,10 @@ static int dummy_socket_create (int family, int type, return 0; } -static void dummy_socket_post_create (struct socket *sock, int family, int type, - int protocol, int kern) +static int dummy_socket_post_create (struct socket *sock, int family, int type, + int protocol, int kern) { - return; + return 0; } static int dummy_socket_bind (struct socket *sock, struct sockaddr *address, diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 33028b3..2a6bbb9 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -12,6 +12,8 @@ * Copyright (C) 2003 Red Hat, Inc., James Morris * Copyright (C) 2004-2005 Trusted Computer Solutions, Inc. * + * Copyright (C) 2006 Hewlett-Packard Development Company, L.P. + * Paul Moore, * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2, @@ -74,6 +76,7 @@ #include "objsec.h" #include "netif.h" #include "xfrm.h" +#include "selinux_netlabel.h" #define XATTR_SELINUX_SUFFIX "selinux" #define XATTR_NAME_SELINUX XATTR_SECURITY_PREFIX XATTR_SELINUX_SUFFIX @@ -2395,6 +2398,7 @@ static int selinux_inode_listsecurity(struct inode *inode, char *buffer, size_t static int selinux_file_permission(struct file *file, int mask) { + int rc; struct inode *inode = file->f_dentry->d_inode; if (!mask) { @@ -2406,8 +2410,12 @@ static int selinux_file_permission(struct file *file, int mask) if ((file->f_flags & O_APPEND) && (mask & MAY_WRITE)) mask |= MAY_APPEND; - return file_has_perm(current, file, - file_mask_to_av(inode->i_mode, mask)); + rc = file_has_perm(current, file, + file_mask_to_av(inode->i_mode, mask)); + if (rc) + return rc; + + return selinux_netlbl_inode_permission(inode, mask); } static int selinux_file_alloc_security(struct file *file) @@ -3058,9 +3066,10 @@ out: return err; } -static void selinux_socket_post_create(struct socket *sock, int family, - int type, int protocol, int kern) +static int selinux_socket_post_create(struct socket *sock, int family, + int type, int protocol, int kern) { + int err = 0; struct inode_security_struct *isec; struct task_security_struct *tsec; struct sk_security_struct *sksec; @@ -3077,9 +3086,12 @@ static void selinux_socket_post_create(struct socket *sock, int family, if (sock->sk) { sksec = sock->sk->sk_security; sksec->sid = isec->sid; + err = selinux_netlbl_socket_post_create(sock, + family, + isec->sid); } - return; + return err; } /* Range of port numbers used to automatically bind. @@ -3260,7 +3272,13 @@ static int selinux_socket_accept(struct socket *sock, struct socket *newsock) static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size) { - return socket_has_perm(current, sock, SOCKET__WRITE); + int rc; + + rc = socket_has_perm(current, sock, SOCKET__WRITE); + if (rc) + return rc; + + return selinux_netlbl_inode_permission(SOCK_INODE(sock), MAY_WRITE); } static int selinux_socket_recvmsg(struct socket *sock, struct msghdr *msg, @@ -3468,6 +3486,10 @@ static int selinux_socket_sock_rcv_skb(struct sock *sk, struct sk_buff *skb) if (err) goto out; + err = selinux_netlbl_sock_rcv_skb(sksec, skb, &ad); + if (err) + goto out; + err = selinux_xfrm_sock_rcv_skb(sksec->sid, skb, &ad); out: return err; @@ -3491,8 +3513,9 @@ static int selinux_socket_getpeersec_stream(struct socket *sock, char __user *op peer_sid = ssec->peer_sid; } else if (isec->sclass == SECCLASS_TCP_SOCKET) { - peer_sid = selinux_socket_getpeer_stream(sock->sk); - + peer_sid = selinux_netlbl_socket_getpeersec_stream(sock); + if (peer_sid == SECSID_NULL) + peer_sid = selinux_socket_getpeer_stream(sock->sk); if (peer_sid == SECSID_NULL) { err = -ENOPROTOOPT; goto out; @@ -3532,8 +3555,11 @@ static int selinux_socket_getpeersec_dgram(struct socket *sock, struct sk_buff * if (sock && (sock->sk->sk_family == PF_UNIX)) selinux_get_inode_sid(SOCK_INODE(sock), &peer_secid); - else if (skb) - peer_secid = selinux_socket_getpeer_dgram(skb); + else if (skb) { + peer_secid = selinux_netlbl_socket_getpeersec_dgram(skb); + if (peer_secid == SECSID_NULL) + peer_secid = selinux_socket_getpeer_dgram(skb); + } if (peer_secid == SECSID_NULL) err = -EINVAL; @@ -3578,6 +3604,8 @@ void selinux_sock_graft(struct sock* sk, struct socket *parent) struct sk_security_struct *sksec = sk->sk_security; isec->sid = sksec->sid; + + selinux_netlbl_sock_graft(sk, parent); } int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, @@ -3585,9 +3613,15 @@ int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, { struct sk_security_struct *sksec = sk->sk_security; int err; - u32 newsid = 0; + u32 newsid; u32 peersid; + newsid = selinux_netlbl_inet_conn_request(skb, sksec->sid); + if (newsid != SECSID_NULL) { + req->secid = newsid; + return 0; + } + err = selinux_xfrm_decode_session(skb, &peersid, 0); BUG_ON(err); diff --git a/security/selinux/include/objsec.h b/security/selinux/include/objsec.h index 79b9e0a..0a39bfd 100644 --- a/security/selinux/include/objsec.h +++ b/security/selinux/include/objsec.h @@ -101,6 +101,14 @@ struct sk_security_struct { struct sock *sk; /* back pointer to sk object */ u32 sid; /* SID of this object */ u32 peer_sid; /* SID of peer */ +#ifdef CONFIG_NETLABEL + u16 sclass; /* sock security class */ + enum { /* NetLabel state */ + NLBL_UNSET = 0, + NLBL_REQUIRE, + NLBL_LABELED, + } nlbl_state; +#endif }; struct key_security_struct { diff --git a/security/selinux/include/selinux_netlabel.h b/security/selinux/include/selinux_netlabel.h new file mode 100644 index 0000000..88c463e --- /dev/null +++ b/security/selinux/include/selinux_netlabel.h @@ -0,0 +1,125 @@ +/* + * SELinux interface to the NetLabel subsystem + * + * Author : Paul Moore + * + */ + +/* + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See + * the GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +#ifndef _SELINUX_NETLABEL_H_ +#define _SELINUX_NETLABEL_H_ + +#ifdef CONFIG_NETLABEL +void selinux_netlbl_cache_invalidate(void); +int selinux_netlbl_socket_post_create(struct socket *sock, + int sock_family, + u32 sid); +void selinux_netlbl_sock_graft(struct sock *sk, struct socket *sock); +u32 selinux_netlbl_inet_conn_request(struct sk_buff *skb, u32 sock_sid); +int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec, + struct sk_buff *skb, + struct avc_audit_data *ad); +u32 selinux_netlbl_socket_getpeersec_stream(struct socket *sock); +u32 selinux_netlbl_socket_getpeersec_dgram(struct sk_buff *skb); + +int __selinux_netlbl_inode_permission(struct inode *inode, int mask); +/** + * selinux_netlbl_inode_permission - Verify the socket is NetLabel labeled + * @inode: the file descriptor's inode + * @mask: the permission mask + * + * Description: + * Looks at a file's inode and if it is marked as a socket protected by + * NetLabel then verify that the socket has been labeled, if not try to label + * the socket now with the inode's SID. Returns zero on success, negative + * values on failure. + * + */ +static inline int selinux_netlbl_inode_permission(struct inode *inode, + int mask) +{ + int rc = 0; + struct inode_security_struct *isec; + struct sk_security_struct *sksec; + + if (!S_ISSOCK(inode->i_mode)) + return 0; + + isec = inode->i_security; + sksec = SOCKET_I(inode)->sk->sk_security; + down(&isec->sem); + if (unlikely(sksec->nlbl_state == NLBL_REQUIRE && + (mask & (MAY_WRITE | MAY_APPEND)))) + rc = __selinux_netlbl_inode_permission(inode, mask); + up(&isec->sem); + + return rc; +} +#else +static inline void selinux_netlbl_cache_invalidate(void) +{ + return; +} + +static inline int selinux_netlbl_socket_post_create(struct socket *sock, + int sock_family, + u32 sid) +{ + return 0; +} + +static inline void selinux_netlbl_sock_graft(struct sock *sk, + struct socket *sock) +{ + return; +} + +static inline u32 selinux_netlbl_inet_conn_request(struct sk_buff *skb, + u32 sock_sid) +{ + return SECSID_NULL; +} + +static inline int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec, + struct sk_buff *skb, + struct avc_audit_data *ad) +{ + return 0; +} + +static inline u32 selinux_netlbl_socket_getpeersec_stream(struct socket *sock) +{ + return SECSID_NULL; +} + +static inline u32 selinux_netlbl_socket_getpeersec_dgram(struct sk_buff *skb) +{ + return SECSID_NULL; +} + +static inline int selinux_netlbl_inode_permission(struct inode *inode, + int mask) +{ + return 0; +} +#endif /* CONFIG_NETLABEL */ + +#endif diff --git a/security/selinux/ss/ebitmap.c b/security/selinux/ss/ebitmap.c index 47024a6..4b915eb 100644 --- a/security/selinux/ss/ebitmap.c +++ b/security/selinux/ss/ebitmap.c @@ -3,6 +3,14 @@ * * Author : Stephen Smalley, */ +/* + * Updated: Hewlett-Packard + * + * Added ebitmap_export() and ebitmap_import() + * + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + */ + #include #include #include @@ -59,6 +67,142 @@ int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src) return 0; } +/** + * ebitmap_export - Export an ebitmap to a unsigned char bitmap string + * @src: the ebitmap to export + * @dst: the resulting bitmap string + * @dst_len: length of dst in bytes + * + * Description: + * Allocate a buffer at least src->highbit bits long and export the extensible + * bitmap into the buffer. The bitmap string will be in little endian format, + * i.e. LSB first. The value returned in dst_len may not the true size of the + * buffer as the length of the buffer is rounded up to a multiple of MAPTYPE. + * The caller must free the buffer when finished. Returns zero on success, + * negative values on failure. + * + */ +int ebitmap_export(const struct ebitmap *src, + unsigned char **dst, + size_t *dst_len) +{ + size_t bitmap_len; + unsigned char *bitmap; + struct ebitmap_node *iter_node; + MAPTYPE node_val; + size_t bitmap_byte; + unsigned char bitmask; + + bitmap_len = src->highbit / 8; + if (src->highbit % 7) + bitmap_len += 1; + if (bitmap_len == 0) + return -EINVAL; + + bitmap = kzalloc((bitmap_len & ~(sizeof(MAPTYPE) - 1)) + + sizeof(MAPTYPE), + GFP_ATOMIC); + if (bitmap == NULL) + return -ENOMEM; + + iter_node = src->node; + do { + bitmap_byte = iter_node->startbit / 8; + bitmask = 0x80; + node_val = iter_node->map; + do { + if (bitmask == 0) { + bitmap_byte++; + bitmask = 0x80; + } + if (node_val & (MAPTYPE)0x01) + bitmap[bitmap_byte] |= bitmask; + node_val >>= 1; + bitmask >>= 1; + } while (node_val > 0); + iter_node = iter_node->next; + } while (iter_node); + + *dst = bitmap; + *dst_len = bitmap_len; + return 0; +} + +/** + * ebitmap_import - Import an unsigned char bitmap string into an ebitmap + * @src: the bitmap string + * @src_len: the bitmap length in bytes + * @dst: the empty ebitmap + * + * Description: + * This function takes a little endian bitmap string in src and imports it into + * the ebitmap pointed to by dst. Returns zero on success, negative values on + * failure. + * + */ +int ebitmap_import(const unsigned char *src, + size_t src_len, + struct ebitmap *dst) +{ + size_t src_off = 0; + struct ebitmap_node *node_new; + struct ebitmap_node *node_last = NULL; + size_t iter; + size_t iter_bit; + size_t iter_limit; + unsigned char src_byte; + + do { + iter_limit = src_len - src_off; + if (iter_limit >= sizeof(MAPTYPE)) { + if (*(MAPTYPE *)&src[src_off] == 0) { + src_off += sizeof(MAPTYPE); + continue; + } + iter_limit = sizeof(MAPTYPE); + } else { + iter = src_off; + src_byte = 0; + do { + src_byte |= src[iter++]; + } while (iter < src_len && src_byte == 0); + if (src_byte == 0) + break; + } + + node_new = kzalloc(sizeof(*node_new), GFP_ATOMIC); + if (unlikely(node_new == NULL)) { + ebitmap_destroy(dst); + return -ENOMEM; + } + node_new->startbit = src_off * 8; + iter = 0; + do { + src_byte = src[src_off++]; + iter_bit = iter++ * 8; + while (src_byte != 0) { + if (src_byte & 0x80) + node_new->map |= MAPBIT << iter_bit; + iter_bit++; + src_byte <<= 1; + } + } while (iter < iter_limit); + + if (node_last != NULL) + node_last->next = node_new; + else + dst->node = node_new; + node_last = node_new; + } while (src_off < src_len); + + if (likely(node_last != NULL)) + dst->highbit = node_last->startbit + MAPSIZE; + else + ebitmap_init(dst); + + return 0; +} + int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2) { struct ebitmap_node *n1, *n2; diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h index 8bf4105..da2d465 100644 --- a/security/selinux/ss/ebitmap.h +++ b/security/selinux/ss/ebitmap.h @@ -69,6 +69,12 @@ static inline int ebitmap_node_get_bit(struct ebitmap_node * n, int ebitmap_cmp(struct ebitmap *e1, struct ebitmap *e2); int ebitmap_cpy(struct ebitmap *dst, struct ebitmap *src); +int ebitmap_export(const struct ebitmap *src, + unsigned char **dst, + size_t *dst_len); +int ebitmap_import(const unsigned char *src, + size_t src_len, + struct ebitmap *dst); int ebitmap_contains(struct ebitmap *e1, struct ebitmap *e2); int ebitmap_get_bit(struct ebitmap *e, unsigned long bit); int ebitmap_set_bit(struct ebitmap *e, unsigned long bit, int value); diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c index e15f7e0..119bd60 100644 --- a/security/selinux/ss/mls.c +++ b/security/selinux/ss/mls.c @@ -10,6 +10,13 @@ * * Copyright (C) 2004-2006 Trusted Computer Solutions, Inc. */ +/* + * Updated: Hewlett-Packard + * + * Added support to import/export the MLS label + * + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + */ #include #include @@ -565,3 +572,152 @@ int mls_compute_sid(struct context *scontext, return -EINVAL; } +/** + * mls_export_lvl - Export the MLS sensitivity levels + * @context: the security context + * @low: the low sensitivity level + * @high: the high sensitivity level + * + * Description: + * Given the security context copy the low MLS sensitivity level into lvl_low + * and the high sensitivity level in lvl_high. The MLS levels are only + * exported if the pointers are not NULL, if they are NULL then that level is + * not exported. + * + */ +void mls_export_lvl(const struct context *context, u32 *low, u32 *high) +{ + if (!selinux_mls_enabled) + return; + + if (low != NULL) + *low = context->range.level[0].sens - 1; + if (high != NULL) + *high = context->range.level[1].sens - 1; +} + +/** + * mls_import_lvl - Import the MLS sensitivity levels + * @context: the security context + * @low: the low sensitivity level + * @high: the high sensitivity level + * + * Description: + * Given the security context and the two sensitivty levels, set the MLS levels + * in the context according the two given as parameters. Returns zero on + * success, negative values on failure. + * + */ +void mls_import_lvl(struct context *context, u32 low, u32 high) +{ + if (!selinux_mls_enabled) + return; + + context->range.level[0].sens = low + 1; + context->range.level[1].sens = high + 1; +} + +/** + * mls_export_cat - Export the MLS categories + * @context: the security context + * @low: the low category + * @low_len: length of the cat_low bitmap in bytes + * @high: the high category + * @high_len: length of the cat_high bitmap in bytes + * + * Description: + * Given the security context export the low MLS category bitmap into cat_low + * and the high category bitmap into cat_high. The MLS categories are only + * exported if the pointers are not NULL, if they are NULL then that level is + * not exported. The caller is responsibile for freeing the memory when + * finished. Returns zero on success, negative values on failure. + * + */ +int mls_export_cat(const struct context *context, + unsigned char **low, + size_t *low_len, + unsigned char **high, + size_t *high_len) +{ + int rc = -EPERM; + + if (!selinux_mls_enabled) + return 0; + + if (low != NULL) { + rc = ebitmap_export(&context->range.level[0].cat, + low, + low_len); + if (rc != 0) + goto export_cat_failure; + } + if (high != NULL) { + rc = ebitmap_export(&context->range.level[1].cat, + high, + high_len); + if (rc != 0) + goto export_cat_failure; + } + + return 0; + +export_cat_failure: + if (low != NULL) + kfree(*low); + if (high != NULL) + kfree(*high); + return rc; +} + +/** + * mls_import_cat - Import the MLS categories + * @context: the security context + * @low: the low category + * @low_len: length of the cat_low bitmap in bytes + * @high: the high category + * @high_len: length of the cat_high bitmap in bytes + * + * Description: + * Given the security context and the two category bitmap strings import the + * categories into the security context. The MLS categories are only imported + * if the pointers are not NULL, if they are NULL they are skipped. Returns + * zero on success, negative values on failure. + * + */ +int mls_import_cat(struct context *context, + const unsigned char *low, + size_t low_len, + const unsigned char *high, + size_t high_len) +{ + int rc = -EPERM; + + if (!selinux_mls_enabled) + return 0; + + if (low != NULL) { + rc = ebitmap_import(low, + low_len, + &context->range.level[0].cat); + if (rc != 0) + goto import_cat_failure; + } + if (high != NULL) { + if (high == low) + rc = ebitmap_cpy(&context->range.level[1].cat, + &context->range.level[0].cat); + else + rc = ebitmap_import(high, + high_len, + &context->range.level[1].cat); + if (rc != 0) + goto import_cat_failure; + } + + return 0; + +import_cat_failure: + ebitmap_destroy(&context->range.level[0].cat); + ebitmap_destroy(&context->range.level[1].cat); + return rc; +} diff --git a/security/selinux/ss/mls.h b/security/selinux/ss/mls.h index 90c5e88..df6032c 100644 --- a/security/selinux/ss/mls.h +++ b/security/selinux/ss/mls.h @@ -10,6 +10,13 @@ * * Copyright (C) 2004-2006 Trusted Computer Solutions, Inc. */ +/* + * Updated: Hewlett-Packard + * + * Added support to import/export the MLS label + * + * (c) Copyright Hewlett-Packard Development Company, L.P., 2006 + */ #ifndef _SS_MLS_H_ #define _SS_MLS_H_ @@ -62,5 +69,19 @@ int mls_compute_sid(struct context *scontext, int mls_setup_user_range(struct context *fromcon, struct user_datum *user, struct context *usercon); +void mls_export_lvl(const struct context *context, u32 *low, u32 *high); +void mls_import_lvl(struct context *context, u32 low, u32 high); + +int mls_export_cat(const struct context *context, + unsigned char **low, + size_t *low_len, + unsigned char **high, + size_t *high_len); +int mls_import_cat(struct context *context, + const unsigned char *low, + size_t low_len, + const unsigned char *high, + size_t high_len); + #endif /* _SS_MLS_H */ diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index b00ec69..910afa1 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -13,6 +13,11 @@ * * Added conditional policy language extensions * + * Updated: Hewlett-Packard + * + * Added support for NetLabel + * + * Copyright (C) 2006 Hewlett-Packard Development Company, L.P. * Copyright (C) 2004-2006 Trusted Computer Solutions, Inc. * Copyright (C) 2003 - 2004 Tresys Technology, LLC * Copyright (C) 2003 Red Hat, Inc., James Morris @@ -29,6 +34,8 @@ #include #include #include +#include +#include #include "flask.h" #include "avc.h" @@ -40,6 +47,8 @@ #include "services.h" #include "conditional.h" #include "mls.h" +#include "objsec.h" +#include "selinux_netlabel.h" extern void selnl_notify_policyload(u32 seqno); unsigned int policydb_loaded_version; @@ -1241,6 +1250,7 @@ int security_load_policy(void *data, size_t len) selinux_complete_init(); avc_ss_reset(seqno); selnl_notify_policyload(seqno); + selinux_netlbl_cache_invalidate(); return 0; } @@ -1295,6 +1305,7 @@ int security_load_policy(void *data, size_t len) avc_ss_reset(seqno); selnl_notify_policyload(seqno); + selinux_netlbl_cache_invalidate(); return 0; @@ -2133,3 +2144,480 @@ void selinux_audit_set_callback(int (*callback)(void)) { aurule_callback = callback; } + +#ifdef CONFIG_NETLABEL +/* + * This is the structure we store inside the NetLabel cache block. + */ +#define NETLBL_CACHE(x) ((struct netlbl_cache *)(x)) +#define NETLBL_CACHE_T_NONE 0 +#define NETLBL_CACHE_T_SID 1 +#define NETLBL_CACHE_T_MLS 2 +struct netlbl_cache { + u32 type; + union { + u32 sid; + struct mls_range mls_label; + } data; +}; + +/** + * selinux_netlbl_cache_free - Free the NetLabel cached data + * @data: the data to free + * + * Description: + * This function is intended to be used as the free() callback inside the + * netlbl_lsm_cache structure. + * + */ +static void selinux_netlbl_cache_free(const void *data) +{ + struct netlbl_cache *cache = NETLBL_CACHE(data); + switch (cache->type) { + case NETLBL_CACHE_T_MLS: + ebitmap_destroy(&cache->data.mls_label.level[0].cat); + break; + } + kfree(data); +} + +/** + * selinux_netlbl_cache_add - Add an entry to the NetLabel cache + * @skb: the packet + * @ctx: the SELinux context + * + * Description: + * Attempt to cache the context in @ctx, which was derived from the packet in + * @skb, in the NetLabel subsystem cache. + * + */ +static void selinux_netlbl_cache_add(struct sk_buff *skb, struct context *ctx) +{ + struct netlbl_cache *cache = NULL; + struct netlbl_lsm_secattr secattr; + + netlbl_secattr_init(&secattr); + + cache = kzalloc(sizeof(*cache), GFP_ATOMIC); + if (cache == NULL) + goto netlbl_cache_add_failure; + secattr.cache.free = selinux_netlbl_cache_free; + secattr.cache.data = (void *)cache; + + cache->type = NETLBL_CACHE_T_MLS; + if (ebitmap_cpy(&cache->data.mls_label.level[0].cat, + &ctx->range.level[0].cat) != 0) + goto netlbl_cache_add_failure; + cache->data.mls_label.level[1].cat.highbit = + cache->data.mls_label.level[0].cat.highbit; + cache->data.mls_label.level[1].cat.node = + cache->data.mls_label.level[0].cat.node; + cache->data.mls_label.level[0].sens = ctx->range.level[0].sens; + cache->data.mls_label.level[1].sens = ctx->range.level[0].sens; + + if (netlbl_cache_add(skb, &secattr) != 0) + goto netlbl_cache_add_failure; + + return; + +netlbl_cache_add_failure: + netlbl_secattr_destroy(&secattr, 1); +} + +/** + * selinux_netlbl_cache_invalidate - Invalidate the NetLabel cache + * + * Description: + * Invalidate the NetLabel security attribute mapping cache. + * + */ +void selinux_netlbl_cache_invalidate(void) +{ + netlbl_cache_invalidate(); +} + +/** + * selinux_netlbl_secattr_to_sid - Convert a NetLabel secattr to a SELinux SID + * @skb: the network packet + * @secattr: the NetLabel packet security attributes + * @base_sid: the SELinux SID to use as a context for MLS only attributes + * @sid: the SELinux SID + * + * Description: + * Convert the given NetLabel packet security attributes in @secattr into a + * SELinux SID. If the @secattr field does not contain a full SELinux + * SID/context then use the context in @base_sid as the foundation. If @skb + * is not NULL attempt to cache as much data as possibile. Returns zero on + * success, negative values on failure. + * + */ +static int selinux_netlbl_secattr_to_sid(struct sk_buff *skb, + struct netlbl_lsm_secattr *secattr, + u32 base_sid, + u32 *sid) +{ + int rc = -EIDRM; + struct context *ctx; + struct context ctx_new; + struct netlbl_cache *cache; + + POLICY_RDLOCK; + + if (secattr->cache.data) { + cache = NETLBL_CACHE(secattr->cache.data); + switch (cache->type) { + case NETLBL_CACHE_T_SID: + *sid = cache->data.sid; + rc = 0; + break; + case NETLBL_CACHE_T_MLS: + ctx = sidtab_search(&sidtab, base_sid); + if (ctx == NULL) + goto netlbl_secattr_to_sid_return; + + ctx_new.user = ctx->user; + ctx_new.role = ctx->role; + ctx_new.type = ctx->type; + ctx_new.range.level[0].sens = + cache->data.mls_label.level[0].sens; + ctx_new.range.level[0].cat.highbit = + cache->data.mls_label.level[0].cat.highbit; + ctx_new.range.level[0].cat.node = + cache->data.mls_label.level[0].cat.node; + ctx_new.range.level[1].sens = + cache->data.mls_label.level[1].sens; + ctx_new.range.level[1].cat.highbit = + cache->data.mls_label.level[1].cat.highbit; + ctx_new.range.level[1].cat.node = + cache->data.mls_label.level[1].cat.node; + + rc = sidtab_context_to_sid(&sidtab, &ctx_new, sid); + break; + default: + goto netlbl_secattr_to_sid_return; + } + } else if (secattr->mls_lvl_vld) { + ctx = sidtab_search(&sidtab, base_sid); + if (ctx == NULL) + goto netlbl_secattr_to_sid_return; + + ctx_new.user = ctx->user; + ctx_new.role = ctx->role; + ctx_new.type = ctx->type; + mls_import_lvl(&ctx_new, secattr->mls_lvl, secattr->mls_lvl); + if (secattr->mls_cat) { + if (mls_import_cat(&ctx_new, + secattr->mls_cat, + secattr->mls_cat_len, + NULL, + 0) != 0) + goto netlbl_secattr_to_sid_return; + ctx_new.range.level[1].cat.highbit = + ctx_new.range.level[0].cat.highbit; + ctx_new.range.level[1].cat.node = + ctx_new.range.level[0].cat.node; + } else { + ebitmap_init(&ctx_new.range.level[0].cat); + ebitmap_init(&ctx_new.range.level[1].cat); + } + if (mls_context_isvalid(&policydb, &ctx_new) != 1) + goto netlbl_secattr_to_sid_return_cleanup; + + rc = sidtab_context_to_sid(&sidtab, &ctx_new, sid); + if (rc != 0) + goto netlbl_secattr_to_sid_return_cleanup; + + if (skb != NULL) + selinux_netlbl_cache_add(skb, &ctx_new); + ebitmap_destroy(&ctx_new.range.level[0].cat); + } else { + *sid = SECINITSID_UNLABELED; + rc = 0; + } + +netlbl_secattr_to_sid_return: + POLICY_RDUNLOCK; + return rc; +netlbl_secattr_to_sid_return_cleanup: + ebitmap_destroy(&ctx_new.range.level[0].cat); + goto netlbl_secattr_to_sid_return; +} + +/** + * selinux_netlbl_skbuff_getsid - Get the sid of a packet using NetLabel + * @skb: the packet + * @base_sid: the SELinux SID to use as a context for MLS only attributes + * @sid: the SID + * + * Description: + * Call the NetLabel mechanism to get the security attributes of the given + * packet and use those attributes to determine the correct context/SID to + * assign to the packet. Returns zero on success, negative values on failure. + * + */ +static int selinux_netlbl_skbuff_getsid(struct sk_buff *skb, + u32 base_sid, + u32 *sid) +{ + int rc; + struct netlbl_lsm_secattr secattr; + + netlbl_secattr_init(&secattr); + rc = netlbl_skbuff_getattr(skb, &secattr); + if (rc == 0) + rc = selinux_netlbl_secattr_to_sid(skb, + &secattr, + base_sid, + sid); + netlbl_secattr_destroy(&secattr, 0); + + return rc; +} + +/** + * selinux_netlbl_socket_setsid - Label a socket using the NetLabel mechanism + * @sock: the socket to label + * @sid: the SID to use + * + * Description: + * Attempt to label a socket using the NetLabel mechanism using the given + * SID. Returns zero values on success, negative values on failure. + * + */ +static int selinux_netlbl_socket_setsid(struct socket *sock, u32 sid) +{ + int rc = -ENOENT; + struct sk_security_struct *sksec = sock->sk->sk_security; + struct netlbl_lsm_secattr secattr; + struct context *ctx; + + if (!ss_initialized) + return 0; + + POLICY_RDLOCK; + + ctx = sidtab_search(&sidtab, sid); + if (ctx == NULL) + goto netlbl_socket_setsid_return; + + netlbl_secattr_init(&secattr); + secattr.domain = kstrdup(policydb.p_type_val_to_name[ctx->type - 1], + GFP_ATOMIC); + mls_export_lvl(ctx, &secattr.mls_lvl, NULL); + secattr.mls_lvl_vld = 1; + mls_export_cat(ctx, + &secattr.mls_cat, + &secattr.mls_cat_len, + NULL, + NULL); + + rc = netlbl_socket_setattr(sock, &secattr); + if (rc == 0) + sksec->nlbl_state = NLBL_LABELED; + + netlbl_secattr_destroy(&secattr, 0); + +netlbl_socket_setsid_return: + POLICY_RDUNLOCK; + return rc; +} + +/** + * selinux_netlbl_socket_post_create - Label a socket using NetLabel + * @sock: the socket to label + * @sock_family: the socket family + * @sid: the SID to use + * + * Description: + * Attempt to label a socket using the NetLabel mechanism using the given + * SID. Returns zero values on success, negative values on failure. + * + */ +int selinux_netlbl_socket_post_create(struct socket *sock, + int sock_family, + u32 sid) +{ + struct inode_security_struct *isec = SOCK_INODE(sock)->i_security; + struct sk_security_struct *sksec = sock->sk->sk_security; + + if (sock_family != PF_INET) + return 0; + + sksec->sclass = isec->sclass; + sksec->nlbl_state = NLBL_REQUIRE; + return selinux_netlbl_socket_setsid(sock, sid); +} + +/** + * selinux_netlbl_sock_graft - Netlabel the new socket + * @sk: the new connection + * @sock: the new socket + * + * Description: + * The connection represented by @sk is being grafted onto @sock so set the + * socket's NetLabel to match the SID of @sk. + * + */ +void selinux_netlbl_sock_graft(struct sock *sk, struct socket *sock) +{ + struct inode_security_struct *isec = SOCK_INODE(sock)->i_security; + struct sk_security_struct *sksec = sk->sk_security; + + if (sk->sk_family != PF_INET) + return; + + sksec->nlbl_state = NLBL_REQUIRE; + sksec->peer_sid = sksec->sid; + sksec->sclass = isec->sclass; + + /* Try to set the NetLabel on the socket to save time later, if we fail + * here we will pick up the pieces in later calls to + * selinux_netlbl_inode_permission(). */ + selinux_netlbl_socket_setsid(sock, sksec->sid); +} + +/** + * selinux_netlbl_inet_conn_request - Handle a new connection request + * @skb: the packet + * @sock_sid: the SID of the parent socket + * + * Description: + * If present, use the security attributes of the packet in @skb and the + * parent sock's SID to arrive at a SID for the new child sock. Returns the + * SID of the connection or SECSID_NULL on failure. + * + */ +u32 selinux_netlbl_inet_conn_request(struct sk_buff *skb, u32 sock_sid) +{ + int rc; + u32 peer_sid; + + rc = selinux_netlbl_skbuff_getsid(skb, sock_sid, &peer_sid); + if (rc != 0) + return SECSID_NULL; + + if (peer_sid == SECINITSID_UNLABELED) + return SECSID_NULL; + + return peer_sid; +} + +/** + * __selinux_netlbl_inode_permission - Label a socket using NetLabel + * @inode: the file descriptor's inode + * @mask: the permission mask + * + * Description: + * Try to label a socket with the inode's SID using NetLabel. Returns zero on + * success, negative values on failure. + * + */ +int __selinux_netlbl_inode_permission(struct inode *inode, int mask) +{ + int rc; + struct socket *sock = SOCKET_I(inode); + struct sk_security_struct *sksec = sock->sk->sk_security; + + lock_sock(sock->sk); + rc = selinux_netlbl_socket_setsid(sock, sksec->sid); + release_sock(sock->sk); + + return rc; +} + +/** + * selinux_netlbl_sock_rcv_skb - Do an inbound access check using NetLabel + * @sksec: the sock's sk_security_struct + * @skb: the packet + * @ad: the audit data + * + * Description: + * Fetch the NetLabel security attributes from @skb and perform an access check + * against the receiving socket. Returns zero on success, negative values on + * error. + * + */ +int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec, + struct sk_buff *skb, + struct avc_audit_data *ad) +{ + int rc; + u32 netlbl_sid; + u32 recv_perm; + + rc = selinux_netlbl_skbuff_getsid(skb, sksec->sid, &netlbl_sid); + if (rc != 0) + return rc; + + if (netlbl_sid == SECINITSID_UNLABELED) + return 0; + + switch (sksec->sclass) { + case SECCLASS_UDP_SOCKET: + recv_perm = UDP_SOCKET__RECV_MSG; + break; + case SECCLASS_TCP_SOCKET: + recv_perm = TCP_SOCKET__RECV_MSG; + break; + default: + recv_perm = RAWIP_SOCKET__RECV_MSG; + } + + rc = avc_has_perm(sksec->sid, + netlbl_sid, + sksec->sclass, + recv_perm, + ad); + if (rc == 0) + return 0; + + netlbl_skbuff_err(skb, rc); + return rc; +} + +/** + * selinux_netlbl_socket_peersid - Return the peer SID of a connected socket + * @sock: the socket + * + * Description: + * Examine @sock to find the connected peer's SID. Returns the SID on success + * or SECSID_NULL on error. + * + */ +u32 selinux_netlbl_socket_getpeersec_stream(struct socket *sock) +{ + struct sk_security_struct *sksec = sock->sk->sk_security; + + if (sksec->peer_sid == SECINITSID_UNLABELED) + return SECSID_NULL; + + return sksec->peer_sid; +} + +/** + * selinux_netlbl_socket_getpeersec_dgram - Return the SID of a NetLabel packet + * @skb: the packet + * + * Description: + * Examine @skb to find the SID assigned to it by NetLabel. Returns the SID on + * success, SECSID_NULL on error. + * + */ +u32 selinux_netlbl_socket_getpeersec_dgram(struct sk_buff *skb) +{ + int peer_sid; + struct sock *sk = skb->sk; + struct inode_security_struct *isec; + + if (sk == NULL || sk->sk_socket == NULL) + return SECSID_NULL; + + isec = SOCK_INODE(sk->sk_socket)->i_security; + if (selinux_netlbl_skbuff_getsid(skb, isec->sid, &peer_sid) != 0) + return SECSID_NULL; + if (peer_sid == SECINITSID_UNLABELED) + return SECSID_NULL; + + return peer_sid; +} +#endif /* CONFIG_NETLABEL */ -- cgit v0.10.2 From 8161327311fe4da1684ed08015e141feb9a0a737 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Thu, 3 Aug 2006 16:50:39 -0700 Subject: [NetLabel]: tie NetLabel into the Kconfig system Modify the net/Kconfig file to enable selecting the NetLabel Kconfig options. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/net/Kconfig b/net/Kconfig index 4959a4e..eb855b7 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -249,6 +249,8 @@ source "net/ieee80211/Kconfig" config WIRELESS_EXT bool +source "net/netlabel/Kconfig" + endif # if NET endmenu # Networking -- cgit v0.10.2 From 5d0bbeeb144f631150881712607345c532e38e7e Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 03:37:36 -0700 Subject: [IPV6]: Remove ndiscs rt6_lock dependency (Ab)using rt6_lock wouldn't work anymore if rt6_lock is converted into a per table lock. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index d9baca0..ce1f49b 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -747,8 +747,6 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, u32 mtu) } } -/* Protected by rt6_lock. */ -static struct dst_entry *ndisc_dst_gc_list; static int ipv6_get_mtu(struct net_device *dev); static inline unsigned int ipv6_advmss(unsigned int mtu) @@ -769,6 +767,9 @@ static inline unsigned int ipv6_advmss(unsigned int mtu) return mtu; } +static struct dst_entry *ndisc_dst_gc_list; +DEFINE_SPINLOCK(ndisc_lock); + struct dst_entry *ndisc_dst_alloc(struct net_device *dev, struct neighbour *neigh, struct in6_addr *addr, @@ -809,10 +810,10 @@ struct dst_entry *ndisc_dst_alloc(struct net_device *dev, rt->rt6i_dst.plen = 128; #endif - write_lock_bh(&rt6_lock); + spin_lock_bh(&ndisc_lock); rt->u.dst.next = ndisc_dst_gc_list; ndisc_dst_gc_list = &rt->u.dst; - write_unlock_bh(&rt6_lock); + spin_unlock_bh(&ndisc_lock); fib6_force_start_gc(); @@ -826,8 +827,11 @@ int ndisc_dst_gc(int *more) int freed; next = NULL; + freed = 0; + + spin_lock_bh(&ndisc_lock); pprev = &ndisc_dst_gc_list; - freed = 0; + while ((dst = *pprev) != NULL) { if (!atomic_read(&dst->__refcnt)) { *pprev = dst->next; @@ -839,6 +843,8 @@ int ndisc_dst_gc(int *more) } } + spin_unlock_bh(&ndisc_lock); + return freed; } -- cgit v0.10.2 From c71099acce933455123ee505cc75964610a209ad Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:20:06 -0700 Subject: [IPV6]: Multiple Routing Tables Adds the framework to support multiple IPv6 routing tables. Currently all automatically generated routes are put into the same table. This could be changed at a later point after considering the produced locking overhead. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index a66e9de..8184115 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -51,6 +51,8 @@ struct rt6key int plen; }; +struct fib6_table; + struct rt6_info { union { @@ -71,6 +73,7 @@ struct rt6_info u32 rt6i_flags; u32 rt6i_metric; atomic_t rt6i_ref; + struct fib6_table *rt6i_table; struct rt6key rt6i_dst; struct rt6key rt6i_src; @@ -143,12 +146,43 @@ struct rt6_statistics { typedef void (*f_pnode)(struct fib6_node *fn, void *); -extern struct fib6_node ip6_routing_table; +struct fib6_table { + struct hlist_node tb6_hlist; + u32 tb6_id; + rwlock_t tb6_lock; + struct fib6_node tb6_root; +}; + +#define RT6_TABLE_UNSPEC RT_TABLE_UNSPEC +#define RT6_TABLE_MAIN RT_TABLE_MAIN +#define RT6_TABLE_LOCAL RT6_TABLE_MAIN +#define RT6_TABLE_DFLT RT6_TABLE_MAIN +#define RT6_TABLE_INFO RT6_TABLE_MAIN +#define RT6_TABLE_PREFIX RT6_TABLE_MAIN + +#ifdef CONFIG_IPV6_MULTIPLE_TABLES +#define FIB6_TABLE_MIN 1 +#define FIB6_TABLE_MAX RT_TABLE_MAX +#else +#define FIB6_TABLE_MIN RT_TABLE_MAIN +#define FIB6_TABLE_MAX FIB6_TABLE_MIN +#endif + +#define RT6_F_STRICT 1 +#define RT6_F_HAS_SADDR 2 + +typedef struct rt6_info *(*pol_lookup_t)(struct fib6_table *, + struct flowi *, int); /* * exported functions */ +extern struct fib6_table * fib6_get_table(u32 id); +extern struct fib6_table * fib6_new_table(u32 id); +extern struct dst_entry * fib6_rule_lookup(struct flowi *fl, int flags, + pol_lookup_t lookup); + extern struct fib6_node *fib6_lookup(struct fib6_node *root, struct in6_addr *daddr, struct in6_addr *saddr); @@ -161,6 +195,9 @@ extern void fib6_clean_tree(struct fib6_node *root, int (*func)(struct rt6_info *, void *arg), int prune, void *arg); +extern void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), + int prune, void *arg); + extern int fib6_walk(struct fib6_walker_t *w); extern int fib6_walk_continue(struct fib6_walker_t *w); diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 96b0e66..d49c8c9 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -58,7 +58,8 @@ extern int ipv6_route_ioctl(unsigned int cmd, void __user *arg); extern int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *, void *rtattr, - struct netlink_skb_parms *req); + struct netlink_skb_parms *req, + u32 table_id); extern int ip6_ins_rt(struct rt6_info *, struct nlmsghdr *, void *rtattr, diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 0ba06c0..159c63d 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -136,3 +136,9 @@ config IPV6_TUNNEL If unsure, say N. +config IPV6_MULTIPLE_TABLES + bool "IPv6: Multiple Routing Tables" + depends on IPV6 && EXPERIMENTAL + ---help--- + Support multiple routing tables. + diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c7852b3..318767f 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1525,7 +1525,7 @@ addrconf_prefix_route(struct in6_addr *pfx, int plen, struct net_device *dev, if (dev->type == ARPHRD_SIT && (dev->flags&IFF_POINTOPOINT)) rtmsg.rtmsg_flags |= RTF_NONEXTHOP; - ip6_route_add(&rtmsg, NULL, NULL, NULL); + ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_PREFIX); } /* Create "default" multicast route to the interface */ @@ -1542,7 +1542,7 @@ static void addrconf_add_mroute(struct net_device *dev) rtmsg.rtmsg_ifindex = dev->ifindex; rtmsg.rtmsg_flags = RTF_UP; rtmsg.rtmsg_type = RTMSG_NEWROUTE; - ip6_route_add(&rtmsg, NULL, NULL, NULL); + ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_LOCAL); } static void sit_route_add(struct net_device *dev) @@ -1559,7 +1559,7 @@ static void sit_route_add(struct net_device *dev) rtmsg.rtmsg_flags = RTF_UP|RTF_NONEXTHOP; rtmsg.rtmsg_ifindex = dev->ifindex; - ip6_route_add(&rtmsg, NULL, NULL, NULL); + ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_MAIN); } static void addrconf_add_lroute(struct net_device *dev) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 7642212..fcd7da8 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -26,6 +26,7 @@ #include #include #include +#include #ifdef CONFIG_PROC_FS #include @@ -147,6 +148,126 @@ static __inline__ void rt6_release(struct rt6_info *rt) dst_free(&rt->u.dst); } +static struct fib6_table fib6_main_tbl = { + .tb6_id = RT6_TABLE_MAIN, + .tb6_lock = RW_LOCK_UNLOCKED, + .tb6_root = { + .leaf = &ip6_null_entry, + .fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO, + }, +}; + +#ifdef CONFIG_IPV6_MULTIPLE_TABLES + +#define FIB_TABLE_HASHSZ 256 +static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; + +static struct fib6_table *fib6_alloc_table(u32 id) +{ + struct fib6_table *table; + + table = kzalloc(sizeof(*table), GFP_ATOMIC); + if (table != NULL) { + table->tb6_id = id; + table->tb6_lock = RW_LOCK_UNLOCKED; + table->tb6_root.leaf = &ip6_null_entry; + table->tb6_root.fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO; + } + + return table; +} + +static void fib6_link_table(struct fib6_table *tb) +{ + unsigned int h; + + h = tb->tb6_id & (FIB_TABLE_HASHSZ - 1); + + /* + * No protection necessary, this is the only list mutatation + * operation, tables never disappear once they exist. + */ + hlist_add_head_rcu(&tb->tb6_hlist, &fib_table_hash[h]); +} + +struct fib6_table *fib6_new_table(u32 id) +{ + struct fib6_table *tb; + + if (id == 0) + id = RT6_TABLE_MAIN; + tb = fib6_get_table(id); + if (tb) + return tb; + + tb = fib6_alloc_table(id); + if (tb != NULL) + fib6_link_table(tb); + + return tb; +} + +struct fib6_table *fib6_get_table(u32 id) +{ + struct fib6_table *tb; + struct hlist_node *node; + unsigned int h; + + if (id == 0) + id = RT6_TABLE_MAIN; + h = id & (FIB_TABLE_HASHSZ - 1); + rcu_read_lock(); + hlist_for_each_entry_rcu(tb, node, &fib_table_hash[h], tb6_hlist) { + if (tb->tb6_id == id) { + rcu_read_unlock(); + return tb; + } + } + rcu_read_unlock(); + + return NULL; +} + +struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, + pol_lookup_t lookup) +{ + /* + * TODO: Add rule lookup + */ + struct fib6_table *table = fib6_get_table(RT6_TABLE_MAIN); + + return (struct dst_entry *) lookup(table, fl, flags); +} + +static void __init fib6_tables_init(void) +{ + fib6_link_table(&fib6_main_tbl); +} + +#else + +struct fib6_table *fib6_new_table(u32 id) +{ + return fib6_get_table(id); +} + +struct fib6_table *fib6_get_table(u32 id) +{ + return &fib6_main_tbl; +} + +struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, + pol_lookup_t lookup) +{ + return (struct dst_entry *) lookup(&fib6_main_tbl, fl, flags); +} + +static void __init fib6_tables_init(void) +{ +} + +#endif + /* * Routing Table @@ -1064,6 +1185,22 @@ void fib6_clean_tree(struct fib6_node *root, fib6_walk(&c.w); } +void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), + int prune, void *arg) +{ + int i; + struct fib6_table *table; + + for (i = FIB6_TABLE_MIN; i <= FIB6_TABLE_MAX; i++) { + table = fib6_get_table(i); + if (table != NULL) { + write_lock_bh(&table->tb6_lock); + fib6_clean_tree(&table->tb6_root, func, prune, arg); + write_unlock_bh(&table->tb6_lock); + } + } +} + static int fib6_prune_clone(struct rt6_info *rt, void *arg) { if (rt->rt6i_flags & RTF_CACHE) { @@ -1142,11 +1279,8 @@ void fib6_run_gc(unsigned long dummy) } gc_args.more = 0; - - write_lock_bh(&rt6_lock); ndisc_dst_gc(&gc_args.more); - fib6_clean_tree(&ip6_routing_table, fib6_age, 0, NULL); - write_unlock_bh(&rt6_lock); + fib6_clean_all(fib6_age, 0, NULL); if (gc_args.more) mod_timer(&ip6_fib_timer, jiffies + ip6_rt_gc_interval); @@ -1165,6 +1299,8 @@ void __init fib6_init(void) NULL, NULL); if (!fib6_node_kmem) panic("cannot create fib6_nodes cache"); + + fib6_tables_init(); } void fib6_gc_cleanup(void) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index ce1f49b..73efdad 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -140,16 +140,6 @@ struct rt6_info ip6_null_entry = { .rt6i_ref = ATOMIC_INIT(1), }; -struct fib6_node ip6_routing_table = { - .leaf = &ip6_null_entry, - .fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO, -}; - -/* Protects all the ip6 fib */ - -DEFINE_RWLOCK(rt6_lock); - - /* allocate dst with ip6_dst_ops */ static __inline__ struct rt6_info *ip6_dst_alloc(void) { @@ -188,8 +178,14 @@ static __inline__ int rt6_check_expired(const struct rt6_info *rt) time_after(jiffies, rt->rt6i_expires)); } +static inline int rt6_need_strict(struct in6_addr *daddr) +{ + return (ipv6_addr_type(daddr) & + (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL)); +} + /* - * Route lookup. Any rt6_lock is implied. + * Route lookup. Any table->tb6_lock is implied. */ static __inline__ struct rt6_info *rt6_device_match(struct rt6_info *rt, @@ -441,27 +437,66 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len, } #endif -struct rt6_info *rt6_lookup(struct in6_addr *daddr, struct in6_addr *saddr, - int oif, int strict) +#define BACKTRACK() \ +if (rt == &ip6_null_entry && flags & RT6_F_STRICT) { \ + while ((fn = fn->parent) != NULL) { \ + if (fn->fn_flags & RTN_TL_ROOT) { \ + dst_hold(&rt->u.dst); \ + goto out; \ + } \ + if (fn->fn_flags & RTN_RTINFO) \ + goto restart; \ + } \ +} + +static struct rt6_info *ip6_pol_route_lookup(struct fib6_table *table, + struct flowi *fl, int flags) { struct fib6_node *fn; struct rt6_info *rt; - read_lock_bh(&rt6_lock); - fn = fib6_lookup(&ip6_routing_table, daddr, saddr); - rt = rt6_device_match(fn->leaf, oif, strict); + read_lock_bh(&table->tb6_lock); + fn = fib6_lookup(&table->tb6_root, &fl->fl6_dst, &fl->fl6_src); +restart: + rt = fn->leaf; + rt = rt6_device_match(rt, fl->oif, flags & RT6_F_STRICT); + BACKTRACK(); dst_hold(&rt->u.dst); - rt->u.dst.__use++; - read_unlock_bh(&rt6_lock); +out: + read_unlock_bh(&table->tb6_lock); rt->u.dst.lastuse = jiffies; - if (rt->u.dst.error == 0) - return rt; - dst_release(&rt->u.dst); + rt->u.dst.__use++; + + return rt; + +} + +struct rt6_info *rt6_lookup(struct in6_addr *daddr, struct in6_addr *saddr, + int oif, int strict) +{ + struct flowi fl = { + .oif = oif, + .nl_u = { + .ip6_u = { + .daddr = *daddr, + /* TODO: saddr */ + }, + }, + }; + struct dst_entry *dst; + int flags = strict ? RT6_F_STRICT : 0; + + dst = fib6_rule_lookup(&fl, flags, ip6_pol_route_lookup); + if (dst->error == 0) + return (struct rt6_info *) dst; + + dst_release(dst); + return NULL; } -/* ip6_ins_rt is called with FREE rt6_lock. +/* ip6_ins_rt is called with FREE table->tb6_lock. It takes new route entry, the addition fails by any reason the route is freed. In any case, if caller does not hold it, it may be destroyed. @@ -471,10 +506,12 @@ int ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) { int err; + struct fib6_table *table; - write_lock_bh(&rt6_lock); - err = fib6_add(&ip6_routing_table, rt, nlh, _rtattr, req); - write_unlock_bh(&rt6_lock); + table = rt->rt6i_table; + write_lock_bh(&table->tb6_lock); + err = fib6_add(&table->tb6_root, rt, nlh, _rtattr, req); + write_unlock_bh(&table->tb6_lock); return err; } @@ -532,51 +569,40 @@ static struct rt6_info *rt6_alloc_clone(struct rt6_info *ort, struct in6_addr *d return rt; } -#define BACKTRACK() \ -if (rt == &ip6_null_entry) { \ - while ((fn = fn->parent) != NULL) { \ - if (fn->fn_flags & RTN_ROOT) { \ - goto out; \ - } \ - if (fn->fn_flags & RTN_RTINFO) \ - goto restart; \ - } \ -} - - -void ip6_route_input(struct sk_buff *skb) +struct rt6_info *ip6_pol_route_input(struct fib6_table *table, struct flowi *fl, + int flags) { struct fib6_node *fn; struct rt6_info *rt, *nrt; - int strict; + int strict = 0; int attempts = 3; int err; int reachable = RT6_SELECT_F_REACHABLE; - strict = ipv6_addr_type(&skb->nh.ipv6h->daddr) & (IPV6_ADDR_MULTICAST|IPV6_ADDR_LINKLOCAL) ? RT6_SELECT_F_IFACE : 0; + if (flags & RT6_F_STRICT) + strict = RT6_SELECT_F_IFACE; relookup: - read_lock_bh(&rt6_lock); + read_lock_bh(&table->tb6_lock); restart_2: - fn = fib6_lookup(&ip6_routing_table, &skb->nh.ipv6h->daddr, - &skb->nh.ipv6h->saddr); + fn = fib6_lookup(&table->tb6_root, &fl->fl6_dst, &fl->fl6_src); restart: - rt = rt6_select(&fn->leaf, skb->dev->ifindex, strict | reachable); + rt = rt6_select(&fn->leaf, fl->iif, strict | reachable); BACKTRACK(); if (rt == &ip6_null_entry || rt->rt6i_flags & RTF_CACHE) goto out; dst_hold(&rt->u.dst); - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); if (!rt->rt6i_nexthop && !(rt->rt6i_flags & RTF_NONEXTHOP)) - nrt = rt6_alloc_cow(rt, &skb->nh.ipv6h->daddr, &skb->nh.ipv6h->saddr); + nrt = rt6_alloc_cow(rt, &fl->fl6_dst, &fl->fl6_src); else { #if CLONE_OFFLINK_ROUTE - nrt = rt6_alloc_clone(rt, &skb->nh.ipv6h->daddr); + nrt = rt6_alloc_clone(rt, &fl->fl6_dst); #else goto out2; #endif @@ -587,7 +613,7 @@ restart: dst_hold(&rt->u.dst); if (nrt) { - err = ip6_ins_rt(nrt, NULL, NULL, &NETLINK_CB(skb)); + err = ip6_ins_rt(nrt, NULL, NULL, NULL); if (!err) goto out2; } @@ -596,7 +622,7 @@ restart: goto out2; /* - * Race condition! In the gap, when rt6_lock was + * Race condition! In the gap, when table->tb6_lock was * released someone could insert this route. Relookup. */ dst_release(&rt->u.dst); @@ -608,30 +634,54 @@ out: goto restart_2; } dst_hold(&rt->u.dst); - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); out2: rt->u.dst.lastuse = jiffies; rt->u.dst.__use++; - skb->dst = (struct dst_entry *) rt; - return; + + return rt; } -struct dst_entry * ip6_route_output(struct sock *sk, struct flowi *fl) +void ip6_route_input(struct sk_buff *skb) +{ + struct ipv6hdr *iph = skb->nh.ipv6h; + struct flowi fl = { + .iif = skb->dev->ifindex, + .nl_u = { + .ip6_u = { + .daddr = iph->daddr, + .saddr = iph->saddr, + .flowlabel = (* (u32 *) iph)&IPV6_FLOWINFO_MASK, + }, + }, + .proto = iph->nexthdr, + }; + int flags = 0; + + if (rt6_need_strict(&iph->daddr)) + flags |= RT6_F_STRICT; + + skb->dst = fib6_rule_lookup(&fl, flags, ip6_pol_route_input); +} + +static struct rt6_info *ip6_pol_route_output(struct fib6_table *table, + struct flowi *fl, int flags) { struct fib6_node *fn; struct rt6_info *rt, *nrt; - int strict; + int strict = 0; int attempts = 3; int err; int reachable = RT6_SELECT_F_REACHABLE; - strict = ipv6_addr_type(&fl->fl6_dst) & (IPV6_ADDR_MULTICAST|IPV6_ADDR_LINKLOCAL) ? RT6_SELECT_F_IFACE : 0; + if (flags & RT6_F_STRICT) + strict = RT6_SELECT_F_IFACE; relookup: - read_lock_bh(&rt6_lock); + read_lock_bh(&table->tb6_lock); restart_2: - fn = fib6_lookup(&ip6_routing_table, &fl->fl6_dst, &fl->fl6_src); + fn = fib6_lookup(&table->tb6_root, &fl->fl6_dst, &fl->fl6_src); restart: rt = rt6_select(&fn->leaf, fl->oif, strict | reachable); @@ -641,7 +691,7 @@ restart: goto out; dst_hold(&rt->u.dst); - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); if (!rt->rt6i_nexthop && !(rt->rt6i_flags & RTF_NONEXTHOP)) nrt = rt6_alloc_cow(rt, &fl->fl6_dst, &fl->fl6_src); @@ -667,7 +717,7 @@ restart: goto out2; /* - * Race condition! In the gap, when rt6_lock was + * Race condition! In the gap, when table->tb6_lock was * released someone could insert this route. Relookup. */ dst_release(&rt->u.dst); @@ -679,11 +729,21 @@ out: goto restart_2; } dst_hold(&rt->u.dst); - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); out2: rt->u.dst.lastuse = jiffies; rt->u.dst.__use++; - return &rt->u.dst; + return rt; +} + +struct dst_entry * ip6_route_output(struct sock *sk, struct flowi *fl) +{ + int flags = 0; + + if (rt6_need_strict(&fl->fl6_dst)) + flags |= RT6_F_STRICT; + + return fib6_rule_lookup(fl, flags, ip6_pol_route_output); } @@ -906,7 +966,8 @@ int ipv6_get_hoplimit(struct net_device *dev) */ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req) + void *_rtattr, struct netlink_skb_parms *req, + u32 table_id) { int err; struct rtmsg *r; @@ -914,6 +975,7 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, struct rt6_info *rt = NULL; struct net_device *dev = NULL; struct inet6_dev *idev = NULL; + struct fib6_table *table; int addr_type; rta = (struct rtattr **) _rtattr; @@ -937,6 +999,12 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, if (rtmsg->rtmsg_metric == 0) rtmsg->rtmsg_metric = IP6_RT_PRIO_USER; + table = fib6_new_table(table_id); + if (table == NULL) { + err = -ENOBUFS; + goto out; + } + rt = ip6_dst_alloc(); if (rt == NULL) { @@ -1093,6 +1161,7 @@ install_route: rt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(dst_mtu(&rt->u.dst)); rt->u.dst.dev = dev; rt->rt6i_idev = idev; + rt->rt6i_table = table; return ip6_ins_rt(rt, nlh, _rtattr, req); out: @@ -1108,26 +1177,35 @@ out: int ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) { int err; + struct fib6_table *table; - write_lock_bh(&rt6_lock); + table = rt->rt6i_table; + write_lock_bh(&table->tb6_lock); err = fib6_del(rt, nlh, _rtattr, req); dst_release(&rt->u.dst); - write_unlock_bh(&rt6_lock); + write_unlock_bh(&table->tb6_lock); return err; } -static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) +static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, + void *_rtattr, struct netlink_skb_parms *req, + u32 table_id) { + struct fib6_table *table; struct fib6_node *fn; struct rt6_info *rt; int err = -ESRCH; - read_lock_bh(&rt6_lock); + table = fib6_get_table(table_id); + if (table == NULL) + return err; + + read_lock_bh(&table->tb6_lock); - fn = fib6_locate(&ip6_routing_table, + fn = fib6_locate(&table->tb6_root, &rtmsg->rtmsg_dst, rtmsg->rtmsg_dst_len, &rtmsg->rtmsg_src, rtmsg->rtmsg_src_len); @@ -1144,12 +1222,12 @@ static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, void *_r rtmsg->rtmsg_metric != rt->rt6i_metric) continue; dst_hold(&rt->u.dst); - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); return ip6_del_rt(rt, nlh, _rtattr, req); } } - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); return err; } @@ -1161,10 +1239,15 @@ void rt6_redirect(struct in6_addr *dest, struct in6_addr *saddr, struct neighbour *neigh, u8 *lladdr, int on_link) { struct rt6_info *rt, *nrt = NULL; - int strict; struct fib6_node *fn; + struct fib6_table *table; struct netevent_redirect netevent; + /* TODO: Very lazy, might need to check all tables */ + table = fib6_get_table(RT6_TABLE_MAIN); + if (table == NULL) + return; + /* * Get the "current" route for this destination and * check if the redirect has come from approriate router. @@ -1175,10 +1258,9 @@ void rt6_redirect(struct in6_addr *dest, struct in6_addr *saddr, * is a bit fuzzy and one might need to check all possible * routes. */ - strict = ipv6_addr_type(dest) & (IPV6_ADDR_MULTICAST | IPV6_ADDR_LINKLOCAL); - read_lock_bh(&rt6_lock); - fn = fib6_lookup(&ip6_routing_table, dest, NULL); + read_lock_bh(&table->tb6_lock); + fn = fib6_lookup(&table->tb6_root, dest, NULL); restart: for (rt = fn->leaf; rt; rt = rt->u.next) { /* @@ -1201,7 +1283,7 @@ restart: } if (rt) dst_hold(&rt->u.dst); - else if (strict) { + else if (rt6_need_strict(dest)) { while ((fn = fn->parent) != NULL) { if (fn->fn_flags & RTN_ROOT) break; @@ -1209,7 +1291,7 @@ restart: goto restart; } } - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); if (!rt) { if (net_ratelimit()) @@ -1384,6 +1466,7 @@ static struct rt6_info * ip6_rt_copy(struct rt6_info *ort) #ifdef CONFIG_IPV6_SUBTREES memcpy(&rt->rt6i_src, &ort->rt6i_src, sizeof(struct rt6key)); #endif + rt->rt6i_table = ort->rt6i_table; } return rt; } @@ -1394,9 +1477,14 @@ static struct rt6_info *rt6_get_route_info(struct in6_addr *prefix, int prefixle { struct fib6_node *fn; struct rt6_info *rt = NULL; + struct fib6_table *table; + + table = fib6_get_table(RT6_TABLE_INFO); + if (table == NULL) + return NULL; - write_lock_bh(&rt6_lock); - fn = fib6_locate(&ip6_routing_table, prefix ,prefixlen, NULL, 0); + write_lock_bh(&table->tb6_lock); + fn = fib6_locate(&table->tb6_root, prefix ,prefixlen, NULL, 0); if (!fn) goto out; @@ -1411,7 +1499,7 @@ static struct rt6_info *rt6_get_route_info(struct in6_addr *prefix, int prefixle break; } out: - write_unlock_bh(&rt6_lock); + write_unlock_bh(&table->tb6_lock); return rt; } @@ -1433,7 +1521,7 @@ static struct rt6_info *rt6_add_route_info(struct in6_addr *prefix, int prefixle rtmsg.rtmsg_flags |= RTF_DEFAULT; rtmsg.rtmsg_ifindex = ifindex; - ip6_route_add(&rtmsg, NULL, NULL, NULL); + ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_INFO); return rt6_get_route_info(prefix, prefixlen, gwaddr, ifindex); } @@ -1442,12 +1530,14 @@ static struct rt6_info *rt6_add_route_info(struct in6_addr *prefix, int prefixle struct rt6_info *rt6_get_dflt_router(struct in6_addr *addr, struct net_device *dev) { struct rt6_info *rt; - struct fib6_node *fn; + struct fib6_table *table; - fn = &ip6_routing_table; + table = fib6_get_table(RT6_TABLE_DFLT); + if (table == NULL) + return NULL; - write_lock_bh(&rt6_lock); - for (rt = fn->leaf; rt; rt=rt->u.next) { + write_lock_bh(&table->tb6_lock); + for (rt = table->tb6_root.leaf; rt; rt=rt->u.next) { if (dev == rt->rt6i_dev && ((rt->rt6i_flags & (RTF_ADDRCONF | RTF_DEFAULT)) == (RTF_ADDRCONF | RTF_DEFAULT)) && ipv6_addr_equal(&rt->rt6i_gateway, addr)) @@ -1455,7 +1545,7 @@ struct rt6_info *rt6_get_dflt_router(struct in6_addr *addr, struct net_device *d } if (rt) dst_hold(&rt->u.dst); - write_unlock_bh(&rt6_lock); + write_unlock_bh(&table->tb6_lock); return rt; } @@ -1474,28 +1564,31 @@ struct rt6_info *rt6_add_dflt_router(struct in6_addr *gwaddr, rtmsg.rtmsg_ifindex = dev->ifindex; - ip6_route_add(&rtmsg, NULL, NULL, NULL); + ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_DFLT); return rt6_get_dflt_router(gwaddr, dev); } void rt6_purge_dflt_routers(void) { struct rt6_info *rt; + struct fib6_table *table; + + /* NOTE: Keep consistent with rt6_get_dflt_router */ + table = fib6_get_table(RT6_TABLE_DFLT); + if (table == NULL) + return; restart: - read_lock_bh(&rt6_lock); - for (rt = ip6_routing_table.leaf; rt; rt = rt->u.next) { + read_lock_bh(&table->tb6_lock); + for (rt = table->tb6_root.leaf; rt; rt = rt->u.next) { if (rt->rt6i_flags & (RTF_DEFAULT | RTF_ADDRCONF)) { dst_hold(&rt->u.dst); - - read_unlock_bh(&rt6_lock); - + read_unlock_bh(&table->tb6_lock); ip6_del_rt(rt, NULL, NULL, NULL); - goto restart; } } - read_unlock_bh(&rt6_lock); + read_unlock_bh(&table->tb6_lock); } int ipv6_route_ioctl(unsigned int cmd, void __user *arg) @@ -1516,10 +1609,12 @@ int ipv6_route_ioctl(unsigned int cmd, void __user *arg) rtnl_lock(); switch (cmd) { case SIOCADDRT: - err = ip6_route_add(&rtmsg, NULL, NULL, NULL); + err = ip6_route_add(&rtmsg, NULL, NULL, NULL, + RT6_TABLE_MAIN); break; case SIOCDELRT: - err = ip6_route_del(&rtmsg, NULL, NULL, NULL); + err = ip6_route_del(&rtmsg, NULL, NULL, NULL, + RT6_TABLE_MAIN); break; default: err = -EINVAL; @@ -1593,6 +1688,7 @@ struct rt6_info *addrconf_dst_alloc(struct inet6_dev *idev, ipv6_addr_copy(&rt->rt6i_dst.addr, addr); rt->rt6i_dst.plen = 128; + rt->rt6i_table = fib6_get_table(RT6_TABLE_LOCAL); atomic_set(&rt->u.dst.__refcnt, 1); @@ -1611,9 +1707,7 @@ static int fib6_ifdown(struct rt6_info *rt, void *arg) void rt6_ifdown(struct net_device *dev) { - write_lock_bh(&rt6_lock); - fib6_clean_tree(&ip6_routing_table, fib6_ifdown, 0, dev); - write_unlock_bh(&rt6_lock); + fib6_clean_all(fib6_ifdown, 0, dev); } struct rt6_mtu_change_arg @@ -1663,13 +1757,12 @@ static int rt6_mtu_change_route(struct rt6_info *rt, void *p_arg) void rt6_mtu_change(struct net_device *dev, unsigned mtu) { - struct rt6_mtu_change_arg arg; + struct rt6_mtu_change_arg arg = { + .dev = dev, + .mtu = mtu, + }; - arg.dev = dev; - arg.mtu = mtu; - read_lock_bh(&rt6_lock); - fib6_clean_tree(&ip6_routing_table, rt6_mtu_change_route, 0, &arg); - read_unlock_bh(&rt6_lock); + fib6_clean_all(rt6_mtu_change_route, 0, &arg); } static int inet6_rtm_to_rtmsg(struct rtmsg *r, struct rtattr **rta, @@ -1719,7 +1812,7 @@ int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) return -EINVAL; - return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb)); + return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table); } int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) @@ -1729,7 +1822,7 @@ int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) return -EINVAL; - return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb)); + return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table); } struct rt6_rtnl_dump_arg @@ -1761,6 +1854,10 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, rtm->rtm_dst_len = rt->rt6i_dst.plen; rtm->rtm_src_len = rt->rt6i_src.plen; rtm->rtm_tos = 0; + if (rt->rt6i_table) + rtm->rtm_table = rt->rt6i_table->tb6_id; + else + rtm->rtm_table = RT6_TABLE_UNSPEC; rtm->rtm_table = RT_TABLE_MAIN; if (rt->rt6i_flags&RTF_REJECT) rtm->rtm_type = RTN_UNREACHABLE; @@ -1868,7 +1965,6 @@ static void fib6_dump_end(struct netlink_callback *cb) if (w) { cb->args[0] = 0; - fib6_walker_unlink(w); kfree(w); } cb->done = (void*)cb->args[1]; @@ -1883,13 +1979,20 @@ static int fib6_dump_done(struct netlink_callback *cb) int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) { + struct fib6_table *table; struct rt6_rtnl_dump_arg arg; struct fib6_walker_t *w; - int res; + int i, res = 0; arg.skb = skb; arg.cb = cb; + /* + * cb->args[0] = pointer to walker structure + * cb->args[1] = saved cb->done() pointer + * cb->args[2] = current table being dumped + */ + w = (void*)cb->args[0]; if (w == NULL) { /* New dump: @@ -1905,24 +2008,48 @@ int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) w = kzalloc(sizeof(*w), GFP_ATOMIC); if (w == NULL) return -ENOMEM; - RT6_TRACE("dump<%p", w); - w->root = &ip6_routing_table; w->func = fib6_dump_node; w->args = &arg; cb->args[0] = (long)w; - read_lock_bh(&rt6_lock); - res = fib6_walk(w); - read_unlock_bh(&rt6_lock); + cb->args[2] = FIB6_TABLE_MIN; } else { w->args = &arg; - read_lock_bh(&rt6_lock); - res = fib6_walk_continue(w); - read_unlock_bh(&rt6_lock); + i = cb->args[2]; + if (i > FIB6_TABLE_MAX) + goto end; + + table = fib6_get_table(i); + if (table != NULL) { + read_lock_bh(&table->tb6_lock); + w->root = &table->tb6_root; + res = fib6_walk_continue(w); + read_unlock_bh(&table->tb6_lock); + if (res != 0) { + if (res < 0) + fib6_walker_unlink(w); + goto end; + } + } + + fib6_walker_unlink(w); + cb->args[2] = ++i; } -#if RT6_DEBUG >= 3 - if (res <= 0 && skb->len == 0) - RT6_TRACE("%p>dump end\n", w); -#endif + + for (i = cb->args[2]; i <= FIB6_TABLE_MAX; i++) { + table = fib6_get_table(i); + if (table == NULL) + continue; + + read_lock_bh(&table->tb6_lock); + w->root = &table->tb6_root; + res = fib6_walk(w); + read_unlock_bh(&table->tb6_lock); + if (res) + break; + } +end: + cb->args[2] = i; + res = res < 0 ? res : skb->len; /* res < 0 is an error. (really, impossible) res == 0 means that dump is complete, but skb still can contain data. @@ -2102,16 +2229,13 @@ static int rt6_info_route(struct rt6_info *rt, void *p_arg) static int rt6_proc_info(char *buffer, char **start, off_t offset, int length) { - struct rt6_proc_arg arg; - arg.buffer = buffer; - arg.offset = offset; - arg.length = length; - arg.skip = 0; - arg.len = 0; + struct rt6_proc_arg arg = { + .buffer = buffer, + .offset = offset, + .length = length, + }; - read_lock_bh(&rt6_lock); - fib6_clean_tree(&ip6_routing_table, rt6_info_route, 0, &arg); - read_unlock_bh(&rt6_lock); + fib6_clean_all(rt6_info_route, 0, &arg); *start = buffer; if (offset) -- cgit v0.10.2 From 14c0b97ddfc2944982d078b8e33b088840068976 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 03:38:38 -0700 Subject: [NET]: Protocol Independant Policy Routing Rules Framework Derived from net/ipv/fib_rules.c Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/fib_rules.h b/include/linux/fib_rules.h new file mode 100644 index 0000000..5e503f0 --- /dev/null +++ b/include/linux/fib_rules.h @@ -0,0 +1,60 @@ +#ifndef __LINUX_FIB_RULES_H +#define __LINUX_FIB_RULES_H + +#include +#include + +/* rule is permanent, and cannot be deleted */ +#define FIB_RULE_PERMANENT 1 + +struct fib_rule_hdr +{ + __u8 family; + __u8 dst_len; + __u8 src_len; + __u8 tos; + + __u8 table; + __u8 res1; /* reserved */ + __u8 res2; /* reserved */ + __u8 action; + + __u32 flags; +}; + +enum +{ + FRA_UNSPEC, + FRA_DST, /* destination address */ + FRA_SRC, /* source address */ + FRA_IFNAME, /* interface name */ + FRA_UNUSED1, + FRA_UNUSED2, + FRA_PRIORITY, /* priority/preference */ + FRA_UNUSED3, + FRA_UNUSED4, + FRA_UNUSED5, + FRA_FWMARK, /* netfilter mark (IPv4) */ + FRA_FLOW, /* flow/class id */ + __FRA_MAX +}; + +#define FRA_MAX (__FRA_MAX - 1) + +enum +{ + FR_ACT_UNSPEC, + FR_ACT_TO_TBL, /* Pass to fixed table */ + FR_ACT_RES1, + FR_ACT_RES2, + FR_ACT_RES3, + FR_ACT_RES4, + FR_ACT_BLACKHOLE, /* Drop without notification */ + FR_ACT_UNREACHABLE, /* Drop with ENETUNREACH */ + FR_ACT_PROHIBIT, /* Drop with EACCES */ + __FR_ACT_MAX, +}; + +#define FR_ACT_MAX (__FR_ACT_MAX - 1) + +#endif diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h new file mode 100644 index 0000000..61375d9 --- /dev/null +++ b/include/net/fib_rules.h @@ -0,0 +1,90 @@ +#ifndef __NET_FIB_RULES_H +#define __NET_FIB_RULES_H + +#include +#include +#include +#include +#include + +struct fib_rule +{ + struct list_head list; + atomic_t refcnt; + int ifindex; + char ifname[IFNAMSIZ]; + u32 pref; + u32 flags; + u32 table; + u8 action; + struct rcu_head rcu; +}; + +struct fib_lookup_arg +{ + void *lookup_ptr; + void *result; + struct fib_rule *rule; +}; + +struct fib_rules_ops +{ + int family; + struct list_head list; + int rule_size; + + int (*action)(struct fib_rule *, + struct flowi *, int, + struct fib_lookup_arg *); + int (*match)(struct fib_rule *, + struct flowi *, int); + int (*configure)(struct fib_rule *, + struct sk_buff *, + struct nlmsghdr *, + struct fib_rule_hdr *, + struct nlattr **); + int (*compare)(struct fib_rule *, + struct fib_rule_hdr *, + struct nlattr **); + int (*fill)(struct fib_rule *, struct sk_buff *, + struct nlmsghdr *, + struct fib_rule_hdr *); + u32 (*default_pref)(void); + + int nlgroup; + struct nla_policy *policy; + struct list_head *rules_list; + struct module *owner; +}; + +static inline void fib_rule_get(struct fib_rule *rule) +{ + atomic_inc(&rule->refcnt); +} + +static inline void fib_rule_put_rcu(struct rcu_head *head) +{ + struct fib_rule *rule = container_of(head, struct fib_rule, rcu); + kfree(rule); +} + +static inline void fib_rule_put(struct fib_rule *rule) +{ + if (atomic_dec_and_test(&rule->refcnt)) + call_rcu(&rule->rcu, fib_rule_put_rcu); +} + +extern int fib_rules_register(struct fib_rules_ops *); +extern int fib_rules_unregister(struct fib_rules_ops *); + +extern int fib_rules_lookup(struct fib_rules_ops *, + struct flowi *, int flags, + struct fib_lookup_arg *); + +extern int fib_nl_newrule(struct sk_buff *, + struct nlmsghdr *, void *); +extern int fib_nl_delrule(struct sk_buff *, + struct nlmsghdr *, void *); +extern int fib_rules_dump(struct sk_buff *, + struct netlink_callback *, int); +#endif diff --git a/net/Kconfig b/net/Kconfig index eb855b7..6528a935 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -251,6 +251,9 @@ config WIRELESS_EXT source "net/netlabel/Kconfig" +config FIB_RULES + bool + endif # if NET endmenu # Networking diff --git a/net/core/Makefile b/net/core/Makefile index 2645ba4..1195680 100644 --- a/net/core/Makefile +++ b/net/core/Makefile @@ -17,3 +17,4 @@ obj-$(CONFIG_NET_PKTGEN) += pktgen.o obj-$(CONFIG_WIRELESS_EXT) += wireless.o obj-$(CONFIG_NETPOLL) += netpoll.o obj-$(CONFIG_NET_DMA) += user_dma.o +obj-$(CONFIG_FIB_RULES) += fib_rules.o diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c new file mode 100644 index 0000000..6cdad24 --- /dev/null +++ b/net/core/fib_rules.c @@ -0,0 +1,416 @@ +/* + * net/core/fib_rules.c Generic Routing Rules + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation, version 2. + * + * Authors: Thomas Graf + */ + +#include +#include +#include +#include +#include + +static LIST_HEAD(rules_ops); +static DEFINE_SPINLOCK(rules_mod_lock); + +static void notify_rule_change(int event, struct fib_rule *rule, + struct fib_rules_ops *ops); + +static struct fib_rules_ops *lookup_rules_ops(int family) +{ + struct fib_rules_ops *ops; + + rcu_read_lock(); + list_for_each_entry_rcu(ops, &rules_ops, list) { + if (ops->family == family) { + if (!try_module_get(ops->owner)) + ops = NULL; + rcu_read_unlock(); + return ops; + } + } + rcu_read_unlock(); + + return NULL; +} + +static void rules_ops_put(struct fib_rules_ops *ops) +{ + if (ops) + module_put(ops->owner); +} + +int fib_rules_register(struct fib_rules_ops *ops) +{ + int err = -EEXIST; + struct fib_rules_ops *o; + + if (ops->rule_size < sizeof(struct fib_rule)) + return -EINVAL; + + if (ops->match == NULL || ops->configure == NULL || + ops->compare == NULL || ops->fill == NULL || + ops->action == NULL) + return -EINVAL; + + spin_lock(&rules_mod_lock); + list_for_each_entry(o, &rules_ops, list) + if (ops->family == o->family) + goto errout; + + list_add_tail_rcu(&ops->list, &rules_ops); + err = 0; +errout: + spin_unlock(&rules_mod_lock); + + return err; +} + +EXPORT_SYMBOL_GPL(fib_rules_register); + +static void cleanup_ops(struct fib_rules_ops *ops) +{ + struct fib_rule *rule, *tmp; + + list_for_each_entry_safe(rule, tmp, ops->rules_list, list) { + list_del_rcu(&rule->list); + fib_rule_put(rule); + } +} + +int fib_rules_unregister(struct fib_rules_ops *ops) +{ + int err = 0; + struct fib_rules_ops *o; + + spin_lock(&rules_mod_lock); + list_for_each_entry(o, &rules_ops, list) { + if (o == ops) { + list_del_rcu(&o->list); + cleanup_ops(ops); + goto out; + } + } + + err = -ENOENT; +out: + spin_unlock(&rules_mod_lock); + + synchronize_rcu(); + + return err; +} + +EXPORT_SYMBOL_GPL(fib_rules_unregister); + +int fib_rules_lookup(struct fib_rules_ops *ops, struct flowi *fl, + int flags, struct fib_lookup_arg *arg) +{ + struct fib_rule *rule; + int err; + + rcu_read_lock(); + + list_for_each_entry_rcu(rule, ops->rules_list, list) { + if (rule->ifindex && (rule->ifindex != fl->iif)) + continue; + + if (!ops->match(rule, fl, flags)) + continue; + + err = ops->action(rule, fl, flags, arg); + if (err != -EAGAIN) { + fib_rule_get(rule); + arg->rule = rule; + goto out; + } + } + + err = -ENETUNREACH; +out: + rcu_read_unlock(); + + return err; +} + +EXPORT_SYMBOL_GPL(fib_rules_lookup); + +int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) +{ + struct fib_rule_hdr *frh = nlmsg_data(nlh); + struct fib_rules_ops *ops = NULL; + struct fib_rule *rule, *r, *last = NULL; + struct nlattr *tb[FRA_MAX+1]; + int err = -EINVAL; + + if (nlh->nlmsg_len < nlmsg_msg_size(sizeof(*frh))) + goto errout; + + ops = lookup_rules_ops(frh->family); + if (ops == NULL) { + err = EAFNOSUPPORT; + goto errout; + } + + err = nlmsg_parse(nlh, sizeof(*frh), tb, FRA_MAX, ops->policy); + if (err < 0) + goto errout; + + if (tb[FRA_IFNAME] && nla_len(tb[FRA_IFNAME]) > IFNAMSIZ) + goto errout; + + rule = kzalloc(ops->rule_size, GFP_KERNEL); + if (rule == NULL) { + err = -ENOMEM; + goto errout; + } + + if (tb[FRA_PRIORITY]) + rule->pref = nla_get_u32(tb[FRA_PRIORITY]); + + if (tb[FRA_IFNAME]) { + struct net_device *dev; + + rule->ifindex = -1; + if (nla_strlcpy(rule->ifname, tb[FRA_IFNAME], + IFNAMSIZ) >= IFNAMSIZ) + goto errout_free; + + dev = __dev_get_by_name(rule->ifname); + if (dev) + rule->ifindex = dev->ifindex; + } + + rule->action = frh->action; + rule->flags = frh->flags; + rule->table = frh->table; + + if (!rule->pref && ops->default_pref) + rule->pref = ops->default_pref(); + + err = ops->configure(rule, skb, nlh, frh, tb); + if (err < 0) + goto errout_free; + + list_for_each_entry(r, ops->rules_list, list) { + if (r->pref > rule->pref) + break; + last = r; + } + + fib_rule_get(rule); + + if (last) + list_add_rcu(&rule->list, &last->list); + else + list_add_rcu(&rule->list, ops->rules_list); + + notify_rule_change(RTM_NEWRULE, rule, ops); + rules_ops_put(ops); + return 0; + +errout_free: + kfree(rule); +errout: + rules_ops_put(ops); + return err; +} + +int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) +{ + struct fib_rule_hdr *frh = nlmsg_data(nlh); + struct fib_rules_ops *ops = NULL; + struct fib_rule *rule; + struct nlattr *tb[FRA_MAX+1]; + int err = -EINVAL; + + if (nlh->nlmsg_len < nlmsg_msg_size(sizeof(*frh))) + goto errout; + + ops = lookup_rules_ops(frh->family); + if (ops == NULL) { + err = EAFNOSUPPORT; + goto errout; + } + + err = nlmsg_parse(nlh, sizeof(*frh), tb, FRA_MAX, ops->policy); + if (err < 0) + goto errout; + + list_for_each_entry(rule, ops->rules_list, list) { + if (frh->action && (frh->action != rule->action)) + continue; + + if (frh->table && (frh->table != rule->table)) + continue; + + if (tb[FRA_PRIORITY] && + (rule->pref != nla_get_u32(tb[FRA_PRIORITY]))) + continue; + + if (tb[FRA_IFNAME] && + nla_strcmp(tb[FRA_IFNAME], rule->ifname)) + continue; + + if (!ops->compare(rule, frh, tb)) + continue; + + if (rule->flags & FIB_RULE_PERMANENT) { + err = -EPERM; + goto errout; + } + + list_del_rcu(&rule->list); + synchronize_rcu(); + notify_rule_change(RTM_DELRULE, rule, ops); + fib_rule_put(rule); + rules_ops_put(ops); + return 0; + } + + err = -ENOENT; +errout: + rules_ops_put(ops); + return err; +} + +static int fib_nl_fill_rule(struct sk_buff *skb, struct fib_rule *rule, + u32 pid, u32 seq, int type, int flags, + struct fib_rules_ops *ops) +{ + struct nlmsghdr *nlh; + struct fib_rule_hdr *frh; + + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*frh), flags); + if (nlh == NULL) + return -1; + + frh = nlmsg_data(nlh); + frh->table = rule->table; + frh->res1 = 0; + frh->res2 = 0; + frh->action = rule->action; + frh->flags = rule->flags; + + if (rule->ifname[0]) + NLA_PUT_STRING(skb, FRA_IFNAME, rule->ifname); + + if (rule->pref) + NLA_PUT_U32(skb, FRA_PRIORITY, rule->pref); + + if (ops->fill(rule, skb, nlh, frh) < 0) + goto nla_put_failure; + + return nlmsg_end(skb, nlh); + +nla_put_failure: + return nlmsg_cancel(skb, nlh); +} + +int fib_rules_dump(struct sk_buff *skb, struct netlink_callback *cb, int family) +{ + int idx = 0; + struct fib_rule *rule; + struct fib_rules_ops *ops; + + ops = lookup_rules_ops(family); + if (ops == NULL) + return -EAFNOSUPPORT; + + rcu_read_lock(); + list_for_each_entry(rule, ops->rules_list, list) { + if (idx < cb->args[0]) + goto skip; + + if (fib_nl_fill_rule(skb, rule, NETLINK_CB(cb->skb).pid, + cb->nlh->nlmsg_seq, RTM_NEWRULE, + NLM_F_MULTI, ops) < 0) + break; +skip: + idx++; + } + rcu_read_unlock(); + cb->args[0] = idx; + rules_ops_put(ops); + + return skb->len; +} + +EXPORT_SYMBOL_GPL(fib_rules_dump); + +static void notify_rule_change(int event, struct fib_rule *rule, + struct fib_rules_ops *ops) +{ + int size = nlmsg_total_size(sizeof(struct fib_rule_hdr) + 128); + struct sk_buff *skb = alloc_skb(size, GFP_KERNEL); + + if (skb == NULL) + netlink_set_err(rtnl, 0, ops->nlgroup, ENOBUFS); + else if (fib_nl_fill_rule(skb, rule, 0, 0, event, 0, ops) < 0) { + kfree_skb(skb); + netlink_set_err(rtnl, 0, ops->nlgroup, EINVAL); + } else + netlink_broadcast(rtnl, skb, 0, ops->nlgroup, GFP_KERNEL); +} + +static void attach_rules(struct list_head *rules, struct net_device *dev) +{ + struct fib_rule *rule; + + list_for_each_entry(rule, rules, list) { + if (rule->ifindex == -1 && + strcmp(dev->name, rule->ifname) == 0) + rule->ifindex = dev->ifindex; + } +} + +static void detach_rules(struct list_head *rules, struct net_device *dev) +{ + struct fib_rule *rule; + + list_for_each_entry(rule, rules, list) + if (rule->ifindex == dev->ifindex) + rule->ifindex = -1; +} + + +static int fib_rules_event(struct notifier_block *this, unsigned long event, + void *ptr) +{ + struct net_device *dev = ptr; + struct fib_rules_ops *ops; + + ASSERT_RTNL(); + rcu_read_lock(); + + switch (event) { + case NETDEV_REGISTER: + list_for_each_entry(ops, &rules_ops, list) + attach_rules(ops->rules_list, dev); + break; + + case NETDEV_UNREGISTER: + list_for_each_entry(ops, &rules_ops, list) + detach_rules(ops->rules_list, dev); + break; + } + + rcu_read_unlock(); + + return NOTIFY_DONE; +} + +static struct notifier_block fib_rules_notifier = { + .notifier_call = fib_rules_event, +}; + +static int __init fib_rules_init(void) +{ + return register_netdevice_notifier(&fib_rules_notifier); +} + +subsys_initcall(fib_rules_init); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 30cc1ba..aa7cff2 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #ifdef CONFIG_NET_WIRELESS_RTNETLINK #include @@ -103,7 +104,7 @@ static const int rtm_min[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWADDR)] = NLMSG_LENGTH(sizeof(struct ifaddrmsg)), [RTM_FAM(RTM_NEWROUTE)] = NLMSG_LENGTH(sizeof(struct rtmsg)), [RTM_FAM(RTM_NEWNEIGH)] = NLMSG_LENGTH(sizeof(struct ndmsg)), - [RTM_FAM(RTM_NEWRULE)] = NLMSG_LENGTH(sizeof(struct rtmsg)), + [RTM_FAM(RTM_NEWRULE)] = NLMSG_LENGTH(sizeof(struct fib_rule_hdr)), [RTM_FAM(RTM_NEWQDISC)] = NLMSG_LENGTH(sizeof(struct tcmsg)), [RTM_FAM(RTM_NEWTCLASS)] = NLMSG_LENGTH(sizeof(struct tcmsg)), [RTM_FAM(RTM_NEWTFILTER)] = NLMSG_LENGTH(sizeof(struct tcmsg)), @@ -120,7 +121,7 @@ static const int rta_max[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWADDR)] = IFA_MAX, [RTM_FAM(RTM_NEWROUTE)] = RTA_MAX, [RTM_FAM(RTM_NEWNEIGH)] = NDA_MAX, - [RTM_FAM(RTM_NEWRULE)] = RTA_MAX, + [RTM_FAM(RTM_NEWRULE)] = FRA_MAX, [RTM_FAM(RTM_NEWQDISC)] = TCA_MAX, [RTM_FAM(RTM_NEWTCLASS)] = TCA_MAX, [RTM_FAM(RTM_NEWTFILTER)] = TCA_MAX, @@ -757,6 +758,10 @@ static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] = [RTM_NEWNEIGH - RTM_BASE] = { .doit = neigh_add }, [RTM_DELNEIGH - RTM_BASE] = { .doit = neigh_delete }, [RTM_GETNEIGH - RTM_BASE] = { .dumpit = neigh_dump_info }, +#ifdef CONFIG_FIB_RULES + [RTM_NEWRULE - RTM_BASE] = { .doit = fib_nl_newrule }, + [RTM_DELRULE - RTM_BASE] = { .doit = fib_nl_delrule }, +#endif [RTM_GETRULE - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, [RTM_GETNEIGHTBL - RTM_BASE] = { .dumpit = neightbl_dump_info }, [RTM_SETNEIGHTBL - RTM_BASE] = { .doit = neightbl_set }, -- cgit v0.10.2 From 101367c2f8c464ea96643192673aa18d88e6336d Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 03:39:02 -0700 Subject: [IPV6]: Policy Routing Rules Adds support for policy routing rules including a new local table for routes with a local destination. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index facd9ee..bf35353 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -889,6 +889,8 @@ enum rtnetlink_groups { RTNLGRP_NOP4, RTNLGRP_IPV6_PREFIX, #define RTNLGRP_IPV6_PREFIX RTNLGRP_IPV6_PREFIX + RTNLGRP_IPV6_RULE, +#define RTNLGRP_IPV6_RULE RTNLGRP_IPV6_RULE __RTNLGRP_MAX }; #define RTNLGRP_MAX (__RTNLGRP_MAX - 1) diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 8184115..7b47e8d 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -155,7 +155,6 @@ struct fib6_table { #define RT6_TABLE_UNSPEC RT_TABLE_UNSPEC #define RT6_TABLE_MAIN RT_TABLE_MAIN -#define RT6_TABLE_LOCAL RT6_TABLE_MAIN #define RT6_TABLE_DFLT RT6_TABLE_MAIN #define RT6_TABLE_INFO RT6_TABLE_MAIN #define RT6_TABLE_PREFIX RT6_TABLE_MAIN @@ -163,9 +162,11 @@ struct fib6_table { #ifdef CONFIG_IPV6_MULTIPLE_TABLES #define FIB6_TABLE_MIN 1 #define FIB6_TABLE_MAX RT_TABLE_MAX +#define RT6_TABLE_LOCAL RT_TABLE_LOCAL #else #define FIB6_TABLE_MIN RT_TABLE_MAIN #define FIB6_TABLE_MAX FIB6_TABLE_MIN +#define RT6_TABLE_LOCAL RT6_TABLE_MAIN #endif #define RT6_F_STRICT 1 @@ -221,5 +222,11 @@ extern void fib6_run_gc(unsigned long dummy); extern void fib6_gc_cleanup(void); extern void fib6_init(void); + +extern void fib6_rules_init(void); +extern void fib6_rules_cleanup(void); +extern int fib6_rules_dump(struct sk_buff *, + struct netlink_callback *); + #endif #endif diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index d49c8c9..9bfa3cc 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -41,6 +41,11 @@ struct pol_chain { extern struct rt6_info ip6_null_entry; +#ifdef CONFIG_IPV6_MULTIPLE_TABLES +extern struct rt6_info ip6_prohibit_entry; +extern struct rt6_info ip6_blk_hole_entry; +#endif + extern int ip6_rt_gc_interval; extern void ip6_route_input(struct sk_buff *skb); diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 159c63d..36a6c2b 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -139,6 +139,7 @@ config IPV6_TUNNEL config IPV6_MULTIPLE_TABLES bool "IPv6: Multiple Routing Tables" depends on IPV6 && EXPERIMENTAL + select FIB_RULES ---help--- Support multiple routing tables. diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 386e0a6..9eebf60 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -13,6 +13,7 @@ ipv6-objs := af_inet6.o anycast.o ip6_output.o ip6_input.o addrconf.o sit.o \ ipv6-$(CONFIG_XFRM) += xfrm6_policy.o xfrm6_state.o xfrm6_input.o \ xfrm6_output.o ipv6-$(CONFIG_NETFILTER) += netfilter.o +ipv6-$(CONFIG_IPV6_MULTIPLE_TABLES) += fib6_rules.o ipv6-objs += $(ipv6-y) obj-$(CONFIG_INET6_AH) += ah6.o diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 318767f..ed766ee 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3528,6 +3528,7 @@ static struct rtnetlink_link inet6_rtnetlink_table[RTM_NR_MSGTYPES] = { [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, .dumpit = inet6_dump_fib, }, + [RTM_GETRULE - RTM_BASE] = { .dumpit = fib6_rules_dump, }, }; static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c new file mode 100644 index 0000000..c3c8195 --- /dev/null +++ b/net/ipv6/fib6_rules.c @@ -0,0 +1,251 @@ +/* + * net/ipv6/fib6_rules.c IPv6 Routing Policy Rules + * + * Copyright (C)2003-2006 Helsinki University of Technology + * Copyright (C)2003-2006 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License as + * published by the Free Software Foundation, version 2. + * + * Authors + * Thomas Graf + * Ville Nuorvala + */ + +#include +#include + +#include +#include +#include +#include + +struct fib6_rule +{ + struct fib_rule common; + struct rt6key src; + struct rt6key dst; + u8 tclass; +}; + +static struct fib_rules_ops fib6_rules_ops; + +static struct fib6_rule main_rule = { + .common = { + .refcnt = ATOMIC_INIT(2), + .pref = 0x7FFE, + .action = FR_ACT_TO_TBL, + .table = RT6_TABLE_MAIN, + }, +}; + +static struct fib6_rule local_rule = { + .common = { + .refcnt = ATOMIC_INIT(2), + .pref = 0, + .action = FR_ACT_TO_TBL, + .table = RT6_TABLE_LOCAL, + .flags = FIB_RULE_PERMANENT, + }, +}; + +static LIST_HEAD(fib6_rules); + +struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, + pol_lookup_t lookup) +{ + struct fib_lookup_arg arg = { + .lookup_ptr = lookup, + }; + + fib_rules_lookup(&fib6_rules_ops, fl, flags, &arg); + if (arg.rule) + fib_rule_put(arg.rule); + + return (struct dst_entry *) arg.result; +} + +int fib6_rule_action(struct fib_rule *rule, struct flowi *flp, + int flags, struct fib_lookup_arg *arg) +{ + struct rt6_info *rt = NULL; + struct fib6_table *table; + pol_lookup_t lookup = arg->lookup_ptr; + + switch (rule->action) { + case FR_ACT_TO_TBL: + break; + case FR_ACT_UNREACHABLE: + rt = &ip6_null_entry; + goto discard_pkt; + default: + case FR_ACT_BLACKHOLE: + rt = &ip6_blk_hole_entry; + goto discard_pkt; + case FR_ACT_PROHIBIT: + rt = &ip6_prohibit_entry; + goto discard_pkt; + } + + table = fib6_get_table(rule->table); + if (table) + rt = lookup(table, flp, flags); + + if (rt != &ip6_null_entry) + goto out; + + dst_release(&rt->u.dst); +discard_pkt: + dst_hold(&rt->u.dst); +out: + arg->result = rt; + return rt == NULL ? -EAGAIN : 0; +} + + +static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) +{ + struct fib6_rule *r = (struct fib6_rule *) rule; + + if (!ipv6_prefix_equal(&fl->fl6_dst, &r->dst.addr, r->dst.plen)) + return 0; + + if ((flags & RT6_F_HAS_SADDR) && + !ipv6_prefix_equal(&fl->fl6_src, &r->src.addr, r->src.plen)) + return 0; + + return 1; +} + +static struct nla_policy fib6_rule_policy[RTA_MAX+1] __read_mostly = { + [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_PRIORITY] = { .type = NLA_U32 }, + [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, + [FRA_DST] = { .minlen = sizeof(struct in6_addr) }, +}; + +static int fib6_rule_configure(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh, + struct nlattr **tb) +{ + int err = -EINVAL; + struct fib6_rule *rule6 = (struct fib6_rule *) rule; + + if (frh->src_len > 128 || frh->dst_len > 128 || + (frh->tos & ~IPV6_FLOWINFO_MASK)) + goto errout; + + if (rule->action == FR_ACT_TO_TBL) { + if (rule->table == RT6_TABLE_UNSPEC) + goto errout; + + if (fib6_new_table(rule->table) == NULL) { + err = -ENOBUFS; + goto errout; + } + } + + if (tb[FRA_SRC]) + nla_memcpy(&rule6->src.addr, tb[FRA_SRC], + sizeof(struct in6_addr)); + + if (tb[FRA_DST]) + nla_memcpy(&rule6->dst.addr, tb[FRA_DST], + sizeof(struct in6_addr)); + + rule6->src.plen = frh->src_len; + rule6->dst.plen = frh->dst_len; + rule6->tclass = frh->tos; + + err = 0; +errout: + return err; +} + +static int fib6_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, + struct nlattr **tb) +{ + struct fib6_rule *rule6 = (struct fib6_rule *) rule; + + if (frh->src_len && (rule6->src.plen != frh->src_len)) + return 0; + + if (frh->dst_len && (rule6->dst.plen != frh->dst_len)) + return 0; + + if (frh->tos && (rule6->tclass != frh->tos)) + return 0; + + if (tb[FRA_SRC] && + nla_memcmp(tb[FRA_SRC], &rule6->src.addr, sizeof(struct in6_addr))) + return 0; + + if (tb[FRA_DST] && + nla_memcmp(tb[FRA_DST], &rule6->dst.addr, sizeof(struct in6_addr))) + return 0; + + return 1; +} + +static int fib6_rule_fill(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh) +{ + struct fib6_rule *rule6 = (struct fib6_rule *) rule; + + frh->family = AF_INET6; + frh->dst_len = rule6->dst.plen; + frh->src_len = rule6->src.plen; + frh->tos = rule6->tclass; + + if (rule6->dst.plen) + NLA_PUT(skb, FRA_DST, sizeof(struct in6_addr), + &rule6->dst.addr); + + if (rule6->src.plen) + NLA_PUT(skb, FRA_SRC, sizeof(struct in6_addr), + &rule6->src.addr); + + return 0; + +nla_put_failure: + return -ENOBUFS; +} + +int fib6_rules_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + return fib_rules_dump(skb, cb, AF_INET6); +} + +static u32 fib6_rule_default_pref(void) +{ + return 0x3FFF; +} + +static struct fib_rules_ops fib6_rules_ops = { + .family = AF_INET6, + .rule_size = sizeof(struct fib6_rule), + .action = fib6_rule_action, + .match = fib6_rule_match, + .configure = fib6_rule_configure, + .compare = fib6_rule_compare, + .fill = fib6_rule_fill, + .default_pref = fib6_rule_default_pref, + .nlgroup = RTNLGRP_IPV6_RULE, + .policy = fib6_rule_policy, + .rules_list = &fib6_rules, + .owner = THIS_MODULE, +}; + +void __init fib6_rules_init(void) +{ + list_add_tail(&local_rule.common.list, &fib6_rules); + list_add_tail(&main_rule.common.list, &fib6_rules); + + fib_rules_register(&fib6_rules_ops); +} + +void fib6_rules_cleanup(void) +{ + fib_rules_unregister(&fib6_rules_ops); +} diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index fcd7da8..ce226c1 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -159,6 +159,15 @@ static struct fib6_table fib6_main_tbl = { #ifdef CONFIG_IPV6_MULTIPLE_TABLES +static struct fib6_table fib6_local_tbl = { + .tb6_id = RT6_TABLE_LOCAL, + .tb6_lock = RW_LOCK_UNLOCKED, + .tb6_root = { + .leaf = &ip6_null_entry, + .fn_flags = RTN_ROOT | RTN_TL_ROOT | RTN_RTINFO, + }, +}; + #define FIB_TABLE_HASHSZ 256 static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; @@ -228,20 +237,10 @@ struct fib6_table *fib6_get_table(u32 id) return NULL; } -struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, - pol_lookup_t lookup) -{ - /* - * TODO: Add rule lookup - */ - struct fib6_table *table = fib6_get_table(RT6_TABLE_MAIN); - - return (struct dst_entry *) lookup(table, fl, flags); -} - static void __init fib6_tables_init(void) { fib6_link_table(&fib6_main_tbl); + fib6_link_table(&fib6_local_tbl); } #else diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 73efdad..438977e 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -140,6 +140,50 @@ struct rt6_info ip6_null_entry = { .rt6i_ref = ATOMIC_INIT(1), }; +#ifdef CONFIG_IPV6_MULTIPLE_TABLES + +struct rt6_info ip6_prohibit_entry = { + .u = { + .dst = { + .__refcnt = ATOMIC_INIT(1), + .__use = 1, + .dev = &loopback_dev, + .obsolete = -1, + .error = -EACCES, + .metrics = { [RTAX_HOPLIMIT - 1] = 255, }, + .input = ip6_pkt_discard, + .output = ip6_pkt_discard_out, + .ops = &ip6_dst_ops, + .path = (struct dst_entry*)&ip6_prohibit_entry, + } + }, + .rt6i_flags = (RTF_REJECT | RTF_NONEXTHOP), + .rt6i_metric = ~(u32) 0, + .rt6i_ref = ATOMIC_INIT(1), +}; + +struct rt6_info ip6_blk_hole_entry = { + .u = { + .dst = { + .__refcnt = ATOMIC_INIT(1), + .__use = 1, + .dev = &loopback_dev, + .obsolete = -1, + .error = -EINVAL, + .metrics = { [RTAX_HOPLIMIT - 1] = 255, }, + .input = ip6_pkt_discard, + .output = ip6_pkt_discard_out, + .ops = &ip6_dst_ops, + .path = (struct dst_entry*)&ip6_blk_hole_entry, + } + }, + .rt6i_flags = (RTF_REJECT | RTF_NONEXTHOP), + .rt6i_metric = ~(u32) 0, + .rt6i_ref = ATOMIC_INIT(1), +}; + +#endif + /* allocate dst with ip6_dst_ops */ static __inline__ struct rt6_info *ip6_dst_alloc(void) { @@ -2408,10 +2452,16 @@ void __init ip6_route_init(void) #ifdef CONFIG_XFRM xfrm6_init(); #endif +#ifdef CONFIG_IPV6_MULTIPLE_TABLES + fib6_rules_init(); +#endif } void ip6_route_cleanup(void) { +#ifdef CONFIG_IPV6_MULTIPLE_TABLES + fib6_rules_cleanup(); +#endif #ifdef CONFIG_PROC_FS proc_net_remove("ipv6_route"); proc_net_remove("rt6_stats"); -- cgit v0.10.2 From e1ef4bf23b1ced0bf78a1c98289f746486e5c912 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 03:39:22 -0700 Subject: [IPV4]: Use Protocol Independant Policy Routing Rules Framework Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index a095d1d..14c82e6 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -18,6 +18,7 @@ #include #include +#include /* WARNING: The ordering of these elements must match ordering * of RTA_* rtnetlink attribute numbers. @@ -203,9 +204,8 @@ static inline void fib_select_default(const struct flowi *flp, struct fib_result #define ip_fib_main_table (fib_tables[RT_TABLE_MAIN]) extern struct fib_table * fib_tables[RT_TABLE_MAX+1]; -extern int fib_lookup(const struct flowi *flp, struct fib_result *res); +extern int fib_lookup(struct flowi *flp, struct fib_result *res); extern struct fib_table *__fib_new_table(int id); -extern void fib_rule_put(struct fib_rule *r); static inline struct fib_table *fib_get_table(int id) { @@ -251,15 +251,15 @@ extern u32 __fib_res_prefsrc(struct fib_result *res); extern struct fib_table *fib_hash_init(int id); #ifdef CONFIG_IP_MULTIPLE_TABLES -/* Exported by fib_rules.c */ +extern int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb); + +extern void __init fib4_rules_init(void); +extern void __exit fib4_rules_cleanup(void); -extern int inet_rtm_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); -extern int inet_rtm_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); -extern int inet_dump_rules(struct sk_buff *skb, struct netlink_callback *cb); #ifdef CONFIG_NET_CLS_ROUTE extern u32 fib_rules_tclass(struct fib_result *res); #endif -extern void fib_rules_init(void); + #endif static inline void fib_combine_itag(u32 *itag, struct fib_result *res) diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig index 3b5d504..1650b64 100644 --- a/net/ipv4/Kconfig +++ b/net/ipv4/Kconfig @@ -88,6 +88,7 @@ config IP_FIB_HASH config IP_MULTIPLE_TABLES bool "IP: policy routing" depends on IP_ADVANCED_ROUTER + select FIB_RULES ---help--- Normally, a router decides what to do with a received packet based solely on the packet's final destination address. If you say Y here, diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index a6cc31d..9f3ffbe 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -1151,9 +1151,7 @@ static struct rtnetlink_link inet_rtnetlink_table[RTM_NR_MSGTYPES] = { [RTM_GETROUTE - RTM_BASE] = { .doit = inet_rtm_getroute, .dumpit = inet_dump_fib, }, #ifdef CONFIG_IP_MULTIPLE_TABLES - [RTM_NEWRULE - RTM_BASE] = { .doit = inet_rtm_newrule, }, - [RTM_DELRULE - RTM_BASE] = { .doit = inet_rtm_delrule, }, - [RTM_GETRULE - RTM_BASE] = { .dumpit = inet_dump_rules, }, + [RTM_GETRULE - RTM_BASE] = { .dumpit = fib4_rules_dump, }, #endif }; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index ba2a707..fe4a53d 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -656,7 +656,7 @@ void __init ip_fib_init(void) ip_fib_local_table = fib_hash_init(RT_TABLE_LOCAL); ip_fib_main_table = fib_hash_init(RT_TABLE_MAIN); #else - fib_rules_init(); + fib4_rules_init(); #endif register_netdevice_notifier(&fib_netdev_notifier); diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 79b0471..23ec6ae 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -5,9 +5,8 @@ * * IPv4 Forwarding Information Base: policy rules. * - * Version: $Id: fib_rules.c,v 1.17 2001/10/31 21:55:54 davem Exp $ - * * Authors: Alexey Kuznetsov, + * Thomas Graf * * This program is free software; you can redistribute it and/or * modify it under the terms of the GNU General Public License @@ -19,129 +18,154 @@ * Marc Boucher : routing by fwmark */ -#include -#include -#include #include #include -#include -#include -#include -#include -#include -#include -#include -#include -#include #include -#include -#include -#include #include +#include #include #include #include - #include -#include #include #include -#include #include +#include -#define FRprintk(a...) +static struct fib_rules_ops fib4_rules_ops; -struct fib_rule +struct fib4_rule { - struct hlist_node hlist; - atomic_t r_clntref; - u32 r_preference; - unsigned char r_table; - unsigned char r_action; - unsigned char r_dst_len; - unsigned char r_src_len; - u32 r_src; - u32 r_srcmask; - u32 r_dst; - u32 r_dstmask; - u32 r_srcmap; - u8 r_flags; - u8 r_tos; + struct fib_rule common; + u8 dst_len; + u8 src_len; + u8 tos; + u32 src; + u32 srcmask; + u32 dst; + u32 dstmask; #ifdef CONFIG_IP_ROUTE_FWMARK - u32 r_fwmark; + u32 fwmark; #endif - int r_ifindex; #ifdef CONFIG_NET_CLS_ROUTE - __u32 r_tclassid; + u32 tclassid; #endif - char r_ifname[IFNAMSIZ]; - int r_dead; - struct rcu_head rcu; }; -static struct fib_rule default_rule = { - .r_clntref = ATOMIC_INIT(2), - .r_preference = 0x7FFF, - .r_table = RT_TABLE_DEFAULT, - .r_action = RTN_UNICAST, +static struct fib4_rule default_rule = { + .common = { + .refcnt = ATOMIC_INIT(2), + .pref = 0x7FFF, + .table = RT_TABLE_DEFAULT, + .action = FR_ACT_TO_TBL, + }, }; -static struct fib_rule main_rule = { - .r_clntref = ATOMIC_INIT(2), - .r_preference = 0x7FFE, - .r_table = RT_TABLE_MAIN, - .r_action = RTN_UNICAST, +static struct fib4_rule main_rule = { + .common = { + .refcnt = ATOMIC_INIT(2), + .pref = 0x7FFE, + .table = RT_TABLE_MAIN, + .action = FR_ACT_TO_TBL, + }, }; -static struct fib_rule local_rule = { - .r_clntref = ATOMIC_INIT(2), - .r_table = RT_TABLE_LOCAL, - .r_action = RTN_UNICAST, +static struct fib4_rule local_rule = { + .common = { + .refcnt = ATOMIC_INIT(2), + .table = RT_TABLE_LOCAL, + .action = FR_ACT_TO_TBL, + .flags = FIB_RULE_PERMANENT, + }, }; -static struct hlist_head fib_rules; +static LIST_HEAD(fib4_rules); + +#ifdef CONFIG_NET_CLS_ROUTE +u32 fib_rules_tclass(struct fib_result *res) +{ + return res->r ? ((struct fib4_rule *) res->r)->tclassid : 0; +} +#endif -/* writer func called from netlink -- rtnl_sem hold*/ +int fib_lookup(struct flowi *flp, struct fib_result *res) +{ + struct fib_lookup_arg arg = { + .result = res, + }; + int err; -static void rtmsg_rule(int, struct fib_rule *); + err = fib_rules_lookup(&fib4_rules_ops, flp, 0, &arg); + res->r = arg.rule; -int inet_rtm_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) + return err; +} + +int fib4_rule_action(struct fib_rule *rule, struct flowi *flp, int flags, + struct fib_lookup_arg *arg) { - struct rtattr **rta = arg; - struct rtmsg *rtm = NLMSG_DATA(nlh); - struct fib_rule *r; - struct hlist_node *node; - int err = -ESRCH; - - hlist_for_each_entry(r, node, &fib_rules, hlist) { - if ((!rta[RTA_SRC-1] || memcmp(RTA_DATA(rta[RTA_SRC-1]), &r->r_src, 4) == 0) && - rtm->rtm_src_len == r->r_src_len && - rtm->rtm_dst_len == r->r_dst_len && - (!rta[RTA_DST-1] || memcmp(RTA_DATA(rta[RTA_DST-1]), &r->r_dst, 4) == 0) && - rtm->rtm_tos == r->r_tos && -#ifdef CONFIG_IP_ROUTE_FWMARK - (!rta[RTA_PROTOINFO-1] || memcmp(RTA_DATA(rta[RTA_PROTOINFO-1]), &r->r_fwmark, 4) == 0) && -#endif - (!rtm->rtm_type || rtm->rtm_type == r->r_action) && - (!rta[RTA_PRIORITY-1] || memcmp(RTA_DATA(rta[RTA_PRIORITY-1]), &r->r_preference, 4) == 0) && - (!rta[RTA_IIF-1] || rtattr_strcmp(rta[RTA_IIF-1], r->r_ifname) == 0) && - (!rtm->rtm_table || (r && rtm->rtm_table == r->r_table))) { - err = -EPERM; - if (r == &local_rule) - break; - - hlist_del_rcu(&r->hlist); - r->r_dead = 1; - rtmsg_rule(RTM_DELRULE, r); - fib_rule_put(r); - err = 0; - break; - } + int err = -EAGAIN; + struct fib_table *tbl; + + switch (rule->action) { + case FR_ACT_TO_TBL: + break; + + case FR_ACT_UNREACHABLE: + err = -ENETUNREACH; + goto errout; + + case FR_ACT_PROHIBIT: + err = -EACCES; + goto errout; + + case FR_ACT_BLACKHOLE: + default: + err = -EINVAL; + goto errout; } + + if ((tbl = fib_get_table(rule->table)) == NULL) + goto errout; + + err = tbl->tb_lookup(tbl, flp, (struct fib_result *) arg->result); + if (err > 0) + err = -EAGAIN; +errout: return err; } -/* Allocate new unique table id */ + +void fib_select_default(const struct flowi *flp, struct fib_result *res) +{ + if (res->r && res->r->action == FR_ACT_TO_TBL && + FIB_RES_GW(*res) && FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK) { + struct fib_table *tb; + if ((tb = fib_get_table(res->r->table)) != NULL) + tb->tb_select_default(tb, flp, res); + } +} + +static int fib4_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) +{ + struct fib4_rule *r = (struct fib4_rule *) rule; + u32 daddr = fl->fl4_dst; + u32 saddr = fl->fl4_src; + + if (((saddr ^ r->src) & r->srcmask) || + ((daddr ^ r->dst) & r->dstmask)) + return 0; + + if (r->tos && (r->tos != fl->fl4_tos)) + return 0; + +#ifdef CONFIG_IP_ROUTE_FWMARK + if (r->fwmark && (r->fwmark != fl->fl4_fwmark)) + return 0; +#endif + + return 1; +} static struct fib_table *fib_empty_table(void) { @@ -153,329 +177,178 @@ static struct fib_table *fib_empty_table(void) return NULL; } -static inline void fib_rule_put_rcu(struct rcu_head *head) -{ - struct fib_rule *r = container_of(head, struct fib_rule, rcu); - kfree(r); -} +static struct nla_policy fib4_rule_policy[FRA_MAX+1] __read_mostly = { + [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_PRIORITY] = { .type = NLA_U32 }, + [FRA_SRC] = { .type = NLA_U32 }, + [FRA_DST] = { .type = NLA_U32 }, + [FRA_FWMARK] = { .type = NLA_U32 }, + [FRA_FLOW] = { .type = NLA_U32 }, +}; -void fib_rule_put(struct fib_rule *r) +static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh, + struct nlattr **tb) { - if (atomic_dec_and_test(&r->r_clntref)) { - if (r->r_dead) - call_rcu(&r->rcu, fib_rule_put_rcu); - else - printk("Freeing alive rule %p\n", r); - } -} + int err = -EINVAL; + struct fib4_rule *rule4 = (struct fib4_rule *) rule; -/* writer func called from netlink -- rtnl_sem hold*/ + if (frh->src_len > 32 || frh->dst_len > 32 || + (frh->tos & ~IPTOS_TOS_MASK)) + goto errout; -int inet_rtm_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) -{ - struct rtattr **rta = arg; - struct rtmsg *rtm = NLMSG_DATA(nlh); - struct fib_rule *r, *new_r, *last = NULL; - struct hlist_node *node = NULL; - unsigned char table_id; - - if (rtm->rtm_src_len > 32 || rtm->rtm_dst_len > 32 || - (rtm->rtm_tos & ~IPTOS_TOS_MASK)) - return -EINVAL; - - if (rta[RTA_IIF-1] && RTA_PAYLOAD(rta[RTA_IIF-1]) > IFNAMSIZ) - return -EINVAL; - - table_id = rtm->rtm_table; - if (table_id == RT_TABLE_UNSPEC) { - struct fib_table *table; - if (rtm->rtm_type == RTN_UNICAST) { - if ((table = fib_empty_table()) == NULL) - return -ENOBUFS; - table_id = table->tb_id; - } - } + if (rule->table == RT_TABLE_UNSPEC) { + if (rule->action == FR_ACT_TO_TBL) { + struct fib_table *table; - new_r = kzalloc(sizeof(*new_r), GFP_KERNEL); - if (!new_r) - return -ENOMEM; - - if (rta[RTA_SRC-1]) - memcpy(&new_r->r_src, RTA_DATA(rta[RTA_SRC-1]), 4); - if (rta[RTA_DST-1]) - memcpy(&new_r->r_dst, RTA_DATA(rta[RTA_DST-1]), 4); - if (rta[RTA_GATEWAY-1]) - memcpy(&new_r->r_srcmap, RTA_DATA(rta[RTA_GATEWAY-1]), 4); - new_r->r_src_len = rtm->rtm_src_len; - new_r->r_dst_len = rtm->rtm_dst_len; - new_r->r_srcmask = inet_make_mask(rtm->rtm_src_len); - new_r->r_dstmask = inet_make_mask(rtm->rtm_dst_len); - new_r->r_tos = rtm->rtm_tos; -#ifdef CONFIG_IP_ROUTE_FWMARK - if (rta[RTA_PROTOINFO-1]) - memcpy(&new_r->r_fwmark, RTA_DATA(rta[RTA_PROTOINFO-1]), 4); -#endif - new_r->r_action = rtm->rtm_type; - new_r->r_flags = rtm->rtm_flags; - if (rta[RTA_PRIORITY-1]) - memcpy(&new_r->r_preference, RTA_DATA(rta[RTA_PRIORITY-1]), 4); - new_r->r_table = table_id; - if (rta[RTA_IIF-1]) { - struct net_device *dev; - rtattr_strlcpy(new_r->r_ifname, rta[RTA_IIF-1], IFNAMSIZ); - new_r->r_ifindex = -1; - dev = __dev_get_by_name(new_r->r_ifname); - if (dev) - new_r->r_ifindex = dev->ifindex; - } -#ifdef CONFIG_NET_CLS_ROUTE - if (rta[RTA_FLOW-1]) - memcpy(&new_r->r_tclassid, RTA_DATA(rta[RTA_FLOW-1]), 4); -#endif - r = container_of(fib_rules.first, struct fib_rule, hlist); + table = fib_empty_table(); + if (table == NULL) { + err = -ENOBUFS; + goto errout; + } - if (!new_r->r_preference) { - if (r && r->hlist.next != NULL) { - r = container_of(r->hlist.next, struct fib_rule, hlist); - if (r->r_preference) - new_r->r_preference = r->r_preference - 1; + rule->table = table->tb_id; } } - hlist_for_each_entry(r, node, &fib_rules, hlist) { - if (r->r_preference > new_r->r_preference) - break; - last = r; - } - atomic_inc(&new_r->r_clntref); + if (tb[FRA_SRC]) + rule4->src = nla_get_u32(tb[FRA_SRC]); - if (last) - hlist_add_after_rcu(&last->hlist, &new_r->hlist); - else - hlist_add_before_rcu(&new_r->hlist, &r->hlist); + if (tb[FRA_DST]) + rule4->dst = nla_get_u32(tb[FRA_DST]); - rtmsg_rule(RTM_NEWRULE, new_r); - return 0; -} +#ifdef CONFIG_IP_ROUTE_FWMARK + if (tb[FRA_FWMARK]) + rule4->fwmark = nla_get_u32(tb[FRA_FWMARK]); +#endif #ifdef CONFIG_NET_CLS_ROUTE -u32 fib_rules_tclass(struct fib_result *res) -{ - if (res->r) - return res->r->r_tclassid; - return 0; -} + if (tb[FRA_FLOW]) + rule4->tclassid = nla_get_u32(tb[FRA_FLOW]); #endif -/* callers should hold rtnl semaphore */ - -static void fib_rules_detach(struct net_device *dev) -{ - struct hlist_node *node; - struct fib_rule *r; - - hlist_for_each_entry(r, node, &fib_rules, hlist) { - if (r->r_ifindex == dev->ifindex) - r->r_ifindex = -1; + rule4->src_len = frh->src_len; + rule4->srcmask = inet_make_mask(rule4->src_len); + rule4->dst_len = frh->dst_len; + rule4->dstmask = inet_make_mask(rule4->dst_len); + rule4->tos = frh->tos; - } -} - -/* callers should hold rtnl semaphore */ - -static void fib_rules_attach(struct net_device *dev) -{ - struct hlist_node *node; - struct fib_rule *r; - - hlist_for_each_entry(r, node, &fib_rules, hlist) { - if (r->r_ifindex == -1 && strcmp(dev->name, r->r_ifname) == 0) - r->r_ifindex = dev->ifindex; - } + err = 0; +errout: + return err; } -int fib_lookup(const struct flowi *flp, struct fib_result *res) +static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, + struct nlattr **tb) { - int err; - struct fib_rule *r, *policy; - struct fib_table *tb; - struct hlist_node *node; + struct fib4_rule *rule4 = (struct fib4_rule *) rule; - u32 daddr = flp->fl4_dst; - u32 saddr = flp->fl4_src; + if (frh->src_len && (rule4->src_len != frh->src_len)) + return 0; -FRprintk("Lookup: %u.%u.%u.%u <- %u.%u.%u.%u ", - NIPQUAD(flp->fl4_dst), NIPQUAD(flp->fl4_src)); + if (frh->dst_len && (rule4->dst_len != frh->dst_len)) + return 0; - rcu_read_lock(); + if (frh->tos && (rule4->tos != frh->tos)) + return 0; - hlist_for_each_entry_rcu(r, node, &fib_rules, hlist) { - if (((saddr^r->r_src) & r->r_srcmask) || - ((daddr^r->r_dst) & r->r_dstmask) || - (r->r_tos && r->r_tos != flp->fl4_tos) || #ifdef CONFIG_IP_ROUTE_FWMARK - (r->r_fwmark && r->r_fwmark != flp->fl4_fwmark) || + if (tb[FRA_FWMARK] && (rule4->fwmark != nla_get_u32(tb[FRA_FWMARK]))) + return 0; #endif - (r->r_ifindex && r->r_ifindex != flp->iif)) - continue; - -FRprintk("tb %d r %d ", r->r_table, r->r_action); - switch (r->r_action) { - case RTN_UNICAST: - policy = r; - break; - case RTN_UNREACHABLE: - rcu_read_unlock(); - return -ENETUNREACH; - default: - case RTN_BLACKHOLE: - rcu_read_unlock(); - return -EINVAL; - case RTN_PROHIBIT: - rcu_read_unlock(); - return -EACCES; - } - if ((tb = fib_get_table(r->r_table)) == NULL) - continue; - err = tb->tb_lookup(tb, flp, res); - if (err == 0) { - res->r = policy; - if (policy) - atomic_inc(&policy->r_clntref); - rcu_read_unlock(); - return 0; - } - if (err < 0 && err != -EAGAIN) { - rcu_read_unlock(); - return err; - } - } -FRprintk("FAILURE\n"); - rcu_read_unlock(); - return -ENETUNREACH; -} +#ifdef CONFIG_NET_CLS_ROUTE + if (tb[FRA_FLOW] && (rule4->tclassid != nla_get_u32(tb[FRA_FLOW]))) + return 0; +#endif -void fib_select_default(const struct flowi *flp, struct fib_result *res) -{ - if (res->r && res->r->r_action == RTN_UNICAST && - FIB_RES_GW(*res) && FIB_RES_NH(*res).nh_scope == RT_SCOPE_LINK) { - struct fib_table *tb; - if ((tb = fib_get_table(res->r->r_table)) != NULL) - tb->tb_select_default(tb, flp, res); - } -} + if (tb[FRA_SRC] && (rule4->src != nla_get_u32(tb[FRA_SRC]))) + return 0; -static int fib_rules_event(struct notifier_block *this, unsigned long event, void *ptr) -{ - struct net_device *dev = ptr; + if (tb[FRA_DST] && (rule4->dst != nla_get_u32(tb[FRA_DST]))) + return 0; - if (event == NETDEV_UNREGISTER) - fib_rules_detach(dev); - else if (event == NETDEV_REGISTER) - fib_rules_attach(dev); - return NOTIFY_DONE; + return 1; } +static int fib4_rule_fill(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh) +{ + struct fib4_rule *rule4 = (struct fib4_rule *) rule; -static struct notifier_block fib_rules_notifier = { - .notifier_call =fib_rules_event, -}; + frh->family = AF_INET; + frh->dst_len = rule4->dst_len; + frh->src_len = rule4->src_len; + frh->tos = rule4->tos; -static __inline__ int inet_fill_rule(struct sk_buff *skb, - struct fib_rule *r, - u32 pid, u32 seq, int event, - unsigned int flags) -{ - struct rtmsg *rtm; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; - - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*rtm), flags); - rtm = NLMSG_DATA(nlh); - rtm->rtm_family = AF_INET; - rtm->rtm_dst_len = r->r_dst_len; - rtm->rtm_src_len = r->r_src_len; - rtm->rtm_tos = r->r_tos; #ifdef CONFIG_IP_ROUTE_FWMARK - if (r->r_fwmark) - RTA_PUT(skb, RTA_PROTOINFO, 4, &r->r_fwmark); + if (rule4->fwmark) + NLA_PUT_U32(skb, FRA_FWMARK, rule4->fwmark); #endif - rtm->rtm_table = r->r_table; - rtm->rtm_protocol = 0; - rtm->rtm_scope = 0; - rtm->rtm_type = r->r_action; - rtm->rtm_flags = r->r_flags; - - if (r->r_dst_len) - RTA_PUT(skb, RTA_DST, 4, &r->r_dst); - if (r->r_src_len) - RTA_PUT(skb, RTA_SRC, 4, &r->r_src); - if (r->r_ifname[0]) - RTA_PUT(skb, RTA_IIF, IFNAMSIZ, &r->r_ifname); - if (r->r_preference) - RTA_PUT(skb, RTA_PRIORITY, 4, &r->r_preference); - if (r->r_srcmap) - RTA_PUT(skb, RTA_GATEWAY, 4, &r->r_srcmap); + + if (rule4->dst_len) + NLA_PUT_U32(skb, FRA_DST, rule4->dst); + + if (rule4->src_len) + NLA_PUT_U32(skb, FRA_SRC, rule4->src); + #ifdef CONFIG_NET_CLS_ROUTE - if (r->r_tclassid) - RTA_PUT(skb, RTA_FLOW, 4, &r->r_tclassid); + if (rule4->tclassid) + NLA_PUT_U32(skb, FRA_FLOW, rule4->tclassid); #endif - nlh->nlmsg_len = skb->tail - b; - return skb->len; + return 0; -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; +nla_put_failure: + return -ENOBUFS; } -/* callers should hold rtnl semaphore */ +int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + return fib_rules_dump(skb, cb, AF_INET); +} -static void rtmsg_rule(int event, struct fib_rule *r) +static u32 fib4_rule_default_pref(void) { - int size = NLMSG_SPACE(sizeof(struct rtmsg) + 128); - struct sk_buff *skb = alloc_skb(size, GFP_KERNEL); - - if (!skb) - netlink_set_err(rtnl, 0, RTNLGRP_IPV4_RULE, ENOBUFS); - else if (inet_fill_rule(skb, r, 0, 0, event, 0) < 0) { - kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV4_RULE, EINVAL); - } else { - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV4_RULE, GFP_KERNEL); + struct list_head *pos; + struct fib_rule *rule; + + if (!list_empty(&fib4_rules)) { + pos = fib4_rules.next; + if (pos->next != &fib4_rules) { + rule = list_entry(pos->next, struct fib_rule, list); + if (rule->pref) + return rule->pref - 1; + } } + + return 0; } -int inet_dump_rules(struct sk_buff *skb, struct netlink_callback *cb) +static struct fib_rules_ops fib4_rules_ops = { + .family = AF_INET, + .rule_size = sizeof(struct fib4_rule), + .action = fib4_rule_action, + .match = fib4_rule_match, + .configure = fib4_rule_configure, + .compare = fib4_rule_compare, + .fill = fib4_rule_fill, + .default_pref = fib4_rule_default_pref, + .nlgroup = RTNLGRP_IPV4_RULE, + .policy = fib4_rule_policy, + .rules_list = &fib4_rules, + .owner = THIS_MODULE, +}; + +void __init fib4_rules_init(void) { - int idx = 0; - int s_idx = cb->args[0]; - struct fib_rule *r; - struct hlist_node *node; - - rcu_read_lock(); - hlist_for_each_entry(r, node, &fib_rules, hlist) { - if (idx < s_idx) - goto next; - if (inet_fill_rule(skb, r, NETLINK_CB(cb->skb).pid, - cb->nlh->nlmsg_seq, - RTM_NEWRULE, NLM_F_MULTI) < 0) - break; -next: - idx++; - } - rcu_read_unlock(); - cb->args[0] = idx; + list_add_tail(&local_rule.common.list, &fib4_rules); + list_add_tail(&main_rule.common.list, &fib4_rules); + list_add_tail(&default_rule.common.list, &fib4_rules); - return skb->len; + fib_rules_register(&fib4_rules_ops); } -void __init fib_rules_init(void) +void __exit fib4_rules_cleanup(void) { - INIT_HLIST_HEAD(&fib_rules); - hlist_add_head(&local_rule.hlist, &fib_rules); - hlist_add_after(&local_rule.hlist, &main_rule.hlist); - hlist_add_after(&main_rule.hlist, &default_rule.hlist); - register_netdevice_notifier(&fib_rules_notifier); + fib_rules_unregister(&fib4_rules_ops); } -- cgit v0.10.2 From fe4944e59c357f945f81bc67edb7ed1392e875ad Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:03:05 -0700 Subject: [NETLINK]: Extend netlink messaging interface Adds: nlmsg_get_pos() return current position in message nlmsg_trim() trim part of message nla_reserve_nohdr(skb, len) reserve room for an attribute w/o hdr nla_put_nohdr(skb, len, data) add attribute w/o hdr nla_find_nested() find attribute in nested attributes Fixes nlmsg_new() to take allocation flags and consider size. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/netlink.h b/include/net/netlink.h index 640c26a..3a5e40b 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -35,6 +35,8 @@ * nlmsg_put() add a netlink message to an skb * nlmsg_put_answer() callback based nlmsg_put() * nlmsg_end() finanlize netlink message + * nlmsg_get_pos() return current position in message + * nlmsg_trim() trim part of message * nlmsg_cancel() cancel message construction * nlmsg_free() free a netlink message * @@ -80,8 +82,10 @@ * struct nlattr netlink attribtue header * * Attribute Construction: - * nla_reserve(skb, type, len) reserve skb tailroom for an attribute + * nla_reserve(skb, type, len) reserve room for an attribute + * nla_reserve_nohdr(skb, len) reserve room for an attribute w/o hdr * nla_put(skb, type, len, data) add attribute to skb + * nla_put_nohdr(skb, len, data) add attribute w/o hdr * * Attribute Construction for Basic Types: * nla_put_u8(skb, type, value) add u8 attribute to skb @@ -139,6 +143,7 @@ * nla_next(nla, remaining) get next netlink attribute * nla_validate() validate a stream of attributes * nla_find() find attribute in stream of attributes + * nla_find_nested() find attribute in nested attributes * nla_parse() parse and validate stream of attrs * nla_parse_nested() parse nested attribuets * nla_for_each_attr() loop over all attributes @@ -203,12 +208,18 @@ extern int nla_memcmp(const struct nlattr *nla, const void *data, extern int nla_strcmp(const struct nlattr *nla, const char *str); extern struct nlattr * __nla_reserve(struct sk_buff *skb, int attrtype, int attrlen); +extern void * __nla_reserve_nohdr(struct sk_buff *skb, int attrlen); extern struct nlattr * nla_reserve(struct sk_buff *skb, int attrtype, int attrlen); +extern void * nla_reserve_nohdr(struct sk_buff *skb, int attrlen); extern void __nla_put(struct sk_buff *skb, int attrtype, int attrlen, const void *data); +extern void __nla_put_nohdr(struct sk_buff *skb, int attrlen, + const void *data); extern int nla_put(struct sk_buff *skb, int attrtype, int attrlen, const void *data); +extern int nla_put_nohdr(struct sk_buff *skb, int attrlen, + const void *data); /************************************************************************** * Netlink Messages @@ -453,12 +464,13 @@ static inline struct nlmsghdr *nlmsg_put_answer(struct sk_buff *skb, /** * nlmsg_new - Allocate a new netlink message * @size: maximum size of message + * @flags: the type of memory to allocate. * * Use NLMSG_GOODSIZE if size isn't know and you need a good default size. */ -static inline struct sk_buff *nlmsg_new(int size) +static inline struct sk_buff *nlmsg_new(int size, gfp_t flags) { - return alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + return alloc_skb(size, flags); } /** @@ -480,6 +492,32 @@ static inline int nlmsg_end(struct sk_buff *skb, struct nlmsghdr *nlh) } /** + * nlmsg_get_pos - return current position in netlink message + * @skb: socket buffer the message is stored in + * + * Returns a pointer to the current tail of the message. + */ +static inline void *nlmsg_get_pos(struct sk_buff *skb) +{ + return skb->tail; +} + +/** + * nlmsg_trim - Trim message to a mark + * @skb: socket buffer the message is stored in + * @mark: mark to trim to + * + * Trims the message to the provided mark. Returns -1. + */ +static inline int nlmsg_trim(struct sk_buff *skb, void *mark) +{ + if (mark) + skb_trim(skb, (unsigned char *) mark - skb->data); + + return -1; +} + +/** * nlmsg_cancel - Cancel construction of a netlink message * @skb: socket buffer the message is stored in * @nlh: netlink message header @@ -489,9 +527,7 @@ static inline int nlmsg_end(struct sk_buff *skb, struct nlmsghdr *nlh) */ static inline int nlmsg_cancel(struct sk_buff *skb, struct nlmsghdr *nlh) { - skb_trim(skb, (unsigned char *) nlh - skb->data); - - return -1; + return nlmsg_trim(skb, nlh); } /** @@ -631,6 +667,18 @@ static inline struct nlattr *nla_next(const struct nlattr *nla, int *remaining) } /** + * nla_find_nested - find attribute in a set of nested attributes + * @nla: attribute containing the nested attributes + * @attrtype: type of attribute to look for + * + * Returns the first attribute which matches the specified type. + */ +static inline struct nlattr *nla_find_nested(struct nlattr *nla, int attrtype) +{ + return nla_find(nla_data(nla), nla_len(nla), attrtype); +} + +/** * nla_parse_nested - parse nested attributes * @tb: destination array with maxtype+1 elements * @maxtype: maximum attribute type to be expected @@ -862,10 +910,7 @@ static inline int nla_nest_end(struct sk_buff *skb, struct nlattr *start) */ static inline int nla_nest_cancel(struct sk_buff *skb, struct nlattr *start) { - if (start) - skb_trim(skb, (unsigned char *) start - skb->data); - - return -1; + return nlmsg_trim(skb, start); } /** @@ -880,4 +925,13 @@ static inline int nla_nest_cancel(struct sk_buff *skb, struct nlattr *start) nla_ok(pos, rem); \ pos = nla_next(pos, &(rem))) +/** + * nla_for_each_nested - iterate over nested attributes + * @pos: loop counter, set to current attribute + * @nla: attribute containing the nested attributes + * @rem: initialized to len, holds bytes currently remaining in stream + */ +#define nla_for_each_nested(pos, nla, rem) \ + nla_for_each_attr(pos, nla_data(nla), nla_len(nla), rem) + #endif diff --git a/kernel/taskstats.c b/kernel/taskstats.c index e781876..2ed4040 100644 --- a/kernel/taskstats.c +++ b/kernel/taskstats.c @@ -75,7 +75,7 @@ static int prepare_reply(struct genl_info *info, u8 cmd, struct sk_buff **skbp, /* * If new attributes are added, please revisit this allocation */ - skb = nlmsg_new(size); + skb = nlmsg_new(size, GFP_KERNEL); if (!skb) return -ENOMEM; diff --git a/net/netlink/attr.c b/net/netlink/attr.c index dddbd15..136e529 100644 --- a/net/netlink/attr.c +++ b/net/netlink/attr.c @@ -255,6 +255,26 @@ struct nlattr *__nla_reserve(struct sk_buff *skb, int attrtype, int attrlen) } /** + * __nla_reserve_nohdr - reserve room for attribute without header + * @skb: socket buffer to reserve room on + * @attrlen: length of attribute payload + * + * Reserves room for attribute payload without a header. + * + * The caller is responsible to ensure that the skb provides enough + * tailroom for the payload. + */ +void *__nla_reserve_nohdr(struct sk_buff *skb, int attrlen) +{ + void *start; + + start = skb_put(skb, NLA_ALIGN(attrlen)); + memset(start, 0, NLA_ALIGN(attrlen)); + + return start; +} + +/** * nla_reserve - reserve room for attribute on the skb * @skb: socket buffer to reserve room on * @attrtype: attribute type @@ -275,6 +295,24 @@ struct nlattr *nla_reserve(struct sk_buff *skb, int attrtype, int attrlen) } /** + * nla_reserve - reserve room for attribute without header + * @skb: socket buffer to reserve room on + * @len: length of attribute payload + * + * Reserves room for attribute payload without a header. + * + * Returns NULL if the tailroom of the skb is insufficient to store + * the attribute payload. + */ +void *nla_reserve_nohdr(struct sk_buff *skb, int attrlen) +{ + if (unlikely(skb_tailroom(skb) < NLA_ALIGN(attrlen))) + return NULL; + + return __nla_reserve_nohdr(skb, attrlen); +} + +/** * __nla_put - Add a netlink attribute to a socket buffer * @skb: socket buffer to add attribute to * @attrtype: attribute type @@ -293,6 +331,22 @@ void __nla_put(struct sk_buff *skb, int attrtype, int attrlen, memcpy(nla_data(nla), data, attrlen); } +/** + * __nla_put_nohdr - Add a netlink attribute without header + * @skb: socket buffer to add attribute to + * @attrlen: length of attribute payload + * @data: head of attribute payload + * + * The caller is responsible to ensure that the skb provides enough + * tailroom for the attribute payload. + */ +void __nla_put_nohdr(struct sk_buff *skb, int attrlen, const void *data) +{ + void *start; + + start = __nla_reserve_nohdr(skb, attrlen); + memcpy(start, data, attrlen); +} /** * nla_put - Add a netlink attribute to a socket buffer @@ -313,15 +367,36 @@ int nla_put(struct sk_buff *skb, int attrtype, int attrlen, const void *data) return 0; } +/** + * nla_put_nohdr - Add a netlink attribute without header + * @skb: socket buffer to add attribute to + * @attrlen: length of attribute payload + * @data: head of attribute payload + * + * Returns -1 if the tailroom of the skb is insufficient to store + * the attribute payload. + */ +int nla_put_nohdr(struct sk_buff *skb, int attrlen, const void *data) +{ + if (unlikely(skb_tailroom(skb) < NLA_ALIGN(attrlen))) + return -1; + + __nla_put_nohdr(skb, attrlen, data); + return 0; +} EXPORT_SYMBOL(nla_validate); EXPORT_SYMBOL(nla_parse); EXPORT_SYMBOL(nla_find); EXPORT_SYMBOL(nla_strlcpy); EXPORT_SYMBOL(__nla_reserve); +EXPORT_SYMBOL(__nla_reserve_nohdr); EXPORT_SYMBOL(nla_reserve); +EXPORT_SYMBOL(nla_reserve_nohdr); EXPORT_SYMBOL(__nla_put); +EXPORT_SYMBOL(__nla_put_nohdr); EXPORT_SYMBOL(nla_put); +EXPORT_SYMBOL(nla_put_nohdr); EXPORT_SYMBOL(nla_memcpy); EXPORT_SYMBOL(nla_memcmp); EXPORT_SYMBOL(nla_strcmp); diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index a298f77..75bb47a 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -440,7 +440,7 @@ static struct sk_buff *ctrl_build_msg(struct genl_family *family, u32 pid, struct sk_buff *skb; int err; - skb = nlmsg_new(NLMSG_GOODSIZE); + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); if (skb == NULL) return ERR_PTR(-ENOBUFS); -- cgit v0.10.2 From bf8b79e444a748963c71d2a58709e1ce5597e1b5 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:03:29 -0700 Subject: [NETLINK]: Convert core netlink handling to new netlink api Fixes a theoretical memory and locking leak when the size of the netlink header would exceed the skb tailroom. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 8b85036..0f36ddc 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1147,7 +1147,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock, if (len > sk->sk_sndbuf - 32) goto out; err = -ENOBUFS; - skb = alloc_skb(len, GFP_KERNEL); + skb = nlmsg_new(len, GFP_KERNEL); if (skb==NULL) goto out; @@ -1341,19 +1341,18 @@ static int netlink_dump(struct sock *sk) struct netlink_callback *cb; struct sk_buff *skb; struct nlmsghdr *nlh; - int len; + int len, err = -ENOBUFS; skb = sock_rmalloc(sk, NLMSG_GOODSIZE, 0, GFP_KERNEL); if (!skb) - return -ENOBUFS; + goto errout; spin_lock(&nlk->cb_lock); cb = nlk->cb; if (cb == NULL) { - spin_unlock(&nlk->cb_lock); - kfree_skb(skb); - return -EINVAL; + err = -EINVAL; + goto errout_skb; } len = cb->dump(skb, cb); @@ -1365,8 +1364,12 @@ static int netlink_dump(struct sock *sk) return 0; } - nlh = NLMSG_NEW_ANSWER(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI); - memcpy(NLMSG_DATA(nlh), &len, sizeof(len)); + nlh = nlmsg_put_answer(skb, cb, NLMSG_DONE, sizeof(len), NLM_F_MULTI); + if (!nlh) + goto errout_skb; + + memcpy(nlmsg_data(nlh), &len, sizeof(len)); + skb_queue_tail(&sk->sk_receive_queue, skb); sk->sk_data_ready(sk, skb->len); @@ -1378,8 +1381,11 @@ static int netlink_dump(struct sock *sk) netlink_destroy_callback(cb); return 0; -nlmsg_failure: - return -ENOBUFS; +errout_skb: + spin_unlock(&nlk->cb_lock); + kfree_skb(skb); +errout: + return err; } int netlink_dump_start(struct sock *ssk, struct sk_buff *skb, @@ -1431,11 +1437,11 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err) int size; if (err == 0) - size = NLMSG_SPACE(sizeof(struct nlmsgerr)); + size = nlmsg_total_size(sizeof(*errmsg)); else - size = NLMSG_SPACE(4 + NLMSG_ALIGN(nlh->nlmsg_len)); + size = nlmsg_total_size(sizeof(*errmsg) + nlmsg_len(nlh)); - skb = alloc_skb(size, GFP_KERNEL); + skb = nlmsg_new(size, GFP_KERNEL); if (!skb) { struct sock *sk; @@ -1451,16 +1457,15 @@ void netlink_ack(struct sk_buff *in_skb, struct nlmsghdr *nlh, int err) rep = __nlmsg_put(skb, NETLINK_CB(in_skb).pid, nlh->nlmsg_seq, NLMSG_ERROR, sizeof(struct nlmsgerr), 0); - errmsg = NLMSG_DATA(rep); + errmsg = nlmsg_data(rep); errmsg->error = err; - memcpy(&errmsg->msg, nlh, err ? nlh->nlmsg_len : sizeof(struct nlmsghdr)); + memcpy(&errmsg->msg, nlh, err ? nlh->nlmsg_len : sizeof(*nlh)); netlink_unicast(in_skb->sk, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); } static int netlink_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *, struct nlmsghdr *, int *)) { - unsigned int total_len; struct nlmsghdr *nlh; int err; @@ -1470,8 +1475,6 @@ static int netlink_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *, if (nlh->nlmsg_len < NLMSG_HDRLEN || skb->len < nlh->nlmsg_len) return 0; - total_len = min(NLMSG_ALIGN(nlh->nlmsg_len), skb->len); - if (cb(skb, nlh, &err) < 0) { /* Not an error, but we have to interrupt processing * here. Note: that in this case we do not pull @@ -1483,7 +1486,7 @@ static int netlink_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *, } else if (nlh->nlmsg_flags & NLM_F_ACK) netlink_ack(skb, nlh, 0); - skb_pull(skb, total_len); + netlink_queue_skip(nlh, skb); } return 0; -- cgit v0.10.2 From 5c7539781d392629fb40b04aad9a1f197b66cd01 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:03:53 -0700 Subject: [IPV4]: Convert address addition to new netlink api Adds rtm_to_ifaddr() transforming a netlink message to a struct in_ifaddr. Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 9f3ffbe..6b297c8 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -62,6 +62,7 @@ #include #include #include +#include struct ipv4_devconf ipv4_devconf = { .accept_redirects = 1, @@ -78,6 +79,14 @@ static struct ipv4_devconf ipv4_devconf_dflt = { .accept_source_route = 1, }; +static struct nla_policy ifa_ipv4_policy[IFA_MAX+1] __read_mostly = { + [IFA_LOCAL] = { .type = NLA_U32 }, + [IFA_ADDRESS] = { .type = NLA_U32 }, + [IFA_BROADCAST] = { .type = NLA_U32 }, + [IFA_ANYCAST] = { .type = NLA_U32 }, + [IFA_LABEL] = { .type = NLA_STRING }, +}; + static void rtmsg_ifa(int event, struct in_ifaddr *); static BLOCKING_NOTIFIER_HEAD(inetaddr_chain); @@ -451,57 +460,90 @@ out: return -EADDRNOTAVAIL; } -static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh) { - struct rtattr **rta = arg; + struct nlattr *tb[IFA_MAX+1]; + struct in_ifaddr *ifa; + struct ifaddrmsg *ifm; struct net_device *dev; struct in_device *in_dev; - struct ifaddrmsg *ifm = NLMSG_DATA(nlh); - struct in_ifaddr *ifa; - int rc = -EINVAL; + int err = -EINVAL; - ASSERT_RTNL(); + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv4_policy); + if (err < 0) + goto errout; - if (ifm->ifa_prefixlen > 32 || !rta[IFA_LOCAL - 1]) - goto out; + ifm = nlmsg_data(nlh); + if (ifm->ifa_prefixlen > 32 || tb[IFA_LOCAL] == NULL) + goto errout; - rc = -ENODEV; - if ((dev = __dev_get_by_index(ifm->ifa_index)) == NULL) - goto out; + dev = __dev_get_by_index(ifm->ifa_index); + if (dev == NULL) { + err = -ENODEV; + goto errout; + } - rc = -ENOBUFS; - if ((in_dev = __in_dev_get_rtnl(dev)) == NULL) { + in_dev = __in_dev_get_rtnl(dev); + if (in_dev == NULL) { in_dev = inetdev_init(dev); - if (!in_dev) - goto out; + if (in_dev == NULL) { + err = -ENOBUFS; + goto errout; + } } - if ((ifa = inet_alloc_ifa()) == NULL) - goto out; + ifa = inet_alloc_ifa(); + if (ifa == NULL) { + /* + * A potential indev allocation can be left alive, it stays + * assigned to its device and is destroy with it. + */ + err = -ENOBUFS; + goto errout; + } + + in_dev_hold(in_dev); + + if (tb[IFA_ADDRESS] == NULL) + tb[IFA_ADDRESS] = tb[IFA_LOCAL]; - if (!rta[IFA_ADDRESS - 1]) - rta[IFA_ADDRESS - 1] = rta[IFA_LOCAL - 1]; - memcpy(&ifa->ifa_local, RTA_DATA(rta[IFA_LOCAL - 1]), 4); - memcpy(&ifa->ifa_address, RTA_DATA(rta[IFA_ADDRESS - 1]), 4); ifa->ifa_prefixlen = ifm->ifa_prefixlen; ifa->ifa_mask = inet_make_mask(ifm->ifa_prefixlen); - if (rta[IFA_BROADCAST - 1]) - memcpy(&ifa->ifa_broadcast, - RTA_DATA(rta[IFA_BROADCAST - 1]), 4); - if (rta[IFA_ANYCAST - 1]) - memcpy(&ifa->ifa_anycast, RTA_DATA(rta[IFA_ANYCAST - 1]), 4); ifa->ifa_flags = ifm->ifa_flags; ifa->ifa_scope = ifm->ifa_scope; - in_dev_hold(in_dev); - ifa->ifa_dev = in_dev; - if (rta[IFA_LABEL - 1]) - rtattr_strlcpy(ifa->ifa_label, rta[IFA_LABEL - 1], IFNAMSIZ); + ifa->ifa_dev = in_dev; + + ifa->ifa_local = nla_get_u32(tb[IFA_LOCAL]); + ifa->ifa_address = nla_get_u32(tb[IFA_ADDRESS]); + + if (tb[IFA_BROADCAST]) + ifa->ifa_broadcast = nla_get_u32(tb[IFA_BROADCAST]); + + if (tb[IFA_ANYCAST]) + ifa->ifa_anycast = nla_get_u32(tb[IFA_ANYCAST]); + + if (tb[IFA_LABEL]) + nla_strlcpy(ifa->ifa_label, tb[IFA_LABEL], IFNAMSIZ); else memcpy(ifa->ifa_label, dev->name, IFNAMSIZ); - rc = inet_insert_ifa(ifa); -out: - return rc; + return ifa; + +errout: + return ERR_PTR(err); +} + +static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +{ + struct in_ifaddr *ifa; + + ASSERT_RTNL(); + + ifa = rtm_to_ifaddr(nlh); + if (IS_ERR(ifa)) + return PTR_ERR(ifa); + + return inet_insert_ifa(ifa); } /* -- cgit v0.10.2 From dfdd5fd4e93d98e06be9ac9db84e3b98c6c26706 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:04:17 -0700 Subject: [IPV4]: Convert address deletion to new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 6b297c8..309640e 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -430,34 +430,48 @@ struct in_ifaddr *inet_ifa_byprefix(struct in_device *in_dev, u32 prefix, static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct rtattr **rta = arg; + struct nlattr *tb[IFA_MAX+1]; struct in_device *in_dev; - struct ifaddrmsg *ifm = NLMSG_DATA(nlh); + struct ifaddrmsg *ifm; struct in_ifaddr *ifa, **ifap; + int err = -EINVAL; ASSERT_RTNL(); - if ((in_dev = inetdev_by_index(ifm->ifa_index)) == NULL) - goto out; + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv4_policy); + if (err < 0) + goto errout; + + ifm = nlmsg_data(nlh); + in_dev = inetdev_by_index(ifm->ifa_index); + if (in_dev == NULL) { + err = -ENODEV; + goto errout; + } + __in_dev_put(in_dev); for (ifap = &in_dev->ifa_list; (ifa = *ifap) != NULL; ifap = &ifa->ifa_next) { - if ((rta[IFA_LOCAL - 1] && - memcmp(RTA_DATA(rta[IFA_LOCAL - 1]), - &ifa->ifa_local, 4)) || - (rta[IFA_LABEL - 1] && - rtattr_strcmp(rta[IFA_LABEL - 1], ifa->ifa_label)) || - (rta[IFA_ADDRESS - 1] && - (ifm->ifa_prefixlen != ifa->ifa_prefixlen || - !inet_ifa_match(*(u32*)RTA_DATA(rta[IFA_ADDRESS - 1]), - ifa)))) + if (tb[IFA_LOCAL] && + ifa->ifa_local != nla_get_u32(tb[IFA_LOCAL])) continue; + + if (tb[IFA_LABEL] && nla_strcmp(tb[IFA_LABEL], ifa->ifa_label)) + continue; + + if (tb[IFA_ADDRESS] && + (ifm->ifa_prefixlen != ifa->ifa_prefixlen || + !inet_ifa_match(nla_get_u32(tb[IFA_ADDRESS]), ifa))) + continue; + inet_del_ifa(in_dev, ifap, 1); return 0; } -out: - return -EADDRNOTAVAIL; + + err = -EADDRNOTAVAIL; +errout: + return err; } static struct in_ifaddr *rtm_to_ifaddr(struct nlmsghdr *nlh) -- cgit v0.10.2 From 47f68512d2685431f1781830dfcbab31bda87644 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:04:36 -0700 Subject: [IPV4]: Convert address dumping to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 309640e..80bf5b2 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -1112,32 +1112,37 @@ static int inet_fill_ifaddr(struct sk_buff *skb, struct in_ifaddr *ifa, { struct ifaddrmsg *ifm; struct nlmsghdr *nlh; - unsigned char *b = skb->tail; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); - ifm = NLMSG_DATA(nlh); + nlh = nlmsg_put(skb, pid, seq, event, sizeof(*ifm), flags); + if (nlh == NULL) + return -ENOBUFS; + + ifm = nlmsg_data(nlh); ifm->ifa_family = AF_INET; ifm->ifa_prefixlen = ifa->ifa_prefixlen; ifm->ifa_flags = ifa->ifa_flags|IFA_F_PERMANENT; ifm->ifa_scope = ifa->ifa_scope; ifm->ifa_index = ifa->ifa_dev->dev->ifindex; + if (ifa->ifa_address) - RTA_PUT(skb, IFA_ADDRESS, 4, &ifa->ifa_address); + NLA_PUT_U32(skb, IFA_ADDRESS, ifa->ifa_address); + if (ifa->ifa_local) - RTA_PUT(skb, IFA_LOCAL, 4, &ifa->ifa_local); + NLA_PUT_U32(skb, IFA_LOCAL, ifa->ifa_local); + if (ifa->ifa_broadcast) - RTA_PUT(skb, IFA_BROADCAST, 4, &ifa->ifa_broadcast); + NLA_PUT_U32(skb, IFA_BROADCAST, ifa->ifa_broadcast); + if (ifa->ifa_anycast) - RTA_PUT(skb, IFA_ANYCAST, 4, &ifa->ifa_anycast); + NLA_PUT_U32(skb, IFA_ANYCAST, ifa->ifa_anycast); + if (ifa->ifa_label[0]) - RTA_PUT(skb, IFA_LABEL, IFNAMSIZ, &ifa->ifa_label); - nlh->nlmsg_len = skb->tail - b; - return skb->len; + NLA_PUT_STRING(skb, IFA_LABEL, ifa->ifa_label); + + return nlmsg_end(skb, nlh); -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; +nla_put_failure: + return nlmsg_cancel(skb, nlh); } static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) @@ -1185,17 +1190,16 @@ done: static void rtmsg_ifa(int event, struct in_ifaddr* ifa) { - int size = NLMSG_SPACE(sizeof(struct ifaddrmsg) + 128); - struct sk_buff *skb = alloc_skb(size, GFP_KERNEL); + struct sk_buff *skb; - if (!skb) + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb == NULL) netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, ENOBUFS); else if (inet_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { kfree_skb(skb); netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, EINVAL); - } else { + } else netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV4_IFADDR, GFP_KERNEL); - } } static struct rtnetlink_link inet_rtnetlink_table[RTM_NR_MSGTYPES] = { -- cgit v0.10.2 From 1823730fbc89fadde72a7bb3b7bdf03cc7b8835c Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:04:54 -0700 Subject: [IPv4]: Move interface address bits to linux/if_addr.h Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/if_addr.h b/include/linux/if_addr.h new file mode 100644 index 0000000..e159045 --- /dev/null +++ b/include/linux/if_addr.h @@ -0,0 +1,53 @@ +#ifndef __LINUX_IF_ADDR_H +#define __LINUX_IF_ADDR_H + +#include + +struct ifaddrmsg +{ + __u8 ifa_family; + __u8 ifa_prefixlen; /* The prefix length */ + __u8 ifa_flags; /* Flags */ + __u8 ifa_scope; /* Address scope */ + __u32 ifa_index; /* Link index */ +}; + +/* + * Important comment: + * IFA_ADDRESS is prefix address, rather than local interface address. + * It makes no difference for normally configured broadcast interfaces, + * but for point-to-point IFA_ADDRESS is DESTINATION address, + * local address is supplied in IFA_LOCAL attribute. + */ +enum +{ + IFA_UNSPEC, + IFA_ADDRESS, + IFA_LOCAL, + IFA_LABEL, + IFA_BROADCAST, + IFA_ANYCAST, + IFA_CACHEINFO, + IFA_MULTICAST, + __IFA_MAX, +}; + +#define IFA_MAX (__IFA_MAX - 1) + +/* ifa_flags */ +#define IFA_F_SECONDARY 0x01 +#define IFA_F_TEMPORARY IFA_F_SECONDARY + +#define IFA_F_DEPRECATED 0x20 +#define IFA_F_TENTATIVE 0x40 +#define IFA_F_PERMANENT 0x80 + +struct ifa_cacheinfo +{ + __u32 ifa_prefered; + __u32 ifa_valid; + __u32 cstamp; /* created timestamp, hundredths of seconds */ + __u32 tstamp; /* updated timestamp, hundredths of seconds */ +}; + +#endif diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index bf35353..890c4d4 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -384,62 +384,6 @@ struct rta_session }; -/********************************************************* - * Interface address. - ****/ - -struct ifaddrmsg -{ - unsigned char ifa_family; - unsigned char ifa_prefixlen; /* The prefix length */ - unsigned char ifa_flags; /* Flags */ - unsigned char ifa_scope; /* See above */ - int ifa_index; /* Link index */ -}; - -enum -{ - IFA_UNSPEC, - IFA_ADDRESS, - IFA_LOCAL, - IFA_LABEL, - IFA_BROADCAST, - IFA_ANYCAST, - IFA_CACHEINFO, - IFA_MULTICAST, - __IFA_MAX -}; - -#define IFA_MAX (__IFA_MAX - 1) - -/* ifa_flags */ - -#define IFA_F_SECONDARY 0x01 -#define IFA_F_TEMPORARY IFA_F_SECONDARY - -#define IFA_F_DEPRECATED 0x20 -#define IFA_F_TENTATIVE 0x40 -#define IFA_F_PERMANENT 0x80 - -struct ifa_cacheinfo -{ - __u32 ifa_prefered; - __u32 ifa_valid; - __u32 cstamp; /* created timestamp, hundredths of seconds */ - __u32 tstamp; /* updated timestamp, hundredths of seconds */ -}; - - -#define IFA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifaddrmsg)))) -#define IFA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifaddrmsg)) - -/* - Important comment: - IFA_ADDRESS is prefix address, rather than local interface address. - It makes no difference for normally configured broadcast interfaces, - but for point-to-point IFA_ADDRESS is DESTINATION address, - local address is supplied in IFA_LOCAL attribute. - */ /************************************************************** * Neighbour discovery. diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index aa7cff2..35712031 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -35,6 +35,7 @@ #include #include #include +#include #include #include diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c index 476455f..632c5a9 100644 --- a/net/decnet/dn_dev.c +++ b/net/decnet/dn_dev.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 80bf5b2..398e7b9 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #include #include diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index fe4a53d..a83f1aa 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index ed766ee..c2a4db8 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -48,6 +48,7 @@ #include #include #include +#include #include #include #include diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 67cfc38..5743e8b 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -62,6 +62,7 @@ #include #endif +#include #include #include #include -- cgit v0.10.2 From da5e0494c542dddc56a1f1edfd30310ea30f41ff Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 10 Aug 2006 21:17:37 -0700 Subject: [NET]: Convert link modification to new netlink api Transforms do_setlink() into rtnl_setlink() using the new netlink api. A warning message printed to the console is added in the event that a change request fails while part of the change request has been comitted already. The ioctl() based nature of net devices makes it almost impossible to move on to atomic netlink operations without obsoleting some of the functionality. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 35712031..2adc966 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -336,52 +336,69 @@ static int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *c return skb->len; } -static int do_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static struct nla_policy ifla_policy[IFLA_MAX+1] __read_mostly = { + [IFLA_IFNAME] = { .type = NLA_STRING }, + [IFLA_MAP] = { .minlen = sizeof(struct rtnl_link_ifmap) }, + [IFLA_MTU] = { .type = NLA_U32 }, + [IFLA_TXQLEN] = { .type = NLA_U32 }, + [IFLA_WEIGHT] = { .type = NLA_U32 }, + [IFLA_OPERSTATE] = { .type = NLA_U8 }, + [IFLA_LINKMODE] = { .type = NLA_U8 }, +}; + +static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct ifinfomsg *ifm = NLMSG_DATA(nlh); - struct rtattr **ida = arg; + struct ifinfomsg *ifm; struct net_device *dev; - int err, send_addr_notify = 0; + int err, send_addr_notify = 0, modified = 0; + struct nlattr *tb[IFLA_MAX+1]; + char ifname[IFNAMSIZ]; + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy); + if (err < 0) + goto errout; + + if (tb[IFLA_IFNAME] && + nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ) >= IFNAMSIZ) + return -EINVAL; + + err = -EINVAL; + ifm = nlmsg_data(nlh); if (ifm->ifi_index >= 0) dev = dev_get_by_index(ifm->ifi_index); - else if (ida[IFLA_IFNAME - 1]) { - char ifname[IFNAMSIZ]; - - if (rtattr_strlcpy(ifname, ida[IFLA_IFNAME - 1], - IFNAMSIZ) >= IFNAMSIZ) - return -EINVAL; + else if (tb[IFLA_IFNAME]) dev = dev_get_by_name(ifname); - } else - return -EINVAL; + else + goto errout; - if (!dev) - return -ENODEV; + if (dev == NULL) { + err = -ENODEV; + goto errout; + } - err = -EINVAL; + if (tb[IFLA_ADDRESS] && + nla_len(tb[IFLA_ADDRESS]) < dev->addr_len) + goto errout_dev; - if (ifm->ifi_flags) - dev_change_flags(dev, ifm->ifi_flags); + if (tb[IFLA_BROADCAST] && + nla_len(tb[IFLA_BROADCAST]) < dev->addr_len) + goto errout_dev; - if (ida[IFLA_MAP - 1]) { + if (tb[IFLA_MAP]) { struct rtnl_link_ifmap *u_map; struct ifmap k_map; if (!dev->set_config) { err = -EOPNOTSUPP; - goto out; + goto errout_dev; } if (!netif_device_present(dev)) { err = -ENODEV; - goto out; + goto errout_dev; } - - if (ida[IFLA_MAP - 1]->rta_len != RTA_LENGTH(sizeof(*u_map))) - goto out; - - u_map = RTA_DATA(ida[IFLA_MAP - 1]); + u_map = nla_data(tb[IFLA_MAP]); k_map.mem_start = (unsigned long) u_map->mem_start; k_map.mem_end = (unsigned long) u_map->mem_end; k_map.base_addr = (unsigned short) u_map->base_addr; @@ -390,119 +407,111 @@ static int do_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) k_map.port = (unsigned char) u_map->port; err = dev->set_config(dev, &k_map); + if (err < 0) + goto errout_dev; - if (err) - goto out; + modified = 1; } - if (ida[IFLA_ADDRESS - 1]) { + if (tb[IFLA_ADDRESS]) { struct sockaddr *sa; int len; if (!dev->set_mac_address) { err = -EOPNOTSUPP; - goto out; + goto errout_dev; } + if (!netif_device_present(dev)) { err = -ENODEV; - goto out; + goto errout_dev; } - if (ida[IFLA_ADDRESS - 1]->rta_len != RTA_LENGTH(dev->addr_len)) - goto out; len = sizeof(sa_family_t) + dev->addr_len; sa = kmalloc(len, GFP_KERNEL); if (!sa) { err = -ENOMEM; - goto out; + goto errout_dev; } sa->sa_family = dev->type; - memcpy(sa->sa_data, RTA_DATA(ida[IFLA_ADDRESS - 1]), + memcpy(sa->sa_data, nla_data(tb[IFLA_ADDRESS]), dev->addr_len); err = dev->set_mac_address(dev, sa); kfree(sa); if (err) - goto out; + goto errout_dev; send_addr_notify = 1; + modified = 1; } - if (ida[IFLA_BROADCAST - 1]) { - if (ida[IFLA_BROADCAST - 1]->rta_len != RTA_LENGTH(dev->addr_len)) - goto out; - memcpy(dev->broadcast, RTA_DATA(ida[IFLA_BROADCAST - 1]), - dev->addr_len); - send_addr_notify = 1; + if (tb[IFLA_MTU]) { + err = dev_set_mtu(dev, nla_get_u32(tb[IFLA_MTU])); + if (err < 0) + goto errout_dev; + modified = 1; } - if (ida[IFLA_MTU - 1]) { - if (ida[IFLA_MTU - 1]->rta_len != RTA_LENGTH(sizeof(u32))) - goto out; - err = dev_set_mtu(dev, *((u32 *) RTA_DATA(ida[IFLA_MTU - 1]))); - - if (err) - goto out; - + /* + * Interface selected by interface index but interface + * name provided implies that a name change has been + * requested. + */ + if (ifm->ifi_index >= 0 && ifname[0]) { + err = dev_change_name(dev, ifname); + if (err < 0) + goto errout_dev; + modified = 1; } - if (ida[IFLA_TXQLEN - 1]) { - if (ida[IFLA_TXQLEN - 1]->rta_len != RTA_LENGTH(sizeof(u32))) - goto out; +#ifdef CONFIG_NET_WIRELESS_RTNETLINK + if (tb[IFLA_WIRELESS]) { + /* Call Wireless Extensions. + * Various stuff checked in there... */ + err = wireless_rtnetlink_set(dev, nla_data(tb[IFLA_WIRELESS]), + nla_len(tb[IFLA_WIRELESS])); + if (err < 0) + goto errout_dev; + } +#endif /* CONFIG_NET_WIRELESS_RTNETLINK */ - dev->tx_queue_len = *((u32 *) RTA_DATA(ida[IFLA_TXQLEN - 1])); + if (tb[IFLA_BROADCAST]) { + nla_memcpy(dev->broadcast, tb[IFLA_BROADCAST], dev->addr_len); + send_addr_notify = 1; } - if (ida[IFLA_WEIGHT - 1]) { - if (ida[IFLA_WEIGHT - 1]->rta_len != RTA_LENGTH(sizeof(u32))) - goto out; - dev->weight = *((u32 *) RTA_DATA(ida[IFLA_WEIGHT - 1])); - } + if (ifm->ifi_flags) + dev_change_flags(dev, ifm->ifi_flags); - if (ida[IFLA_OPERSTATE - 1]) { - if (ida[IFLA_OPERSTATE - 1]->rta_len != RTA_LENGTH(sizeof(u8))) - goto out; + if (tb[IFLA_TXQLEN]) + dev->tx_queue_len = nla_get_u32(tb[IFLA_TXQLEN]); - set_operstate(dev, *((u8 *) RTA_DATA(ida[IFLA_OPERSTATE - 1]))); - } + if (tb[IFLA_WEIGHT]) + dev->weight = nla_get_u32(tb[IFLA_WEIGHT]); - if (ida[IFLA_LINKMODE - 1]) { - if (ida[IFLA_LINKMODE - 1]->rta_len != RTA_LENGTH(sizeof(u8))) - goto out; + if (tb[IFLA_OPERSTATE]) + set_operstate(dev, nla_get_u8(tb[IFLA_OPERSTATE])); + if (tb[IFLA_LINKMODE]) { write_lock_bh(&dev_base_lock); - dev->link_mode = *((u8 *) RTA_DATA(ida[IFLA_LINKMODE - 1])); + dev->link_mode = nla_get_u8(tb[IFLA_LINKMODE]); write_unlock_bh(&dev_base_lock); } - if (ifm->ifi_index >= 0 && ida[IFLA_IFNAME - 1]) { - char ifname[IFNAMSIZ]; - - if (rtattr_strlcpy(ifname, ida[IFLA_IFNAME - 1], - IFNAMSIZ) >= IFNAMSIZ) - goto out; - err = dev_change_name(dev, ifname); - if (err) - goto out; - } - -#ifdef CONFIG_NET_WIRELESS_RTNETLINK - if (ida[IFLA_WIRELESS - 1]) { - - /* Call Wireless Extensions. - * Various stuff checked in there... */ - err = wireless_rtnetlink_set(dev, RTA_DATA(ida[IFLA_WIRELESS - 1]), ida[IFLA_WIRELESS - 1]->rta_len); - if (err) - goto out; - } -#endif /* CONFIG_NET_WIRELESS_RTNETLINK */ - err = 0; -out: +errout_dev: + if (err < 0 && modified && net_ratelimit()) + printk(KERN_WARNING "A link change request failed with " + "some changes comitted already. Interface %s may " + "have been left with an inconsistent configuration, " + "please check.\n", dev->name); + if (send_addr_notify) call_netdevice_notifiers(NETDEV_CHANGEADDR, dev); dev_put(dev); +errout: return err; } @@ -753,7 +762,7 @@ static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] = .doit = do_getlink, #endif /* CONFIG_NET_WIRELESS_RTNETLINK */ .dumpit = rtnetlink_dump_ifinfo }, - [RTM_SETLINK - RTM_BASE] = { .doit = do_setlink }, + [RTM_SETLINK - RTM_BASE] = { .doit = rtnl_setlink }, [RTM_GETADDR - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, [RTM_GETROUTE - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, [RTM_NEWNEIGH - RTM_BASE] = { .doit = neigh_add }, -- cgit v0.10.2 From b60c5115f4abf0b961a18682889798dcfbe6a801 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:05:34 -0700 Subject: [NET]: Convert link dumping to new netlink api Transforms netlink code to dump link tables to use the new netlink api. Makes rtnl_getlink() available regardless of the availability of the wireless extensions. Adding copy_rtnl_link_stats() avoids the structural dependency of struct rtnl_link_stats on struct net_device_stats and thus avoids troubles later on. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2adc966..93ba04f 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -218,41 +218,73 @@ static void set_operstate(struct net_device *dev, unsigned char transition) } } -static int rtnetlink_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, - int type, u32 pid, u32 seq, u32 change, - unsigned int flags) +static void copy_rtnl_link_stats(struct rtnl_link_stats *a, + struct net_device_stats *b) { - struct ifinfomsg *r; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; - - nlh = NLMSG_NEW(skb, pid, seq, type, sizeof(*r), flags); - r = NLMSG_DATA(nlh); - r->ifi_family = AF_UNSPEC; - r->__ifi_pad = 0; - r->ifi_type = dev->type; - r->ifi_index = dev->ifindex; - r->ifi_flags = dev_get_flags(dev); - r->ifi_change = change; + a->rx_packets = b->rx_packets; + a->tx_packets = b->tx_packets; + a->rx_bytes = b->rx_bytes; + a->tx_bytes = b->tx_bytes; + a->rx_errors = b->rx_errors; + a->tx_errors = b->tx_errors; + a->rx_dropped = b->rx_dropped; + a->tx_dropped = b->tx_dropped; + + a->multicast = b->multicast; + a->collisions = b->collisions; + + a->rx_length_errors = b->rx_length_errors; + a->rx_over_errors = b->rx_over_errors; + a->rx_crc_errors = b->rx_crc_errors; + a->rx_frame_errors = b->rx_frame_errors; + a->rx_fifo_errors = b->rx_fifo_errors; + a->rx_missed_errors = b->rx_missed_errors; + + a->tx_aborted_errors = b->tx_aborted_errors; + a->tx_carrier_errors = b->tx_carrier_errors; + a->tx_fifo_errors = b->tx_fifo_errors; + a->tx_heartbeat_errors = b->tx_heartbeat_errors; + a->tx_window_errors = b->tx_window_errors; + + a->rx_compressed = b->rx_compressed; + a->tx_compressed = b->tx_compressed; +}; - RTA_PUT(skb, IFLA_IFNAME, strlen(dev->name)+1, dev->name); +static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, + void *iwbuf, int iwbuflen, int type, u32 pid, + u32 seq, u32 change, unsigned int flags) +{ + struct ifinfomsg *ifm; + struct nlmsghdr *nlh; - if (1) { - u32 txqlen = dev->tx_queue_len; - RTA_PUT(skb, IFLA_TXQLEN, sizeof(txqlen), &txqlen); - } + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ifm), flags); + if (nlh == NULL) + return -ENOBUFS; - if (1) { - u32 weight = dev->weight; - RTA_PUT(skb, IFLA_WEIGHT, sizeof(weight), &weight); - } + ifm = nlmsg_data(nlh); + ifm->ifi_family = AF_UNSPEC; + ifm->__ifi_pad = 0; + ifm->ifi_type = dev->type; + ifm->ifi_index = dev->ifindex; + ifm->ifi_flags = dev_get_flags(dev); + ifm->ifi_change = change; + + NLA_PUT_STRING(skb, IFLA_IFNAME, dev->name); + NLA_PUT_U32(skb, IFLA_TXQLEN, dev->tx_queue_len); + NLA_PUT_U32(skb, IFLA_WEIGHT, dev->weight); + NLA_PUT_U8(skb, IFLA_OPERSTATE, + netif_running(dev) ? dev->operstate : IF_OPER_DOWN); + NLA_PUT_U8(skb, IFLA_LINKMODE, dev->link_mode); + NLA_PUT_U32(skb, IFLA_MTU, dev->mtu); + + if (dev->ifindex != dev->iflink) + NLA_PUT_U32(skb, IFLA_LINK, dev->iflink); + + if (dev->master) + NLA_PUT_U32(skb, IFLA_MASTER, dev->master->ifindex); - if (1) { - u8 operstate = netif_running(dev)?dev->operstate:IF_OPER_DOWN; - u8 link_mode = dev->link_mode; - RTA_PUT(skb, IFLA_OPERSTATE, sizeof(operstate), &operstate); - RTA_PUT(skb, IFLA_LINKMODE, sizeof(link_mode), &link_mode); - } + if (dev->qdisc_sleeping) + NLA_PUT_STRING(skb, IFLA_QDISC, dev->qdisc_sleeping->ops->id); if (1) { struct rtnl_link_ifmap map = { @@ -263,58 +295,38 @@ static int rtnetlink_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, .dma = dev->dma, .port = dev->if_port, }; - RTA_PUT(skb, IFLA_MAP, sizeof(map), &map); + NLA_PUT(skb, IFLA_MAP, sizeof(map), &map); } if (dev->addr_len) { - RTA_PUT(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr); - RTA_PUT(skb, IFLA_BROADCAST, dev->addr_len, dev->broadcast); - } - - if (1) { - u32 mtu = dev->mtu; - RTA_PUT(skb, IFLA_MTU, sizeof(mtu), &mtu); - } - - if (dev->ifindex != dev->iflink) { - u32 iflink = dev->iflink; - RTA_PUT(skb, IFLA_LINK, sizeof(iflink), &iflink); - } - - if (dev->qdisc_sleeping) - RTA_PUT(skb, IFLA_QDISC, - strlen(dev->qdisc_sleeping->ops->id) + 1, - dev->qdisc_sleeping->ops->id); - - if (dev->master) { - u32 master = dev->master->ifindex; - RTA_PUT(skb, IFLA_MASTER, sizeof(master), &master); + NLA_PUT(skb, IFLA_ADDRESS, dev->addr_len, dev->dev_addr); + NLA_PUT(skb, IFLA_BROADCAST, dev->addr_len, dev->broadcast); } if (dev->get_stats) { - unsigned long *stats = (unsigned long*)dev->get_stats(dev); + struct net_device_stats *stats = dev->get_stats(dev); if (stats) { - struct rtattr *a; - __u32 *s; - int i; - int n = sizeof(struct rtnl_link_stats)/4; - - a = __RTA_PUT(skb, IFLA_STATS, n*4); - s = RTA_DATA(a); - for (i=0; inlmsg_len = skb->tail - b; - return skb->len; -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; + if (iwbuf) + NLA_PUT(skb, IFLA_WIRELESS, iwbuflen, iwbuf); + + return nlmsg_end(skb, nlh); + +nla_put_failure: + return nlmsg_cancel(skb, nlh); } -static int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) +static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) { int idx; int s_idx = cb->args[0]; @@ -324,10 +336,9 @@ static int rtnetlink_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *c for (dev=dev_base, idx=0; dev; dev = dev->next, idx++) { if (idx < s_idx) continue; - if (rtnetlink_fill_ifinfo(skb, dev, RTM_NEWLINK, - NETLINK_CB(cb->skb).pid, - cb->nlh->nlmsg_seq, 0, - NLM_F_MULTI) <= 0) + if (rtnl_fill_ifinfo(skb, dev, NULL, 0, RTM_NEWLINK, + NETLINK_CB(cb->skb).pid, + cb->nlh->nlmsg_seq, 0, NLM_F_MULTI) <= 0) break; } read_unlock(&dev_base_lock); @@ -515,84 +526,69 @@ errout: return err; } -#ifdef CONFIG_NET_WIRELESS_RTNETLINK -static int do_getlink(struct sk_buff *in_skb, struct nlmsghdr* in_nlh, void *arg) +static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) { - struct ifinfomsg *ifm = NLMSG_DATA(in_nlh); - struct rtattr **ida = arg; - struct net_device *dev; - struct ifinfomsg *r; - struct nlmsghdr *nlh; - int err = -ENOBUFS; - struct sk_buff *skb; - unsigned char *b; - char *iw_buf = NULL; + struct ifinfomsg *ifm; + struct nlattr *tb[IFLA_MAX+1]; + struct net_device *dev = NULL; + struct sk_buff *nskb; + char *iw_buf = NULL, *iw = NULL; int iw_buf_len = 0; + int err, payload; - if (ifm->ifi_index >= 0) + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFLA_MAX, ifla_policy); + if (err < 0) + goto errout; + + ifm = nlmsg_data(nlh); + if (ifm->ifi_index >= 0) { dev = dev_get_by_index(ifm->ifi_index); - else + if (dev == NULL) + return -ENODEV; + } else return -EINVAL; - if (!dev) - return -ENODEV; -#ifdef CONFIG_NET_WIRELESS_RTNETLINK - if (ida[IFLA_WIRELESS - 1]) { +#ifdef CONFIG_NET_WIRELESS_RTNETLINK + if (tb[IFLA_WIRELESS]) { /* Call Wireless Extensions. We need to know the size before * we can alloc. Various stuff checked in there... */ - err = wireless_rtnetlink_get(dev, RTA_DATA(ida[IFLA_WIRELESS - 1]), ida[IFLA_WIRELESS - 1]->rta_len, &iw_buf, &iw_buf_len); - if (err) - goto out; + err = wireless_rtnetlink_get(dev, nla_data(tb[IFLA_WIRELESS]), + nla_len(tb[IFLA_WIRELESS]), + &iw_buf, &iw_buf_len); + if (err < 0) + goto errout; + + iw += IW_EV_POINT_OFF; } #endif /* CONFIG_NET_WIRELESS_RTNETLINK */ - /* Create a skb big enough to include all the data. - * Some requests are way bigger than 4k... Jean II */ - skb = alloc_skb((NLMSG_LENGTH(sizeof(*r))) + (RTA_SPACE(iw_buf_len)), - GFP_KERNEL); - if (!skb) - goto out; - b = skb->tail; - - /* Put in the message the usual good stuff */ - nlh = NLMSG_PUT(skb, NETLINK_CB(in_skb).pid, in_nlh->nlmsg_seq, - RTM_NEWLINK, sizeof(*r)); - r = NLMSG_DATA(nlh); - r->ifi_family = AF_UNSPEC; - r->__ifi_pad = 0; - r->ifi_type = dev->type; - r->ifi_index = dev->ifindex; - r->ifi_flags = dev->flags; - r->ifi_change = 0; - - /* Put the wireless payload if it exist */ - if(iw_buf != NULL) - RTA_PUT(skb, IFLA_WIRELESS, iw_buf_len, - iw_buf + IW_EV_POINT_OFF); - - nlh->nlmsg_len = skb->tail - b; - - /* Needed ? */ - NETLINK_CB(skb).dst_pid = NETLINK_CB(in_skb).pid; - - err = netlink_unicast(rtnl, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); + payload = NLMSG_ALIGN(sizeof(struct ifinfomsg) + + nla_total_size(iw_buf_len)); + nskb = nlmsg_new(nlmsg_total_size(payload), GFP_KERNEL); + if (nskb == NULL) { + err = -ENOBUFS; + goto errout; + } + + err = rtnl_fill_ifinfo(nskb, dev, iw, iw_buf_len, RTM_NEWLINK, + NETLINK_CB(skb).pid, nlh->nlmsg_seq, 0, 0); + if (err <= 0) { + kfree_skb(skb); + goto errout; + } + + err = netlink_unicast(rtnl, skb, NETLINK_CB(skb).pid, MSG_DONTWAIT); if (err > 0) err = 0; -out: - if(iw_buf != NULL) - kfree(iw_buf); +errout: + kfree(iw_buf); dev_put(dev); - return err; -rtattr_failure: -nlmsg_failure: - kfree_skb(skb); - goto out; + return err; } -#endif /* CONFIG_NET_WIRELESS_RTNETLINK */ -static int rtnetlink_dump_all(struct sk_buff *skb, struct netlink_callback *cb) +static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb) { int idx; int s_idx = cb->family; @@ -623,11 +619,11 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change) sizeof(struct rtnl_link_ifmap) + sizeof(struct rtnl_link_stats) + 128); - skb = alloc_skb(size, GFP_KERNEL); + skb = nlmsg_new(size, GFP_KERNEL); if (!skb) return; - if (rtnetlink_fill_ifinfo(skb, dev, type, 0, 0, change, 0) < 0) { + if (rtnl_fill_ifinfo(skb, dev, NULL, 0, type, 0, 0, change, 0) < 0) { kfree_skb(skb); return; } @@ -757,14 +753,11 @@ static void rtnetlink_rcv(struct sock *sk, int len) static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] = { - [RTM_GETLINK - RTM_BASE] = { -#ifdef CONFIG_NET_WIRELESS_RTNETLINK - .doit = do_getlink, -#endif /* CONFIG_NET_WIRELESS_RTNETLINK */ - .dumpit = rtnetlink_dump_ifinfo }, + [RTM_GETLINK - RTM_BASE] = { .doit = rtnl_getlink, + .dumpit = rtnl_dump_ifinfo }, [RTM_SETLINK - RTM_BASE] = { .doit = rtnl_setlink }, - [RTM_GETADDR - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, - [RTM_GETROUTE - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, + [RTM_GETADDR - RTM_BASE] = { .dumpit = rtnl_dump_all }, + [RTM_GETROUTE - RTM_BASE] = { .dumpit = rtnl_dump_all }, [RTM_NEWNEIGH - RTM_BASE] = { .doit = neigh_add }, [RTM_DELNEIGH - RTM_BASE] = { .doit = neigh_delete }, [RTM_GETNEIGH - RTM_BASE] = { .dumpit = neigh_dump_info }, @@ -772,7 +765,7 @@ static struct rtnetlink_link link_rtnetlink_table[RTM_NR_MSGTYPES] = [RTM_NEWRULE - RTM_BASE] = { .doit = fib_nl_newrule }, [RTM_DELRULE - RTM_BASE] = { .doit = fib_nl_delrule }, #endif - [RTM_GETRULE - RTM_BASE] = { .dumpit = rtnetlink_dump_all }, + [RTM_GETRULE - RTM_BASE] = { .dumpit = rtnl_dump_all }, [RTM_GETNEIGHTBL - RTM_BASE] = { .dumpit = neightbl_dump_info }, [RTM_SETNEIGHTBL - RTM_BASE] = { .doit = neightbl_set }, }; -- cgit v0.10.2 From 0844565fb8a9418f5a860aa480c1aef70319c9a2 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Fri, 4 Aug 2006 23:05:56 -0700 Subject: [NET]: Move netlink interface bits to linux/if.h Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/if.h b/include/linux/if.h index 374e20a..cd080d7 100644 --- a/include/linux/if.h +++ b/include/linux/if.h @@ -212,5 +212,134 @@ struct ifconf #define ifc_buf ifc_ifcu.ifcu_buf /* buffer address */ #define ifc_req ifc_ifcu.ifcu_req /* array of structures */ +/* The struct should be in sync with struct net_device_stats */ +struct rtnl_link_stats +{ + __u32 rx_packets; /* total packets received */ + __u32 tx_packets; /* total packets transmitted */ + __u32 rx_bytes; /* total bytes received */ + __u32 tx_bytes; /* total bytes transmitted */ + __u32 rx_errors; /* bad packets received */ + __u32 tx_errors; /* packet transmit problems */ + __u32 rx_dropped; /* no space in linux buffers */ + __u32 tx_dropped; /* no space available in linux */ + __u32 multicast; /* multicast packets received */ + __u32 collisions; + + /* detailed rx_errors: */ + __u32 rx_length_errors; + __u32 rx_over_errors; /* receiver ring buff overflow */ + __u32 rx_crc_errors; /* recved pkt with crc error */ + __u32 rx_frame_errors; /* recv'd frame alignment error */ + __u32 rx_fifo_errors; /* recv'r fifo overrun */ + __u32 rx_missed_errors; /* receiver missed packet */ + + /* detailed tx_errors */ + __u32 tx_aborted_errors; + __u32 tx_carrier_errors; + __u32 tx_fifo_errors; + __u32 tx_heartbeat_errors; + __u32 tx_window_errors; + + /* for cslip etc */ + __u32 rx_compressed; + __u32 tx_compressed; +}; + +/* The struct should be in sync with struct ifmap */ +struct rtnl_link_ifmap +{ + __u64 mem_start; + __u64 mem_end; + __u64 base_addr; + __u16 irq; + __u8 dma; + __u8 port; +}; + +enum +{ + IFLA_UNSPEC, + IFLA_ADDRESS, + IFLA_BROADCAST, + IFLA_IFNAME, + IFLA_MTU, + IFLA_LINK, + IFLA_QDISC, + IFLA_STATS, + IFLA_COST, +#define IFLA_COST IFLA_COST + IFLA_PRIORITY, +#define IFLA_PRIORITY IFLA_PRIORITY + IFLA_MASTER, +#define IFLA_MASTER IFLA_MASTER + IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ +#define IFLA_WIRELESS IFLA_WIRELESS + IFLA_PROTINFO, /* Protocol specific information for a link */ +#define IFLA_PROTINFO IFLA_PROTINFO + IFLA_TXQLEN, +#define IFLA_TXQLEN IFLA_TXQLEN + IFLA_MAP, +#define IFLA_MAP IFLA_MAP + IFLA_WEIGHT, +#define IFLA_WEIGHT IFLA_WEIGHT + IFLA_OPERSTATE, + IFLA_LINKMODE, + __IFLA_MAX +}; + + +#define IFLA_MAX (__IFLA_MAX - 1) + +/* ifi_flags. + + IFF_* flags. + + The only change is: + IFF_LOOPBACK, IFF_BROADCAST and IFF_POINTOPOINT are + more not changeable by user. They describe link media + characteristics and set by device driver. + + Comments: + - Combination IFF_BROADCAST|IFF_POINTOPOINT is invalid + - If neither of these three flags are set; + the interface is NBMA. + + - IFF_MULTICAST does not mean anything special: + multicasts can be used on all not-NBMA links. + IFF_MULTICAST means that this media uses special encapsulation + for multicast frames. Apparently, all IFF_POINTOPOINT and + IFF_BROADCAST devices are able to use multicasts too. + */ + +/* IFLA_LINK. + For usual devices it is equal ifi_index. + If it is a "virtual interface" (f.e. tunnel), ifi_link + can point to real physical interface (f.e. for bandwidth calculations), + or maybe 0, what means, that real media is unknown (usual + for IPIP tunnels, when route to endpoint is allowed to change) + */ + +/* Subtype attributes for IFLA_PROTINFO */ +enum +{ + IFLA_INET6_UNSPEC, + IFLA_INET6_FLAGS, /* link flags */ + IFLA_INET6_CONF, /* sysctl parameters */ + IFLA_INET6_STATS, /* statistics */ + IFLA_INET6_MCAST, /* MC things. What of them? */ + IFLA_INET6_CACHEINFO, /* time values and max reasm size */ + __IFLA_INET6_MAX +}; + +#define IFLA_INET6_MAX (__IFLA_INET6_MAX - 1) + +struct ifla_cacheinfo +{ + __u32 max_reasm_len; + __u32 tstamp; /* ipv6InterfaceTable updated timestamp */ + __u32 reachable_time; + __u32 retrans_time; +}; #endif /* _LINUX_IF_H */ diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 890c4d4..84f3eb4 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -2,6 +2,7 @@ #define __LINUX_RTNETLINK_H #include +#include /**** * Routing/neighbour discovery messages. @@ -607,138 +608,6 @@ struct prefix_cacheinfo __u32 valid_time; }; -/* The struct should be in sync with struct net_device_stats */ -struct rtnl_link_stats -{ - __u32 rx_packets; /* total packets received */ - __u32 tx_packets; /* total packets transmitted */ - __u32 rx_bytes; /* total bytes received */ - __u32 tx_bytes; /* total bytes transmitted */ - __u32 rx_errors; /* bad packets received */ - __u32 tx_errors; /* packet transmit problems */ - __u32 rx_dropped; /* no space in linux buffers */ - __u32 tx_dropped; /* no space available in linux */ - __u32 multicast; /* multicast packets received */ - __u32 collisions; - - /* detailed rx_errors: */ - __u32 rx_length_errors; - __u32 rx_over_errors; /* receiver ring buff overflow */ - __u32 rx_crc_errors; /* recved pkt with crc error */ - __u32 rx_frame_errors; /* recv'd frame alignment error */ - __u32 rx_fifo_errors; /* recv'r fifo overrun */ - __u32 rx_missed_errors; /* receiver missed packet */ - - /* detailed tx_errors */ - __u32 tx_aborted_errors; - __u32 tx_carrier_errors; - __u32 tx_fifo_errors; - __u32 tx_heartbeat_errors; - __u32 tx_window_errors; - - /* for cslip etc */ - __u32 rx_compressed; - __u32 tx_compressed; -}; - -/* The struct should be in sync with struct ifmap */ -struct rtnl_link_ifmap -{ - __u64 mem_start; - __u64 mem_end; - __u64 base_addr; - __u16 irq; - __u8 dma; - __u8 port; -}; - -enum -{ - IFLA_UNSPEC, - IFLA_ADDRESS, - IFLA_BROADCAST, - IFLA_IFNAME, - IFLA_MTU, - IFLA_LINK, - IFLA_QDISC, - IFLA_STATS, - IFLA_COST, -#define IFLA_COST IFLA_COST - IFLA_PRIORITY, -#define IFLA_PRIORITY IFLA_PRIORITY - IFLA_MASTER, -#define IFLA_MASTER IFLA_MASTER - IFLA_WIRELESS, /* Wireless Extension event - see wireless.h */ -#define IFLA_WIRELESS IFLA_WIRELESS - IFLA_PROTINFO, /* Protocol specific information for a link */ -#define IFLA_PROTINFO IFLA_PROTINFO - IFLA_TXQLEN, -#define IFLA_TXQLEN IFLA_TXQLEN - IFLA_MAP, -#define IFLA_MAP IFLA_MAP - IFLA_WEIGHT, -#define IFLA_WEIGHT IFLA_WEIGHT - IFLA_OPERSTATE, - IFLA_LINKMODE, - __IFLA_MAX -}; - - -#define IFLA_MAX (__IFLA_MAX - 1) - -#define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg)))) -#define IFLA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ifinfomsg)) - -/* ifi_flags. - - IFF_* flags. - - The only change is: - IFF_LOOPBACK, IFF_BROADCAST and IFF_POINTOPOINT are - more not changeable by user. They describe link media - characteristics and set by device driver. - - Comments: - - Combination IFF_BROADCAST|IFF_POINTOPOINT is invalid - - If neither of these three flags are set; - the interface is NBMA. - - - IFF_MULTICAST does not mean anything special: - multicasts can be used on all not-NBMA links. - IFF_MULTICAST means that this media uses special encapsulation - for multicast frames. Apparently, all IFF_POINTOPOINT and - IFF_BROADCAST devices are able to use multicasts too. - */ - -/* IFLA_LINK. - For usual devices it is equal ifi_index. - If it is a "virtual interface" (f.e. tunnel), ifi_link - can point to real physical interface (f.e. for bandwidth calculations), - or maybe 0, what means, that real media is unknown (usual - for IPIP tunnels, when route to endpoint is allowed to change) - */ - -/* Subtype attributes for IFLA_PROTINFO */ -enum -{ - IFLA_INET6_UNSPEC, - IFLA_INET6_FLAGS, /* link flags */ - IFLA_INET6_CONF, /* sysctl parameters */ - IFLA_INET6_STATS, /* statistics */ - IFLA_INET6_MCAST, /* MC things. What of them? */ - IFLA_INET6_CACHEINFO, /* time values and max reasm size */ - __IFLA_INET6_MAX -}; - -#define IFLA_INET6_MAX (__IFLA_INET6_MAX - 1) - -struct ifla_cacheinfo -{ - __u32 max_reasm_len; - __u32 tstamp; /* ipv6InterfaceTable updated timestamp */ - __u32 reachable_time; - __u32 retrans_time; -}; /***************************************************************** * Traffic control messages. -- cgit v0.10.2 From 8584d6df39db5601965f9bc5e3bf2fea833ad7bb Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sat, 5 Aug 2006 00:56:16 -0700 Subject: [NETFILTER]: netbios conntrack: fix compile Fix compile breakage caused by move of IFA_F_SECONDARY to new header file. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_netbios_ns.c b/net/ipv4/netfilter/ip_conntrack_netbios_ns.c index a566a81..3d0b438 100644 --- a/net/ipv4/netfilter/ip_conntrack_netbios_ns.c +++ b/net/ipv4/netfilter/ip_conntrack_netbios_ns.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include #include -- cgit v0.10.2 From 84fa7933a33f806bbbaae6775e87459b1ec584c0 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 29 Aug 2006 16:44:56 -0700 Subject: [NET]: Replace CHECKSUM_HW by CHECKSUM_PARTIAL/CHECKSUM_COMPLETE Replace CHECKSUM_HW by CHECKSUM_PARTIAL (for outgoing packets, whose checksum still needs to be completed) and CHECKSUM_COMPLETE (for incoming packets, device supplied full checksum). Patch originally from Herbert Xu, updated by myself for 2.6.18-rc3. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/drivers/atm/he.c b/drivers/atm/he.c index ffcb9fd..41e052f 100644 --- a/drivers/atm/he.c +++ b/drivers/atm/he.c @@ -1912,7 +1912,7 @@ he_service_rbrq(struct he_dev *he_dev, int group) skb->tail = skb->data + skb->len; #ifdef USE_CHECKSUM_HW if (vcc->vpi == 0 && vcc->vci >= ATM_NOT_RSV_VCI) { - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; skb->csum = TCP_CKSUM(skb->data, he_vcc->pdu_len); } diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c index 80e8ca0..29dede2 100644 --- a/drivers/net/3c59x.c +++ b/drivers/net/3c59x.c @@ -2077,7 +2077,7 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev) vp->tx_ring[entry].next = 0; #if DO_ZEROCOPY - if (skb->ip_summed != CHECKSUM_HW) + if (skb->ip_summed != CHECKSUM_PARTIAL) vp->tx_ring[entry].status = cpu_to_le32(skb->len | TxIntrUploaded); else vp->tx_ring[entry].status = cpu_to_le32(skb->len | TxIntrUploaded | AddTCPChksum | AddUDPChksum); diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c index 1428bb7..a48b211 100644 --- a/drivers/net/8139cp.c +++ b/drivers/net/8139cp.c @@ -813,7 +813,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev) if (mss) flags |= LargeSend | ((mss & MSSMask) << MSSShift); - else if (skb->ip_summed == CHECKSUM_HW) { + else if (skb->ip_summed == CHECKSUM_PARTIAL) { const struct iphdr *ip = skb->nh.iph; if (ip->protocol == IPPROTO_TCP) flags |= IPCS | TCPCS; @@ -867,7 +867,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev) if (mss) ctrl |= LargeSend | ((mss & MSSMask) << MSSShift); - else if (skb->ip_summed == CHECKSUM_HW) { + else if (skb->ip_summed == CHECKSUM_PARTIAL) { if (ip->protocol == IPPROTO_TCP) ctrl |= IPCS | TCPCS; else if (ip->protocol == IPPROTO_UDP) @@ -898,7 +898,7 @@ static int cp_start_xmit (struct sk_buff *skb, struct net_device *dev) txd->addr = cpu_to_le64(first_mapping); wmb(); - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { if (ip->protocol == IPPROTO_TCP) txd->opts1 = cpu_to_le32(first_eor | first_len | FirstFrag | DescOwn | diff --git a/drivers/net/acenic.c b/drivers/net/acenic.c index 1c01e9b..8265486 100644 --- a/drivers/net/acenic.c +++ b/drivers/net/acenic.c @@ -2040,7 +2040,7 @@ static void ace_rx_int(struct net_device *dev, u32 rxretprd, u32 rxretcsm) */ if (bd_flags & BD_FLG_TCP_UDP_SUM) { skb->csum = htons(csum); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } else { skb->ip_summed = CHECKSUM_NONE; } @@ -2511,7 +2511,7 @@ restart: mapping = ace_map_tx_skb(ap, skb, skb, idx); flagsize = (skb->len << 16) | (BD_FLG_END); - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) flagsize |= BD_FLG_TCP_UDP_SUM; #if ACENIC_DO_VLAN if (vlan_tx_tag_present(skb)) { @@ -2534,7 +2534,7 @@ restart: mapping = ace_map_tx_skb(ap, skb, NULL, idx); flagsize = (skb_headlen(skb) << 16); - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) flagsize |= BD_FLG_TCP_UDP_SUM; #if ACENIC_DO_VLAN if (vlan_tx_tag_present(skb)) { @@ -2560,7 +2560,7 @@ restart: PCI_DMA_TODEVICE); flagsize = (frag->size << 16); - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) flagsize |= BD_FLG_TCP_UDP_SUM; idx = (idx + 1) % ACE_TX_RING_ENTRIES(ap); diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c index 652eb05..7857b463 100644 --- a/drivers/net/bnx2.c +++ b/drivers/net/bnx2.c @@ -4423,7 +4423,7 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev) ring_prod = TX_RING_IDX(prod); vlan_tag_flags = 0; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { vlan_tag_flags |= TX_BD_FLAGS_TCP_UDP_CKSUM; } diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c index a31544c..558fdb8 100644 --- a/drivers/net/cassini.c +++ b/drivers/net/cassini.c @@ -2167,7 +2167,7 @@ end_copy_pkt: cas_page_unmap(addr); } skb->csum = ntohs(i ^ 0xffff); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; skb->protocol = eth_type_trans(skb, cp->dev); return len; } @@ -2821,7 +2821,7 @@ static inline int cas_xmit_tx_ringN(struct cas *cp, int ring, } ctrl = 0; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { u64 csum_start_off, csum_stuff_off; csum_start_off = (u64) (skb->h.raw - skb->data); diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c index 61b3754..ddd0bdb 100644 --- a/drivers/net/chelsio/sge.c +++ b/drivers/net/chelsio/sge.c @@ -1470,9 +1470,9 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) } if (!(adapter->flags & UDP_CSUM_CAPABLE) && - skb->ip_summed == CHECKSUM_HW && + skb->ip_summed == CHECKSUM_PARTIAL && skb->nh.iph->protocol == IPPROTO_UDP) - if (unlikely(skb_checksum_help(skb, 0))) { + if (unlikely(skb_checksum_help(skb))) { dev_kfree_skb_any(skb); return NETDEV_TX_OK; } @@ -1495,11 +1495,11 @@ int t1_start_xmit(struct sk_buff *skb, struct net_device *dev) cpl = (struct cpl_tx_pkt *)__skb_push(skb, sizeof(*cpl)); cpl->opcode = CPL_TX_PKT; cpl->ip_csum_dis = 1; /* SW calculates IP csum */ - cpl->l4_csum_dis = skb->ip_summed == CHECKSUM_HW ? 0 : 1; + cpl->l4_csum_dis = skb->ip_summed == CHECKSUM_PARTIAL ? 0 : 1; /* the length field isn't used so don't bother setting it */ - st->tx_cso += (skb->ip_summed == CHECKSUM_HW); - sge->stats.tx_do_cksum += (skb->ip_summed == CHECKSUM_HW); + st->tx_cso += (skb->ip_summed == CHECKSUM_PARTIAL); + sge->stats.tx_do_cksum += (skb->ip_summed == CHECKSUM_PARTIAL); sge->stats.tx_reg_pkts++; } cpl->iff = dev->if_port; diff --git a/drivers/net/dl2k.c b/drivers/net/dl2k.c index 402961e..b74e676 100644 --- a/drivers/net/dl2k.c +++ b/drivers/net/dl2k.c @@ -611,7 +611,7 @@ start_xmit (struct sk_buff *skb, struct net_device *dev) txdesc = &np->tx_ring[entry]; #if 0 - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { txdesc->status |= cpu_to_le64 (TCPChecksumEnable | UDPChecksumEnable | IPChecksumEnable); diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c index 98ef9f8..2ab9f96 100644 --- a/drivers/net/e1000/e1000_main.c +++ b/drivers/net/e1000/e1000_main.c @@ -2600,7 +2600,7 @@ e1000_tx_csum(struct e1000_adapter *adapter, struct e1000_tx_ring *tx_ring, unsigned int i; uint8_t css; - if (likely(skb->ip_summed == CHECKSUM_HW)) { + if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) { css = skb->h.raw - skb->data; i = tx_ring->next_to_use; @@ -2927,11 +2927,11 @@ e1000_xmit_frame(struct sk_buff *skb, struct net_device *netdev) } /* reserve a descriptor for the offload context */ - if ((mss) || (skb->ip_summed == CHECKSUM_HW)) + if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL)) count++; count++; #else - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) count++; #endif @@ -3608,7 +3608,7 @@ e1000_rx_checksum(struct e1000_adapter *adapter, */ csum = ntohl(csum ^ 0xFFFF); skb->csum = csum; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } adapter->hw_csum_good++; } diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c index 11b8f1b..32cacf1 100644 --- a/drivers/net/forcedeth.c +++ b/drivers/net/forcedeth.c @@ -1503,7 +1503,8 @@ static int nv_start_xmit(struct sk_buff *skb, struct net_device *dev) tx_flags_extra = NV_TX2_TSO | (skb_shinfo(skb)->gso_size << NV_TX2_TSO_SHIFT); else #endif - tx_flags_extra = (skb->ip_summed == CHECKSUM_HW ? (NV_TX2_CHECKSUM_L3|NV_TX2_CHECKSUM_L4) : 0); + tx_flags_extra = skb->ip_summed == CHECKSUM_PARTIAL ? + NV_TX2_CHECKSUM_L3 | NV_TX2_CHECKSUM_L4 : 0; /* vlan tag */ if (np->vlangrp && vlan_tx_tag_present(skb)) { diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c index ebbbd6c..ba96091 100644 --- a/drivers/net/gianfar.c +++ b/drivers/net/gianfar.c @@ -947,7 +947,7 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev) /* Set up checksumming */ if (likely((dev->features & NETIF_F_IP_CSUM) - && (CHECKSUM_HW == skb->ip_summed))) { + && (CHECKSUM_PARTIAL == skb->ip_summed))) { fcb = gfar_add_fcb(skb, txbdp); status |= TXBD_TOE; gfar_tx_checksum(skb, fcb); diff --git a/drivers/net/hamachi.c b/drivers/net/hamachi.c index 409c6aa..763373a 100644 --- a/drivers/net/hamachi.c +++ b/drivers/net/hamachi.c @@ -1648,7 +1648,7 @@ static int hamachi_rx(struct net_device *dev) * could do the pseudo myself and return * CHECKSUM_UNNECESSARY */ - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } } } diff --git a/drivers/net/ibm_emac/ibm_emac_core.c b/drivers/net/ibm_emac/ibm_emac_core.c index 82468e2..57e214d 100644 --- a/drivers/net/ibm_emac/ibm_emac_core.c +++ b/drivers/net/ibm_emac/ibm_emac_core.c @@ -1036,7 +1036,7 @@ static inline u16 emac_tx_csum(struct ocp_enet_private *dev, struct sk_buff *skb) { #if defined(CONFIG_IBM_EMAC_TAH) - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { ++dev->stats.tx_packets_csum; return EMAC_TX_CTRL_TAH_CSUM; } diff --git a/drivers/net/ioc3-eth.c b/drivers/net/ioc3-eth.c index 68d8af7..65f897dd 100644 --- a/drivers/net/ioc3-eth.c +++ b/drivers/net/ioc3-eth.c @@ -1387,7 +1387,7 @@ static int ioc3_start_xmit(struct sk_buff *skb, struct net_device *dev) * MAC header which should not be summed and the TCP/UDP pseudo headers * manually. */ - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { int proto = ntohs(skb->nh.iph->protocol); unsigned int csoff; struct iphdr *ih = skb->nh.iph; diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c index 7bbd447..9405b44 100644 --- a/drivers/net/ixgb/ixgb_main.c +++ b/drivers/net/ixgb/ixgb_main.c @@ -1232,7 +1232,7 @@ ixgb_tx_csum(struct ixgb_adapter *adapter, struct sk_buff *skb) unsigned int i; uint8_t css, cso; - if(likely(skb->ip_summed == CHECKSUM_HW)) { + if(likely(skb->ip_summed == CHECKSUM_PARTIAL)) { css = skb->h.raw - skb->data; cso = (skb->h.raw + skb->csum) - skb->data; diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c index eeab1df..38df58f 100644 --- a/drivers/net/mv643xx_eth.c +++ b/drivers/net/mv643xx_eth.c @@ -1147,7 +1147,7 @@ static void eth_tx_submit_descs_for_skb(struct mv643xx_private *mp, desc->byte_cnt = length; desc->buf_ptr = dma_map_single(NULL, skb->data, length, DMA_TO_DEVICE); - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { BUG_ON(skb->protocol != ETH_P_IP); cmd_sts |= ETH_GEN_TCP_UDP_CHECKSUM | diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c index 9bdd43a..9f16681 100644 --- a/drivers/net/myri10ge/myri10ge.c +++ b/drivers/net/myri10ge/myri10ge.c @@ -930,7 +930,7 @@ static inline void myri10ge_vlan_ip_csum(struct sk_buff *skb, u16 hw_csum) (vh->h_vlan_encapsulated_proto == htons(ETH_P_IP) || vh->h_vlan_encapsulated_proto == htons(ETH_P_IPV6))) { skb->csum = hw_csum; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } } @@ -973,7 +973,7 @@ myri10ge_rx_done(struct myri10ge_priv *mgp, struct myri10ge_rx_buf *rx, if ((skb->protocol == ntohs(ETH_P_IP)) || (skb->protocol == ntohs(ETH_P_IPV6))) { skb->csum = ntohs((u16) csum); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } else myri10ge_vlan_ip_csum(skb, ntohs((u16) csum)); } @@ -1897,13 +1897,13 @@ again: pseudo_hdr_offset = 0; odd_flag = 0; flags = (MXGEFW_FLAGS_NO_TSO | MXGEFW_FLAGS_FIRST); - if (likely(skb->ip_summed == CHECKSUM_HW)) { + if (likely(skb->ip_summed == CHECKSUM_PARTIAL)) { cksum_offset = (skb->h.raw - skb->data); pseudo_hdr_offset = (skb->h.raw + skb->csum) - skb->data; /* If the headers are excessively large, then we must * fall back to a software checksum */ if (unlikely(cksum_offset > 255 || pseudo_hdr_offset > 127)) { - if (skb_checksum_help(skb, 0)) + if (skb_checksum_help(skb)) goto drop; cksum_offset = 0; pseudo_hdr_offset = 0; diff --git a/drivers/net/ns83820.c b/drivers/net/ns83820.c index 0e76859..5143f5d 100644 --- a/drivers/net/ns83820.c +++ b/drivers/net/ns83820.c @@ -1153,7 +1153,7 @@ again: if (!nr_frags) frag = NULL; extsts = 0; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { extsts |= EXTSTS_IPPKT; if (IPPROTO_TCP == skb->nh.iph->protocol) extsts |= EXTSTS_TCPPKT; diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c index 4c2f575..d9b960a 100644 --- a/drivers/net/r8169.c +++ b/drivers/net/r8169.c @@ -2169,7 +2169,7 @@ static inline u32 rtl8169_tso_csum(struct sk_buff *skb, struct net_device *dev) if (mss) return LargeSend | ((mss & MSSMask) << MSSShift); } - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { const struct iphdr *ip = skb->nh.iph; if (ip->protocol == IPPROTO_TCP) diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c index e72e0e0..5b3713f 100644 --- a/drivers/net/s2io.c +++ b/drivers/net/s2io.c @@ -3893,7 +3893,7 @@ static int s2io_xmit(struct sk_buff *skb, struct net_device *dev) txdp->Control_1 |= TXD_TCP_LSO_MSS(s2io_tcp_mss(skb)); } #endif - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { txdp->Control_2 |= (TXD_TX_CKO_IPV4_EN | TXD_TX_CKO_TCP_EN | TXD_TX_CKO_UDP_EN); diff --git a/drivers/net/sk98lin/skge.c b/drivers/net/sk98lin/skge.c index ee62845..eb3b351 100644 --- a/drivers/net/sk98lin/skge.c +++ b/drivers/net/sk98lin/skge.c @@ -1559,7 +1559,7 @@ struct sk_buff *pMessage) /* pointer to send-message */ pTxd->VDataHigh = (SK_U32) (PhysAddr >> 32); pTxd->pMBuf = pMessage; - if (pMessage->ip_summed == CHECKSUM_HW) { + if (pMessage->ip_summed == CHECKSUM_PARTIAL) { u16 hdrlen = pMessage->h.raw - pMessage->data; u16 offset = hdrlen + pMessage->csum; @@ -1678,7 +1678,7 @@ struct sk_buff *pMessage) /* pointer to send-message */ /* ** Does the HW need to evaluate checksum for TCP or UDP packets? */ - if (pMessage->ip_summed == CHECKSUM_HW) { + if (pMessage->ip_summed == CHECKSUM_PARTIAL) { u16 hdrlen = pMessage->h.raw - pMessage->data; u16 offset = hdrlen + pMessage->csum; @@ -2158,7 +2158,7 @@ rx_start: #ifdef USE_SK_RX_CHECKSUM pMsg->csum = pRxd->TcpSums & 0xffff; - pMsg->ip_summed = CHECKSUM_HW; + pMsg->ip_summed = CHECKSUM_COMPLETE; #else pMsg->ip_summed = CHECKSUM_NONE; #endif diff --git a/drivers/net/skge.c b/drivers/net/skge.c index ad878df..b3d6fa3 100644 --- a/drivers/net/skge.c +++ b/drivers/net/skge.c @@ -2338,7 +2338,7 @@ static int skge_xmit_frame(struct sk_buff *skb, struct net_device *dev) td->dma_lo = map; td->dma_hi = map >> 32; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { int offset = skb->h.raw - skb->data; /* This seems backwards, but it is what the sk98lin @@ -2642,7 +2642,7 @@ static inline struct sk_buff *skge_rx_get(struct skge_port *skge, skb->dev = skge->netdev; if (skge->rx_csum) { skb->csum = csum; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } skb->protocol = eth_type_trans(skb, skge->netdev); diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c index 933e87f..8e92566 100644 --- a/drivers/net/sky2.c +++ b/drivers/net/sky2.c @@ -1163,7 +1163,7 @@ static unsigned tx_le_req(const struct sk_buff *skb) if (skb_is_gso(skb)) ++count; - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) ++count; return count; @@ -1272,7 +1272,7 @@ static int sky2_xmit_frame(struct sk_buff *skb, struct net_device *dev) #endif /* Handle TCP checksum offload */ - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { u16 hdr = skb->h.raw - skb->data; u16 offset = hdr + skb->csum; @@ -2000,7 +2000,7 @@ static int sky2_status_intr(struct sky2_hw *hw, int to_do) #endif case OP_RXCHKS: skb = sky2->rx_ring[sky2->rx_next].skb; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; skb->csum = le16_to_cpu(status); break; diff --git a/drivers/net/starfire.c b/drivers/net/starfire.c index c0a62b0..2607aa5 100644 --- a/drivers/net/starfire.c +++ b/drivers/net/starfire.c @@ -1230,7 +1230,7 @@ static int start_tx(struct sk_buff *skb, struct net_device *dev) } #if defined(ZEROCOPY) && defined(HAS_BROKEN_FIRMWARE) - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { if (skb_padto(skb, (skb->len + PADDING_MASK) & ~PADDING_MASK)) return NETDEV_TX_OK; } @@ -1252,7 +1252,7 @@ static int start_tx(struct sk_buff *skb, struct net_device *dev) status |= TxDescIntr; np->reap_tx = 0; } - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { status |= TxCalTCP; np->stats.tx_compressed++; } @@ -1499,7 +1499,7 @@ static int __netdev_rx(struct net_device *dev, int *quota) * Until then, the printk stays. :-) -Ion */ else if (le16_to_cpu(desc->status2) & 0x0040) { - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; skb->csum = le16_to_cpu(desc->csum); printk(KERN_DEBUG "%s: checksum_hw, status2 = %#x\n", dev->name, le16_to_cpu(desc->status2)); } diff --git a/drivers/net/sungem.c b/drivers/net/sungem.c index d7b1d18..b388651 100644 --- a/drivers/net/sungem.c +++ b/drivers/net/sungem.c @@ -855,7 +855,7 @@ static int gem_rx(struct gem *gp, int work_to_do) } skb->csum = ntohs((status & RXDCTRL_TCPCSUM) ^ 0xffff); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; skb->protocol = eth_type_trans(skb, gp->dev); netif_receive_skb(skb); @@ -1026,7 +1026,7 @@ static int gem_start_xmit(struct sk_buff *skb, struct net_device *dev) unsigned long flags; ctrl = 0; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { u64 csum_start_off, csum_stuff_off; csum_start_off = (u64) (skb->h.raw - skb->data); diff --git a/drivers/net/sunhme.c b/drivers/net/sunhme.c index c6f5bc3..17981da 100644 --- a/drivers/net/sunhme.c +++ b/drivers/net/sunhme.c @@ -1207,7 +1207,7 @@ static void happy_meal_transceiver_check(struct happy_meal *hp, void __iomem *tr * flags, thus: * * skb->csum = rxd->rx_flags & 0xffff; - * skb->ip_summed = CHECKSUM_HW; + * skb->ip_summed = CHECKSUM_COMPLETE; * * before sending off the skb to the protocols, and we are good as gold. */ @@ -2074,7 +2074,7 @@ static void happy_meal_rx(struct happy_meal *hp, struct net_device *dev) /* This card is _fucking_ hot... */ skb->csum = ntohs(csum ^ 0xffff); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; RXD(("len=%d csum=%4x]", len, csum)); skb->protocol = eth_type_trans(skb, dev); @@ -2268,7 +2268,7 @@ static int happy_meal_start_xmit(struct sk_buff *skb, struct net_device *dev) u32 tx_flags; tx_flags = TXFLAG_OWN; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { u32 csum_start_off, csum_stuff_off; csum_start_off = (u32) (skb->h.raw - skb->data); diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c index eafabb2..6f5d3a3 100644 --- a/drivers/net/tg3.c +++ b/drivers/net/tg3.c @@ -3851,11 +3851,11 @@ static int tg3_start_xmit(struct sk_buff *skb, struct net_device *dev) skb->h.th->check = 0; } - else if (skb->ip_summed == CHECKSUM_HW) + else if (skb->ip_summed == CHECKSUM_PARTIAL) base_flags |= TXD_FLAG_TCPUDP_CSUM; #else mss = 0; - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) base_flags |= TXD_FLAG_TCPUDP_CSUM; #endif #if TG3_VLAN_TAG_USED @@ -3981,7 +3981,7 @@ static int tg3_start_xmit_dma_bug(struct sk_buff *skb, struct net_device *dev) entry = tp->tx_prod; base_flags = 0; - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) base_flags |= TXD_FLAG_TCPUDP_CSUM; #if TG3_TSO_SUPPORT != 0 mss = 0; diff --git a/drivers/net/typhoon.c b/drivers/net/typhoon.c index 4103c37..c6e601d 100644 --- a/drivers/net/typhoon.c +++ b/drivers/net/typhoon.c @@ -830,7 +830,7 @@ typhoon_start_tx(struct sk_buff *skb, struct net_device *dev) first_txd->addrHi = (u64)((unsigned long) skb) >> 32; first_txd->processFlags = 0; - if(skb->ip_summed == CHECKSUM_HW) { + if(skb->ip_summed == CHECKSUM_PARTIAL) { /* The 3XP will figure out if this is UDP/TCP */ first_txd->processFlags |= TYPHOON_TX_PF_TCP_CHKSUM; first_txd->processFlags |= TYPHOON_TX_PF_UDP_CHKSUM; diff --git a/drivers/net/via-rhine.c b/drivers/net/via-rhine.c index ae97108..6654715 100644 --- a/drivers/net/via-rhine.c +++ b/drivers/net/via-rhine.c @@ -1230,7 +1230,7 @@ static int rhine_start_tx(struct sk_buff *skb, struct net_device *dev) rp->tx_skbuff[entry] = skb; if ((rp->quirks & rqRhineI) && - (((unsigned long)skb->data & 3) || skb_shinfo(skb)->nr_frags != 0 || skb->ip_summed == CHECKSUM_HW)) { + (((unsigned long)skb->data & 3) || skb_shinfo(skb)->nr_frags != 0 || skb->ip_summed == CHECKSUM_PARTIAL)) { /* Must use alignment buffer. */ if (skb->len > PKT_BUF_SZ) { /* packet too long, drop it */ diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c index aa9cd92..f1e0c74 100644 --- a/drivers/net/via-velocity.c +++ b/drivers/net/via-velocity.c @@ -2002,7 +2002,7 @@ static int velocity_xmit(struct sk_buff *skb, struct net_device *dev) * Handle hardware checksum */ if ((vptr->flags & VELOCITY_FLAGS_TX_CSUM) - && (skb->ip_summed == CHECKSUM_HW)) { + && (skb->ip_summed == CHECKSUM_PARTIAL)) { struct iphdr *ip = skb->nh.iph; if (ip->protocol == IPPROTO_TCP) td_ptr->tdesc1.TCR |= TCR0_TCPCK; diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 50a4719..4f2c2b6 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -976,7 +976,7 @@ extern void dev_mcast_init(void); extern int netdev_max_backlog; extern int weight_p; extern int netdev_set_master(struct net_device *dev, struct net_device *master); -extern int skb_checksum_help(struct sk_buff *skb, int inward); +extern int skb_checksum_help(struct sk_buff *skb); extern struct sk_buff *skb_gso_segment(struct sk_buff *skb, int features); #ifdef CONFIG_BUG extern void netdev_rx_csum_fault(struct net_device *dev); @@ -1012,7 +1012,7 @@ static inline int netif_needs_gso(struct net_device *dev, struct sk_buff *skb) { return skb_is_gso(skb) && (!skb_gso_ok(skb, dev->features) || - unlikely(skb->ip_summed != CHECKSUM_HW)); + unlikely(skb->ip_summed != CHECKSUM_PARTIAL)); } /* On bonding slaves other than the currently active slave, suppress diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 755e9cd..85577a4 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -34,8 +34,9 @@ #define HAVE_ALIGNABLE_SKB /* Ditto 8) */ #define CHECKSUM_NONE 0 -#define CHECKSUM_HW 1 +#define CHECKSUM_PARTIAL 1 #define CHECKSUM_UNNECESSARY 2 +#define CHECKSUM_COMPLETE 3 #define SKB_DATA_ALIGN(X) (((X) + (SMP_CACHE_BYTES - 1)) & \ ~(SMP_CACHE_BYTES - 1)) @@ -56,17 +57,17 @@ * Apparently with secret goal to sell you new device, when you * will add new protocol to your host. F.e. IPv6. 8) * - * HW: the most generic way. Device supplied checksum of _all_ + * COMPLETE: the most generic way. Device supplied checksum of _all_ * the packet as seen by netif_rx in skb->csum. * NOTE: Even if device supports only some protocols, but - * is able to produce some skb->csum, it MUST use HW, + * is able to produce some skb->csum, it MUST use COMPLETE, * not UNNECESSARY. * * B. Checksumming on output. * * NONE: skb is checksummed by protocol or csum is not required. * - * HW: device is required to csum packet as seen by hard_start_xmit + * PARTIAL: device is required to csum packet as seen by hard_start_xmit * from skb->h.raw to the end and to record the checksum * at skb->h.raw+skb->csum. * @@ -1261,14 +1262,14 @@ static inline int skb_linearize_cow(struct sk_buff *skb) * @len: length of data pulled * * After doing a pull on a received packet, you need to call this to - * update the CHECKSUM_HW checksum, or set ip_summed to CHECKSUM_NONE - * so that it can be recomputed from scratch. + * update the CHECKSUM_COMPLETE checksum, or set ip_summed to + * CHECKSUM_NONE so that it can be recomputed from scratch. */ static inline void skb_postpull_rcsum(struct sk_buff *skb, const void *start, unsigned int len) { - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_COMPLETE) skb->csum = csum_sub(skb->csum, csum_partial(start, len, 0)); } @@ -1287,7 +1288,7 @@ static inline int pskb_trim_rcsum(struct sk_buff *skb, unsigned int len) { if (likely(len >= skb->len)) return 0; - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_COMPLETE) skb->ip_summed = CHECKSUM_NONE; return __pskb_trim(skb, len); } diff --git a/net/core/datagram.c b/net/core/datagram.c index aecddcc..f558c61 100644 --- a/net/core/datagram.c +++ b/net/core/datagram.c @@ -417,7 +417,7 @@ unsigned int __skb_checksum_complete(struct sk_buff *skb) sum = (u16)csum_fold(skb_checksum(skb, 0, skb->len, skb->csum)); if (likely(!sum)) { - if (unlikely(skb->ip_summed == CHECKSUM_HW)) + if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE)) netdev_rx_csum_fault(skb->dev); skb->ip_summed = CHECKSUM_UNNECESSARY; } @@ -462,7 +462,7 @@ int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, goto fault; if ((unsigned short)csum_fold(csum)) goto csum_error; - if (unlikely(skb->ip_summed == CHECKSUM_HW)) + if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE)) netdev_rx_csum_fault(skb->dev); iov->iov_len -= chunk; iov->iov_base += chunk; diff --git a/net/core/dev.c b/net/core/dev.c index d4a1ec3..fc82f6f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1166,12 +1166,12 @@ EXPORT_SYMBOL(netif_device_attach); * Invalidate hardware checksum when packet is to be mangled, and * complete checksum manually on outgoing path. */ -int skb_checksum_help(struct sk_buff *skb, int inward) +int skb_checksum_help(struct sk_buff *skb) { unsigned int csum; int ret = 0, offset = skb->h.raw - skb->data; - if (inward) + if (skb->ip_summed == CHECKSUM_COMPLETE) goto out_set_summed; if (unlikely(skb_shinfo(skb)->gso_size)) { @@ -1223,7 +1223,7 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, int features) skb->mac_len = skb->nh.raw - skb->data; __skb_pull(skb, skb->mac_len); - if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) { if (skb_header_cloned(skb) && (err = pskb_expand_head(skb, 0, 0, GFP_ATOMIC))) return ERR_PTR(err); @@ -1232,7 +1232,7 @@ struct sk_buff *skb_gso_segment(struct sk_buff *skb, int features) rcu_read_lock(); list_for_each_entry_rcu(ptype, &ptype_base[ntohs(type) & 15], list) { if (ptype->type == type && !ptype->dev && ptype->gso_segment) { - if (unlikely(skb->ip_summed != CHECKSUM_HW)) { + if (unlikely(skb->ip_summed != CHECKSUM_PARTIAL)) { err = ptype->gso_send_check(skb); segs = ERR_PTR(err); if (err || skb_gso_ok(skb, features)) @@ -1444,11 +1444,11 @@ int dev_queue_xmit(struct sk_buff *skb) /* If packet is not checksummed and device does not support * checksumming for this protocol, complete checksumming here. */ - if (skb->ip_summed == CHECKSUM_HW && + if (skb->ip_summed == CHECKSUM_PARTIAL && (!(dev->features & NETIF_F_GEN_CSUM) && (!(dev->features & NETIF_F_IP_CSUM) || skb->protocol != htons(ETH_P_IP)))) - if (skb_checksum_help(skb, 0)) + if (skb_checksum_help(skb)) goto out_kfree_skb; gso: diff --git a/net/core/netpoll.c b/net/core/netpoll.c index 471da45..ead5920 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -110,7 +110,7 @@ static int checksum_udp(struct sk_buff *skb, struct udphdr *uh, psum = csum_tcpudp_nofold(saddr, daddr, ulen, IPPROTO_UDP, 0); - if (skb->ip_summed == CHECKSUM_HW && + if (skb->ip_summed == CHECKSUM_COMPLETE && !(u16)csum_fold(csum_add(psum, skb->csum))) return 0; diff --git a/net/core/skbuff.c b/net/core/skbuff.c index c54f366..8a476f1 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -1397,7 +1397,7 @@ void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to) unsigned int csum; long csstart; - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) csstart = skb->h.raw - skb->data; else csstart = skb_headlen(skb); @@ -1411,7 +1411,7 @@ void skb_copy_and_csum_dev(const struct sk_buff *skb, u8 *to) csum = skb_copy_and_csum_bits(skb, csstart, to + csstart, skb->len - csstart, 0); - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { long csstuff = csstart + skb->csum; *((unsigned short *)(to + csstuff)) = csum_fold(csum); @@ -1898,10 +1898,10 @@ int skb_append_datato_frags(struct sock *sk, struct sk_buff *skb, * @len: length of data pulled * * This function performs an skb_pull on the packet and updates - * update the CHECKSUM_HW checksum. It should be used on receive - * path processing instead of skb_pull unless you know that the - * checksum difference is zero (e.g., a valid IP header) or you - * are setting ip_summed to CHECKSUM_NONE. + * update the CHECKSUM_COMPLETE checksum. It should be used on + * receive path processing instead of skb_pull unless you know + * that the checksum difference is zero (e.g., a valid IP header) + * or you are setting ip_summed to CHECKSUM_NONE. */ unsigned char *skb_pull_rcsum(struct sk_buff *skb, unsigned int len) { @@ -1994,7 +1994,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, int features) frag = skb_shinfo(nskb)->frags; k = 0; - nskb->ip_summed = CHECKSUM_HW; + nskb->ip_summed = CHECKSUM_PARTIAL; nskb->csum = skb->csum; memcpy(skb_put(nskb, hsize), skb->data + offset, hsize); diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 6ad797c..6d223e5 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -930,7 +930,7 @@ int icmp_rcv(struct sk_buff *skb) ICMP_INC_STATS_BH(ICMP_MIB_INMSGS); switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (!(u16)csum_fold(skb->csum)) break; /* fall through */ diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 8e8117c..7003e76 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -931,7 +931,7 @@ int igmp_rcv(struct sk_buff *skb) goto drop; switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (!(u16)csum_fold(skb->csum)) break; /* fall through */ diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index b84b53a..8d7f107 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -665,7 +665,7 @@ static struct sk_buff *ip_frag_reasm(struct ipq *qp, struct net_device *dev) head->len += fp->len; if (head->ip_summed != fp->ip_summed) head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_HW) + else if (head->ip_summed == CHECKSUM_COMPLETE) head->csum = csum_add(head->csum, fp->csum); head->truesize += fp->truesize; atomic_sub(fp->truesize, &ip_frag_mem); diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index 0f9b3a3..e66f6ff 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -576,7 +576,7 @@ static int ipgre_rcv(struct sk_buff *skb) if (flags&GRE_CSUM) { switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: csum = (u16)csum_fold(skb->csum); if (!csum) break; @@ -584,7 +584,7 @@ static int ipgre_rcv(struct sk_buff *skb) case CHECKSUM_NONE: skb->csum = 0; csum = __skb_checksum_complete(skb); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_COMPLETE; } offset += 4; } diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 308bdea..1b9b674 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -680,7 +680,7 @@ ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk { struct iovec *iov = from; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { if (memcpy_fromiovecend(to, iov, offset, len) < 0) return -EFAULT; } else { @@ -736,7 +736,7 @@ static inline int ip_ufo_append_data(struct sock *sk, /* initialize protocol header pointer */ skb->h.raw = skb->data + fragheaderlen; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; skb->csum = 0; sk->sk_sndmsg_off = 0; } @@ -844,7 +844,7 @@ int ip_append_data(struct sock *sk, length + fragheaderlen <= mtu && rt->u.dst.dev->features & NETIF_F_ALL_CSUM && !exthdrlen) - csummode = CHECKSUM_HW; + csummode = CHECKSUM_PARTIAL; inet->cork.length += length; if (((length > mtu) && (sk->sk_protocol == IPPROTO_UDP)) && diff --git a/net/ipv4/ipvs/ip_vs_proto_tcp.c b/net/ipv4/ipvs/ip_vs_proto_tcp.c index bc28b11..820e831 100644 --- a/net/ipv4/ipvs/ip_vs_proto_tcp.c +++ b/net/ipv4/ipvs/ip_vs_proto_tcp.c @@ -151,7 +151,7 @@ tcp_snat_handler(struct sk_buff **pskb, /* Only port and addr are changed, do fast csum update */ tcp_fast_csum_update(tcph, cp->daddr, cp->vaddr, cp->dport, cp->vport); - if ((*pskb)->ip_summed == CHECKSUM_HW) + if ((*pskb)->ip_summed == CHECKSUM_COMPLETE) (*pskb)->ip_summed = CHECKSUM_NONE; } else { /* full checksum calculation */ @@ -204,7 +204,7 @@ tcp_dnat_handler(struct sk_buff **pskb, /* Only port and addr are changed, do fast csum update */ tcp_fast_csum_update(tcph, cp->vaddr, cp->daddr, cp->vport, cp->dport); - if ((*pskb)->ip_summed == CHECKSUM_HW) + if ((*pskb)->ip_summed == CHECKSUM_COMPLETE) (*pskb)->ip_summed = CHECKSUM_NONE; } else { /* full checksum calculation */ @@ -229,7 +229,7 @@ tcp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp) switch (skb->ip_summed) { case CHECKSUM_NONE: skb->csum = skb_checksum(skb, tcphoff, skb->len - tcphoff, 0); - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (csum_tcpudp_magic(skb->nh.iph->saddr, skb->nh.iph->daddr, skb->len - tcphoff, skb->nh.iph->protocol, skb->csum)) { @@ -239,7 +239,7 @@ tcp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp) } break; default: - /* CHECKSUM_UNNECESSARY */ + /* No need to checksum. */ break; } diff --git a/net/ipv4/ipvs/ip_vs_proto_udp.c b/net/ipv4/ipvs/ip_vs_proto_udp.c index 89d9175..90c8166 100644 --- a/net/ipv4/ipvs/ip_vs_proto_udp.c +++ b/net/ipv4/ipvs/ip_vs_proto_udp.c @@ -161,7 +161,7 @@ udp_snat_handler(struct sk_buff **pskb, /* Only port and addr are changed, do fast csum update */ udp_fast_csum_update(udph, cp->daddr, cp->vaddr, cp->dport, cp->vport); - if ((*pskb)->ip_summed == CHECKSUM_HW) + if ((*pskb)->ip_summed == CHECKSUM_COMPLETE) (*pskb)->ip_summed = CHECKSUM_NONE; } else { /* full checksum calculation */ @@ -216,7 +216,7 @@ udp_dnat_handler(struct sk_buff **pskb, /* Only port and addr are changed, do fast csum update */ udp_fast_csum_update(udph, cp->vaddr, cp->daddr, cp->vport, cp->dport); - if ((*pskb)->ip_summed == CHECKSUM_HW) + if ((*pskb)->ip_summed == CHECKSUM_COMPLETE) (*pskb)->ip_summed = CHECKSUM_NONE; } else { /* full checksum calculation */ @@ -250,7 +250,7 @@ udp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp) case CHECKSUM_NONE: skb->csum = skb_checksum(skb, udphoff, skb->len - udphoff, 0); - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (csum_tcpudp_magic(skb->nh.iph->saddr, skb->nh.iph->daddr, skb->len - udphoff, @@ -262,7 +262,7 @@ udp_csum_check(struct sk_buff *skb, struct ip_vs_protocol *pp) } break; default: - /* CHECKSUM_UNNECESSARY */ + /* No need to checksum. */ break; } } diff --git a/net/ipv4/netfilter.c b/net/ipv4/netfilter.c index 6a9e34b..f88347d 100644 --- a/net/ipv4/netfilter.c +++ b/net/ipv4/netfilter.c @@ -168,7 +168,7 @@ unsigned int nf_ip_checksum(struct sk_buff *skb, unsigned int hook, unsigned int csum = 0; switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (hook != NF_IP_PRE_ROUTING && hook != NF_IP_LOCAL_IN) break; if ((protocol == 0 && !(u16)csum_fold(skb->csum)) || diff --git a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c index fb920e7..9de81ff 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c @@ -865,8 +865,7 @@ static int tcp_error(struct sk_buff *skb, /* Checksum invalid? Ignore. * We skip checking packets on the outgoing path - * because the semantic of CHECKSUM_HW is different there - * and moreover root might send raw packets. + * because it is assumed to be correct. */ /* FIXME: Source route IP option packets --RR */ if (ip_conntrack_checksum && hooknum == NF_IP_PRE_ROUTING && diff --git a/net/ipv4/netfilter/ip_conntrack_proto_udp.c b/net/ipv4/netfilter/ip_conntrack_proto_udp.c index 9b2c16b..e58e52f 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_udp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_udp.c @@ -117,8 +117,7 @@ static int udp_error(struct sk_buff *skb, enum ip_conntrack_info *ctinfo, /* Checksum invalid? Ignore. * We skip checking packets on the outgoing path - * because the semantic of CHECKSUM_HW is different there - * and moreover root might send raw packets. + * because the checksum is assumed to be correct. * FIXME: Source route IP option packets --RR */ if (ip_conntrack_checksum && hooknum == NF_IP_PRE_ROUTING && nf_ip_checksum(skb, hooknum, iph->ihl * 4, IPPROTO_UDP)) { diff --git a/net/ipv4/netfilter/ip_nat_standalone.c b/net/ipv4/netfilter/ip_nat_standalone.c index 17de077..f4f00c8 100644 --- a/net/ipv4/netfilter/ip_nat_standalone.c +++ b/net/ipv4/netfilter/ip_nat_standalone.c @@ -111,8 +111,9 @@ ip_nat_fn(unsigned int hooknum, & htons(IP_MF|IP_OFFSET))); /* If we had a hardware checksum before, it's now invalid */ - if ((*pskb)->ip_summed == CHECKSUM_HW) - if (skb_checksum_help(*pskb, (out == NULL))) + if ((*pskb)->ip_summed == CHECKSUM_PARTIAL || + (*pskb)->ip_summed == CHECKSUM_COMPLETE) + if (skb_checksum_help(*pskb)) return NF_DROP; ct = ip_conntrack_get(*pskb, &ctinfo); diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c index 198ac36..276a964 100644 --- a/net/ipv4/netfilter/ip_queue.c +++ b/net/ipv4/netfilter/ip_queue.c @@ -208,9 +208,9 @@ ipq_build_packet_message(struct ipq_queue_entry *entry, int *errp) break; case IPQ_COPY_PACKET: - if (entry->skb->ip_summed == CHECKSUM_HW && - (*errp = skb_checksum_help(entry->skb, - entry->info->outdev == NULL))) { + if ((entry->skb->ip_summed == CHECKSUM_PARTIAL || + entry->skb->ip_summed == CHECKSUM_COMPLETE) && + (*errp = skb_checksum_help(entry->skb))) { read_unlock_bh(&queue_lock); return NULL; } diff --git a/net/ipv4/netfilter/ipt_ECN.c b/net/ipv4/netfilter/ipt_ECN.c index 4adf5c9..4ec43f9 100644 --- a/net/ipv4/netfilter/ipt_ECN.c +++ b/net/ipv4/netfilter/ipt_ECN.c @@ -49,7 +49,7 @@ set_ect_ip(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) /* Return 0 if there was an error. */ static inline int -set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo, int inward) +set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) { struct tcphdr _tcph, *tcph; u_int16_t diffs[2]; @@ -70,8 +70,9 @@ set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo, int inward) return 0; tcph = (void *)(*pskb)->nh.iph + (*pskb)->nh.iph->ihl*4; - if ((*pskb)->ip_summed == CHECKSUM_HW && - skb_checksum_help(*pskb, inward)) + if (((*pskb)->ip_summed == CHECKSUM_PARTIAL || + (*pskb)->ip_summed == CHECKSUM_COMPLETE) && + skb_checksum_help(*pskb)) return 0; diffs[0] = ((u_int16_t *)tcph)[6]; @@ -106,7 +107,7 @@ target(struct sk_buff **pskb, if (einfo->operation & (IPT_ECN_OP_SET_ECE | IPT_ECN_OP_SET_CWR) && (*pskb)->nh.iph->protocol == IPPROTO_TCP) - if (!set_ect_tcp(pskb, einfo, (out == NULL))) + if (!set_ect_tcp(pskb, einfo)) return NF_DROP; return IPT_CONTINUE; diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index ef2fe5b..c998dc0 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -62,8 +62,9 @@ ipt_tcpmss_target(struct sk_buff **pskb, if (!skb_make_writable(pskb, (*pskb)->len)) return NF_DROP; - if ((*pskb)->ip_summed == CHECKSUM_HW && - skb_checksum_help(*pskb, out == NULL)) + if (((*pskb)->ip_summed == CHECKSUM_PARTIAL || + (*pskb)->ip_summed == CHECKSUM_COMPLETE) && + skb_checksum_help(*pskb)) return NF_DROP; iph = (*pskb)->nh.iph; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 934396b..b0124e69 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -568,7 +568,7 @@ new_segment: skb->truesize += copy; sk->sk_wmem_queued += copy; sk->sk_forward_alloc -= copy; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; tp->write_seq += copy; TCP_SKB_CB(skb)->end_seq += copy; skb_shinfo(skb)->gso_segs = 0; @@ -723,7 +723,7 @@ new_segment: * Check whether we can use HW checksum. */ if (sk->sk_route_caps & NETIF_F_ALL_CSUM) - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; skb_entail(sk, tp, skb); copy = size_goal; @@ -2205,7 +2205,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb, int features) th->fin = th->psh = 0; th->check = ~csum_fold(th->check + delta); - if (skb->ip_summed != CHECKSUM_HW) + if (skb->ip_summed != CHECKSUM_PARTIAL) th->check = csum_fold(csum_partial(skb->h.raw, thlen, skb->csum)); @@ -2219,7 +2219,7 @@ struct sk_buff *tcp_tso_segment(struct sk_buff *skb, int features) delta = htonl(oldlen + (skb->tail - skb->h.raw) + skb->data_len); th->check = ~csum_fold(th->check + delta); - if (skb->ip_summed != CHECKSUM_HW) + if (skb->ip_summed != CHECKSUM_PARTIAL) th->check = csum_fold(csum_partial(skb->h.raw, thlen, skb->csum)); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 43f6740..b2aa512 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -484,7 +484,7 @@ void tcp_v4_send_check(struct sock *sk, int len, struct sk_buff *skb) struct inet_sock *inet = inet_sk(sk); struct tcphdr *th = skb->h.th; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { th->check = ~tcp_v4_check(th, len, inet->saddr, inet->daddr, 0); skb->csum = offsetof(struct tcphdr, check); } else { @@ -509,7 +509,7 @@ int tcp_v4_gso_send_check(struct sk_buff *skb) th->check = 0; th->check = ~tcp_v4_check(th, skb->len, iph->saddr, iph->daddr, 0); skb->csum = offsetof(struct tcphdr, check); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; return 0; } @@ -973,7 +973,7 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) static int tcp_v4_checksum_init(struct sk_buff *skb) { - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_COMPLETE) { if (!tcp_v4_check(skb->h.th, skb->len, skb->nh.iph->saddr, skb->nh.iph->daddr, skb->csum)) { skb->ip_summed = CHECKSUM_UNNECESSARY; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index b4f3ffe..9252a50 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -577,7 +577,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss TCP_SKB_CB(buff)->sacked = TCP_SKB_CB(skb)->sacked; TCP_SKB_CB(skb)->sacked &= ~TCPCB_AT_TAIL; - if (!skb_shinfo(skb)->nr_frags && skb->ip_summed != CHECKSUM_HW) { + if (!skb_shinfo(skb)->nr_frags && skb->ip_summed != CHECKSUM_PARTIAL) { /* Copy and checksum data tail into the new buffer. */ buff->csum = csum_partial_copy_nocheck(skb->data + len, skb_put(buff, nsize), nsize, 0); @@ -586,7 +586,7 @@ int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len, unsigned int mss skb->csum = csum_block_sub(skb->csum, buff->csum, len); } else { - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; skb_split(skb, buff, len); } @@ -689,7 +689,7 @@ int tcp_trim_head(struct sock *sk, struct sk_buff *skb, u32 len) __pskb_trim_head(skb, len - skb_headlen(skb)); TCP_SKB_CB(skb)->seq += len; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; skb->truesize -= len; sk->sk_wmem_queued -= len; @@ -1062,7 +1062,7 @@ static int tso_fragment(struct sock *sk, struct sk_buff *skb, unsigned int len, /* This packet was never sent out yet, so no SACK bits. */ TCP_SKB_CB(buff)->sacked = 0; - buff->ip_summed = skb->ip_summed = CHECKSUM_HW; + buff->ip_summed = skb->ip_summed = CHECKSUM_PARTIAL; skb_split(skb, buff, len); /* Fix up tso_factor for both original and new SKB. */ @@ -1206,8 +1206,7 @@ static int tcp_mtu_probe(struct sock *sk) TCP_SKB_CB(nskb)->flags = TCPCB_FLAG_ACK; TCP_SKB_CB(nskb)->sacked = 0; nskb->csum = 0; - if (skb->ip_summed == CHECKSUM_HW) - nskb->ip_summed = CHECKSUM_HW; + nskb->ip_summed = skb->ip_summed; len = 0; while (len < probe_size) { @@ -1231,7 +1230,7 @@ static int tcp_mtu_probe(struct sock *sk) ~(TCPCB_FLAG_FIN|TCPCB_FLAG_PSH); if (!skb_shinfo(skb)->nr_frags) { skb_pull(skb, copy); - if (skb->ip_summed != CHECKSUM_HW) + if (skb->ip_summed != CHECKSUM_PARTIAL) skb->csum = csum_partial(skb->data, skb->len, 0); } else { __pskb_trim_head(skb, copy); @@ -1572,10 +1571,9 @@ static void tcp_retrans_try_collapse(struct sock *sk, struct sk_buff *skb, int m memcpy(skb_put(skb, next_skb_size), next_skb->data, next_skb_size); - if (next_skb->ip_summed == CHECKSUM_HW) - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = next_skb->ip_summed; - if (skb->ip_summed != CHECKSUM_HW) + if (skb->ip_summed != CHECKSUM_PARTIAL) skb->csum = csum_block_add(skb->csum, next_skb->csum, skb_size); /* Update sequence range on original skb. */ diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index a4d005e..8715251 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -429,7 +429,7 @@ static int udp_push_pending_frames(struct sock *sk, struct udp_sock *up) /* * Only one fragment on the socket. */ - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { skb->csum = offsetof(struct udphdr, check); uh->check = ~csum_tcpudp_magic(fl->fl4_src, fl->fl4_dst, up->len, IPPROTO_UDP, 0); @@ -448,7 +448,7 @@ static int udp_push_pending_frames(struct sock *sk, struct udp_sock *up) * fragments on the socket so that all csums of sk_buffs * should be together. */ - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { int offset = (unsigned char *)uh - skb->data; skb->csum = skb_checksum(skb, offset, skb->len - offset, 0); @@ -1088,7 +1088,7 @@ static void udp_checksum_init(struct sk_buff *skb, struct udphdr *uh, { if (uh->check == 0) { skb->ip_summed = CHECKSUM_UNNECESSARY; - } else if (skb->ip_summed == CHECKSUM_HW) { + } else if (skb->ip_summed == CHECKSUM_COMPLETE) { if (!udp_check(uh, ulen, saddr, daddr, skb->csum)) skb->ip_summed = CHECKSUM_UNNECESSARY; } diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c index d16f863..4a96a9e 100644 --- a/net/ipv4/xfrm4_output.c +++ b/net/ipv4/xfrm4_output.c @@ -48,8 +48,8 @@ static int xfrm4_output_one(struct sk_buff *skb) struct xfrm_state *x = dst->xfrm; int err; - if (skb->ip_summed == CHECKSUM_HW) { - err = skb_checksum_help(skb, 0); + if (skb->ip_summed == CHECKSUM_PARTIAL) { + err = skb_checksum_help(skb); if (err) goto error_nolock; } diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 86dac10..05afa6b 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -294,7 +294,7 @@ looped_back: hdr = (struct ipv6_rt_hdr *) skb2->h.raw; } - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_COMPLETE) skb->ip_summed = CHECKSUM_NONE; i = n - --hdr->segments_left; diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index dbfce08..1030551 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -606,7 +606,7 @@ static int icmpv6_rcv(struct sk_buff **pskb) /* Perform checksum. */ switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (!csum_ipv6_magic(saddr, daddr, skb->len, IPPROTO_ICMPV6, skb->csum)) break; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 4fb47a2..65514f2 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -866,7 +866,7 @@ static inline int ip6_ufo_append_data(struct sock *sk, /* initialize protocol header pointer */ skb->h.raw = skb->data + fragheaderlen; - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; skb->csum = 0; sk->sk_sndmsg_off = 0; } diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c index 395a417..580b1ab 100644 --- a/net/ipv6/netfilter.c +++ b/net/ipv6/netfilter.c @@ -87,7 +87,7 @@ unsigned int nf_ip6_checksum(struct sk_buff *skb, unsigned int hook, unsigned int csum = 0; switch (skb->ip_summed) { - case CHECKSUM_HW: + case CHECKSUM_COMPLETE: if (hook != NF_IP6_PRE_ROUTING && hook != NF_IP6_LOCAL_IN) break; if (!csum_ipv6_magic(&ip6h->saddr, &ip6h->daddr, diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c index 968a14b..c01c126 100644 --- a/net/ipv6/netfilter/ip6_queue.c +++ b/net/ipv6/netfilter/ip6_queue.c @@ -206,9 +206,9 @@ ipq_build_packet_message(struct ipq_queue_entry *entry, int *errp) break; case IPQ_COPY_PACKET: - if (entry->skb->ip_summed == CHECKSUM_HW && - (*errp = skb_checksum_help(entry->skb, - entry->info->outdev == NULL))) { + if ((entry->skb->ip_summed == CHECKSUM_PARTIAL || + entry->skb->ip_summed == CHECKSUM_COMPLETE) && + (*errp = skb_checksum_help(entry->skb))) { read_unlock_bh(&queue_lock); return NULL; } diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index 00d5583..7a4e4c2 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -408,7 +408,7 @@ static int nf_ct_frag6_queue(struct nf_ct_frag6_queue *fq, struct sk_buff *skb, return -1; } - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_COMPLETE) skb->csum = csum_sub(skb->csum, csum_partial(skb->nh.raw, (u8*)(fhdr + 1) - skb->nh.raw, @@ -640,7 +640,7 @@ nf_ct_frag6_reasm(struct nf_ct_frag6_queue *fq, struct net_device *dev) head->len += fp->len; if (head->ip_summed != fp->ip_summed) head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_HW) + else if (head->ip_summed == CHECKSUM_COMPLETE) head->csum = csum_add(head->csum, fp->csum); head->truesize += fp->truesize; atomic_sub(fp->truesize, &nf_ct_frag6_mem); @@ -652,7 +652,7 @@ nf_ct_frag6_reasm(struct nf_ct_frag6_queue *fq, struct net_device *dev) head->nh.ipv6h->payload_len = htons(payload_len); /* Yes, and fold redundant checksum back. 8) */ - if (head->ip_summed == CHECKSUM_HW) + if (head->ip_summed == CHECKSUM_COMPLETE) head->csum = csum_partial(head->nh.raw, head->h.raw-head->nh.raw, head->csum); fq->fragments = NULL; diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index d5040e1..d4af1cb 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -334,7 +334,7 @@ int rawv6_rcv(struct sock *sk, struct sk_buff *skb) if (!rp->checksum) skb->ip_summed = CHECKSUM_UNNECESSARY; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_COMPLETE) { skb_postpull_rcsum(skb, skb->nh.raw, skb->h.raw - skb->nh.raw); if (!csum_ipv6_magic(&skb->nh.ipv6h->saddr, diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index 4e299c6..a8623d2 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -433,7 +433,7 @@ static void ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb, return; } - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_COMPLETE) skb->csum = csum_sub(skb->csum, csum_partial(skb->nh.raw, (u8*)(fhdr+1)-skb->nh.raw, 0)); @@ -647,7 +647,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff **skb_in, head->len += fp->len; if (head->ip_summed != fp->ip_summed) head->ip_summed = CHECKSUM_NONE; - else if (head->ip_summed == CHECKSUM_HW) + else if (head->ip_summed == CHECKSUM_COMPLETE) head->csum = csum_add(head->csum, fp->csum); head->truesize += fp->truesize; atomic_sub(fp->truesize, &ip6_frag_mem); @@ -662,7 +662,7 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff **skb_in, *skb_in = head; /* Yes, and fold redundant checksum back. 8) */ - if (head->ip_summed == CHECKSUM_HW) + if (head->ip_summed == CHECKSUM_COMPLETE) head->csum = csum_partial(head->nh.raw, head->h.raw-head->nh.raw, head->csum); IP6_INC_STATS_BH(IPSTATS_MIB_REASMOKS); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 302786a..7f1b660 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -545,7 +545,7 @@ static void tcp_v6_send_check(struct sock *sk, int len, struct sk_buff *skb) struct ipv6_pinfo *np = inet6_sk(sk); struct tcphdr *th = skb->h.th; - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_PARTIAL) { th->check = ~csum_ipv6_magic(&np->saddr, &np->daddr, len, IPPROTO_TCP, 0); skb->csum = offsetof(struct tcphdr, check); } else { @@ -570,7 +570,7 @@ static int tcp_v6_gso_send_check(struct sk_buff *skb) th->check = ~csum_ipv6_magic(&ipv6h->saddr, &ipv6h->daddr, skb->len, IPPROTO_TCP, 0); skb->csum = offsetof(struct tcphdr, check); - skb->ip_summed = CHECKSUM_HW; + skb->ip_summed = CHECKSUM_PARTIAL; return 0; } @@ -1033,7 +1033,7 @@ out: static int tcp_v6_checksum_init(struct sk_buff *skb) { - if (skb->ip_summed == CHECKSUM_HW) { + if (skb->ip_summed == CHECKSUM_COMPLETE) { if (!tcp_v6_check(skb->h.th,skb->len,&skb->nh.ipv6h->saddr, &skb->nh.ipv6h->daddr,skb->csum)) { skb->ip_summed = CHECKSUM_UNNECESSARY; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 82c7c9c..780b89f 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -475,7 +475,7 @@ static int udpv6_rcv(struct sk_buff **pskb) uh = skb->h.uh; } - if (skb->ip_summed == CHECKSUM_HW && + if (skb->ip_summed == CHECKSUM_COMPLETE && !csum_ipv6_magic(saddr, daddr, ulen, IPPROTO_UDP, skb->csum)) skb->ip_summed = CHECKSUM_UNNECESSARY; diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index c8c8b44..6d11174 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -41,8 +41,8 @@ static int xfrm6_output_one(struct sk_buff *skb) struct xfrm_state *x = dst->xfrm; int err; - if (skb->ip_summed == CHECKSUM_HW) { - err = skb_checksum_help(skb, 0); + if (skb->ip_summed == CHECKSUM_PARTIAL) { + err = skb_checksum_help(skb); if (err) goto error_nolock; } diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index af8adcb..308d2ab 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -823,8 +823,7 @@ static int tcp_error(struct sk_buff *skb, /* Checksum invalid? Ignore. * We skip checking packets on the outgoing path - * because the semantic of CHECKSUM_HW is different there - * and moreover root might send raw packets. + * because the checksum is assumed to be correct. */ /* FIXME: Source route IP option packets --RR */ if (nf_conntrack_checksum && diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c index ae07ebe..d36e031 100644 --- a/net/netfilter/nf_conntrack_proto_udp.c +++ b/net/netfilter/nf_conntrack_proto_udp.c @@ -131,8 +131,7 @@ static int udp_error(struct sk_buff *skb, unsigned int dataoff, /* Checksum invalid? Ignore. * We skip checking packets on the outgoing path - * because the semantic of CHECKSUM_HW is different there - * and moreover root might send raw packets. + * because the checksum is assumed to be correct. * FIXME: Source route IP option packets --RR */ if (nf_conntrack_checksum && ((pf == PF_INET && hooknum == NF_IP_PRE_ROUTING) || diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c index 49ef41e..eddfbe4 100644 --- a/net/netfilter/nfnetlink_queue.c +++ b/net/netfilter/nfnetlink_queue.c @@ -377,9 +377,9 @@ nfqnl_build_packet_message(struct nfqnl_instance *queue, break; case NFQNL_COPY_PACKET: - if (entskb->ip_summed == CHECKSUM_HW && - (*errp = skb_checksum_help(entskb, - outdev == NULL))) { + if ((entskb->ip_summed == CHECKSUM_PARTIAL || + entskb->ip_summed == CHECKSUM_COMPLETE) && + (*errp = skb_checksum_help(entskb))) { spin_unlock_bh(&queue->lock); return NULL; } diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 4172a52..300215b 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -586,7 +586,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, struct packe else if (skb->pkt_type == PACKET_OUTGOING) { /* Special case: outgoing packets have ll header at head */ skb_pull(skb, skb->nh.raw - skb->data); - if (skb->ip_summed == CHECKSUM_HW) + if (skb->ip_summed == CHECKSUM_PARTIAL) status |= TP_STATUS_CSUMNOTREADY; } } diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c index a08ec4c7c..45939ba 100644 --- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -192,8 +192,8 @@ static int netem_enqueue(struct sk_buff *skb, struct Qdisc *sch) */ if (q->corrupt && q->corrupt >= get_crandom(&q->corrupt_cor)) { if (!(skb = skb_unshare(skb, GFP_ATOMIC)) - || (skb->ip_summed == CHECKSUM_HW - && skb_checksum_help(skb, 0))) { + || (skb->ip_summed == CHECKSUM_PARTIAL + && skb_checksum_help(skb))) { sch->qstats.drops++; return NET_XMIT_DROP; } diff --git a/net/sunrpc/socklib.c b/net/sunrpc/socklib.c index eb330d4..6f17527 100644 --- a/net/sunrpc/socklib.c +++ b/net/sunrpc/socklib.c @@ -168,7 +168,7 @@ int csum_partial_copy_to_xdr(struct xdr_buf *xdr, struct sk_buff *skb) return -1; if ((unsigned short)csum_fold(desc.csum)) return -1; - if (unlikely(skb->ip_summed == CHECKSUM_HW)) + if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE)) netdev_rx_csum_fault(skb->dev); return 0; no_checksum: -- cgit v0.10.2 From 4cf411de49c65140b3c259748629b561c0d3340f Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sat, 5 Aug 2006 00:58:33 -0700 Subject: [NETFILTER]: Get rid of HW checksum invalidation Update hardware checksums incrementally to avoid breaking GSO. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h index 10168e2..b7e67d1 100644 --- a/include/linux/netfilter.h +++ b/include/linux/netfilter.h @@ -282,6 +282,12 @@ extern void nf_invalidate_cache(int pf); Returns true or false. */ extern int skb_make_writable(struct sk_buff **pskb, unsigned int writable_len); +extern u_int16_t nf_csum_update(u_int32_t oldval, u_int32_t newval, + u_int32_t csum); +extern u_int16_t nf_proto_csum_update(struct sk_buff *skb, + u_int32_t oldval, u_int32_t newval, + u_int16_t csum, int pseudohdr); + struct nf_afinfo { unsigned short family; unsigned int (*checksum)(struct sk_buff *skb, unsigned int hook, diff --git a/include/linux/netfilter_ipv4/ip_nat.h b/include/linux/netfilter_ipv4/ip_nat.h index e9f5ed1..98f8407 100644 --- a/include/linux/netfilter_ipv4/ip_nat.h +++ b/include/linux/netfilter_ipv4/ip_nat.h @@ -72,10 +72,6 @@ extern unsigned int ip_nat_setup_info(struct ip_conntrack *conntrack, extern int ip_nat_used_tuple(const struct ip_conntrack_tuple *tuple, const struct ip_conntrack *ignored_conntrack); -/* Calculate relative checksum. */ -extern u_int16_t ip_nat_cheat_check(u_int32_t oldvalinv, - u_int32_t newval, - u_int16_t oldcheck); #else /* !__KERNEL__: iptables wants this to compile. */ #define ip_nat_multi_range ip_nat_multi_range_compat #endif /*__KERNEL__*/ diff --git a/include/linux/netfilter_ipv4/ip_nat_core.h b/include/linux/netfilter_ipv4/ip_nat_core.h index 30db23f..60566f9f 100644 --- a/include/linux/netfilter_ipv4/ip_nat_core.h +++ b/include/linux/netfilter_ipv4/ip_nat_core.h @@ -11,8 +11,8 @@ extern unsigned int ip_nat_packet(struct ip_conntrack *ct, unsigned int hooknum, struct sk_buff **pskb); -extern int ip_nat_icmp_reply_translation(struct sk_buff **pskb, - struct ip_conntrack *ct, - enum ip_nat_manip_type manip, - enum ip_conntrack_dir dir); +extern int ip_nat_icmp_reply_translation(struct ip_conntrack *ct, + enum ip_conntrack_info ctinfo, + unsigned int hooknum, + struct sk_buff **pskb); #endif /* _IP_NAT_CORE_H */ diff --git a/net/ipv4/netfilter/ip_nat_core.c b/net/ipv4/netfilter/ip_nat_core.c index 1741d55..4c540d0 100644 --- a/net/ipv4/netfilter/ip_nat_core.c +++ b/net/ipv4/netfilter/ip_nat_core.c @@ -101,18 +101,6 @@ static void ip_nat_cleanup_conntrack(struct ip_conntrack *conn) write_unlock_bh(&ip_nat_lock); } -/* We do checksum mangling, so if they were wrong before they're still - * wrong. Also works for incomplete packets (eg. ICMP dest - * unreachables.) */ -u_int16_t -ip_nat_cheat_check(u_int32_t oldvalinv, u_int32_t newval, u_int16_t oldcheck) -{ - u_int32_t diffs[] = { oldvalinv, newval }; - return csum_fold(csum_partial((char *)diffs, sizeof(diffs), - oldcheck^0xFFFF)); -} -EXPORT_SYMBOL(ip_nat_cheat_check); - /* Is this tuple already taken? (not by us) */ int ip_nat_used_tuple(const struct ip_conntrack_tuple *tuple, @@ -378,12 +366,12 @@ manip_pkt(u_int16_t proto, iph = (void *)(*pskb)->data + iphdroff; if (maniptype == IP_NAT_MANIP_SRC) { - iph->check = ip_nat_cheat_check(~iph->saddr, target->src.ip, - iph->check); + iph->check = nf_csum_update(~iph->saddr, target->src.ip, + iph->check); iph->saddr = target->src.ip; } else { - iph->check = ip_nat_cheat_check(~iph->daddr, target->dst.ip, - iph->check); + iph->check = nf_csum_update(~iph->daddr, target->dst.ip, + iph->check); iph->daddr = target->dst.ip; } return 1; @@ -423,10 +411,10 @@ unsigned int ip_nat_packet(struct ip_conntrack *ct, EXPORT_SYMBOL_GPL(ip_nat_packet); /* Dir is direction ICMP is coming from (opposite to packet it contains) */ -int ip_nat_icmp_reply_translation(struct sk_buff **pskb, - struct ip_conntrack *ct, - enum ip_nat_manip_type manip, - enum ip_conntrack_dir dir) +int ip_nat_icmp_reply_translation(struct ip_conntrack *ct, + enum ip_conntrack_info ctinfo, + unsigned int hooknum, + struct sk_buff **pskb) { struct { struct icmphdr icmp; @@ -434,7 +422,9 @@ int ip_nat_icmp_reply_translation(struct sk_buff **pskb, } *inside; struct ip_conntrack_tuple inner, target; int hdrlen = (*pskb)->nh.iph->ihl * 4; + enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo); unsigned long statusbit; + enum ip_nat_manip_type manip = HOOK2MANIP(hooknum); if (!skb_make_writable(pskb, hdrlen + sizeof(*inside))) return 0; @@ -443,12 +433,8 @@ int ip_nat_icmp_reply_translation(struct sk_buff **pskb, /* We're actually going to mangle it beyond trivial checksum adjustment, so make sure the current checksum is correct. */ - if ((*pskb)->ip_summed != CHECKSUM_UNNECESSARY) { - hdrlen = (*pskb)->nh.iph->ihl * 4; - if ((u16)csum_fold(skb_checksum(*pskb, hdrlen, - (*pskb)->len - hdrlen, 0))) - return 0; - } + if (nf_ip_checksum(*pskb, hooknum, hdrlen, 0)) + return 0; /* Must be RELATED */ IP_NF_ASSERT((*pskb)->nfctinfo == IP_CT_RELATED || @@ -487,12 +473,14 @@ int ip_nat_icmp_reply_translation(struct sk_buff **pskb, !manip)) return 0; - /* Reloading "inside" here since manip_pkt inner. */ - inside = (void *)(*pskb)->data + (*pskb)->nh.iph->ihl*4; - inside->icmp.checksum = 0; - inside->icmp.checksum = csum_fold(skb_checksum(*pskb, hdrlen, - (*pskb)->len - hdrlen, - 0)); + if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) { + /* Reloading "inside" here since manip_pkt inner. */ + inside = (void *)(*pskb)->data + (*pskb)->nh.iph->ihl*4; + inside->icmp.checksum = 0; + inside->icmp.checksum = csum_fold(skb_checksum(*pskb, hdrlen, + (*pskb)->len - hdrlen, + 0)); + } /* Change outer to look the reply to an incoming packet * (proto 0 means don't invert per-proto part). */ diff --git a/net/ipv4/netfilter/ip_nat_helper.c b/net/ipv4/netfilter/ip_nat_helper.c index cbcaa45..021c3da 100644 --- a/net/ipv4/netfilter/ip_nat_helper.c +++ b/net/ipv4/netfilter/ip_nat_helper.c @@ -165,7 +165,7 @@ ip_nat_mangle_tcp_packet(struct sk_buff **pskb, { struct iphdr *iph; struct tcphdr *tcph; - int datalen; + int oldlen, datalen; if (!skb_make_writable(pskb, (*pskb)->len)) return 0; @@ -180,13 +180,22 @@ ip_nat_mangle_tcp_packet(struct sk_buff **pskb, iph = (*pskb)->nh.iph; tcph = (void *)iph + iph->ihl*4; + oldlen = (*pskb)->len - iph->ihl*4; mangle_contents(*pskb, iph->ihl*4 + tcph->doff*4, match_offset, match_len, rep_buffer, rep_len); datalen = (*pskb)->len - iph->ihl*4; - tcph->check = 0; - tcph->check = tcp_v4_check(tcph, datalen, iph->saddr, iph->daddr, - csum_partial((char *)tcph, datalen, 0)); + if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) { + tcph->check = 0; + tcph->check = tcp_v4_check(tcph, datalen, + iph->saddr, iph->daddr, + csum_partial((char *)tcph, + datalen, 0)); + } else + tcph->check = nf_proto_csum_update(*pskb, + htons(oldlen) ^ 0xFFFF, + htons(datalen), + tcph->check, 1); if (rep_len != match_len) { set_bit(IPS_SEQ_ADJUST_BIT, &ct->status); @@ -221,6 +230,7 @@ ip_nat_mangle_udp_packet(struct sk_buff **pskb, { struct iphdr *iph; struct udphdr *udph; + int datalen, oldlen; /* UDP helpers might accidentally mangle the wrong packet */ iph = (*pskb)->nh.iph; @@ -238,22 +248,32 @@ ip_nat_mangle_udp_packet(struct sk_buff **pskb, iph = (*pskb)->nh.iph; udph = (void *)iph + iph->ihl*4; + + oldlen = (*pskb)->len - iph->ihl*4; mangle_contents(*pskb, iph->ihl*4 + sizeof(*udph), match_offset, match_len, rep_buffer, rep_len); /* update the length of the UDP packet */ - udph->len = htons((*pskb)->len - iph->ihl*4); + datalen = (*pskb)->len - iph->ihl*4; + udph->len = htons(datalen); /* fix udp checksum if udp checksum was previously calculated */ - if (udph->check) { - int datalen = (*pskb)->len - iph->ihl * 4; + if (!udph->check && (*pskb)->ip_summed != CHECKSUM_PARTIAL) + return 1; + + if ((*pskb)->ip_summed != CHECKSUM_PARTIAL) { udph->check = 0; udph->check = csum_tcpudp_magic(iph->saddr, iph->daddr, datalen, IPPROTO_UDP, csum_partial((char *)udph, datalen, 0)); - } - + if (!udph->check) + udph->check = -1; + } else + udph->check = nf_proto_csum_update(*pskb, + htons(oldlen) ^ 0xFFFF, + htons(datalen), + udph->check, 1); return 1; } EXPORT_SYMBOL(ip_nat_mangle_udp_packet); @@ -293,11 +313,14 @@ sack_adjust(struct sk_buff *skb, ntohl(sack->start_seq), new_start_seq, ntohl(sack->end_seq), new_end_seq); - tcph->check = - ip_nat_cheat_check(~sack->start_seq, new_start_seq, - ip_nat_cheat_check(~sack->end_seq, - new_end_seq, - tcph->check)); + tcph->check = nf_proto_csum_update(skb, + ~sack->start_seq, + new_start_seq, + tcph->check, 0); + tcph->check = nf_proto_csum_update(skb, + ~sack->end_seq, + new_end_seq, + tcph->check, 0); sack->start_seq = new_start_seq; sack->end_seq = new_end_seq; sackoff += sizeof(*sack); @@ -381,10 +404,10 @@ ip_nat_seq_adjust(struct sk_buff **pskb, newack = ntohl(tcph->ack_seq) - other_way->offset_before; newack = htonl(newack); - tcph->check = ip_nat_cheat_check(~tcph->seq, newseq, - ip_nat_cheat_check(~tcph->ack_seq, - newack, - tcph->check)); + tcph->check = nf_proto_csum_update(*pskb, ~tcph->seq, newseq, + tcph->check, 0); + tcph->check = nf_proto_csum_update(*pskb, ~tcph->ack_seq, newack, + tcph->check, 0); DEBUGP("Adjusting sequence number from %u->%u, ack from %u->%u\n", ntohl(tcph->seq), ntohl(newseq), ntohl(tcph->ack_seq), diff --git a/net/ipv4/netfilter/ip_nat_proto_gre.c b/net/ipv4/netfilter/ip_nat_proto_gre.c index 38acfdf..70a6537 100644 --- a/net/ipv4/netfilter/ip_nat_proto_gre.c +++ b/net/ipv4/netfilter/ip_nat_proto_gre.c @@ -130,9 +130,10 @@ gre_manip_pkt(struct sk_buff **pskb, if (greh->csum) { /* FIXME: Never tested this code... */ *(gre_csum(greh)) = - ip_nat_cheat_check(~*(gre_key(greh)), + nf_proto_csum_update(*pskb, + ~*(gre_key(greh)), tuple->dst.u.gre.key, - *(gre_csum(greh))); + *(gre_csum(greh)), 0); } *(gre_key(greh)) = tuple->dst.u.gre.key; break; diff --git a/net/ipv4/netfilter/ip_nat_proto_icmp.c b/net/ipv4/netfilter/ip_nat_proto_icmp.c index 31a3f4c..ec50cc2 100644 --- a/net/ipv4/netfilter/ip_nat_proto_icmp.c +++ b/net/ipv4/netfilter/ip_nat_proto_icmp.c @@ -66,10 +66,10 @@ icmp_manip_pkt(struct sk_buff **pskb, return 0; hdr = (struct icmphdr *)((*pskb)->data + hdroff); - - hdr->checksum = ip_nat_cheat_check(hdr->un.echo.id ^ 0xFFFF, - tuple->src.u.icmp.id, - hdr->checksum); + hdr->checksum = nf_proto_csum_update(*pskb, + hdr->un.echo.id ^ 0xFFFF, + tuple->src.u.icmp.id, + hdr->checksum, 0); hdr->un.echo.id = tuple->src.u.icmp.id; return 1; } diff --git a/net/ipv4/netfilter/ip_nat_proto_tcp.c b/net/ipv4/netfilter/ip_nat_proto_tcp.c index a3d1407..72a6307 100644 --- a/net/ipv4/netfilter/ip_nat_proto_tcp.c +++ b/net/ipv4/netfilter/ip_nat_proto_tcp.c @@ -129,10 +129,9 @@ tcp_manip_pkt(struct sk_buff **pskb, if (hdrsize < sizeof(*hdr)) return 1; - hdr->check = ip_nat_cheat_check(~oldip, newip, - ip_nat_cheat_check(oldport ^ 0xFFFF, - newport, - hdr->check)); + hdr->check = nf_proto_csum_update(*pskb, ~oldip, newip, hdr->check, 1); + hdr->check = nf_proto_csum_update(*pskb, oldport ^ 0xFFFF, newport, + hdr->check, 0); return 1; } diff --git a/net/ipv4/netfilter/ip_nat_proto_udp.c b/net/ipv4/netfilter/ip_nat_proto_udp.c index ec6053f..5da196a 100644 --- a/net/ipv4/netfilter/ip_nat_proto_udp.c +++ b/net/ipv4/netfilter/ip_nat_proto_udp.c @@ -113,11 +113,16 @@ udp_manip_pkt(struct sk_buff **pskb, newport = tuple->dst.u.udp.port; portptr = &hdr->dest; } - if (hdr->check) /* 0 is a special case meaning no checksum */ - hdr->check = ip_nat_cheat_check(~oldip, newip, - ip_nat_cheat_check(*portptr ^ 0xFFFF, - newport, - hdr->check)); + + if (hdr->check || (*pskb)->ip_summed == CHECKSUM_PARTIAL) { + hdr->check = nf_proto_csum_update(*pskb, ~oldip, newip, + hdr->check, 1); + hdr->check = nf_proto_csum_update(*pskb, + *portptr ^ 0xFFFF, newport, + hdr->check, 0); + if (!hdr->check) + hdr->check = -1; + } *portptr = newport; return 1; } diff --git a/net/ipv4/netfilter/ip_nat_standalone.c b/net/ipv4/netfilter/ip_nat_standalone.c index f4f00c8..f3b7783 100644 --- a/net/ipv4/netfilter/ip_nat_standalone.c +++ b/net/ipv4/netfilter/ip_nat_standalone.c @@ -110,12 +110,6 @@ ip_nat_fn(unsigned int hooknum, IP_NF_ASSERT(!((*pskb)->nh.iph->frag_off & htons(IP_MF|IP_OFFSET))); - /* If we had a hardware checksum before, it's now invalid */ - if ((*pskb)->ip_summed == CHECKSUM_PARTIAL || - (*pskb)->ip_summed == CHECKSUM_COMPLETE) - if (skb_checksum_help(*pskb)) - return NF_DROP; - ct = ip_conntrack_get(*pskb, &ctinfo); /* Can't track? It's not due to stress, or conntrack would have dropped it. Hence it's the user's responsibilty to @@ -146,8 +140,8 @@ ip_nat_fn(unsigned int hooknum, case IP_CT_RELATED: case IP_CT_RELATED+IP_CT_IS_REPLY: if ((*pskb)->nh.iph->protocol == IPPROTO_ICMP) { - if (!ip_nat_icmp_reply_translation(pskb, ct, maniptype, - CTINFO2DIR(ctinfo))) + if (!ip_nat_icmp_reply_translation(ct, ctinfo, + hooknum, pskb)) return NF_DROP; else return NF_ACCEPT; diff --git a/net/ipv4/netfilter/ipt_ECN.c b/net/ipv4/netfilter/ipt_ECN.c index 4ec43f9..35916c7 100644 --- a/net/ipv4/netfilter/ipt_ECN.c +++ b/net/ipv4/netfilter/ipt_ECN.c @@ -52,7 +52,7 @@ static inline int set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) { struct tcphdr _tcph, *tcph; - u_int16_t diffs[2]; + u_int16_t oldval; /* Not enought header? */ tcph = skb_header_pointer(*pskb, (*pskb)->nh.iph->ihl*4, @@ -70,23 +70,16 @@ set_ect_tcp(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) return 0; tcph = (void *)(*pskb)->nh.iph + (*pskb)->nh.iph->ihl*4; - if (((*pskb)->ip_summed == CHECKSUM_PARTIAL || - (*pskb)->ip_summed == CHECKSUM_COMPLETE) && - skb_checksum_help(*pskb)) - return 0; - - diffs[0] = ((u_int16_t *)tcph)[6]; + oldval = ((u_int16_t *)tcph)[6]; if (einfo->operation & IPT_ECN_OP_SET_ECE) tcph->ece = einfo->proto.tcp.ece; if (einfo->operation & IPT_ECN_OP_SET_CWR) tcph->cwr = einfo->proto.tcp.cwr; - diffs[1] = ((u_int16_t *)tcph)[6]; - diffs[0] = diffs[0] ^ 0xFFFF; - if ((*pskb)->ip_summed != CHECKSUM_UNNECESSARY) - tcph->check = csum_fold(csum_partial((char *)diffs, - sizeof(diffs), - tcph->check^0xFFFF)); + tcph->check = nf_proto_csum_update((*pskb), + oldval ^ 0xFFFF, + ((u_int16_t *)tcph)[6], + tcph->check, 0); return 1; } diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c index 7f905bf..95c6662 100644 --- a/net/ipv4/netfilter/ipt_REJECT.c +++ b/net/ipv4/netfilter/ipt_REJECT.c @@ -185,6 +185,7 @@ static void send_reset(struct sk_buff *oldskb, int hook) tcph->urg_ptr = 0; /* Adjust TCP checksum */ + nskb->ip_summed = CHECKSUM_NONE; tcph->check = 0; tcph->check = tcp_v4_check(tcph, sizeof(struct tcphdr), nskb->nh.iph->saddr, diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index c998dc0..0fce85e 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -27,14 +27,6 @@ MODULE_DESCRIPTION("iptables TCP MSS modification module"); #define DEBUGP(format, args...) #endif -static u_int16_t -cheat_check(u_int32_t oldvalinv, u_int32_t newval, u_int16_t oldcheck) -{ - u_int32_t diffs[] = { oldvalinv, newval }; - return csum_fold(csum_partial((char *)diffs, sizeof(diffs), - oldcheck^0xFFFF)); -} - static inline unsigned int optlen(const u_int8_t *opt, unsigned int offset) { @@ -62,11 +54,6 @@ ipt_tcpmss_target(struct sk_buff **pskb, if (!skb_make_writable(pskb, (*pskb)->len)) return NF_DROP; - if (((*pskb)->ip_summed == CHECKSUM_PARTIAL || - (*pskb)->ip_summed == CHECKSUM_COMPLETE) && - skb_checksum_help(*pskb)) - return NF_DROP; - iph = (*pskb)->nh.iph; tcplen = (*pskb)->len - iph->ihl*4; @@ -120,9 +107,10 @@ ipt_tcpmss_target(struct sk_buff **pskb, opt[i+2] = (newmss & 0xff00) >> 8; opt[i+3] = (newmss & 0x00ff); - tcph->check = cheat_check(htons(oldmss)^0xFFFF, - htons(newmss), - tcph->check); + tcph->check = nf_proto_csum_update(*pskb, + htons(oldmss)^0xFFFF, + htons(newmss), + tcph->check, 0); DEBUGP(KERN_INFO "ipt_tcpmss_target: %u.%u.%u.%u:%hu" "->%u.%u.%u.%u:%hu changed TCP MSS option" @@ -162,8 +150,10 @@ ipt_tcpmss_target(struct sk_buff **pskb, opt = (u_int8_t *)tcph + sizeof(struct tcphdr); memmove(opt + TCPOLEN_MSS, opt, tcplen - sizeof(struct tcphdr)); - tcph->check = cheat_check(htons(tcplen) ^ 0xFFFF, - htons(tcplen + TCPOLEN_MSS), tcph->check); + tcph->check = nf_proto_csum_update(*pskb, + htons(tcplen) ^ 0xFFFF, + htons(tcplen + TCPOLEN_MSS), + tcph->check, 1); tcplen += TCPOLEN_MSS; opt[0] = TCPOPT_MSS; @@ -171,16 +161,19 @@ ipt_tcpmss_target(struct sk_buff **pskb, opt[2] = (newmss & 0xff00) >> 8; opt[3] = (newmss & 0x00ff); - tcph->check = cheat_check(~0, *((u_int32_t *)opt), tcph->check); + tcph->check = nf_proto_csum_update(*pskb, ~0, *((u_int32_t *)opt), + tcph->check, 0); oldval = ((u_int16_t *)tcph)[6]; tcph->doff += TCPOLEN_MSS/4; - tcph->check = cheat_check(oldval ^ 0xFFFF, - ((u_int16_t *)tcph)[6], tcph->check); + tcph->check = nf_proto_csum_update(*pskb, + oldval ^ 0xFFFF, + ((u_int16_t *)tcph)[6], + tcph->check, 0); newtotlen = htons(ntohs(iph->tot_len) + TCPOLEN_MSS); - iph->check = cheat_check(iph->tot_len ^ 0xFFFF, - newtotlen, iph->check); + iph->check = nf_csum_update(iph->tot_len ^ 0xFFFF, + newtotlen, iph->check); iph->tot_len = newtotlen; DEBUGP(KERN_INFO "ipt_tcpmss_target: %u.%u.%u.%u:%hu" diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 5d29d5e..27f639f 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -222,6 +222,28 @@ copy_skb: } EXPORT_SYMBOL(skb_make_writable); +u_int16_t nf_csum_update(u_int32_t oldval, u_int32_t newval, u_int32_t csum) +{ + u_int32_t diff[] = { oldval, newval }; + + return csum_fold(csum_partial((char *)diff, sizeof(diff), ~csum)); +} +EXPORT_SYMBOL(nf_csum_update); + +u_int16_t nf_proto_csum_update(struct sk_buff *skb, + u_int32_t oldval, u_int32_t newval, + u_int16_t csum, int pseudohdr) +{ + if (skb->ip_summed != CHECKSUM_PARTIAL) { + csum = nf_csum_update(oldval, newval, csum); + if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr) + skb->csum = nf_csum_update(oldval, newval, skb->csum); + } else if (pseudohdr) + csum = ~nf_csum_update(oldval, newval, ~csum); + + return csum; +} +EXPORT_SYMBOL(nf_proto_csum_update); /* This does not belong here, but locally generated errors need it if connection tracking in use: without this, connection may not be in hash table, and hence -- cgit v0.10.2 From 394f545db6e7e4d7a6a2fa3f543b755ca39d58ac Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sat, 5 Aug 2006 00:58:52 -0700 Subject: [NETFILTER]: nf_queue: handle GSO packets Handle GSO packets in nf_queue by segmenting them before queueing to avoid breaking GSO in case they get mangled. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/core.c b/net/netfilter/core.c index 27f639f..d80b935 100644 --- a/net/netfilter/core.c +++ b/net/netfilter/core.c @@ -182,7 +182,7 @@ next_hook: ret = -EPERM; } else if ((verdict & NF_VERDICT_MASK) == NF_QUEUE) { NFDEBUG("nf_hook: Verdict = QUEUE.\n"); - if (!nf_queue(pskb, elem, pf, hook, indev, outdev, okfn, + if (!nf_queue(*pskb, elem, pf, hook, indev, outdev, okfn, verdict >> NF_VERDICT_BITS)) goto next_hook; } diff --git a/net/netfilter/nf_internals.h b/net/netfilter/nf_internals.h index 86e392bf..a981971 100644 --- a/net/netfilter/nf_internals.h +++ b/net/netfilter/nf_internals.h @@ -23,7 +23,7 @@ extern unsigned int nf_iterate(struct list_head *head, int hook_thresh); /* nf_queue.c */ -extern int nf_queue(struct sk_buff **skb, +extern int nf_queue(struct sk_buff *skb, struct list_head *elem, int pf, unsigned int hook, struct net_device *indev, diff --git a/net/netfilter/nf_queue.c b/net/netfilter/nf_queue.c index 662a869..4d8936e 100644 --- a/net/netfilter/nf_queue.c +++ b/net/netfilter/nf_queue.c @@ -74,13 +74,13 @@ EXPORT_SYMBOL_GPL(nf_unregister_queue_handlers); * Any packet that leaves via this function must come back * through nf_reinject(). */ -int nf_queue(struct sk_buff **skb, - struct list_head *elem, - int pf, unsigned int hook, - struct net_device *indev, - struct net_device *outdev, - int (*okfn)(struct sk_buff *), - unsigned int queuenum) +static int __nf_queue(struct sk_buff *skb, + struct list_head *elem, + int pf, unsigned int hook, + struct net_device *indev, + struct net_device *outdev, + int (*okfn)(struct sk_buff *), + unsigned int queuenum) { int status; struct nf_info *info; @@ -94,14 +94,14 @@ int nf_queue(struct sk_buff **skb, read_lock(&queue_handler_lock); if (!queue_handler[pf]) { read_unlock(&queue_handler_lock); - kfree_skb(*skb); + kfree_skb(skb); return 1; } afinfo = nf_get_afinfo(pf); if (!afinfo) { read_unlock(&queue_handler_lock); - kfree_skb(*skb); + kfree_skb(skb); return 1; } @@ -109,9 +109,9 @@ int nf_queue(struct sk_buff **skb, if (!info) { if (net_ratelimit()) printk(KERN_ERR "OOM queueing packet %p\n", - *skb); + skb); read_unlock(&queue_handler_lock); - kfree_skb(*skb); + kfree_skb(skb); return 1; } @@ -130,15 +130,15 @@ int nf_queue(struct sk_buff **skb, if (outdev) dev_hold(outdev); #ifdef CONFIG_BRIDGE_NETFILTER - if ((*skb)->nf_bridge) { - physindev = (*skb)->nf_bridge->physindev; + if (skb->nf_bridge) { + physindev = skb->nf_bridge->physindev; if (physindev) dev_hold(physindev); - physoutdev = (*skb)->nf_bridge->physoutdev; + physoutdev = skb->nf_bridge->physoutdev; if (physoutdev) dev_hold(physoutdev); } #endif - afinfo->saveroute(*skb, info); - status = queue_handler[pf]->outfn(*skb, info, queuenum, + afinfo->saveroute(skb, info); + status = queue_handler[pf]->outfn(skb, info, queuenum, queue_handler[pf]->data); read_unlock(&queue_handler_lock); @@ -153,7 +153,7 @@ int nf_queue(struct sk_buff **skb, #endif module_put(info->elem->owner); kfree(info); - kfree_skb(*skb); + kfree_skb(skb); return 1; } @@ -161,6 +161,46 @@ int nf_queue(struct sk_buff **skb, return 1; } +int nf_queue(struct sk_buff *skb, + struct list_head *elem, + int pf, unsigned int hook, + struct net_device *indev, + struct net_device *outdev, + int (*okfn)(struct sk_buff *), + unsigned int queuenum) +{ + struct sk_buff *segs; + + if (!skb_is_gso(skb)) + return __nf_queue(skb, elem, pf, hook, indev, outdev, okfn, + queuenum); + + switch (pf) { + case AF_INET: + skb->protocol = htons(ETH_P_IP); + break; + case AF_INET6: + skb->protocol = htons(ETH_P_IPV6); + break; + } + + segs = skb_gso_segment(skb, 0); + kfree_skb(skb); + if (unlikely(IS_ERR(segs))) + return 1; + + do { + struct sk_buff *nskb = segs->next; + + segs->next = NULL; + if (!__nf_queue(segs, elem, pf, hook, indev, outdev, okfn, + queuenum)) + kfree_skb(segs); + segs = nskb; + } while (segs); + return 1; +} + void nf_reinject(struct sk_buff *skb, struct nf_info *info, unsigned int verdict) { @@ -224,9 +264,9 @@ void nf_reinject(struct sk_buff *skb, struct nf_info *info, case NF_STOLEN: break; case NF_QUEUE: - if (!nf_queue(&skb, elem, info->pf, info->hook, - info->indev, info->outdev, info->okfn, - verdict >> NF_VERDICT_BITS)) + if (!__nf_queue(skb, elem, info->pf, info->hook, + info->indev, info->outdev, info->okfn, + verdict >> NF_VERDICT_BITS)) goto next_hook; break; default: -- cgit v0.10.2 From d7aba67f814729647c938ac6da2d5224b790f926 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sat, 5 Aug 2006 02:20:42 -0700 Subject: [IPV6]: Fix thinko in rt6_fill_node This looks like a mistake, the table ID is overwritten again. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 438977e..ff5affe 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1902,7 +1902,6 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, rtm->rtm_table = rt->rt6i_table->tb6_id; else rtm->rtm_table = RT6_TABLE_UNSPEC; - rtm->rtm_table = RT_TABLE_MAIN; if (rt->rt6i_flags&RTF_REJECT) rtm->rtm_type = RTN_UNREACHABLE; else if (rt->rt6i_dev && (rt->rt6i_dev->flags&IFF_LOOPBACK)) -- cgit v0.10.2 From 6c813a7297e3af4cd7c3458e09e9ee3d161c6830 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sun, 6 Aug 2006 22:22:47 -0700 Subject: [IPV6]: Fix crash in ip6_del_rt ip6_null_entry doesn't have rt6i_table set, when trying to delete it the kernel crashes dereferencing table->tb6_lock. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index ff5affe..41c5905 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1223,6 +1223,9 @@ int ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct int err; struct fib6_table *table; + if (rt == &ip6_null_entry) + return -ENOENT; + table = rt->rt6i_table; write_lock_bh(&table->tb6_lock); -- cgit v0.10.2 From 3226f6881719e61e00e92b4c85a8ef49aa4d42b1 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sun, 6 Aug 2006 22:24:08 -0700 Subject: [IPV6]: Fix policy routing lookup When the lookup in a table returns ip6_null_entry the policy routing lookup returns it instead of continuing in the next table, which effectively means it only searches the local table. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index c3c8195..94a46ec 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -94,8 +94,10 @@ int fib6_rule_action(struct fib_rule *rule, struct flowi *flp, if (rt != &ip6_null_entry) goto out; - dst_release(&rt->u.dst); + rt = NULL; + goto out; + discard_pkt: dst_hold(&rt->u.dst); out: -- cgit v0.10.2 From a14a49d2b7b9290e87751f21f503f1954267d4c4 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 17:53:08 -0700 Subject: [NEIGH]: Convert neighbour deletion to new netlink api Fixes: Return ENOENT if the neighbour is not found (was EINVAL) Return EAFNOSUPPORT if no table matches the specified address family. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index fe2113f..39c07cc 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include @@ -1440,48 +1441,62 @@ int neigh_table_clear(struct neigh_table *tbl) int neigh_delete(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct ndmsg *ndm = NLMSG_DATA(nlh); - struct rtattr **nda = arg; + struct ndmsg *ndm; + struct nlattr *dst_attr; struct neigh_table *tbl; struct net_device *dev = NULL; - int err = -ENODEV; + int err = -EINVAL; - if (ndm->ndm_ifindex && - (dev = dev_get_by_index(ndm->ndm_ifindex)) == NULL) + if (nlmsg_len(nlh) < sizeof(*ndm)) + goto out; + + dst_attr = nlmsg_find_attr(nlh, sizeof(*ndm), NDA_DST); + if (dst_attr == NULL) goto out; + ndm = nlmsg_data(nlh); + if (ndm->ndm_ifindex) { + dev = dev_get_by_index(ndm->ndm_ifindex); + if (dev == NULL) { + err = -ENODEV; + goto out; + } + } + read_lock(&neigh_tbl_lock); for (tbl = neigh_tables; tbl; tbl = tbl->next) { - struct rtattr *dst_attr = nda[NDA_DST - 1]; - struct neighbour *n; + struct neighbour *neigh; if (tbl->family != ndm->ndm_family) continue; read_unlock(&neigh_tbl_lock); - err = -EINVAL; - if (!dst_attr || RTA_PAYLOAD(dst_attr) < tbl->key_len) + if (nla_len(dst_attr) < tbl->key_len) goto out_dev_put; if (ndm->ndm_flags & NTF_PROXY) { - err = pneigh_delete(tbl, RTA_DATA(dst_attr), dev); + err = pneigh_delete(tbl, nla_data(dst_attr), dev); goto out_dev_put; } - if (!dev) - goto out; + if (dev == NULL) + goto out_dev_put; - n = neigh_lookup(tbl, RTA_DATA(dst_attr), dev); - if (n) { - err = neigh_update(n, NULL, NUD_FAILED, - NEIGH_UPDATE_F_OVERRIDE| - NEIGH_UPDATE_F_ADMIN); - neigh_release(n); + neigh = neigh_lookup(tbl, nla_data(dst_attr), dev); + if (neigh == NULL) { + err = -ENOENT; + goto out_dev_put; } + + err = neigh_update(neigh, NULL, NUD_FAILED, + NEIGH_UPDATE_F_OVERRIDE | + NEIGH_UPDATE_F_ADMIN); + neigh_release(neigh); goto out_dev_put; } read_unlock(&neigh_tbl_lock); - err = -EADDRNOTAVAIL; + err = -EAFNOSUPPORT; + out_dev_put: if (dev) dev_put(dev); -- cgit v0.10.2 From 5208debd0f1da07bbb350f8b0b142775d4f002ea Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 17:55:40 -0700 Subject: [NEIGH]: Convert neighbour addition to new netlink api Fixes: Return EAFNOSUPPORT if no table matches the specified address family. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 39c07cc..6036f43 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1506,76 +1506,88 @@ out: int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct ndmsg *ndm = NLMSG_DATA(nlh); - struct rtattr **nda = arg; + struct ndmsg *ndm; + struct nlattr *tb[NDA_MAX+1]; struct neigh_table *tbl; struct net_device *dev = NULL; - int err = -ENODEV; + int err; + + err = nlmsg_parse(nlh, sizeof(*ndm), tb, NDA_MAX, NULL); + if (err < 0) + goto out; - if (ndm->ndm_ifindex && - (dev = dev_get_by_index(ndm->ndm_ifindex)) == NULL) + err = -EINVAL; + if (tb[NDA_DST] == NULL) goto out; + ndm = nlmsg_data(nlh); + if (ndm->ndm_ifindex) { + dev = dev_get_by_index(ndm->ndm_ifindex); + if (dev == NULL) { + err = -ENODEV; + goto out; + } + + if (tb[NDA_LLADDR] && nla_len(tb[NDA_LLADDR]) < dev->addr_len) + goto out_dev_put; + } + read_lock(&neigh_tbl_lock); for (tbl = neigh_tables; tbl; tbl = tbl->next) { - struct rtattr *lladdr_attr = nda[NDA_LLADDR - 1]; - struct rtattr *dst_attr = nda[NDA_DST - 1]; - int override = 1; - struct neighbour *n; + int flags = NEIGH_UPDATE_F_ADMIN | NEIGH_UPDATE_F_OVERRIDE; + struct neighbour *neigh; + void *dst, *lladdr; if (tbl->family != ndm->ndm_family) continue; read_unlock(&neigh_tbl_lock); - err = -EINVAL; - if (!dst_attr || RTA_PAYLOAD(dst_attr) < tbl->key_len) + if (nla_len(tb[NDA_DST]) < tbl->key_len) goto out_dev_put; + dst = nla_data(tb[NDA_DST]); + lladdr = tb[NDA_LLADDR] ? nla_data(tb[NDA_LLADDR]) : NULL; if (ndm->ndm_flags & NTF_PROXY) { - err = -ENOBUFS; - if (pneigh_lookup(tbl, RTA_DATA(dst_attr), dev, 1)) - err = 0; + err = 0; + if (pneigh_lookup(tbl, dst, dev, 1) == NULL) + err = -ENOBUFS; goto out_dev_put; } - err = -EINVAL; - if (!dev) - goto out; - if (lladdr_attr && RTA_PAYLOAD(lladdr_attr) < dev->addr_len) + if (dev == NULL) goto out_dev_put; + + neigh = neigh_lookup(tbl, dst, dev); + if (neigh == NULL) { + if (!(nlh->nlmsg_flags & NLM_F_CREATE)) { + err = -ENOENT; + goto out_dev_put; + } - n = neigh_lookup(tbl, RTA_DATA(dst_attr), dev); - if (n) { - if (nlh->nlmsg_flags & NLM_F_EXCL) { - err = -EEXIST; - neigh_release(n); + neigh = __neigh_lookup_errno(tbl, dst, dev); + if (IS_ERR(neigh)) { + err = PTR_ERR(neigh); goto out_dev_put; } - - override = nlh->nlmsg_flags & NLM_F_REPLACE; - } else if (!(nlh->nlmsg_flags & NLM_F_CREATE)) { - err = -ENOENT; - goto out_dev_put; } else { - n = __neigh_lookup_errno(tbl, RTA_DATA(dst_attr), dev); - if (IS_ERR(n)) { - err = PTR_ERR(n); + if (nlh->nlmsg_flags & NLM_F_EXCL) { + err = -EEXIST; + neigh_release(neigh); goto out_dev_put; } - } - err = neigh_update(n, - lladdr_attr ? RTA_DATA(lladdr_attr) : NULL, - ndm->ndm_state, - (override ? NEIGH_UPDATE_F_OVERRIDE : 0) | - NEIGH_UPDATE_F_ADMIN); + if (!(nlh->nlmsg_flags & NLM_F_REPLACE)) + flags &= ~NEIGH_UPDATE_F_OVERRIDE; + } - neigh_release(n); + err = neigh_update(neigh, lladdr, ndm->ndm_state, flags); + neigh_release(neigh); goto out_dev_put; } read_unlock(&neigh_tbl_lock); - err = -EADDRNOTAVAIL; + err = -EAFNOSUPPORT; + out_dev_put: if (dev) dev_put(dev); -- cgit v0.10.2 From 8b8aec508302d4e63fd88f47894805115277f70f Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 17:56:37 -0700 Subject: [NEIGH]: Convert neighbour dumping to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 6036f43..5490afd 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1901,48 +1901,49 @@ out: return skb->len; } -static int neigh_fill_info(struct sk_buff *skb, struct neighbour *n, - u32 pid, u32 seq, int event, unsigned int flags) +static int neigh_fill_info(struct sk_buff *skb, struct neighbour *neigh, + u32 pid, u32 seq, int type, unsigned int flags) { unsigned long now = jiffies; - unsigned char *b = skb->tail; struct nda_cacheinfo ci; - int locked = 0; - u32 probes; - struct nlmsghdr *nlh = NLMSG_NEW(skb, pid, seq, event, - sizeof(struct ndmsg), flags); - struct ndmsg *ndm = NLMSG_DATA(nlh); + struct nlmsghdr *nlh; + struct ndmsg *ndm; + + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ndm), flags); + if (nlh == NULL) + return -ENOBUFS; - ndm->ndm_family = n->ops->family; + ndm = nlmsg_data(nlh); + ndm->ndm_family = neigh->ops->family; ndm->ndm_pad1 = 0; ndm->ndm_pad2 = 0; - ndm->ndm_flags = n->flags; - ndm->ndm_type = n->type; - ndm->ndm_ifindex = n->dev->ifindex; - RTA_PUT(skb, NDA_DST, n->tbl->key_len, n->primary_key); - read_lock_bh(&n->lock); - locked = 1; - ndm->ndm_state = n->nud_state; - if (n->nud_state & NUD_VALID) - RTA_PUT(skb, NDA_LLADDR, n->dev->addr_len, n->ha); - ci.ndm_used = now - n->used; - ci.ndm_confirmed = now - n->confirmed; - ci.ndm_updated = now - n->updated; - ci.ndm_refcnt = atomic_read(&n->refcnt) - 1; - probes = atomic_read(&n->probes); - read_unlock_bh(&n->lock); - locked = 0; - RTA_PUT(skb, NDA_CACHEINFO, sizeof(ci), &ci); - RTA_PUT(skb, NDA_PROBES, sizeof(probes), &probes); - nlh->nlmsg_len = skb->tail - b; - return skb->len; + ndm->ndm_flags = neigh->flags; + ndm->ndm_type = neigh->type; + ndm->ndm_ifindex = neigh->dev->ifindex; -nlmsg_failure: -rtattr_failure: - if (locked) - read_unlock_bh(&n->lock); - skb_trim(skb, b - skb->data); - return -1; + NLA_PUT(skb, NDA_DST, neigh->tbl->key_len, neigh->primary_key); + + read_lock_bh(&neigh->lock); + ndm->ndm_state = neigh->nud_state; + if ((neigh->nud_state & NUD_VALID) && + nla_put(skb, NDA_LLADDR, neigh->dev->addr_len, neigh->ha) < 0) { + read_unlock_bh(&neigh->lock); + goto nla_put_failure; + } + + ci.ndm_used = now - neigh->used; + ci.ndm_confirmed = now - neigh->confirmed; + ci.ndm_updated = now - neigh->updated; + ci.ndm_refcnt = atomic_read(&neigh->refcnt) - 1; + read_unlock_bh(&neigh->lock); + + NLA_PUT_U32(skb, NDA_PROBES, atomic_read(&neigh->probes)); + NLA_PUT(skb, NDA_CACHEINFO, sizeof(ci), &ci); + + return nlmsg_end(skb, nlh); + +nla_put_failure: + return nlmsg_cancel(skb, nlh); } @@ -1986,7 +1987,7 @@ int neigh_dump_info(struct sk_buff *skb, struct netlink_callback *cb) int t, family, s_t; read_lock(&neigh_tbl_lock); - family = ((struct rtgenmsg *)NLMSG_DATA(cb->nlh))->rtgen_family; + family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family; s_t = cb->args[0]; for (tbl = neigh_tables, t = 0; tbl; tbl = tbl->next, t++) { @@ -2367,39 +2368,34 @@ static struct file_operations neigh_stat_seq_fops = { #ifdef CONFIG_ARPD void neigh_app_ns(struct neighbour *n) { - struct nlmsghdr *nlh; - int size = NLMSG_SPACE(sizeof(struct ndmsg) + 256); - struct sk_buff *skb = alloc_skb(size, GFP_ATOMIC); + struct sk_buff *skb; - if (!skb) + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_ATOMIC); + if (skb == NULL) return; - if (neigh_fill_info(skb, n, 0, 0, RTM_GETNEIGH, 0) < 0) { + if (neigh_fill_info(skb, n, 0, 0, RTM_GETNEIGH, NLM_F_REQUEST) <= 0) kfree_skb(skb); - return; + else { + NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; + netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); } - nlh = (struct nlmsghdr *)skb->data; - nlh->nlmsg_flags = NLM_F_REQUEST; - NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); } static void neigh_app_notify(struct neighbour *n) { - struct nlmsghdr *nlh; - int size = NLMSG_SPACE(sizeof(struct ndmsg) + 256); - struct sk_buff *skb = alloc_skb(size, GFP_ATOMIC); + struct sk_buff *skb; - if (!skb) + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_ATOMIC); + if (skb == NULL) return; - if (neigh_fill_info(skb, n, 0, 0, RTM_NEWNEIGH, 0) < 0) { + if (neigh_fill_info(skb, n, 0, 0, RTM_NEWNEIGH, 0) <= 0) kfree_skb(skb); - return; + else { + NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; + netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); } - nlh = (struct nlmsghdr *)skb->data; - NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); } #endif /* CONFIG_ARPD */ -- cgit v0.10.2 From 9067c722cf6930adf1df2d169de9094dd90b0c33 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 17:57:44 -0700 Subject: [NEIGH]: Move netlink neighbour bits to linux/neighbour.h Moves netlink neighbour bits to linux/neighbour.h. Also moves bits to be exported to userspace from net/neighbour.h to linux/neighbour.h and removes __KERNEL__ guards, userspace is not supposed to be using it. rtnetlink_rcv_msg() is not longer required to parse attributes for the neighbour layer, remove dependency on obsolete and buggy rta_buf. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h new file mode 100644 index 0000000..8e8293d --- /dev/null +++ b/include/linux/neighbour.h @@ -0,0 +1,65 @@ +#ifndef __LINUX_NEIGHBOUR_H +#define __LINUX_NEIGHBOUR_H + +#include + +struct ndmsg +{ + __u8 ndm_family; + __u8 ndm_pad1; + __u16 ndm_pad2; + __s32 ndm_ifindex; + __u16 ndm_state; + __u8 ndm_flags; + __u8 ndm_type; +}; + +enum +{ + NDA_UNSPEC, + NDA_DST, + NDA_LLADDR, + NDA_CACHEINFO, + NDA_PROBES, + __NDA_MAX +}; + +#define NDA_MAX (__NDA_MAX - 1) + +/* + * Neighbor Cache Entry Flags + */ + +#define NTF_PROXY 0x08 /* == ATF_PUBL */ +#define NTF_ROUTER 0x80 + +/* + * Neighbor Cache Entry States. + */ + +#define NUD_INCOMPLETE 0x01 +#define NUD_REACHABLE 0x02 +#define NUD_STALE 0x04 +#define NUD_DELAY 0x08 +#define NUD_PROBE 0x10 +#define NUD_FAILED 0x20 + +/* Dummy states */ +#define NUD_NOARP 0x40 +#define NUD_PERMANENT 0x80 +#define NUD_NONE 0x00 + +/* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change + and make no address resolution or NUD. + NUD_PERMANENT is also cannot be deleted by garbage collectors. + */ + +struct nda_cacheinfo +{ + __u32 ndm_confirmed; + __u32 ndm_used; + __u32 ndm_updated; + __u32 ndm_refcnt; +}; + +#endif diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 84f3eb4..9750f02 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -386,69 +386,6 @@ struct rta_session -/************************************************************** - * Neighbour discovery. - ****/ - -struct ndmsg -{ - unsigned char ndm_family; - unsigned char ndm_pad1; - unsigned short ndm_pad2; - int ndm_ifindex; /* Link index */ - __u16 ndm_state; - __u8 ndm_flags; - __u8 ndm_type; -}; - -enum -{ - NDA_UNSPEC, - NDA_DST, - NDA_LLADDR, - NDA_CACHEINFO, - NDA_PROBES, - __NDA_MAX -}; - -#define NDA_MAX (__NDA_MAX - 1) - -#define NDA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ndmsg)))) -#define NDA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ndmsg)) - -/* - * Neighbor Cache Entry Flags - */ - -#define NTF_PROXY 0x08 /* == ATF_PUBL */ -#define NTF_ROUTER 0x80 - -/* - * Neighbor Cache Entry States. - */ - -#define NUD_INCOMPLETE 0x01 -#define NUD_REACHABLE 0x02 -#define NUD_STALE 0x04 -#define NUD_DELAY 0x08 -#define NUD_PROBE 0x10 -#define NUD_FAILED 0x20 - -/* Dummy states */ -#define NUD_NOARP 0x40 -#define NUD_PERMANENT 0x80 -#define NUD_NONE 0x00 - - -struct nda_cacheinfo -{ - __u32 ndm_confirmed; - __u32 ndm_used; - __u32 ndm_updated; - __u32 ndm_refcnt; -}; - - /***************************************************************** * Neighbour tables specific messages. * diff --git a/include/net/neighbour.h b/include/net/neighbour.h index 4901ee4..74c4b6f 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -1,6 +1,8 @@ #ifndef _NET_NEIGHBOUR_H #define _NET_NEIGHBOUR_H +#include + /* * Generic neighbour manipulation * @@ -14,40 +16,6 @@ * - Add neighbour cache statistics like rtstat */ -/* The following flags & states are exported to user space, - so that they should be moved to include/linux/ directory. - */ - -/* - * Neighbor Cache Entry Flags - */ - -#define NTF_PROXY 0x08 /* == ATF_PUBL */ -#define NTF_ROUTER 0x80 - -/* - * Neighbor Cache Entry States. - */ - -#define NUD_INCOMPLETE 0x01 -#define NUD_REACHABLE 0x02 -#define NUD_STALE 0x04 -#define NUD_DELAY 0x08 -#define NUD_PROBE 0x10 -#define NUD_FAILED 0x20 - -/* Dummy states */ -#define NUD_NOARP 0x40 -#define NUD_PERMANENT 0x80 -#define NUD_NONE 0x00 - -/* NUD_NOARP & NUD_PERMANENT are pseudostates, they never change - and make no address resolution or NUD. - NUD_PERMANENT is also cannot be deleted by garbage collectors. - */ - -#ifdef __KERNEL__ - #include #include #include @@ -374,6 +342,3 @@ struct neighbour_cb { #define NEIGH_CB(skb) ((struct neighbour_cb *)(skb)->cb) #endif -#endif - - diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 93ba04f..78ccbd4 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -104,7 +104,6 @@ static const int rtm_min[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWLINK)] = NLMSG_LENGTH(sizeof(struct ifinfomsg)), [RTM_FAM(RTM_NEWADDR)] = NLMSG_LENGTH(sizeof(struct ifaddrmsg)), [RTM_FAM(RTM_NEWROUTE)] = NLMSG_LENGTH(sizeof(struct rtmsg)), - [RTM_FAM(RTM_NEWNEIGH)] = NLMSG_LENGTH(sizeof(struct ndmsg)), [RTM_FAM(RTM_NEWRULE)] = NLMSG_LENGTH(sizeof(struct fib_rule_hdr)), [RTM_FAM(RTM_NEWQDISC)] = NLMSG_LENGTH(sizeof(struct tcmsg)), [RTM_FAM(RTM_NEWTCLASS)] = NLMSG_LENGTH(sizeof(struct tcmsg)), @@ -121,7 +120,6 @@ static const int rta_max[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWLINK)] = IFLA_MAX, [RTM_FAM(RTM_NEWADDR)] = IFA_MAX, [RTM_FAM(RTM_NEWROUTE)] = RTA_MAX, - [RTM_FAM(RTM_NEWNEIGH)] = NDA_MAX, [RTM_FAM(RTM_NEWRULE)] = FRA_MAX, [RTM_FAM(RTM_NEWQDISC)] = TCA_MAX, [RTM_FAM(RTM_NEWTCLASS)] = TCA_MAX, -- cgit v0.10.2 From 6b3f8674bccbb2e784d01e44373fb730af6cb149 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 17:58:53 -0700 Subject: [NEIGH]: Convert neighbour table modification to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 5490afd..5a0b8f4 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1754,28 +1754,61 @@ static inline struct neigh_parms *lookup_neigh_params(struct neigh_table *tbl, return NULL; } +static struct nla_policy nl_neightbl_policy[NDTA_MAX+1] __read_mostly = { + [NDTA_NAME] = { .type = NLA_STRING }, + [NDTA_THRESH1] = { .type = NLA_U32 }, + [NDTA_THRESH2] = { .type = NLA_U32 }, + [NDTA_THRESH3] = { .type = NLA_U32 }, + [NDTA_GC_INTERVAL] = { .type = NLA_U64 }, + [NDTA_PARMS] = { .type = NLA_NESTED }, +}; + +static struct nla_policy nl_ntbl_parm_policy[NDTPA_MAX+1] __read_mostly = { + [NDTPA_IFINDEX] = { .type = NLA_U32 }, + [NDTPA_QUEUE_LEN] = { .type = NLA_U32 }, + [NDTPA_PROXY_QLEN] = { .type = NLA_U32 }, + [NDTPA_APP_PROBES] = { .type = NLA_U32 }, + [NDTPA_UCAST_PROBES] = { .type = NLA_U32 }, + [NDTPA_MCAST_PROBES] = { .type = NLA_U32 }, + [NDTPA_BASE_REACHABLE_TIME] = { .type = NLA_U64 }, + [NDTPA_GC_STALETIME] = { .type = NLA_U64 }, + [NDTPA_DELAY_PROBE_TIME] = { .type = NLA_U64 }, + [NDTPA_RETRANS_TIME] = { .type = NLA_U64 }, + [NDTPA_ANYCAST_DELAY] = { .type = NLA_U64 }, + [NDTPA_PROXY_DELAY] = { .type = NLA_U64 }, + [NDTPA_LOCKTIME] = { .type = NLA_U64 }, +}; + int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { struct neigh_table *tbl; - struct ndtmsg *ndtmsg = NLMSG_DATA(nlh); - struct rtattr **tb = arg; - int err = -EINVAL; + struct ndtmsg *ndtmsg; + struct nlattr *tb[NDTA_MAX+1]; + int err; - if (!tb[NDTA_NAME - 1] || !RTA_PAYLOAD(tb[NDTA_NAME - 1])) - return -EINVAL; + err = nlmsg_parse(nlh, sizeof(*ndtmsg), tb, NDTA_MAX, + nl_neightbl_policy); + if (err < 0) + goto errout; + + if (tb[NDTA_NAME] == NULL) { + err = -EINVAL; + goto errout; + } + ndtmsg = nlmsg_data(nlh); read_lock(&neigh_tbl_lock); for (tbl = neigh_tables; tbl; tbl = tbl->next) { if (ndtmsg->ndtm_family && tbl->family != ndtmsg->ndtm_family) continue; - if (!rtattr_strcmp(tb[NDTA_NAME - 1], tbl->id)) + if (nla_strcmp(tb[NDTA_NAME], tbl->id) == 0) break; } if (tbl == NULL) { err = -ENOENT; - goto errout; + goto errout_locked; } /* @@ -1784,86 +1817,89 @@ int neightbl_set(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) */ write_lock_bh(&tbl->lock); - if (tb[NDTA_THRESH1 - 1]) - tbl->gc_thresh1 = RTA_GET_U32(tb[NDTA_THRESH1 - 1]); - - if (tb[NDTA_THRESH2 - 1]) - tbl->gc_thresh2 = RTA_GET_U32(tb[NDTA_THRESH2 - 1]); - - if (tb[NDTA_THRESH3 - 1]) - tbl->gc_thresh3 = RTA_GET_U32(tb[NDTA_THRESH3 - 1]); - - if (tb[NDTA_GC_INTERVAL - 1]) - tbl->gc_interval = RTA_GET_MSECS(tb[NDTA_GC_INTERVAL - 1]); - - if (tb[NDTA_PARMS - 1]) { - struct rtattr *tbp[NDTPA_MAX]; + if (tb[NDTA_PARMS]) { + struct nlattr *tbp[NDTPA_MAX+1]; struct neigh_parms *p; - u32 ifindex = 0; + int i, ifindex = 0; - if (rtattr_parse_nested(tbp, NDTPA_MAX, tb[NDTA_PARMS - 1]) < 0) - goto rtattr_failure; + err = nla_parse_nested(tbp, NDTPA_MAX, tb[NDTA_PARMS], + nl_ntbl_parm_policy); + if (err < 0) + goto errout_tbl_lock; - if (tbp[NDTPA_IFINDEX - 1]) - ifindex = RTA_GET_U32(tbp[NDTPA_IFINDEX - 1]); + if (tbp[NDTPA_IFINDEX]) + ifindex = nla_get_u32(tbp[NDTPA_IFINDEX]); p = lookup_neigh_params(tbl, ifindex); if (p == NULL) { err = -ENOENT; - goto rtattr_failure; + goto errout_tbl_lock; } - - if (tbp[NDTPA_QUEUE_LEN - 1]) - p->queue_len = RTA_GET_U32(tbp[NDTPA_QUEUE_LEN - 1]); - - if (tbp[NDTPA_PROXY_QLEN - 1]) - p->proxy_qlen = RTA_GET_U32(tbp[NDTPA_PROXY_QLEN - 1]); - - if (tbp[NDTPA_APP_PROBES - 1]) - p->app_probes = RTA_GET_U32(tbp[NDTPA_APP_PROBES - 1]); - - if (tbp[NDTPA_UCAST_PROBES - 1]) - p->ucast_probes = - RTA_GET_U32(tbp[NDTPA_UCAST_PROBES - 1]); - if (tbp[NDTPA_MCAST_PROBES - 1]) - p->mcast_probes = - RTA_GET_U32(tbp[NDTPA_MCAST_PROBES - 1]); - - if (tbp[NDTPA_BASE_REACHABLE_TIME - 1]) - p->base_reachable_time = - RTA_GET_MSECS(tbp[NDTPA_BASE_REACHABLE_TIME - 1]); - - if (tbp[NDTPA_GC_STALETIME - 1]) - p->gc_staletime = - RTA_GET_MSECS(tbp[NDTPA_GC_STALETIME - 1]); + for (i = 1; i <= NDTPA_MAX; i++) { + if (tbp[i] == NULL) + continue; - if (tbp[NDTPA_DELAY_PROBE_TIME - 1]) - p->delay_probe_time = - RTA_GET_MSECS(tbp[NDTPA_DELAY_PROBE_TIME - 1]); + switch (i) { + case NDTPA_QUEUE_LEN: + p->queue_len = nla_get_u32(tbp[i]); + break; + case NDTPA_PROXY_QLEN: + p->proxy_qlen = nla_get_u32(tbp[i]); + break; + case NDTPA_APP_PROBES: + p->app_probes = nla_get_u32(tbp[i]); + break; + case NDTPA_UCAST_PROBES: + p->ucast_probes = nla_get_u32(tbp[i]); + break; + case NDTPA_MCAST_PROBES: + p->mcast_probes = nla_get_u32(tbp[i]); + break; + case NDTPA_BASE_REACHABLE_TIME: + p->base_reachable_time = nla_get_msecs(tbp[i]); + break; + case NDTPA_GC_STALETIME: + p->gc_staletime = nla_get_msecs(tbp[i]); + break; + case NDTPA_DELAY_PROBE_TIME: + p->delay_probe_time = nla_get_msecs(tbp[i]); + break; + case NDTPA_RETRANS_TIME: + p->retrans_time = nla_get_msecs(tbp[i]); + break; + case NDTPA_ANYCAST_DELAY: + p->anycast_delay = nla_get_msecs(tbp[i]); + break; + case NDTPA_PROXY_DELAY: + p->proxy_delay = nla_get_msecs(tbp[i]); + break; + case NDTPA_LOCKTIME: + p->locktime = nla_get_msecs(tbp[i]); + break; + } + } + } - if (tbp[NDTPA_RETRANS_TIME - 1]) - p->retrans_time = - RTA_GET_MSECS(tbp[NDTPA_RETRANS_TIME - 1]); + if (tb[NDTA_THRESH1]) + tbl->gc_thresh1 = nla_get_u32(tb[NDTA_THRESH1]); - if (tbp[NDTPA_ANYCAST_DELAY - 1]) - p->anycast_delay = - RTA_GET_MSECS(tbp[NDTPA_ANYCAST_DELAY - 1]); + if (tb[NDTA_THRESH2]) + tbl->gc_thresh2 = nla_get_u32(tb[NDTA_THRESH2]); - if (tbp[NDTPA_PROXY_DELAY - 1]) - p->proxy_delay = - RTA_GET_MSECS(tbp[NDTPA_PROXY_DELAY - 1]); + if (tb[NDTA_THRESH3]) + tbl->gc_thresh3 = nla_get_u32(tb[NDTA_THRESH3]); - if (tbp[NDTPA_LOCKTIME - 1]) - p->locktime = RTA_GET_MSECS(tbp[NDTPA_LOCKTIME - 1]); - } + if (tb[NDTA_GC_INTERVAL]) + tbl->gc_interval = nla_get_msecs(tb[NDTA_GC_INTERVAL]); err = 0; -rtattr_failure: +errout_tbl_lock: write_unlock_bh(&tbl->lock); -errout: +errout_locked: read_unlock(&neigh_tbl_lock); +errout: return err; } -- cgit v0.10.2 From ca860fb39b4aa1479e2fea67435a2c1eac9ce789 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 18:00:18 -0700 Subject: [NEIGH]: Convert neighbour table dumping to new netlink api Also fixes skipping of already dumped neighbours. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 5a0b8f4..2f4e06a 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1597,56 +1597,59 @@ out: static int neightbl_fill_parms(struct sk_buff *skb, struct neigh_parms *parms) { - struct rtattr *nest = NULL; - - nest = RTA_NEST(skb, NDTA_PARMS); + struct nlattr *nest; + + nest = nla_nest_start(skb, NDTA_PARMS); + if (nest == NULL) + return -ENOBUFS; if (parms->dev) - RTA_PUT_U32(skb, NDTPA_IFINDEX, parms->dev->ifindex); - - RTA_PUT_U32(skb, NDTPA_REFCNT, atomic_read(&parms->refcnt)); - RTA_PUT_U32(skb, NDTPA_QUEUE_LEN, parms->queue_len); - RTA_PUT_U32(skb, NDTPA_PROXY_QLEN, parms->proxy_qlen); - RTA_PUT_U32(skb, NDTPA_APP_PROBES, parms->app_probes); - RTA_PUT_U32(skb, NDTPA_UCAST_PROBES, parms->ucast_probes); - RTA_PUT_U32(skb, NDTPA_MCAST_PROBES, parms->mcast_probes); - RTA_PUT_MSECS(skb, NDTPA_REACHABLE_TIME, parms->reachable_time); - RTA_PUT_MSECS(skb, NDTPA_BASE_REACHABLE_TIME, + NLA_PUT_U32(skb, NDTPA_IFINDEX, parms->dev->ifindex); + + NLA_PUT_U32(skb, NDTPA_REFCNT, atomic_read(&parms->refcnt)); + NLA_PUT_U32(skb, NDTPA_QUEUE_LEN, parms->queue_len); + NLA_PUT_U32(skb, NDTPA_PROXY_QLEN, parms->proxy_qlen); + NLA_PUT_U32(skb, NDTPA_APP_PROBES, parms->app_probes); + NLA_PUT_U32(skb, NDTPA_UCAST_PROBES, parms->ucast_probes); + NLA_PUT_U32(skb, NDTPA_MCAST_PROBES, parms->mcast_probes); + NLA_PUT_MSECS(skb, NDTPA_REACHABLE_TIME, parms->reachable_time); + NLA_PUT_MSECS(skb, NDTPA_BASE_REACHABLE_TIME, parms->base_reachable_time); - RTA_PUT_MSECS(skb, NDTPA_GC_STALETIME, parms->gc_staletime); - RTA_PUT_MSECS(skb, NDTPA_DELAY_PROBE_TIME, parms->delay_probe_time); - RTA_PUT_MSECS(skb, NDTPA_RETRANS_TIME, parms->retrans_time); - RTA_PUT_MSECS(skb, NDTPA_ANYCAST_DELAY, parms->anycast_delay); - RTA_PUT_MSECS(skb, NDTPA_PROXY_DELAY, parms->proxy_delay); - RTA_PUT_MSECS(skb, NDTPA_LOCKTIME, parms->locktime); + NLA_PUT_MSECS(skb, NDTPA_GC_STALETIME, parms->gc_staletime); + NLA_PUT_MSECS(skb, NDTPA_DELAY_PROBE_TIME, parms->delay_probe_time); + NLA_PUT_MSECS(skb, NDTPA_RETRANS_TIME, parms->retrans_time); + NLA_PUT_MSECS(skb, NDTPA_ANYCAST_DELAY, parms->anycast_delay); + NLA_PUT_MSECS(skb, NDTPA_PROXY_DELAY, parms->proxy_delay); + NLA_PUT_MSECS(skb, NDTPA_LOCKTIME, parms->locktime); - return RTA_NEST_END(skb, nest); + return nla_nest_end(skb, nest); -rtattr_failure: - return RTA_NEST_CANCEL(skb, nest); +nla_put_failure: + return nla_nest_cancel(skb, nest); } -static int neightbl_fill_info(struct neigh_table *tbl, struct sk_buff *skb, - struct netlink_callback *cb) +static int neightbl_fill_info(struct sk_buff *skb, struct neigh_table *tbl, + u32 pid, u32 seq, int type, int flags) { struct nlmsghdr *nlh; struct ndtmsg *ndtmsg; - nlh = NLMSG_NEW_ANSWER(skb, cb, RTM_NEWNEIGHTBL, sizeof(struct ndtmsg), - NLM_F_MULTI); + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ndtmsg), flags); + if (nlh == NULL) + return -ENOBUFS; - ndtmsg = NLMSG_DATA(nlh); + ndtmsg = nlmsg_data(nlh); read_lock_bh(&tbl->lock); ndtmsg->ndtm_family = tbl->family; ndtmsg->ndtm_pad1 = 0; ndtmsg->ndtm_pad2 = 0; - RTA_PUT_STRING(skb, NDTA_NAME, tbl->id); - RTA_PUT_MSECS(skb, NDTA_GC_INTERVAL, tbl->gc_interval); - RTA_PUT_U32(skb, NDTA_THRESH1, tbl->gc_thresh1); - RTA_PUT_U32(skb, NDTA_THRESH2, tbl->gc_thresh2); - RTA_PUT_U32(skb, NDTA_THRESH3, tbl->gc_thresh3); + NLA_PUT_STRING(skb, NDTA_NAME, tbl->id); + NLA_PUT_MSECS(skb, NDTA_GC_INTERVAL, tbl->gc_interval); + NLA_PUT_U32(skb, NDTA_THRESH1, tbl->gc_thresh1); + NLA_PUT_U32(skb, NDTA_THRESH2, tbl->gc_thresh2); + NLA_PUT_U32(skb, NDTA_THRESH3, tbl->gc_thresh3); { unsigned long now = jiffies; @@ -1665,7 +1668,7 @@ static int neightbl_fill_info(struct neigh_table *tbl, struct sk_buff *skb, .ndtc_proxy_qlen = tbl->proxy_queue.qlen, }; - RTA_PUT(skb, NDTA_CONFIG, sizeof(ndc), &ndc); + NLA_PUT(skb, NDTA_CONFIG, sizeof(ndc), &ndc); } { @@ -1690,55 +1693,50 @@ static int neightbl_fill_info(struct neigh_table *tbl, struct sk_buff *skb, ndst.ndts_forced_gc_runs += st->forced_gc_runs; } - RTA_PUT(skb, NDTA_STATS, sizeof(ndst), &ndst); + NLA_PUT(skb, NDTA_STATS, sizeof(ndst), &ndst); } BUG_ON(tbl->parms.dev); if (neightbl_fill_parms(skb, &tbl->parms) < 0) - goto rtattr_failure; + goto nla_put_failure; read_unlock_bh(&tbl->lock); - return NLMSG_END(skb, nlh); + return nlmsg_end(skb, nlh); -rtattr_failure: +nla_put_failure: read_unlock_bh(&tbl->lock); - return NLMSG_CANCEL(skb, nlh); - -nlmsg_failure: - return -1; + return nlmsg_cancel(skb, nlh); } -static int neightbl_fill_param_info(struct neigh_table *tbl, +static int neightbl_fill_param_info(struct sk_buff *skb, + struct neigh_table *tbl, struct neigh_parms *parms, - struct sk_buff *skb, - struct netlink_callback *cb) + u32 pid, u32 seq, int type, + unsigned int flags) { struct ndtmsg *ndtmsg; struct nlmsghdr *nlh; - nlh = NLMSG_NEW_ANSWER(skb, cb, RTM_NEWNEIGHTBL, sizeof(struct ndtmsg), - NLM_F_MULTI); + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*ndtmsg), flags); + if (nlh == NULL) + return -ENOBUFS; - ndtmsg = NLMSG_DATA(nlh); + ndtmsg = nlmsg_data(nlh); read_lock_bh(&tbl->lock); ndtmsg->ndtm_family = tbl->family; ndtmsg->ndtm_pad1 = 0; ndtmsg->ndtm_pad2 = 0; - RTA_PUT_STRING(skb, NDTA_NAME, tbl->id); - if (neightbl_fill_parms(skb, parms) < 0) - goto rtattr_failure; + if (nla_put_string(skb, NDTA_NAME, tbl->id) < 0 || + neightbl_fill_parms(skb, parms) < 0) + goto errout; read_unlock_bh(&tbl->lock); - return NLMSG_END(skb, nlh); - -rtattr_failure: + return nlmsg_end(skb, nlh); +errout: read_unlock_bh(&tbl->lock); - return NLMSG_CANCEL(skb, nlh); - -nlmsg_failure: - return -1; + return nlmsg_cancel(skb, nlh); } static inline struct neigh_parms *lookup_neigh_params(struct neigh_table *tbl, @@ -1905,34 +1903,43 @@ errout: int neightbl_dump_info(struct sk_buff *skb, struct netlink_callback *cb) { - int idx, family; - int s_idx = cb->args[0]; + int family, tidx, nidx = 0; + int tbl_skip = cb->args[0]; + int neigh_skip = cb->args[1]; struct neigh_table *tbl; - family = ((struct rtgenmsg *)NLMSG_DATA(cb->nlh))->rtgen_family; + family = ((struct rtgenmsg *) nlmsg_data(cb->nlh))->rtgen_family; read_lock(&neigh_tbl_lock); - for (tbl = neigh_tables, idx = 0; tbl; tbl = tbl->next) { + for (tbl = neigh_tables, tidx = 0; tbl; tbl = tbl->next, tidx++) { struct neigh_parms *p; - if (idx < s_idx || (family && tbl->family != family)) + if (tidx < tbl_skip || (family && tbl->family != family)) continue; - if (neightbl_fill_info(tbl, skb, cb) <= 0) + if (neightbl_fill_info(skb, tbl, NETLINK_CB(cb->skb).pid, + cb->nlh->nlmsg_seq, RTM_NEWNEIGHTBL, + NLM_F_MULTI) <= 0) break; - for (++idx, p = tbl->parms.next; p; p = p->next, idx++) { - if (idx < s_idx) + for (nidx = 0, p = tbl->parms.next; p; p = p->next, nidx++) { + if (nidx < neigh_skip) continue; - if (neightbl_fill_param_info(tbl, p, skb, cb) <= 0) + if (neightbl_fill_param_info(skb, tbl, p, + NETLINK_CB(cb->skb).pid, + cb->nlh->nlmsg_seq, + RTM_NEWNEIGHTBL, + NLM_F_MULTI) <= 0) goto out; } + neigh_skip = 0; } out: read_unlock(&neigh_tbl_lock); - cb->args[0] = idx; + cb->args[0] = tidx; + cb->args[1] = nidx; return skb->len; } -- cgit v0.10.2 From b63bbc5006a0a62fabc81c4f77e95f16ff16f340 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 7 Aug 2006 18:00:57 -0700 Subject: [NEIGH]: Move netlink neighbour table bits to linux/neighbour.h rtnetlink_rcv_msg() is not longer required to parse attributes for the neighbour tables layer, remove dependency on obsolete and buggy rta_buf. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/neighbour.h b/include/linux/neighbour.h index 8e8293d..bd3bbf6 100644 --- a/include/linux/neighbour.h +++ b/include/linux/neighbour.h @@ -62,4 +62,98 @@ struct nda_cacheinfo __u32 ndm_refcnt; }; +/***************************************************************** + * Neighbour tables specific messages. + * + * To retrieve the neighbour tables send RTM_GETNEIGHTBL with the + * NLM_F_DUMP flag set. Every neighbour table configuration is + * spread over multiple messages to avoid running into message + * size limits on systems with many interfaces. The first message + * in the sequence transports all not device specific data such as + * statistics, configuration, and the default parameter set. + * This message is followed by 0..n messages carrying device + * specific parameter sets. + * Although the ordering should be sufficient, NDTA_NAME can be + * used to identify sequences. The initial message can be identified + * by checking for NDTA_CONFIG. The device specific messages do + * not contain this TLV but have NDTPA_IFINDEX set to the + * corresponding interface index. + * + * To change neighbour table attributes, send RTM_SETNEIGHTBL + * with NDTA_NAME set. Changeable attribute include NDTA_THRESH[1-3], + * NDTA_GC_INTERVAL, and all TLVs in NDTA_PARMS unless marked + * otherwise. Device specific parameter sets can be changed by + * setting NDTPA_IFINDEX to the interface index of the corresponding + * device. + ****/ + +struct ndt_stats +{ + __u64 ndts_allocs; + __u64 ndts_destroys; + __u64 ndts_hash_grows; + __u64 ndts_res_failed; + __u64 ndts_lookups; + __u64 ndts_hits; + __u64 ndts_rcv_probes_mcast; + __u64 ndts_rcv_probes_ucast; + __u64 ndts_periodic_gc_runs; + __u64 ndts_forced_gc_runs; +}; + +enum { + NDTPA_UNSPEC, + NDTPA_IFINDEX, /* u32, unchangeable */ + NDTPA_REFCNT, /* u32, read-only */ + NDTPA_REACHABLE_TIME, /* u64, read-only, msecs */ + NDTPA_BASE_REACHABLE_TIME, /* u64, msecs */ + NDTPA_RETRANS_TIME, /* u64, msecs */ + NDTPA_GC_STALETIME, /* u64, msecs */ + NDTPA_DELAY_PROBE_TIME, /* u64, msecs */ + NDTPA_QUEUE_LEN, /* u32 */ + NDTPA_APP_PROBES, /* u32 */ + NDTPA_UCAST_PROBES, /* u32 */ + NDTPA_MCAST_PROBES, /* u32 */ + NDTPA_ANYCAST_DELAY, /* u64, msecs */ + NDTPA_PROXY_DELAY, /* u64, msecs */ + NDTPA_PROXY_QLEN, /* u32 */ + NDTPA_LOCKTIME, /* u64, msecs */ + __NDTPA_MAX +}; +#define NDTPA_MAX (__NDTPA_MAX - 1) + +struct ndtmsg +{ + __u8 ndtm_family; + __u8 ndtm_pad1; + __u16 ndtm_pad2; +}; + +struct ndt_config +{ + __u16 ndtc_key_len; + __u16 ndtc_entry_size; + __u32 ndtc_entries; + __u32 ndtc_last_flush; /* delta to now in msecs */ + __u32 ndtc_last_rand; /* delta to now in msecs */ + __u32 ndtc_hash_rnd; + __u32 ndtc_hash_mask; + __u32 ndtc_hash_chain_gc; + __u32 ndtc_proxy_qlen; +}; + +enum { + NDTA_UNSPEC, + NDTA_NAME, /* char *, unchangeable */ + NDTA_THRESH1, /* u32 */ + NDTA_THRESH2, /* u32 */ + NDTA_THRESH3, /* u32 */ + NDTA_CONFIG, /* struct ndt_config, read-only */ + NDTA_PARMS, /* nested TLV NDTPA_* */ + NDTA_STATS, /* struct ndt_stats, read-only */ + NDTA_GC_INTERVAL, /* u64, msecs */ + __NDTA_MAX +}; +#define NDTA_MAX (__NDTA_MAX - 1) + #endif diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 9750f02..784a1a2 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -384,107 +384,6 @@ struct rta_session } u; }; - - -/***************************************************************** - * Neighbour tables specific messages. - * - * To retrieve the neighbour tables send RTM_GETNEIGHTBL with the - * NLM_F_DUMP flag set. Every neighbour table configuration is - * spread over multiple messages to avoid running into message - * size limits on systems with many interfaces. The first message - * in the sequence transports all not device specific data such as - * statistics, configuration, and the default parameter set. - * This message is followed by 0..n messages carrying device - * specific parameter sets. - * Although the ordering should be sufficient, NDTA_NAME can be - * used to identify sequences. The initial message can be identified - * by checking for NDTA_CONFIG. The device specific messages do - * not contain this TLV but have NDTPA_IFINDEX set to the - * corresponding interface index. - * - * To change neighbour table attributes, send RTM_SETNEIGHTBL - * with NDTA_NAME set. Changeable attribute include NDTA_THRESH[1-3], - * NDTA_GC_INTERVAL, and all TLVs in NDTA_PARMS unless marked - * otherwise. Device specific parameter sets can be changed by - * setting NDTPA_IFINDEX to the interface index of the corresponding - * device. - ****/ - -struct ndt_stats -{ - __u64 ndts_allocs; - __u64 ndts_destroys; - __u64 ndts_hash_grows; - __u64 ndts_res_failed; - __u64 ndts_lookups; - __u64 ndts_hits; - __u64 ndts_rcv_probes_mcast; - __u64 ndts_rcv_probes_ucast; - __u64 ndts_periodic_gc_runs; - __u64 ndts_forced_gc_runs; -}; - -enum { - NDTPA_UNSPEC, - NDTPA_IFINDEX, /* u32, unchangeable */ - NDTPA_REFCNT, /* u32, read-only */ - NDTPA_REACHABLE_TIME, /* u64, read-only, msecs */ - NDTPA_BASE_REACHABLE_TIME, /* u64, msecs */ - NDTPA_RETRANS_TIME, /* u64, msecs */ - NDTPA_GC_STALETIME, /* u64, msecs */ - NDTPA_DELAY_PROBE_TIME, /* u64, msecs */ - NDTPA_QUEUE_LEN, /* u32 */ - NDTPA_APP_PROBES, /* u32 */ - NDTPA_UCAST_PROBES, /* u32 */ - NDTPA_MCAST_PROBES, /* u32 */ - NDTPA_ANYCAST_DELAY, /* u64, msecs */ - NDTPA_PROXY_DELAY, /* u64, msecs */ - NDTPA_PROXY_QLEN, /* u32 */ - NDTPA_LOCKTIME, /* u64, msecs */ - __NDTPA_MAX -}; -#define NDTPA_MAX (__NDTPA_MAX - 1) - -struct ndtmsg -{ - __u8 ndtm_family; - __u8 ndtm_pad1; - __u16 ndtm_pad2; -}; - -struct ndt_config -{ - __u16 ndtc_key_len; - __u16 ndtc_entry_size; - __u32 ndtc_entries; - __u32 ndtc_last_flush; /* delta to now in msecs */ - __u32 ndtc_last_rand; /* delta to now in msecs */ - __u32 ndtc_hash_rnd; - __u32 ndtc_hash_mask; - __u32 ndtc_hash_chain_gc; - __u32 ndtc_proxy_qlen; -}; - -enum { - NDTA_UNSPEC, - NDTA_NAME, /* char *, unchangeable */ - NDTA_THRESH1, /* u32 */ - NDTA_THRESH2, /* u32 */ - NDTA_THRESH3, /* u32 */ - NDTA_CONFIG, /* struct ndt_config, read-only */ - NDTA_PARMS, /* nested TLV NDTPA_* */ - NDTA_STATS, /* struct ndt_stats, read-only */ - NDTA_GC_INTERVAL, /* u64, msecs */ - __NDTA_MAX -}; -#define NDTA_MAX (__NDTA_MAX - 1) - -#define NDTA_RTA(r) ((struct rtattr*)(((char*)(r)) + \ - NLMSG_ALIGN(sizeof(struct ndtmsg)))) -#define NDTA_PAYLOAD(n) NLMSG_PAYLOAD(n,sizeof(struct ndtmsg)) - - /**** * General form of address family dependent message. ****/ diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 78ccbd4..a1b783a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -112,7 +112,6 @@ static const int rtm_min[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWPREFIX)] = NLMSG_LENGTH(sizeof(struct rtgenmsg)), [RTM_FAM(RTM_GETMULTICAST)] = NLMSG_LENGTH(sizeof(struct rtgenmsg)), [RTM_FAM(RTM_GETANYCAST)] = NLMSG_LENGTH(sizeof(struct rtgenmsg)), - [RTM_FAM(RTM_NEWNEIGHTBL)] = NLMSG_LENGTH(sizeof(struct ndtmsg)), }; static const int rta_max[RTM_NR_FAMILIES] = @@ -125,7 +124,6 @@ static const int rta_max[RTM_NR_FAMILIES] = [RTM_FAM(RTM_NEWTCLASS)] = TCA_MAX, [RTM_FAM(RTM_NEWTFILTER)] = TCA_MAX, [RTM_FAM(RTM_NEWACTION)] = TCAA_MAX, - [RTM_FAM(RTM_NEWNEIGHTBL)] = NDTA_MAX, }; void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data) -- cgit v0.10.2 From ac5a488ef252ed673cb067843e411f8cc43f7ab9 Mon Sep 17 00:00:00 2001 From: Sridhar Samudrala Date: Mon, 7 Aug 2006 20:57:31 -0700 Subject: [NET]: Round out in-kernel sockets API This patch implements wrapper functions that provide a convenient way to access the sockets API for in-kernel users like sunrpc, cifs & ocfs2 etc and any future users. Signed-off-by: Sridhar Samudrala Acked-by: James Morris Signed-off-by: David S. Miller diff --git a/include/linux/net.h b/include/linux/net.h index b20c53c..19da2c0 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -208,6 +208,25 @@ extern int kernel_recvmsg(struct socket *sock, struct msghdr *msg, struct kvec *vec, size_t num, size_t len, int flags); +extern int kernel_bind(struct socket *sock, struct sockaddr *addr, + int addrlen); +extern int kernel_listen(struct socket *sock, int backlog); +extern int kernel_accept(struct socket *sock, struct socket **newsock, + int flags); +extern int kernel_connect(struct socket *sock, struct sockaddr *addr, + int addrlen, int flags); +extern int kernel_getsockname(struct socket *sock, struct sockaddr *addr, + int *addrlen); +extern int kernel_getpeername(struct socket *sock, struct sockaddr *addr, + int *addrlen); +extern int kernel_getsockopt(struct socket *sock, int level, int optname, + char *optval, int *optlen); +extern int kernel_setsockopt(struct socket *sock, int level, int optname, + char *optval, int optlen); +extern int kernel_sendpage(struct socket *sock, struct page *page, int offset, + size_t size, int flags); +extern int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg); + #ifndef CONFIG_SMP #define SOCKOPS_WRAPPED(name) name #define SOCKOPS_WRAP(name, fam) diff --git a/net/socket.c b/net/socket.c index 6756e57..2eaebf9 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2170,6 +2170,109 @@ static long compat_sock_ioctl(struct file *file, unsigned cmd, } #endif +int kernel_bind(struct socket *sock, struct sockaddr *addr, int addrlen) +{ + return sock->ops->bind(sock, addr, addrlen); +} + +int kernel_listen(struct socket *sock, int backlog) +{ + return sock->ops->listen(sock, backlog); +} + +int kernel_accept(struct socket *sock, struct socket **newsock, int flags) +{ + struct sock *sk = sock->sk; + int err; + + err = sock_create_lite(sk->sk_family, sk->sk_type, sk->sk_protocol, + newsock); + if (err < 0) + goto done; + + err = sock->ops->accept(sock, *newsock, flags); + if (err < 0) { + sock_release(*newsock); + goto done; + } + + (*newsock)->ops = sock->ops; + +done: + return err; +} + +int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen, + int flags) +{ + return sock->ops->connect(sock, addr, addrlen, flags); +} + +int kernel_getsockname(struct socket *sock, struct sockaddr *addr, + int *addrlen) +{ + return sock->ops->getname(sock, addr, addrlen, 0); +} + +int kernel_getpeername(struct socket *sock, struct sockaddr *addr, + int *addrlen) +{ + return sock->ops->getname(sock, addr, addrlen, 1); +} + +int kernel_getsockopt(struct socket *sock, int level, int optname, + char *optval, int *optlen) +{ + mm_segment_t oldfs = get_fs(); + int err; + + set_fs(KERNEL_DS); + if (level == SOL_SOCKET) + err = sock_getsockopt(sock, level, optname, optval, optlen); + else + err = sock->ops->getsockopt(sock, level, optname, optval, + optlen); + set_fs(oldfs); + return err; +} + +int kernel_setsockopt(struct socket *sock, int level, int optname, + char *optval, int optlen) +{ + mm_segment_t oldfs = get_fs(); + int err; + + set_fs(KERNEL_DS); + if (level == SOL_SOCKET) + err = sock_setsockopt(sock, level, optname, optval, optlen); + else + err = sock->ops->setsockopt(sock, level, optname, optval, + optlen); + set_fs(oldfs); + return err; +} + +int kernel_sendpage(struct socket *sock, struct page *page, int offset, + size_t size, int flags) +{ + if (sock->ops->sendpage) + return sock->ops->sendpage(sock, page, offset, size, flags); + + return sock_no_sendpage(sock, page, offset, size, flags); +} + +int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg) +{ + mm_segment_t oldfs = get_fs(); + int err; + + set_fs(KERNEL_DS); + err = sock->ops->ioctl(sock, cmd, arg); + set_fs(oldfs); + + return err; +} + /* ABI emulation layers need these two */ EXPORT_SYMBOL(move_addr_to_kernel); EXPORT_SYMBOL(move_addr_to_user); @@ -2186,3 +2289,13 @@ EXPORT_SYMBOL(sock_wake_async); EXPORT_SYMBOL(sockfd_lookup); EXPORT_SYMBOL(kernel_sendmsg); EXPORT_SYMBOL(kernel_recvmsg); +EXPORT_SYMBOL(kernel_bind); +EXPORT_SYMBOL(kernel_listen); +EXPORT_SYMBOL(kernel_accept); +EXPORT_SYMBOL(kernel_connect); +EXPORT_SYMBOL(kernel_getsockname); +EXPORT_SYMBOL(kernel_getpeername); +EXPORT_SYMBOL(kernel_getsockopt); +EXPORT_SYMBOL(kernel_setsockopt); +EXPORT_SYMBOL(kernel_sendpage); +EXPORT_SYMBOL(kernel_sock_ioctl); -- cgit v0.10.2 From e6242e928ef1e4ed853f909a7479e4934f4bcb70 Mon Sep 17 00:00:00 2001 From: Sridhar Samudrala Date: Mon, 7 Aug 2006 20:58:01 -0700 Subject: [SUNRPC]: Update to use in-kernel sockets API. Signed-off-by: Sridhar Samudrala Acked-by: James Morris Signed-off-by: David S. Miller diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index d9a9573..953aff8 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -388,7 +388,7 @@ svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr) /* send head */ if (slen == xdr->head[0].iov_len) flags = 0; - len = sock->ops->sendpage(sock, rqstp->rq_respages[0], 0, xdr->head[0].iov_len, flags); + len = kernel_sendpage(sock, rqstp->rq_respages[0], 0, xdr->head[0].iov_len, flags); if (len != xdr->head[0].iov_len) goto out; slen -= xdr->head[0].iov_len; @@ -400,7 +400,7 @@ svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr) while (pglen > 0) { if (slen == size) flags = 0; - result = sock->ops->sendpage(sock, *ppage, base, size, flags); + result = kernel_sendpage(sock, *ppage, base, size, flags); if (result > 0) len += result; if (result != size) @@ -413,7 +413,7 @@ svc_sendto(struct svc_rqst *rqstp, struct xdr_buf *xdr) } /* send tail */ if (xdr->tail[0].iov_len) { - result = sock->ops->sendpage(sock, rqstp->rq_respages[rqstp->rq_restailpage], + result = kernel_sendpage(sock, rqstp->rq_respages[rqstp->rq_restailpage], ((unsigned long)xdr->tail[0].iov_base)& (PAGE_SIZE-1), xdr->tail[0].iov_len, 0); @@ -434,13 +434,10 @@ out: static int svc_recv_available(struct svc_sock *svsk) { - mm_segment_t oldfs; struct socket *sock = svsk->sk_sock; int avail, err; - oldfs = get_fs(); set_fs(KERNEL_DS); - err = sock->ops->ioctl(sock, TIOCINQ, (unsigned long) &avail); - set_fs(oldfs); + err = kernel_sock_ioctl(sock, TIOCINQ, (unsigned long) &avail); return (err >= 0)? avail : err; } @@ -472,7 +469,7 @@ svc_recvfrom(struct svc_rqst *rqstp, struct kvec *iov, int nr, int buflen) * at accept time. FIXME */ alen = sizeof(rqstp->rq_addr); - sock->ops->getname(sock, (struct sockaddr *)&rqstp->rq_addr, &alen, 1); + kernel_getpeername(sock, (struct sockaddr *)&rqstp->rq_addr, &alen); dprintk("svc: socket %p recvfrom(%p, %Zu) = %d\n", rqstp->rq_sock, iov[0].iov_base, iov[0].iov_len, len); @@ -758,7 +755,6 @@ svc_tcp_accept(struct svc_sock *svsk) struct svc_serv *serv = svsk->sk_server; struct socket *sock = svsk->sk_sock; struct socket *newsock; - const struct proto_ops *ops; struct svc_sock *newsvsk; int err, slen; @@ -766,29 +762,23 @@ svc_tcp_accept(struct svc_sock *svsk) if (!sock) return; - err = sock_create_lite(PF_INET, SOCK_STREAM, IPPROTO_TCP, &newsock); - if (err) { + clear_bit(SK_CONN, &svsk->sk_flags); + err = kernel_accept(sock, &newsock, O_NONBLOCK); + if (err < 0) { if (err == -ENOMEM) printk(KERN_WARNING "%s: no more sockets!\n", serv->sv_name); - return; - } - - dprintk("svc: tcp_accept %p allocated\n", newsock); - newsock->ops = ops = sock->ops; - - clear_bit(SK_CONN, &svsk->sk_flags); - if ((err = ops->accept(sock, newsock, O_NONBLOCK)) < 0) { - if (err != -EAGAIN && net_ratelimit()) + else if (err != -EAGAIN && net_ratelimit()) printk(KERN_WARNING "%s: accept failed (err %d)!\n", serv->sv_name, -err); - goto failed; /* aborted connection or whatever */ + return; } + set_bit(SK_CONN, &svsk->sk_flags); svc_sock_enqueue(svsk); slen = sizeof(sin); - err = ops->getname(newsock, (struct sockaddr *) &sin, &slen, 1); + err = kernel_getpeername(newsock, (struct sockaddr *) &sin, &slen); if (err < 0) { if (net_ratelimit()) printk(KERN_WARNING "%s: peername failed (err %d)!\n", @@ -1406,14 +1396,14 @@ svc_create_socket(struct svc_serv *serv, int protocol, struct sockaddr_in *sin) if (sin != NULL) { if (type == SOCK_STREAM) sock->sk->sk_reuse = 1; /* allow address reuse */ - error = sock->ops->bind(sock, (struct sockaddr *) sin, + error = kernel_bind(sock, (struct sockaddr *) sin, sizeof(*sin)); if (error < 0) goto bummer; } if (protocol == IPPROTO_TCP) { - if ((error = sock->ops->listen(sock, 64)) < 0) + if ((error = kernel_listen(sock, 64)) < 0) goto bummer; } diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 441bd53..8b319e3 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -207,7 +207,7 @@ static inline int xs_sendpages(struct socket *sock, struct sockaddr *addr, int a base &= ~PAGE_CACHE_MASK; } - sendpage = sock->ops->sendpage ? : sock_no_sendpage; + sendpage = kernel_sendpage; do { int flags = XS_SENDMSG_FLAGS; @@ -986,7 +986,7 @@ static int xs_bindresvport(struct rpc_xprt *xprt, struct socket *sock) do { myaddr.sin_port = htons(port); - err = sock->ops->bind(sock, (struct sockaddr *) &myaddr, + err = kernel_bind(sock, (struct sockaddr *) &myaddr, sizeof(myaddr)); if (err == 0) { xprt->port = port; @@ -1081,7 +1081,7 @@ static void xs_tcp_reuse_connection(struct rpc_xprt *xprt) */ memset(&any, 0, sizeof(any)); any.sa_family = AF_UNSPEC; - result = sock->ops->connect(sock, &any, sizeof(any), 0); + result = kernel_connect(sock, &any, sizeof(any), 0); if (result) dprintk("RPC: AF_UNSPEC connect return code %d\n", result); @@ -1151,7 +1151,7 @@ static void xs_tcp_connect_worker(void *args) /* Tell the socket layer to start connecting... */ xprt->stat.connect_count++; xprt->stat.connect_start = jiffies; - status = sock->ops->connect(sock, (struct sockaddr *) &xprt->addr, + status = kernel_connect(sock, (struct sockaddr *) &xprt->addr, sizeof(xprt->addr), O_NONBLOCK); dprintk("RPC: %p connect status %d connected %d sock state %d\n", xprt, -status, xprt_connected(xprt), sock->sk->sk_state); -- cgit v0.10.2 From 8ce11e6a9faf1f1c849b77104adc1642c46aee95 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Mon, 7 Aug 2006 21:50:48 -0700 Subject: [NET]: Make code static. This patch makes needlessly global code static. Signed-off-by: Adrian Bunk Signed-off-by: David S. Miller diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 7b47e8d..c0660ce 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -192,10 +192,6 @@ struct fib6_node *fib6_locate(struct fib6_node *root, struct in6_addr *daddr, int dst_len, struct in6_addr *saddr, int src_len); -extern void fib6_clean_tree(struct fib6_node *root, - int (*func)(struct rt6_info *, void *arg), - int prune, void *arg); - extern void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), int prune, void *arg); diff --git a/net/ipv4/cipso_ipv4.c b/net/ipv4/cipso_ipv4.c index b82a101..80a2a09 100644 --- a/net/ipv4/cipso_ipv4.c +++ b/net/ipv4/cipso_ipv4.c @@ -60,7 +60,7 @@ struct cipso_v4_domhsh_entry { * if in practice there are a lot of different DOIs this list should * probably be turned into a hash table or something similar so we * can do quick lookups. */ -DEFINE_SPINLOCK(cipso_v4_doi_list_lock); +static DEFINE_SPINLOCK(cipso_v4_doi_list_lock); static struct list_head cipso_v4_doi_list = LIST_HEAD_INIT(cipso_v4_doi_list); /* Label mapping cache */ diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 23ec6ae..03d1e8a 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -101,8 +101,8 @@ int fib_lookup(struct flowi *flp, struct fib_result *res) return err; } -int fib4_rule_action(struct fib_rule *rule, struct flowi *flp, int flags, - struct fib_lookup_arg *arg) +static int fib4_rule_action(struct fib_rule *rule, struct flowi *flp, + int flags, struct fib_lookup_arg *arg) { int err = -EAGAIN; struct fib_table *tbl; diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 94a46ec..bf9bba8 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -66,8 +66,8 @@ struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, return (struct dst_entry *) arg.result; } -int fib6_rule_action(struct fib_rule *rule, struct flowi *flp, - int flags, struct fib_lookup_arg *arg) +static int fib6_rule_action(struct fib_rule *rule, struct flowi *flp, + int flags, struct fib_lookup_arg *arg) { struct rt6_info *rt = NULL; struct fib6_table *table; diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index ce226c1..1f23161 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1169,9 +1169,9 @@ static int fib6_clean_node(struct fib6_walker_t *w) * ignoring pure split nodes) will be scanned. */ -void fib6_clean_tree(struct fib6_node *root, - int (*func)(struct rt6_info *, void *arg), - int prune, void *arg) +static void fib6_clean_tree(struct fib6_node *root, + int (*func)(struct rt6_info *, void *arg), + int prune, void *arg) { struct fib6_cleaner_t c; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 41c5905..e08d840 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -613,8 +613,8 @@ static struct rt6_info *rt6_alloc_clone(struct rt6_info *ort, struct in6_addr *d return rt; } -struct rt6_info *ip6_pol_route_input(struct fib6_table *table, struct flowi *fl, - int flags) +static struct rt6_info *ip6_pol_route_input(struct fib6_table *table, + struct flowi *fl, int flags) { struct fib6_node *fn; struct rt6_info *rt, *nrt; @@ -872,7 +872,7 @@ static inline unsigned int ipv6_advmss(unsigned int mtu) } static struct dst_entry *ndisc_dst_gc_list; -DEFINE_SPINLOCK(ndisc_lock); +static DEFINE_SPINLOCK(ndisc_lock); struct dst_entry *ndisc_dst_alloc(struct net_device *dev, struct neighbour *neigh, diff --git a/net/netlabel/netlabel_domainhash.c b/net/netlabel/netlabel_domainhash.c index 5bb3fad..0489a13 100644 --- a/net/netlabel/netlabel_domainhash.c +++ b/net/netlabel/netlabel_domainhash.c @@ -50,11 +50,11 @@ struct netlbl_domhsh_tbl { /* Domain hash table */ /* XXX - updates should be so rare that having one spinlock for the entire * hash table should be okay */ -DEFINE_SPINLOCK(netlbl_domhsh_lock); +static DEFINE_SPINLOCK(netlbl_domhsh_lock); static struct netlbl_domhsh_tbl *netlbl_domhsh = NULL; /* Default domain mapping */ -DEFINE_SPINLOCK(netlbl_domhsh_def_lock); +static DEFINE_SPINLOCK(netlbl_domhsh_def_lock); static struct netlbl_dom_map *netlbl_domhsh_def = NULL; /* -- cgit v0.10.2 From 8423a9aadfaa135fd5fd1ab8bbd4a1e76b4143c9 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Mon, 7 Aug 2006 21:54:37 -0700 Subject: [IPV6]: Protect RTM_GETRULE table entry with IPV6_MULTIPLE_TABLES ifdef This is how IPv4 handles this case too. Based upon a patch from Andrew Morton. Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c2a4db8..9ba1e81 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3529,7 +3529,9 @@ static struct rtnetlink_link inet6_rtnetlink_table[RTM_NR_MSGTYPES] = { [RTM_DELROUTE - RTM_BASE] = { .doit = inet6_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = inet6_rtm_getroute, .dumpit = inet6_dump_fib, }, +#ifdef CONFIG_IPV6_MULTIPLE_TABLES [RTM_GETRULE - RTM_BASE] = { .dumpit = fib6_rules_dump, }, +#endif }; static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) -- cgit v0.10.2 From 0298f36a579b5bd7f10f6f6d57e5929977a865a1 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Mon, 7 Aug 2006 21:56:52 -0700 Subject: [IPV4]: Kill fib4_rules_clean(). As noted by Adrian Bunk this function is totally unused. Signed-off-by: David S. Miller diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index 14c82e6..adf7358 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -254,7 +254,6 @@ extern struct fib_table *fib_hash_init(int id); extern int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb); extern void __init fib4_rules_init(void); -extern void __exit fib4_rules_cleanup(void); #ifdef CONFIG_NET_CLS_ROUTE extern u32 fib_rules_tclass(struct fib_result *res); diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 03d1e8a..d242e52 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -347,8 +347,3 @@ void __init fib4_rules_init(void) fib_rules_register(&fib4_rules_ops); } - -void __exit fib4_rules_cleanup(void) -{ - fib_rules_unregister(&fib4_rules_ops); -} -- cgit v0.10.2 From 1a01912ae0a5666c4c24eaae2b4821711e2ad79a Mon Sep 17 00:00:00 2001 From: Louis Nyffenegger Date: Tue, 8 Aug 2006 00:56:11 -0700 Subject: [INET]: Remove is_setbyuser patch The value is_setbyuser from struct ip_options is never used and set only one time (http://linux-net.osdl.org/index.php/TODO#IPV4). This little patch removes it from the kernel source. Signed-off-by: Louis Nyffenegger Signed-off-by: David S. Miller diff --git a/include/net/inet_sock.h b/include/net/inet_sock.h index f4caad5..f624271 100644 --- a/include/net/inet_sock.h +++ b/include/net/inet_sock.h @@ -27,7 +27,6 @@ /** struct ip_options - IP Options * * @faddr - Saved first hop address - * @is_setbyuser - Set by setsockopt? * @is_data - Options in __data, rather than skb * @is_strictroute - Strict source route * @srr_is_hit - Packet destination addr was our one @@ -42,8 +41,7 @@ struct ip_options { unsigned char srr; unsigned char rr; unsigned char ts; - unsigned char is_setbyuser:1, - is_data:1, + unsigned char is_data:1, is_strictroute:1, srr_is_hit:1, is_changed:1, diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c index e0a93b4..e7437c0 100644 --- a/net/ipv4/ip_options.c +++ b/net/ipv4/ip_options.c @@ -525,7 +525,6 @@ static int ip_options_get_finish(struct ip_options **optp, opt->__data[optlen++] = IPOPT_END; opt->optlen = optlen; opt->is_data = 1; - opt->is_setbyuser = 1; if (optlen && ip_options_compile(opt, NULL)) { kfree(opt); return -EINVAL; -- cgit v0.10.2 From 99a92ff50424146ba01a222248fd47a1cd55b78f Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Tue, 8 Aug 2006 02:18:10 -0700 Subject: [IPV4]: Uninline inet_lookup_listener By modern standards this function is way too big to be inlined. It's even bigger than __inet_lookup_listener :) Signed-off-by: Herbert Xu Signed-off-by: David S. Miller diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 98e0bb3..bd513f3 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -271,39 +271,10 @@ static inline int inet_iif(const struct sk_buff *skb) return ((struct rtable *)skb->dst)->rt_iif; } -extern struct sock *__inet_lookup_listener(const struct hlist_head *head, - const u32 daddr, - const unsigned short hnum, - const int dif); - -/* Optimize the common listener case. */ -static inline struct sock * +extern struct sock * inet_lookup_listener(struct inet_hashinfo *hashinfo, const u32 daddr, - const unsigned short hnum, const int dif) -{ - struct sock *sk = NULL; - const struct hlist_head *head; - - read_lock(&hashinfo->lhash_lock); - head = &hashinfo->listening_hash[inet_lhashfn(hnum)]; - if (!hlist_empty(head)) { - const struct inet_sock *inet = inet_sk((sk = __sk_head(head))); - - if (inet->num == hnum && !sk->sk_node.next && - (!inet->rcv_saddr || inet->rcv_saddr == daddr) && - (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && - !sk->sk_bound_dev_if) - goto sherry_cache; - sk = __inet_lookup_listener(head, daddr, hnum, dif); - } - if (sk) { -sherry_cache: - sock_hold(sk); - } - read_unlock(&hashinfo->lhash_lock); - return sk; -} + const unsigned short hnum, const int dif); /* Socket demux engine toys. */ #ifdef __BIG_ENDIAN diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 95fac55..bfc39066 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -124,8 +124,10 @@ EXPORT_SYMBOL(inet_listen_wlock); * remote address for the connection. So always assume those are both * wildcarded during the search since they can never be otherwise. */ -struct sock *__inet_lookup_listener(const struct hlist_head *head, const u32 daddr, - const unsigned short hnum, const int dif) +static struct sock *__inet_lookup_listener(const struct hlist_head *head, + const u32 daddr, + const unsigned short hnum, + const int dif) { struct sock *result = NULL, *sk; const struct hlist_node *node; @@ -159,7 +161,34 @@ struct sock *__inet_lookup_listener(const struct hlist_head *head, const u32 dad return result; } -EXPORT_SYMBOL_GPL(__inet_lookup_listener); +/* Optimize the common listener case. */ +struct sock *inet_lookup_listener(struct inet_hashinfo *hashinfo, + const u32 daddr, const unsigned short hnum, + const int dif) +{ + struct sock *sk = NULL; + const struct hlist_head *head; + + read_lock(&hashinfo->lhash_lock); + head = &hashinfo->listening_hash[inet_lhashfn(hnum)]; + if (!hlist_empty(head)) { + const struct inet_sock *inet = inet_sk((sk = __sk_head(head))); + + if (inet->num == hnum && !sk->sk_node.next && + (!inet->rcv_saddr || inet->rcv_saddr == daddr) && + (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && + !sk->sk_bound_dev_if) + goto sherry_cache; + sk = __inet_lookup_listener(head, daddr, hnum, dif); + } + if (sk) { +sherry_cache: + sock_hold(sk); + } + read_unlock(&hashinfo->lhash_lock); + return sk; +} +EXPORT_SYMBOL_GPL(inet_lookup_listener); /* called with local bh disabled */ static int __inet_check_established(struct inet_timewait_death_row *death_row, -- cgit v0.10.2 From b14295532421c40f82ee099fdbd3d011f022e756 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala Date: Tue, 8 Aug 2006 16:44:17 -0700 Subject: [IPV6]: Make sure fib6_rule_lookup doesn't return NULL The callers of fib6_rule_lookup don't expect it to return NULL, therefore it must return ip6_null_entry whenever fib_rule_lookup fails. Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index bf9bba8..22a2fdb 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -63,7 +63,11 @@ struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, if (arg.rule) fib_rule_put(arg.rule); - return (struct dst_entry *) arg.result; + if (arg.result) + return (struct dst_entry *) arg.result; + + dst_hold(&ip6_null_entry.u.dst); + return &ip6_null_entry.u.dst; } static int fib6_rule_action(struct fib_rule *rule, struct flowi *flp, -- cgit v0.10.2 From 832b4c5e184391773e462653aa862a8cab71f38d Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Tue, 29 Aug 2006 16:48:09 -0700 Subject: [IPV4] fib: convert reader/writer to spinlock Ther is no point in using a more expensive reader/writer lock for a low contention lock like the fib_info_lock. The only reader case is in handling route redirects. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 5173800..38bca47 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -49,7 +49,7 @@ #define FSprintk(a...) -static DEFINE_RWLOCK(fib_info_lock); +static DEFINE_SPINLOCK(fib_info_lock); static struct hlist_head *fib_info_hash; static struct hlist_head *fib_info_laddrhash; static unsigned int fib_hash_size; @@ -159,7 +159,7 @@ void free_fib_info(struct fib_info *fi) void fib_release_info(struct fib_info *fi) { - write_lock_bh(&fib_info_lock); + spin_lock_bh(&fib_info_lock); if (fi && --fi->fib_treeref == 0) { hlist_del(&fi->fib_hash); if (fi->fib_prefsrc) @@ -172,7 +172,7 @@ void fib_release_info(struct fib_info *fi) fi->fib_dead = 1; fib_info_put(fi); } - write_unlock_bh(&fib_info_lock); + spin_unlock_bh(&fib_info_lock); } static __inline__ int nh_comp(const struct fib_info *fi, const struct fib_info *ofi) @@ -254,7 +254,7 @@ int ip_fib_check_default(u32 gw, struct net_device *dev) struct fib_nh *nh; unsigned int hash; - read_lock(&fib_info_lock); + spin_lock(&fib_info_lock); hash = fib_devindex_hashfn(dev->ifindex); head = &fib_info_devhash[hash]; @@ -262,12 +262,12 @@ int ip_fib_check_default(u32 gw, struct net_device *dev) if (nh->nh_dev == dev && nh->nh_gw == gw && !(nh->nh_flags&RTNH_F_DEAD)) { - read_unlock(&fib_info_lock); + spin_unlock(&fib_info_lock); return 0; } } - read_unlock(&fib_info_lock); + spin_unlock(&fib_info_lock); return -1; } @@ -598,7 +598,7 @@ static void fib_hash_move(struct hlist_head *new_info_hash, unsigned int old_size = fib_hash_size; unsigned int i, bytes; - write_lock_bh(&fib_info_lock); + spin_lock_bh(&fib_info_lock); old_info_hash = fib_info_hash; old_laddrhash = fib_info_laddrhash; fib_hash_size = new_size; @@ -639,7 +639,7 @@ static void fib_hash_move(struct hlist_head *new_info_hash, } fib_info_laddrhash = new_laddrhash; - write_unlock_bh(&fib_info_lock); + spin_unlock_bh(&fib_info_lock); bytes = old_size * sizeof(struct hlist_head *); fib_hash_free(old_info_hash, bytes); @@ -820,7 +820,7 @@ link_it: fi->fib_treeref++; atomic_inc(&fi->fib_clntref); - write_lock_bh(&fib_info_lock); + spin_lock_bh(&fib_info_lock); hlist_add_head(&fi->fib_hash, &fib_info_hash[fib_info_hashfn(fi)]); if (fi->fib_prefsrc) { @@ -839,7 +839,7 @@ link_it: head = &fib_info_devhash[hash]; hlist_add_head(&nh->nh_hash, head); } endfor_nexthops(fi) - write_unlock_bh(&fib_info_lock); + spin_unlock_bh(&fib_info_lock); return fi; err_inval: -- cgit v0.10.2 From 8f491069b40be5d627007a343f99759e9da6a178 Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Wed, 9 Aug 2006 15:47:12 -0700 Subject: [IPV4]: Use network-order dport for all visible inet_lookup_* Right now most inet_lookup_* functions take a host-order hnum instead of a network-order dport because that's how it is represented internally. This means that users of these functions have to be careful about using the right byte-order. To add more confusion, inet_lookup takes a network-order dport unlike all other functions. So this patch changes all visible inet_lookup functions to take a dport and move all dport->hnum conversion inside them. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index bd513f3..b4491c9 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -271,10 +271,16 @@ static inline int inet_iif(const struct sk_buff *skb) return ((struct rtable *)skb->dst)->rt_iif; } -extern struct sock * - inet_lookup_listener(struct inet_hashinfo *hashinfo, - const u32 daddr, - const unsigned short hnum, const int dif); +extern struct sock *__inet_lookup_listener(struct inet_hashinfo *hashinfo, + const u32 daddr, + const unsigned short hnum, + const int dif); + +static inline struct sock *inet_lookup_listener(struct inet_hashinfo *hashinfo, + u32 daddr, u16 dport, int dif) +{ + return __inet_lookup_listener(hashinfo, daddr, ntohs(dport), dif); +} /* Socket demux engine toys. */ #ifdef __BIG_ENDIAN @@ -362,14 +368,25 @@ hit: goto out; } +static inline struct sock * + inet_lookup_established(struct inet_hashinfo *hashinfo, + const u32 saddr, const u16 sport, + const u32 daddr, const u16 dport, + const int dif) +{ + return __inet_lookup_established(hashinfo, saddr, sport, daddr, + ntohs(dport), dif); +} + static inline struct sock *__inet_lookup(struct inet_hashinfo *hashinfo, const u32 saddr, const u16 sport, - const u32 daddr, const u16 hnum, + const u32 daddr, const u16 dport, const int dif) { + u16 hnum = ntohs(dport); struct sock *sk = __inet_lookup_established(hashinfo, saddr, sport, daddr, hnum, dif); - return sk ? : inet_lookup_listener(hashinfo, daddr, hnum, dif); + return sk ? : __inet_lookup_listener(hashinfo, daddr, hnum, dif); } static inline struct sock *inet_lookup(struct inet_hashinfo *hashinfo, @@ -380,7 +397,7 @@ static inline struct sock *inet_lookup(struct inet_hashinfo *hashinfo, struct sock *sk; local_bh_disable(); - sk = __inet_lookup(hashinfo, saddr, sport, daddr, ntohs(dport), dif); + sk = __inet_lookup(hashinfo, saddr, sport, daddr, dport, dif); local_bh_enable(); return sk; diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c index 171d363..9a1a76a 100644 --- a/net/dccp/ipv4.c +++ b/net/dccp/ipv4.c @@ -608,10 +608,10 @@ static struct sock *dccp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) if (req != NULL) return dccp_check_req(sk, skb, req, prev); - nsk = __inet_lookup_established(&dccp_hashinfo, - iph->saddr, dh->dccph_sport, - iph->daddr, ntohs(dh->dccph_dport), - inet_iif(skb)); + nsk = inet_lookup_established(&dccp_hashinfo, + iph->saddr, dh->dccph_sport, + iph->daddr, dh->dccph_dport, + inet_iif(skb)); if (nsk != NULL) { if (nsk->sk_state != DCCP_TIME_WAIT) { bh_lock_sock(nsk); @@ -925,7 +925,7 @@ static int dccp_v4_rcv(struct sk_buff *skb) * Look up flow ID in table and get corresponding socket */ sk = __inet_lookup(&dccp_hashinfo, skb->nh.iph->saddr, dh->dccph_sport, - skb->nh.iph->daddr, ntohs(dh->dccph_dport), + skb->nh.iph->daddr, dh->dccph_dport, inet_iif(skb)); /* diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index bfc39066..fb296c9 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -124,10 +124,10 @@ EXPORT_SYMBOL(inet_listen_wlock); * remote address for the connection. So always assume those are both * wildcarded during the search since they can never be otherwise. */ -static struct sock *__inet_lookup_listener(const struct hlist_head *head, - const u32 daddr, - const unsigned short hnum, - const int dif) +static struct sock *inet_lookup_listener_slow(const struct hlist_head *head, + const u32 daddr, + const unsigned short hnum, + const int dif) { struct sock *result = NULL, *sk; const struct hlist_node *node; @@ -162,9 +162,9 @@ static struct sock *__inet_lookup_listener(const struct hlist_head *head, } /* Optimize the common listener case. */ -struct sock *inet_lookup_listener(struct inet_hashinfo *hashinfo, - const u32 daddr, const unsigned short hnum, - const int dif) +struct sock *__inet_lookup_listener(struct inet_hashinfo *hashinfo, + const u32 daddr, const unsigned short hnum, + const int dif) { struct sock *sk = NULL; const struct hlist_head *head; @@ -179,7 +179,7 @@ struct sock *inet_lookup_listener(struct inet_hashinfo *hashinfo, (sk->sk_family == PF_INET || !ipv6_only_sock(sk)) && !sk->sk_bound_dev_if) goto sherry_cache; - sk = __inet_lookup_listener(head, daddr, hnum, dif); + sk = inet_lookup_listener_slow(head, daddr, hnum, dif); } if (sk) { sherry_cache: @@ -188,7 +188,7 @@ sherry_cache: read_unlock(&hashinfo->lhash_lock); return sk; } -EXPORT_SYMBOL_GPL(inet_lookup_listener); +EXPORT_SYMBOL_GPL(__inet_lookup_listener); /* called with local bh disabled */ static int __inet_check_established(struct inet_timewait_death_row *death_row, diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index b2aa512..2973dee 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -951,9 +951,9 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) if (req) return tcp_check_req(sk, skb, req, prev); - nsk = __inet_lookup_established(&tcp_hashinfo, skb->nh.iph->saddr, - th->source, skb->nh.iph->daddr, - ntohs(th->dest), inet_iif(skb)); + nsk = inet_lookup_established(&tcp_hashinfo, skb->nh.iph->saddr, + th->source, skb->nh.iph->daddr, + th->dest, inet_iif(skb)); if (nsk) { if (nsk->sk_state != TCP_TIME_WAIT) { @@ -1090,7 +1090,7 @@ int tcp_v4_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->sacked = 0; sk = __inet_lookup(&tcp_hashinfo, skb->nh.iph->saddr, th->source, - skb->nh.iph->daddr, ntohs(th->dest), + skb->nh.iph->daddr, th->dest, inet_iif(skb)); if (!sk) @@ -1168,7 +1168,7 @@ do_time_wait: case TCP_TW_SYN: { struct sock *sk2 = inet_lookup_listener(&tcp_hashinfo, skb->nh.iph->daddr, - ntohs(th->dest), + th->dest, inet_iif(skb)); if (sk2) { inet_twsk_deschedule((struct inet_timewait_sock *)sk, -- cgit v0.10.2 From a8731cbf61c8768ea129780b70dc7dfc6795aad4 Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Wed, 9 Aug 2006 15:56:46 -0700 Subject: [DECNET]: Covert rules to use generic code This patch converts the DECnet rules code to use the generic rules system created by Thomas Graf . Signed-off-by: Steven Whitehouse Acked-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 784a1a2..0aaffa2 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -534,7 +534,8 @@ enum rtnetlink_groups { RTNLGRP_NOP2, RTNLGRP_DECnet_ROUTE, #define RTNLGRP_DECnet_ROUTE RTNLGRP_DECnet_ROUTE - RTNLGRP_NOP3, + RTNLGRP_DECnet_RULE, +#define RTNLGRP_DECnet_RULE RTNLGRP_DECnet_RULE RTNLGRP_NOP4, RTNLGRP_IPV6_PREFIX, #define RTNLGRP_IPV6_PREFIX RTNLGRP_IPV6_PREFIX diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h index a15dcf0..32bc8ce 100644 --- a/include/net/dn_fib.h +++ b/include/net/dn_fib.h @@ -22,7 +22,7 @@ struct dn_kern_rta }; struct dn_fib_res { - struct dn_fib_rule *r; + struct fib_rule *r; struct dn_fib_info *fi; unsigned char prefixlen; unsigned char nh_sel; @@ -147,10 +147,8 @@ extern void dn_fib_table_cleanup(void); */ extern void dn_fib_rules_init(void); extern void dn_fib_rules_cleanup(void); -extern void dn_fib_rule_put(struct dn_fib_rule *); -extern __le16 dn_fib_rules_policy(__le16 saddr, struct dn_fib_res *res, unsigned *flags); extern unsigned dnet_addr_type(__le16 addr); -extern int dn_fib_lookup(const struct flowi *fl, struct dn_fib_res *res); +extern int dn_fib_lookup(struct flowi *fl, struct dn_fib_res *res); /* * rtnetlink interface @@ -176,7 +174,7 @@ static inline void dn_fib_res_put(struct dn_fib_res *res) if (res->fi) dn_fib_info_put(res->fi); if (res->r) - dn_fib_rule_put(res->r); + fib_rule_put(res->r); } extern struct dn_fib_table *dn_fib_tables[]; diff --git a/net/decnet/Kconfig b/net/decnet/Kconfig index 92f2ec4..36e72cb 100644 --- a/net/decnet/Kconfig +++ b/net/decnet/Kconfig @@ -27,6 +27,7 @@ config DECNET config DECNET_ROUTER bool "DECnet: router support (EXPERIMENTAL)" depends on DECNET && EXPERIMENTAL + select FIB_RULES ---help--- Add support for turning your DECnet Endnode into a level 1 or 2 router. This is an experimental, but functional option. If you diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c index 5486247..70e0273 100644 --- a/net/decnet/af_decnet.c +++ b/net/decnet/af_decnet.c @@ -130,6 +130,7 @@ Version 0.0.6 2.1.110 07-aug-98 Eduardo Marcelo Serrat #include #include #include +#include #include #include #include diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c index 632c5a9..88ea7a1 100644 --- a/net/decnet/dn_dev.c +++ b/net/decnet/dn_dev.c @@ -46,6 +46,7 @@ #include #include #include +#include #include #include #include @@ -1418,8 +1419,6 @@ static struct rtnetlink_link dnet_rtnetlink_table[RTM_NR_MSGTYPES] = [RTM_DELROUTE - RTM_BASE] = { .doit = dn_fib_rtm_delroute, }, [RTM_GETROUTE - RTM_BASE] = { .doit = dn_cache_getroute, .dumpit = dn_fib_dump, }, - [RTM_NEWRULE - RTM_BASE] = { .doit = dn_fib_rtm_newrule, }, - [RTM_DELRULE - RTM_BASE] = { .doit = dn_fib_rtm_delrule, }, [RTM_GETRULE - RTM_BASE] = { .dumpit = dn_fib_dump_rules, }, #else [RTM_GETROUTE - RTM_BASE] = { .doit = dn_cache_getroute, diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index fa20e2e..846df39 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include #include diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index 743e9fc..5e6f4616 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -80,6 +80,7 @@ #include #include #include +#include #include #include #include @@ -1284,7 +1285,7 @@ static int dn_route_input_slow(struct sk_buff *skb) dev_hold(out_dev); if (res.r) - src_map = dn_fib_rules_policy(fl.fld_src, &res, &flags); + src_map = fl.fld_src; /* no NAT support for now */ gateway = DN_FIB_RES_GW(res); if (res.type == RTN_NAT) { diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 6986be7..096f127 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -11,259 +11,198 @@ * * * Changes: + * Steve Whitehouse + * Updated for Thomas Graf's generic rules * */ -#include #include -#include -#include #include -#include #include #include -#include #include -#include #include -#include #include #include -#include -#include #include #include #include +#include #include #include #include #include +static struct fib_rules_ops dn_fib_rules_ops; + struct dn_fib_rule { - struct hlist_node r_hlist; - atomic_t r_clntref; - u32 r_preference; - unsigned char r_table; - unsigned char r_action; - unsigned char r_dst_len; - unsigned char r_src_len; - __le16 r_src; - __le16 r_srcmask; - __le16 r_dst; - __le16 r_dstmask; - __le16 r_srcmap; - u8 r_flags; + struct fib_rule common; + unsigned char dst_len; + unsigned char src_len; + __le16 src; + __le16 srcmask; + __le16 dst; + __le16 dstmask; + __le16 srcmap; + u8 flags; #ifdef CONFIG_DECNET_ROUTE_FWMARK - u32 r_fwmark; + u32 fwmark; #endif - int r_ifindex; - char r_ifname[IFNAMSIZ]; - int r_dead; - struct rcu_head rcu; }; static struct dn_fib_rule default_rule = { - .r_clntref = ATOMIC_INIT(2), - .r_preference = 0x7fff, - .r_table = RT_TABLE_MAIN, - .r_action = RTN_UNICAST + .common = { + .refcnt = ATOMIC_INIT(2), + .pref = 0x7fff, + .table = RT_TABLE_MAIN, + .action = FR_ACT_TO_TBL, + }, }; -static struct hlist_head dn_fib_rules; +static LIST_HEAD(dn_fib_rules); + -int dn_fib_rtm_delrule(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +int dn_fib_lookup(struct flowi *flp, struct dn_fib_res *res) { - struct rtattr **rta = arg; - struct rtmsg *rtm = NLMSG_DATA(nlh); - struct dn_fib_rule *r; - struct hlist_node *node; - int err = -ESRCH; - - hlist_for_each_entry(r, node, &dn_fib_rules, r_hlist) { - if ((!rta[RTA_SRC-1] || memcmp(RTA_DATA(rta[RTA_SRC-1]), &r->r_src, 2) == 0) && - rtm->rtm_src_len == r->r_src_len && - rtm->rtm_dst_len == r->r_dst_len && - (!rta[RTA_DST-1] || memcmp(RTA_DATA(rta[RTA_DST-1]), &r->r_dst, 2) == 0) && -#ifdef CONFIG_DECNET_ROUTE_FWMARK - (!rta[RTA_PROTOINFO-1] || memcmp(RTA_DATA(rta[RTA_PROTOINFO-1]), &r->r_fwmark, 4) == 0) && -#endif - (!rtm->rtm_type || rtm->rtm_type == r->r_action) && - (!rta[RTA_PRIORITY-1] || memcmp(RTA_DATA(rta[RTA_PRIORITY-1]), &r->r_preference, 4) == 0) && - (!rta[RTA_IIF-1] || rtattr_strcmp(rta[RTA_IIF-1], r->r_ifname) == 0) && - (!rtm->rtm_table || (r && rtm->rtm_table == r->r_table))) { - - err = -EPERM; - if (r == &default_rule) - break; - - hlist_del_rcu(&r->r_hlist); - r->r_dead = 1; - dn_fib_rule_put(r); - err = 0; - break; - } - } + struct fib_lookup_arg arg = { + .result = res, + }; + int err; + + err = fib_rules_lookup(&dn_fib_rules_ops, flp, 0, &arg); + res->r = arg.rule; return err; } -static inline void dn_fib_rule_put_rcu(struct rcu_head *head) +int dn_fib_rule_action(struct fib_rule *rule, struct flowi *flp, int flags, + struct fib_lookup_arg *arg) { - struct dn_fib_rule *r = container_of(head, struct dn_fib_rule, rcu); - kfree(r); -} + int err = -EAGAIN; + struct dn_fib_table *tbl; -void dn_fib_rule_put(struct dn_fib_rule *r) -{ - if (atomic_dec_and_test(&r->r_clntref)) { - if (r->r_dead) - call_rcu(&r->rcu, dn_fib_rule_put_rcu); - else - printk(KERN_DEBUG "Attempt to free alive dn_fib_rule\n"); + switch(rule->action) { + case FR_ACT_TO_TBL: + break; + + case FR_ACT_UNREACHABLE: + err = -ENETUNREACH; + goto errout; + + case FR_ACT_PROHIBIT: + err = -EACCES; + goto errout; + + case FR_ACT_BLACKHOLE: + default: + err = -EINVAL; + goto errout; } + + tbl = dn_fib_get_table(rule->table, 0); + if (tbl == NULL) + goto errout; + + err = tbl->lookup(tbl, flp, (struct dn_fib_res *)arg->result); + if (err > 0) + err = -EAGAIN; +errout: + return err; } +static struct nla_policy dn_fib_rule_policy[FRA_MAX+1] __read_mostly = { + [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_PRIORITY] = { .type = NLA_U32 }, + [FRA_SRC] = { .type = NLA_U16 }, + [FRA_DST] = { .type = NLA_U16 }, + [FRA_FWMARK] = { .type = NLA_U32 }, +}; -int dn_fib_rtm_newrule(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) +static int dn_fib_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) { - struct rtattr **rta = arg; - struct rtmsg *rtm = NLMSG_DATA(nlh); - struct dn_fib_rule *r, *new_r, *last = NULL; - struct hlist_node *node = NULL; - unsigned char table_id; - - if (rtm->rtm_src_len > 16 || rtm->rtm_dst_len > 16) - return -EINVAL; - - if (rta[RTA_IIF-1] && RTA_PAYLOAD(rta[RTA_IIF-1]) > IFNAMSIZ) - return -EINVAL; - - if (rtm->rtm_type == RTN_NAT) - return -EINVAL; - - table_id = rtm->rtm_table; - if (table_id == RT_TABLE_UNSPEC) { - struct dn_fib_table *tb; - if (rtm->rtm_type == RTN_UNICAST) { - if ((tb = dn_fib_empty_table()) == NULL) - return -ENOBUFS; - table_id = tb->n; - } - } + struct dn_fib_rule *r = (struct dn_fib_rule *)rule; + u16 daddr = fl->fld_dst; + u16 saddr = fl->fld_src; + + if (((saddr ^ r->src) & r->srcmask) || + ((daddr ^ r->dst) & r->dstmask)) + return 0; - new_r = kzalloc(sizeof(*new_r), GFP_KERNEL); - if (!new_r) - return -ENOMEM; - - if (rta[RTA_SRC-1]) - memcpy(&new_r->r_src, RTA_DATA(rta[RTA_SRC-1]), 2); - if (rta[RTA_DST-1]) - memcpy(&new_r->r_dst, RTA_DATA(rta[RTA_DST-1]), 2); - if (rta[RTA_GATEWAY-1]) - memcpy(&new_r->r_srcmap, RTA_DATA(rta[RTA_GATEWAY-1]), 2); - new_r->r_src_len = rtm->rtm_src_len; - new_r->r_dst_len = rtm->rtm_dst_len; - new_r->r_srcmask = dnet_make_mask(rtm->rtm_src_len); - new_r->r_dstmask = dnet_make_mask(rtm->rtm_dst_len); #ifdef CONFIG_DECNET_ROUTE_FWMARK - if (rta[RTA_PROTOINFO-1]) - memcpy(&new_r->r_fwmark, RTA_DATA(rta[RTA_PROTOINFO-1]), 4); + if (r->fwmark && (r->fwmark != fl->fld_fwmark)) + return 0; #endif - new_r->r_action = rtm->rtm_type; - new_r->r_flags = rtm->rtm_flags; - if (rta[RTA_PRIORITY-1]) - memcpy(&new_r->r_preference, RTA_DATA(rta[RTA_PRIORITY-1]), 4); - new_r->r_table = table_id; - if (rta[RTA_IIF-1]) { - struct net_device *dev; - rtattr_strlcpy(new_r->r_ifname, rta[RTA_IIF-1], IFNAMSIZ); - new_r->r_ifindex = -1; - dev = dev_get_by_name(new_r->r_ifname); - if (dev) { - new_r->r_ifindex = dev->ifindex; - dev_put(dev); - } - } - r = container_of(dn_fib_rules.first, struct dn_fib_rule, r_hlist); - if (!new_r->r_preference) { - if (r && r->r_hlist.next != NULL) { - r = container_of(r->r_hlist.next, struct dn_fib_rule, r_hlist); - if (r->r_preference) - new_r->r_preference = r->r_preference - 1; + return 1; +} + +static int dn_fib_rule_configure(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh, + struct nlattr **tb) +{ + int err = -EINVAL; + struct dn_fib_rule *r = (struct dn_fib_rule *)rule; + + if (frh->src_len > 16 || frh->dst_len > 16 || frh->tos) + goto errout; + + if (rule->table == RT_TABLE_UNSPEC) { + if (rule->action == FR_ACT_TO_TBL) { + struct dn_fib_table *table; + + table = dn_fib_empty_table(); + if (table == NULL) { + err = -ENOBUFS; + goto errout; + } + + rule->table = table->n; } } - hlist_for_each_entry(r, node, &dn_fib_rules, r_hlist) { - if (r->r_preference > new_r->r_preference) - break; - last = r; - } - atomic_inc(&new_r->r_clntref); + if (tb[FRA_SRC]) + r->src = nla_get_u16(tb[FRA_SRC]); - if (last) - hlist_add_after_rcu(&last->r_hlist, &new_r->r_hlist); - else - hlist_add_before_rcu(&new_r->r_hlist, &r->r_hlist); - return 0; -} + if (tb[FRA_DST]) + r->dst = nla_get_u16(tb[FRA_DST]); +#ifdef CONFIG_DECNET_ROUTE_FWMARK + if (tb[FRA_FWMARK]) + r->fwmark = nla_get_u32(tb[FRA_FWMARK]); +#endif + + r->src_len = frh->src_len; + r->srcmask = dnet_make_mask(r->src_len); + r->dst_len = frh->dst_len; + r->dstmask = dnet_make_mask(r->dst_len); + err = 0; +errout: + return err; +} -int dn_fib_lookup(const struct flowi *flp, struct dn_fib_res *res) +static int dn_fib_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, + struct nlattr **tb) { - struct dn_fib_rule *r, *policy; - struct dn_fib_table *tb; - __le16 saddr = flp->fld_src; - __le16 daddr = flp->fld_dst; - struct hlist_node *node; - int err; + struct dn_fib_rule *r = (struct dn_fib_rule *)rule; + + if (frh->src_len && (r->src_len != frh->src_len)) + return 0; - rcu_read_lock(); + if (frh->dst_len && (r->dst_len != frh->dst_len)) + return 0; - hlist_for_each_entry_rcu(r, node, &dn_fib_rules, r_hlist) { - if (((saddr^r->r_src) & r->r_srcmask) || - ((daddr^r->r_dst) & r->r_dstmask) || #ifdef CONFIG_DECNET_ROUTE_FWMARK - (r->r_fwmark && r->r_fwmark != flp->fld_fwmark) || + if (tb[FRA_FWMARK] && (r->fwmark != nla_get_u32(tb[FRA_FWMARK]))) + return 0; #endif - (r->r_ifindex && r->r_ifindex != flp->iif)) - continue; - - switch(r->r_action) { - case RTN_UNICAST: - case RTN_NAT: - policy = r; - break; - case RTN_UNREACHABLE: - rcu_read_unlock(); - return -ENETUNREACH; - default: - case RTN_BLACKHOLE: - rcu_read_unlock(); - return -EINVAL; - case RTN_PROHIBIT: - rcu_read_unlock(); - return -EACCES; - } - if ((tb = dn_fib_get_table(r->r_table, 0)) == NULL) - continue; - err = tb->lookup(tb, flp, res); - if (err == 0) { - res->r = policy; - if (policy) - atomic_inc(&policy->r_clntref); - rcu_read_unlock(); - return 0; - } - if (err < 0 && err != -EAGAIN) { - rcu_read_unlock(); - return err; - } - } + if (tb[FRA_SRC] && (r->src != nla_get_u32(tb[FRA_SRC]))) + return 0; + + if (tb[FRA_DST] && (r->dst != nla_get_u32(tb[FRA_DST]))) + return 0; - rcu_read_unlock(); - return -ESRCH; + return 1; } unsigned dnet_addr_type(__le16 addr) @@ -284,142 +223,77 @@ unsigned dnet_addr_type(__le16 addr) return ret; } -__le16 dn_fib_rules_policy(__le16 saddr, struct dn_fib_res *res, unsigned *flags) +static int dn_fib_rule_fill(struct fib_rule *rule, struct sk_buff *skb, + struct nlmsghdr *nlh, struct fib_rule_hdr *frh) { - struct dn_fib_rule *r = res->r; - - if (r->r_action == RTN_NAT) { - int addrtype = dnet_addr_type(r->r_srcmap); - - if (addrtype == RTN_NAT) { - saddr = (saddr&~r->r_srcmask)|r->r_srcmap; - *flags |= RTCF_SNAT; - } else if (addrtype == RTN_LOCAL || r->r_srcmap == 0) { - saddr = r->r_srcmap; - *flags |= RTCF_MASQ; - } - } - return saddr; -} + struct dn_fib_rule *r = (struct dn_fib_rule *)rule; -static void dn_fib_rules_detach(struct net_device *dev) -{ - struct hlist_node *node; - struct dn_fib_rule *r; + frh->family = AF_DECnet; + frh->dst_len = r->dst_len; + frh->src_len = r->src_len; + frh->tos = 0; - hlist_for_each_entry(r, node, &dn_fib_rules, r_hlist) { - if (r->r_ifindex == dev->ifindex) - r->r_ifindex = -1; - } -} +#ifdef CONFIG_DECNET_ROUTE_FWMARK + if (r->fwmark) + NLA_PUT_U32(skb, FRA_FWMARK, r->fwmark); +#endif + if (r->dst_len) + NLA_PUT_U16(skb, FRA_DST, r->dst); + if (r->src_len) + NLA_PUT_U16(skb, FRA_SRC, r->src); -static void dn_fib_rules_attach(struct net_device *dev) -{ - struct hlist_node *node; - struct dn_fib_rule *r; + return 0; - hlist_for_each_entry(r, node, &dn_fib_rules, r_hlist) { - if (r->r_ifindex == -1 && strcmp(dev->name, r->r_ifname) == 0) - r->r_ifindex = dev->ifindex; - } +nla_put_failure: + return -ENOBUFS; } -static int dn_fib_rules_event(struct notifier_block *this, unsigned long event, void *ptr) +static u32 dn_fib_rule_default_pref(void) { - struct net_device *dev = ptr; - - switch(event) { - case NETDEV_UNREGISTER: - dn_fib_rules_detach(dev); - dn_fib_sync_down(0, dev, 1); - case NETDEV_REGISTER: - dn_fib_rules_attach(dev); - dn_fib_sync_up(dev); + struct list_head *pos; + struct fib_rule *rule; + + if (!list_empty(&dn_fib_rules)) { + pos = dn_fib_rules.next; + if (pos->next != &dn_fib_rules) { + rule = list_entry(pos->next, struct fib_rule, list); + if (rule->pref) + return rule->pref - 1; + } } - return NOTIFY_DONE; -} - - -static struct notifier_block dn_fib_rules_notifier = { - .notifier_call = dn_fib_rules_event, -}; - -static int dn_fib_fill_rule(struct sk_buff *skb, struct dn_fib_rule *r, - struct netlink_callback *cb, unsigned int flags) -{ - struct rtmsg *rtm; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; - - - nlh = NLMSG_NEW_ANSWER(skb, cb, RTM_NEWRULE, sizeof(*rtm), flags); - rtm = NLMSG_DATA(nlh); - rtm->rtm_family = AF_DECnet; - rtm->rtm_dst_len = r->r_dst_len; - rtm->rtm_src_len = r->r_src_len; - rtm->rtm_tos = 0; -#ifdef CONFIG_DECNET_ROUTE_FWMARK - if (r->r_fwmark) - RTA_PUT(skb, RTA_PROTOINFO, 4, &r->r_fwmark); -#endif - rtm->rtm_table = r->r_table; - rtm->rtm_protocol = 0; - rtm->rtm_scope = 0; - rtm->rtm_type = r->r_action; - rtm->rtm_flags = r->r_flags; - - if (r->r_dst_len) - RTA_PUT(skb, RTA_DST, 2, &r->r_dst); - if (r->r_src_len) - RTA_PUT(skb, RTA_SRC, 2, &r->r_src); - if (r->r_ifname[0]) - RTA_PUT(skb, RTA_IIF, IFNAMSIZ, &r->r_ifname); - if (r->r_preference) - RTA_PUT(skb, RTA_PRIORITY, 4, &r->r_preference); - if (r->r_srcmap) - RTA_PUT(skb, RTA_GATEWAY, 2, &r->r_srcmap); - nlh->nlmsg_len = skb->tail - b; - return skb->len; - -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; + return 0; } int dn_fib_dump_rules(struct sk_buff *skb, struct netlink_callback *cb) { - int idx = 0; - int s_idx = cb->args[0]; - struct dn_fib_rule *r; - struct hlist_node *node; - - rcu_read_lock(); - hlist_for_each_entry(r, node, &dn_fib_rules, r_hlist) { - if (idx < s_idx) - goto next; - if (dn_fib_fill_rule(skb, r, cb, NLM_F_MULTI) < 0) - break; -next: - idx++; - } - rcu_read_unlock(); - cb->args[0] = idx; - - return skb->len; + return fib_rules_dump(skb, cb, AF_DECnet); } +static struct fib_rules_ops dn_fib_rules_ops = { + .family = AF_DECnet, + .rule_size = sizeof(struct dn_fib_rule), + .action = dn_fib_rule_action, + .match = dn_fib_rule_match, + .configure = dn_fib_rule_configure, + .compare = dn_fib_rule_compare, + .fill = dn_fib_rule_fill, + .default_pref = dn_fib_rule_default_pref, + .nlgroup = RTNLGRP_DECnet_RULE, + .policy = dn_fib_rule_policy, + .rules_list = &dn_fib_rules, + .owner = THIS_MODULE, +}; + void __init dn_fib_rules_init(void) { - INIT_HLIST_HEAD(&dn_fib_rules); - hlist_add_head(&default_rule.r_hlist, &dn_fib_rules); - register_netdevice_notifier(&dn_fib_rules_notifier); + list_add_tail(&default_rule.common.list, &dn_fib_rules); + fib_rules_register(&dn_fib_rules_ops); } void __exit dn_fib_rules_cleanup(void) { - unregister_netdevice_notifier(&dn_fib_rules_notifier); + fib_rules_unregister(&dn_fib_rules_ops); } diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c index e926c95..2e01b67 100644 --- a/net/decnet/dn_table.c +++ b/net/decnet/dn_table.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include #include -- cgit v0.10.2 From a22ec367b08455f95fa0096ce1999950b6f6911c Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Wed, 9 Aug 2006 16:00:57 -0700 Subject: [DECNET]: Convert rwlock to spinlock As per Stephen Hemminger's recent patch to ipv4/fib_semantics.c this is the same change but for DECnet. Signed-off-by: Steven Whitehouse Signed-off-by: David S. Miller diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index 846df39..ed5fb5c 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -59,7 +59,7 @@ extern int dn_cache_dump(struct sk_buff *skb, struct netlink_callback *cb); static DEFINE_SPINLOCK(dn_fib_multipath_lock); static struct dn_fib_info *dn_fib_info_list; -static DEFINE_RWLOCK(dn_fib_info_lock); +static DEFINE_SPINLOCK(dn_fib_info_lock); static struct { @@ -97,7 +97,7 @@ void dn_fib_free_info(struct dn_fib_info *fi) void dn_fib_release_info(struct dn_fib_info *fi) { - write_lock(&dn_fib_info_lock); + spin_lock(&dn_fib_info_lock); if (fi && --fi->fib_treeref == 0) { if (fi->fib_next) fi->fib_next->fib_prev = fi->fib_prev; @@ -108,7 +108,7 @@ void dn_fib_release_info(struct dn_fib_info *fi) fi->fib_dead = 1; dn_fib_info_put(fi); } - write_unlock(&dn_fib_info_lock); + spin_unlock(&dn_fib_info_lock); } static inline int dn_fib_nh_comp(const struct dn_fib_info *fi, const struct dn_fib_info *ofi) @@ -379,13 +379,13 @@ link_it: fi->fib_treeref++; atomic_inc(&fi->fib_clntref); - write_lock(&dn_fib_info_lock); + spin_lock(&dn_fib_info_lock); fi->fib_next = dn_fib_info_list; fi->fib_prev = NULL; if (dn_fib_info_list) dn_fib_info_list->fib_prev = fi; dn_fib_info_list = fi; - write_unlock(&dn_fib_info_lock); + spin_unlock(&dn_fib_info_lock); return fi; err_inval: -- cgit v0.10.2 From 53fad3cbff120d8987f377eff374cf4db4ecb177 Mon Sep 17 00:00:00 2001 From: Sridhar Samudrala Date: Wed, 9 Aug 2006 17:03:17 -0700 Subject: [SUNRPC]: Remove the unnecessary check for highmem in xs_sendpages(). Just call kernel_sendpage() directly. Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 8b319e3..897bdd9 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -174,7 +174,6 @@ static inline int xs_sendpages(struct socket *sock, struct sockaddr *addr, int a struct page **ppage = xdr->pages; unsigned int len, pglen = xdr->page_len; int err, ret = 0; - ssize_t (*sendpage)(struct socket *, struct page *, int, size_t, int); if (unlikely(!sock)) return -ENOTCONN; @@ -207,7 +206,6 @@ static inline int xs_sendpages(struct socket *sock, struct sockaddr *addr, int a base &= ~PAGE_CACHE_MASK; } - sendpage = kernel_sendpage; do { int flags = XS_SENDMSG_FLAGS; @@ -220,10 +218,7 @@ static inline int xs_sendpages(struct socket *sock, struct sockaddr *addr, int a if (pglen != len || xdr->tail[0].iov_len != 0) flags |= MSG_MORE; - /* Hmm... We might be dealing with highmem pages */ - if (PageHighMem(*ppage)) - sendpage = sock_no_sendpage; - err = sendpage(sock, *ppage, base, len, flags); + err = kernel_sendpage(sock, *ppage, base, len, flags); if (ret == 0) ret = err; else if (err > 0) -- cgit v0.10.2 From 89bddce58e85bb18b13f5077e8349ba9a3ee2597 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Fri, 1 Sep 2006 00:19:31 -0700 Subject: [NET] socket: code style cleanup Make socket.c conform to current style: * run through Lindent * get rid of unneeded casts * split assignment and comparsion where possible Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/socket.c b/net/socket.c index 2eaebf9..156f2ef 100644 --- a/net/socket.c +++ b/net/socket.c @@ -42,7 +42,7 @@ * Andi Kleen : Some small cleanups, optimizations, * and fixed a copy_from_user() bug. * Tigran Aivazian : sys_send(args) calls sys_sendto(args, NULL, 0) - * Tigran Aivazian : Made listen(2) backlog sanity checks + * Tigran Aivazian : Made listen(2) backlog sanity checks * protocol-independent * * @@ -53,7 +53,7 @@ * * * This module is effectively the top level interface to the BSD socket - * paradigm. + * paradigm. * * Based upon Swansea University Computer Society NET3.039 */ @@ -96,25 +96,24 @@ static int sock_no_open(struct inode *irrelevant, struct file *dontcare); static ssize_t sock_aio_read(struct kiocb *iocb, char __user *buf, - size_t size, loff_t pos); + size_t size, loff_t pos); static ssize_t sock_aio_write(struct kiocb *iocb, const char __user *buf, - size_t size, loff_t pos); -static int sock_mmap(struct file *file, struct vm_area_struct * vma); + size_t size, loff_t pos); +static int sock_mmap(struct file *file, struct vm_area_struct *vma); static int sock_close(struct inode *inode, struct file *file); static unsigned int sock_poll(struct file *file, struct poll_table_struct *wait); -static long sock_ioctl(struct file *file, - unsigned int cmd, unsigned long arg); +static long sock_ioctl(struct file *file, unsigned int cmd, unsigned long arg); #ifdef CONFIG_COMPAT static long compat_sock_ioctl(struct file *file, - unsigned int cmd, unsigned long arg); + unsigned int cmd, unsigned long arg); #endif static int sock_fasync(int fd, struct file *filp, int on); static ssize_t sock_readv(struct file *file, const struct iovec *vector, unsigned long count, loff_t *ppos); static ssize_t sock_writev(struct file *file, const struct iovec *vector, - unsigned long count, loff_t *ppos); + unsigned long count, loff_t *ppos); static ssize_t sock_sendpage(struct file *file, struct page *page, int offset, size_t size, loff_t *ppos, int more); @@ -193,7 +192,6 @@ static __inline__ void net_family_read_unlock(void) #define net_family_read_unlock() do { } while(0) #endif - /* * Statistics counters of the socket lists */ @@ -201,19 +199,20 @@ static __inline__ void net_family_read_unlock(void) static DEFINE_PER_CPU(int, sockets_in_use) = 0; /* - * Support routines. Move socket addresses back and forth across the kernel/user - * divide and look after the messy bits. + * Support routines. + * Move socket addresses back and forth across the kernel/user + * divide and look after the messy bits. */ -#define MAX_SOCK_ADDR 128 /* 108 for Unix domain - +#define MAX_SOCK_ADDR 128 /* 108 for Unix domain - 16 for IP, 16 for IPX, 24 for IPv6, - about 80 for AX.25 + about 80 for AX.25 must be at least one bigger than the AF_UNIX size (see net/unix/af_unix.c - :unix_mkname()). + :unix_mkname()). */ - + /** * move_addr_to_kernel - copy a socket address into kernel space * @uaddr: Address in user space @@ -227,11 +226,11 @@ static DEFINE_PER_CPU(int, sockets_in_use) = 0; int move_addr_to_kernel(void __user *uaddr, int ulen, void *kaddr) { - if(ulen<0||ulen>MAX_SOCK_ADDR) + if (ulen < 0 || ulen > MAX_SOCK_ADDR) return -EINVAL; - if(ulen==0) + if (ulen == 0) return 0; - if(copy_from_user(kaddr,uaddr,ulen)) + if (copy_from_user(kaddr, uaddr, ulen)) return -EFAULT; return audit_sockaddr(ulen, kaddr); } @@ -252,44 +251,46 @@ int move_addr_to_kernel(void __user *uaddr, int ulen, void *kaddr) * length of the data is written over the length limit the user * specified. Zero is returned for a success. */ - -int move_addr_to_user(void *kaddr, int klen, void __user *uaddr, int __user *ulen) + +int move_addr_to_user(void *kaddr, int klen, void __user *uaddr, + int __user *ulen) { int err; int len; - if((err=get_user(len, ulen))) + err = get_user(len, ulen); + if (err) return err; - if(len>klen) - len=klen; - if(len<0 || len> MAX_SOCK_ADDR) + if (len > klen) + len = klen; + if (len < 0 || len > MAX_SOCK_ADDR) return -EINVAL; - if(len) - { + if (len) { if (audit_sockaddr(klen, kaddr)) return -ENOMEM; - if(copy_to_user(uaddr,kaddr,len)) + if (copy_to_user(uaddr, kaddr, len)) return -EFAULT; } /* - * "fromlen shall refer to the value before truncation.." - * 1003.1g + * "fromlen shall refer to the value before truncation.." + * 1003.1g */ return __put_user(klen, ulen); } #define SOCKFS_MAGIC 0x534F434B -static kmem_cache_t * sock_inode_cachep __read_mostly; +static kmem_cache_t *sock_inode_cachep __read_mostly; static struct inode *sock_alloc_inode(struct super_block *sb) { struct socket_alloc *ei; - ei = (struct socket_alloc *)kmem_cache_alloc(sock_inode_cachep, SLAB_KERNEL); + + ei = kmem_cache_alloc(sock_inode_cachep, SLAB_KERNEL); if (!ei) return NULL; init_waitqueue_head(&ei->socket.wait); - + ei->socket.fasync_list = NULL; ei->socket.state = SS_UNCONNECTED; ei->socket.flags = 0; @@ -307,22 +308,25 @@ static void sock_destroy_inode(struct inode *inode) container_of(inode, struct socket_alloc, vfs_inode)); } -static void init_once(void * foo, kmem_cache_t * cachep, unsigned long flags) +static void init_once(void *foo, kmem_cache_t *cachep, unsigned long flags) { - struct socket_alloc *ei = (struct socket_alloc *) foo; + struct socket_alloc *ei = (struct socket_alloc *)foo; - if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) == - SLAB_CTOR_CONSTRUCTOR) + if ((flags & (SLAB_CTOR_VERIFY|SLAB_CTOR_CONSTRUCTOR)) + == SLAB_CTOR_CONSTRUCTOR) inode_init_once(&ei->vfs_inode); } - + static int init_inodecache(void) { sock_inode_cachep = kmem_cache_create("sock_inode_cache", - sizeof(struct socket_alloc), - 0, (SLAB_HWCACHE_ALIGN|SLAB_RECLAIM_ACCOUNT| - SLAB_MEM_SPREAD), - init_once, NULL); + sizeof(struct socket_alloc), + 0, + (SLAB_HWCACHE_ALIGN | + SLAB_RECLAIM_ACCOUNT | + SLAB_MEM_SPREAD), + init_once, + NULL); if (sock_inode_cachep == NULL) return -ENOMEM; return 0; @@ -335,7 +339,8 @@ static struct super_operations sockfs_ops = { }; static int sockfs_get_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *data, struct vfsmount *mnt) + int flags, const char *dev_name, void *data, + struct vfsmount *mnt) { return get_sb_pseudo(fs_type, "socket:", &sockfs_ops, SOCKFS_MAGIC, mnt); @@ -348,12 +353,13 @@ static struct file_system_type sock_fs_type = { .get_sb = sockfs_get_sb, .kill_sb = kill_anon_super, }; + static int sockfs_delete_dentry(struct dentry *dentry) { return 1; } static struct dentry_operations sockfs_dentry_operations = { - .d_delete = sockfs_delete_dentry, + .d_delete = sockfs_delete_dentry, }; /* @@ -477,10 +483,12 @@ struct socket *sockfd_lookup(int fd, int *err) struct file *file; struct socket *sock; - if (!(file = fget(fd))) { + file = fget(fd); + if (!file) { *err = -EBADF; return NULL; } + sock = sock_from_file(file, err); if (!sock) fput(file); @@ -505,7 +513,7 @@ static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed) /** * sock_alloc - allocate a socket - * + * * Allocate a new inode and socket object. The two are bound together * and initialised. The socket is then returned. If we are out of inodes * NULL is returned. @@ -513,8 +521,8 @@ static struct socket *sockfd_lookup_light(int fd, int *err, int *fput_needed) static struct socket *sock_alloc(void) { - struct inode * inode; - struct socket * sock; + struct inode *inode; + struct socket *sock; inode = new_inode(sock_mnt->mnt_sb); if (!inode) @@ -522,7 +530,7 @@ static struct socket *sock_alloc(void) sock = SOCKET_I(inode); - inode->i_mode = S_IFSOCK|S_IRWXUGO; + inode->i_mode = S_IFSOCK | S_IRWXUGO; inode->i_uid = current->fsuid; inode->i_gid = current->fsgid; @@ -536,7 +544,7 @@ static struct socket *sock_alloc(void) * a back door. Remember to keep it shut otherwise you'll let the * creepy crawlies in. */ - + static int sock_no_open(struct inode *irrelevant, struct file *dontcare) { return -ENXIO; @@ -553,9 +561,9 @@ const struct file_operations bad_sock_fops = { * * The socket is released from the protocol stack if it has a release * callback, and the inode is then released if the socket is bound to - * an inode not a file. + * an inode not a file. */ - + void sock_release(struct socket *sock) { if (sock->ops) { @@ -575,10 +583,10 @@ void sock_release(struct socket *sock) iput(SOCK_INODE(sock)); return; } - sock->file=NULL; + sock->file = NULL; } -static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock, +static inline int __sock_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, size_t size) { struct sock_iocb *si = kiocb_to_siocb(iocb); @@ -621,14 +629,14 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg, * the following is safe, since for compiler definitions of kvec and * iovec are identical, yielding the same in-core layout and alignment */ - msg->msg_iov = (struct iovec *)vec, + msg->msg_iov = (struct iovec *)vec; msg->msg_iovlen = num; result = sock_sendmsg(sock, msg, size); set_fs(oldfs); return result; } -static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock, +static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg, size_t size, int flags) { int err; @@ -647,14 +655,14 @@ static inline int __sock_recvmsg(struct kiocb *iocb, struct socket *sock, return sock->ops->recvmsg(iocb, sock, msg, size, flags); } -int sock_recvmsg(struct socket *sock, struct msghdr *msg, +int sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t size, int flags) { struct kiocb iocb; struct sock_iocb siocb; int ret; - init_sync_kiocb(&iocb, NULL); + init_sync_kiocb(&iocb, NULL); iocb.private = &siocb; ret = __sock_recvmsg(&iocb, sock, msg, size, flags); if (-EIOCBQUEUED == ret) @@ -662,9 +670,8 @@ int sock_recvmsg(struct socket *sock, struct msghdr *msg, return ret; } -int kernel_recvmsg(struct socket *sock, struct msghdr *msg, - struct kvec *vec, size_t num, - size_t size, int flags) +int kernel_recvmsg(struct socket *sock, struct msghdr *msg, + struct kvec *vec, size_t num, size_t size, int flags) { mm_segment_t oldfs = get_fs(); int result; @@ -674,8 +681,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg, * the following is safe, since for compiler definitions of kvec and * iovec are identical, yielding the same in-core layout and alignment */ - msg->msg_iov = (struct iovec *)vec, - msg->msg_iovlen = num; + msg->msg_iov = (struct iovec *)vec, msg->msg_iovlen = num; result = sock_recvmsg(sock, msg, size, flags); set_fs(oldfs); return result; @@ -702,7 +708,8 @@ static ssize_t sock_sendpage(struct file *file, struct page *page, } static struct sock_iocb *alloc_sock_iocb(struct kiocb *iocb, - char __user *ubuf, size_t size, struct sock_iocb *siocb) + char __user *ubuf, size_t size, + struct sock_iocb *siocb) { if (!is_sync_kiocb(iocb)) { siocb = kmalloc(sizeof(*siocb), GFP_KERNEL); @@ -720,20 +727,21 @@ static struct sock_iocb *alloc_sock_iocb(struct kiocb *iocb, } static ssize_t do_sock_read(struct msghdr *msg, struct kiocb *iocb, - struct file *file, struct iovec *iov, unsigned long nr_segs) + struct file *file, struct iovec *iov, + unsigned long nr_segs) { struct socket *sock = file->private_data; size_t size = 0; int i; - for (i = 0 ; i < nr_segs ; i++) - size += iov[i].iov_len; + for (i = 0; i < nr_segs; i++) + size += iov[i].iov_len; msg->msg_name = NULL; msg->msg_namelen = 0; msg->msg_control = NULL; msg->msg_controllen = 0; - msg->msg_iov = (struct iovec *) iov; + msg->msg_iov = (struct iovec *)iov; msg->msg_iovlen = nr_segs; msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0; @@ -748,7 +756,7 @@ static ssize_t sock_readv(struct file *file, const struct iovec *iov, struct msghdr msg; int ret; - init_sync_kiocb(&iocb, NULL); + init_sync_kiocb(&iocb, NULL); iocb.private = &siocb; ret = do_sock_read(&msg, &iocb, file, (struct iovec *)iov, nr_segs); @@ -758,7 +766,7 @@ static ssize_t sock_readv(struct file *file, const struct iovec *iov, } static ssize_t sock_aio_read(struct kiocb *iocb, char __user *ubuf, - size_t count, loff_t pos) + size_t count, loff_t pos) { struct sock_iocb siocb, *x; @@ -771,24 +779,25 @@ static ssize_t sock_aio_read(struct kiocb *iocb, char __user *ubuf, if (!x) return -ENOMEM; return do_sock_read(&x->async_msg, iocb, iocb->ki_filp, - &x->async_iov, 1); + &x->async_iov, 1); } static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb, - struct file *file, struct iovec *iov, unsigned long nr_segs) + struct file *file, struct iovec *iov, + unsigned long nr_segs) { struct socket *sock = file->private_data; size_t size = 0; int i; - for (i = 0 ; i < nr_segs ; i++) - size += iov[i].iov_len; + for (i = 0; i < nr_segs; i++) + size += iov[i].iov_len; msg->msg_name = NULL; msg->msg_namelen = 0; msg->msg_control = NULL; msg->msg_controllen = 0; - msg->msg_iov = (struct iovec *) iov; + msg->msg_iov = (struct iovec *)iov; msg->msg_iovlen = nr_segs; msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0; if (sock->type == SOCK_SEQPACKET) @@ -815,7 +824,7 @@ static ssize_t sock_writev(struct file *file, const struct iovec *iov, } static ssize_t sock_aio_write(struct kiocb *iocb, const char __user *ubuf, - size_t count, loff_t pos) + size_t count, loff_t pos) { struct sock_iocb siocb, *x; @@ -829,46 +838,48 @@ static ssize_t sock_aio_write(struct kiocb *iocb, const char __user *ubuf, return -ENOMEM; return do_sock_write(&x->async_msg, iocb, iocb->ki_filp, - &x->async_iov, 1); + &x->async_iov, 1); } - /* * Atomic setting of ioctl hooks to avoid race * with module unload. */ static DEFINE_MUTEX(br_ioctl_mutex); -static int (*br_ioctl_hook)(unsigned int cmd, void __user *arg) = NULL; +static int (*br_ioctl_hook) (unsigned int cmd, void __user *arg) = NULL; -void brioctl_set(int (*hook)(unsigned int, void __user *)) +void brioctl_set(int (*hook) (unsigned int, void __user *)) { mutex_lock(&br_ioctl_mutex); br_ioctl_hook = hook; mutex_unlock(&br_ioctl_mutex); } + EXPORT_SYMBOL(brioctl_set); static DEFINE_MUTEX(vlan_ioctl_mutex); -static int (*vlan_ioctl_hook)(void __user *arg); +static int (*vlan_ioctl_hook) (void __user *arg); -void vlan_ioctl_set(int (*hook)(void __user *)) +void vlan_ioctl_set(int (*hook) (void __user *)) { mutex_lock(&vlan_ioctl_mutex); vlan_ioctl_hook = hook; mutex_unlock(&vlan_ioctl_mutex); } + EXPORT_SYMBOL(vlan_ioctl_set); static DEFINE_MUTEX(dlci_ioctl_mutex); -static int (*dlci_ioctl_hook)(unsigned int, void __user *); +static int (*dlci_ioctl_hook) (unsigned int, void __user *); -void dlci_ioctl_set(int (*hook)(unsigned int, void __user *)) +void dlci_ioctl_set(int (*hook) (unsigned int, void __user *)) { mutex_lock(&dlci_ioctl_mutex); dlci_ioctl_hook = hook; mutex_unlock(&dlci_ioctl_mutex); } + EXPORT_SYMBOL(dlci_ioctl_set); /* @@ -890,8 +901,8 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) if (cmd >= SIOCIWFIRST && cmd <= SIOCIWLAST) { err = dev_ioctl(cmd, argp); } else -#endif /* CONFIG_WIRELESS_EXT */ - switch (cmd) { +#endif /* CONFIG_WIRELESS_EXT */ + switch (cmd) { case FIOSETOWN: case SIOCSPGRP: err = -EFAULT; @@ -901,7 +912,8 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) break; case FIOGETOWN: case SIOCGPGRP: - err = put_user(sock->file->f_owner.pid, (int __user *)argp); + err = put_user(sock->file->f_owner.pid, + (int __user *)argp); break; case SIOCGIFBR: case SIOCSIFBR: @@ -912,7 +924,7 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) request_module("bridge"); mutex_lock(&br_ioctl_mutex); - if (br_ioctl_hook) + if (br_ioctl_hook) err = br_ioctl_hook(cmd, argp); mutex_unlock(&br_ioctl_mutex); break; @@ -929,7 +941,7 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) break; case SIOCGIFDIVERT: case SIOCSIFDIVERT: - /* Convert this to call through a hook */ + /* Convert this to call through a hook */ err = divert_ioctl(cmd, argp); break; case SIOCADDDLCI: @@ -954,7 +966,7 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg) if (err == -ENOIOCTLCMD) err = dev_ioctl(cmd, argp); break; - } + } return err; } @@ -962,7 +974,7 @@ int sock_create_lite(int family, int type, int protocol, struct socket **res) { int err; struct socket *sock = NULL; - + err = security_socket_create(family, type, protocol, 1); if (err) goto out; @@ -988,18 +1000,18 @@ out_release: } /* No kernel lock held - perfect */ -static unsigned int sock_poll(struct file *file, poll_table * wait) +static unsigned int sock_poll(struct file *file, poll_table *wait) { struct socket *sock; /* - * We can't return errors to poll, so it's either yes or no. + * We can't return errors to poll, so it's either yes or no. */ sock = file->private_data; return sock->ops->poll(file, sock, wait); } -static int sock_mmap(struct file * file, struct vm_area_struct * vma) +static int sock_mmap(struct file *file, struct vm_area_struct *vma) { struct socket *sock = file->private_data; @@ -1009,12 +1021,11 @@ static int sock_mmap(struct file * file, struct vm_area_struct * vma) static int sock_close(struct inode *inode, struct file *filp) { /* - * It was possible the inode is NULL we were - * closing an unfinished socket. + * It was possible the inode is NULL we were + * closing an unfinished socket. */ - if (!inode) - { + if (!inode) { printk(KERN_DEBUG "sock_close: NULL inode\n"); return 0; } @@ -1040,57 +1051,52 @@ static int sock_close(struct inode *inode, struct file *filp) static int sock_fasync(int fd, struct file *filp, int on) { - struct fasync_struct *fa, *fna=NULL, **prev; + struct fasync_struct *fa, *fna = NULL, **prev; struct socket *sock; struct sock *sk; - if (on) - { + if (on) { fna = kmalloc(sizeof(struct fasync_struct), GFP_KERNEL); - if(fna==NULL) + if (fna == NULL) return -ENOMEM; } sock = filp->private_data; - if ((sk=sock->sk) == NULL) { + sk = sock->sk; + if (sk == NULL) { kfree(fna); return -EINVAL; } lock_sock(sk); - prev=&(sock->fasync_list); + prev = &(sock->fasync_list); - for (fa=*prev; fa!=NULL; prev=&fa->fa_next,fa=*prev) - if (fa->fa_file==filp) + for (fa = *prev; fa != NULL; prev = &fa->fa_next, fa = *prev) + if (fa->fa_file == filp) break; - if(on) - { - if(fa!=NULL) - { + if (on) { + if (fa != NULL) { write_lock_bh(&sk->sk_callback_lock); - fa->fa_fd=fd; + fa->fa_fd = fd; write_unlock_bh(&sk->sk_callback_lock); kfree(fna); goto out; } - fna->fa_file=filp; - fna->fa_fd=fd; - fna->magic=FASYNC_MAGIC; - fna->fa_next=sock->fasync_list; + fna->fa_file = filp; + fna->fa_fd = fd; + fna->magic = FASYNC_MAGIC; + fna->fa_next = sock->fasync_list; write_lock_bh(&sk->sk_callback_lock); - sock->fasync_list=fna; + sock->fasync_list = fna; write_unlock_bh(&sk->sk_callback_lock); - } - else - { - if (fa!=NULL) - { + } else { + if (fa != NULL) { write_lock_bh(&sk->sk_callback_lock); - *prev=fa->fa_next; + *prev = fa->fa_next; write_unlock_bh(&sk->sk_callback_lock); kfree(fa); } @@ -1107,10 +1113,9 @@ int sock_wake_async(struct socket *sock, int how, int band) { if (!sock || !sock->fasync_list) return -1; - switch (how) - { + switch (how) { case 1: - + if (test_bit(SOCK_ASYNC_WAITDATA, &sock->flags)) break; goto call_kill; @@ -1119,7 +1124,7 @@ int sock_wake_async(struct socket *sock, int how, int band) break; /* fall through */ case 0: - call_kill: +call_kill: __kill_fasync(sock->fasync_list, SIGIO, band); break; case 3: @@ -1128,13 +1133,14 @@ int sock_wake_async(struct socket *sock, int how, int band) return 0; } -static int __sock_create(int family, int type, int protocol, struct socket **res, int kern) +static int __sock_create(int family, int type, int protocol, + struct socket **res, int kern) { int err; struct socket *sock; /* - * Check protocol is in range + * Check protocol is in range */ if (family < 0 || family >= NPROTO) return -EAFNOSUPPORT; @@ -1147,10 +1153,11 @@ static int __sock_create(int family, int type, int protocol, struct socket **res deadlock in module load. */ if (family == PF_INET && type == SOCK_PACKET) { - static int warned; + static int warned; if (!warned) { warned = 1; - printk(KERN_INFO "%s uses obsolete (PF_INET,SOCK_PACKET)\n", current->comm); + printk(KERN_INFO "%s uses obsolete (PF_INET,SOCK_PACKET)\n", + current->comm); } family = PF_PACKET; } @@ -1158,17 +1165,16 @@ static int __sock_create(int family, int type, int protocol, struct socket **res err = security_socket_create(family, type, protocol, kern); if (err) return err; - + #if defined(CONFIG_KMOD) - /* Attempt to load a protocol module if the find failed. - * - * 12/09/1996 Marcin: But! this makes REALLY only sense, if the user + /* Attempt to load a protocol module if the find failed. + * + * 12/09/1996 Marcin: But! this makes REALLY only sense, if the user * requested real, full-featured networking support upon configuration. * Otherwise module support will break! */ - if (net_families[family]==NULL) - { - request_module("net-pf-%d",family); + if (net_families[family] == NULL) { + request_module("net-pf-%d", family); } #endif @@ -1187,12 +1193,12 @@ static int __sock_create(int family, int type, int protocol, struct socket **res if (!(sock = sock_alloc())) { if (net_ratelimit()) printk(KERN_WARNING "socket: no more sockets\n"); - err = -ENFILE; /* Not exactly a match, but its the - closest posix thing */ + err = -ENFILE; /* Not exactly a match, but its the + closest posix thing */ goto out; } - sock->type = type; + sock->type = type; /* * We will call the ->create function, that possibly is in a loadable @@ -1271,7 +1277,8 @@ out_release: * Create a pair of connected sockets. */ -asmlinkage long sys_socketpair(int family, int type, int protocol, int __user *usockvec) +asmlinkage long sys_socketpair(int family, int type, int protocol, + int __user *usockvec) { struct socket *sock1, *sock2; int fd1, fd2, err; @@ -1290,7 +1297,7 @@ asmlinkage long sys_socketpair(int family, int type, int protocol, int __user *u goto out_release_1; err = sock1->ops->socketpair(sock1, sock2); - if (err < 0) + if (err < 0) goto out_release_both; fd1 = fd2 = -1; @@ -1309,7 +1316,7 @@ asmlinkage long sys_socketpair(int family, int type, int protocol, int __user *u * Not kernel problem. */ - err = put_user(fd1, &usockvec[0]); + err = put_user(fd1, &usockvec[0]); if (!err) err = put_user(fd2, &usockvec[1]); if (!err) @@ -1320,19 +1327,18 @@ asmlinkage long sys_socketpair(int family, int type, int protocol, int __user *u return err; out_close_1: - sock_release(sock2); + sock_release(sock2); sys_close(fd1); return err; out_release_both: - sock_release(sock2); + sock_release(sock2); out_release_1: - sock_release(sock1); + sock_release(sock1); out: return err; } - /* * Bind a name to a socket. Nothing much to do here since it's * the protocol's responsibility to handle the local address. @@ -1347,20 +1353,23 @@ asmlinkage long sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen) char address[MAX_SOCK_ADDR]; int err, fput_needed; - if((sock = sockfd_lookup_light(fd, &err, &fput_needed))!=NULL) - { - if((err=move_addr_to_kernel(umyaddr,addrlen,address))>=0) { - err = security_socket_bind(sock, (struct sockaddr *)address, addrlen); + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if(sock) { + err = move_addr_to_kernel(umyaddr, addrlen, address); + if (err >= 0) { + err = security_socket_bind(sock, + (struct sockaddr *)address, + addrlen); if (!err) err = sock->ops->bind(sock, - (struct sockaddr *)address, addrlen); + (struct sockaddr *) + address, addrlen); } fput_light(sock->file, fput_needed); - } + } return err; } - /* * Perform a listen. Basically, we allow the protocol to do anything * necessary for a listen, and if that works, we mark the socket as @@ -1373,9 +1382,10 @@ asmlinkage long sys_listen(int fd, int backlog) { struct socket *sock; int err, fput_needed; - - if ((sock = sockfd_lookup_light(fd, &err, &fput_needed)) != NULL) { - if ((unsigned) backlog > sysctl_somaxconn) + + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if (sock) { + if ((unsigned)backlog > sysctl_somaxconn) backlog = sysctl_somaxconn; err = security_socket_listen(sock, backlog); @@ -1387,7 +1397,6 @@ asmlinkage long sys_listen(int fd, int backlog) return err; } - /* * For accept, we attempt to create a new socket, set up the link * with the client, wake up the client, then return the new @@ -1400,7 +1409,8 @@ asmlinkage long sys_listen(int fd, int backlog) * clean when we restucture accept also. */ -asmlinkage long sys_accept(int fd, struct sockaddr __user *upeer_sockaddr, int __user *upeer_addrlen) +asmlinkage long sys_accept(int fd, struct sockaddr __user *upeer_sockaddr, + int __user *upeer_addrlen) { struct socket *sock, *newsock; struct file *newfile; @@ -1412,7 +1422,7 @@ asmlinkage long sys_accept(int fd, struct sockaddr __user *upeer_sockaddr, int _ goto out; err = -ENFILE; - if (!(newsock = sock_alloc())) + if (!(newsock = sock_alloc())) goto out_put; newsock->type = sock->type; @@ -1444,11 +1454,13 @@ asmlinkage long sys_accept(int fd, struct sockaddr __user *upeer_sockaddr, int _ goto out_fd; if (upeer_sockaddr) { - if(newsock->ops->getname(newsock, (struct sockaddr *)address, &len, 2)<0) { + if (newsock->ops->getname(newsock, (struct sockaddr *)address, + &len, 2) < 0) { err = -ECONNABORTED; goto out_fd; } - err = move_addr_to_user(address, len, upeer_sockaddr, upeer_addrlen); + err = move_addr_to_user(address, len, upeer_sockaddr, + upeer_addrlen); if (err < 0) goto out_fd; } @@ -1470,7 +1482,6 @@ out_fd: goto out_put; } - /* * Attempt to connect to a socket with the server address. The address * is in user space so we verify it is OK and move it to kernel space. @@ -1483,7 +1494,8 @@ out_fd: * include the -EINPROGRESS status for such sockets. */ -asmlinkage long sys_connect(int fd, struct sockaddr __user *uservaddr, int addrlen) +asmlinkage long sys_connect(int fd, struct sockaddr __user *uservaddr, + int addrlen) { struct socket *sock; char address[MAX_SOCK_ADDR]; @@ -1496,11 +1508,12 @@ asmlinkage long sys_connect(int fd, struct sockaddr __user *uservaddr, int addrl if (err < 0) goto out_put; - err = security_socket_connect(sock, (struct sockaddr *)address, addrlen); + err = + security_socket_connect(sock, (struct sockaddr *)address, addrlen); if (err) goto out_put; - err = sock->ops->connect(sock, (struct sockaddr *) address, addrlen, + err = sock->ops->connect(sock, (struct sockaddr *)address, addrlen, sock->file->f_flags); out_put: fput_light(sock->file, fput_needed); @@ -1513,12 +1526,13 @@ out: * name to user space. */ -asmlinkage long sys_getsockname(int fd, struct sockaddr __user *usockaddr, int __user *usockaddr_len) +asmlinkage long sys_getsockname(int fd, struct sockaddr __user *usockaddr, + int __user *usockaddr_len) { struct socket *sock; char address[MAX_SOCK_ADDR]; int len, err, fput_needed; - + sock = sockfd_lookup_light(fd, &err, &fput_needed); if (!sock) goto out; @@ -1543,22 +1557,27 @@ out: * name to user space. */ -asmlinkage long sys_getpeername(int fd, struct sockaddr __user *usockaddr, int __user *usockaddr_len) +asmlinkage long sys_getpeername(int fd, struct sockaddr __user *usockaddr, + int __user *usockaddr_len) { struct socket *sock; char address[MAX_SOCK_ADDR]; int len, err, fput_needed; - if ((sock = sockfd_lookup_light(fd, &err, &fput_needed)) != NULL) { + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if (sock != NULL) { err = security_socket_getpeername(sock); if (err) { fput_light(sock->file, fput_needed); return err; } - err = sock->ops->getname(sock, (struct sockaddr *)address, &len, 1); + err = + sock->ops->getname(sock, (struct sockaddr *)address, &len, + 1); if (!err) - err=move_addr_to_user(address,len, usockaddr, usockaddr_len); + err = move_addr_to_user(address, len, usockaddr, + usockaddr_len); fput_light(sock->file, fput_needed); } return err; @@ -1570,8 +1589,9 @@ asmlinkage long sys_getpeername(int fd, struct sockaddr __user *usockaddr, int _ * the protocol. */ -asmlinkage long sys_sendto(int fd, void __user * buff, size_t len, unsigned flags, - struct sockaddr __user *addr, int addr_len) +asmlinkage long sys_sendto(int fd, void __user *buff, size_t len, + unsigned flags, struct sockaddr __user *addr, + int addr_len) { struct socket *sock; char address[MAX_SOCK_ADDR]; @@ -1588,54 +1608,55 @@ asmlinkage long sys_sendto(int fd, void __user * buff, size_t len, unsigned flag sock = sock_from_file(sock_file, &err); if (!sock) goto out_put; - iov.iov_base=buff; - iov.iov_len=len; - msg.msg_name=NULL; - msg.msg_iov=&iov; - msg.msg_iovlen=1; - msg.msg_control=NULL; - msg.msg_controllen=0; - msg.msg_namelen=0; + iov.iov_base = buff; + iov.iov_len = len; + msg.msg_name = NULL; + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_namelen = 0; if (addr) { err = move_addr_to_kernel(addr, addr_len, address); if (err < 0) goto out_put; - msg.msg_name=address; - msg.msg_namelen=addr_len; + msg.msg_name = address; + msg.msg_namelen = addr_len; } if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; msg.msg_flags = flags; err = sock_sendmsg(sock, &msg, len); -out_put: +out_put: fput_light(sock_file, fput_needed); return err; } /* - * Send a datagram down a socket. + * Send a datagram down a socket. */ -asmlinkage long sys_send(int fd, void __user * buff, size_t len, unsigned flags) +asmlinkage long sys_send(int fd, void __user *buff, size_t len, unsigned flags) { return sys_sendto(fd, buff, len, flags, NULL, 0); } /* - * Receive a frame from the socket and optionally record the address of the + * Receive a frame from the socket and optionally record the address of the * sender. We verify the buffers are writable and if needed move the * sender address from kernel to user space. */ -asmlinkage long sys_recvfrom(int fd, void __user * ubuf, size_t size, unsigned flags, - struct sockaddr __user *addr, int __user *addr_len) +asmlinkage long sys_recvfrom(int fd, void __user *ubuf, size_t size, + unsigned flags, struct sockaddr __user *addr, + int __user *addr_len) { struct socket *sock; struct iovec iov; struct msghdr msg; char address[MAX_SOCK_ADDR]; - int err,err2; + int err, err2; struct file *sock_file; int fput_needed; @@ -1647,23 +1668,22 @@ asmlinkage long sys_recvfrom(int fd, void __user * ubuf, size_t size, unsigned f if (!sock) goto out; - msg.msg_control=NULL; - msg.msg_controllen=0; - msg.msg_iovlen=1; - msg.msg_iov=&iov; - iov.iov_len=size; - iov.iov_base=ubuf; - msg.msg_name=address; - msg.msg_namelen=MAX_SOCK_ADDR; + msg.msg_control = NULL; + msg.msg_controllen = 0; + msg.msg_iovlen = 1; + msg.msg_iov = &iov; + iov.iov_len = size; + iov.iov_base = ubuf; + msg.msg_name = address; + msg.msg_namelen = MAX_SOCK_ADDR; if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; - err=sock_recvmsg(sock, &msg, size, flags); + err = sock_recvmsg(sock, &msg, size, flags); - if(err >= 0 && addr != NULL) - { - err2=move_addr_to_user(address, msg.msg_namelen, addr, addr_len); - if(err2<0) - err=err2; + if (err >= 0 && addr != NULL) { + err2 = move_addr_to_user(address, msg.msg_namelen, addr, addr_len); + if (err2 < 0) + err = err2; } out: fput_light(sock_file, fput_needed); @@ -1671,10 +1691,11 @@ out: } /* - * Receive a datagram from a socket. + * Receive a datagram from a socket. */ -asmlinkage long sys_recv(int fd, void __user * ubuf, size_t size, unsigned flags) +asmlinkage long sys_recv(int fd, void __user *ubuf, size_t size, + unsigned flags) { return sys_recvfrom(fd, ubuf, size, flags, NULL, NULL); } @@ -1684,24 +1705,29 @@ asmlinkage long sys_recv(int fd, void __user * ubuf, size_t size, unsigned flags * to pass the user mode parameter for the protocols to sort out. */ -asmlinkage long sys_setsockopt(int fd, int level, int optname, char __user *optval, int optlen) +asmlinkage long sys_setsockopt(int fd, int level, int optname, + char __user *optval, int optlen) { int err, fput_needed; struct socket *sock; if (optlen < 0) return -EINVAL; - - if ((sock = sockfd_lookup_light(fd, &err, &fput_needed)) != NULL) - { - err = security_socket_setsockopt(sock,level,optname); + + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if (sock != NULL) { + err = security_socket_setsockopt(sock, level, optname); if (err) goto out_put; if (level == SOL_SOCKET) - err=sock_setsockopt(sock,level,optname,optval,optlen); + err = + sock_setsockopt(sock, level, optname, optval, + optlen); else - err=sock->ops->setsockopt(sock, level, optname, optval, optlen); + err = + sock->ops->setsockopt(sock, level, optname, optval, + optlen); out_put: fput_light(sock->file, fput_needed); } @@ -1713,27 +1739,32 @@ out_put: * to pass a user mode parameter for the protocols to sort out. */ -asmlinkage long sys_getsockopt(int fd, int level, int optname, char __user *optval, int __user *optlen) +asmlinkage long sys_getsockopt(int fd, int level, int optname, + char __user *optval, int __user *optlen) { int err, fput_needed; struct socket *sock; - if ((sock = sockfd_lookup_light(fd, &err, &fput_needed)) != NULL) { + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if (sock != NULL) { err = security_socket_getsockopt(sock, level, optname); if (err) goto out_put; if (level == SOL_SOCKET) - err=sock_getsockopt(sock,level,optname,optval,optlen); + err = + sock_getsockopt(sock, level, optname, optval, + optlen); else - err=sock->ops->getsockopt(sock, level, optname, optval, optlen); + err = + sock->ops->getsockopt(sock, level, optname, optval, + optlen); out_put: fput_light(sock->file, fput_needed); } return err; } - /* * Shutdown a socket. */ @@ -1743,8 +1774,8 @@ asmlinkage long sys_shutdown(int fd, int how) int err, fput_needed; struct socket *sock; - if ((sock = sockfd_lookup_light(fd, &err, &fput_needed))!=NULL) - { + sock = sockfd_lookup_light(fd, &err, &fput_needed); + if (sock != NULL) { err = security_socket_shutdown(sock, how); if (!err) err = sock->ops->shutdown(sock, how); @@ -1753,41 +1784,42 @@ asmlinkage long sys_shutdown(int fd, int how) return err; } -/* A couple of helpful macros for getting the address of the 32/64 bit +/* A couple of helpful macros for getting the address of the 32/64 bit * fields which are the same type (int / unsigned) on our platforms. */ #define COMPAT_MSG(msg, member) ((MSG_CMSG_COMPAT & flags) ? &msg##_compat->member : &msg->member) #define COMPAT_NAMELEN(msg) COMPAT_MSG(msg, msg_namelen) #define COMPAT_FLAGS(msg) COMPAT_MSG(msg, msg_flags) - /* * BSD sendmsg interface */ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) { - struct compat_msghdr __user *msg_compat = (struct compat_msghdr __user *)msg; + struct compat_msghdr __user *msg_compat = + (struct compat_msghdr __user *)msg; struct socket *sock; char address[MAX_SOCK_ADDR]; struct iovec iovstack[UIO_FASTIOV], *iov = iovstack; unsigned char ctl[sizeof(struct cmsghdr) + 20] - __attribute__ ((aligned (sizeof(__kernel_size_t)))); - /* 20 is size of ipv6_pktinfo */ + __attribute__ ((aligned(sizeof(__kernel_size_t)))); + /* 20 is size of ipv6_pktinfo */ unsigned char *ctl_buf = ctl; struct msghdr msg_sys; int err, ctl_len, iov_size, total_len; int fput_needed; - + err = -EFAULT; if (MSG_CMSG_COMPAT & flags) { if (get_compat_msghdr(&msg_sys, msg_compat)) return -EFAULT; - } else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr))) + } + else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr))) return -EFAULT; sock = sockfd_lookup_light(fd, &err, &fput_needed); - if (!sock) + if (!sock) goto out; /* do not move before msg_sys is valid */ @@ -1795,7 +1827,7 @@ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) if (msg_sys.msg_iovlen > UIO_MAXIOV) goto out_put; - /* Check whether to allocate the iovec area*/ + /* Check whether to allocate the iovec area */ err = -ENOMEM; iov_size = msg_sys.msg_iovlen * sizeof(struct iovec); if (msg_sys.msg_iovlen > UIO_FASTIOV) { @@ -1809,7 +1841,7 @@ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) err = verify_compat_iovec(&msg_sys, iov, address, VERIFY_READ); } else err = verify_iovec(&msg_sys, iov, address, VERIFY_READ); - if (err < 0) + if (err < 0) goto out_freeiov; total_len = err; @@ -1817,18 +1849,19 @@ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) if (msg_sys.msg_controllen > INT_MAX) goto out_freeiov; - ctl_len = msg_sys.msg_controllen; + ctl_len = msg_sys.msg_controllen; if ((MSG_CMSG_COMPAT & flags) && ctl_len) { - err = cmsghdr_from_user_compat_to_kern(&msg_sys, sock->sk, ctl, sizeof(ctl)); + err = + cmsghdr_from_user_compat_to_kern(&msg_sys, sock->sk, ctl, + sizeof(ctl)); if (err) goto out_freeiov; ctl_buf = msg_sys.msg_control; ctl_len = msg_sys.msg_controllen; } else if (ctl_len) { - if (ctl_len > sizeof(ctl)) - { + if (ctl_len > sizeof(ctl)) { ctl_buf = sock_kmalloc(sock->sk, ctl_len, GFP_KERNEL); - if (ctl_buf == NULL) + if (ctl_buf == NULL) goto out_freeiov; } err = -EFAULT; @@ -1837,7 +1870,8 @@ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) * Afterwards, it will be a kernel pointer. Thus the compiler-assisted * checking falls down on this. */ - if (copy_from_user(ctl_buf, (void __user *) msg_sys.msg_control, ctl_len)) + if (copy_from_user(ctl_buf, (void __user *)msg_sys.msg_control, + ctl_len)) goto out_freectl; msg_sys.msg_control = ctl_buf; } @@ -1848,14 +1882,14 @@ asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags) err = sock_sendmsg(sock, &msg_sys, total_len); out_freectl: - if (ctl_buf != ctl) + if (ctl_buf != ctl) sock_kfree_s(sock->sk, ctl_buf, ctl_len); out_freeiov: if (iov != iovstack) sock_kfree_s(sock->sk, iov, iov_size); out_put: fput_light(sock->file, fput_needed); -out: +out: return err; } @@ -1863,12 +1897,14 @@ out: * BSD recvmsg interface */ -asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flags) +asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, + unsigned int flags) { - struct compat_msghdr __user *msg_compat = (struct compat_msghdr __user *)msg; + struct compat_msghdr __user *msg_compat = + (struct compat_msghdr __user *)msg; struct socket *sock; struct iovec iovstack[UIO_FASTIOV]; - struct iovec *iov=iovstack; + struct iovec *iov = iovstack; struct msghdr msg_sys; unsigned long cmsg_ptr; int err, iov_size, total_len, len; @@ -1880,13 +1916,13 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag /* user mode address pointers */ struct sockaddr __user *uaddr; int __user *uaddr_len; - + if (MSG_CMSG_COMPAT & flags) { if (get_compat_msghdr(&msg_sys, msg_compat)) return -EFAULT; - } else - if (copy_from_user(&msg_sys,msg,sizeof(struct msghdr))) - return -EFAULT; + } + else if (copy_from_user(&msg_sys, msg, sizeof(struct msghdr))) + return -EFAULT; sock = sockfd_lookup_light(fd, &err, &fput_needed); if (!sock) @@ -1895,8 +1931,8 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag err = -EMSGSIZE; if (msg_sys.msg_iovlen > UIO_MAXIOV) goto out_put; - - /* Check whether to allocate the iovec area*/ + + /* Check whether to allocate the iovec area */ err = -ENOMEM; iov_size = msg_sys.msg_iovlen * sizeof(struct iovec); if (msg_sys.msg_iovlen > UIO_FASTIOV) { @@ -1906,11 +1942,11 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag } /* - * Save the user-mode address (verify_iovec will change the - * kernel msghdr to use the kernel address space) + * Save the user-mode address (verify_iovec will change the + * kernel msghdr to use the kernel address space) */ - - uaddr = (void __user *) msg_sys.msg_name; + + uaddr = (void __user *)msg_sys.msg_name; uaddr_len = COMPAT_NAMELEN(msg); if (MSG_CMSG_COMPAT & flags) { err = verify_compat_iovec(&msg_sys, iov, addr, VERIFY_WRITE); @@ -1918,13 +1954,13 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag err = verify_iovec(&msg_sys, iov, addr, VERIFY_WRITE); if (err < 0) goto out_freeiov; - total_len=err; + total_len = err; cmsg_ptr = (unsigned long)msg_sys.msg_control; msg_sys.msg_flags = 0; if (MSG_CMSG_COMPAT & flags) msg_sys.msg_flags = MSG_CMSG_COMPAT; - + if (sock->file->f_flags & O_NONBLOCK) flags |= MSG_DONTWAIT; err = sock_recvmsg(sock, &msg_sys, total_len, flags); @@ -1933,7 +1969,8 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag len = err; if (uaddr != NULL) { - err = move_addr_to_user(addr, msg_sys.msg_namelen, uaddr, uaddr_len); + err = move_addr_to_user(addr, msg_sys.msg_namelen, uaddr, + uaddr_len); if (err < 0) goto out_freeiov; } @@ -1942,10 +1979,10 @@ asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned int flag if (err) goto out_freeiov; if (MSG_CMSG_COMPAT & flags) - err = __put_user((unsigned long)msg_sys.msg_control-cmsg_ptr, + err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr, &msg_compat->msg_controllen); else - err = __put_user((unsigned long)msg_sys.msg_control-cmsg_ptr, + err = __put_user((unsigned long)msg_sys.msg_control - cmsg_ptr, &msg->msg_controllen); if (err) goto out_freeiov; @@ -1964,102 +2001,113 @@ out: /* Argument list sizes for sys_socketcall */ #define AL(x) ((x) * sizeof(unsigned long)) -static unsigned char nargs[18]={AL(0),AL(3),AL(3),AL(3),AL(2),AL(3), - AL(3),AL(3),AL(4),AL(4),AL(4),AL(6), - AL(6),AL(2),AL(5),AL(5),AL(3),AL(3)}; +static const unsigned char nargs[18]={ + AL(0),AL(3),AL(3),AL(3),AL(2),AL(3), + AL(3),AL(3),AL(4),AL(4),AL(4),AL(6), + AL(6),AL(2),AL(5),AL(5),AL(3),AL(3) +}; + #undef AL /* - * System call vectors. + * System call vectors. * * Argument checking cleaned up. Saved 20% in size. * This function doesn't need to set the kernel lock because - * it is set by the callees. + * it is set by the callees. */ asmlinkage long sys_socketcall(int call, unsigned long __user *args) { unsigned long a[6]; - unsigned long a0,a1; + unsigned long a0, a1; int err; - if(call<1||call>SYS_RECVMSG) + if (call < 1 || call > SYS_RECVMSG) return -EINVAL; /* copy_from_user should be SMP safe. */ if (copy_from_user(a, args, nargs[call])) return -EFAULT; - err = audit_socketcall(nargs[call]/sizeof(unsigned long), a); + err = audit_socketcall(nargs[call] / sizeof(unsigned long), a); if (err) return err; - a0=a[0]; - a1=a[1]; - - switch(call) - { - case SYS_SOCKET: - err = sys_socket(a0,a1,a[2]); - break; - case SYS_BIND: - err = sys_bind(a0,(struct sockaddr __user *)a1, a[2]); - break; - case SYS_CONNECT: - err = sys_connect(a0, (struct sockaddr __user *)a1, a[2]); - break; - case SYS_LISTEN: - err = sys_listen(a0,a1); - break; - case SYS_ACCEPT: - err = sys_accept(a0,(struct sockaddr __user *)a1, (int __user *)a[2]); - break; - case SYS_GETSOCKNAME: - err = sys_getsockname(a0,(struct sockaddr __user *)a1, (int __user *)a[2]); - break; - case SYS_GETPEERNAME: - err = sys_getpeername(a0, (struct sockaddr __user *)a1, (int __user *)a[2]); - break; - case SYS_SOCKETPAIR: - err = sys_socketpair(a0,a1, a[2], (int __user *)a[3]); - break; - case SYS_SEND: - err = sys_send(a0, (void __user *)a1, a[2], a[3]); - break; - case SYS_SENDTO: - err = sys_sendto(a0,(void __user *)a1, a[2], a[3], - (struct sockaddr __user *)a[4], a[5]); - break; - case SYS_RECV: - err = sys_recv(a0, (void __user *)a1, a[2], a[3]); - break; - case SYS_RECVFROM: - err = sys_recvfrom(a0, (void __user *)a1, a[2], a[3], - (struct sockaddr __user *)a[4], (int __user *)a[5]); - break; - case SYS_SHUTDOWN: - err = sys_shutdown(a0,a1); - break; - case SYS_SETSOCKOPT: - err = sys_setsockopt(a0, a1, a[2], (char __user *)a[3], a[4]); - break; - case SYS_GETSOCKOPT: - err = sys_getsockopt(a0, a1, a[2], (char __user *)a[3], (int __user *)a[4]); - break; - case SYS_SENDMSG: - err = sys_sendmsg(a0, (struct msghdr __user *) a1, a[2]); - break; - case SYS_RECVMSG: - err = sys_recvmsg(a0, (struct msghdr __user *) a1, a[2]); - break; - default: - err = -EINVAL; - break; + a0 = a[0]; + a1 = a[1]; + + switch (call) { + case SYS_SOCKET: + err = sys_socket(a0, a1, a[2]); + break; + case SYS_BIND: + err = sys_bind(a0, (struct sockaddr __user *)a1, a[2]); + break; + case SYS_CONNECT: + err = sys_connect(a0, (struct sockaddr __user *)a1, a[2]); + break; + case SYS_LISTEN: + err = sys_listen(a0, a1); + break; + case SYS_ACCEPT: + err = + sys_accept(a0, (struct sockaddr __user *)a1, + (int __user *)a[2]); + break; + case SYS_GETSOCKNAME: + err = + sys_getsockname(a0, (struct sockaddr __user *)a1, + (int __user *)a[2]); + break; + case SYS_GETPEERNAME: + err = + sys_getpeername(a0, (struct sockaddr __user *)a1, + (int __user *)a[2]); + break; + case SYS_SOCKETPAIR: + err = sys_socketpair(a0, a1, a[2], (int __user *)a[3]); + break; + case SYS_SEND: + err = sys_send(a0, (void __user *)a1, a[2], a[3]); + break; + case SYS_SENDTO: + err = sys_sendto(a0, (void __user *)a1, a[2], a[3], + (struct sockaddr __user *)a[4], a[5]); + break; + case SYS_RECV: + err = sys_recv(a0, (void __user *)a1, a[2], a[3]); + break; + case SYS_RECVFROM: + err = sys_recvfrom(a0, (void __user *)a1, a[2], a[3], + (struct sockaddr __user *)a[4], + (int __user *)a[5]); + break; + case SYS_SHUTDOWN: + err = sys_shutdown(a0, a1); + break; + case SYS_SETSOCKOPT: + err = sys_setsockopt(a0, a1, a[2], (char __user *)a[3], a[4]); + break; + case SYS_GETSOCKOPT: + err = + sys_getsockopt(a0, a1, a[2], (char __user *)a[3], + (int __user *)a[4]); + break; + case SYS_SENDMSG: + err = sys_sendmsg(a0, (struct msghdr __user *)a1, a[2]); + break; + case SYS_RECVMSG: + err = sys_recvmsg(a0, (struct msghdr __user *)a1, a[2]); + break; + default: + err = -EINVAL; + break; } return err; } -#endif /* __ARCH_WANT_SYS_SOCKETCALL */ +#endif /* __ARCH_WANT_SYS_SOCKETCALL */ /* * This function is called by a protocol handler that wants to @@ -2072,18 +2120,18 @@ int sock_register(struct net_proto_family *ops) int err; if (ops->family >= NPROTO) { - printk(KERN_CRIT "protocol %d >= NPROTO(%d)\n", ops->family, NPROTO); + printk(KERN_CRIT "protocol %d >= NPROTO(%d)\n", ops->family, + NPROTO); return -ENOBUFS; } net_family_write_lock(); err = -EEXIST; if (net_families[ops->family] == NULL) { - net_families[ops->family]=ops; + net_families[ops->family] = ops; err = 0; } net_family_write_unlock(); - printk(KERN_INFO "NET: Registered protocol family %d\n", - ops->family); + printk(KERN_INFO "NET: Registered protocol family %d\n", ops->family); return err; } @@ -2099,28 +2147,27 @@ int sock_unregister(int family) return -1; net_family_write_lock(); - net_families[family]=NULL; + net_families[family] = NULL; net_family_write_unlock(); - printk(KERN_INFO "NET: Unregistered protocol family %d\n", - family); + printk(KERN_INFO "NET: Unregistered protocol family %d\n", family); return 0; } static int __init sock_init(void) { /* - * Initialize sock SLAB cache. + * Initialize sock SLAB cache. */ - + sk_init(); /* - * Initialize skbuff SLAB cache + * Initialize skbuff SLAB cache */ skb_init(); /* - * Initialize the protocols module. + * Initialize the protocols module. */ init_inodecache(); @@ -2146,7 +2193,7 @@ void socket_seq_show(struct seq_file *seq) int counter = 0; for_each_possible_cpu(cpu) - counter += per_cpu(sockets_in_use, cpu); + counter += per_cpu(sockets_in_use, cpu); /* It can be negative, by the way. 8) */ if (counter < 0) @@ -2154,11 +2201,11 @@ void socket_seq_show(struct seq_file *seq) seq_printf(seq, "sockets: used %d\n", counter); } -#endif /* CONFIG_PROC_FS */ +#endif /* CONFIG_PROC_FS */ #ifdef CONFIG_COMPAT static long compat_sock_ioctl(struct file *file, unsigned cmd, - unsigned long arg) + unsigned long arg) { struct socket *sock = file->private_data; int ret = -ENOIOCTLCMD; -- cgit v0.10.2 From 757dbb494be3309fe41ce4c62f8057d8b41d8897 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Wed, 9 Aug 2006 20:50:00 -0700 Subject: [NET]: drop unused elements from net_proto_family Three values in net_proto_family are defined but never used. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/include/linux/net.h b/include/linux/net.h index 19da2c0..1bd7632 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -169,11 +169,6 @@ struct proto_ops { struct net_proto_family { int family; int (*create)(struct socket *sock, int protocol); - /* These are counters for the number of different methods of - each we support */ - short authentication; - short encryption; - short encrypt_net; struct module *owner; }; -- cgit v0.10.2 From 55737fda0bc73cb20f702301d8b52938a5a43630 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Fri, 1 Sep 2006 00:23:39 -0700 Subject: [NET]: socket family using RCU Replace the gross custom locking done in socket code for net_family[] with simple RCU usage. Some reordering necessary to avoid sleep issues with sock_alloc. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/socket.c b/net/socket.c index 156f2ef..b5a3fcb 100644 --- a/net/socket.c +++ b/net/socket.c @@ -59,11 +59,11 @@ */ #include -#include #include #include #include #include +#include #include #include #include @@ -146,51 +146,8 @@ static struct file_operations socket_file_ops = { * The protocol list. Each protocol is registered in here. */ -static struct net_proto_family *net_families[NPROTO]; - -#if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT) -static atomic_t net_family_lockct = ATOMIC_INIT(0); static DEFINE_SPINLOCK(net_family_lock); - -/* The strategy is: modifications net_family vector are short, do not - sleep and veeery rare, but read access should be free of any exclusive - locks. - */ - -static void net_family_write_lock(void) -{ - spin_lock(&net_family_lock); - while (atomic_read(&net_family_lockct) != 0) { - spin_unlock(&net_family_lock); - - yield(); - - spin_lock(&net_family_lock); - } -} - -static __inline__ void net_family_write_unlock(void) -{ - spin_unlock(&net_family_lock); -} - -static __inline__ void net_family_read_lock(void) -{ - atomic_inc(&net_family_lockct); - spin_unlock_wait(&net_family_lock); -} - -static __inline__ void net_family_read_unlock(void) -{ - atomic_dec(&net_family_lockct); -} - -#else -#define net_family_write_lock() do { } while(0) -#define net_family_write_unlock() do { } while(0) -#define net_family_read_lock() do { } while(0) -#define net_family_read_unlock() do { } while(0) -#endif +static const struct net_proto_family *net_families[NPROTO]; /* * Statistics counters of the socket lists @@ -1138,6 +1095,7 @@ static int __sock_create(int family, int type, int protocol, { int err; struct socket *sock; + const struct net_proto_family *pf; /* * Check protocol is in range @@ -1166,6 +1124,21 @@ static int __sock_create(int family, int type, int protocol, if (err) return err; + /* + * Allocate the socket and allow the family to set things up. if + * the protocol is 0, the family is instructed to select an appropriate + * default. + */ + sock = sock_alloc(); + if (!sock) { + if (net_ratelimit()) + printk(KERN_WARNING "socket: no more sockets\n"); + return -ENFILE; /* Not exactly a match, but its the + closest posix thing */ + } + + sock->type = type; + #if defined(CONFIG_KMOD) /* Attempt to load a protocol module if the find failed. * @@ -1173,72 +1146,61 @@ static int __sock_create(int family, int type, int protocol, * requested real, full-featured networking support upon configuration. * Otherwise module support will break! */ - if (net_families[family] == NULL) { + if (net_families[family] == NULL) request_module("net-pf-%d", family); - } #endif - net_family_read_lock(); - if (net_families[family] == NULL) { - err = -EAFNOSUPPORT; - goto out; - } - -/* - * Allocate the socket and allow the family to set things up. if - * the protocol is 0, the family is instructed to select an appropriate - * default. - */ - - if (!(sock = sock_alloc())) { - if (net_ratelimit()) - printk(KERN_WARNING "socket: no more sockets\n"); - err = -ENFILE; /* Not exactly a match, but its the - closest posix thing */ - goto out; - } - - sock->type = type; + rcu_read_lock(); + pf = rcu_dereference(net_families[family]); + err = -EAFNOSUPPORT; + if (!pf) + goto out_release; /* * We will call the ->create function, that possibly is in a loadable * module, so we have to bump that loadable module refcnt first. */ - err = -EAFNOSUPPORT; - if (!try_module_get(net_families[family]->owner)) + if (!try_module_get(pf->owner)) goto out_release; - if ((err = net_families[family]->create(sock, protocol)) < 0) { - sock->ops = NULL; + /* Now protected by module ref count */ + rcu_read_unlock(); + + err = pf->create(sock, protocol); + if (err < 0) goto out_module_put; - } /* * Now to bump the refcnt of the [loadable] module that owns this * socket at sock_release time we decrement its refcnt. */ - if (!try_module_get(sock->ops->owner)) { - sock->ops = NULL; - goto out_module_put; - } + if (!try_module_get(sock->ops->owner)) + goto out_module_busy; + /* * Now that we're done with the ->create function, the [loadable] * module can have its refcnt decremented */ - module_put(net_families[family]->owner); - *res = sock; + module_put(pf->owner); err = security_socket_post_create(sock, family, type, protocol, kern); if (err) goto out_release; + *res = sock; -out: - net_family_read_unlock(); - return err; + return 0; + +out_module_busy: + err = -EAFNOSUPPORT; out_module_put: - module_put(net_families[family]->owner); -out_release: + sock->ops = NULL; + module_put(pf->owner); +out_sock_release: sock_release(sock); - goto out; + return err; + +out_release: + rcu_read_unlock(); + goto out_sock_release; } int sock_create(int family, int type, int protocol, struct socket **res) @@ -2109,12 +2071,15 @@ asmlinkage long sys_socketcall(int call, unsigned long __user *args) #endif /* __ARCH_WANT_SYS_SOCKETCALL */ -/* +/** + * sock_register - add a socket protocol handler + * @ops: description of protocol + * * This function is called by a protocol handler that wants to * advertise its address family, and have it linked into the - * SOCKET module. + * socket interface. The value ops->family coresponds to the + * socket system call protocol family. */ - int sock_register(struct net_proto_family *ops) { int err; @@ -2124,31 +2089,44 @@ int sock_register(struct net_proto_family *ops) NPROTO); return -ENOBUFS; } - net_family_write_lock(); - err = -EEXIST; - if (net_families[ops->family] == NULL) { + + spin_lock(&net_family_lock); + if (net_families[ops->family]) + err = -EEXIST; + else { net_families[ops->family] = ops; err = 0; } - net_family_write_unlock(); + spin_unlock(&net_family_lock); + printk(KERN_INFO "NET: Registered protocol family %d\n", ops->family); return err; } -/* +/** + * sock_unregister - remove a protocol handler + * @family: protocol family to remove + * * This function is called by a protocol handler that wants to * remove its address family, and have it unlinked from the - * SOCKET module. + * new socket creation. + * + * If protocol handler is a module, then it can use module reference + * counts to protect against new references. If protocol handler is not + * a module then it needs to provide its own protection in + * the ops->create routine. */ - int sock_unregister(int family) { if (family < 0 || family >= NPROTO) - return -1; + return -EINVAL; - net_family_write_lock(); + spin_lock(&net_family_lock); net_families[family] = NULL; - net_family_write_unlock(); + spin_unlock(&net_family_lock); + + synchronize_rcu(); + printk(KERN_INFO "NET: Unregistered protocol family %d\n", family); return 0; } -- cgit v0.10.2 From f0fd27d42e39b91f85e1840ec49b072fd6c545b8 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Wed, 9 Aug 2006 21:03:17 -0700 Subject: [NET]: sock_register interface changes The sock_register() doesn't change the family, so the protocols can define it read-only. No caller ever checks return value from sock_unregister() Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/include/linux/net.h b/include/linux/net.h index 1bd7632..c257f71 100644 --- a/include/linux/net.h +++ b/include/linux/net.h @@ -176,8 +176,8 @@ struct iovec; struct kvec; extern int sock_wake_async(struct socket *sk, int how, int band); -extern int sock_register(struct net_proto_family *fam); -extern int sock_unregister(int family); +extern int sock_register(const struct net_proto_family *fam); +extern void sock_unregister(int family); extern int sock_create(int family, int type, int proto, struct socket **res); extern int sock_create_kern(int family, int type, int proto, diff --git a/net/socket.c b/net/socket.c index b5a3fcb..4147fe4 100644 --- a/net/socket.c +++ b/net/socket.c @@ -147,7 +147,7 @@ static struct file_operations socket_file_ops = { */ static DEFINE_SPINLOCK(net_family_lock); -static const struct net_proto_family *net_families[NPROTO]; +static const struct net_proto_family *net_families[NPROTO] __read_mostly; /* * Statistics counters of the socket lists @@ -2080,7 +2080,7 @@ asmlinkage long sys_socketcall(int call, unsigned long __user *args) * socket interface. The value ops->family coresponds to the * socket system call protocol family. */ -int sock_register(struct net_proto_family *ops) +int sock_register(const struct net_proto_family *ops) { int err; @@ -2116,10 +2116,9 @@ int sock_register(struct net_proto_family *ops) * a module then it needs to provide its own protection in * the ops->create routine. */ -int sock_unregister(int family) +void sock_unregister(int family) { - if (family < 0 || family >= NPROTO) - return -EINVAL; + BUG_ON(family < 0 || family >= NPROTO); spin_lock(&net_family_lock); net_families[family] = NULL; @@ -2128,7 +2127,6 @@ int sock_unregister(int family) synchronize_rcu(); printk(KERN_INFO "NET: Unregistered protocol family %d\n", family); - return 0; } static int __init sock_init(void) -- cgit v0.10.2 From bf0d52492d82ad70684e17c8a46942c13d0e140e Mon Sep 17 00:00:00 2001 From: Dave Jones Date: Fri, 22 Sep 2006 14:00:29 -0700 Subject: [NET]: Remove unnecessary config.h includes from net/ config.h is automatically included by kbuild these days. Signed-off-by: Dave Jones Signed-off-by: David S. Miller diff --git a/net/atm/atm_sysfs.c b/net/atm/atm_sysfs.c index 5df4b9a..c0a4ae2 100644 --- a/net/atm/atm_sysfs.c +++ b/net/atm/atm_sysfs.c @@ -1,6 +1,5 @@ /* ATM driver model support. */ -#include #include #include #include diff --git a/net/core/dev_mcast.c b/net/core/dev_mcast.c index c57d887..b22648d 100644 --- a/net/core/dev_mcast.c +++ b/net/core/dev_mcast.c @@ -21,8 +21,7 @@ * 2 of the License, or (at your option) any later version. */ -#include -#include +#include #include #include #include diff --git a/net/core/wireless.c b/net/core/wireless.c index de0bde4..348b9da 100644 --- a/net/core/wireless.c +++ b/net/core/wireless.c @@ -72,7 +72,6 @@ /***************************** INCLUDES *****************************/ -#include /* Not needed ??? */ #include #include /* off_t */ #include /* struct ifreq, dev_get_by_name() */ diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index fc40da3..36c38bf 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -67,7 +67,6 @@ * 2 of the License, or (at your option) any later version. */ -#include #include #include #include diff --git a/net/ipv4/ipconfig.c b/net/ipv4/ipconfig.c index cb8a92f..1fbb384 100644 --- a/net/ipv4/ipconfig.c +++ b/net/ipv4/ipconfig.c @@ -31,7 +31,6 @@ * -- Josef Siemes , Aug 2002 */ -#include #include #include #include diff --git a/net/ipv4/netfilter/ip_conntrack_sip.c b/net/ipv4/netfilter/ip_conntrack_sip.c index 4f222d6..2893e9c 100644 --- a/net/ipv4/netfilter/ip_conntrack_sip.c +++ b/net/ipv4/netfilter/ip_conntrack_sip.c @@ -8,7 +8,6 @@ * published by the Free Software Foundation. */ -#include #include #include #include diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c index fe44cb5..0e935b4 100644 --- a/net/ipv4/raw.c +++ b/net/ipv4/raw.c @@ -38,8 +38,7 @@ * as published by the Free Software Foundation; either version * 2 of the License, or (at your option) any later version. */ - -#include + #include #include #include diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c index 48f28d6..649ebae 100644 --- a/net/ipv4/tcp_lp.c +++ b/net/ipv4/tcp_lp.c @@ -35,7 +35,6 @@ * Version: $Id: tcp_lp.c,v 1.24 2006/09/05 20:22:53 hswong3i Exp $ */ -#include #include #include diff --git a/net/ipv4/tcp_veno.c b/net/ipv4/tcp_veno.c index 11b42a7..5b2fe6d 100644 --- a/net/ipv4/tcp_veno.c +++ b/net/ipv4/tcp_veno.c @@ -9,7 +9,6 @@ * See http://www.ntu.edu.sg/home5/ZHOU0022/papers/CPFu03a.pdf */ -#include #include #include #include -- cgit v0.10.2 From 1e38bb3a38d08129d08c904b10ea3ba08e22d297 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 10 Aug 2006 00:22:41 -0700 Subject: [NET]: Kill double initialization in sock_alloc_inode. No need to set ei->socket.flags to zero twice. Signed-off-by: David S. Miller diff --git a/net/socket.c b/net/socket.c index 4147fe4..d6f27ed 100644 --- a/net/socket.c +++ b/net/socket.c @@ -254,7 +254,6 @@ static struct inode *sock_alloc_inode(struct super_block *sb) ei->socket.ops = NULL; ei->socket.sk = NULL; ei->socket.file = NULL; - ei->socket.flags = 0; return &ei->vfs_inode; } -- cgit v0.10.2 From d924424aaed116b362c6d0e667d912b77e655085 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:03:23 -0700 Subject: [NEIGHBOUR]: Use ALIGN() macro. Rather than opencoding the mask, it looks better to use ALIGN() macro from kernel.h. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/include/net/neighbour.h b/include/net/neighbour.h index 74c4b6f..bd187da 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -101,7 +101,7 @@ struct neighbour __u8 dead; atomic_t probes; rwlock_t lock; - unsigned char ha[(MAX_ADDR_LEN+sizeof(unsigned long)-1)&~(sizeof(unsigned long)-1)]; + unsigned char ha[ALIGN(MAX_ADDR_LEN, sizeof(unsigned long))]; struct hh_cache *hh; atomic_t refcnt; int (*output)(struct sk_buff *skb); -- cgit v0.10.2 From 2dfe55b47e3d66ded5a84caf71e0da5710edf48b Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:08:33 -0700 Subject: [NET]: Use u32 for routing table IDs Use u32 for routing table IDs in net/ipv4 and net/decnet in preparation of support for a larger number of routing tables. net/ipv6 already uses u32 everywhere and needs no further changes. No functional changes are made by this patch. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h index 32bc8ce..cd9c378 100644 --- a/include/net/dn_fib.h +++ b/include/net/dn_fib.h @@ -94,7 +94,7 @@ struct dn_fib_node { struct dn_fib_table { - int n; + u32 n; int (*insert)(struct dn_fib_table *t, struct rtmsg *r, struct dn_kern_rta *rta, struct nlmsghdr *n, @@ -137,7 +137,7 @@ extern int dn_fib_sync_up(struct net_device *dev); /* * dn_tables.c */ -extern struct dn_fib_table *dn_fib_get_table(int n, int creat); +extern struct dn_fib_table *dn_fib_get_table(u32 n, int creat); extern struct dn_fib_table *dn_fib_empty_table(void); extern void dn_fib_table_init(void); extern void dn_fib_table_cleanup(void); diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index adf7358..0dcbf16 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -150,7 +150,7 @@ struct fib_result_nl { #endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ struct fib_table { - unsigned char tb_id; + u32 tb_id; unsigned tb_stamp; int (*tb_lookup)(struct fib_table *tb, const struct flowi *flp, struct fib_result *res); int (*tb_insert)(struct fib_table *table, struct rtmsg *r, @@ -173,14 +173,14 @@ struct fib_table { extern struct fib_table *ip_fib_local_table; extern struct fib_table *ip_fib_main_table; -static inline struct fib_table *fib_get_table(int id) +static inline struct fib_table *fib_get_table(u32 id) { if (id != RT_TABLE_LOCAL) return ip_fib_main_table; return ip_fib_local_table; } -static inline struct fib_table *fib_new_table(int id) +static inline struct fib_table *fib_new_table(u32 id) { return fib_get_table(id); } @@ -205,9 +205,9 @@ static inline void fib_select_default(const struct flowi *flp, struct fib_result extern struct fib_table * fib_tables[RT_TABLE_MAX+1]; extern int fib_lookup(struct flowi *flp, struct fib_result *res); -extern struct fib_table *__fib_new_table(int id); +extern struct fib_table *__fib_new_table(u32 id); -static inline struct fib_table *fib_get_table(int id) +static inline struct fib_table *fib_get_table(u32 id) { if (id == 0) id = RT_TABLE_MAIN; @@ -215,7 +215,7 @@ static inline struct fib_table *fib_get_table(int id) return fib_tables[id]; } -static inline struct fib_table *fib_new_table(int id) +static inline struct fib_table *fib_new_table(u32 id) { if (id == 0) id = RT_TABLE_MAIN; @@ -248,7 +248,7 @@ extern int fib_convert_rtentry(int cmd, struct nlmsghdr *nl, struct rtmsg *rtm, extern u32 __fib_res_prefsrc(struct fib_result *res); /* Exported by fib_hash.c */ -extern struct fib_table *fib_hash_init(int id); +extern struct fib_table *fib_hash_init(u32 id); #ifdef CONFIG_IP_MULTIPLE_TABLES extern int fib4_rules_dump(struct sk_buff *skb, struct netlink_callback *cb); diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index ed5fb5c..7b3bf5c 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -534,8 +534,8 @@ int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb) { - int t; - int s_t; + u32 t; + u32 s_t; struct dn_fib_table *tb; if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && @@ -765,7 +765,7 @@ void dn_fib_flush(void) { int flushed = 0; struct dn_fib_table *tb; - int id; + u32 id; for(id = RT_TABLE_MAX; id > 0; id--) { if ((tb = dn_fib_get_table(id, 0)) == NULL) diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c index 2e01b67..1601ee5 100644 --- a/net/decnet/dn_table.c +++ b/net/decnet/dn_table.c @@ -264,7 +264,7 @@ static int dn_fib_nh_match(struct rtmsg *r, struct nlmsghdr *nlh, struct dn_kern } static int dn_fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, - u8 tb_id, u8 type, u8 scope, void *dst, int dst_len, + u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, struct dn_fib_info *fi, unsigned int flags) { struct rtmsg *rtm; @@ -327,7 +327,7 @@ rtattr_failure: } -static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, int tb_id, +static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, u32 tb_id, struct nlmsghdr *nlh, struct netlink_skb_parms *req) { struct sk_buff *skb; @@ -740,7 +740,7 @@ out: } -struct dn_fib_table *dn_fib_get_table(int n, int create) +struct dn_fib_table *dn_fib_get_table(u32 n, int create) { struct dn_fib_table *t; @@ -777,7 +777,7 @@ struct dn_fib_table *dn_fib_get_table(int n, int create) return t; } -static void dn_fib_del_tree(int n) +static void dn_fib_del_tree(u32 n) { struct dn_fib_table *t; @@ -791,7 +791,7 @@ static void dn_fib_del_tree(int n) struct dn_fib_table *dn_fib_empty_table(void) { - int id; + u32 id; for(id = RT_TABLE_MIN; id <= RT_TABLE_MAX; id++) if (dn_fib_tables[id] == NULL) diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index a83f1aa..06f4b23 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -62,7 +62,7 @@ struct fib_table *ip_fib_main_table; struct fib_table *fib_tables[RT_TABLE_MAX+1]; -struct fib_table *__fib_new_table(int id) +struct fib_table *__fib_new_table(u32 id) { struct fib_table *tb; @@ -82,7 +82,7 @@ static void fib_flush(void) int flushed = 0; #ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_table *tb; - int id; + u32 id; for (id = RT_TABLE_MAX; id>0; id--) { if ((tb = fib_get_table(id))==NULL) @@ -333,8 +333,8 @@ int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) { - int t; - int s_t; + u32 t; + u32 s_t; struct fib_table *tb; if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index 72c633b..f8d5c80 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -765,9 +765,9 @@ static int fn_hash_dump(struct fib_table *tb, struct sk_buff *skb, struct netlin } #ifdef CONFIG_IP_MULTIPLE_TABLES -struct fib_table * fib_hash_init(int id) +struct fib_table * fib_hash_init(u32 id) #else -struct fib_table * __init fib_hash_init(int id) +struct fib_table * __init fib_hash_init(u32 id) #endif { struct fib_table *tb; diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h index ef6609e..ddd5249 100644 --- a/net/ipv4/fib_lookup.h +++ b/net/ipv4/fib_lookup.h @@ -30,11 +30,11 @@ extern struct fib_info *fib_create_info(const struct rtmsg *r, extern int fib_nh_match(struct rtmsg *r, struct nlmsghdr *, struct kern_rta *rta, struct fib_info *fi); extern int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, - u8 tb_id, u8 type, u8 scope, void *dst, + u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos, struct fib_info *fi, unsigned int); extern void rtmsg_fib(int event, u32 key, struct fib_alias *fa, - int z, int tb_id, + int z, u32 tb_id, struct nlmsghdr *n, struct netlink_skb_parms *req); extern struct fib_alias *fib_find_alias(struct list_head *fah, u8 tos, u32 prio); diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index d242e52..58fb91b 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -169,7 +169,7 @@ static int fib4_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) static struct fib_table *fib_empty_table(void) { - int id; + u32 id; for (id = 1; id <= RT_TABLE_MAX; id++) if (fib_tables[id] == NULL) diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 38bca47..c7a112b 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -273,7 +273,7 @@ int ip_fib_check_default(u32 gw, struct net_device *dev) } void rtmsg_fib(int event, u32 key, struct fib_alias *fa, - int z, int tb_id, + int z, u32 tb_id, struct nlmsghdr *n, struct netlink_skb_parms *req) { struct sk_buff *skb; @@ -939,7 +939,7 @@ u32 __fib_res_prefsrc(struct fib_result *res) int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, - u8 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos, + u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos, struct fib_info *fi, unsigned int flags) { struct rtmsg *rtm; diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 01801c0..4a27b2d 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1148,7 +1148,7 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, key = ntohl(key); - pr_debug("Insert table=%d %08x/%d\n", tb->tb_id, key, plen); + pr_debug("Insert table=%u %08x/%d\n", tb->tb_id, key, plen); mask = ntohl(inet_make_mask(plen)); @@ -1943,9 +1943,9 @@ out: /* Fix more generic FIB names for init later */ #ifdef CONFIG_IP_MULTIPLE_TABLES -struct fib_table * fib_hash_init(int id) +struct fib_table * fib_hash_init(u32 id) #else -struct fib_table * __init fib_hash_init(int id) +struct fib_table * __init fib_hash_init(u32 id) #endif { struct fib_table *tb; -- cgit v0.10.2 From 9e762a4a89b302cb3b26a1f9bb33eff459eaeca9 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:09:48 -0700 Subject: [NET]: Introduce RTA_TABLE/FRA_TABLE attributes Introduce RTA_TABLE route attribute and FRA_TABLE routing rule attribute to hold 32 bit routing table IDs. Usespace compatibility is provided by continuing to accept and send the rtm_table field, but because of its limited size it can only carry the low 8 bits of the table ID. This implies that if larger IDs are used, _all_ userspace programs using them need to use RTA_TABLE. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/fib_rules.h b/include/linux/fib_rules.h index 5e503f0..19a82b6 100644 --- a/include/linux/fib_rules.h +++ b/include/linux/fib_rules.h @@ -36,6 +36,10 @@ enum FRA_UNUSED5, FRA_FWMARK, /* netfilter mark (IPv4) */ FRA_FLOW, /* flow/class id */ + FRA_UNUSED6, + FRA_UNUSED7, + FRA_UNUSED8, + FRA_TABLE, /* Extended table id */ __FRA_MAX }; diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 0aaffa2..ea422a5 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -264,6 +264,7 @@ enum rtattr_type_t RTA_CACHEINFO, RTA_SESSION, RTA_MP_ALGO, + RTA_TABLE, __RTA_MAX }; @@ -717,6 +718,13 @@ extern void __rtnl_unlock(void); } \ } while(0) +static inline u32 rtm_get_table(struct rtattr **rta, u8 table) +{ + return RTA_GET_U32(rta[RTA_TABLE-1]); +rtattr_failure: + return table; +} + #endif /* __KERNEL__ */ diff --git a/include/net/fib_rules.h b/include/net/fib_rules.h index 61375d9..8e2f473 100644 --- a/include/net/fib_rules.h +++ b/include/net/fib_rules.h @@ -74,6 +74,13 @@ static inline void fib_rule_put(struct fib_rule *rule) call_rcu(&rule->rcu, fib_rule_put_rcu); } +static inline u32 frh_get_table(struct fib_rule_hdr *frh, struct nlattr **nla) +{ + if (nla[FRA_TABLE]) + return nla_get_u32(nla[FRA_TABLE]); + return frh->table; +} + extern int fib_rules_register(struct fib_rules_ops *); extern int fib_rules_unregister(struct fib_rules_ops *); diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index 6cdad24..873b04d 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -187,7 +187,7 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) rule->action = frh->action; rule->flags = frh->flags; - rule->table = frh->table; + rule->table = frh_get_table(frh, tb); if (!rule->pref && ops->default_pref) rule->pref = ops->default_pref(); @@ -245,7 +245,7 @@ int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (frh->action && (frh->action != rule->action)) continue; - if (frh->table && (frh->table != rule->table)) + if (frh->table && (frh_get_table(frh, tb) != rule->table)) continue; if (tb[FRA_PRIORITY] && @@ -291,6 +291,7 @@ static int fib_nl_fill_rule(struct sk_buff *skb, struct fib_rule *rule, frh = nlmsg_data(nlh); frh->table = rule->table; + NLA_PUT_U32(skb, FRA_TABLE, rule->table); frh->res1 = 0; frh->res2 = 0; frh->action = rule->action; diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index 7b3bf5c..fb59637 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -491,7 +491,8 @@ static int dn_fib_check_attr(struct rtmsg *r, struct rtattr **rta) if (attr) { if (RTA_PAYLOAD(attr) < 4 && RTA_PAYLOAD(attr) != 2) return -EINVAL; - if (i != RTA_MULTIPATH && i != RTA_METRICS) + if (i != RTA_MULTIPATH && i != RTA_METRICS && + i != RTA_TABLE) rta[i-1] = (struct rtattr *)RTA_DATA(attr); } } @@ -508,7 +509,7 @@ int dn_fib_rtm_delroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (dn_fib_check_attr(r, rta)) return -EINVAL; - tb = dn_fib_get_table(r->rtm_table, 0); + tb = dn_fib_get_table(rtm_get_table(rta, r->rtm_table), 0); if (tb) return tb->delete(tb, r, (struct dn_kern_rta *)rta, nlh, &NETLINK_CB(skb)); @@ -524,7 +525,7 @@ int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (dn_fib_check_attr(r, rta)) return -EINVAL; - tb = dn_fib_get_table(r->rtm_table, 1); + tb = dn_fib_get_table(rtm_get_table(rta, r->rtm_table), 1); if (tb) return tb->insert(tb, r, (struct dn_kern_rta *)rta, nlh, &NETLINK_CB(skb)); diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index 5e6f4616..4c96321 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1486,6 +1486,7 @@ static int dn_rt_fill_info(struct sk_buff *skb, u32 pid, u32 seq, r->rtm_src_len = 0; r->rtm_tos = 0; r->rtm_table = RT_TABLE_MAIN; + RTA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN); r->rtm_type = rt->rt_type; r->rtm_flags = (rt->rt_flags & ~0xFFFF) | RTM_F_CLONED; r->rtm_scope = RT_SCOPE_UNIVERSE; diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c index 1601ee5..eca7c1e 100644 --- a/net/decnet/dn_table.c +++ b/net/decnet/dn_table.c @@ -278,6 +278,7 @@ static int dn_fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, rtm->rtm_src_len = 0; rtm->rtm_tos = 0; rtm->rtm_table = tb_id; + RTA_PUT_U32(skb, RTA_TABLE, tb_id); rtm->rtm_flags = fi->fib_flags; rtm->rtm_scope = scope; rtm->rtm_type = type; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 06f4b23..2696ede 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -294,7 +294,8 @@ static int inet_check_attr(struct rtmsg *r, struct rtattr **rta) if (attr) { if (RTA_PAYLOAD(attr) < 4) return -EINVAL; - if (i != RTA_MULTIPATH && i != RTA_METRICS) + if (i != RTA_MULTIPATH && i != RTA_METRICS && + i != RTA_TABLE) *rta = (struct rtattr*)RTA_DATA(attr); } } @@ -310,7 +311,7 @@ int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet_check_attr(r, rta)) return -EINVAL; - tb = fib_get_table(r->rtm_table); + tb = fib_get_table(rtm_get_table(rta, r->rtm_table)); if (tb) return tb->tb_delete(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb)); return -ESRCH; @@ -325,7 +326,7 @@ int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet_check_attr(r, rta)) return -EINVAL; - tb = fib_new_table(r->rtm_table); + tb = fib_new_table(rtm_get_table(rta, r->rtm_table)); if (tb) return tb->tb_insert(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb)); return -ENOBUFS; diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 58fb91b..0330b9c 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -184,6 +184,7 @@ static struct nla_policy fib4_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_DST] = { .type = NLA_U32 }, [FRA_FWMARK] = { .type = NLA_U32 }, [FRA_FLOW] = { .type = NLA_U32 }, + [FRA_TABLE] = { .type = NLA_U32 }, }; static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb, diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index c7a112b..ab753df 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -953,6 +953,7 @@ fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, rtm->rtm_src_len = 0; rtm->rtm_tos = tos; rtm->rtm_table = tb_id; + RTA_PUT_U32(skb, RTA_TABLE, tb_id); rtm->rtm_type = type; rtm->rtm_flags = fi->fib_flags; rtm->rtm_scope = scope; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index b873cbc..12128b8 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2652,6 +2652,7 @@ static int rt_fill_info(struct sk_buff *skb, u32 pid, u32 seq, int event, r->rtm_src_len = 0; r->rtm_tos = rt->fl.fl4_tos; r->rtm_table = RT_TABLE_MAIN; + RTA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN); r->rtm_type = rt->rt_type; r->rtm_scope = RT_SCOPE_UNIVERSE; r->rtm_protocol = RTPROT_UNSPEC; diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 22a2fdb..2c4fbc8 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -129,6 +129,7 @@ static struct nla_policy fib6_rule_policy[RTA_MAX+1] __read_mostly = { [FRA_PRIORITY] = { .type = NLA_U32 }, [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, [FRA_DST] = { .minlen = sizeof(struct in6_addr) }, + [FRA_TABLE] = { .type = NLA_U32 }, }; static int fib6_rule_configure(struct fib_rule *rule, struct sk_buff *skb, diff --git a/net/ipv6/route.c b/net/ipv6/route.c index e08d840..843c550 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1859,7 +1859,8 @@ int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) return -EINVAL; - return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table); + return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb), + rtm_get_table(arg, r->rtm_table)); } int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) @@ -1869,7 +1870,8 @@ int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) return -EINVAL; - return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb), r->rtm_table); + return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb), + rtm_get_table(arg, r->rtm_table)); } struct rt6_rtnl_dump_arg @@ -1887,6 +1889,7 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, struct nlmsghdr *nlh; unsigned char *b = skb->tail; struct rta_cacheinfo ci; + u32 table; if (prefix) { /* user wants prefix routes only */ if (!(rt->rt6i_flags & RTF_PREFIX_RT)) { @@ -1902,9 +1905,11 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, rtm->rtm_src_len = rt->rt6i_src.plen; rtm->rtm_tos = 0; if (rt->rt6i_table) - rtm->rtm_table = rt->rt6i_table->tb6_id; + table = rt->rt6i_table->tb6_id; else - rtm->rtm_table = RT6_TABLE_UNSPEC; + table = RT6_TABLE_UNSPEC; + rtm->rtm_table = table; + RTA_PUT_U32(skb, RTA_TABLE, table); if (rt->rt6i_flags&RTF_REJECT) rtm->rtm_type = RTN_UNREACHABLE; else if (rt->rt6i_dev && (rt->rt6i_dev->flags&IFF_LOOPBACK)) -- cgit v0.10.2 From 1af5a8c4a11cfed0c9a7f30fcfb689981750599c Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:10:46 -0700 Subject: [IPV4]: Increase number of possible routing tables to 2^32 Increase the number of possible routing tables to 2^32 by replacing the fixed sized array of pointers by a hash table and replacing iterations over all possible table IDs by hash table walking. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index 0dcbf16..8e9ba56 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -150,6 +150,7 @@ struct fib_result_nl { #endif /* CONFIG_IP_ROUTE_MULTIPATH_WRANDOM */ struct fib_table { + struct hlist_node tb_hlist; u32 tb_id; unsigned tb_stamp; int (*tb_lookup)(struct fib_table *tb, const struct flowi *flp, struct fib_result *res); @@ -200,29 +201,13 @@ static inline void fib_select_default(const struct flowi *flp, struct fib_result } #else /* CONFIG_IP_MULTIPLE_TABLES */ -#define ip_fib_local_table (fib_tables[RT_TABLE_LOCAL]) -#define ip_fib_main_table (fib_tables[RT_TABLE_MAIN]) +#define ip_fib_local_table fib_get_table(RT_TABLE_LOCAL) +#define ip_fib_main_table fib_get_table(RT_TABLE_MAIN) -extern struct fib_table * fib_tables[RT_TABLE_MAX+1]; extern int fib_lookup(struct flowi *flp, struct fib_result *res); -extern struct fib_table *__fib_new_table(u32 id); - -static inline struct fib_table *fib_get_table(u32 id) -{ - if (id == 0) - id = RT_TABLE_MAIN; - - return fib_tables[id]; -} - -static inline struct fib_table *fib_new_table(u32 id) -{ - if (id == 0) - id = RT_TABLE_MAIN; - - return fib_tables[id] ? : __fib_new_table(id); -} +extern struct fib_table *fib_new_table(u32 id); +extern struct fib_table *fib_get_table(u32 id); extern void fib_select_default(const struct flowi *flp, struct fib_result *res); #endif /* CONFIG_IP_MULTIPLE_TABLES */ diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 2696ede..ad4c14f 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -37,6 +37,7 @@ #include #include #include +#include #include #include @@ -51,48 +52,67 @@ #ifndef CONFIG_IP_MULTIPLE_TABLES -#define RT_TABLE_MIN RT_TABLE_MAIN - struct fib_table *ip_fib_local_table; struct fib_table *ip_fib_main_table; -#else +#define FIB_TABLE_HASHSZ 1 +static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; -#define RT_TABLE_MIN 1 +#else -struct fib_table *fib_tables[RT_TABLE_MAX+1]; +#define FIB_TABLE_HASHSZ 256 +static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; -struct fib_table *__fib_new_table(u32 id) +struct fib_table *fib_new_table(u32 id) { struct fib_table *tb; + unsigned int h; + if (id == 0) + id = RT_TABLE_MAIN; + tb = fib_get_table(id); + if (tb) + return tb; tb = fib_hash_init(id); if (!tb) return NULL; - fib_tables[id] = tb; + h = id & (FIB_TABLE_HASHSZ - 1); + hlist_add_head_rcu(&tb->tb_hlist, &fib_table_hash[h]); return tb; } +struct fib_table *fib_get_table(u32 id) +{ + struct fib_table *tb; + struct hlist_node *node; + unsigned int h; + if (id == 0) + id = RT_TABLE_MAIN; + h = id & (FIB_TABLE_HASHSZ - 1); + rcu_read_lock(); + hlist_for_each_entry_rcu(tb, node, &fib_table_hash[h], tb_hlist) { + if (tb->tb_id == id) { + rcu_read_unlock(); + return tb; + } + } + rcu_read_unlock(); + return NULL; +} #endif /* CONFIG_IP_MULTIPLE_TABLES */ - static void fib_flush(void) { int flushed = 0; -#ifdef CONFIG_IP_MULTIPLE_TABLES struct fib_table *tb; - u32 id; + struct hlist_node *node; + unsigned int h; - for (id = RT_TABLE_MAX; id>0; id--) { - if ((tb = fib_get_table(id))==NULL) - continue; - flushed += tb->tb_flush(tb); + for (h = 0; h < FIB_TABLE_HASHSZ; h++) { + hlist_for_each_entry(tb, node, &fib_table_hash[h], tb_hlist) + flushed += tb->tb_flush(tb); } -#else /* CONFIG_IP_MULTIPLE_TABLES */ - flushed += ip_fib_main_table->tb_flush(ip_fib_main_table); - flushed += ip_fib_local_table->tb_flush(ip_fib_local_table); -#endif /* CONFIG_IP_MULTIPLE_TABLES */ if (flushed) rt_cache_flush(-1); @@ -334,29 +354,37 @@ int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) { - u32 t; - u32 s_t; + unsigned int h, s_h; + unsigned int e = 0, s_e; struct fib_table *tb; + struct hlist_node *node; + int dumped = 0; if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && ((struct rtmsg*)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED) return ip_rt_dump(skb, cb); - s_t = cb->args[0]; - if (s_t == 0) - s_t = cb->args[0] = RT_TABLE_MIN; - - for (t=s_t; t<=RT_TABLE_MAX; t++) { - if (t < s_t) continue; - if (t > s_t) - memset(&cb->args[1], 0, sizeof(cb->args)-sizeof(cb->args[0])); - if ((tb = fib_get_table(t))==NULL) - continue; - if (tb->tb_dump(tb, skb, cb) < 0) - break; + s_h = cb->args[0]; + s_e = cb->args[1]; + + for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) { + e = 0; + hlist_for_each_entry(tb, node, &fib_table_hash[h], tb_hlist) { + if (e < s_e) + goto next; + if (dumped) + memset(&cb->args[2], 0, sizeof(cb->args) - + 2 * sizeof(cb->args[0])); + if (tb->tb_dump(tb, skb, cb) < 0) + goto out; + dumped = 1; +next: + e++; + } } - - cb->args[0] = t; +out: + cb->args[1] = e; + cb->args[0] = h; return skb->len; } @@ -654,9 +682,15 @@ static struct notifier_block fib_netdev_notifier = { void __init ip_fib_init(void) { + unsigned int i; + + for (i = 0; i < FIB_TABLE_HASHSZ; i++) + INIT_HLIST_HEAD(&fib_table_hash[i]); #ifndef CONFIG_IP_MULTIPLE_TABLES ip_fib_local_table = fib_hash_init(RT_TABLE_LOCAL); + hlist_add_head_rcu(&ip_fib_local_table->tb_hlist, &fib_table_hash[0]); ip_fib_main_table = fib_hash_init(RT_TABLE_MAIN); + hlist_add_head_rcu(&ip_fib_main_table->tb_hlist, &fib_table_hash[0]); #else fib4_rules_init(); #endif diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index f8d5c80..b5bee1a 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -684,7 +684,7 @@ fn_hash_dump_bucket(struct sk_buff *skb, struct netlink_callback *cb, struct fib_node *f; int i, s_i; - s_i = cb->args[3]; + s_i = cb->args[4]; i = 0; hlist_for_each_entry(f, node, head, fn_hash) { struct fib_alias *fa; @@ -704,14 +704,14 @@ fn_hash_dump_bucket(struct sk_buff *skb, struct netlink_callback *cb, fa->fa_tos, fa->fa_info, NLM_F_MULTI) < 0) { - cb->args[3] = i; + cb->args[4] = i; return -1; } next: i++; } } - cb->args[3] = i; + cb->args[4] = i; return skb->len; } @@ -722,21 +722,21 @@ fn_hash_dump_zone(struct sk_buff *skb, struct netlink_callback *cb, { int h, s_h; - s_h = cb->args[2]; + s_h = cb->args[3]; for (h=0; h < fz->fz_divisor; h++) { if (h < s_h) continue; if (h > s_h) - memset(&cb->args[3], 0, - sizeof(cb->args) - 3*sizeof(cb->args[0])); + memset(&cb->args[4], 0, + sizeof(cb->args) - 4*sizeof(cb->args[0])); if (fz->fz_hash == NULL || hlist_empty(&fz->fz_hash[h])) continue; if (fn_hash_dump_bucket(skb, cb, tb, fz, &fz->fz_hash[h])<0) { - cb->args[2] = h; + cb->args[3] = h; return -1; } } - cb->args[2] = h; + cb->args[3] = h; return skb->len; } @@ -746,21 +746,21 @@ static int fn_hash_dump(struct fib_table *tb, struct sk_buff *skb, struct netlin struct fn_zone *fz; struct fn_hash *table = (struct fn_hash*)tb->tb_data; - s_m = cb->args[1]; + s_m = cb->args[2]; read_lock(&fib_hash_lock); for (fz = table->fn_zone_list, m=0; fz; fz = fz->fz_next, m++) { if (m < s_m) continue; if (m > s_m) - memset(&cb->args[2], 0, - sizeof(cb->args) - 2*sizeof(cb->args[0])); + memset(&cb->args[3], 0, + sizeof(cb->args) - 3*sizeof(cb->args[0])); if (fn_hash_dump_zone(skb, cb, tb, fz) < 0) { - cb->args[1] = m; + cb->args[2] = m; read_unlock(&fib_hash_lock); return -1; } } read_unlock(&fib_hash_lock); - cb->args[1] = m; + cb->args[2] = m; return skb->len; } diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 0330b9c..ce185ac 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -172,8 +172,8 @@ static struct fib_table *fib_empty_table(void) u32 id; for (id = 1; id <= RT_TABLE_MAX; id++) - if (fib_tables[id] == NULL) - return __fib_new_table(id); + if (fib_get_table(id) == NULL) + return fib_new_table(id); return NULL; } diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 4a27b2d..2a580eb 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1848,7 +1848,7 @@ static int fn_trie_dump_fa(t_key key, int plen, struct list_head *fah, struct fi u32 xkey = htonl(key); - s_i = cb->args[3]; + s_i = cb->args[4]; i = 0; /* rcu_read_lock is hold by caller */ @@ -1870,12 +1870,12 @@ static int fn_trie_dump_fa(t_key key, int plen, struct list_head *fah, struct fi plen, fa->fa_tos, fa->fa_info, 0) < 0) { - cb->args[3] = i; + cb->args[4] = i; return -1; } i++; } - cb->args[3] = i; + cb->args[4] = i; return skb->len; } @@ -1886,14 +1886,14 @@ static int fn_trie_dump_plen(struct trie *t, int plen, struct fib_table *tb, str struct list_head *fa_head; struct leaf *l = NULL; - s_h = cb->args[2]; + s_h = cb->args[3]; for (h = 0; (l = nextleaf(t, l)) != NULL; h++) { if (h < s_h) continue; if (h > s_h) - memset(&cb->args[3], 0, - sizeof(cb->args) - 3*sizeof(cb->args[0])); + memset(&cb->args[4], 0, + sizeof(cb->args) - 4*sizeof(cb->args[0])); fa_head = get_fa_head(l, plen); @@ -1904,11 +1904,11 @@ static int fn_trie_dump_plen(struct trie *t, int plen, struct fib_table *tb, str continue; if (fn_trie_dump_fa(l->key, plen, fa_head, tb, skb, cb)<0) { - cb->args[2] = h; + cb->args[3] = h; return -1; } } - cb->args[2] = h; + cb->args[3] = h; return skb->len; } @@ -1917,23 +1917,23 @@ static int fn_trie_dump(struct fib_table *tb, struct sk_buff *skb, struct netlin int m, s_m; struct trie *t = (struct trie *) tb->tb_data; - s_m = cb->args[1]; + s_m = cb->args[2]; rcu_read_lock(); for (m = 0; m <= 32; m++) { if (m < s_m) continue; if (m > s_m) - memset(&cb->args[2], 0, - sizeof(cb->args) - 2*sizeof(cb->args[0])); + memset(&cb->args[3], 0, + sizeof(cb->args) - 3*sizeof(cb->args[0])); if (fn_trie_dump_plen(t, 32-m, tb, skb, cb)<0) { - cb->args[1] = m; + cb->args[2] = m; goto out; } } rcu_read_unlock(); - cb->args[1] = m; + cb->args[2] = m; return skb->len; out: rcu_read_unlock(); -- cgit v0.10.2 From 1b43af5480c351dbcb2eef478bafe179cbeb6e83 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:11:17 -0700 Subject: [IPV6]: Increase number of possible routing tables to 2^32 Increase number of possible routing tables to 2^32 by replacing iterations over all possible table IDs by hash table walking. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 9bfa3cc..01bfe40 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -137,6 +137,13 @@ extern int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *a extern int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet6_rtm_getroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); +struct rt6_rtnl_dump_arg +{ + struct sk_buff *skb; + struct netlink_callback *cb; +}; + +extern int rt6_dump_route(struct rt6_info *rt, void *p_arg); extern void rt6_ifdown(struct net_device *dev); extern void rt6_mtu_change(struct net_device *dev, unsigned mtu); diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 1f23161..bececbe 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -158,7 +158,26 @@ static struct fib6_table fib6_main_tbl = { }; #ifdef CONFIG_IPV6_MULTIPLE_TABLES +#define FIB_TABLE_HASHSZ 256 +#else +#define FIB_TABLE_HASHSZ 1 +#endif +static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; + +static void fib6_link_table(struct fib6_table *tb) +{ + unsigned int h; + + h = tb->tb6_id & (FIB_TABLE_HASHSZ - 1); + /* + * No protection necessary, this is the only list mutatation + * operation, tables never disappear once they exist. + */ + hlist_add_head_rcu(&tb->tb6_hlist, &fib_table_hash[h]); +} + +#ifdef CONFIG_IPV6_MULTIPLE_TABLES static struct fib6_table fib6_local_tbl = { .tb6_id = RT6_TABLE_LOCAL, .tb6_lock = RW_LOCK_UNLOCKED, @@ -168,9 +187,6 @@ static struct fib6_table fib6_local_tbl = { }, }; -#define FIB_TABLE_HASHSZ 256 -static struct hlist_head fib_table_hash[FIB_TABLE_HASHSZ]; - static struct fib6_table *fib6_alloc_table(u32 id) { struct fib6_table *table; @@ -186,19 +202,6 @@ static struct fib6_table *fib6_alloc_table(u32 id) return table; } -static void fib6_link_table(struct fib6_table *tb) -{ - unsigned int h; - - h = tb->tb6_id & (FIB_TABLE_HASHSZ - 1); - - /* - * No protection necessary, this is the only list mutatation - * operation, tables never disappear once they exist. - */ - hlist_add_head_rcu(&tb->tb6_hlist, &fib_table_hash[h]); -} - struct fib6_table *fib6_new_table(u32 id) { struct fib6_table *tb; @@ -263,10 +266,135 @@ struct dst_entry *fib6_rule_lookup(struct flowi *fl, int flags, static void __init fib6_tables_init(void) { + fib6_link_table(&fib6_main_tbl); } #endif +static int fib6_dump_node(struct fib6_walker_t *w) +{ + int res; + struct rt6_info *rt; + + for (rt = w->leaf; rt; rt = rt->u.next) { + res = rt6_dump_route(rt, w->args); + if (res < 0) { + /* Frame is full, suspend walking */ + w->leaf = rt; + return 1; + } + BUG_TRAP(res!=0); + } + w->leaf = NULL; + return 0; +} + +static void fib6_dump_end(struct netlink_callback *cb) +{ + struct fib6_walker_t *w = (void*)cb->args[2]; + + if (w) { + cb->args[2] = 0; + kfree(w); + } + cb->done = (void*)cb->args[3]; + cb->args[1] = 3; +} + +static int fib6_dump_done(struct netlink_callback *cb) +{ + fib6_dump_end(cb); + return cb->done ? cb->done(cb) : 0; +} + +static int fib6_dump_table(struct fib6_table *table, struct sk_buff *skb, + struct netlink_callback *cb) +{ + struct fib6_walker_t *w; + int res; + + w = (void *)cb->args[2]; + w->root = &table->tb6_root; + + if (cb->args[4] == 0) { + read_lock_bh(&table->tb6_lock); + res = fib6_walk(w); + read_unlock_bh(&table->tb6_lock); + if (res > 0) + cb->args[4] = 1; + } else { + read_lock_bh(&table->tb6_lock); + res = fib6_walk_continue(w); + read_unlock_bh(&table->tb6_lock); + if (res != 0) { + if (res < 0) + fib6_walker_unlink(w); + goto end; + } + fib6_walker_unlink(w); + cb->args[4] = 0; + } +end: + return res; +} + +int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) +{ + unsigned int h, s_h; + unsigned int e = 0, s_e; + struct rt6_rtnl_dump_arg arg; + struct fib6_walker_t *w; + struct fib6_table *tb; + struct hlist_node *node; + int res = 0; + + s_h = cb->args[0]; + s_e = cb->args[1]; + + w = (void *)cb->args[2]; + if (w == NULL) { + /* New dump: + * + * 1. hook callback destructor. + */ + cb->args[3] = (long)cb->done; + cb->done = fib6_dump_done; + + /* + * 2. allocate and initialize walker. + */ + w = kzalloc(sizeof(*w), GFP_ATOMIC); + if (w == NULL) + return -ENOMEM; + w->func = fib6_dump_node; + cb->args[2] = (long)w; + } + + arg.skb = skb; + arg.cb = cb; + w->args = &arg; + + for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) { + e = 0; + hlist_for_each_entry(tb, node, &fib_table_hash[h], tb6_hlist) { + if (e < s_e) + goto next; + res = fib6_dump_table(tb, skb, cb); + if (res != 0) + goto out; +next: + e++; + } + } +out: + cb->args[1] = e; + cb->args[0] = h; + + res = res < 0 ? res : skb->len; + if (res <= 0) + fib6_dump_end(cb); + return res; +} /* * Routing Table @@ -1187,17 +1315,20 @@ static void fib6_clean_tree(struct fib6_node *root, void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), int prune, void *arg) { - int i; struct fib6_table *table; + struct hlist_node *node; + unsigned int h; - for (i = FIB6_TABLE_MIN; i <= FIB6_TABLE_MAX; i++) { - table = fib6_get_table(i); - if (table != NULL) { + rcu_read_lock(); + for (h = 0; h < FIB_TABLE_HASHSZ; h++) { + hlist_for_each_entry_rcu(table, node, &fib_table_hash[h], + tb6_hlist) { write_lock_bh(&table->tb6_lock); fib6_clean_tree(&table->tb6_root, func, prune, arg); write_unlock_bh(&table->tb6_lock); } } + rcu_read_unlock(); } static int fib6_prune_clone(struct rt6_info *rt, void *arg) diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 843c550..9ce28277 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1874,12 +1874,6 @@ int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) rtm_get_table(arg, r->rtm_table)); } -struct rt6_rtnl_dump_arg -{ - struct sk_buff *skb; - struct netlink_callback *cb; -}; - static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, struct in6_addr *dst, struct in6_addr *src, int iif, int type, u32 pid, u32 seq, @@ -1976,7 +1970,7 @@ rtattr_failure: return -1; } -static int rt6_dump_route(struct rt6_info *rt, void *p_arg) +int rt6_dump_route(struct rt6_info *rt, void *p_arg) { struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; int prefix; @@ -1992,126 +1986,6 @@ static int rt6_dump_route(struct rt6_info *rt, void *p_arg) prefix, NLM_F_MULTI); } -static int fib6_dump_node(struct fib6_walker_t *w) -{ - int res; - struct rt6_info *rt; - - for (rt = w->leaf; rt; rt = rt->u.next) { - res = rt6_dump_route(rt, w->args); - if (res < 0) { - /* Frame is full, suspend walking */ - w->leaf = rt; - return 1; - } - BUG_TRAP(res!=0); - } - w->leaf = NULL; - return 0; -} - -static void fib6_dump_end(struct netlink_callback *cb) -{ - struct fib6_walker_t *w = (void*)cb->args[0]; - - if (w) { - cb->args[0] = 0; - kfree(w); - } - cb->done = (void*)cb->args[1]; - cb->args[1] = 0; -} - -static int fib6_dump_done(struct netlink_callback *cb) -{ - fib6_dump_end(cb); - return cb->done ? cb->done(cb) : 0; -} - -int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct fib6_table *table; - struct rt6_rtnl_dump_arg arg; - struct fib6_walker_t *w; - int i, res = 0; - - arg.skb = skb; - arg.cb = cb; - - /* - * cb->args[0] = pointer to walker structure - * cb->args[1] = saved cb->done() pointer - * cb->args[2] = current table being dumped - */ - - w = (void*)cb->args[0]; - if (w == NULL) { - /* New dump: - * - * 1. hook callback destructor. - */ - cb->args[1] = (long)cb->done; - cb->done = fib6_dump_done; - - /* - * 2. allocate and initialize walker. - */ - w = kzalloc(sizeof(*w), GFP_ATOMIC); - if (w == NULL) - return -ENOMEM; - w->func = fib6_dump_node; - w->args = &arg; - cb->args[0] = (long)w; - cb->args[2] = FIB6_TABLE_MIN; - } else { - w->args = &arg; - i = cb->args[2]; - if (i > FIB6_TABLE_MAX) - goto end; - - table = fib6_get_table(i); - if (table != NULL) { - read_lock_bh(&table->tb6_lock); - w->root = &table->tb6_root; - res = fib6_walk_continue(w); - read_unlock_bh(&table->tb6_lock); - if (res != 0) { - if (res < 0) - fib6_walker_unlink(w); - goto end; - } - } - - fib6_walker_unlink(w); - cb->args[2] = ++i; - } - - for (i = cb->args[2]; i <= FIB6_TABLE_MAX; i++) { - table = fib6_get_table(i); - if (table == NULL) - continue; - - read_lock_bh(&table->tb6_lock); - w->root = &table->tb6_root; - res = fib6_walk(w); - read_unlock_bh(&table->tb6_lock); - if (res) - break; - } -end: - cb->args[2] = i; - - res = res < 0 ? res : skb->len; - /* res < 0 is an error. (really, impossible) - res == 0 means that dump is complete, but skb still can contain data. - res > 0 dump is not complete, but frame is full. - */ - /* Destroy walker, if dump of this table is complete. */ - if (res <= 0) - fib6_dump_end(cb); - return res; -} - int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) { struct rtattr **rta = arg; -- cgit v0.10.2 From abcab268303c22d24fc89fedd35d82271d20f5da Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:11:47 -0700 Subject: [DECNET]: Increase number of possible routing tables to 2^32 Increase the number of possible routing tables to 2^32 by replacing the fixed sized array of pointers by a hash table and replacing iterations over all possible table IDs by hash table walking. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h index cd9c378..d97aa10 100644 --- a/include/net/dn_fib.h +++ b/include/net/dn_fib.h @@ -94,6 +94,7 @@ struct dn_fib_node { struct dn_fib_table { + struct hlist_node hlist; u32 n; int (*insert)(struct dn_fib_table *t, struct rtmsg *r, @@ -177,8 +178,6 @@ static inline void dn_fib_res_put(struct dn_fib_res *res) fib_rule_put(res->r); } -extern struct dn_fib_table *dn_fib_tables[]; - #else /* Endnode */ #define dn_fib_init() do { } while(0) diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index fb59637..5ccca3e 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -532,39 +532,6 @@ int dn_fib_rtm_newroute(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) return -ENOBUFS; } - -int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb) -{ - u32 t; - u32 s_t; - struct dn_fib_table *tb; - - if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && - ((struct rtmsg *)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED) - return dn_cache_dump(skb, cb); - - s_t = cb->args[0]; - if (s_t == 0) - s_t = cb->args[0] = RT_MIN_TABLE; - - for(t = s_t; t <= RT_TABLE_MAX; t++) { - if (t < s_t) - continue; - if (t > s_t) - memset(&cb->args[1], 0, - sizeof(cb->args) - sizeof(cb->args[0])); - tb = dn_fib_get_table(t, 0); - if (tb == NULL) - continue; - if (tb->dump(tb, skb, cb) < 0) - break; - } - - cb->args[0] = t; - - return skb->len; -} - static void fib_magic(int cmd, int type, __le16 dst, int dst_len, struct dn_ifaddr *ifa) { struct dn_fib_table *tb; @@ -762,22 +729,6 @@ int dn_fib_sync_up(struct net_device *dev) return ret; } -void dn_fib_flush(void) -{ - int flushed = 0; - struct dn_fib_table *tb; - u32 id; - - for(id = RT_TABLE_MAX; id > 0; id--) { - if ((tb = dn_fib_get_table(id, 0)) == NULL) - continue; - flushed += tb->flush(tb); - } - - if (flushed) - dn_rt_cache_flush(-1); -} - static struct notifier_block dn_fib_dnaddr_notifier = { .notifier_call = dn_fib_dnaddr_event, }; diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 096f127..878312f 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -210,7 +210,7 @@ unsigned dnet_addr_type(__le16 addr) struct flowi fl = { .nl_u = { .dn_u = { .daddr = addr } } }; struct dn_fib_res res; unsigned ret = RTN_UNICAST; - struct dn_fib_table *tb = dn_fib_tables[RT_TABLE_LOCAL]; + struct dn_fib_table *tb = dn_fib_get_table(RT_TABLE_LOCAL, 0); res.r = NULL; diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c index eca7c1e..10e8726 100644 --- a/net/decnet/dn_table.c +++ b/net/decnet/dn_table.c @@ -75,9 +75,9 @@ for( ; ((f) = *(fp)) != NULL; (fp) = &(f)->fn_next) for( ; ((f) = *(fp)) != NULL && dn_key_eq((f)->fn_key, (key)); (fp) = &(f)->fn_next) #define RT_TABLE_MIN 1 - +#define DN_FIB_TABLE_HASHSZ 256 +static struct hlist_head dn_fib_table_hash[DN_FIB_TABLE_HASHSZ]; static DEFINE_RWLOCK(dn_fib_tables_lock); -struct dn_fib_table *dn_fib_tables[RT_TABLE_MAX + 1]; static kmem_cache_t *dn_hash_kmem __read_mostly; static int dn_fib_hash_zombies; @@ -361,7 +361,7 @@ static __inline__ int dn_hash_dump_bucket(struct sk_buff *skb, { int i, s_i; - s_i = cb->args[3]; + s_i = cb->args[4]; for(i = 0; f; i++, f = f->fn_next) { if (i < s_i) continue; @@ -374,11 +374,11 @@ static __inline__ int dn_hash_dump_bucket(struct sk_buff *skb, (f->fn_state & DN_S_ZOMBIE) ? 0 : f->fn_type, f->fn_scope, &f->fn_key, dz->dz_order, f->fn_info, NLM_F_MULTI) < 0) { - cb->args[3] = i; + cb->args[4] = i; return -1; } } - cb->args[3] = i; + cb->args[4] = i; return skb->len; } @@ -389,20 +389,20 @@ static __inline__ int dn_hash_dump_zone(struct sk_buff *skb, { int h, s_h; - s_h = cb->args[2]; + s_h = cb->args[3]; for(h = 0; h < dz->dz_divisor; h++) { if (h < s_h) continue; if (h > s_h) - memset(&cb->args[3], 0, sizeof(cb->args) - 3*sizeof(cb->args[0])); + memset(&cb->args[4], 0, sizeof(cb->args) - 4*sizeof(cb->args[0])); if (dz->dz_hash == NULL || dz->dz_hash[h] == NULL) continue; if (dn_hash_dump_bucket(skb, cb, tb, dz, dz->dz_hash[h]) < 0) { - cb->args[2] = h; + cb->args[3] = h; return -1; } } - cb->args[2] = h; + cb->args[3] = h; return skb->len; } @@ -413,26 +413,63 @@ static int dn_fib_table_dump(struct dn_fib_table *tb, struct sk_buff *skb, struct dn_zone *dz; struct dn_hash *table = (struct dn_hash *)tb->data; - s_m = cb->args[1]; + s_m = cb->args[2]; read_lock(&dn_fib_tables_lock); for(dz = table->dh_zone_list, m = 0; dz; dz = dz->dz_next, m++) { if (m < s_m) continue; if (m > s_m) - memset(&cb->args[2], 0, sizeof(cb->args) - 2*sizeof(cb->args[0])); + memset(&cb->args[3], 0, sizeof(cb->args) - 3*sizeof(cb->args[0])); if (dn_hash_dump_zone(skb, cb, tb, dz) < 0) { - cb->args[1] = m; + cb->args[2] = m; read_unlock(&dn_fib_tables_lock); return -1; } } read_unlock(&dn_fib_tables_lock); - cb->args[1] = m; + cb->args[2] = m; return skb->len; } +int dn_fib_dump(struct sk_buff *skb, struct netlink_callback *cb) +{ + unsigned int h, s_h; + unsigned int e = 0, s_e; + struct dn_fib_table *tb; + struct hlist_node *node; + int dumped = 0; + + if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && + ((struct rtmsg *)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED) + return dn_cache_dump(skb, cb); + + s_h = cb->args[0]; + s_e = cb->args[1]; + + for (h = s_h; h < DN_FIB_TABLE_HASHSZ; h++, s_h = 0) { + e = 0; + hlist_for_each_entry(tb, node, &dn_fib_table_hash[h], hlist) { + if (e < s_e) + goto next; + if (dumped) + memset(&cb->args[2], 0, sizeof(cb->args) - + 2 * sizeof(cb->args[0])); + if (tb->dump(tb, skb, cb) < 0) + goto out; + dumped = 1; +next: + e++; + } + } +out: + cb->args[1] = e; + cb->args[0] = h; + + return skb->len; +} + static int dn_fib_table_insert(struct dn_fib_table *tb, struct rtmsg *r, struct dn_kern_rta *rta, struct nlmsghdr *n, struct netlink_skb_parms *req) { struct dn_hash *table = (struct dn_hash *)tb->data; @@ -744,6 +781,8 @@ out: struct dn_fib_table *dn_fib_get_table(u32 n, int create) { struct dn_fib_table *t; + struct hlist_node *node; + unsigned int h; if (n < RT_TABLE_MIN) return NULL; @@ -751,8 +790,15 @@ struct dn_fib_table *dn_fib_get_table(u32 n, int create) if (n > RT_TABLE_MAX) return NULL; - if (dn_fib_tables[n]) - return dn_fib_tables[n]; + h = n & (DN_FIB_TABLE_HASHSZ - 1); + rcu_read_lock(); + hlist_for_each_entry_rcu(t, node, &dn_fib_table_hash[h], hlist) { + if (t->n == n) { + rcu_read_unlock(); + return t; + } + } + rcu_read_unlock(); if (!create) return NULL; @@ -773,33 +819,37 @@ struct dn_fib_table *dn_fib_get_table(u32 n, int create) t->flush = dn_fib_table_flush; t->dump = dn_fib_table_dump; memset(t->data, 0, sizeof(struct dn_hash)); - dn_fib_tables[n] = t; + hlist_add_head_rcu(&t->hlist, &dn_fib_table_hash[h]); return t; } -static void dn_fib_del_tree(u32 n) -{ - struct dn_fib_table *t; - - write_lock(&dn_fib_tables_lock); - t = dn_fib_tables[n]; - dn_fib_tables[n] = NULL; - write_unlock(&dn_fib_tables_lock); - - kfree(t); -} - struct dn_fib_table *dn_fib_empty_table(void) { u32 id; for(id = RT_TABLE_MIN; id <= RT_TABLE_MAX; id++) - if (dn_fib_tables[id] == NULL) + if (dn_fib_get_table(id, 0) == NULL) return dn_fib_get_table(id, 1); return NULL; } +void dn_fib_flush(void) +{ + int flushed = 0; + struct dn_fib_table *tb; + struct hlist_node *node; + unsigned int h; + + for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) { + hlist_for_each_entry(tb, node, &dn_fib_table_hash[h], hlist) + flushed += tb->flush(tb); + } + + if (flushed) + dn_rt_cache_flush(-1); +} + void __init dn_fib_table_init(void) { dn_hash_kmem = kmem_cache_create("dn_fib_info_cache", @@ -810,10 +860,17 @@ void __init dn_fib_table_init(void) void __exit dn_fib_table_cleanup(void) { - int i; - - for (i = RT_TABLE_MIN; i <= RT_TABLE_MAX; ++i) - dn_fib_del_tree(i); + struct dn_fib_table *t; + struct hlist_node *node, *next; + unsigned int h; - return; + write_lock(&dn_fib_tables_lock); + for (h = 0; h < DN_FIB_TABLE_HASHSZ; h++) { + hlist_for_each_entry_safe(t, node, next, &dn_fib_table_hash[h], + hlist) { + hlist_del(&t->hlist); + kfree(t); + } + } + write_unlock(&dn_fib_tables_lock); } -- cgit v0.10.2 From b801f54917b7c6e8540f877ee562cd0725e62ebd Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Thu, 10 Aug 2006 23:12:34 -0700 Subject: [NET]: Increate RT_TABLE_MAX to 2^32 Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index ea422a5..7e4aa48 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -239,10 +239,8 @@ enum rt_class_t RT_TABLE_DEFAULT=253, RT_TABLE_MAIN=254, RT_TABLE_LOCAL=255, - __RT_TABLE_MAX + RT_TABLE_MAX=0xFFFFFFFF }; -#define RT_TABLE_MAX (__RT_TABLE_MAX - 1) - /* Routing message attributes */ -- cgit v0.10.2 From 3bf72957d2a553c343e4285463ef0a88139bdec4 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:31:08 -0700 Subject: [HTB]: Remove broken debug code. The HTB network scheduler had debug code that wouldn't compile and confused and obfuscated the code, remove it. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 880a339..73094e7 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -70,7 +70,6 @@ #define HTB_HSIZE 16 /* classid hash size */ #define HTB_EWMAC 2 /* rate average over HTB_EWMAC*HTB_HSIZE sec */ -#undef HTB_DEBUG /* compile debugging support (activated by tc tool) */ #define HTB_RATECM 1 /* whether to use rate computer */ #define HTB_HYSTERESIS 1/* whether to use mode hysteresis for speedup */ #define HTB_QLOCK(S) spin_lock_bh(&(S)->dev->queue_lock) @@ -81,51 +80,6 @@ #error "Mismatched sch_htb.c and pkt_sch.h" #endif -/* debugging support; S is subsystem, these are defined: - 0 - netlink messages - 1 - enqueue - 2 - drop & requeue - 3 - dequeue main - 4 - dequeue one prio DRR part - 5 - dequeue class accounting - 6 - class overlimit status computation - 7 - hint tree - 8 - event queue - 10 - rate estimator - 11 - classifier - 12 - fast dequeue cache - - L is level; 0 = none, 1 = basic info, 2 = detailed, 3 = full - q->debug uint32 contains 16 2-bit fields one for subsystem starting - from LSB - */ -#ifdef HTB_DEBUG -#define HTB_DBG_COND(S,L) (((q->debug>>(2*S))&3) >= L) -#define HTB_DBG(S,L,FMT,ARG...) if (HTB_DBG_COND(S,L)) \ - printk(KERN_DEBUG FMT,##ARG) -#define HTB_CHCL(cl) BUG_TRAP((cl)->magic == HTB_CMAGIC) -#define HTB_PASSQ q, -#define HTB_ARGQ struct htb_sched *q, -#define static -#undef __inline__ -#define __inline__ -#undef inline -#define inline -#define HTB_CMAGIC 0xFEFAFEF1 -#define htb_safe_rb_erase(N,R) do { BUG_TRAP((N)->rb_color != -1); \ - if ((N)->rb_color == -1) break; \ - rb_erase(N,R); \ - (N)->rb_color = -1; } while (0) -#else -#define HTB_DBG_COND(S,L) (0) -#define HTB_DBG(S,L,FMT,ARG...) -#define HTB_PASSQ -#define HTB_ARGQ -#define HTB_CHCL(cl) -#define htb_safe_rb_erase(N,R) rb_erase(N,R) -#endif - - /* used internaly to keep status of single class */ enum htb_cmode { HTB_CANT_SEND, /* class can't send and can't borrow */ @@ -136,9 +90,6 @@ enum htb_cmode { /* interior & leaf nodes; props specific to leaves are marked L: */ struct htb_class { -#ifdef HTB_DEBUG - unsigned magic; -#endif /* general class parameters */ u32 classid; struct gnet_stats_basic bstats; @@ -238,7 +189,6 @@ struct htb_sched int nwc_hit; /* this to disable mindelay complaint in dequeue */ int defcls; /* class where unclassified flows go to */ - u32 debug; /* subsystem debug levels */ /* filters for qdisc itself */ struct tcf_proto *filter_list; @@ -354,75 +304,21 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, in return cl; } -#ifdef HTB_DEBUG -static void htb_next_rb_node(struct rb_node **n); -#define HTB_DUMTREE(root,memb) if(root) { \ - struct rb_node *n = (root)->rb_node; \ - while (n->rb_left) n = n->rb_left; \ - while (n) { \ - struct htb_class *cl = rb_entry(n, struct htb_class, memb); \ - printk(" %x",cl->classid); htb_next_rb_node (&n); \ - } } - -static void htb_debug_dump (struct htb_sched *q) -{ - int i,p; - printk(KERN_DEBUG "htb*g j=%lu lj=%lu\n",jiffies,q->jiffies); - /* rows */ - for (i=TC_HTB_MAXDEPTH-1;i>=0;i--) { - printk(KERN_DEBUG "htb*r%d m=%x",i,q->row_mask[i]); - for (p=0;prow[i][p].rb_node) continue; - printk(" p%d:",p); - HTB_DUMTREE(q->row[i]+p,node[p]); - } - printk("\n"); - } - /* classes */ - for (i = 0; i < HTB_HSIZE; i++) { - struct list_head *l; - list_for_each (l,q->hash+i) { - struct htb_class *cl = list_entry(l,struct htb_class,hlist); - long diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32)cl->mbuffer); - printk(KERN_DEBUG "htb*c%x m=%d t=%ld c=%ld pq=%lu df=%ld ql=%d " - "pa=%x f:", - cl->classid,cl->cmode,cl->tokens,cl->ctokens, - cl->pq_node.rb_color==-1?0:cl->pq_key,diff, - cl->level?0:cl->un.leaf.q->q.qlen,cl->prio_activity); - if (cl->level) - for (p=0;pun.inner.feed[p].rb_node) continue; - printk(" p%d a=%x:",p,cl->un.inner.ptr[p]?rb_entry(cl->un.inner.ptr[p], struct htb_class,node[p])->classid:0); - HTB_DUMTREE(cl->un.inner.feed+p,node[p]); - } - printk("\n"); - } - } -} -#endif /** * htb_add_to_id_tree - adds class to the round robin list * * Routine adds class to the list (actually tree) sorted by classid. * Make sure that class is not already on such list for given prio. */ -static void htb_add_to_id_tree (HTB_ARGQ struct rb_root *root, +static void htb_add_to_id_tree (struct rb_root *root, struct htb_class *cl,int prio) { struct rb_node **p = &root->rb_node, *parent = NULL; - HTB_DBG(7,3,"htb_add_id_tree cl=%X prio=%d\n",cl->classid,prio); -#ifdef HTB_DEBUG - if (cl->node[prio].rb_color != -1) { BUG_TRAP(0); return; } - HTB_CHCL(cl); - if (*p) { - struct htb_class *x = rb_entry(*p,struct htb_class,node[prio]); - HTB_CHCL(x); - } -#endif + while (*p) { struct htb_class *c; parent = *p; c = rb_entry(parent, struct htb_class, node[prio]); - HTB_CHCL(c); + if (cl->classid > c->classid) p = &parent->rb_right; else @@ -440,16 +336,10 @@ static void htb_add_to_id_tree (HTB_ARGQ struct rb_root *root, * already in the queue. */ static void htb_add_to_wait_tree (struct htb_sched *q, - struct htb_class *cl,long delay,int debug_hint) + struct htb_class *cl,long delay) { struct rb_node **p = &q->wait_pq[cl->level].rb_node, *parent = NULL; - HTB_DBG(7,3,"htb_add_wt cl=%X key=%lu\n",cl->classid,cl->pq_key); -#ifdef HTB_DEBUG - if (cl->pq_node.rb_color != -1) { BUG_TRAP(0); return; } - HTB_CHCL(cl); - if ((delay <= 0 || delay > cl->mbuffer) && net_ratelimit()) - printk(KERN_ERR "HTB: suspicious delay in wait_tree d=%ld cl=%X h=%d\n",delay,cl->classid,debug_hint); -#endif + cl->pq_key = q->jiffies + PSCHED_US2JIFFIE(delay); if (cl->pq_key == q->jiffies) cl->pq_key++; @@ -490,14 +380,11 @@ static void htb_next_rb_node(struct rb_node **n) static inline void htb_add_class_to_row(struct htb_sched *q, struct htb_class *cl,int mask) { - HTB_DBG(7,2,"htb_addrow cl=%X mask=%X rmask=%X\n", - cl->classid,mask,q->row_mask[cl->level]); - HTB_CHCL(cl); q->row_mask[cl->level] |= mask; while (mask) { int prio = ffz(~mask); mask &= ~(1 << prio); - htb_add_to_id_tree(HTB_PASSQ q->row[cl->level]+prio,cl,prio); + htb_add_to_id_tree(q->row[cl->level]+prio,cl,prio); } } @@ -511,18 +398,16 @@ static __inline__ void htb_remove_class_from_row(struct htb_sched *q, struct htb_class *cl,int mask) { int m = 0; - HTB_CHCL(cl); + while (mask) { int prio = ffz(~mask); mask &= ~(1 << prio); if (q->ptr[cl->level][prio] == cl->node+prio) htb_next_rb_node(q->ptr[cl->level]+prio); - htb_safe_rb_erase(cl->node + prio,q->row[cl->level]+prio); + rb_erase(cl->node + prio,q->row[cl->level]+prio); if (!q->row[cl->level][prio].rb_node) m |= 1 << prio; } - HTB_DBG(7,2,"htb_delrow cl=%X mask=%X rmask=%X maskdel=%X\n", - cl->classid,mask,q->row_mask[cl->level],m); q->row_mask[cl->level] &= ~m; } @@ -537,11 +422,9 @@ static void htb_activate_prios(struct htb_sched *q,struct htb_class *cl) { struct htb_class *p = cl->parent; long m,mask = cl->prio_activity; - HTB_DBG(7,2,"htb_act_prios cl=%X mask=%lX cmode=%d\n",cl->classid,mask,cl->cmode); - HTB_CHCL(cl); while (cl->cmode == HTB_MAY_BORROW && p && mask) { - HTB_CHCL(p); + m = mask; while (m) { int prio = ffz(~m); m &= ~(1 << prio); @@ -551,13 +434,11 @@ static void htb_activate_prios(struct htb_sched *q,struct htb_class *cl) reset bit in mask as parent is already ok */ mask &= ~(1 << prio); - htb_add_to_id_tree(HTB_PASSQ p->un.inner.feed+prio,cl,prio); + htb_add_to_id_tree(p->un.inner.feed+prio,cl,prio); } - HTB_DBG(7,3,"htb_act_pr_aft p=%X pact=%X mask=%lX pmode=%d\n", - p->classid,p->prio_activity,mask,p->cmode); p->prio_activity |= mask; cl = p; p = cl->parent; - HTB_CHCL(cl); + } if (cl->cmode == HTB_CAN_SEND && mask) htb_add_class_to_row(q,cl,mask); @@ -574,8 +455,7 @@ static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl) { struct htb_class *p = cl->parent; long m,mask = cl->prio_activity; - HTB_DBG(7,2,"htb_deact_prios cl=%X mask=%lX cmode=%d\n",cl->classid,mask,cl->cmode); - HTB_CHCL(cl); + while (cl->cmode == HTB_MAY_BORROW && p && mask) { m = mask; mask = 0; @@ -591,16 +471,15 @@ static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl) p->un.inner.ptr[prio] = NULL; } - htb_safe_rb_erase(cl->node + prio,p->un.inner.feed + prio); + rb_erase(cl->node + prio,p->un.inner.feed + prio); if (!p->un.inner.feed[prio].rb_node) mask |= 1 << prio; } - HTB_DBG(7,3,"htb_deact_pr_aft p=%X pact=%X mask=%lX pmode=%d\n", - p->classid,p->prio_activity,mask,p->cmode); + p->prio_activity &= ~mask; cl = p; p = cl->parent; - HTB_CHCL(cl); + } if (cl->cmode == HTB_CAN_SEND && mask) htb_remove_class_from_row(q,cl,mask); @@ -655,8 +534,6 @@ htb_change_class_mode(struct htb_sched *q, struct htb_class *cl, long *diff) { enum htb_cmode new_mode = htb_class_mode(cl,diff); - HTB_CHCL(cl); - HTB_DBG(7,1,"htb_chging_clmode %d->%d cl=%X\n",cl->cmode,new_mode,cl->classid); if (new_mode == cl->cmode) return; @@ -681,7 +558,7 @@ htb_change_class_mode(struct htb_sched *q, struct htb_class *cl, long *diff) static __inline__ void htb_activate(struct htb_sched *q,struct htb_class *cl) { BUG_TRAP(!cl->level && cl->un.leaf.q && cl->un.leaf.q->q.qlen); - HTB_CHCL(cl); + if (!cl->prio_activity) { cl->prio_activity = 1 << (cl->un.leaf.aprio = cl->un.leaf.prio); htb_activate_prios(q,cl); @@ -699,7 +576,7 @@ static __inline__ void htb_deactivate(struct htb_sched *q,struct htb_class *cl) { BUG_TRAP(cl->prio_activity); - HTB_CHCL(cl); + htb_deactivate_prios(q,cl); cl->prio_activity = 0; list_del_init(&cl->un.leaf.drop_list); @@ -739,7 +616,6 @@ static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch) sch->q.qlen++; sch->bstats.packets++; sch->bstats.bytes += skb->len; - HTB_DBG(1,1,"htb_enq_ok cl=%X skb=%p\n",(cl && cl != HTB_DIRECT)?cl->classid:0,skb); return NET_XMIT_SUCCESS; } @@ -771,7 +647,6 @@ static int htb_requeue(struct sk_buff *skb, struct Qdisc *sch) sch->q.qlen++; sch->qstats.requeues++; - HTB_DBG(1,1,"htb_req_ok cl=%X skb=%p\n",(cl && cl != HTB_DIRECT)?cl->classid:0,skb); return NET_XMIT_SUCCESS; } @@ -793,7 +668,6 @@ static void htb_rate_timer(unsigned long arg) /* lock queue so that we can muck with it */ HTB_QLOCK(sch); - HTB_DBG(10,1,"htb_rttmr j=%ld\n",jiffies); q->rttim.expires = jiffies + HZ; add_timer(&q->rttim); @@ -803,8 +677,7 @@ static void htb_rate_timer(unsigned long arg) q->recmp_bucket = 0; list_for_each (p,q->hash+q->recmp_bucket) { struct htb_class *cl = list_entry(p,struct htb_class,hlist); - HTB_DBG(10,2,"htb_rttmr_cl cl=%X sbyte=%lu spkt=%lu\n", - cl->classid,cl->sum_bytes,cl->sum_packets); + RT_GEN (cl->sum_bytes,cl->rate_bytes); RT_GEN (cl->sum_packets,cl->rate_packets); } @@ -828,7 +701,6 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, { long toks,diff; enum htb_cmode old_mode; - HTB_DBG(5,1,"htb_chrg_cl cl=%X lev=%d len=%d\n",cl->classid,level,bytes); #define HTB_ACCNT(T,B,R) toks = diff + cl->T; \ if (toks > cl->B) toks = cl->B; \ @@ -837,24 +709,7 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, cl->T = toks while (cl) { - HTB_CHCL(cl); diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32)cl->mbuffer); -#ifdef HTB_DEBUG - if (diff > cl->mbuffer || diff < 0 || PSCHED_TLESS(q->now, cl->t_c)) { - if (net_ratelimit()) - printk(KERN_ERR "HTB: bad diff in charge, cl=%X diff=%lX now=%Lu then=%Lu j=%lu\n", - cl->classid, diff, -#ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY - q->now.tv_sec * 1000000ULL + q->now.tv_usec, - cl->t_c.tv_sec * 1000000ULL + cl->t_c.tv_usec, -#else - (unsigned long long) q->now, - (unsigned long long) cl->t_c, -#endif - q->jiffies); - diff = 1000; - } -#endif if (cl->level >= level) { if (cl->level == level) cl->xstats.lends++; HTB_ACCNT (tokens,buffer,rate); @@ -864,15 +719,14 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, } HTB_ACCNT (ctokens,cbuffer,ceil); cl->t_c = q->now; - HTB_DBG(5,2,"htb_chrg_clp cl=%X diff=%ld tok=%ld ctok=%ld\n",cl->classid,diff,cl->tokens,cl->ctokens); old_mode = cl->cmode; diff = 0; htb_change_class_mode(q,cl,&diff); if (old_mode != cl->cmode) { if (old_mode != HTB_CAN_SEND) - htb_safe_rb_erase(&cl->pq_node,q->wait_pq+cl->level); + rb_erase(&cl->pq_node,q->wait_pq+cl->level); if (cl->cmode != HTB_CAN_SEND) - htb_add_to_wait_tree (q,cl,diff,1); + htb_add_to_wait_tree (q,cl,diff); } #ifdef HTB_RATECM @@ -899,8 +753,7 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, static long htb_do_events(struct htb_sched *q,int level) { int i; - HTB_DBG(8,1,"htb_do_events l=%d root=%p rmask=%X\n", - level,q->wait_pq[level].rb_node,q->row_mask[level]); + for (i = 0; i < 500; i++) { struct htb_class *cl; long diff; @@ -910,30 +763,13 @@ static long htb_do_events(struct htb_sched *q,int level) cl = rb_entry(p, struct htb_class, pq_node); if (time_after(cl->pq_key, q->jiffies)) { - HTB_DBG(8,3,"htb_do_ev_ret delay=%ld\n",cl->pq_key - q->jiffies); return cl->pq_key - q->jiffies; } - htb_safe_rb_erase(p,q->wait_pq+level); + rb_erase(p,q->wait_pq+level); diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32)cl->mbuffer); -#ifdef HTB_DEBUG - if (diff > cl->mbuffer || diff < 0 || PSCHED_TLESS(q->now, cl->t_c)) { - if (net_ratelimit()) - printk(KERN_ERR "HTB: bad diff in events, cl=%X diff=%lX now=%Lu then=%Lu j=%lu\n", - cl->classid, diff, -#ifdef CONFIG_NET_SCH_CLK_GETTIMEOFDAY - q->now.tv_sec * 1000000ULL + q->now.tv_usec, - cl->t_c.tv_sec * 1000000ULL + cl->t_c.tv_usec, -#else - (unsigned long long) q->now, - (unsigned long long) cl->t_c, -#endif - q->jiffies); - diff = 1000; - } -#endif htb_change_class_mode(q,cl,&diff); if (cl->cmode != HTB_CAN_SEND) - htb_add_to_wait_tree (q,cl,diff,2); + htb_add_to_wait_tree (q,cl,diff); } if (net_ratelimit()) printk(KERN_WARNING "htb: too many events !\n"); @@ -966,7 +802,7 @@ htb_id_find_next_upper(int prio,struct rb_node *n,u32 id) * Find leaf where current feed pointers points to. */ static struct htb_class * -htb_lookup_leaf(HTB_ARGQ struct rb_root *tree,int prio,struct rb_node **pptr,u32 *pid) +htb_lookup_leaf(struct rb_root *tree,int prio,struct rb_node **pptr,u32 *pid) { int i; struct { @@ -981,8 +817,6 @@ htb_lookup_leaf(HTB_ARGQ struct rb_root *tree,int prio,struct rb_node **pptr,u32 sp->pid = pid; for (i = 0; i < 65535; i++) { - HTB_DBG(4,2,"htb_lleaf ptr=%p pid=%X\n",*sp->pptr,*sp->pid); - if (!*sp->pptr && *sp->pid) { /* ptr was invalidated but id is valid - try to recover the original or next ptr */ @@ -1002,7 +836,6 @@ htb_lookup_leaf(HTB_ARGQ struct rb_root *tree,int prio,struct rb_node **pptr,u32 } else { struct htb_class *cl; cl = rb_entry(*sp->pptr,struct htb_class,node[prio]); - HTB_CHCL(cl); if (!cl->level) return cl; (++sp)->root = cl->un.inner.feed[prio].rb_node; @@ -1022,15 +855,13 @@ htb_dequeue_tree(struct htb_sched *q,int prio,int level) struct sk_buff *skb = NULL; struct htb_class *cl,*start; /* look initial class up in the row */ - start = cl = htb_lookup_leaf (HTB_PASSQ q->row[level]+prio,prio, + start = cl = htb_lookup_leaf (q->row[level]+prio,prio, q->ptr[level]+prio,q->last_ptr_id[level]+prio); do { next: BUG_TRAP(cl); if (!cl) return NULL; - HTB_DBG(4,1,"htb_deq_tr prio=%d lev=%d cl=%X defic=%d\n", - prio,level,cl->classid,cl->un.leaf.deficit[level]); /* class can be empty - it is unlikely but can be true if leaf qdisc drops packets in enqueue routine or if someone used @@ -1044,7 +875,7 @@ next: if ((q->row_mask[level] & (1 << prio)) == 0) return NULL; - next = htb_lookup_leaf (HTB_PASSQ q->row[level]+prio, + next = htb_lookup_leaf (q->row[level]+prio, prio,q->ptr[level]+prio,q->last_ptr_id[level]+prio); if (cl == start) /* fix start if we just deleted it */ @@ -1061,15 +892,13 @@ next: } q->nwc_hit++; htb_next_rb_node((level?cl->parent->un.inner.ptr:q->ptr[0])+prio); - cl = htb_lookup_leaf (HTB_PASSQ q->row[level]+prio,prio,q->ptr[level]+prio, + cl = htb_lookup_leaf (q->row[level]+prio,prio,q->ptr[level]+prio, q->last_ptr_id[level]+prio); } while (cl != start); if (likely(skb != NULL)) { if ((cl->un.leaf.deficit[level] -= skb->len) < 0) { - HTB_DBG(4,2,"htb_next_cl oldptr=%p quant_add=%d\n", - level?cl->parent->un.inner.ptr[prio]:q->ptr[0][prio],cl->un.leaf.quantum); cl->un.leaf.deficit[level] += cl->un.leaf.quantum; htb_next_rb_node((level?cl->parent->un.inner.ptr:q->ptr[0])+prio); } @@ -1095,7 +924,6 @@ static void htb_delay_by(struct Qdisc *sch,long delay) mod_timer(&q->timer, q->jiffies + delay); sch->flags |= TCQ_F_THROTTLED; sch->qstats.overlimits++; - HTB_DBG(3,1,"htb_deq t_delay=%ld\n",delay); } static struct sk_buff *htb_dequeue(struct Qdisc *sch) @@ -1104,13 +932,8 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) struct htb_sched *q = qdisc_priv(sch); int level; long min_delay; -#ifdef HTB_DEBUG - int evs_used = 0; -#endif q->jiffies = jiffies; - HTB_DBG(3,1,"htb_deq dircnt=%d qlen=%d\n",skb_queue_len(&q->direct_queue), - sch->q.qlen); /* try to dequeue direct packets as high prio (!) to minimize cpu work */ if ((skb = __skb_dequeue(&q->direct_queue)) != NULL) { @@ -1131,9 +954,6 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) if (time_after_eq(q->jiffies, q->near_ev_cache[level])) { delay = htb_do_events(q,level); q->near_ev_cache[level] = q->jiffies + (delay ? delay : HZ); -#ifdef HTB_DEBUG - evs_used++; -#endif } else delay = q->near_ev_cache[level] - q->jiffies; @@ -1151,20 +971,8 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) } } } -#ifdef HTB_DEBUG - if (!q->nwc_hit && min_delay >= 10*HZ && net_ratelimit()) { - if (min_delay == LONG_MAX) { - printk(KERN_ERR "HTB: dequeue bug (%d,%lu,%lu), report it please !\n", - evs_used,q->jiffies,jiffies); - htb_debug_dump(q); - } else - printk(KERN_WARNING "HTB: mindelay=%ld, some class has " - "too small rate\n",min_delay); - } -#endif htb_delay_by (sch,min_delay > 5*HZ ? 5*HZ : min_delay); fin: - HTB_DBG(3,1,"htb_deq_end %s j=%lu skb=%p\n",sch->dev->name,q->jiffies,skb); return skb; } @@ -1198,7 +1006,6 @@ static void htb_reset(struct Qdisc* sch) { struct htb_sched *q = qdisc_priv(sch); int i; - HTB_DBG(0,1,"htb_reset sch=%p, handle=%X\n",sch,sch->handle); for (i = 0; i < HTB_HSIZE; i++) { struct list_head *p; @@ -1213,10 +1020,6 @@ static void htb_reset(struct Qdisc* sch) } cl->prio_activity = 0; cl->cmode = HTB_CAN_SEND; -#ifdef HTB_DEBUG - cl->pq_node.rb_color = -1; - memset(cl->node,255,sizeof(cl->node)); -#endif } } @@ -1238,10 +1041,6 @@ static int htb_init(struct Qdisc *sch, struct rtattr *opt) struct rtattr *tb[TCA_HTB_INIT]; struct tc_htb_glob *gopt; int i; -#ifdef HTB_DEBUG - printk(KERN_INFO "HTB init, kernel part version %d.%d\n", - HTB_VER >> 16,HTB_VER & 0xffff); -#endif if (!opt || rtattr_parse_nested(tb, TCA_HTB_INIT, opt) || tb[TCA_HTB_INIT-1] == NULL || RTA_PAYLOAD(tb[TCA_HTB_INIT-1]) < sizeof(*gopt)) { @@ -1254,8 +1053,6 @@ static int htb_init(struct Qdisc *sch, struct rtattr *opt) HTB_VER >> 16,HTB_VER & 0xffff,gopt->version); return -EINVAL; } - q->debug = gopt->debug; - HTB_DBG(0,1,"htb_init sch=%p handle=%X r2q=%d\n",sch,sch->handle,gopt->rate2quantum); INIT_LIST_HEAD(&q->root); for (i = 0; i < HTB_HSIZE; i++) @@ -1292,18 +1089,13 @@ static int htb_dump(struct Qdisc *sch, struct sk_buff *skb) unsigned char *b = skb->tail; struct rtattr *rta; struct tc_htb_glob gopt; - HTB_DBG(0,1,"htb_dump sch=%p, handle=%X\n",sch,sch->handle); HTB_QLOCK(sch); gopt.direct_pkts = q->direct_pkts; -#ifdef HTB_DEBUG - if (HTB_DBG_COND(0,2)) - htb_debug_dump(q); -#endif gopt.version = HTB_VER; gopt.rate2quantum = q->rate2quantum; gopt.defcls = q->defcls; - gopt.debug = q->debug; + gopt.debug = 0; rta = (struct rtattr*)b; RTA_PUT(skb, TCA_OPTIONS, 0, NULL); RTA_PUT(skb, TCA_HTB_INIT, sizeof(gopt), &gopt); @@ -1319,16 +1111,11 @@ rtattr_failure: static int htb_dump_class(struct Qdisc *sch, unsigned long arg, struct sk_buff *skb, struct tcmsg *tcm) { -#ifdef HTB_DEBUG - struct htb_sched *q = qdisc_priv(sch); -#endif struct htb_class *cl = (struct htb_class*)arg; unsigned char *b = skb->tail; struct rtattr *rta; struct tc_htb_opt opt; - HTB_DBG(0,1,"htb_dump_class handle=%X clid=%X\n",sch->handle,cl->classid); - HTB_QLOCK(sch); tcm->tcm_parent = cl->parent ? cl->parent->classid : TC_H_ROOT; tcm->tcm_handle = cl->classid; @@ -1410,11 +1197,7 @@ static struct Qdisc * htb_leaf(struct Qdisc *sch, unsigned long arg) static unsigned long htb_get(struct Qdisc *sch, u32 classid) { -#ifdef HTB_DEBUG - struct htb_sched *q = qdisc_priv(sch); -#endif struct htb_class *cl = htb_find(classid,sch); - HTB_DBG(0,1,"htb_get clid=%X q=%p cl=%p ref=%d\n",classid,q,cl,cl?cl->refcnt:0); if (cl) cl->refcnt++; return (unsigned long)cl; @@ -1433,7 +1216,6 @@ static void htb_destroy_filters(struct tcf_proto **fl) static void htb_destroy_class(struct Qdisc* sch,struct htb_class *cl) { struct htb_sched *q = qdisc_priv(sch); - HTB_DBG(0,1,"htb_destrycls clid=%X ref=%d\n", cl?cl->classid:0,cl?cl->refcnt:0); if (!cl->level) { BUG_TRAP(cl->un.leaf.q); sch->q.qlen -= cl->un.leaf.q->q.qlen; @@ -1456,7 +1238,7 @@ static void htb_destroy_class(struct Qdisc* sch,struct htb_class *cl) htb_deactivate (q,cl); if (cl->cmode != HTB_CAN_SEND) - htb_safe_rb_erase(&cl->pq_node,q->wait_pq+cl->level); + rb_erase(&cl->pq_node,q->wait_pq+cl->level); kfree(cl); } @@ -1465,7 +1247,6 @@ static void htb_destroy_class(struct Qdisc* sch,struct htb_class *cl) static void htb_destroy(struct Qdisc* sch) { struct htb_sched *q = qdisc_priv(sch); - HTB_DBG(0,1,"htb_destroy q=%p\n",q); del_timer_sync (&q->timer); #ifdef HTB_RATECM @@ -1488,7 +1269,6 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) { struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = (struct htb_class*)arg; - HTB_DBG(0,1,"htb_delete q=%p cl=%X ref=%d\n",q,cl?cl->classid:0,cl?cl->refcnt:0); // TODO: why don't allow to delete subtree ? references ? does // tc subsys quarantee us that in htb_destroy it holds no class @@ -1512,11 +1292,7 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) static void htb_put(struct Qdisc *sch, unsigned long arg) { -#ifdef HTB_DEBUG - struct htb_sched *q = qdisc_priv(sch); -#endif struct htb_class *cl = (struct htb_class*)arg; - HTB_DBG(0,1,"htb_put q=%p cl=%X ref=%d\n",q,cl?cl->classid:0,cl?cl->refcnt:0); if (--cl->refcnt == 0) htb_destroy_class(sch,cl); @@ -1542,7 +1318,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, parent = parentid == TC_H_ROOT ? NULL : htb_find (parentid,sch); hopt = RTA_DATA(tb[TCA_HTB_PARMS-1]); - HTB_DBG(0,1,"htb_chg cl=%p(%X), clid=%X, parid=%X, opt/prio=%d, rate=%u, buff=%d, quant=%d\n", cl,cl?cl->classid:0,classid,parentid,(int)hopt->prio,hopt->rate.rate,hopt->buffer,hopt->quantum); + rtab = qdisc_get_rtab(&hopt->rate, tb[TCA_HTB_RTAB-1]); ctab = qdisc_get_rtab(&hopt->ceil, tb[TCA_HTB_CTAB-1]); if (!rtab || !ctab) goto failure; @@ -1567,9 +1343,6 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, INIT_LIST_HEAD(&cl->hlist); INIT_LIST_HEAD(&cl->children); INIT_LIST_HEAD(&cl->un.leaf.drop_list); -#ifdef HTB_DEBUG - cl->magic = HTB_CMAGIC; -#endif /* create leaf qdisc early because it uses kmalloc(GFP_KERNEL) so that can't be used inside of sch_tree_lock @@ -1585,7 +1358,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, /* remove from evt list because of level change */ if (parent->cmode != HTB_CAN_SEND) { - htb_safe_rb_erase(&parent->pq_node,q->wait_pq /*+0*/); + rb_erase(&parent->pq_node,q->wait_pq); parent->cmode = HTB_CAN_SEND; } parent->level = (parent->parent ? parent->parent->level @@ -1607,13 +1380,6 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, /* attach to the hash list and parent's family */ list_add_tail(&cl->hlist, q->hash+htb_hash(classid)); list_add_tail(&cl->sibling, parent ? &parent->children : &q->root); -#ifdef HTB_DEBUG - { - int i; - for (i = 0; i < TC_HTB_NUMPRIO; i++) cl->node[i].rb_color = -1; - cl->pq_node.rb_color = -1; - } -#endif } else sch_tree_lock(sch); /* it used to be a nasty bug here, we have to check that node @@ -1654,7 +1420,7 @@ static struct tcf_proto **htb_find_tcf(struct Qdisc *sch, unsigned long arg) struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = (struct htb_class *)arg; struct tcf_proto **fl = cl ? &cl->filter_list : &q->filter_list; - HTB_DBG(0,2,"htb_tcf q=%p clid=%X fref=%d fl=%p\n",q,cl?cl->classid:0,cl?cl->filter_cnt:q->filter_cnt,*fl); + return fl; } @@ -1663,7 +1429,7 @@ static unsigned long htb_bind_filter(struct Qdisc *sch, unsigned long parent, { struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = htb_find (classid,sch); - HTB_DBG(0,2,"htb_bind q=%p clid=%X cl=%p fref=%d\n",q,classid,cl,cl?cl->filter_cnt:q->filter_cnt); + /*if (cl && !cl->level) return 0; The line above used to be there to prevent attaching filters to leaves. But at least tc_index filter uses this just to get class @@ -1684,7 +1450,7 @@ static void htb_unbind_filter(struct Qdisc *sch, unsigned long arg) { struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = (struct htb_class *)arg; - HTB_DBG(0,2,"htb_unbind q=%p cl=%p fref=%d\n",q,cl,cl?cl->filter_cnt:q->filter_cnt); + if (cl) cl->filter_cnt--; else -- cgit v0.10.2 From 9ac961ee05bfc837e5271be34ad7158e90dce7d9 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:33:16 -0700 Subject: [HTB]: Remove lock macro. Get rid of the macro's being used to obscure the locking. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 73094e7..c0b80b7 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -72,8 +72,6 @@ #define HTB_EWMAC 2 /* rate average over HTB_EWMAC*HTB_HSIZE sec */ #define HTB_RATECM 1 /* whether to use rate computer */ #define HTB_HYSTERESIS 1/* whether to use mode hysteresis for speedup */ -#define HTB_QLOCK(S) spin_lock_bh(&(S)->dev->queue_lock) -#define HTB_QUNLOCK(S) spin_unlock_bh(&(S)->dev->queue_lock) #define HTB_VER 0x30011 /* major must be matched with number suplied by TC as version */ #if HTB_VER >> 16 != TC_HTB_PROTOVER @@ -667,7 +665,7 @@ static void htb_rate_timer(unsigned long arg) struct list_head *p; /* lock queue so that we can muck with it */ - HTB_QLOCK(sch); + spin_lock_bh(&sch->dev->queue_lock); q->rttim.expires = jiffies + HZ; add_timer(&q->rttim); @@ -681,7 +679,7 @@ static void htb_rate_timer(unsigned long arg) RT_GEN (cl->sum_bytes,cl->rate_bytes); RT_GEN (cl->sum_packets,cl->rate_packets); } - HTB_QUNLOCK(sch); + spin_unlock_bh(&sch->dev->queue_lock); } #endif @@ -1089,7 +1087,7 @@ static int htb_dump(struct Qdisc *sch, struct sk_buff *skb) unsigned char *b = skb->tail; struct rtattr *rta; struct tc_htb_glob gopt; - HTB_QLOCK(sch); + spin_lock_bh(&sch->dev->queue_lock); gopt.direct_pkts = q->direct_pkts; gopt.version = HTB_VER; @@ -1100,10 +1098,10 @@ static int htb_dump(struct Qdisc *sch, struct sk_buff *skb) RTA_PUT(skb, TCA_OPTIONS, 0, NULL); RTA_PUT(skb, TCA_HTB_INIT, sizeof(gopt), &gopt); rta->rta_len = skb->tail - b; - HTB_QUNLOCK(sch); + spin_unlock_bh(&sch->dev->queue_lock); return skb->len; rtattr_failure: - HTB_QUNLOCK(sch); + spin_unlock_bh(&sch->dev->queue_lock); skb_trim(skb, skb->tail - skb->data); return -1; } @@ -1116,7 +1114,7 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg, struct rtattr *rta; struct tc_htb_opt opt; - HTB_QLOCK(sch); + spin_lock_bh(&sch->dev->queue_lock); tcm->tcm_parent = cl->parent ? cl->parent->classid : TC_H_ROOT; tcm->tcm_handle = cl->classid; if (!cl->level && cl->un.leaf.q) @@ -1133,10 +1131,10 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg, opt.level = cl->level; RTA_PUT(skb, TCA_HTB_PARMS, sizeof(opt), &opt); rta->rta_len = skb->tail - b; - HTB_QUNLOCK(sch); + spin_unlock_bh(&sch->dev->queue_lock); return skb->len; rtattr_failure: - HTB_QUNLOCK(sch); + spin_unlock_bh(&sch->dev->queue_lock); skb_trim(skb, b - skb->data); return -1; } -- cgit v0.10.2 From 18a63e868b04cf949643cc9d2c8a51d8cb5da9c4 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:34:02 -0700 Subject: [HTB]: HTB_HYSTERESIS cleanup Change the conditional compilation around HTB_HYSTERSIS since code was splitting mid expression. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index c0b80b7..d8c1a6b 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -483,6 +483,20 @@ static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl) htb_remove_class_from_row(q,cl,mask); } +#if HTB_HYSTERESIS +static inline long htb_lowater(const struct htb_class *cl) +{ + return cl->cmode != HTB_CANT_SEND ? -cl->cbuffer : 0; +} +static inline long htb_hiwater(const struct htb_class *cl) +{ + return cl->cmode == HTB_CAN_SEND ? -cl->buffer : 0; +} +#else +#define htb_lowater(cl) (0) +#define htb_hiwater(cl) (0) +#endif + /** * htb_class_mode - computes and returns current class mode * @@ -499,19 +513,12 @@ htb_class_mode(struct htb_class *cl,long *diff) { long toks; - if ((toks = (cl->ctokens + *diff)) < ( -#if HTB_HYSTERESIS - cl->cmode != HTB_CANT_SEND ? -cl->cbuffer : -#endif - 0)) { + if ((toks = (cl->ctokens + *diff)) < htb_lowater(cl)) { *diff = -toks; return HTB_CANT_SEND; } - if ((toks = (cl->tokens + *diff)) >= ( -#if HTB_HYSTERESIS - cl->cmode == HTB_CAN_SEND ? -cl->buffer : -#endif - 0)) + + if ((toks = (cl->tokens + *diff)) >= htb_hiwater(cl)) return HTB_CAN_SEND; *diff = -toks; -- cgit v0.10.2 From 87990467d387f922103db31678034785d8f21cb7 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:35:16 -0700 Subject: [HTB]: Lindent Code was a mess in terms of indentation. Run through Lindent script, and cleanup the damage. Also, don't use, vim magic comment, and substitute inline for __inline__. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index d8c1a6b..6c6cac6 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -1,4 +1,4 @@ -/* vim: ts=8 sw=8 +/* * net/sched/sch_htb.c Hierarchical token bucket, feed tree version * * This program is free software; you can redistribute it and/or @@ -68,11 +68,11 @@ one less than their parent. */ -#define HTB_HSIZE 16 /* classid hash size */ -#define HTB_EWMAC 2 /* rate average over HTB_EWMAC*HTB_HSIZE sec */ -#define HTB_RATECM 1 /* whether to use rate computer */ -#define HTB_HYSTERESIS 1/* whether to use mode hysteresis for speedup */ -#define HTB_VER 0x30011 /* major must be matched with number suplied by TC as version */ +#define HTB_HSIZE 16 /* classid hash size */ +#define HTB_EWMAC 2 /* rate average over HTB_EWMAC*HTB_HSIZE sec */ +#define HTB_RATECM 1 /* whether to use rate computer */ +#define HTB_HYSTERESIS 1 /* whether to use mode hysteresis for speedup */ +#define HTB_VER 0x30011 /* major must be matched with number suplied by TC as version */ #if HTB_VER >> 16 != TC_HTB_PROTOVER #error "Mismatched sch_htb.c and pkt_sch.h" @@ -80,154 +80,152 @@ /* used internaly to keep status of single class */ enum htb_cmode { - HTB_CANT_SEND, /* class can't send and can't borrow */ - HTB_MAY_BORROW, /* class can't send but may borrow */ - HTB_CAN_SEND /* class can send */ + HTB_CANT_SEND, /* class can't send and can't borrow */ + HTB_MAY_BORROW, /* class can't send but may borrow */ + HTB_CAN_SEND /* class can send */ }; /* interior & leaf nodes; props specific to leaves are marked L: */ -struct htb_class -{ - /* general class parameters */ - u32 classid; - struct gnet_stats_basic bstats; - struct gnet_stats_queue qstats; - struct gnet_stats_rate_est rate_est; - struct tc_htb_xstats xstats;/* our special stats */ - int refcnt; /* usage count of this class */ +struct htb_class { + /* general class parameters */ + u32 classid; + struct gnet_stats_basic bstats; + struct gnet_stats_queue qstats; + struct gnet_stats_rate_est rate_est; + struct tc_htb_xstats xstats; /* our special stats */ + int refcnt; /* usage count of this class */ #ifdef HTB_RATECM - /* rate measurement counters */ - unsigned long rate_bytes,sum_bytes; - unsigned long rate_packets,sum_packets; + /* rate measurement counters */ + unsigned long rate_bytes, sum_bytes; + unsigned long rate_packets, sum_packets; #endif - /* topology */ - int level; /* our level (see above) */ - struct htb_class *parent; /* parent class */ - struct list_head hlist; /* classid hash list item */ - struct list_head sibling; /* sibling list item */ - struct list_head children; /* children list */ - - union { - struct htb_class_leaf { - struct Qdisc *q; - int prio; - int aprio; - int quantum; - int deficit[TC_HTB_MAXDEPTH]; - struct list_head drop_list; - } leaf; - struct htb_class_inner { - struct rb_root feed[TC_HTB_NUMPRIO]; /* feed trees */ - struct rb_node *ptr[TC_HTB_NUMPRIO]; /* current class ptr */ - /* When class changes from state 1->2 and disconnects from - parent's feed then we lost ptr value and start from the - first child again. Here we store classid of the - last valid ptr (used when ptr is NULL). */ - u32 last_ptr_id[TC_HTB_NUMPRIO]; - } inner; - } un; - struct rb_node node[TC_HTB_NUMPRIO]; /* node for self or feed tree */ - struct rb_node pq_node; /* node for event queue */ - unsigned long pq_key; /* the same type as jiffies global */ - - int prio_activity; /* for which prios are we active */ - enum htb_cmode cmode; /* current mode of the class */ - - /* class attached filters */ - struct tcf_proto *filter_list; - int filter_cnt; - - int warned; /* only one warning about non work conserving .. */ - - /* token bucket parameters */ - struct qdisc_rate_table *rate; /* rate table of the class itself */ - struct qdisc_rate_table *ceil; /* ceiling rate (limits borrows too) */ - long buffer,cbuffer; /* token bucket depth/rate */ - psched_tdiff_t mbuffer; /* max wait time */ - long tokens,ctokens; /* current number of tokens */ - psched_time_t t_c; /* checkpoint time */ + /* topology */ + int level; /* our level (see above) */ + struct htb_class *parent; /* parent class */ + struct list_head hlist; /* classid hash list item */ + struct list_head sibling; /* sibling list item */ + struct list_head children; /* children list */ + + union { + struct htb_class_leaf { + struct Qdisc *q; + int prio; + int aprio; + int quantum; + int deficit[TC_HTB_MAXDEPTH]; + struct list_head drop_list; + } leaf; + struct htb_class_inner { + struct rb_root feed[TC_HTB_NUMPRIO]; /* feed trees */ + struct rb_node *ptr[TC_HTB_NUMPRIO]; /* current class ptr */ + /* When class changes from state 1->2 and disconnects from + parent's feed then we lost ptr value and start from the + first child again. Here we store classid of the + last valid ptr (used when ptr is NULL). */ + u32 last_ptr_id[TC_HTB_NUMPRIO]; + } inner; + } un; + struct rb_node node[TC_HTB_NUMPRIO]; /* node for self or feed tree */ + struct rb_node pq_node; /* node for event queue */ + unsigned long pq_key; /* the same type as jiffies global */ + + int prio_activity; /* for which prios are we active */ + enum htb_cmode cmode; /* current mode of the class */ + + /* class attached filters */ + struct tcf_proto *filter_list; + int filter_cnt; + + int warned; /* only one warning about non work conserving .. */ + + /* token bucket parameters */ + struct qdisc_rate_table *rate; /* rate table of the class itself */ + struct qdisc_rate_table *ceil; /* ceiling rate (limits borrows too) */ + long buffer, cbuffer; /* token bucket depth/rate */ + psched_tdiff_t mbuffer; /* max wait time */ + long tokens, ctokens; /* current number of tokens */ + psched_time_t t_c; /* checkpoint time */ }; /* TODO: maybe compute rate when size is too large .. or drop ? */ -static __inline__ long L2T(struct htb_class *cl,struct qdisc_rate_table *rate, - int size) -{ - int slot = size >> rate->rate.cell_log; - if (slot > 255) { - cl->xstats.giants++; - slot = 255; - } - return rate->data[slot]; +static inline long L2T(struct htb_class *cl, struct qdisc_rate_table *rate, + int size) +{ + int slot = size >> rate->rate.cell_log; + if (slot > 255) { + cl->xstats.giants++; + slot = 255; + } + return rate->data[slot]; } -struct htb_sched -{ - struct list_head root; /* root classes list */ - struct list_head hash[HTB_HSIZE]; /* hashed by classid */ - struct list_head drops[TC_HTB_NUMPRIO]; /* active leaves (for drops) */ - - /* self list - roots of self generating tree */ - struct rb_root row[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; - int row_mask[TC_HTB_MAXDEPTH]; - struct rb_node *ptr[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; - u32 last_ptr_id[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; +struct htb_sched { + struct list_head root; /* root classes list */ + struct list_head hash[HTB_HSIZE]; /* hashed by classid */ + struct list_head drops[TC_HTB_NUMPRIO]; /* active leaves (for drops) */ + + /* self list - roots of self generating tree */ + struct rb_root row[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; + int row_mask[TC_HTB_MAXDEPTH]; + struct rb_node *ptr[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; + u32 last_ptr_id[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; - /* self wait list - roots of wait PQs per row */ - struct rb_root wait_pq[TC_HTB_MAXDEPTH]; + /* self wait list - roots of wait PQs per row */ + struct rb_root wait_pq[TC_HTB_MAXDEPTH]; - /* time of nearest event per level (row) */ - unsigned long near_ev_cache[TC_HTB_MAXDEPTH]; + /* time of nearest event per level (row) */ + unsigned long near_ev_cache[TC_HTB_MAXDEPTH]; - /* cached value of jiffies in dequeue */ - unsigned long jiffies; + /* cached value of jiffies in dequeue */ + unsigned long jiffies; - /* whether we hit non-work conserving class during this dequeue; we use */ - int nwc_hit; /* this to disable mindelay complaint in dequeue */ + /* whether we hit non-work conserving class during this dequeue; we use */ + int nwc_hit; /* this to disable mindelay complaint in dequeue */ - int defcls; /* class where unclassified flows go to */ + int defcls; /* class where unclassified flows go to */ - /* filters for qdisc itself */ - struct tcf_proto *filter_list; - int filter_cnt; + /* filters for qdisc itself */ + struct tcf_proto *filter_list; + int filter_cnt; - int rate2quantum; /* quant = rate / rate2quantum */ - psched_time_t now; /* cached dequeue time */ - struct timer_list timer; /* send delay timer */ + int rate2quantum; /* quant = rate / rate2quantum */ + psched_time_t now; /* cached dequeue time */ + struct timer_list timer; /* send delay timer */ #ifdef HTB_RATECM - struct timer_list rttim; /* rate computer timer */ - int recmp_bucket; /* which hash bucket to recompute next */ + struct timer_list rttim; /* rate computer timer */ + int recmp_bucket; /* which hash bucket to recompute next */ #endif - - /* non shaped skbs; let them go directly thru */ - struct sk_buff_head direct_queue; - int direct_qlen; /* max qlen of above */ - long direct_pkts; + /* non shaped skbs; let them go directly thru */ + struct sk_buff_head direct_queue; + int direct_qlen; /* max qlen of above */ + + long direct_pkts; }; /* compute hash of size HTB_HSIZE for given handle */ -static __inline__ int htb_hash(u32 h) +static inline int htb_hash(u32 h) { #if HTB_HSIZE != 16 - #error "Declare new hash for your HTB_HSIZE" +#error "Declare new hash for your HTB_HSIZE" #endif - h ^= h>>8; /* stolen from cbq_hash */ - h ^= h>>4; - return h & 0xf; + h ^= h >> 8; /* stolen from cbq_hash */ + h ^= h >> 4; + return h & 0xf; } /* find class in global hash table using given handle */ -static __inline__ struct htb_class *htb_find(u32 handle, struct Qdisc *sch) +static inline struct htb_class *htb_find(u32 handle, struct Qdisc *sch) { struct htb_sched *q = qdisc_priv(sch); struct list_head *p; - if (TC_H_MAJ(handle) != sch->handle) + if (TC_H_MAJ(handle) != sch->handle) return NULL; - - list_for_each (p,q->hash+htb_hash(handle)) { - struct htb_class *cl = list_entry(p,struct htb_class,hlist); + + list_for_each(p, q->hash + htb_hash(handle)) { + struct htb_class *cl = list_entry(p, struct htb_class, hlist); if (cl->classid == handle) return cl; } @@ -252,7 +250,8 @@ static inline u32 htb_classid(struct htb_class *cl) return (cl && cl != HTB_DIRECT) ? cl->classid : TC_H_UNSPEC; } -static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, int *qerr) +static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, + int *qerr) { struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl; @@ -264,8 +263,8 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, in note that nfmark can be used too by attaching filter fw with no rules in it */ if (skb->priority == sch->handle) - return HTB_DIRECT; /* X:0 (direct flow) selected */ - if ((cl = htb_find(skb->priority,sch)) != NULL && cl->level == 0) + return HTB_DIRECT; /* X:0 (direct flow) selected */ + if ((cl = htb_find(skb->priority, sch)) != NULL && cl->level == 0) return cl; *qerr = NET_XMIT_BYPASS; @@ -274,7 +273,7 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, in #ifdef CONFIG_NET_CLS_ACT switch (result) { case TC_ACT_QUEUED: - case TC_ACT_STOLEN: + case TC_ACT_STOLEN: *qerr = NET_XMIT_SUCCESS; case TC_ACT_SHOT: return NULL; @@ -283,22 +282,22 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, in if (result == TC_POLICE_SHOT) return HTB_DIRECT; #endif - if ((cl = (void*)res.class) == NULL) { + if ((cl = (void *)res.class) == NULL) { if (res.classid == sch->handle) - return HTB_DIRECT; /* X:0 (direct flow) */ - if ((cl = htb_find(res.classid,sch)) == NULL) - break; /* filter selected invalid classid */ + return HTB_DIRECT; /* X:0 (direct flow) */ + if ((cl = htb_find(res.classid, sch)) == NULL) + break; /* filter selected invalid classid */ } if (!cl->level) - return cl; /* we hit leaf; return it */ + return cl; /* we hit leaf; return it */ /* we have got inner class; apply inner filter chain */ tcf = cl->filter_list; } /* classification failed; try to use default class */ - cl = htb_find(TC_H_MAKE(TC_H_MAJ(sch->handle),q->defcls),sch); + cl = htb_find(TC_H_MAKE(TC_H_MAJ(sch->handle), q->defcls), sch); if (!cl || cl->level) - return HTB_DIRECT; /* bad default .. this is safe bet */ + return HTB_DIRECT; /* bad default .. this is safe bet */ return cl; } @@ -308,18 +307,19 @@ static struct htb_class *htb_classify(struct sk_buff *skb, struct Qdisc *sch, in * Routine adds class to the list (actually tree) sorted by classid. * Make sure that class is not already on such list for given prio. */ -static void htb_add_to_id_tree (struct rb_root *root, - struct htb_class *cl,int prio) +static void htb_add_to_id_tree(struct rb_root *root, + struct htb_class *cl, int prio) { struct rb_node **p = &root->rb_node, *parent = NULL; while (*p) { - struct htb_class *c; parent = *p; + struct htb_class *c; + parent = *p; c = rb_entry(parent, struct htb_class, node[prio]); if (cl->classid > c->classid) p = &parent->rb_right; - else + else p = &parent->rb_left; } rb_link_node(&cl->node[prio], parent, p); @@ -333,8 +333,8 @@ static void htb_add_to_id_tree (struct rb_root *root, * change its mode in cl->pq_key microseconds. Make sure that class is not * already in the queue. */ -static void htb_add_to_wait_tree (struct htb_sched *q, - struct htb_class *cl,long delay) +static void htb_add_to_wait_tree(struct htb_sched *q, + struct htb_class *cl, long delay) { struct rb_node **p = &q->wait_pq[cl->level].rb_node, *parent = NULL; @@ -345,13 +345,14 @@ static void htb_add_to_wait_tree (struct htb_sched *q, /* update the nearest event cache */ if (time_after(q->near_ev_cache[cl->level], cl->pq_key)) q->near_ev_cache[cl->level] = cl->pq_key; - + while (*p) { - struct htb_class *c; parent = *p; + struct htb_class *c; + parent = *p; c = rb_entry(parent, struct htb_class, pq_node); if (time_after_eq(cl->pq_key, c->pq_key)) p = &parent->rb_right; - else + else p = &parent->rb_left; } rb_link_node(&cl->pq_node, parent, p); @@ -375,14 +376,14 @@ static void htb_next_rb_node(struct rb_node **n) * The class is added to row at priorities marked in mask. * It does nothing if mask == 0. */ -static inline void htb_add_class_to_row(struct htb_sched *q, - struct htb_class *cl,int mask) +static inline void htb_add_class_to_row(struct htb_sched *q, + struct htb_class *cl, int mask) { q->row_mask[cl->level] |= mask; while (mask) { int prio = ffz(~mask); mask &= ~(1 << prio); - htb_add_to_id_tree(q->row[cl->level]+prio,cl,prio); + htb_add_to_id_tree(q->row[cl->level] + prio, cl, prio); } } @@ -392,18 +393,18 @@ static inline void htb_add_class_to_row(struct htb_sched *q, * The class is removed from row at priorities marked in mask. * It does nothing if mask == 0. */ -static __inline__ void htb_remove_class_from_row(struct htb_sched *q, - struct htb_class *cl,int mask) +static inline void htb_remove_class_from_row(struct htb_sched *q, + struct htb_class *cl, int mask) { int m = 0; while (mask) { int prio = ffz(~mask); mask &= ~(1 << prio); - if (q->ptr[cl->level][prio] == cl->node+prio) - htb_next_rb_node(q->ptr[cl->level]+prio); - rb_erase(cl->node + prio,q->row[cl->level]+prio); - if (!q->row[cl->level][prio].rb_node) + if (q->ptr[cl->level][prio] == cl->node + prio) + htb_next_rb_node(q->ptr[cl->level] + prio); + rb_erase(cl->node + prio, q->row[cl->level] + prio); + if (!q->row[cl->level][prio].rb_node) m |= 1 << prio; } q->row_mask[cl->level] &= ~m; @@ -416,30 +417,31 @@ static __inline__ void htb_remove_class_from_row(struct htb_sched *q, * for priorities it is participating on. cl->cmode must be new * (activated) mode. It does nothing if cl->prio_activity == 0. */ -static void htb_activate_prios(struct htb_sched *q,struct htb_class *cl) +static void htb_activate_prios(struct htb_sched *q, struct htb_class *cl) { struct htb_class *p = cl->parent; - long m,mask = cl->prio_activity; + long m, mask = cl->prio_activity; while (cl->cmode == HTB_MAY_BORROW && p && mask) { - - m = mask; while (m) { + m = mask; + while (m) { int prio = ffz(~m); m &= ~(1 << prio); - + if (p->un.inner.feed[prio].rb_node) /* parent already has its feed in use so that reset bit in mask as parent is already ok */ mask &= ~(1 << prio); - - htb_add_to_id_tree(p->un.inner.feed+prio,cl,prio); + + htb_add_to_id_tree(p->un.inner.feed + prio, cl, prio); } p->prio_activity |= mask; - cl = p; p = cl->parent; + cl = p; + p = cl->parent; } if (cl->cmode == HTB_CAN_SEND && mask) - htb_add_class_to_row(q,cl,mask); + htb_add_class_to_row(q, cl, mask); } /** @@ -452,35 +454,36 @@ static void htb_activate_prios(struct htb_sched *q,struct htb_class *cl) static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl) { struct htb_class *p = cl->parent; - long m,mask = cl->prio_activity; - + long m, mask = cl->prio_activity; while (cl->cmode == HTB_MAY_BORROW && p && mask) { - m = mask; mask = 0; + m = mask; + mask = 0; while (m) { int prio = ffz(~m); m &= ~(1 << prio); - - if (p->un.inner.ptr[prio] == cl->node+prio) { + + if (p->un.inner.ptr[prio] == cl->node + prio) { /* we are removing child which is pointed to from parent feed - forget the pointer but remember classid */ p->un.inner.last_ptr_id[prio] = cl->classid; p->un.inner.ptr[prio] = NULL; } - - rb_erase(cl->node + prio,p->un.inner.feed + prio); - - if (!p->un.inner.feed[prio].rb_node) + + rb_erase(cl->node + prio, p->un.inner.feed + prio); + + if (!p->un.inner.feed[prio].rb_node) mask |= 1 << prio; } p->prio_activity &= ~mask; - cl = p; p = cl->parent; + cl = p; + p = cl->parent; } - if (cl->cmode == HTB_CAN_SEND && mask) - htb_remove_class_from_row(q,cl,mask); + if (cl->cmode == HTB_CAN_SEND && mask) + htb_remove_class_from_row(q, cl, mask); } #if HTB_HYSTERESIS @@ -508,21 +511,21 @@ static inline long htb_hiwater(const struct htb_class *cl) * 0 .. -cl->{c,}buffer range. It is meant to limit number of * mode transitions per time unit. The speed gain is about 1/6. */ -static __inline__ enum htb_cmode -htb_class_mode(struct htb_class *cl,long *diff) +static inline enum htb_cmode +htb_class_mode(struct htb_class *cl, long *diff) { - long toks; + long toks; - if ((toks = (cl->ctokens + *diff)) < htb_lowater(cl)) { - *diff = -toks; - return HTB_CANT_SEND; - } + if ((toks = (cl->ctokens + *diff)) < htb_lowater(cl)) { + *diff = -toks; + return HTB_CANT_SEND; + } - if ((toks = (cl->tokens + *diff)) >= htb_hiwater(cl)) - return HTB_CAN_SEND; + if ((toks = (cl->tokens + *diff)) >= htb_hiwater(cl)) + return HTB_CAN_SEND; - *diff = -toks; - return HTB_MAY_BORROW; + *diff = -toks; + return HTB_MAY_BORROW; } /** @@ -534,22 +537,21 @@ htb_class_mode(struct htb_class *cl,long *diff) * be different from old one and cl->pq_key has to be valid if changing * to mode other than HTB_CAN_SEND (see htb_add_to_wait_tree). */ -static void +static void htb_change_class_mode(struct htb_sched *q, struct htb_class *cl, long *diff) -{ - enum htb_cmode new_mode = htb_class_mode(cl,diff); - +{ + enum htb_cmode new_mode = htb_class_mode(cl, diff); if (new_mode == cl->cmode) - return; - - if (cl->prio_activity) { /* not necessary: speed optimization */ - if (cl->cmode != HTB_CANT_SEND) - htb_deactivate_prios(q,cl); + return; + + if (cl->prio_activity) { /* not necessary: speed optimization */ + if (cl->cmode != HTB_CANT_SEND) + htb_deactivate_prios(q, cl); cl->cmode = new_mode; - if (new_mode != HTB_CANT_SEND) - htb_activate_prios(q,cl); - } else + if (new_mode != HTB_CANT_SEND) + htb_activate_prios(q, cl); + } else cl->cmode = new_mode; } @@ -560,14 +562,15 @@ htb_change_class_mode(struct htb_sched *q, struct htb_class *cl, long *diff) * for the prio. It can be called on already active leaf safely. * It also adds leaf into droplist. */ -static __inline__ void htb_activate(struct htb_sched *q,struct htb_class *cl) +static inline void htb_activate(struct htb_sched *q, struct htb_class *cl) { BUG_TRAP(!cl->level && cl->un.leaf.q && cl->un.leaf.q->q.qlen); if (!cl->prio_activity) { cl->prio_activity = 1 << (cl->un.leaf.aprio = cl->un.leaf.prio); - htb_activate_prios(q,cl); - list_add_tail(&cl->un.leaf.drop_list,q->drops+cl->un.leaf.aprio); + htb_activate_prios(q, cl); + list_add_tail(&cl->un.leaf.drop_list, + q->drops + cl->un.leaf.aprio); } } @@ -577,97 +580,100 @@ static __inline__ void htb_activate(struct htb_sched *q,struct htb_class *cl) * Make sure that leaf is active. In the other words it can't be called * with non-active leaf. It also removes class from the drop list. */ -static __inline__ void -htb_deactivate(struct htb_sched *q,struct htb_class *cl) +static inline void htb_deactivate(struct htb_sched *q, struct htb_class *cl) { BUG_TRAP(cl->prio_activity); - htb_deactivate_prios(q,cl); + htb_deactivate_prios(q, cl); cl->prio_activity = 0; list_del_init(&cl->un.leaf.drop_list); } static int htb_enqueue(struct sk_buff *skb, struct Qdisc *sch) { - int ret; - struct htb_sched *q = qdisc_priv(sch); - struct htb_class *cl = htb_classify(skb,sch,&ret); - - if (cl == HTB_DIRECT) { - /* enqueue to helper queue */ - if (q->direct_queue.qlen < q->direct_qlen) { - __skb_queue_tail(&q->direct_queue, skb); - q->direct_pkts++; - } else { - kfree_skb(skb); - sch->qstats.drops++; - return NET_XMIT_DROP; - } + int ret; + struct htb_sched *q = qdisc_priv(sch); + struct htb_class *cl = htb_classify(skb, sch, &ret); + + if (cl == HTB_DIRECT) { + /* enqueue to helper queue */ + if (q->direct_queue.qlen < q->direct_qlen) { + __skb_queue_tail(&q->direct_queue, skb); + q->direct_pkts++; + } else { + kfree_skb(skb); + sch->qstats.drops++; + return NET_XMIT_DROP; + } #ifdef CONFIG_NET_CLS_ACT - } else if (!cl) { - if (ret == NET_XMIT_BYPASS) - sch->qstats.drops++; - kfree_skb (skb); - return ret; + } else if (!cl) { + if (ret == NET_XMIT_BYPASS) + sch->qstats.drops++; + kfree_skb(skb); + return ret; #endif - } else if (cl->un.leaf.q->enqueue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; - cl->qstats.drops++; - return NET_XMIT_DROP; - } else { - cl->bstats.packets++; cl->bstats.bytes += skb->len; - htb_activate (q,cl); - } - - sch->q.qlen++; - sch->bstats.packets++; sch->bstats.bytes += skb->len; - return NET_XMIT_SUCCESS; + } else if (cl->un.leaf.q->enqueue(skb, cl->un.leaf.q) != + NET_XMIT_SUCCESS) { + sch->qstats.drops++; + cl->qstats.drops++; + return NET_XMIT_DROP; + } else { + cl->bstats.packets++; + cl->bstats.bytes += skb->len; + htb_activate(q, cl); + } + + sch->q.qlen++; + sch->bstats.packets++; + sch->bstats.bytes += skb->len; + return NET_XMIT_SUCCESS; } /* TODO: requeuing packet charges it to policers again !! */ static int htb_requeue(struct sk_buff *skb, struct Qdisc *sch) { - struct htb_sched *q = qdisc_priv(sch); - int ret = NET_XMIT_SUCCESS; - struct htb_class *cl = htb_classify(skb,sch, &ret); - struct sk_buff *tskb; - - if (cl == HTB_DIRECT || !cl) { - /* enqueue to helper queue */ - if (q->direct_queue.qlen < q->direct_qlen && cl) { - __skb_queue_head(&q->direct_queue, skb); - } else { - __skb_queue_head(&q->direct_queue, skb); - tskb = __skb_dequeue_tail(&q->direct_queue); - kfree_skb (tskb); - sch->qstats.drops++; - return NET_XMIT_CN; - } - } else if (cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q) != NET_XMIT_SUCCESS) { - sch->qstats.drops++; - cl->qstats.drops++; - return NET_XMIT_DROP; - } else - htb_activate (q,cl); - - sch->q.qlen++; - sch->qstats.requeues++; - return NET_XMIT_SUCCESS; + struct htb_sched *q = qdisc_priv(sch); + int ret = NET_XMIT_SUCCESS; + struct htb_class *cl = htb_classify(skb, sch, &ret); + struct sk_buff *tskb; + + if (cl == HTB_DIRECT || !cl) { + /* enqueue to helper queue */ + if (q->direct_queue.qlen < q->direct_qlen && cl) { + __skb_queue_head(&q->direct_queue, skb); + } else { + __skb_queue_head(&q->direct_queue, skb); + tskb = __skb_dequeue_tail(&q->direct_queue); + kfree_skb(tskb); + sch->qstats.drops++; + return NET_XMIT_CN; + } + } else if (cl->un.leaf.q->ops->requeue(skb, cl->un.leaf.q) != + NET_XMIT_SUCCESS) { + sch->qstats.drops++; + cl->qstats.drops++; + return NET_XMIT_DROP; + } else + htb_activate(q, cl); + + sch->q.qlen++; + sch->qstats.requeues++; + return NET_XMIT_SUCCESS; } static void htb_timer(unsigned long arg) { - struct Qdisc *sch = (struct Qdisc*)arg; - sch->flags &= ~TCQ_F_THROTTLED; - wmb(); - netif_schedule(sch->dev); + struct Qdisc *sch = (struct Qdisc *)arg; + sch->flags &= ~TCQ_F_THROTTLED; + wmb(); + netif_schedule(sch->dev); } #ifdef HTB_RATECM #define RT_GEN(D,R) R+=D-(R/HTB_EWMAC);D=0 static void htb_rate_timer(unsigned long arg) { - struct Qdisc *sch = (struct Qdisc*)arg; + struct Qdisc *sch = (struct Qdisc *)arg; struct htb_sched *q = qdisc_priv(sch); struct list_head *p; @@ -678,13 +684,13 @@ static void htb_rate_timer(unsigned long arg) add_timer(&q->rttim); /* scan and recompute one bucket at time */ - if (++q->recmp_bucket >= HTB_HSIZE) + if (++q->recmp_bucket >= HTB_HSIZE) q->recmp_bucket = 0; - list_for_each (p,q->hash+q->recmp_bucket) { - struct htb_class *cl = list_entry(p,struct htb_class,hlist); + list_for_each(p, q->hash + q->recmp_bucket) { + struct htb_class *cl = list_entry(p, struct htb_class, hlist); - RT_GEN (cl->sum_bytes,cl->rate_bytes); - RT_GEN (cl->sum_packets,cl->rate_packets); + RT_GEN(cl->sum_bytes, cl->rate_bytes); + RT_GEN(cl->sum_packets, cl->rate_packets); } spin_unlock_bh(&sch->dev->queue_lock); } @@ -701,10 +707,10 @@ static void htb_rate_timer(unsigned long arg) * CAN_SEND) because we can use more precise clock that event queue here. * In such case we remove class from event queue first. */ -static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, - int level,int bytes) -{ - long toks,diff; +static void htb_charge_class(struct htb_sched *q, struct htb_class *cl, + int level, int bytes) +{ + long toks, diff; enum htb_cmode old_mode; #define HTB_ACCNT(T,B,R) toks = diff + cl->T; \ @@ -714,29 +720,31 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, cl->T = toks while (cl) { - diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32)cl->mbuffer); + diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32) cl->mbuffer); if (cl->level >= level) { - if (cl->level == level) cl->xstats.lends++; - HTB_ACCNT (tokens,buffer,rate); + if (cl->level == level) + cl->xstats.lends++; + HTB_ACCNT(tokens, buffer, rate); } else { cl->xstats.borrows++; - cl->tokens += diff; /* we moved t_c; update tokens */ + cl->tokens += diff; /* we moved t_c; update tokens */ } - HTB_ACCNT (ctokens,cbuffer,ceil); + HTB_ACCNT(ctokens, cbuffer, ceil); cl->t_c = q->now; - old_mode = cl->cmode; diff = 0; - htb_change_class_mode(q,cl,&diff); + old_mode = cl->cmode; + diff = 0; + htb_change_class_mode(q, cl, &diff); if (old_mode != cl->cmode) { if (old_mode != HTB_CAN_SEND) - rb_erase(&cl->pq_node,q->wait_pq+cl->level); + rb_erase(&cl->pq_node, q->wait_pq + cl->level); if (cl->cmode != HTB_CAN_SEND) - htb_add_to_wait_tree (q,cl,diff); + htb_add_to_wait_tree(q, cl, diff); } - #ifdef HTB_RATECM /* update rate counters */ - cl->sum_bytes += bytes; cl->sum_packets++; + cl->sum_bytes += bytes; + cl->sum_packets++; #endif /* update byte stats except for leaves which are already updated */ @@ -755,7 +763,7 @@ static void htb_charge_class(struct htb_sched *q,struct htb_class *cl, * next pending event (0 for no event in pq). * Note: Aplied are events whose have cl->pq_key <= jiffies. */ -static long htb_do_events(struct htb_sched *q,int level) +static long htb_do_events(struct htb_sched *q, int level) { int i; @@ -763,34 +771,38 @@ static long htb_do_events(struct htb_sched *q,int level) struct htb_class *cl; long diff; struct rb_node *p = q->wait_pq[level].rb_node; - if (!p) return 0; - while (p->rb_left) p = p->rb_left; + if (!p) + return 0; + while (p->rb_left) + p = p->rb_left; cl = rb_entry(p, struct htb_class, pq_node); if (time_after(cl->pq_key, q->jiffies)) { return cl->pq_key - q->jiffies; } - rb_erase(p,q->wait_pq+level); - diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32)cl->mbuffer); - htb_change_class_mode(q,cl,&diff); + rb_erase(p, q->wait_pq + level); + diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32) cl->mbuffer); + htb_change_class_mode(q, cl, &diff); if (cl->cmode != HTB_CAN_SEND) - htb_add_to_wait_tree (q,cl,diff); + htb_add_to_wait_tree(q, cl, diff); } if (net_ratelimit()) printk(KERN_WARNING "htb: too many events !\n"); - return HZ/10; + return HZ / 10; } /* Returns class->node+prio from id-tree where classe's id is >= id. NULL is no such one exists. */ -static struct rb_node * -htb_id_find_next_upper(int prio,struct rb_node *n,u32 id) +static struct rb_node *htb_id_find_next_upper(int prio, struct rb_node *n, + u32 id) { struct rb_node *r = NULL; while (n) { - struct htb_class *cl = rb_entry(n,struct htb_class,node[prio]); - if (id == cl->classid) return n; - + struct htb_class *cl = + rb_entry(n, struct htb_class, node[prio]); + if (id == cl->classid) + return n; + if (id > cl->classid) { n = n->rb_right; } else { @@ -806,46 +818,49 @@ htb_id_find_next_upper(int prio,struct rb_node *n,u32 id) * * Find leaf where current feed pointers points to. */ -static struct htb_class * -htb_lookup_leaf(struct rb_root *tree,int prio,struct rb_node **pptr,u32 *pid) +static struct htb_class *htb_lookup_leaf(struct rb_root *tree, int prio, + struct rb_node **pptr, u32 * pid) { int i; struct { struct rb_node *root; struct rb_node **pptr; u32 *pid; - } stk[TC_HTB_MAXDEPTH],*sp = stk; - + } stk[TC_HTB_MAXDEPTH], *sp = stk; + BUG_TRAP(tree->rb_node); sp->root = tree->rb_node; sp->pptr = pptr; sp->pid = pid; for (i = 0; i < 65535; i++) { - if (!*sp->pptr && *sp->pid) { + if (!*sp->pptr && *sp->pid) { /* ptr was invalidated but id is valid - try to recover the original or next ptr */ - *sp->pptr = htb_id_find_next_upper(prio,sp->root,*sp->pid); + *sp->pptr = + htb_id_find_next_upper(prio, sp->root, *sp->pid); } - *sp->pid = 0; /* ptr is valid now so that remove this hint as it - can become out of date quickly */ - if (!*sp->pptr) { /* we are at right end; rewind & go up */ + *sp->pid = 0; /* ptr is valid now so that remove this hint as it + can become out of date quickly */ + if (!*sp->pptr) { /* we are at right end; rewind & go up */ *sp->pptr = sp->root; - while ((*sp->pptr)->rb_left) + while ((*sp->pptr)->rb_left) *sp->pptr = (*sp->pptr)->rb_left; if (sp > stk) { sp--; - BUG_TRAP(*sp->pptr); if(!*sp->pptr) return NULL; - htb_next_rb_node (sp->pptr); + BUG_TRAP(*sp->pptr); + if (!*sp->pptr) + return NULL; + htb_next_rb_node(sp->pptr); } } else { struct htb_class *cl; - cl = rb_entry(*sp->pptr,struct htb_class,node[prio]); - if (!cl->level) + cl = rb_entry(*sp->pptr, struct htb_class, node[prio]); + if (!cl->level) return cl; (++sp)->root = cl->un.inner.feed[prio].rb_node; - sp->pptr = cl->un.inner.ptr+prio; - sp->pid = cl->un.inner.last_ptr_id+prio; + sp->pptr = cl->un.inner.ptr + prio; + sp->pid = cl->un.inner.last_ptr_id + prio; } } BUG_TRAP(0); @@ -854,19 +869,21 @@ htb_lookup_leaf(struct rb_root *tree,int prio,struct rb_node **pptr,u32 *pid) /* dequeues packet at given priority and level; call only if you are sure that there is active class at prio/level */ -static struct sk_buff * -htb_dequeue_tree(struct htb_sched *q,int prio,int level) +static struct sk_buff *htb_dequeue_tree(struct htb_sched *q, int prio, + int level) { struct sk_buff *skb = NULL; - struct htb_class *cl,*start; + struct htb_class *cl, *start; /* look initial class up in the row */ - start = cl = htb_lookup_leaf (q->row[level]+prio,prio, - q->ptr[level]+prio,q->last_ptr_id[level]+prio); - + start = cl = htb_lookup_leaf(q->row[level] + prio, prio, + q->ptr[level] + prio, + q->last_ptr_id[level] + prio); + do { next: - BUG_TRAP(cl); - if (!cl) return NULL; + BUG_TRAP(cl); + if (!cl) + return NULL; /* class can be empty - it is unlikely but can be true if leaf qdisc drops packets in enqueue routine or if someone used @@ -874,56 +891,64 @@ next: simply deactivate and skip such class */ if (unlikely(cl->un.leaf.q->q.qlen == 0)) { struct htb_class *next; - htb_deactivate(q,cl); + htb_deactivate(q, cl); /* row/level might become empty */ if ((q->row_mask[level] & (1 << prio)) == 0) - return NULL; - - next = htb_lookup_leaf (q->row[level]+prio, - prio,q->ptr[level]+prio,q->last_ptr_id[level]+prio); + return NULL; - if (cl == start) /* fix start if we just deleted it */ + next = htb_lookup_leaf(q->row[level] + prio, + prio, q->ptr[level] + prio, + q->last_ptr_id[level] + prio); + + if (cl == start) /* fix start if we just deleted it */ start = next; cl = next; goto next; } - - if (likely((skb = cl->un.leaf.q->dequeue(cl->un.leaf.q)) != NULL)) + + skb = cl->un.leaf.q->dequeue(cl->un.leaf.q); + if (likely(skb != NULL)) break; if (!cl->warned) { - printk(KERN_WARNING "htb: class %X isn't work conserving ?!\n",cl->classid); + printk(KERN_WARNING + "htb: class %X isn't work conserving ?!\n", + cl->classid); cl->warned = 1; } q->nwc_hit++; - htb_next_rb_node((level?cl->parent->un.inner.ptr:q->ptr[0])+prio); - cl = htb_lookup_leaf (q->row[level]+prio,prio,q->ptr[level]+prio, - q->last_ptr_id[level]+prio); + htb_next_rb_node((level ? cl->parent->un.inner.ptr : q-> + ptr[0]) + prio); + cl = htb_lookup_leaf(q->row[level] + prio, prio, + q->ptr[level] + prio, + q->last_ptr_id[level] + prio); } while (cl != start); if (likely(skb != NULL)) { if ((cl->un.leaf.deficit[level] -= skb->len) < 0) { cl->un.leaf.deficit[level] += cl->un.leaf.quantum; - htb_next_rb_node((level?cl->parent->un.inner.ptr:q->ptr[0])+prio); + htb_next_rb_node((level ? cl->parent->un.inner.ptr : q-> + ptr[0]) + prio); } /* this used to be after charge_class but this constelation gives us slightly better performance */ if (!cl->un.leaf.q->q.qlen) - htb_deactivate (q,cl); - htb_charge_class (q,cl,level,skb->len); + htb_deactivate(q, cl); + htb_charge_class(q, cl, level, skb->len); } return skb; } -static void htb_delay_by(struct Qdisc *sch,long delay) +static void htb_delay_by(struct Qdisc *sch, long delay) { struct htb_sched *q = qdisc_priv(sch); - if (delay <= 0) delay = 1; - if (unlikely(delay > 5*HZ)) { + if (delay <= 0) + delay = 1; + if (unlikely(delay > 5 * HZ)) { if (net_ratelimit()) printk(KERN_INFO "HTB delay %ld > 5sec\n", delay); - delay = 5*HZ; + delay = 5 * HZ; } /* why don't use jiffies here ? because expires can be in past */ mod_timer(&q->timer, q->jiffies + delay); @@ -941,13 +966,15 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) q->jiffies = jiffies; /* try to dequeue direct packets as high prio (!) to minimize cpu work */ - if ((skb = __skb_dequeue(&q->direct_queue)) != NULL) { + skb = __skb_dequeue(&q->direct_queue); + if (skb != NULL) { sch->flags &= ~TCQ_F_THROTTLED; sch->q.qlen--; return skb; } - if (!sch->q.qlen) goto fin; + if (!sch->q.qlen) + goto fin; PSCHED_GET_TIME(q->now); min_delay = LONG_MAX; @@ -957,18 +984,19 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) int m; long delay; if (time_after_eq(q->jiffies, q->near_ev_cache[level])) { - delay = htb_do_events(q,level); - q->near_ev_cache[level] = q->jiffies + (delay ? delay : HZ); + delay = htb_do_events(q, level); + q->near_ev_cache[level] = + q->jiffies + (delay ? delay : HZ); } else - delay = q->near_ev_cache[level] - q->jiffies; - - if (delay && min_delay > delay) + delay = q->near_ev_cache[level] - q->jiffies; + + if (delay && min_delay > delay) min_delay = delay; m = ~q->row_mask[level]; while (m != (int)(-1)) { - int prio = ffz (m); + int prio = ffz(m); m |= 1 << prio; - skb = htb_dequeue_tree(q,prio,level); + skb = htb_dequeue_tree(q, prio, level); if (likely(skb != NULL)) { sch->q.qlen--; sch->flags &= ~TCQ_F_THROTTLED; @@ -976,28 +1004,28 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch) } } } - htb_delay_by (sch,min_delay > 5*HZ ? 5*HZ : min_delay); + htb_delay_by(sch, min_delay > 5 * HZ ? 5 * HZ : min_delay); fin: return skb; } /* try to drop from each class (by prio) until one succeed */ -static unsigned int htb_drop(struct Qdisc* sch) +static unsigned int htb_drop(struct Qdisc *sch) { struct htb_sched *q = qdisc_priv(sch); int prio; for (prio = TC_HTB_NUMPRIO - 1; prio >= 0; prio--) { struct list_head *p; - list_for_each (p,q->drops+prio) { + list_for_each(p, q->drops + prio) { struct htb_class *cl = list_entry(p, struct htb_class, un.leaf.drop_list); unsigned int len; - if (cl->un.leaf.q->ops->drop && - (len = cl->un.leaf.q->ops->drop(cl->un.leaf.q))) { + if (cl->un.leaf.q->ops->drop && + (len = cl->un.leaf.q->ops->drop(cl->un.leaf.q))) { sch->q.qlen--; if (!cl->un.leaf.q->q.qlen) - htb_deactivate (q,cl); + htb_deactivate(q, cl); return len; } } @@ -1007,19 +1035,20 @@ static unsigned int htb_drop(struct Qdisc* sch) /* reset all classes */ /* always caled under BH & queue lock */ -static void htb_reset(struct Qdisc* sch) +static void htb_reset(struct Qdisc *sch) { struct htb_sched *q = qdisc_priv(sch); int i; for (i = 0; i < HTB_HSIZE; i++) { struct list_head *p; - list_for_each (p,q->hash+i) { - struct htb_class *cl = list_entry(p,struct htb_class,hlist); + list_for_each(p, q->hash + i) { + struct htb_class *cl = + list_entry(p, struct htb_class, hlist); if (cl->level) - memset(&cl->un.inner,0,sizeof(cl->un.inner)); + memset(&cl->un.inner, 0, sizeof(cl->un.inner)); else { - if (cl->un.leaf.q) + if (cl->un.leaf.q) qdisc_reset(cl->un.leaf.q); INIT_LIST_HEAD(&cl->un.leaf.drop_list); } @@ -1032,12 +1061,12 @@ static void htb_reset(struct Qdisc* sch) del_timer(&q->timer); __skb_queue_purge(&q->direct_queue); sch->q.qlen = 0; - memset(q->row,0,sizeof(q->row)); - memset(q->row_mask,0,sizeof(q->row_mask)); - memset(q->wait_pq,0,sizeof(q->wait_pq)); - memset(q->ptr,0,sizeof(q->ptr)); + memset(q->row, 0, sizeof(q->row)); + memset(q->row_mask, 0, sizeof(q->row_mask)); + memset(q->wait_pq, 0, sizeof(q->wait_pq)); + memset(q->ptr, 0, sizeof(q->ptr)); for (i = 0; i < TC_HTB_NUMPRIO; i++) - INIT_LIST_HEAD(q->drops+i); + INIT_LIST_HEAD(q->drops + i); } static int htb_init(struct Qdisc *sch, struct rtattr *opt) @@ -1047,29 +1076,30 @@ static int htb_init(struct Qdisc *sch, struct rtattr *opt) struct tc_htb_glob *gopt; int i; if (!opt || rtattr_parse_nested(tb, TCA_HTB_INIT, opt) || - tb[TCA_HTB_INIT-1] == NULL || - RTA_PAYLOAD(tb[TCA_HTB_INIT-1]) < sizeof(*gopt)) { + tb[TCA_HTB_INIT - 1] == NULL || + RTA_PAYLOAD(tb[TCA_HTB_INIT - 1]) < sizeof(*gopt)) { printk(KERN_ERR "HTB: hey probably you have bad tc tool ?\n"); return -EINVAL; } - gopt = RTA_DATA(tb[TCA_HTB_INIT-1]); + gopt = RTA_DATA(tb[TCA_HTB_INIT - 1]); if (gopt->version != HTB_VER >> 16) { - printk(KERN_ERR "HTB: need tc/htb version %d (minor is %d), you have %d\n", - HTB_VER >> 16,HTB_VER & 0xffff,gopt->version); + printk(KERN_ERR + "HTB: need tc/htb version %d (minor is %d), you have %d\n", + HTB_VER >> 16, HTB_VER & 0xffff, gopt->version); return -EINVAL; } INIT_LIST_HEAD(&q->root); for (i = 0; i < HTB_HSIZE; i++) - INIT_LIST_HEAD(q->hash+i); + INIT_LIST_HEAD(q->hash + i); for (i = 0; i < TC_HTB_NUMPRIO; i++) - INIT_LIST_HEAD(q->drops+i); + INIT_LIST_HEAD(q->drops + i); init_timer(&q->timer); skb_queue_head_init(&q->direct_queue); q->direct_qlen = sch->dev->tx_queue_len; - if (q->direct_qlen < 2) /* some devices have zero tx_queue_len */ + if (q->direct_qlen < 2) /* some devices have zero tx_queue_len */ q->direct_qlen = 2; q->timer.function = htb_timer; q->timer.data = (unsigned long)sch; @@ -1091,7 +1121,7 @@ static int htb_init(struct Qdisc *sch, struct rtattr *opt) static int htb_dump(struct Qdisc *sch, struct sk_buff *skb) { struct htb_sched *q = qdisc_priv(sch); - unsigned char *b = skb->tail; + unsigned char *b = skb->tail; struct rtattr *rta; struct tc_htb_glob gopt; spin_lock_bh(&sch->dev->queue_lock); @@ -1101,7 +1131,7 @@ static int htb_dump(struct Qdisc *sch, struct sk_buff *skb) gopt.rate2quantum = q->rate2quantum; gopt.defcls = q->defcls; gopt.debug = 0; - rta = (struct rtattr*)b; + rta = (struct rtattr *)b; RTA_PUT(skb, TCA_OPTIONS, 0, NULL); RTA_PUT(skb, TCA_HTB_INIT, sizeof(gopt), &gopt); rta->rta_len = skb->tail - b; @@ -1114,10 +1144,10 @@ rtattr_failure: } static int htb_dump_class(struct Qdisc *sch, unsigned long arg, - struct sk_buff *skb, struct tcmsg *tcm) + struct sk_buff *skb, struct tcmsg *tcm) { - struct htb_class *cl = (struct htb_class*)arg; - unsigned char *b = skb->tail; + struct htb_class *cl = (struct htb_class *)arg; + unsigned char *b = skb->tail; struct rtattr *rta; struct tc_htb_opt opt; @@ -1127,15 +1157,18 @@ static int htb_dump_class(struct Qdisc *sch, unsigned long arg, if (!cl->level && cl->un.leaf.q) tcm->tcm_info = cl->un.leaf.q->handle; - rta = (struct rtattr*)b; + rta = (struct rtattr *)b; RTA_PUT(skb, TCA_OPTIONS, 0, NULL); - memset (&opt,0,sizeof(opt)); + memset(&opt, 0, sizeof(opt)); - opt.rate = cl->rate->rate; opt.buffer = cl->buffer; - opt.ceil = cl->ceil->rate; opt.cbuffer = cl->cbuffer; - opt.quantum = cl->un.leaf.quantum; opt.prio = cl->un.leaf.prio; - opt.level = cl->level; + opt.rate = cl->rate->rate; + opt.buffer = cl->buffer; + opt.ceil = cl->ceil->rate; + opt.cbuffer = cl->cbuffer; + opt.quantum = cl->un.leaf.quantum; + opt.prio = cl->un.leaf.prio; + opt.level = cl->level; RTA_PUT(skb, TCA_HTB_PARMS, sizeof(opt), &opt); rta->rta_len = skb->tail - b; spin_unlock_bh(&sch->dev->queue_lock); @@ -1147,14 +1180,13 @@ rtattr_failure: } static int -htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, - struct gnet_dump *d) +htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, struct gnet_dump *d) { - struct htb_class *cl = (struct htb_class*)arg; + struct htb_class *cl = (struct htb_class *)arg; #ifdef HTB_RATECM - cl->rate_est.bps = cl->rate_bytes/(HTB_EWMAC*HTB_HSIZE); - cl->rate_est.pps = cl->rate_packets/(HTB_EWMAC*HTB_HSIZE); + cl->rate_est.bps = cl->rate_bytes / (HTB_EWMAC * HTB_HSIZE); + cl->rate_est.pps = cl->rate_packets / (HTB_EWMAC * HTB_HSIZE); #endif if (!cl->level && cl->un.leaf.q) @@ -1171,21 +1203,22 @@ htb_dump_class_stats(struct Qdisc *sch, unsigned long arg, } static int htb_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, - struct Qdisc **old) + struct Qdisc **old) { - struct htb_class *cl = (struct htb_class*)arg; + struct htb_class *cl = (struct htb_class *)arg; if (cl && !cl->level) { - if (new == NULL && (new = qdisc_create_dflt(sch->dev, - &pfifo_qdisc_ops)) == NULL) - return -ENOBUFS; + if (new == NULL && (new = qdisc_create_dflt(sch->dev, + &pfifo_qdisc_ops)) + == NULL) + return -ENOBUFS; sch_tree_lock(sch); if ((*old = xchg(&cl->un.leaf.q, new)) != NULL) { if (cl->prio_activity) - htb_deactivate (qdisc_priv(sch),cl); + htb_deactivate(qdisc_priv(sch), cl); /* TODO: is it correct ? Why CBQ doesn't do it ? */ - sch->q.qlen -= (*old)->q.qlen; + sch->q.qlen -= (*old)->q.qlen; qdisc_reset(*old); } sch_tree_unlock(sch); @@ -1194,16 +1227,16 @@ static int htb_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new, return -ENOENT; } -static struct Qdisc * htb_leaf(struct Qdisc *sch, unsigned long arg) +static struct Qdisc *htb_leaf(struct Qdisc *sch, unsigned long arg) { - struct htb_class *cl = (struct htb_class*)arg; + struct htb_class *cl = (struct htb_class *)arg; return (cl && !cl->level) ? cl->un.leaf.q : NULL; } static unsigned long htb_get(struct Qdisc *sch, u32 classid) { - struct htb_class *cl = htb_find(classid,sch); - if (cl) + struct htb_class *cl = htb_find(classid, sch); + if (cl) cl->refcnt++; return (unsigned long)cl; } @@ -1218,7 +1251,7 @@ static void htb_destroy_filters(struct tcf_proto **fl) } } -static void htb_destroy_class(struct Qdisc* sch,struct htb_class *cl) +static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl) { struct htb_sched *q = qdisc_priv(sch); if (!cl->level) { @@ -1228,44 +1261,44 @@ static void htb_destroy_class(struct Qdisc* sch,struct htb_class *cl) } qdisc_put_rtab(cl->rate); qdisc_put_rtab(cl->ceil); - - htb_destroy_filters (&cl->filter_list); - - while (!list_empty(&cl->children)) - htb_destroy_class (sch,list_entry(cl->children.next, - struct htb_class,sibling)); + + htb_destroy_filters(&cl->filter_list); + + while (!list_empty(&cl->children)) + htb_destroy_class(sch, list_entry(cl->children.next, + struct htb_class, sibling)); /* note: this delete may happen twice (see htb_delete) */ list_del(&cl->hlist); list_del(&cl->sibling); - + if (cl->prio_activity) - htb_deactivate (q,cl); - + htb_deactivate(q, cl); + if (cl->cmode != HTB_CAN_SEND) - rb_erase(&cl->pq_node,q->wait_pq+cl->level); - + rb_erase(&cl->pq_node, q->wait_pq + cl->level); + kfree(cl); } /* always caled under BH & queue lock */ -static void htb_destroy(struct Qdisc* sch) +static void htb_destroy(struct Qdisc *sch) { struct htb_sched *q = qdisc_priv(sch); - del_timer_sync (&q->timer); + del_timer_sync(&q->timer); #ifdef HTB_RATECM - del_timer_sync (&q->rttim); + del_timer_sync(&q->rttim); #endif /* This line used to be after htb_destroy_class call below and surprisingly it worked in 2.4. But it must precede it because filter need its target class alive to be able to call unbind_filter on it (without Oops). */ htb_destroy_filters(&q->filter_list); - - while (!list_empty(&q->root)) - htb_destroy_class (sch,list_entry(q->root.next, - struct htb_class,sibling)); + + while (!list_empty(&q->root)) + htb_destroy_class(sch, list_entry(q->root.next, + struct htb_class, sibling)); __skb_queue_purge(&q->direct_queue); } @@ -1273,23 +1306,23 @@ static void htb_destroy(struct Qdisc* sch) static int htb_delete(struct Qdisc *sch, unsigned long arg) { struct htb_sched *q = qdisc_priv(sch); - struct htb_class *cl = (struct htb_class*)arg; + struct htb_class *cl = (struct htb_class *)arg; // TODO: why don't allow to delete subtree ? references ? does // tc subsys quarantee us that in htb_destroy it holds no class // refs so that we can remove children safely there ? if (!list_empty(&cl->children) || cl->filter_cnt) return -EBUSY; - + sch_tree_lock(sch); - + /* delete from hash and active; remainder in destroy_class */ list_del_init(&cl->hlist); if (cl->prio_activity) - htb_deactivate (q,cl); + htb_deactivate(q, cl); if (--cl->refcnt == 0) - htb_destroy_class(sch,cl); + htb_destroy_class(sch, cl); sch_tree_unlock(sch); return 0; @@ -1297,41 +1330,44 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) static void htb_put(struct Qdisc *sch, unsigned long arg) { - struct htb_class *cl = (struct htb_class*)arg; + struct htb_class *cl = (struct htb_class *)arg; if (--cl->refcnt == 0) - htb_destroy_class(sch,cl); + htb_destroy_class(sch, cl); } -static int htb_change_class(struct Qdisc *sch, u32 classid, - u32 parentid, struct rtattr **tca, unsigned long *arg) +static int htb_change_class(struct Qdisc *sch, u32 classid, + u32 parentid, struct rtattr **tca, + unsigned long *arg) { int err = -EINVAL; struct htb_sched *q = qdisc_priv(sch); - struct htb_class *cl = (struct htb_class*)*arg,*parent; - struct rtattr *opt = tca[TCA_OPTIONS-1]; + struct htb_class *cl = (struct htb_class *)*arg, *parent; + struct rtattr *opt = tca[TCA_OPTIONS - 1]; struct qdisc_rate_table *rtab = NULL, *ctab = NULL; struct rtattr *tb[TCA_HTB_RTAB]; struct tc_htb_opt *hopt; /* extract all subattrs from opt attr */ if (!opt || rtattr_parse_nested(tb, TCA_HTB_RTAB, opt) || - tb[TCA_HTB_PARMS-1] == NULL || - RTA_PAYLOAD(tb[TCA_HTB_PARMS-1]) < sizeof(*hopt)) + tb[TCA_HTB_PARMS - 1] == NULL || + RTA_PAYLOAD(tb[TCA_HTB_PARMS - 1]) < sizeof(*hopt)) goto failure; - - parent = parentid == TC_H_ROOT ? NULL : htb_find (parentid,sch); - hopt = RTA_DATA(tb[TCA_HTB_PARMS-1]); + parent = parentid == TC_H_ROOT ? NULL : htb_find(parentid, sch); + + hopt = RTA_DATA(tb[TCA_HTB_PARMS - 1]); - rtab = qdisc_get_rtab(&hopt->rate, tb[TCA_HTB_RTAB-1]); - ctab = qdisc_get_rtab(&hopt->ceil, tb[TCA_HTB_CTAB-1]); - if (!rtab || !ctab) goto failure; + rtab = qdisc_get_rtab(&hopt->rate, tb[TCA_HTB_RTAB - 1]); + ctab = qdisc_get_rtab(&hopt->ceil, tb[TCA_HTB_CTAB - 1]); + if (!rtab || !ctab) + goto failure; - if (!cl) { /* new class */ + if (!cl) { /* new class */ struct Qdisc *new_q; /* check for valid classid */ - if (!classid || TC_H_MAJ(classid^sch->handle) || htb_find(classid,sch)) + if (!classid || TC_H_MAJ(classid ^ sch->handle) + || htb_find(classid, sch)) goto failure; /* check maximal depth */ @@ -1342,7 +1378,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, err = -ENOBUFS; if ((cl = kzalloc(sizeof(*cl), GFP_KERNEL)) == NULL) goto failure; - + cl->refcnt = 1; INIT_LIST_HEAD(&cl->sibling); INIT_LIST_HEAD(&cl->hlist); @@ -1357,46 +1393,53 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, if (parent && !parent->level) { /* turn parent into inner node */ sch->q.qlen -= parent->un.leaf.q->q.qlen; - qdisc_destroy (parent->un.leaf.q); - if (parent->prio_activity) - htb_deactivate (q,parent); + qdisc_destroy(parent->un.leaf.q); + if (parent->prio_activity) + htb_deactivate(q, parent); /* remove from evt list because of level change */ if (parent->cmode != HTB_CAN_SEND) { - rb_erase(&parent->pq_node,q->wait_pq); + rb_erase(&parent->pq_node, q->wait_pq); parent->cmode = HTB_CAN_SEND; } parent->level = (parent->parent ? parent->parent->level - : TC_HTB_MAXDEPTH) - 1; - memset (&parent->un.inner,0,sizeof(parent->un.inner)); + : TC_HTB_MAXDEPTH) - 1; + memset(&parent->un.inner, 0, sizeof(parent->un.inner)); } /* leaf (we) needs elementary qdisc */ cl->un.leaf.q = new_q ? new_q : &noop_qdisc; - cl->classid = classid; cl->parent = parent; + cl->classid = classid; + cl->parent = parent; /* set class to be in HTB_CAN_SEND state */ cl->tokens = hopt->buffer; cl->ctokens = hopt->cbuffer; - cl->mbuffer = PSCHED_JIFFIE2US(HZ*60); /* 1min */ + cl->mbuffer = PSCHED_JIFFIE2US(HZ * 60); /* 1min */ PSCHED_GET_TIME(cl->t_c); cl->cmode = HTB_CAN_SEND; /* attach to the hash list and parent's family */ - list_add_tail(&cl->hlist, q->hash+htb_hash(classid)); - list_add_tail(&cl->sibling, parent ? &parent->children : &q->root); - } else sch_tree_lock(sch); + list_add_tail(&cl->hlist, q->hash + htb_hash(classid)); + list_add_tail(&cl->sibling, + parent ? &parent->children : &q->root); + } else + sch_tree_lock(sch); /* it used to be a nasty bug here, we have to check that node - is really leaf before changing cl->un.leaf ! */ + is really leaf before changing cl->un.leaf ! */ if (!cl->level) { cl->un.leaf.quantum = rtab->rate.rate / q->rate2quantum; if (!hopt->quantum && cl->un.leaf.quantum < 1000) { - printk(KERN_WARNING "HTB: quantum of class %X is small. Consider r2q change.\n", cl->classid); + printk(KERN_WARNING + "HTB: quantum of class %X is small. Consider r2q change.\n", + cl->classid); cl->un.leaf.quantum = 1000; } if (!hopt->quantum && cl->un.leaf.quantum > 200000) { - printk(KERN_WARNING "HTB: quantum of class %X is big. Consider r2q change.\n", cl->classid); + printk(KERN_WARNING + "HTB: quantum of class %X is big. Consider r2q change.\n", + cl->classid); cl->un.leaf.quantum = 200000; } if (hopt->quantum) @@ -1407,16 +1450,22 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, cl->buffer = hopt->buffer; cl->cbuffer = hopt->cbuffer; - if (cl->rate) qdisc_put_rtab(cl->rate); cl->rate = rtab; - if (cl->ceil) qdisc_put_rtab(cl->ceil); cl->ceil = ctab; + if (cl->rate) + qdisc_put_rtab(cl->rate); + cl->rate = rtab; + if (cl->ceil) + qdisc_put_rtab(cl->ceil); + cl->ceil = ctab; sch_tree_unlock(sch); *arg = (unsigned long)cl; return 0; failure: - if (rtab) qdisc_put_rtab(rtab); - if (ctab) qdisc_put_rtab(ctab); + if (rtab) + qdisc_put_rtab(rtab); + if (ctab) + qdisc_put_rtab(ctab); return err; } @@ -1430,23 +1479,23 @@ static struct tcf_proto **htb_find_tcf(struct Qdisc *sch, unsigned long arg) } static unsigned long htb_bind_filter(struct Qdisc *sch, unsigned long parent, - u32 classid) + u32 classid) { struct htb_sched *q = qdisc_priv(sch); - struct htb_class *cl = htb_find (classid,sch); + struct htb_class *cl = htb_find(classid, sch); /*if (cl && !cl->level) return 0; - The line above used to be there to prevent attaching filters to - leaves. But at least tc_index filter uses this just to get class - for other reasons so that we have to allow for it. - ---- - 19.6.2002 As Werner explained it is ok - bind filter is just - another way to "lock" the class - unlike "get" this lock can - be broken by class during destroy IIUC. + The line above used to be there to prevent attaching filters to + leaves. But at least tc_index filter uses this just to get class + for other reasons so that we have to allow for it. + ---- + 19.6.2002 As Werner explained it is ok - bind filter is just + another way to "lock" the class - unlike "get" this lock can + be broken by class during destroy IIUC. */ - if (cl) - cl->filter_cnt++; - else + if (cl) + cl->filter_cnt++; + else q->filter_cnt++; return (unsigned long)cl; } @@ -1456,9 +1505,9 @@ static void htb_unbind_filter(struct Qdisc *sch, unsigned long arg) struct htb_sched *q = qdisc_priv(sch); struct htb_class *cl = (struct htb_class *)arg; - if (cl) - cl->filter_cnt--; - else + if (cl) + cl->filter_cnt--; + else q->filter_cnt--; } @@ -1472,8 +1521,9 @@ static void htb_walk(struct Qdisc *sch, struct qdisc_walker *arg) for (i = 0; i < HTB_HSIZE; i++) { struct list_head *p; - list_for_each (p,q->hash+i) { - struct htb_class *cl = list_entry(p,struct htb_class,hlist); + list_for_each(p, q->hash + i) { + struct htb_class *cl = + list_entry(p, struct htb_class, hlist); if (arg->count < arg->skip) { arg->count++; continue; @@ -1521,12 +1571,13 @@ static struct Qdisc_ops htb_qdisc_ops = { static int __init htb_module_init(void) { - return register_qdisc(&htb_qdisc_ops); + return register_qdisc(&htb_qdisc_ops); } -static void __exit htb_module_exit(void) +static void __exit htb_module_exit(void) { - unregister_qdisc(&htb_qdisc_ops); + unregister_qdisc(&htb_qdisc_ops); } + module_init(htb_module_init) module_exit(htb_module_exit) MODULE_LICENSE("GPL"); -- cgit v0.10.2 From 0cef296da9331e871401076b8c0688b2b31fcadd Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:35:38 -0700 Subject: [HTB]: Use hlist for hash lists. Use hlist instead of list for the hash list. This saves space, and we can check for double delete better. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index 6c6cac6..a686b95 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -104,7 +104,7 @@ struct htb_class { /* topology */ int level; /* our level (see above) */ struct htb_class *parent; /* parent class */ - struct list_head hlist; /* classid hash list item */ + struct hlist_node hlist; /* classid hash list item */ struct list_head sibling; /* sibling list item */ struct list_head children; /* children list */ @@ -163,8 +163,8 @@ static inline long L2T(struct htb_class *cl, struct qdisc_rate_table *rate, struct htb_sched { struct list_head root; /* root classes list */ - struct list_head hash[HTB_HSIZE]; /* hashed by classid */ - struct list_head drops[TC_HTB_NUMPRIO]; /* active leaves (for drops) */ + struct hlist_head hash[HTB_HSIZE]; /* hashed by classid */ + struct list_head drops[TC_HTB_NUMPRIO];/* active leaves (for drops) */ /* self list - roots of self generating tree */ struct rb_root row[TC_HTB_MAXDEPTH][TC_HTB_NUMPRIO]; @@ -220,12 +220,13 @@ static inline int htb_hash(u32 h) static inline struct htb_class *htb_find(u32 handle, struct Qdisc *sch) { struct htb_sched *q = qdisc_priv(sch); - struct list_head *p; + struct hlist_node *p; + struct htb_class *cl; + if (TC_H_MAJ(handle) != sch->handle) return NULL; - list_for_each(p, q->hash + htb_hash(handle)) { - struct htb_class *cl = list_entry(p, struct htb_class, hlist); + hlist_for_each_entry(cl, p, q->hash + htb_hash(handle), hlist) { if (cl->classid == handle) return cl; } @@ -675,7 +676,9 @@ static void htb_rate_timer(unsigned long arg) { struct Qdisc *sch = (struct Qdisc *)arg; struct htb_sched *q = qdisc_priv(sch); - struct list_head *p; + struct hlist_node *p; + struct htb_class *cl; + /* lock queue so that we can muck with it */ spin_lock_bh(&sch->dev->queue_lock); @@ -686,9 +689,8 @@ static void htb_rate_timer(unsigned long arg) /* scan and recompute one bucket at time */ if (++q->recmp_bucket >= HTB_HSIZE) q->recmp_bucket = 0; - list_for_each(p, q->hash + q->recmp_bucket) { - struct htb_class *cl = list_entry(p, struct htb_class, hlist); + hlist_for_each_entry(cl,p, q->hash + q->recmp_bucket, hlist) { RT_GEN(cl->sum_bytes, cl->rate_bytes); RT_GEN(cl->sum_packets, cl->rate_packets); } @@ -1041,10 +1043,10 @@ static void htb_reset(struct Qdisc *sch) int i; for (i = 0; i < HTB_HSIZE; i++) { - struct list_head *p; - list_for_each(p, q->hash + i) { - struct htb_class *cl = - list_entry(p, struct htb_class, hlist); + struct hlist_node *p; + struct htb_class *cl; + + hlist_for_each_entry(cl, p, q->hash + i, hlist) { if (cl->level) memset(&cl->un.inner, 0, sizeof(cl->un.inner)); else { @@ -1091,7 +1093,7 @@ static int htb_init(struct Qdisc *sch, struct rtattr *opt) INIT_LIST_HEAD(&q->root); for (i = 0; i < HTB_HSIZE; i++) - INIT_LIST_HEAD(q->hash + i); + INIT_HLIST_HEAD(q->hash + i); for (i = 0; i < TC_HTB_NUMPRIO; i++) INIT_LIST_HEAD(q->drops + i); @@ -1269,7 +1271,8 @@ static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl) struct htb_class, sibling)); /* note: this delete may happen twice (see htb_delete) */ - list_del(&cl->hlist); + if (!hlist_unhashed(&cl->hlist)) + hlist_del(&cl->hlist); list_del(&cl->sibling); if (cl->prio_activity) @@ -1317,7 +1320,9 @@ static int htb_delete(struct Qdisc *sch, unsigned long arg) sch_tree_lock(sch); /* delete from hash and active; remainder in destroy_class */ - list_del_init(&cl->hlist); + if (!hlist_unhashed(&cl->hlist)) + hlist_del(&cl->hlist); + if (cl->prio_activity) htb_deactivate(q, cl); @@ -1381,7 +1386,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, cl->refcnt = 1; INIT_LIST_HEAD(&cl->sibling); - INIT_LIST_HEAD(&cl->hlist); + INIT_HLIST_NODE(&cl->hlist); INIT_LIST_HEAD(&cl->children); INIT_LIST_HEAD(&cl->un.leaf.drop_list); @@ -1420,7 +1425,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, cl->cmode = HTB_CAN_SEND; /* attach to the hash list and parent's family */ - list_add_tail(&cl->hlist, q->hash + htb_hash(classid)); + hlist_add_head(&cl->hlist, q->hash + htb_hash(classid)); list_add_tail(&cl->sibling, parent ? &parent->children : &q->root); } else @@ -1520,10 +1525,10 @@ static void htb_walk(struct Qdisc *sch, struct qdisc_walker *arg) return; for (i = 0; i < HTB_HSIZE; i++) { - struct list_head *p; - list_for_each(p, q->hash + i) { - struct htb_class *cl = - list_entry(p, struct htb_class, hlist); + struct hlist_node *p; + struct htb_class *cl; + + hlist_for_each_entry(cl, p, q->hash + i, hlist) { if (arg->count < arg->skip) { arg->count++; continue; -- cgit v0.10.2 From 3696f625e2efa1f1b228b276788274e1eb86fcfa Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 10 Aug 2006 23:36:01 -0700 Subject: [HTB]: rbtree cleanup Add code to initialize rb tree nodes, and check for double deletion. This is not a real fix, but I can make it trap sometimes and may be a bandaid for: http://bugzilla.kernel.org/show_bug.cgi?id=6681 Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c index a686b95..bb3ddd4 100644 --- a/net/sched/sch_htb.c +++ b/net/sched/sch_htb.c @@ -366,7 +366,7 @@ static void htb_add_to_wait_tree(struct htb_sched *q, * When we are past last key we return NULL. * Average complexity is 2 steps per call. */ -static void htb_next_rb_node(struct rb_node **n) +static inline void htb_next_rb_node(struct rb_node **n) { *n = rb_next(*n); } @@ -388,6 +388,18 @@ static inline void htb_add_class_to_row(struct htb_sched *q, } } +/* If this triggers, it is a bug in this code, but it need not be fatal */ +static void htb_safe_rb_erase(struct rb_node *rb, struct rb_root *root) +{ + if (RB_EMPTY_NODE(rb)) { + WARN_ON(1); + } else { + rb_erase(rb, root); + RB_CLEAR_NODE(rb); + } +} + + /** * htb_remove_class_from_row - removes class from its row * @@ -401,10 +413,12 @@ static inline void htb_remove_class_from_row(struct htb_sched *q, while (mask) { int prio = ffz(~mask); + mask &= ~(1 << prio); if (q->ptr[cl->level][prio] == cl->node + prio) htb_next_rb_node(q->ptr[cl->level] + prio); - rb_erase(cl->node + prio, q->row[cl->level] + prio); + + htb_safe_rb_erase(cl->node + prio, q->row[cl->level] + prio); if (!q->row[cl->level][prio].rb_node) m |= 1 << prio; } @@ -472,7 +486,7 @@ static void htb_deactivate_prios(struct htb_sched *q, struct htb_class *cl) p->un.inner.ptr[prio] = NULL; } - rb_erase(cl->node + prio, p->un.inner.feed + prio); + htb_safe_rb_erase(cl->node + prio, p->un.inner.feed + prio); if (!p->un.inner.feed[prio].rb_node) mask |= 1 << prio; @@ -739,7 +753,7 @@ static void htb_charge_class(struct htb_sched *q, struct htb_class *cl, htb_change_class_mode(q, cl, &diff); if (old_mode != cl->cmode) { if (old_mode != HTB_CAN_SEND) - rb_erase(&cl->pq_node, q->wait_pq + cl->level); + htb_safe_rb_erase(&cl->pq_node, q->wait_pq + cl->level); if (cl->cmode != HTB_CAN_SEND) htb_add_to_wait_tree(q, cl, diff); } @@ -782,7 +796,7 @@ static long htb_do_events(struct htb_sched *q, int level) if (time_after(cl->pq_key, q->jiffies)) { return cl->pq_key - q->jiffies; } - rb_erase(p, q->wait_pq + level); + htb_safe_rb_erase(p, q->wait_pq + level); diff = PSCHED_TDIFF_SAFE(q->now, cl->t_c, (u32) cl->mbuffer); htb_change_class_mode(q, cl, &diff); if (cl->cmode != HTB_CAN_SEND) @@ -1279,7 +1293,7 @@ static void htb_destroy_class(struct Qdisc *sch, struct htb_class *cl) htb_deactivate(q, cl); if (cl->cmode != HTB_CAN_SEND) - rb_erase(&cl->pq_node, q->wait_pq + cl->level); + htb_safe_rb_erase(&cl->pq_node, q->wait_pq + cl->level); kfree(cl); } @@ -1370,6 +1384,8 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, if (!cl) { /* new class */ struct Qdisc *new_q; + int prio; + /* check for valid classid */ if (!classid || TC_H_MAJ(classid ^ sch->handle) || htb_find(classid, sch)) @@ -1389,6 +1405,10 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, INIT_HLIST_NODE(&cl->hlist); INIT_LIST_HEAD(&cl->children); INIT_LIST_HEAD(&cl->un.leaf.drop_list); + RB_CLEAR_NODE(&cl->pq_node); + + for (prio = 0; prio < TC_HTB_NUMPRIO; prio++) + RB_CLEAR_NODE(&cl->node[prio]); /* create leaf qdisc early because it uses kmalloc(GFP_KERNEL) so that can't be used inside of sch_tree_lock @@ -1404,7 +1424,7 @@ static int htb_change_class(struct Qdisc *sch, u32 classid, /* remove from evt list because of level change */ if (parent->cmode != HTB_CAN_SEND) { - rb_erase(&parent->pq_node, q->wait_pq); + htb_safe_rb_erase(&parent->pq_node, q->wait_pq); parent->cmode = HTB_CAN_SEND; } parent->level = (parent->parent ? parent->parent->level -- cgit v0.10.2 From b6fe17d6cc5d570b72f8e4da351b593c5a680355 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Tue, 29 Aug 2006 17:06:13 -0700 Subject: [NET] netdev: Check name length Some improvements to robust name interface. These API's are safe now by convention, but it is worth providing some safety checks against future bugs. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/core/dev.c b/net/core/dev.c index fc82f6f..14de297 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -640,6 +640,8 @@ int dev_valid_name(const char *name) { if (*name == '\0') return 0; + if (strlen(name) >= IFNAMSIZ) + return 0; if (!strcmp(name, ".") || !strcmp(name, "..")) return 0; @@ -3191,13 +3193,15 @@ struct net_device *alloc_netdev(int sizeof_priv, const char *name, struct net_device *dev; int alloc_size; + BUG_ON(strlen(name) >= sizeof(dev->name)); + /* ensure 32-byte alignment of both the device and private area */ alloc_size = (sizeof(*dev) + NETDEV_ALIGN_CONST) & ~NETDEV_ALIGN_CONST; alloc_size += sizeof_priv + NETDEV_ALIGN_CONST; p = kzalloc(alloc_size, GFP_KERNEL); if (!p) { - printk(KERN_ERR "alloc_dev: Unable to allocate device.\n"); + printk(KERN_ERR "alloc_netdev: Unable to allocate device.\n"); return NULL; } -- cgit v0.10.2 From d880309ae17783c27016bf4f903782d322d0a2a1 Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Fri, 11 Aug 2006 16:43:41 -0700 Subject: [DECNET] Fix to multiple tables routing Here is a fix to Patrick McHardy's increase number of routing tables patch for DECnet. I did just test this and it appears to be working fine with this patch. Signed-off-by: Steven Whitehouse Acked-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 878312f..c8d9411 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -116,6 +116,7 @@ static struct nla_policy dn_fib_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_SRC] = { .type = NLA_U16 }, [FRA_DST] = { .type = NLA_U16 }, [FRA_FWMARK] = { .type = NLA_U32 }, + [FRA_TABLE] = { .type = NLA_U32 }, }; static int dn_fib_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) -- cgit v0.10.2 From d1aa62f15b511457af2233150c960dc1fd02769b Mon Sep 17 00:00:00 2001 From: Steven Whitehouse Date: Fri, 11 Aug 2006 16:44:18 -0700 Subject: [DECNET] Fix to decnet rules compare function Here is a fix to the DECnet rules compare function where we used 32bit values rather than 16bit values. Spotted by Patrick McHardy. Signed-off-by: Steven Whitehouse Signed-off-by: David S. Miller diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index c8d9411..977bb56 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -197,10 +197,10 @@ static int dn_fib_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, return 0; #endif - if (tb[FRA_SRC] && (r->src != nla_get_u32(tb[FRA_SRC]))) + if (tb[FRA_SRC] && (r->src != nla_get_u16(tb[FRA_SRC]))) return 0; - if (tb[FRA_DST] && (r->dst != nla_get_u32(tb[FRA_DST]))) + if (tb[FRA_DST] && (r->dst != nla_get_u16(tb[FRA_DST]))) return 0; return 1; -- cgit v0.10.2 From 90d41122f79c8c3687d965dde4c6d30a6e0cac4c Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Mon, 14 Aug 2006 23:49:16 -0700 Subject: [IPV6] ip6_fib.c: make code static Make the following needlessly global code static: - fib6_walker_lock - struct fib6_walker_list - fib6_walk_continue() - fib6_walk() Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index c0660ce..69c4442 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -92,28 +92,6 @@ struct fib6_walker_t void *args; }; -extern struct fib6_walker_t fib6_walker_list; -extern rwlock_t fib6_walker_lock; - -static inline void fib6_walker_link(struct fib6_walker_t *w) -{ - write_lock_bh(&fib6_walker_lock); - w->next = fib6_walker_list.next; - w->prev = &fib6_walker_list; - w->next->prev = w; - w->prev->next = w; - write_unlock_bh(&fib6_walker_lock); -} - -static inline void fib6_walker_unlink(struct fib6_walker_t *w) -{ - write_lock_bh(&fib6_walker_lock); - w->next->prev = w->prev; - w->prev->next = w->next; - w->prev = w->next = w; - write_unlock_bh(&fib6_walker_lock); -} - struct rt6_statistics { __u32 fib_nodes; __u32 fib_route_nodes; @@ -195,9 +173,6 @@ struct fib6_node *fib6_locate(struct fib6_node *root, extern void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), int prune, void *arg); -extern int fib6_walk(struct fib6_walker_t *w); -extern int fib6_walk_continue(struct fib6_walker_t *w); - extern int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nlmsghdr *nlh, diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index bececbe..be36f4a 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -69,8 +69,7 @@ struct fib6_cleaner_t void *arg; }; -DEFINE_RWLOCK(fib6_walker_lock); - +static DEFINE_RWLOCK(fib6_walker_lock); #ifdef CONFIG_IPV6_SUBTREES #define FWS_INIT FWS_S @@ -82,6 +81,8 @@ DEFINE_RWLOCK(fib6_walker_lock); static void fib6_prune_clones(struct fib6_node *fn, struct rt6_info *rt); static struct fib6_node * fib6_repair_tree(struct fib6_node *fn); +static int fib6_walk(struct fib6_walker_t *w); +static int fib6_walk_continue(struct fib6_walker_t *w); /* * A routing update causes an increase of the serial number on the @@ -94,13 +95,31 @@ static __u32 rt_sernum; static DEFINE_TIMER(ip6_fib_timer, fib6_run_gc, 0, 0); -struct fib6_walker_t fib6_walker_list = { +static struct fib6_walker_t fib6_walker_list = { .prev = &fib6_walker_list, .next = &fib6_walker_list, }; #define FOR_WALKERS(w) for ((w)=fib6_walker_list.next; (w) != &fib6_walker_list; (w)=(w)->next) +static inline void fib6_walker_link(struct fib6_walker_t *w) +{ + write_lock_bh(&fib6_walker_lock); + w->next = fib6_walker_list.next; + w->prev = &fib6_walker_list; + w->next->prev = w; + w->prev->next = w; + write_unlock_bh(&fib6_walker_lock); +} + +static inline void fib6_walker_unlink(struct fib6_walker_t *w) +{ + write_lock_bh(&fib6_walker_lock); + w->next->prev = w->prev; + w->prev->next = w->next; + w->prev = w->next = w; + write_unlock_bh(&fib6_walker_lock); +} static __inline__ u32 fib6_new_sernum(void) { u32 n = ++rt_sernum; @@ -1173,7 +1192,7 @@ int fib6_del(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct ne * <0 -> walk is terminated by an error. */ -int fib6_walk_continue(struct fib6_walker_t *w) +static int fib6_walk_continue(struct fib6_walker_t *w) { struct fib6_node *fn, *pn; @@ -1247,7 +1266,7 @@ int fib6_walk_continue(struct fib6_walker_t *w) } } -int fib6_walk(struct fib6_walker_t *w) +static int fib6_walk(struct fib6_walker_t *w) { int res; -- cgit v0.10.2 From 50da859d4e566fba90ebda87b843970d902c903e Mon Sep 17 00:00:00 2001 From: Andreas Mohr Date: Mon, 14 Aug 2006 23:54:30 -0700 Subject: [TG3]: Constify firmware structs Constify largish areas of firmware data in Tigon3 ethernet driver. non-const: lsmod: tg3 101404 0 objdump -x: .rodata 000003e8 .data 00004a0c ls -l: -rw-r--r-- 1 root root 114404 2006-08-19 21:36 drivers/net/tg3.ko const: lsmod: tg3 101404 0 objdump -x: .rodata 000042c8 .data 00000b4c ls -l: -rw-r--r-- 1 root root 114532 2006-08-19 21:06 drivers/net/tg3.ko Signed-off-by: Andreas Mohr Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c index 6f5d3a3..34078a7 100644 --- a/drivers/net/tg3.c +++ b/drivers/net/tg3.c @@ -264,7 +264,7 @@ static struct pci_device_id tg3_pci_tbl[] = { MODULE_DEVICE_TABLE(pci, tg3_pci_tbl); -static struct { +static const struct { const char string[ETH_GSTRING_LEN]; } ethtool_stats_keys[TG3_NUM_STATS] = { { "rx_octets" }, @@ -345,7 +345,7 @@ static struct { { "nic_tx_threshold_hit" } }; -static struct { +static const struct { const char string[ETH_GSTRING_LEN]; } ethtool_test_keys[TG3_NUM_TEST] = { { "nvram test (online) " }, @@ -4969,7 +4969,7 @@ static int tg3_halt(struct tg3 *tp, int kind, int silent) #define TG3_FW_BSS_ADDR 0x08000a70 #define TG3_FW_BSS_LEN 0x10 -static u32 tg3FwText[(TG3_FW_TEXT_LEN / sizeof(u32)) + 1] = { +static const u32 tg3FwText[(TG3_FW_TEXT_LEN / sizeof(u32)) + 1] = { 0x00000000, 0x10000003, 0x00000000, 0x0000000d, 0x0000000d, 0x3c1d0800, 0x37bd3ffc, 0x03a0f021, 0x3c100800, 0x26100000, 0x0e000018, 0x00000000, 0x0000000d, 0x3c1d0800, 0x37bd3ffc, 0x03a0f021, 0x3c100800, 0x26100034, @@ -5063,7 +5063,7 @@ static u32 tg3FwText[(TG3_FW_TEXT_LEN / sizeof(u32)) + 1] = { 0x27bd0008, 0x03e00008, 0x00000000, 0x00000000, 0x00000000 }; -static u32 tg3FwRodata[(TG3_FW_RODATA_LEN / sizeof(u32)) + 1] = { +static const u32 tg3FwRodata[(TG3_FW_RODATA_LEN / sizeof(u32)) + 1] = { 0x35373031, 0x726c7341, 0x00000000, 0x00000000, 0x53774576, 0x656e7430, 0x00000000, 0x726c7045, 0x76656e74, 0x31000000, 0x556e6b6e, 0x45766e74, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x66617461, 0x6c457272, @@ -5128,13 +5128,13 @@ static int tg3_halt_cpu(struct tg3 *tp, u32 offset) struct fw_info { unsigned int text_base; unsigned int text_len; - u32 *text_data; + const u32 *text_data; unsigned int rodata_base; unsigned int rodata_len; - u32 *rodata_data; + const u32 *rodata_data; unsigned int data_base; unsigned int data_len; - u32 *data_data; + const u32 *data_data; }; /* tp->lock is held. */ @@ -5266,7 +5266,7 @@ static int tg3_load_5701_a0_firmware_fix(struct tg3 *tp) #define TG3_TSO_FW_BSS_ADDR 0x08001b80 #define TG3_TSO_FW_BSS_LEN 0x894 -static u32 tg3TsoFwText[(TG3_TSO_FW_TEXT_LEN / 4) + 1] = { +static const u32 tg3TsoFwText[(TG3_TSO_FW_TEXT_LEN / 4) + 1] = { 0x0e000003, 0x00000000, 0x08001b24, 0x00000000, 0x10000003, 0x00000000, 0x0000000d, 0x0000000d, 0x3c1d0800, 0x37bd4000, 0x03a0f021, 0x3c100800, 0x26100000, 0x0e000010, 0x00000000, 0x0000000d, 0x27bdffe0, 0x3c04fefe, @@ -5553,7 +5553,7 @@ static u32 tg3TsoFwText[(TG3_TSO_FW_TEXT_LEN / 4) + 1] = { 0xac470014, 0xac4a0018, 0x03e00008, 0xac4b001c, 0x00000000, 0x00000000, }; -static u32 tg3TsoFwRodata[] = { +static const u32 tg3TsoFwRodata[] = { 0x4d61696e, 0x43707542, 0x00000000, 0x4d61696e, 0x43707541, 0x00000000, 0x00000000, 0x00000000, 0x73746b6f, 0x66666c64, 0x496e0000, 0x73746b6f, 0x66662a2a, 0x00000000, 0x53774576, 0x656e7430, 0x00000000, 0x00000000, @@ -5561,7 +5561,7 @@ static u32 tg3TsoFwRodata[] = { 0x00000000, }; -static u32 tg3TsoFwData[] = { +static const u32 tg3TsoFwData[] = { 0x00000000, 0x73746b6f, 0x66666c64, 0x5f76312e, 0x362e3000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, @@ -5583,7 +5583,7 @@ static u32 tg3TsoFwData[] = { #define TG3_TSO5_FW_BSS_ADDR 0x00010f50 #define TG3_TSO5_FW_BSS_LEN 0x88 -static u32 tg3Tso5FwText[(TG3_TSO5_FW_TEXT_LEN / 4) + 1] = { +static const u32 tg3Tso5FwText[(TG3_TSO5_FW_TEXT_LEN / 4) + 1] = { 0x0c004003, 0x00000000, 0x00010f04, 0x00000000, 0x10000003, 0x00000000, 0x0000000d, 0x0000000d, 0x3c1d0001, 0x37bde000, 0x03a0f021, 0x3c100001, 0x26100000, 0x0c004010, 0x00000000, 0x0000000d, 0x27bdffe0, 0x3c04fefe, @@ -5742,14 +5742,14 @@ static u32 tg3Tso5FwText[(TG3_TSO5_FW_TEXT_LEN / 4) + 1] = { 0x00000000, 0x00000000, 0x00000000, }; -static u32 tg3Tso5FwRodata[(TG3_TSO5_FW_RODATA_LEN / 4) + 1] = { +static const u32 tg3Tso5FwRodata[(TG3_TSO5_FW_RODATA_LEN / 4) + 1] = { 0x4d61696e, 0x43707542, 0x00000000, 0x4d61696e, 0x43707541, 0x00000000, 0x00000000, 0x00000000, 0x73746b6f, 0x66666c64, 0x00000000, 0x00000000, 0x73746b6f, 0x66666c64, 0x00000000, 0x00000000, 0x66617461, 0x6c457272, 0x00000000, 0x00000000, 0x00000000, }; -static u32 tg3Tso5FwData[(TG3_TSO5_FW_DATA_LEN / 4) + 1] = { +static const u32 tg3Tso5FwData[(TG3_TSO5_FW_DATA_LEN / 4) + 1] = { 0x00000000, 0x73746b6f, 0x66666c64, 0x5f76312e, 0x322e3000, 0x00000000, 0x00000000, 0x00000000, 0x00000000, }; -- cgit v0.10.2 From 2aa7f36cdb332a32849afbf25fcbf35dce5b1940 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Mon, 14 Aug 2006 23:55:20 -0700 Subject: [DECNET]: cleanups - make the following needlessly global functions static: - dn_fib.c: dn_fib_sync_down() - dn_fib.c: dn_fib_sync_up() - dn_rules.c: dn_fib_rule_action() - remove the following unneeded prototype: - dn_fib.c: dn_cache_dump() Signed-off-by: Adrian Bunk Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/include/net/dn_fib.h b/include/net/dn_fib.h index d97aa10..f01626c 100644 --- a/include/net/dn_fib.h +++ b/include/net/dn_fib.h @@ -131,9 +131,6 @@ extern __le16 dn_fib_get_attr16(struct rtattr *attr, int attrlen, int type); extern void dn_fib_flush(void); extern void dn_fib_select_multipath(const struct flowi *fl, struct dn_fib_res *res); -extern int dn_fib_sync_down(__le16 local, struct net_device *dev, - int force); -extern int dn_fib_sync_up(struct net_device *dev); /* * dn_tables.c diff --git a/net/decnet/dn_fib.c b/net/decnet/dn_fib.c index 5ccca3e..1cf01012 100644 --- a/net/decnet/dn_fib.c +++ b/net/decnet/dn_fib.c @@ -55,8 +55,6 @@ #define endfor_nexthops(fi) } -extern int dn_cache_dump(struct sk_buff *skb, struct netlink_callback *cb); - static DEFINE_SPINLOCK(dn_fib_multipath_lock); static struct dn_fib_info *dn_fib_info_list; static DEFINE_SPINLOCK(dn_fib_info_lock); @@ -80,6 +78,9 @@ static struct [RTN_XRESOLVE] = { .error = -EINVAL, .scope = RT_SCOPE_NOWHERE }, }; +static int dn_fib_sync_down(__le16 local, struct net_device *dev, int force); +static int dn_fib_sync_up(struct net_device *dev); + void dn_fib_free_info(struct dn_fib_info *fi) { if (fi->fib_dead == 0) { @@ -651,7 +652,7 @@ static int dn_fib_dnaddr_event(struct notifier_block *this, unsigned long event, return NOTIFY_DONE; } -int dn_fib_sync_down(__le16 local, struct net_device *dev, int force) +static int dn_fib_sync_down(__le16 local, struct net_device *dev, int force) { int ret = 0; int scope = RT_SCOPE_NOWHERE; @@ -695,7 +696,7 @@ int dn_fib_sync_down(__le16 local, struct net_device *dev, int force) } -int dn_fib_sync_up(struct net_device *dev) +static int dn_fib_sync_up(struct net_device *dev) { int ret = 0; diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 977bb56..50e819e 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -75,8 +75,8 @@ int dn_fib_lookup(struct flowi *flp, struct dn_fib_res *res) return err; } -int dn_fib_rule_action(struct fib_rule *rule, struct flowi *flp, int flags, - struct fib_lookup_arg *arg) +static int dn_fib_rule_action(struct fib_rule *rule, struct flowi *flp, + int flags, struct fib_lookup_arg *arg) { int err = -EAGAIN; struct dn_fib_table *tbl; -- cgit v0.10.2 From 81aa646cc4df3779bcbf9d18cc2c0813ee9b3262 Mon Sep 17 00:00:00 2001 From: Martin Bligh Date: Mon, 14 Aug 2006 23:57:10 -0700 Subject: [IPV4]: add the UdpSndbufErrors and UdpRcvbufErrors MIBs Signed-off-by: Martin Bligh Signed-off-by: Andrew Morton diff --git a/include/linux/snmp.h b/include/linux/snmp.h index 4db25d5..3015655 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -155,6 +155,8 @@ enum UDP_MIB_NOPORTS, /* NoPorts */ UDP_MIB_INERRORS, /* InErrors */ UDP_MIB_OUTDATAGRAMS, /* OutDatagrams */ + UDP_MIB_RCVBUFERRORS, /* RcvbufErrors */ + UDP_MIB_SNDBUFERRORS, /* SndbufErrors */ __UDP_MIB_MAX }; diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c index d61e2a9..9c6cbe3 100644 --- a/net/ipv4/proc.c +++ b/net/ipv4/proc.c @@ -173,6 +173,8 @@ static const struct snmp_mib snmp4_udp_list[] = { SNMP_MIB_ITEM("NoPorts", UDP_MIB_NOPORTS), SNMP_MIB_ITEM("InErrors", UDP_MIB_INERRORS), SNMP_MIB_ITEM("OutDatagrams", UDP_MIB_OUTDATAGRAMS), + SNMP_MIB_ITEM("RcvbufErrors", UDP_MIB_RCVBUFERRORS), + SNMP_MIB_ITEM("SndbufErrors", UDP_MIB_SNDBUFERRORS), SNMP_MIB_SENTINEL }; diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 8715251..514c1e9 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -662,6 +662,16 @@ out: UDP_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS); return len; } + /* + * ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space. Reporting + * ENOBUFS might not be good (it's not tunable per se), but otherwise + * we don't have a good statistic (IpOutDiscards but it can be too many + * things). We could add another new stat but at least for now that + * seems like overkill. + */ + if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { + UDP_INC_STATS_USER(UDP_MIB_SNDBUFERRORS); + } return err; do_confirm: @@ -981,6 +991,7 @@ static int udp_encap_rcv(struct sock * sk, struct sk_buff *skb) static int udp_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) { struct udp_sock *up = udp_sk(sk); + int rc; /* * Charge it to the socket, dropping if the queue is full. @@ -1027,7 +1038,10 @@ static int udp_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) skb->ip_summed = CHECKSUM_UNNECESSARY; } - if (sock_queue_rcv_skb(sk,skb)<0) { + if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) { + /* Note that an ENOMEM error is charged twice */ + if (rc == -ENOMEM) + UDP_INC_STATS_BH(UDP_MIB_RCVBUFERRORS); UDP_INC_STATS_BH(UDP_MIB_INERRORS); kfree_skb(skb); return -1; -- cgit v0.10.2 From a18135eb9389c26d36ef5c05bd8bc526e0cbe883 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 15 Aug 2006 00:00:09 -0700 Subject: [IPV6]: Add UDP_MIB_{SND,RCV}BUFERRORS handling. Signed-off-by: David S. Miller diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 780b89f..c813381 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -345,6 +345,8 @@ out: static inline int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) { + int rc; + if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) { kfree_skb(skb); return -1; @@ -356,7 +358,10 @@ static inline int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) return 0; } - if (sock_queue_rcv_skb(sk,skb)<0) { + if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) { + /* Note that an ENOMEM error is charged twice */ + if (rc == -ENOMEM) + UDP_INC_STATS_BH(UDP_MIB_RCVBUFERRORS); UDP6_INC_STATS_BH(UDP_MIB_INERRORS); kfree_skb(skb); return 0; @@ -857,6 +862,16 @@ out: UDP6_INC_STATS_USER(UDP_MIB_OUTDATAGRAMS); return len; } + /* + * ENOBUFS = no kernel mem, SOCK_NOSPACE = no sndbuf space. Reporting + * ENOBUFS might not be good (it's not tunable per se), but otherwise + * we don't have a good statistic (IpOutDiscards but it can be too many + * things). We could add another new stat but at least for now that + * seems like overkill. + */ + if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { + UDP_INC_STATS_USER(UDP_MIB_SNDBUFERRORS); + } return err; do_confirm: -- cgit v0.10.2 From 97a4f3e7110619568aa239fe19143d9ec42dede5 Mon Sep 17 00:00:00 2001 From: Alan Cox Date: Tue, 15 Aug 2006 00:01:05 -0700 Subject: [NETFILTER]: Make unused signal code go away so nobody copies its brokenness This code is wrong on so many levels, please lose it so it isn't replicated anywhere else. Signed-off-by: Alan Cox Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 3a13ed6..d06a507 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -37,30 +37,9 @@ #include #include -#if 0 -/* use this for remote debugging - * Copyright (C) 1998 by Ori Pomerantz - * Print the string to the appropriate tty, the one - * the current task uses - */ -static void print_string(char *str) -{ - struct tty_struct *my_tty; - - /* The tty for the current task */ - my_tty = current->signal->tty; - if (my_tty != NULL) { - my_tty->driver->write(my_tty, 0, str, strlen(str)); - my_tty->driver->write(my_tty, 0, "\015\012", 2); - } -} - -#define BUGPRINT(args) print_string(args); -#else #define BUGPRINT(format, args...) printk("kernel msg: ebtables bug: please "\ "report to author: "format, ## args) /* #define BUGPRINT(format, args...) */ -#endif #define MEMPRINT(format, args...) printk("kernel msg: ebtables "\ ": out of memory: "format, ## args) /* #define MEMPRINT(format, args...) */ -- cgit v0.10.2 From 9a673e563e543a5c8a6f9824562e55e807b8a56c Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Tue, 15 Aug 2006 00:03:53 -0700 Subject: [SELINUX]: security/selinux/hooks.c: Make 4 functions static. This patch makes four needlessly global functions static. Signed-off-by: Adrian Bunk Acked-by: James Morris Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 2a6bbb9..180b26b 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -3598,7 +3598,7 @@ static void selinux_sk_getsecid(struct sock *sk, u32 *secid) } } -void selinux_sock_graft(struct sock* sk, struct socket *parent) +static void selinux_sock_graft(struct sock* sk, struct socket *parent) { struct inode_security_struct *isec = SOCK_INODE(parent)->i_security; struct sk_security_struct *sksec = sk->sk_security; @@ -3608,8 +3608,8 @@ void selinux_sock_graft(struct sock* sk, struct socket *parent) selinux_netlbl_sock_graft(sk, parent); } -int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, - struct request_sock *req) +static int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, + struct request_sock *req) { struct sk_security_struct *sksec = sk->sk_security; int err; @@ -3638,7 +3638,8 @@ int selinux_inet_conn_request(struct sock *sk, struct sk_buff *skb, return 0; } -void selinux_inet_csk_clone(struct sock *newsk, const struct request_sock *req) +static void selinux_inet_csk_clone(struct sock *newsk, + const struct request_sock *req) { struct sk_security_struct *newsksec = newsk->sk_security; @@ -3649,7 +3650,8 @@ void selinux_inet_csk_clone(struct sock *newsk, const struct request_sock *req) time it will have been created and available. */ } -void selinux_req_classify_flow(const struct request_sock *req, struct flowi *fl) +static void selinux_req_classify_flow(const struct request_sock *req, + struct flowi *fl) { fl->secid = req->secid; } -- cgit v0.10.2 From 9c3bd6833a4df1abd9ecd3b51492b8949bf9cd11 Mon Sep 17 00:00:00 2001 From: Bjorn Helgaas Date: Tue, 15 Aug 2006 00:05:38 -0700 Subject: [IRDA]: Replace hard-coded dev_self[] array sizes with ARRAY_SIZE() Several IR drivers used "for (i = 0; i < 4; i++)" to walk their dev_self[] table. Better to use ARRAY_SIZE(). And fix ali-ircc so it won't run off the end if we find too many adapters. Signed-off-by: Bjorn Helgaas Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/drivers/net/irda/ali-ircc.c b/drivers/net/irda/ali-ircc.c index e3c8cd5..68d4c41 100644 --- a/drivers/net/irda/ali-ircc.c +++ b/drivers/net/irda/ali-ircc.c @@ -249,7 +249,7 @@ static void __exit ali_ircc_cleanup(void) IRDA_DEBUG(2, "%s(), ---------------- Start ----------------\n", __FUNCTION__); - for (i=0; i < 4; i++) { + for (i=0; i < ARRAY_SIZE(dev_self); i++) { if (dev_self[i]) ali_ircc_close(dev_self[i]); } @@ -273,6 +273,12 @@ static int ali_ircc_open(int i, chipio_t *info) int err; IRDA_DEBUG(2, "%s(), ---------------- Start ----------------\n", __FUNCTION__); + + if (i >= ARRAY_SIZE(dev_self)) { + IRDA_ERROR("%s(), maximum number of supported chips reached!\n", + __FUNCTION__); + return -ENOMEM; + } /* Set FIR FIFO and DMA Threshold */ if ((ali_ircc_setup(info)) == -1) diff --git a/drivers/net/irda/irport.c b/drivers/net/irda/irport.c index 44efd49..ba4f3eb 100644 --- a/drivers/net/irda/irport.c +++ b/drivers/net/irda/irport.c @@ -1090,7 +1090,7 @@ static int __init irport_init(void) { int i; - for (i=0; (io[i] < 2000) && (i < 4); i++) { + for (i=0; (io[i] < 2000) && (i < ARRAY_SIZE(dev_self)); i++) { if (irport_open(i, io[i], irq[i]) != NULL) return 0; } @@ -1112,7 +1112,7 @@ static void __exit irport_cleanup(void) IRDA_DEBUG( 4, "%s()\n", __FUNCTION__); - for (i=0; i < 4; i++) { + for (i=0; i < ARRAY_SIZE(dev_self); i++) { if (dev_self[i]) irport_close(dev_self[i]); } diff --git a/drivers/net/irda/via-ircc.c b/drivers/net/irda/via-ircc.c index 8bafb45..79b85f3 100644 --- a/drivers/net/irda/via-ircc.c +++ b/drivers/net/irda/via-ircc.c @@ -279,7 +279,7 @@ static void via_ircc_clean(void) IRDA_DEBUG(3, "%s()\n", __FUNCTION__); - for (i=0; i < 4; i++) { + for (i=0; i < ARRAY_SIZE(dev_self); i++) { if (dev_self[i]) via_ircc_close(dev_self[i]); } @@ -327,6 +327,9 @@ static __devinit int via_ircc_open(int i, chipio_t * info, unsigned int id) IRDA_DEBUG(3, "%s()\n", __FUNCTION__); + if (i >= ARRAY_SIZE(dev_self)) + return -ENOMEM; + /* Allocate new instance of the driver */ dev = alloc_irdadev(sizeof(struct via_ircc_cb)); if (dev == NULL) diff --git a/drivers/net/irda/w83977af_ir.c b/drivers/net/irda/w83977af_ir.c index 0ea65c4..8421597 100644 --- a/drivers/net/irda/w83977af_ir.c +++ b/drivers/net/irda/w83977af_ir.c @@ -117,7 +117,7 @@ static int __init w83977af_init(void) IRDA_DEBUG(0, "%s()\n", __FUNCTION__ ); - for (i=0; (io[i] < 2000) && (i < 4); i++) { + for (i=0; (io[i] < 2000) && (i < ARRAY_SIZE(dev_self)); i++) { if (w83977af_open(i, io[i], irq[i], dma[i]) == 0) return 0; } @@ -136,7 +136,7 @@ static void __exit w83977af_cleanup(void) IRDA_DEBUG(4, "%s()\n", __FUNCTION__ ); - for (i=0; i < 4; i++) { + for (i=0; i < ARRAY_SIZE(dev_self); i++) { if (dev_self[i]) w83977af_close(dev_self[i]); } -- cgit v0.10.2 From 62872e2dcb3127b20a49e3b4b1d93523cf476cc4 Mon Sep 17 00:00:00 2001 From: Stphane Witzmann Date: Tue, 15 Aug 2006 00:09:17 -0700 Subject: [ARCNET]: SoHard PCI support Add support for a SoHard PCI ARCnet card. Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/drivers/net/arcnet/com20020-pci.c b/drivers/net/arcnet/com20020-pci.c index 979a33d..96d8a69 100644 --- a/drivers/net/arcnet/com20020-pci.c +++ b/drivers/net/arcnet/com20020-pci.c @@ -161,6 +161,7 @@ static struct pci_device_id com20020pci_id_table[] = { { 0x1571, 0xa204, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ARC_CAN_10MBIT }, { 0x1571, 0xa205, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ARC_CAN_10MBIT }, { 0x1571, 0xa206, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ARC_CAN_10MBIT }, + { 0x10B5, 0x9030, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ARC_CAN_10MBIT }, { 0x10B5, 0x9050, PCI_ANY_ID, PCI_ANY_ID, 0, 0, ARC_CAN_10MBIT }, {0,} }; -- cgit v0.10.2 From f8d8fda54a1bfcf8cf829e44c494b2b4582819aa Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 15 Aug 2006 00:15:41 -0700 Subject: [IPV6] udp: Fix type in previous change. UDPv6 stats are UDP6_foo not UDP_foo. Signed-off-by: David S. Miller diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index c813381..eb9e1b3 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -361,7 +361,7 @@ static inline int udpv6_queue_rcv_skb(struct sock * sk, struct sk_buff *skb) if ((rc = sock_queue_rcv_skb(sk,skb)) < 0) { /* Note that an ENOMEM error is charged twice */ if (rc == -ENOMEM) - UDP_INC_STATS_BH(UDP_MIB_RCVBUFERRORS); + UDP6_INC_STATS_BH(UDP_MIB_RCVBUFERRORS); UDP6_INC_STATS_BH(UDP_MIB_INERRORS); kfree_skb(skb); return 0; @@ -870,7 +870,7 @@ out: * seems like overkill. */ if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) { - UDP_INC_STATS_USER(UDP_MIB_SNDBUFERRORS); + UDP6_INC_STATS_USER(UDP_MIB_SNDBUFERRORS); } return err; -- cgit v0.10.2 From 2942e90050569525628a9f34e0daaa9b661b49cc Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:30:25 -0700 Subject: [RTNETLINK]: Use rtnl_unicast() for rtnetlink unicasts Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 7e4aa48..0e4f478 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -584,6 +584,7 @@ struct rtnetlink_link extern struct rtnetlink_link * rtnetlink_links[NPROTO]; extern int rtnetlink_send(struct sk_buff *skb, u32 pid, u32 group, int echo); +extern int rtnl_unicast(struct sk_buff *skb, u32 pid); extern int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics); extern void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index a1b783a..e02fa6a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -166,6 +166,11 @@ int rtnetlink_send(struct sk_buff *skb, u32 pid, unsigned group, int echo) return err; } +int rtnl_unicast(struct sk_buff *skb, u32 pid) +{ + return nlmsg_unicast(rtnl, skb, pid); +} + int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics) { struct rtattr *mx = (struct rtattr*)skb->tail; @@ -574,9 +579,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) goto errout; } - err = netlink_unicast(rtnl, skb, NETLINK_CB(skb).pid, MSG_DONTWAIT); - if (err > 0) - err = 0; + err = rtnl_unicast(skb, NETLINK_CB(skb).pid); errout: kfree(iw_buf); dev_put(dev); @@ -825,3 +828,4 @@ EXPORT_SYMBOL(rtnl); EXPORT_SYMBOL(rtnl_lock); EXPORT_SYMBOL(rtnl_trylock); EXPORT_SYMBOL(rtnl_unlock); +EXPORT_SYMBOL(rtnl_unicast); diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index 4c96321..c5daf35 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1611,9 +1611,7 @@ int dn_cache_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh, void *arg) goto out_free; } - err = netlink_unicast(rtnl, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); - - return err; + return rtnl_unicast(skb, NETLINK_CB(in_skb).pid); out_free: kfree_skb(skb); diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 85893ee..98f0aa0 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -312,7 +312,8 @@ static void ipmr_destroy_unres(struct mfc_cache *c) e = NLMSG_DATA(nlh); e->error = -ETIMEDOUT; memset(&e->msg, 0, sizeof(e->msg)); - netlink_unicast(rtnl, skb, NETLINK_CB(skb).dst_pid, MSG_DONTWAIT); + + rtnl_unicast(skb, NETLINK_CB(skb).pid); } else kfree_skb(skb); } @@ -512,7 +513,6 @@ static void ipmr_cache_resolve(struct mfc_cache *uc, struct mfc_cache *c) while((skb=__skb_dequeue(&uc->mfc_un.unres.unresolved))) { if (skb->nh.iph->version == 0) { - int err; struct nlmsghdr *nlh = (struct nlmsghdr *)skb_pull(skb, sizeof(struct iphdr)); if (ipmr_fill_mroute(skb, c, NLMSG_DATA(nlh)) > 0) { @@ -525,7 +525,8 @@ static void ipmr_cache_resolve(struct mfc_cache *uc, struct mfc_cache *c) e->error = -EMSGSIZE; memset(&e->msg, 0, sizeof(e->msg)); } - err = netlink_unicast(rtnl, skb, NETLINK_CB(skb).dst_pid, MSG_DONTWAIT); + + rtnl_unicast(skb, NETLINK_CB(skb).pid); } else ip_mr_forward(skb, c, 0); } diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 12128b8..b8f6cad 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2809,10 +2809,9 @@ int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) goto out_free; } - err = netlink_unicast(rtnl, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); - if (err > 0) - err = 0; -out: return err; + err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); +out: + return err; out_free: kfree_skb(skb); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 9ba1e81..4f991a2 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3268,9 +3268,7 @@ static int inet6_rtm_getaddr(struct sk_buff *in_skb, goto out_free; } - err = netlink_unicast(rtnl, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); - if (err > 0) - err = 0; + err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); out: in6_ifa_put(ifa); return err; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 9ce28277..024c8e2 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2044,9 +2044,7 @@ int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) goto out_free; } - err = netlink_unicast(rtnl, skb, NETLINK_CB(in_skb).pid, MSG_DONTWAIT); - if (err > 0) - err = 0; + err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); out: return err; out_free: diff --git a/net/sched/act_api.c b/net/sched/act_api.c index a2587b5..6990747 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -459,7 +459,6 @@ static int act_get_notify(u32 pid, struct nlmsghdr *n, struct tc_action *a, int event) { struct sk_buff *skb; - int err = 0; skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); if (!skb) @@ -468,10 +467,8 @@ act_get_notify(u32 pid, struct nlmsghdr *n, struct tc_action *a, int event) kfree_skb(skb); return -EINVAL; } - err = netlink_unicast(rtnl, skb, pid, MSG_DONTWAIT); - if (err > 0) - err = 0; - return err; + + return rtnl_unicast(skb, pid); } static struct tc_action * -- cgit v0.10.2 From d387f6ad10764fc2174373b4a1cca443adee36e3 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:31:06 -0700 Subject: [NETLINK]: Add notification message sending interface Adds nlmsg_notify() implementing proper notification logic. The message is multicasted to all listeners in the group. The applications the requests orignates from can request a unicast back report in which case said socket will be excluded from the multicast to avoid duplicated notifications. nlmsg_multicast() is extended to take allocation flags to allow notification in atomic contexts. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 8c22872..97d6d3a 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -133,11 +133,12 @@ static inline int genlmsg_cancel(struct sk_buff *skb, void *hdr) * @skb: netlink message as socket buffer * @pid: own netlink pid to avoid sending to yourself * @group: multicast group id + * @flags: allocation flags */ static inline int genlmsg_multicast(struct sk_buff *skb, u32 pid, - unsigned int group) + unsigned int group, gfp_t flags) { - return nlmsg_multicast(genl_sock, skb, pid, group); + return nlmsg_multicast(genl_sock, skb, pid, group, flags); } /** diff --git a/include/net/netlink.h b/include/net/netlink.h index 3a5e40b..b154b81 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -43,6 +43,7 @@ * Message Sending: * nlmsg_multicast() multicast message to several groups * nlmsg_unicast() unicast a message to a single socket + * nlmsg_notify() send notification message * * Message Length Calculations: * nlmsg_msg_size(payload) length of message w/o padding @@ -545,15 +546,16 @@ static inline void nlmsg_free(struct sk_buff *skb) * @skb: netlink message as socket buffer * @pid: own netlink pid to avoid sending to yourself * @group: multicast group id + * @flags: allocation flags */ static inline int nlmsg_multicast(struct sock *sk, struct sk_buff *skb, - u32 pid, unsigned int group) + u32 pid, unsigned int group, gfp_t flags) { int err; NETLINK_CB(skb).dst_group = group; - err = netlink_broadcast(sk, skb, pid, group, GFP_KERNEL); + err = netlink_broadcast(sk, skb, pid, group, flags); if (err > 0) err = 0; diff --git a/net/netlabel/netlabel_user.c b/net/netlabel/netlabel_user.c index 8002222..73cbe66 100644 --- a/net/netlabel/netlabel_user.c +++ b/net/netlabel/netlabel_user.c @@ -154,5 +154,5 @@ int netlbl_netlink_snd(struct sk_buff *skb, u32 pid) */ int netlbl_netlink_snd_multicast(struct sk_buff *skb, u32 pid, u32 group) { - return genlmsg_multicast(skb, pid, group); + return genlmsg_multicast(skb, pid, group, GFP_KERNEL); } diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index 0f36ddc..a80e445 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1549,6 +1549,38 @@ void netlink_queue_skip(struct nlmsghdr *nlh, struct sk_buff *skb) skb_pull(skb, msglen); } +/** + * nlmsg_notify - send a notification netlink message + * @sk: netlink socket to use + * @skb: notification message + * @pid: destination netlink pid for reports or 0 + * @group: destination multicast group or 0 + * @report: 1 to report back, 0 to disable + * @flags: allocation flags + */ +int nlmsg_notify(struct sock *sk, struct sk_buff *skb, u32 pid, + unsigned int group, int report, gfp_t flags) +{ + int err = 0; + + if (group) { + int exclude_pid = 0; + + if (report) { + atomic_inc(&skb->users); + exclude_pid = pid; + } + + /* errors reported via destination sk->sk_err */ + nlmsg_multicast(sk, skb, exclude_pid, group, flags); + } + + if (report) + err = nlmsg_unicast(sk, skb, pid); + + return err; +} + #ifdef CONFIG_PROC_FS struct nl_seq_iter { int link; @@ -1802,4 +1834,4 @@ EXPORT_SYMBOL(netlink_set_err); EXPORT_SYMBOL(netlink_set_nonroot); EXPORT_SYMBOL(netlink_unicast); EXPORT_SYMBOL(netlink_unregister_notifier); - +EXPORT_SYMBOL(nlmsg_notify); diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 75bb47a..d325991 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -510,7 +510,7 @@ static int genl_ctrl_event(int event, void *data) if (IS_ERR(msg)) return PTR_ERR(msg); - genlmsg_multicast(msg, 0, GENL_ID_CTRL); + genlmsg_multicast(msg, 0, GENL_ID_CTRL, GFP_KERNEL); break; } -- cgit v0.10.2 From 97676b6b5538b3e059d33b8338e7d5cc41c5f1f1 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:31:41 -0700 Subject: [RTNETLINK]: Add rtnetlink notification interface Adds rtnl_notify() to send rtnetlink notification messages and rtnl_set_sk_err() to report notification errors as socket errors in order to indicate the need of a resync due to loss of events. nlmsg_report() is added to properly document the meaning of NLM_F_ECHO. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index 0e4f478..ecbe034 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -585,6 +585,9 @@ struct rtnetlink_link extern struct rtnetlink_link * rtnetlink_links[NPROTO]; extern int rtnetlink_send(struct sk_buff *skb, u32 pid, u32 group, int echo); extern int rtnl_unicast(struct sk_buff *skb, u32 pid); +extern int rtnl_notify(struct sk_buff *skb, u32 pid, u32 group, + struct nlmsghdr *nlh, gfp_t flags); +extern void rtnl_set_sk_err(u32 group, int error); extern int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics); extern void __rta_fill(struct sk_buff *skb, int attrtype, int attrlen, const void *data); diff --git a/include/net/netlink.h b/include/net/netlink.h index b154b81..bf593eb 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -65,6 +65,9 @@ * nlmsg_validate() validate netlink message incl. attrs * nlmsg_for_each_attr() loop over all attributes * + * Misc: + * nlmsg_report() report back to application? + * * ------------------------------------------------------------------------ * Attributes Interface * ------------------------------------------------------------------------ @@ -194,6 +197,9 @@ extern void netlink_run_queue(struct sock *sk, unsigned int *qlen, struct nlmsghdr *, int *)); extern void netlink_queue_skip(struct nlmsghdr *nlh, struct sk_buff *skb); +extern int nlmsg_notify(struct sock *sk, struct sk_buff *skb, + u32 pid, unsigned int group, int report, + gfp_t flags); extern int nla_validate(struct nlattr *head, int len, int maxtype, struct nla_policy *policy); @@ -376,6 +382,17 @@ static inline int nlmsg_validate(struct nlmsghdr *nlh, int hdrlen, int maxtype, } /** + * nlmsg_report - need to report back to application? + * @nlh: netlink message header + * + * Returns 1 if a report back to the application is requested. + */ +static inline int nlmsg_report(struct nlmsghdr *nlh) +{ + return !!(nlh->nlmsg_flags & NLM_F_ECHO); +} + +/** * nlmsg_for_each_attr - iterate over a stream of attributes * @pos: loop counter, set to current attribute * @nlh: netlink message header diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index e02fa6a..2b1af17 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -171,6 +171,22 @@ int rtnl_unicast(struct sk_buff *skb, u32 pid) return nlmsg_unicast(rtnl, skb, pid); } +int rtnl_notify(struct sk_buff *skb, u32 pid, u32 group, + struct nlmsghdr *nlh, gfp_t flags) +{ + int report = 0; + + if (nlh) + report = nlmsg_report(nlh); + + return nlmsg_notify(rtnl, skb, pid, group, report, flags); +} + +void rtnl_set_sk_err(u32 group, int error) +{ + netlink_set_err(rtnl, 0, group, error); +} + int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics) { struct rtattr *mx = (struct rtattr*)skb->tail; @@ -829,3 +845,5 @@ EXPORT_SYMBOL(rtnl_lock); EXPORT_SYMBOL(rtnl_trylock); EXPORT_SYMBOL(rtnl_unlock); EXPORT_SYMBOL(rtnl_unicast); +EXPORT_SYMBOL(rtnl_notify); +EXPORT_SYMBOL(rtnl_set_sk_err); -- cgit v0.10.2 From c17084d21c18497b506bd28b82d964bc9e6c424b Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:32:48 -0700 Subject: [NET] fib_rules: Convert fib rule notification to use rtnl_notify() Adds support for NLM_F_ECHO to simplify the process of identifying inserted rules with an auto generated priority. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index 873b04d..7b2e9bb 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -18,7 +18,8 @@ static LIST_HEAD(rules_ops); static DEFINE_SPINLOCK(rules_mod_lock); static void notify_rule_change(int event, struct fib_rule *rule, - struct fib_rules_ops *ops); + struct fib_rules_ops *ops, struct nlmsghdr *nlh, + u32 pid); static struct fib_rules_ops *lookup_rules_ops(int family) { @@ -209,7 +210,7 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) else list_add_rcu(&rule->list, ops->rules_list); - notify_rule_change(RTM_NEWRULE, rule, ops); + notify_rule_change(RTM_NEWRULE, rule, ops, nlh, NETLINK_CB(skb).pid); rules_ops_put(ops); return 0; @@ -266,7 +267,8 @@ int fib_nl_delrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) list_del_rcu(&rule->list); synchronize_rcu(); - notify_rule_change(RTM_DELRULE, rule, ops); + notify_rule_change(RTM_DELRULE, rule, ops, nlh, + NETLINK_CB(skb).pid); fib_rule_put(rule); rules_ops_put(ops); return 0; @@ -344,18 +346,26 @@ skip: EXPORT_SYMBOL_GPL(fib_rules_dump); static void notify_rule_change(int event, struct fib_rule *rule, - struct fib_rules_ops *ops) + struct fib_rules_ops *ops, struct nlmsghdr *nlh, + u32 pid) { - int size = nlmsg_total_size(sizeof(struct fib_rule_hdr) + 128); - struct sk_buff *skb = alloc_skb(size, GFP_KERNEL); + struct sk_buff *skb; + int err = -ENOBUFS; + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); if (skb == NULL) - netlink_set_err(rtnl, 0, ops->nlgroup, ENOBUFS); - else if (fib_nl_fill_rule(skb, rule, 0, 0, event, 0, ops) < 0) { + goto errout; + + err = fib_nl_fill_rule(skb, rule, pid, nlh->nlmsg_seq, event, 0, ops); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, ops->nlgroup, EINVAL); - } else - netlink_broadcast(rtnl, skb, 0, ops->nlgroup, GFP_KERNEL); + goto errout; + } + + err = rtnl_notify(skb, pid, ops->nlgroup, nlh, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(ops->nlgroup, err); } static void attach_rules(struct list_head *rules, struct net_device *dev) -- cgit v0.10.2 From b8673311804ca29680dd584bd08352001fcbe2f8 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:33:14 -0700 Subject: [NEIGH]: Convert neighbour notifications ot use rtnl_notify() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 2f4e06a..23ae5e5 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -2409,36 +2409,35 @@ static struct file_operations neigh_stat_seq_fops = { #endif /* CONFIG_PROC_FS */ #ifdef CONFIG_ARPD -void neigh_app_ns(struct neighbour *n) +static void __neigh_notify(struct neighbour *n, int type, int flags) { struct sk_buff *skb; + int err = -ENOBUFS; skb = nlmsg_new(NLMSG_GOODSIZE, GFP_ATOMIC); if (skb == NULL) - return; + goto errout; - if (neigh_fill_info(skb, n, 0, 0, RTM_GETNEIGH, NLM_F_REQUEST) <= 0) + err = neigh_fill_info(skb, n, 0, 0, type, flags); + if (err < 0) { kfree_skb(skb); - else { - NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); + goto errout; } + + err = rtnl_notify(skb, 0, RTNLGRP_NEIGH, NULL, GFP_ATOMIC); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_NEIGH, err); } -static void neigh_app_notify(struct neighbour *n) +void neigh_app_ns(struct neighbour *n) { - struct sk_buff *skb; - - skb = nlmsg_new(NLMSG_GOODSIZE, GFP_ATOMIC); - if (skb == NULL) - return; + __neigh_notify(n, RTM_GETNEIGH, NLM_F_REQUEST); +} - if (neigh_fill_info(skb, n, 0, 0, RTM_NEWNEIGH, 0) <= 0) - kfree_skb(skb); - else { - NETLINK_CB(skb).dst_group = RTNLGRP_NEIGH; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_NEIGH, GFP_ATOMIC); - } +static void neigh_app_notify(struct neighbour *n) +{ + __neigh_notify(n, RTM_NEWNEIGH, 0); } #endif /* CONFIG_ARPD */ -- cgit v0.10.2 From dc738dd83e88c3c5de55431f8cfb758de5d4df48 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:33:35 -0700 Subject: [DECNET]: Convert DECnet notifications to use rtnl_notify() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/decnet/dn_dev.c b/net/decnet/dn_dev.c index 88ea7a1..01861fe 100644 --- a/net/decnet/dn_dev.c +++ b/net/decnet/dn_dev.c @@ -746,20 +746,23 @@ rtattr_failure: static void rtmsg_ifa(int event, struct dn_ifaddr *ifa) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct ifaddrmsg)+128); + int payload = sizeof(struct ifaddrmsg) + 128; + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_KERNEL); - if (!skb) { - netlink_set_err(rtnl, 0, RTNLGRP_DECnet_IFADDR, ENOBUFS); - return; - } - if (dn_dev_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + skb = alloc_skb(nlmsg_total_size(payload), GFP_KERNEL); + if (skb == NULL) + goto errout; + + err = dn_dev_fill_ifaddr(skb, ifa, 0, 0, event, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_DECnet_IFADDR, EINVAL); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_DECnet_IFADDR; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_DECnet_IFADDR, GFP_KERNEL); + + err = rtnl_notify(skb, 0, RTNLGRP_DECnet_IFADDR, NULL, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_DECnet_IFADDR, err); } static int dn_dev_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) diff --git a/net/decnet/dn_table.c b/net/decnet/dn_table.c index 10e8726..317904b 100644 --- a/net/decnet/dn_table.c +++ b/net/decnet/dn_table.c @@ -333,24 +333,24 @@ static void dn_rtmsg_fib(int event, struct dn_fib_node *f, int z, u32 tb_id, { struct sk_buff *skb; u32 pid = req ? req->pid : 0; - int size = NLMSG_SPACE(sizeof(struct rtmsg) + 256); + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_KERNEL); - if (!skb) - return; + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb == NULL) + goto errout; - if (dn_fib_dump_info(skb, pid, nlh->nlmsg_seq, event, tb_id, - f->fn_type, f->fn_scope, &f->fn_key, z, - DN_FIB_INFO(f), 0) < 0) { + err = dn_fib_dump_info(skb, pid, nlh->nlmsg_seq, event, tb_id, + f->fn_type, f->fn_scope, &f->fn_key, z, + DN_FIB_INFO(f), 0); + if (err < 0) { kfree_skb(skb); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_DECnet_ROUTE; - if (nlh->nlmsg_flags & NLM_F_ECHO) - atomic_inc(&skb->users); - netlink_broadcast(rtnl, skb, pid, RTNLGRP_DECnet_ROUTE, GFP_KERNEL); - if (nlh->nlmsg_flags & NLM_F_ECHO) - netlink_unicast(rtnl, skb, pid, MSG_DONTWAIT); + + err = rtnl_notify(skb, pid, RTNLGRP_DECnet_ROUTE, nlh, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_DECnet_ROUTE, err); } static __inline__ int dn_hash_dump_bucket(struct sk_buff *skb, -- cgit v0.10.2 From d6062cbbd1f5e92c94e5eae9ef1a280ed48d56d5 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:33:59 -0700 Subject: [IPv4] address: Convert address notification to use rtnl_notify() Adds support for NLM_F_ECHO allowing applications to easly see which address have been deleted, added, or promoted. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 398e7b9..0487677 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -88,7 +88,7 @@ static struct nla_policy ifa_ipv4_policy[IFA_MAX+1] __read_mostly = { [IFA_LABEL] = { .type = NLA_STRING }, }; -static void rtmsg_ifa(int event, struct in_ifaddr *); +static void rtmsg_ifa(int event, struct in_ifaddr *, struct nlmsghdr *, u32); static BLOCKING_NOTIFIER_HEAD(inetaddr_chain); static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, @@ -239,8 +239,8 @@ int inet_addr_onlink(struct in_device *in_dev, u32 a, u32 b) return 0; } -static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, - int destroy) +static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, + int destroy, struct nlmsghdr *nlh, u32 pid) { struct in_ifaddr *promote = NULL; struct in_ifaddr *ifa, *ifa1 = *ifap; @@ -273,7 +273,7 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, if (!do_promote) { *ifap1 = ifa->ifa_next; - rtmsg_ifa(RTM_DELADDR, ifa); + rtmsg_ifa(RTM_DELADDR, ifa, nlh, pid); blocking_notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa); inet_free_ifa(ifa); @@ -298,7 +298,7 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, is valid, it will try to restore deleted routes... Grr. So that, this order is correct. */ - rtmsg_ifa(RTM_DELADDR, ifa1); + rtmsg_ifa(RTM_DELADDR, ifa1, nlh, pid); blocking_notifier_call_chain(&inetaddr_chain, NETDEV_DOWN, ifa1); if (promote) { @@ -310,7 +310,7 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, } promote->ifa_flags &= ~IFA_F_SECONDARY; - rtmsg_ifa(RTM_NEWADDR, promote); + rtmsg_ifa(RTM_NEWADDR, promote, nlh, pid); blocking_notifier_call_chain(&inetaddr_chain, NETDEV_UP, promote); for (ifa = promote->ifa_next; ifa; ifa = ifa->ifa_next) { @@ -329,7 +329,14 @@ static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, } } -static int inet_insert_ifa(struct in_ifaddr *ifa) +static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap, + int destroy) +{ + __inet_del_ifa(in_dev, ifap, destroy, NULL, 0); +} + +static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh, + u32 pid) { struct in_device *in_dev = ifa->ifa_dev; struct in_ifaddr *ifa1, **ifap, **last_primary; @@ -374,12 +381,17 @@ static int inet_insert_ifa(struct in_ifaddr *ifa) /* Send message first, then call notifier. Notifier will trigger FIB update, so that listeners of netlink will know about new ifaddr */ - rtmsg_ifa(RTM_NEWADDR, ifa); + rtmsg_ifa(RTM_NEWADDR, ifa, nlh, pid); blocking_notifier_call_chain(&inetaddr_chain, NETDEV_UP, ifa); return 0; } +static int inet_insert_ifa(struct in_ifaddr *ifa) +{ + return __inet_insert_ifa(ifa, NULL, 0); +} + static int inet_set_ifa(struct net_device *dev, struct in_ifaddr *ifa) { struct in_device *in_dev = __in_dev_get_rtnl(dev); @@ -466,7 +478,7 @@ static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg !inet_ifa_match(nla_get_u32(tb[IFA_ADDRESS]), ifa))) continue; - inet_del_ifa(in_dev, ifap, 1); + __inet_del_ifa(in_dev, ifap, 1, nlh, NETLINK_CB(skb).pid); return 0; } @@ -558,7 +570,7 @@ static int inet_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg if (IS_ERR(ifa)) return PTR_ERR(ifa); - return inet_insert_ifa(ifa); + return __inet_insert_ifa(ifa, nlh, NETLINK_CB(skb).pid); } /* @@ -1189,18 +1201,27 @@ done: return skb->len; } -static void rtmsg_ifa(int event, struct in_ifaddr* ifa) +static void rtmsg_ifa(int event, struct in_ifaddr* ifa, struct nlmsghdr *nlh, + u32 pid) { struct sk_buff *skb; + u32 seq = nlh ? nlh->nlmsg_seq : 0; + int err = -ENOBUFS; skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); if (skb == NULL) - netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, ENOBUFS); - else if (inet_fill_ifaddr(skb, ifa, 0, 0, event, 0) < 0) { + goto errout; + + err = inet_fill_ifaddr(skb, ifa, pid, seq, event, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV4_IFADDR, EINVAL); - } else - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV4_IFADDR, GFP_KERNEL); + goto errout; + } + + err = rtnl_notify(skb, pid, RTNLGRP_IPV4_IFADDR, nlh, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV4_IFADDR, err); } static struct rtnetlink_link inet_rtnetlink_table[RTM_NR_MSGTYPES] = { -- cgit v0.10.2 From f21c7bc5f6a0a5bd03988886ff46656bc3f255b7 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:34:17 -0700 Subject: [IPv4] route: Convert route notifications to use rtnl_notify() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index ab753df..5dfdad5 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -33,7 +33,6 @@ #include #include #include -#include #include #include @@ -44,6 +43,7 @@ #include #include #include +#include #include "fib_lookup.h" @@ -278,25 +278,25 @@ void rtmsg_fib(int event, u32 key, struct fib_alias *fa, { struct sk_buff *skb; u32 pid = req ? req->pid : n->nlmsg_pid; - int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); + int payload = sizeof(struct rtmsg) + 256; + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_KERNEL); - if (!skb) - return; + skb = nlmsg_new(nlmsg_total_size(payload), GFP_KERNEL); + if (skb == NULL) + goto errout; - if (fib_dump_info(skb, pid, n->nlmsg_seq, event, tb_id, - fa->fa_type, fa->fa_scope, &key, z, - fa->fa_tos, - fa->fa_info, 0) < 0) { + err = fib_dump_info(skb, pid, n->nlmsg_seq, event, tb_id, + fa->fa_type, fa->fa_scope, &key, z, fa->fa_tos, + fa->fa_info, 0); + if (err < 0) { kfree_skb(skb); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_IPV4_ROUTE; - if (n->nlmsg_flags&NLM_F_ECHO) - atomic_inc(&skb->users); - netlink_broadcast(rtnl, skb, pid, RTNLGRP_IPV4_ROUTE, GFP_KERNEL); - if (n->nlmsg_flags&NLM_F_ECHO) - netlink_unicast(rtnl, skb, pid, MSG_DONTWAIT); + + err = rtnl_notify(skb, pid, RTNLGRP_IPV4_ROUTE, n, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV4_ROUTE, err); } /* Return the first fib alias matching TOS with -- cgit v0.10.2 From 5d620266431c03d1dac66287367c6da26c64a069 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:35:02 -0700 Subject: [IPv6] address: Convert address notification to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 4f991a2..81e9ef1 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -73,6 +73,7 @@ #include #include #include +#include #include #include @@ -3280,20 +3281,23 @@ out_free: static void inet6_ifa_notify(int event, struct inet6_ifaddr *ifa) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE); + int payload = sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE; + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_ATOMIC); - if (!skb) { - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFADDR, ENOBUFS); - return; - } - if (inet6_fill_ifaddr(skb, ifa, current->pid, 0, event, 0) < 0) { + skb = nlmsg_new(nlmsg_total_size(payload), GFP_ATOMIC); + if (skb == NULL) + goto errout; + + err = inet6_fill_ifaddr(skb, ifa, 0, 0, event, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFADDR, EINVAL); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_IPV6_IFADDR; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV6_IFADDR, GFP_ATOMIC); + + err = rtnl_notify(skb, 0, RTNLGRP_IPV6_IFADDR, NULL, GFP_ATOMIC); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV6_IFADDR, err); } static void inline ipv6_store_devconf(struct ipv6_devconf *cnf, -- cgit v0.10.2 From 21713ebc4f119950e87d21c4637d5a750eea20e8 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:35:24 -0700 Subject: [IPv6] route: Convert route notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 024c8e2..1aca787 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -35,7 +35,6 @@ #include #include #include -#include #include #ifdef CONFIG_PROC_FS @@ -54,6 +53,7 @@ #include #include #include +#include #include @@ -2056,27 +2056,25 @@ void inet6_rt_notify(int event, struct rt6_info *rt, struct nlmsghdr *nlh, struct netlink_skb_parms *req) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct rtmsg)+256); - u32 pid = current->pid; - u32 seq = 0; - - if (req) - pid = req->pid; - if (nlh) - seq = nlh->nlmsg_seq; - - skb = alloc_skb(size, gfp_any()); - if (!skb) { - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_ROUTE, ENOBUFS); - return; - } - if (rt6_fill_node(skb, rt, NULL, NULL, 0, event, pid, seq, 0, 0) < 0) { + u32 pid = req ? req->pid : 0; + u32 seq = nlh ? nlh->nlmsg_seq : 0; + int payload = sizeof(struct rtmsg) + 256; + int err = -ENOBUFS; + + skb = nlmsg_new(nlmsg_total_size(payload), gfp_any()); + if (skb == NULL) + goto errout; + + err = rt6_fill_node(skb, rt, NULL, NULL, 0, event, pid, seq, 0, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_ROUTE, EINVAL); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_IPV6_ROUTE; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV6_ROUTE, gfp_any()); + + err = rtnl_notify(skb, pid, RTNLGRP_IPV6_ROUTE, nlh, gfp_any()); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV6_ROUTE, err); } /* -- cgit v0.10.2 From 8d7a76c9b17866f426fcbb531c81af7a1f53e071 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:35:47 -0700 Subject: [IPv6] link: Convert link notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 81e9ef1..2a3be0f 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3438,20 +3438,23 @@ static int inet6_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) void inet6_ifinfo_notify(int event, struct inet6_dev *idev) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct ifinfomsg) + INET6_IFINFO_RTA_SPACE); + int payload = sizeof(struct ifinfomsg) + INET6_IFINFO_RTA_SPACE; + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_ATOMIC); - if (!skb) { - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFINFO, ENOBUFS); - return; - } - if (inet6_fill_ifinfo(skb, idev, current->pid, 0, event, 0) < 0) { + skb = nlmsg_new(nlmsg_total_size(payload), GFP_ATOMIC); + if (skb == NULL) + goto errout; + + err = inet6_fill_ifinfo(skb, idev, 0, 0, event, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_IFINFO, EINVAL); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_IPV6_IFINFO; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV6_IFINFO, GFP_ATOMIC); + + err = rtnl_notify(skb, 0, RTNLGRP_IPV6_IFADDR, NULL, GFP_ATOMIC); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV6_IFADDR, err); } /* Maximum length of prefix_cacheinfo attributes */ -- cgit v0.10.2 From 8c384bfa36b1dbeba8154da20d49167ce3e275c4 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:36:07 -0700 Subject: [IPv6] prefix: Convert prefix notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 2a3be0f..4af741e 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3506,20 +3506,23 @@ static void inet6_prefix_notify(int event, struct inet6_dev *idev, struct prefix_info *pinfo) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct prefixmsg) + INET6_PREFIX_RTA_SPACE); + int payload = sizeof(struct prefixmsg) + INET6_PREFIX_RTA_SPACE; + int err = -ENOBUFS; - skb = alloc_skb(size, GFP_ATOMIC); - if (!skb) { - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_PREFIX, ENOBUFS); - return; - } - if (inet6_fill_prefix(skb, idev, pinfo, current->pid, 0, event, 0) < 0) { + skb = nlmsg_new(nlmsg_total_size(payload), GFP_ATOMIC); + if (skb == NULL) + goto errout; + + err = inet6_fill_prefix(skb, idev, pinfo, 0, 0, event, 0); + if (err < 0) { kfree_skb(skb); - netlink_set_err(rtnl, 0, RTNLGRP_IPV6_PREFIX, EINVAL); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_IPV6_PREFIX; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_IPV6_PREFIX, GFP_ATOMIC); + + err = rtnl_notify(skb, 0, RTNLGRP_IPV6_PREFIX, NULL, GFP_ATOMIC); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_IPV6_PREFIX, err); } static struct rtnetlink_link inet6_rtnetlink_table[RTM_NR_MSGTYPES] = { -- cgit v0.10.2 From 280a306c539389156477cc9c07028d43fe4fbf86 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:36:28 -0700 Subject: [BRIDGE]: Convert notifications to use rtnl_notify() Fixes a wrong use of current->pid as netlink pid. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 53086fb..8f66119 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -12,6 +12,7 @@ #include #include +#include #include "br_private.h" /* @@ -76,26 +77,24 @@ rtattr_failure: void br_ifinfo_notify(int event, struct net_bridge_port *port) { struct sk_buff *skb; - int err = -ENOMEM; + int payload = sizeof(struct ifinfomsg) + 128; + int err = -ENOBUFS; pr_debug("bridge notify event=%d\n", event); - skb = alloc_skb(NLMSG_SPACE(sizeof(struct ifinfomsg) + 128), - GFP_ATOMIC); - if (!skb) - goto err_out; + skb = nlmsg_new(nlmsg_total_size(payload), GFP_ATOMIC); + if (skb == NULL) + goto errout; + + err = br_fill_ifinfo(skb, port, 0, 0, event, 0); + if (err < 0) { + kfree_skb(skb); + goto errout; + } - err = br_fill_ifinfo(skb, port, current->pid, 0, event, 0); + err = rtnl_notify(skb, 0, RTNLGRP_LINK, NULL, GFP_ATOMIC); +errout: if (err < 0) - goto err_kfree; - - NETLINK_CB(skb).dst_group = RTNLGRP_LINK; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_LINK, GFP_ATOMIC); - return; - -err_kfree: - kfree_skb(skb); -err_out: - netlink_set_err(rtnl, 0, RTNLGRP_LINK, err); + rtnl_set_sk_err(RTNLGRP_LINK, err); } /* -- cgit v0.10.2 From bd5785ba3ac1c89aa4c351ceb2acd96686424d8c Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:36:49 -0700 Subject: [WIRELESS]: Convert notifications to use rtnl_notify() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/wireless.c b/net/core/wireless.c index 348b9da..3168fca 100644 --- a/net/core/wireless.c +++ b/net/core/wireless.c @@ -85,6 +85,7 @@ #include /* Pretty obvious */ #include /* New driver API */ +#include #include /* copy_to_user() */ @@ -1849,7 +1850,7 @@ static void wireless_nlevent_process(unsigned long data) struct sk_buff *skb; while ((skb = skb_dequeue(&wireless_nlevent_queue))) - netlink_broadcast(rtnl, skb, 0, RTNLGRP_LINK, GFP_ATOMIC); + rtnl_notify(skb, 0, RTNLGRP_LINK, NULL, GFP_ATOMIC); } static DECLARE_TASKLET(wireless_nlevent_tasklet, wireless_nlevent_process, 0); -- cgit v0.10.2 From 0ec6d3f467faeec5dd3b617959eb90e9d520113d Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:37:09 -0700 Subject: [NET] link: Convert notifications to use rtnl_notify() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2b1af17..f5300b5 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -630,20 +630,22 @@ static int rtnl_dump_all(struct sk_buff *skb, struct netlink_callback *cb) void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change) { struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct ifinfomsg) + - sizeof(struct rtnl_link_ifmap) + - sizeof(struct rtnl_link_stats) + 128); + int err = -ENOBUFS; - skb = nlmsg_new(size, GFP_KERNEL); - if (!skb) - return; + skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb == NULL) + goto errout; - if (rtnl_fill_ifinfo(skb, dev, NULL, 0, type, 0, 0, change, 0) < 0) { + err = rtnl_fill_ifinfo(skb, dev, NULL, 0, type, 0, 0, change, 0); + if (err < 0) { kfree_skb(skb); - return; + goto errout; } - NETLINK_CB(skb).dst_group = RTNLGRP_LINK; - netlink_broadcast(rtnl, skb, 0, RTNLGRP_LINK, GFP_KERNEL); + + err = rtnl_notify(skb, 0, RTNLGRP_LINK, NULL, GFP_KERNEL); +errout: + if (err < 0) + rtnl_set_sk_err(RTNLGRP_LINK, err); } /* Protected by RTNL sempahore. */ -- cgit v0.10.2 From 56fc85ac961e2c20dcb5ef07e2628b3f93de2e49 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 15 Aug 2006 00:37:29 -0700 Subject: [RTNETLINK]: Unexport rtnl socket Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h index ecbe034..9c92dc8 100644 --- a/include/linux/rtnetlink.h +++ b/include/linux/rtnetlink.h @@ -574,8 +574,6 @@ extern int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, in #define rtattr_parse_nested(tb, max, rta) \ rtattr_parse((tb), (max), RTA_DATA((rta)), RTA_PAYLOAD((rta))) -extern struct sock *rtnl; - struct rtnetlink_link { int (*doit)(struct sk_buff *, struct nlmsghdr*, void *attr); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f5300b5..dfc5826 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -58,6 +58,7 @@ #endif /* CONFIG_NET_WIRELESS_RTNETLINK */ static DEFINE_MUTEX(rtnl_mutex); +static struct sock *rtnl; void rtnl_lock(void) { @@ -95,8 +96,6 @@ int rtattr_parse(struct rtattr *tb[], int maxattr, struct rtattr *rta, int len) return 0; } -struct sock *rtnl; - struct rtnetlink_link * rtnetlink_links[NPROTO]; static const int rtm_min[RTM_NR_FAMILIES] = @@ -842,7 +841,6 @@ EXPORT_SYMBOL(rtattr_strlcpy); EXPORT_SYMBOL(rtattr_parse); EXPORT_SYMBOL(rtnetlink_links); EXPORT_SYMBOL(rtnetlink_put_metrics); -EXPORT_SYMBOL(rtnl); EXPORT_SYMBOL(rtnl_lock); EXPORT_SYMBOL(rtnl_trylock); EXPORT_SYMBOL(rtnl_unlock); -- cgit v0.10.2 From ab32ea5d8a760e7dd4339634e95d7be24ee5b842 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Fri, 22 Sep 2006 14:15:41 -0700 Subject: [NET/IPV4/IPV6]: Change some sysctl variables to __read_mostly Change net/core, ipv4 and ipv6 sysctl variables to __read_mostly. Couldn't actually measure any performance increase while testing (.3% I consider noise), but seems like the right thing to do. Signed-off-by: Brian Haley Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index 23ae5e5..c7e653f 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -2451,7 +2451,7 @@ static struct neigh_sysctl_table { ctl_table neigh_neigh_dir[2]; ctl_table neigh_proto_dir[2]; ctl_table neigh_root_dir[2]; -} neigh_sysctl_template = { +} neigh_sysctl_template __read_mostly = { .neigh_vars = { { .ctl_name = NET_NEIGH_MCAST_SOLICIT, diff --git a/net/core/sock.c b/net/core/sock.c index b67d868..cfaf090 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -187,13 +187,13 @@ static struct lock_class_key af_callback_keys[AF_MAX]; #define SK_RMEM_MAX (_SK_MEM_OVERHEAD * _SK_MEM_PACKETS) /* Run time adjustable parameters. */ -__u32 sysctl_wmem_max = SK_WMEM_MAX; -__u32 sysctl_rmem_max = SK_RMEM_MAX; -__u32 sysctl_wmem_default = SK_WMEM_MAX; -__u32 sysctl_rmem_default = SK_RMEM_MAX; +__u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX; +__u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX; +__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX; +__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX; /* Maximal space eaten by iovec or ancilliary data plus some space */ -int sysctl_optmem_max = sizeof(unsigned long)*(2*UIO_MAXIOV + 512); +int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512); static int sock_set_timeout(long *timeo_p, char __user *optval, int optlen) { diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 36c38bf..f2e8927 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -391,7 +391,7 @@ int inet_release(struct socket *sock) } /* It is off by default, see below. */ -int sysctl_ip_nonlocal_bind; +int sysctl_ip_nonlocal_bind __read_mostly; int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len) { @@ -987,7 +987,7 @@ void inet_unregister_protosw(struct inet_protosw *p) * Shall we try to damage output packets if routing dev changes? */ -int sysctl_ip_dynaddr; +int sysctl_ip_dynaddr __read_mostly; static int inet_sk_reselect_saddr(struct sock *sk) { diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c index 6d223e5..c2ad07e 100644 --- a/net/ipv4/icmp.c +++ b/net/ipv4/icmp.c @@ -187,11 +187,11 @@ struct icmp_err icmp_err_convert[] = { }; /* Control parameters for ECHO replies. */ -int sysctl_icmp_echo_ignore_all; -int sysctl_icmp_echo_ignore_broadcasts = 1; +int sysctl_icmp_echo_ignore_all __read_mostly; +int sysctl_icmp_echo_ignore_broadcasts __read_mostly = 1; /* Control parameter - ignore bogus broadcast responses? */ -int sysctl_icmp_ignore_bogus_error_responses = 1; +int sysctl_icmp_ignore_bogus_error_responses __read_mostly = 1; /* * Configurable global rate limit. @@ -205,9 +205,9 @@ int sysctl_icmp_ignore_bogus_error_responses = 1; * time exceeded (11), parameter problem (12) */ -int sysctl_icmp_ratelimit = 1 * HZ; -int sysctl_icmp_ratemask = 0x1818; -int sysctl_icmp_errors_use_inbound_ifaddr; +int sysctl_icmp_ratelimit __read_mostly = 1 * HZ; +int sysctl_icmp_ratemask __read_mostly = 0x1818; +int sysctl_icmp_errors_use_inbound_ifaddr __read_mostly; /* * ICMP control array. This specifies what to do with each ICMP. diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c index 7003e76..58be822 100644 --- a/net/ipv4/igmp.c +++ b/net/ipv4/igmp.c @@ -1397,8 +1397,8 @@ static struct in_device * ip_mc_find_dev(struct ip_mreqn *imr) /* * Join a socket to a group */ -int sysctl_igmp_max_memberships = IP_MAX_MEMBERSHIPS; -int sysctl_igmp_max_msf = IP_MAX_MSF; +int sysctl_igmp_max_memberships __read_mostly = IP_MAX_MEMBERSHIPS; +int sysctl_igmp_max_msf __read_mostly = IP_MAX_MSF; static int ip_mc_del1_src(struct ip_mc_list *pmc, int sfmode, diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 8d7f107..165d728 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -54,15 +54,15 @@ * even the most extreme cases without allowing an attacker to measurably * harm machine performance. */ -int sysctl_ipfrag_high_thresh = 256*1024; -int sysctl_ipfrag_low_thresh = 192*1024; +int sysctl_ipfrag_high_thresh __read_mostly = 256*1024; +int sysctl_ipfrag_low_thresh __read_mostly = 192*1024; -int sysctl_ipfrag_max_dist = 64; +int sysctl_ipfrag_max_dist __read_mostly = 64; /* Important NOTE! Fragment queue must be destroyed before MSL expires. * RFC791 is wrong proposing to prolongate timer each fragment arrival by TTL. */ -int sysctl_ipfrag_time = IP_FRAG_TIME; +int sysctl_ipfrag_time __read_mostly = IP_FRAG_TIME; struct ipfrag_skb_cb { @@ -130,7 +130,7 @@ static unsigned int ipqhashfn(u16 id, u32 saddr, u32 daddr, u8 prot) } static struct timer_list ipfrag_secret_timer; -int sysctl_ipfrag_secret_interval = 10 * 60 * HZ; +int sysctl_ipfrag_secret_interval __read_mostly = 10 * 60 * HZ; static void ipfrag_secret_rebuild(unsigned long dummy) { diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 1b9b674..81b2795 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -83,7 +83,7 @@ #include #include -int sysctl_ip_default_ttl = IPDEFTTL; +int sysctl_ip_default_ttl __read_mostly = IPDEFTTL; /* Generate a checksum for an outgoing IP datagram. */ __inline__ void ip_send_check(struct iphdr *iph) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index b0124e69..e570db4 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -268,7 +268,7 @@ #include #include -int sysctl_tcp_fin_timeout = TCP_FIN_TIMEOUT; +int sysctl_tcp_fin_timeout __read_mostly = TCP_FIN_TIMEOUT; DEFINE_SNMP_STAT(struct tcp_mib, tcp_statistics) __read_mostly; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 159fa3f..caf3c41 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -72,24 +72,24 @@ #include #include -int sysctl_tcp_timestamps = 1; -int sysctl_tcp_window_scaling = 1; -int sysctl_tcp_sack = 1; -int sysctl_tcp_fack = 1; -int sysctl_tcp_reordering = TCP_FASTRETRANS_THRESH; -int sysctl_tcp_ecn; -int sysctl_tcp_dsack = 1; -int sysctl_tcp_app_win = 31; -int sysctl_tcp_adv_win_scale = 2; - -int sysctl_tcp_stdurg; -int sysctl_tcp_rfc1337; -int sysctl_tcp_max_orphans = NR_FILE; -int sysctl_tcp_frto; -int sysctl_tcp_nometrics_save; - -int sysctl_tcp_moderate_rcvbuf = 1; -int sysctl_tcp_abc; +int sysctl_tcp_timestamps __read_mostly = 1; +int sysctl_tcp_window_scaling __read_mostly = 1; +int sysctl_tcp_sack __read_mostly = 1; +int sysctl_tcp_fack __read_mostly = 1; +int sysctl_tcp_reordering __read_mostly = TCP_FASTRETRANS_THRESH; +int sysctl_tcp_ecn __read_mostly; +int sysctl_tcp_dsack __read_mostly = 1; +int sysctl_tcp_app_win __read_mostly = 31; +int sysctl_tcp_adv_win_scale __read_mostly = 2; + +int sysctl_tcp_stdurg __read_mostly; +int sysctl_tcp_rfc1337 __read_mostly; +int sysctl_tcp_max_orphans __read_mostly = NR_FILE; +int sysctl_tcp_frto __read_mostly; +int sysctl_tcp_nometrics_save __read_mostly; + +int sysctl_tcp_moderate_rcvbuf __read_mostly = 1; +int sysctl_tcp_abc __read_mostly; #define FLAG_DATA 0x01 /* Incoming frame contained data. */ #define FLAG_WIN_UPDATE 0x02 /* Incoming ACK was a window update. */ diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 2973dee..23b46e3 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -78,8 +78,8 @@ #include #include -int sysctl_tcp_tw_reuse; -int sysctl_tcp_low_latency; +int sysctl_tcp_tw_reuse __read_mostly; +int sysctl_tcp_low_latency __read_mostly; /* Check TCP sequence numbers in ICMP packets. */ #define ICMP_MIN_LENGTH 8 diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 624e2b2..0163d98 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -34,8 +34,8 @@ #define SYNC_INIT 1 #endif -int sysctl_tcp_syncookies = SYNC_INIT; -int sysctl_tcp_abort_on_overflow; +int sysctl_tcp_syncookies __read_mostly = SYNC_INIT; +int sysctl_tcp_abort_on_overflow __read_mostly; struct inet_timewait_death_row tcp_death_row = { .sysctl_max_tw_buckets = NR_FILE * 2, diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9252a50..061edfa 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -43,24 +43,24 @@ #include /* People can turn this off for buggy TCP's found in printers etc. */ -int sysctl_tcp_retrans_collapse = 1; +int sysctl_tcp_retrans_collapse __read_mostly = 1; /* People can turn this on to work with those rare, broken TCPs that * interpret the window field as a signed quantity. */ -int sysctl_tcp_workaround_signed_windows = 0; +int sysctl_tcp_workaround_signed_windows __read_mostly = 0; /* This limits the percentage of the congestion window which we * will allow a single TSO frame to consume. Building TSO frames * which are too large can cause TCP streams to be bursty. */ -int sysctl_tcp_tso_win_divisor = 3; +int sysctl_tcp_tso_win_divisor __read_mostly = 3; -int sysctl_tcp_mtu_probing = 0; -int sysctl_tcp_base_mss = 512; +int sysctl_tcp_mtu_probing __read_mostly = 0; +int sysctl_tcp_base_mss __read_mostly = 512; /* By default, RFC2861 behavior. */ -int sysctl_tcp_slow_start_after_idle = 1; +int sysctl_tcp_slow_start_after_idle __read_mostly = 1; static void update_send_head(struct sock *sk, struct tcp_sock *tp, struct sk_buff *skb) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 7c1bde3..fb09ade 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -23,14 +23,14 @@ #include #include -int sysctl_tcp_syn_retries = TCP_SYN_RETRIES; -int sysctl_tcp_synack_retries = TCP_SYNACK_RETRIES; -int sysctl_tcp_keepalive_time = TCP_KEEPALIVE_TIME; -int sysctl_tcp_keepalive_probes = TCP_KEEPALIVE_PROBES; -int sysctl_tcp_keepalive_intvl = TCP_KEEPALIVE_INTVL; -int sysctl_tcp_retries1 = TCP_RETR1; -int sysctl_tcp_retries2 = TCP_RETR2; -int sysctl_tcp_orphan_retries; +int sysctl_tcp_syn_retries __read_mostly = TCP_SYN_RETRIES; +int sysctl_tcp_synack_retries __read_mostly = TCP_SYNACK_RETRIES; +int sysctl_tcp_keepalive_time __read_mostly = TCP_KEEPALIVE_TIME; +int sysctl_tcp_keepalive_probes __read_mostly = TCP_KEEPALIVE_PROBES; +int sysctl_tcp_keepalive_intvl __read_mostly = TCP_KEEPALIVE_INTVL; +int sysctl_tcp_retries1 __read_mostly = TCP_RETR1; +int sysctl_tcp_retries2 __read_mostly = TCP_RETR2; +int sysctl_tcp_orphan_retries __read_mostly; static void tcp_write_timer(unsigned long); static void tcp_delack_timer(unsigned long); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 4af741e..f1ede90 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -146,7 +146,7 @@ static int ipv6_chk_same_addr(const struct in6_addr *addr, struct net_device *de static ATOMIC_NOTIFIER_HEAD(inet6addr_chain); -struct ipv6_devconf ipv6_devconf = { +struct ipv6_devconf ipv6_devconf __read_mostly = { .forwarding = 0, .hop_limit = IPV6_DEFAULT_HOPLIMIT, .mtu6 = IPV6_MIN_MTU, @@ -177,7 +177,7 @@ struct ipv6_devconf ipv6_devconf = { #endif }; -static struct ipv6_devconf ipv6_devconf_dflt = { +static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { .forwarding = 0, .hop_limit = IPV6_DEFAULT_HOPLIMIT, .mtu6 = IPV6_MIN_MTU, @@ -3665,7 +3665,7 @@ static struct addrconf_sysctl_table ctl_table addrconf_conf_dir[2]; ctl_table addrconf_proto_dir[2]; ctl_table addrconf_root_dir[2]; -} addrconf_sysctl = { +} addrconf_sysctl __read_mostly = { .sysctl_header = NULL, .addrconf_vars = { { diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 82a1b1a..2ff600c 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -67,7 +67,7 @@ MODULE_AUTHOR("Cast of dozens"); MODULE_DESCRIPTION("IPv6 protocol stack for Linux"); MODULE_LICENSE("GPL"); -int sysctl_ipv6_bindv6only; +int sysctl_ipv6_bindv6only __read_mostly; /* The inetsw table contains everything that inet_create needs to * build a new socket. diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 1030551..e3a8e27 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -151,7 +151,7 @@ static int is_ineligible(struct sk_buff *skb) return 0; } -static int sysctl_icmpv6_time = 1*HZ; +static int sysctl_icmpv6_time __read_mostly = 1*HZ; /* * Check the ICMP output rate limit diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c index 639eb20..3b114e3 100644 --- a/net/ipv6/mcast.c +++ b/net/ipv6/mcast.c @@ -171,7 +171,7 @@ static int ip6_mc_leave_src(struct sock *sk, struct ipv6_mc_socklist *iml, #define IPV6_MLD_MAX_MSF 64 -int sysctl_mld_max_msf = IPV6_MLD_MAX_MSF; +int sysctl_mld_max_msf __read_mostly = IPV6_MLD_MAX_MSF; /* * socket join on multicast group diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index a8623d2..f39bbed 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -53,10 +53,10 @@ #include #include -int sysctl_ip6frag_high_thresh = 256*1024; -int sysctl_ip6frag_low_thresh = 192*1024; +int sysctl_ip6frag_high_thresh __read_mostly = 256*1024; +int sysctl_ip6frag_low_thresh __read_mostly = 192*1024; -int sysctl_ip6frag_time = IPV6_FRAG_TIMEOUT; +int sysctl_ip6frag_time __read_mostly = IPV6_FRAG_TIMEOUT; struct ip6frag_skb_cb { @@ -152,7 +152,7 @@ static unsigned int ip6qhashfn(u32 id, struct in6_addr *saddr, } static struct timer_list ip6_frag_secret_timer; -int sysctl_ip6frag_secret_interval = 10 * 60 * HZ; +int sysctl_ip6frag_secret_interval __read_mostly = 10 * 60 * HZ; static void ip6_frag_secret_rebuild(unsigned long dummy) { -- cgit v0.10.2 From 4e902c57417c4c285b98ba2722468d1c3ed83d1b Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 17 Aug 2006 18:14:52 -0700 Subject: [IPv4]: FIB configuration using struct fib_config Introduces struct fib_config replacing the ugly struct kern_rta prone to ordering issues. Avoids creating faked netlink messages for auto generated routes or requests via ioctl. A new interface net/nexthop.h is added to help navigate through nexthop configuration arrays. A new struct nl_info will be used to carry the necessary netlink information to be used for notifications later on. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index 8e9ba56..42ed96f 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -20,25 +20,32 @@ #include #include -/* WARNING: The ordering of these elements must match ordering - * of RTA_* rtnetlink attribute numbers. - */ -struct kern_rta { - void *rta_dst; - void *rta_src; - int *rta_iif; - int *rta_oif; - void *rta_gw; - u32 *rta_priority; - void *rta_prefsrc; - struct rtattr *rta_mx; - struct rtattr *rta_mp; - unsigned char *rta_protoinfo; - u32 *rta_flow; - struct rta_cacheinfo *rta_ci; - struct rta_session *rta_sess; - u32 *rta_mp_alg; -}; +struct fib_config { + u8 fc_family; + u8 fc_dst_len; + u8 fc_src_len; + u8 fc_tos; + u8 fc_protocol; + u8 fc_scope; + u8 fc_type; + /* 1 byte unused */ + u32 fc_table; + u32 fc_dst; + u32 fc_src; + u32 fc_gw; + int fc_oif; + u32 fc_flags; + u32 fc_priority; + u32 fc_prefsrc; + struct nlattr *fc_mx; + struct rtnexthop *fc_mp; + int fc_mx_len; + int fc_mp_len; + u32 fc_flow; + u32 fc_mp_alg; + u32 fc_nlflags; + struct nl_info fc_nlinfo; + }; struct fib_info; @@ -154,12 +161,8 @@ struct fib_table { u32 tb_id; unsigned tb_stamp; int (*tb_lookup)(struct fib_table *tb, const struct flowi *flp, struct fib_result *res); - int (*tb_insert)(struct fib_table *table, struct rtmsg *r, - struct kern_rta *rta, struct nlmsghdr *n, - struct netlink_skb_parms *req); - int (*tb_delete)(struct fib_table *table, struct rtmsg *r, - struct kern_rta *rta, struct nlmsghdr *n, - struct netlink_skb_parms *req); + int (*tb_insert)(struct fib_table *, struct fib_config *); + int (*tb_delete)(struct fib_table *, struct fib_config *); int (*tb_dump)(struct fib_table *table, struct sk_buff *skb, struct netlink_callback *cb); int (*tb_flush)(struct fib_table *table); @@ -228,8 +231,6 @@ struct rtentry; extern int ip_fib_check_default(u32 gw, struct net_device *dev); extern int fib_sync_down(u32 local, struct net_device *dev, int force); extern int fib_sync_up(struct net_device *dev); -extern int fib_convert_rtentry(int cmd, struct nlmsghdr *nl, struct rtmsg *rtm, - struct kern_rta *rta, struct rtentry *r); extern u32 __fib_res_prefsrc(struct fib_result *res); /* Exported by fib_hash.c */ diff --git a/include/net/netlink.h b/include/net/netlink.h index bf593eb..47044da 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -192,6 +192,16 @@ struct nla_policy { u16 minlen; }; +/** + * struct nl_info - netlink source information + * @nlh: Netlink message header of original request + * @pid: Netlink PID of requesting application + */ +struct nl_info { + struct nlmsghdr *nlh; + u32 pid; +}; + extern void netlink_run_queue(struct sock *sk, unsigned int *qlen, int (*cb)(struct sk_buff *, struct nlmsghdr *, int *)); diff --git a/include/net/nexthop.h b/include/net/nexthop.h new file mode 100644 index 0000000..3334dbf --- /dev/null +++ b/include/net/nexthop.h @@ -0,0 +1,33 @@ +#ifndef __NET_NEXTHOP_H +#define __NET_NEXTHOP_H + +#include +#include + +static inline int rtnh_ok(const struct rtnexthop *rtnh, int remaining) +{ + return remaining >= sizeof(*rtnh) && + rtnh->rtnh_len >= sizeof(*rtnh) && + rtnh->rtnh_len <= remaining; +} + +static inline struct rtnexthop *rtnh_next(const struct rtnexthop *rtnh, + int *remaining) +{ + int totlen = NLA_ALIGN(rtnh->rtnh_len); + + *remaining -= totlen; + return (struct rtnexthop *) ((char *) rtnh + totlen); +} + +static inline struct nlattr *rtnh_attrs(const struct rtnexthop *rtnh) +{ + return (struct nlattr *) ((char *) rtnh + NLA_ALIGN(sizeof(*rtnh))); +} + +static inline int rtnh_attrlen(const struct rtnexthop *rtnh) +{ + return rtnh->rtnh_len - NLA_ALIGN(sizeof(*rtnh)); +} + +#endif diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index ad4c14f..acc18bd 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -253,42 +253,190 @@ e_inval: #ifndef CONFIG_IP_NOSIOCRT +static inline u32 sk_extract_addr(struct sockaddr *addr) +{ + return ((struct sockaddr_in *) addr)->sin_addr.s_addr; +} + +static int put_rtax(struct nlattr *mx, int len, int type, u32 value) +{ + struct nlattr *nla; + + nla = (struct nlattr *) ((char *) mx + len); + nla->nla_type = type; + nla->nla_len = nla_attr_size(4); + *(u32 *) nla_data(nla) = value; + + return len + nla_total_size(4); +} + +static int rtentry_to_fib_config(int cmd, struct rtentry *rt, + struct fib_config *cfg) +{ + u32 addr; + int plen; + + memset(cfg, 0, sizeof(*cfg)); + + if (rt->rt_dst.sa_family != AF_INET) + return -EAFNOSUPPORT; + + /* + * Check mask for validity: + * a) it must be contiguous. + * b) destination must have all host bits clear. + * c) if application forgot to set correct family (AF_INET), + * reject request unless it is absolutely clear i.e. + * both family and mask are zero. + */ + plen = 32; + addr = sk_extract_addr(&rt->rt_dst); + if (!(rt->rt_flags & RTF_HOST)) { + u32 mask = sk_extract_addr(&rt->rt_genmask); + + if (rt->rt_genmask.sa_family != AF_INET) { + if (mask || rt->rt_genmask.sa_family) + return -EAFNOSUPPORT; + } + + if (bad_mask(mask, addr)) + return -EINVAL; + + plen = inet_mask_len(mask); + } + + cfg->fc_dst_len = plen; + cfg->fc_dst = addr; + + if (cmd != SIOCDELRT) { + cfg->fc_nlflags = NLM_F_CREATE; + cfg->fc_protocol = RTPROT_BOOT; + } + + if (rt->rt_metric) + cfg->fc_priority = rt->rt_metric - 1; + + if (rt->rt_flags & RTF_REJECT) { + cfg->fc_scope = RT_SCOPE_HOST; + cfg->fc_type = RTN_UNREACHABLE; + return 0; + } + + cfg->fc_scope = RT_SCOPE_NOWHERE; + cfg->fc_type = RTN_UNICAST; + + if (rt->rt_dev) { + char *colon; + struct net_device *dev; + char devname[IFNAMSIZ]; + + if (copy_from_user(devname, rt->rt_dev, IFNAMSIZ-1)) + return -EFAULT; + + devname[IFNAMSIZ-1] = 0; + colon = strchr(devname, ':'); + if (colon) + *colon = 0; + dev = __dev_get_by_name(devname); + if (!dev) + return -ENODEV; + cfg->fc_oif = dev->ifindex; + if (colon) { + struct in_ifaddr *ifa; + struct in_device *in_dev = __in_dev_get_rtnl(dev); + if (!in_dev) + return -ENODEV; + *colon = ':'; + for (ifa = in_dev->ifa_list; ifa; ifa = ifa->ifa_next) + if (strcmp(ifa->ifa_label, devname) == 0) + break; + if (ifa == NULL) + return -ENODEV; + cfg->fc_prefsrc = ifa->ifa_local; + } + } + + addr = sk_extract_addr(&rt->rt_gateway); + if (rt->rt_gateway.sa_family == AF_INET && addr) { + cfg->fc_gw = addr; + if (rt->rt_flags & RTF_GATEWAY && + inet_addr_type(addr) == RTN_UNICAST) + cfg->fc_scope = RT_SCOPE_UNIVERSE; + } + + if (cmd == SIOCDELRT) + return 0; + + if (rt->rt_flags & RTF_GATEWAY && !cfg->fc_gw) + return -EINVAL; + + if (cfg->fc_scope == RT_SCOPE_NOWHERE) + cfg->fc_scope = RT_SCOPE_LINK; + + if (rt->rt_flags & (RTF_MTU | RTF_WINDOW | RTF_IRTT)) { + struct nlattr *mx; + int len = 0; + + mx = kzalloc(3 * nla_total_size(4), GFP_KERNEL); + if (mx == NULL) + return -ENOMEM; + + if (rt->rt_flags & RTF_MTU) + len = put_rtax(mx, len, RTAX_ADVMSS, rt->rt_mtu - 40); + + if (rt->rt_flags & RTF_WINDOW) + len = put_rtax(mx, len, RTAX_WINDOW, rt->rt_window); + + if (rt->rt_flags & RTF_IRTT) + len = put_rtax(mx, len, RTAX_RTT, rt->rt_irtt << 3); + + cfg->fc_mx = mx; + cfg->fc_mx_len = len; + } + + return 0; +} + /* * Handle IP routing ioctl calls. These are used to manipulate the routing tables */ int ip_rt_ioctl(unsigned int cmd, void __user *arg) { + struct fib_config cfg; + struct rtentry rt; int err; - struct kern_rta rta; - struct rtentry r; - struct { - struct nlmsghdr nlh; - struct rtmsg rtm; - } req; switch (cmd) { case SIOCADDRT: /* Add a route */ case SIOCDELRT: /* Delete a route */ if (!capable(CAP_NET_ADMIN)) return -EPERM; - if (copy_from_user(&r, arg, sizeof(struct rtentry))) + + if (copy_from_user(&rt, arg, sizeof(rt))) return -EFAULT; + rtnl_lock(); - err = fib_convert_rtentry(cmd, &req.nlh, &req.rtm, &rta, &r); + err = rtentry_to_fib_config(cmd, &rt, &cfg); if (err == 0) { + struct fib_table *tb; + if (cmd == SIOCDELRT) { - struct fib_table *tb = fib_get_table(req.rtm.rtm_table); - err = -ESRCH; + tb = fib_get_table(cfg.fc_table); if (tb) - err = tb->tb_delete(tb, &req.rtm, &rta, &req.nlh, NULL); + err = tb->tb_delete(tb, &cfg); + else + err = -ESRCH; } else { - struct fib_table *tb = fib_new_table(req.rtm.rtm_table); - err = -ENOBUFS; + tb = fib_new_table(cfg.fc_table); if (tb) - err = tb->tb_insert(tb, &req.rtm, &rta, &req.nlh, NULL); + err = tb->tb_insert(tb, &cfg); + else + err = -ENOBUFS; } - kfree(rta.rta_mx); + + /* allocated by rtentry_to_fib_config() */ + kfree(cfg.fc_mx); } rtnl_unlock(); return err; @@ -305,51 +453,134 @@ int ip_rt_ioctl(unsigned int cmd, void *arg) #endif -static int inet_check_attr(struct rtmsg *r, struct rtattr **rta) +static struct nla_policy rtm_ipv4_policy[RTA_MAX+1] __read_mostly = { + [RTA_DST] = { .type = NLA_U32 }, + [RTA_SRC] = { .type = NLA_U32 }, + [RTA_IIF] = { .type = NLA_U32 }, + [RTA_OIF] = { .type = NLA_U32 }, + [RTA_GATEWAY] = { .type = NLA_U32 }, + [RTA_PRIORITY] = { .type = NLA_U32 }, + [RTA_PREFSRC] = { .type = NLA_U32 }, + [RTA_METRICS] = { .type = NLA_NESTED }, + [RTA_MULTIPATH] = { .minlen = sizeof(struct rtnexthop) }, + [RTA_PROTOINFO] = { .type = NLA_U32 }, + [RTA_FLOW] = { .type = NLA_U32 }, + [RTA_MP_ALGO] = { .type = NLA_U32 }, +}; + +static int rtm_to_fib_config(struct sk_buff *skb, struct nlmsghdr *nlh, + struct fib_config *cfg) { - int i; - - for (i=1; i<=RTA_MAX; i++, rta++) { - struct rtattr *attr = *rta; - if (attr) { - if (RTA_PAYLOAD(attr) < 4) - return -EINVAL; - if (i != RTA_MULTIPATH && i != RTA_METRICS && - i != RTA_TABLE) - *rta = (struct rtattr*)RTA_DATA(attr); + struct nlattr *attr; + int err, remaining; + struct rtmsg *rtm; + + err = nlmsg_validate(nlh, sizeof(*rtm), RTA_MAX, rtm_ipv4_policy); + if (err < 0) + goto errout; + + memset(cfg, 0, sizeof(*cfg)); + + rtm = nlmsg_data(nlh); + cfg->fc_family = rtm->rtm_family; + cfg->fc_dst_len = rtm->rtm_dst_len; + cfg->fc_src_len = rtm->rtm_src_len; + cfg->fc_tos = rtm->rtm_tos; + cfg->fc_table = rtm->rtm_table; + cfg->fc_protocol = rtm->rtm_protocol; + cfg->fc_scope = rtm->rtm_scope; + cfg->fc_type = rtm->rtm_type; + cfg->fc_flags = rtm->rtm_flags; + cfg->fc_nlflags = nlh->nlmsg_flags; + + cfg->fc_nlinfo.pid = NETLINK_CB(skb).pid; + cfg->fc_nlinfo.nlh = nlh; + + nlmsg_for_each_attr(attr, nlh, sizeof(struct rtmsg), remaining) { + switch (attr->nla_type) { + case RTA_DST: + cfg->fc_dst = nla_get_u32(attr); + break; + case RTA_SRC: + cfg->fc_src = nla_get_u32(attr); + break; + case RTA_OIF: + cfg->fc_oif = nla_get_u32(attr); + break; + case RTA_GATEWAY: + cfg->fc_gw = nla_get_u32(attr); + break; + case RTA_PRIORITY: + cfg->fc_priority = nla_get_u32(attr); + break; + case RTA_PREFSRC: + cfg->fc_prefsrc = nla_get_u32(attr); + break; + case RTA_METRICS: + cfg->fc_mx = nla_data(attr); + cfg->fc_mx_len = nla_len(attr); + break; + case RTA_MULTIPATH: + cfg->fc_mp = nla_data(attr); + cfg->fc_mp_len = nla_len(attr); + break; + case RTA_FLOW: + cfg->fc_flow = nla_get_u32(attr); + break; + case RTA_MP_ALGO: + cfg->fc_mp_alg = nla_get_u32(attr); + break; + case RTA_TABLE: + cfg->fc_table = nla_get_u32(attr); + break; } } + return 0; +errout: + return err; } int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) { - struct fib_table * tb; - struct rtattr **rta = arg; - struct rtmsg *r = NLMSG_DATA(nlh); + struct fib_config cfg; + struct fib_table *tb; + int err; - if (inet_check_attr(r, rta)) - return -EINVAL; + err = rtm_to_fib_config(skb, nlh, &cfg); + if (err < 0) + goto errout; - tb = fib_get_table(rtm_get_table(rta, r->rtm_table)); - if (tb) - return tb->tb_delete(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb)); - return -ESRCH; + tb = fib_get_table(cfg.fc_table); + if (tb == NULL) { + err = -ESRCH; + goto errout; + } + + err = tb->tb_delete(tb, &cfg); +errout: + return err; } int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) { - struct fib_table * tb; - struct rtattr **rta = arg; - struct rtmsg *r = NLMSG_DATA(nlh); + struct fib_config cfg; + struct fib_table *tb; + int err; - if (inet_check_attr(r, rta)) - return -EINVAL; + err = rtm_to_fib_config(skb, nlh, &cfg); + if (err < 0) + goto errout; - tb = fib_new_table(rtm_get_table(rta, r->rtm_table)); - if (tb) - return tb->tb_insert(tb, r, (struct kern_rta*)rta, nlh, &NETLINK_CB(skb)); - return -ENOBUFS; + tb = fib_new_table(cfg.fc_table); + if (tb == NULL) { + err = -ENOBUFS; + goto errout; + } + + err = tb->tb_insert(tb, &cfg); +errout: + return err; } int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) @@ -396,17 +627,19 @@ out: only when netlink is already locked. */ -static void fib_magic(int cmd, int type, u32 dst, int dst_len, struct in_ifaddr *ifa) +static void fib_magic(int cmd, int type, u32 dst, int dst_len, + struct in_ifaddr *ifa) { - struct fib_table * tb; - struct { - struct nlmsghdr nlh; - struct rtmsg rtm; - } req; - struct kern_rta rta; - - memset(&req.rtm, 0, sizeof(req.rtm)); - memset(&rta, 0, sizeof(rta)); + struct fib_table *tb; + struct fib_config cfg = { + .fc_protocol = RTPROT_KERNEL, + .fc_type = type, + .fc_dst = dst, + .fc_dst_len = dst_len, + .fc_prefsrc = ifa->ifa_local, + .fc_oif = ifa->ifa_dev->dev->ifindex, + .fc_nlflags = NLM_F_CREATE | NLM_F_APPEND, + }; if (type == RTN_UNICAST) tb = fib_new_table(RT_TABLE_MAIN); @@ -416,26 +649,17 @@ static void fib_magic(int cmd, int type, u32 dst, int dst_len, struct in_ifaddr if (tb == NULL) return; - req.nlh.nlmsg_len = sizeof(req); - req.nlh.nlmsg_type = cmd; - req.nlh.nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE|NLM_F_APPEND; - req.nlh.nlmsg_pid = 0; - req.nlh.nlmsg_seq = 0; + cfg.fc_table = tb->tb_id; - req.rtm.rtm_dst_len = dst_len; - req.rtm.rtm_table = tb->tb_id; - req.rtm.rtm_protocol = RTPROT_KERNEL; - req.rtm.rtm_scope = (type != RTN_LOCAL ? RT_SCOPE_LINK : RT_SCOPE_HOST); - req.rtm.rtm_type = type; - - rta.rta_dst = &dst; - rta.rta_prefsrc = &ifa->ifa_local; - rta.rta_oif = &ifa->ifa_dev->dev->ifindex; + if (type != RTN_LOCAL) + cfg.fc_scope = RT_SCOPE_LINK; + else + cfg.fc_scope = RT_SCOPE_HOST; if (cmd == RTM_NEWROUTE) - tb->tb_insert(tb, &req.rtm, &rta, &req.nlh, NULL); + tb->tb_insert(tb, &cfg); else - tb->tb_delete(tb, &req.rtm, &rta, &req.nlh, NULL); + tb->tb_delete(tb, &cfg); } void fib_add_ifaddr(struct in_ifaddr *ifa) diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index b5bee1a..3575575 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -379,42 +379,39 @@ static struct fib_node *fib_find_node(struct fn_zone *fz, u32 key) return NULL; } -static int -fn_hash_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, - struct nlmsghdr *n, struct netlink_skb_parms *req) +static int fn_hash_insert(struct fib_table *tb, struct fib_config *cfg) { struct fn_hash *table = (struct fn_hash *) tb->tb_data; struct fib_node *new_f, *f; struct fib_alias *fa, *new_fa; struct fn_zone *fz; struct fib_info *fi; - int z = r->rtm_dst_len; - int type = r->rtm_type; - u8 tos = r->rtm_tos; + u8 tos = cfg->fc_tos; u32 key; int err; - if (z > 32) + if (cfg->fc_dst_len > 32) return -EINVAL; - fz = table->fn_zones[z]; - if (!fz && !(fz = fn_new_zone(table, z))) + + fz = table->fn_zones[cfg->fc_dst_len]; + if (!fz && !(fz = fn_new_zone(table, cfg->fc_dst_len))) return -ENOBUFS; key = 0; - if (rta->rta_dst) { - u32 dst; - memcpy(&dst, rta->rta_dst, 4); - if (dst & ~FZ_MASK(fz)) + if (cfg->fc_dst) { + if (cfg->fc_dst & ~FZ_MASK(fz)) return -EINVAL; - key = fz_key(dst, fz); + key = fz_key(cfg->fc_dst, fz); } - if ((fi = fib_create_info(r, rta, n, &err)) == NULL) - return err; + fi = fib_create_info(cfg); + if (IS_ERR(fi)) + return PTR_ERR(fi); if (fz->fz_nent > (fz->fz_divisor<<1) && fz->fz_divisor < FZ_MAX_DIVISOR && - (z==32 || (1< fz->fz_divisor)) + (cfg->fc_dst_len == 32 || + (1 << cfg->fc_dst_len) > fz->fz_divisor)) fn_rehash_zone(fz); f = fib_find_node(fz, key); @@ -440,18 +437,18 @@ fn_hash_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, struct fib_alias *fa_orig; err = -EEXIST; - if (n->nlmsg_flags & NLM_F_EXCL) + if (cfg->fc_nlflags & NLM_F_EXCL) goto out; - if (n->nlmsg_flags & NLM_F_REPLACE) { + if (cfg->fc_nlflags & NLM_F_REPLACE) { struct fib_info *fi_drop; u8 state; write_lock_bh(&fib_hash_lock); fi_drop = fa->fa_info; fa->fa_info = fi; - fa->fa_type = type; - fa->fa_scope = r->rtm_scope; + fa->fa_type = cfg->fc_type; + fa->fa_scope = cfg->fc_scope; state = fa->fa_state; fa->fa_state &= ~FA_S_ACCESSED; fib_hash_genid++; @@ -474,17 +471,17 @@ fn_hash_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, break; if (fa->fa_info->fib_priority != fi->fib_priority) break; - if (fa->fa_type == type && - fa->fa_scope == r->rtm_scope && + if (fa->fa_type == cfg->fc_type && + fa->fa_scope == cfg->fc_scope && fa->fa_info == fi) goto out; } - if (!(n->nlmsg_flags & NLM_F_APPEND)) + if (!(cfg->fc_nlflags & NLM_F_APPEND)) fa = fa_orig; } err = -ENOENT; - if (!(n->nlmsg_flags&NLM_F_CREATE)) + if (!(cfg->fc_nlflags & NLM_F_CREATE)) goto out; err = -ENOBUFS; @@ -506,8 +503,8 @@ fn_hash_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, new_fa->fa_info = fi; new_fa->fa_tos = tos; - new_fa->fa_type = type; - new_fa->fa_scope = r->rtm_scope; + new_fa->fa_type = cfg->fc_type; + new_fa->fa_scope = cfg->fc_scope; new_fa->fa_state = 0; /* @@ -526,7 +523,8 @@ fn_hash_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, fz->fz_nent++; rt_cache_flush(-1); - rtmsg_fib(RTM_NEWROUTE, key, new_fa, z, tb->tb_id, n, req); + rtmsg_fib(RTM_NEWROUTE, key, new_fa, cfg->fc_dst_len, tb->tb_id, + &cfg->fc_nlinfo); return 0; out_free_new_fa: @@ -537,30 +535,25 @@ out: } -static int -fn_hash_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, - struct nlmsghdr *n, struct netlink_skb_parms *req) +static int fn_hash_delete(struct fib_table *tb, struct fib_config *cfg) { struct fn_hash *table = (struct fn_hash*)tb->tb_data; struct fib_node *f; struct fib_alias *fa, *fa_to_delete; - int z = r->rtm_dst_len; struct fn_zone *fz; u32 key; - u8 tos = r->rtm_tos; - if (z > 32) + if (cfg->fc_dst_len > 32) return -EINVAL; - if ((fz = table->fn_zones[z]) == NULL) + + if ((fz = table->fn_zones[cfg->fc_dst_len]) == NULL) return -ESRCH; key = 0; - if (rta->rta_dst) { - u32 dst; - memcpy(&dst, rta->rta_dst, 4); - if (dst & ~FZ_MASK(fz)) + if (cfg->fc_dst) { + if (cfg->fc_dst & ~FZ_MASK(fz)) return -EINVAL; - key = fz_key(dst, fz); + key = fz_key(cfg->fc_dst, fz); } f = fib_find_node(fz, key); @@ -568,7 +561,7 @@ fn_hash_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, if (!f) fa = NULL; else - fa = fib_find_alias(&f->fn_alias, tos, 0); + fa = fib_find_alias(&f->fn_alias, cfg->fc_tos, 0); if (!fa) return -ESRCH; @@ -577,16 +570,16 @@ fn_hash_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, list_for_each_entry_continue(fa, &f->fn_alias, fa_list) { struct fib_info *fi = fa->fa_info; - if (fa->fa_tos != tos) + if (fa->fa_tos != cfg->fc_tos) break; - if ((!r->rtm_type || - fa->fa_type == r->rtm_type) && - (r->rtm_scope == RT_SCOPE_NOWHERE || - fa->fa_scope == r->rtm_scope) && - (!r->rtm_protocol || - fi->fib_protocol == r->rtm_protocol) && - fib_nh_match(r, n, rta, fi) == 0) { + if ((!cfg->fc_type || + fa->fa_type == cfg->fc_type) && + (cfg->fc_scope == RT_SCOPE_NOWHERE || + fa->fa_scope == cfg->fc_scope) && + (!cfg->fc_protocol || + fi->fib_protocol == cfg->fc_protocol) && + fib_nh_match(cfg, fi) == 0) { fa_to_delete = fa; break; } @@ -596,7 +589,8 @@ fn_hash_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, int kill_fn; fa = fa_to_delete; - rtmsg_fib(RTM_DELROUTE, key, fa, z, tb->tb_id, n, req); + rtmsg_fib(RTM_DELROUTE, key, fa, cfg->fc_dst_len, + tb->tb_id, &cfg->fc_nlinfo); kill_fn = 0; write_lock_bh(&fib_hash_lock); diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h index ddd5249..d6d1a89 100644 --- a/net/ipv4/fib_lookup.h +++ b/net/ipv4/fib_lookup.h @@ -23,19 +23,14 @@ extern int fib_semantic_match(struct list_head *head, struct fib_result *res, __u32 zone, __u32 mask, int prefixlen); extern void fib_release_info(struct fib_info *); -extern struct fib_info *fib_create_info(const struct rtmsg *r, - struct kern_rta *rta, - const struct nlmsghdr *, - int *err); -extern int fib_nh_match(struct rtmsg *r, struct nlmsghdr *, - struct kern_rta *rta, struct fib_info *fi); +extern struct fib_info *fib_create_info(struct fib_config *cfg); +extern int fib_nh_match(struct fib_config *cfg, struct fib_info *fi); extern int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos, struct fib_info *fi, unsigned int); extern void rtmsg_fib(int event, u32 key, struct fib_alias *fa, - int z, u32 tb_id, - struct nlmsghdr *n, struct netlink_skb_parms *req); + int dst_len, u32 tb_id, struct nl_info *info); extern struct fib_alias *fib_find_alias(struct list_head *fah, u8 tos, u32 prio); extern int fib_detect_death(struct fib_info *fi, int order, diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 5dfdad5..340f9db 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -44,6 +44,7 @@ #include #include #include +#include #include "fib_lookup.h" @@ -273,27 +274,27 @@ int ip_fib_check_default(u32 gw, struct net_device *dev) } void rtmsg_fib(int event, u32 key, struct fib_alias *fa, - int z, u32 tb_id, - struct nlmsghdr *n, struct netlink_skb_parms *req) + int dst_len, u32 tb_id, struct nl_info *info) { struct sk_buff *skb; - u32 pid = req ? req->pid : n->nlmsg_pid; int payload = sizeof(struct rtmsg) + 256; + u32 seq = info->nlh ? info->nlh->nlmsg_seq : 0; int err = -ENOBUFS; skb = nlmsg_new(nlmsg_total_size(payload), GFP_KERNEL); if (skb == NULL) goto errout; - err = fib_dump_info(skb, pid, n->nlmsg_seq, event, tb_id, - fa->fa_type, fa->fa_scope, &key, z, fa->fa_tos, - fa->fa_info, 0); + err = fib_dump_info(skb, info->pid, seq, event, tb_id, + fa->fa_type, fa->fa_scope, &key, dst_len, + fa->fa_tos, fa->fa_info, 0); if (err < 0) { kfree_skb(skb); goto errout; } - err = rtnl_notify(skb, pid, RTNLGRP_IPV4_ROUTE, n, GFP_KERNEL); + err = rtnl_notify(skb, info->pid, RTNLGRP_IPV4_ROUTE, + info->nlh, GFP_KERNEL); errout: if (err < 0) rtnl_set_sk_err(RTNLGRP_IPV4_ROUTE, err); @@ -342,102 +343,100 @@ int fib_detect_death(struct fib_info *fi, int order, #ifdef CONFIG_IP_ROUTE_MULTIPATH -static u32 fib_get_attr32(struct rtattr *attr, int attrlen, int type) -{ - while (RTA_OK(attr,attrlen)) { - if (attr->rta_type == type) - return *(u32*)RTA_DATA(attr); - attr = RTA_NEXT(attr, attrlen); - } - return 0; -} - -static int -fib_count_nexthops(struct rtattr *rta) +static int fib_count_nexthops(struct rtnexthop *rtnh, int remaining) { int nhs = 0; - struct rtnexthop *nhp = RTA_DATA(rta); - int nhlen = RTA_PAYLOAD(rta); - while (nhlen >= (int)sizeof(struct rtnexthop)) { - if ((nhlen -= nhp->rtnh_len) < 0) - return 0; + while (rtnh_ok(rtnh, remaining)) { nhs++; - nhp = RTNH_NEXT(nhp); - }; - return nhs; + rtnh = rtnh_next(rtnh, &remaining); + } + + /* leftover implies invalid nexthop configuration, discard it */ + return remaining > 0 ? 0 : nhs; } -static int -fib_get_nhs(struct fib_info *fi, const struct rtattr *rta, const struct rtmsg *r) +static int fib_get_nhs(struct fib_info *fi, struct rtnexthop *rtnh, + int remaining, struct fib_config *cfg) { - struct rtnexthop *nhp = RTA_DATA(rta); - int nhlen = RTA_PAYLOAD(rta); - change_nexthops(fi) { - int attrlen = nhlen - sizeof(struct rtnexthop); - if (attrlen < 0 || (nhlen -= nhp->rtnh_len) < 0) + int attrlen; + + if (!rtnh_ok(rtnh, remaining)) return -EINVAL; - nh->nh_flags = (r->rtm_flags&~0xFF) | nhp->rtnh_flags; - nh->nh_oif = nhp->rtnh_ifindex; - nh->nh_weight = nhp->rtnh_hops + 1; - if (attrlen) { - nh->nh_gw = fib_get_attr32(RTNH_DATA(nhp), attrlen, RTA_GATEWAY); + + nh->nh_flags = (cfg->fc_flags & ~0xFF) | rtnh->rtnh_flags; + nh->nh_oif = rtnh->rtnh_ifindex; + nh->nh_weight = rtnh->rtnh_hops + 1; + + attrlen = rtnh_attrlen(rtnh); + if (attrlen > 0) { + struct nlattr *nla, *attrs = rtnh_attrs(rtnh); + + nla = nla_find(attrs, attrlen, RTA_GATEWAY); + nh->nh_gw = nla ? nla_get_u32(nla) : 0; #ifdef CONFIG_NET_CLS_ROUTE - nh->nh_tclassid = fib_get_attr32(RTNH_DATA(nhp), attrlen, RTA_FLOW); + nla = nla_find(attrs, attrlen, RTA_FLOW); + nh->nh_tclassid = nla ? nla_get_u32(nla) : 0; #endif } - nhp = RTNH_NEXT(nhp); + + rtnh = rtnh_next(rtnh, &remaining); } endfor_nexthops(fi); + return 0; } #endif -int fib_nh_match(struct rtmsg *r, struct nlmsghdr *nlh, struct kern_rta *rta, - struct fib_info *fi) +int fib_nh_match(struct fib_config *cfg, struct fib_info *fi) { #ifdef CONFIG_IP_ROUTE_MULTIPATH - struct rtnexthop *nhp; - int nhlen; + struct rtnexthop *rtnh; + int remaining; #endif - if (rta->rta_priority && - *rta->rta_priority != fi->fib_priority) + if (cfg->fc_priority && cfg->fc_priority != fi->fib_priority) return 1; - if (rta->rta_oif || rta->rta_gw) { - if ((!rta->rta_oif || *rta->rta_oif == fi->fib_nh->nh_oif) && - (!rta->rta_gw || memcmp(rta->rta_gw, &fi->fib_nh->nh_gw, 4) == 0)) + if (cfg->fc_oif || cfg->fc_gw) { + if ((!cfg->fc_oif || cfg->fc_oif == fi->fib_nh->nh_oif) && + (!cfg->fc_gw || cfg->fc_gw == fi->fib_nh->nh_gw)) return 0; return 1; } #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (rta->rta_mp == NULL) + if (cfg->fc_mp == NULL) return 0; - nhp = RTA_DATA(rta->rta_mp); - nhlen = RTA_PAYLOAD(rta->rta_mp); + + rtnh = cfg->fc_mp; + remaining = cfg->fc_mp_len; for_nexthops(fi) { - int attrlen = nhlen - sizeof(struct rtnexthop); - u32 gw; + int attrlen; - if (attrlen < 0 || (nhlen -= nhp->rtnh_len) < 0) + if (!rtnh_ok(rtnh, remaining)) return -EINVAL; - if (nhp->rtnh_ifindex && nhp->rtnh_ifindex != nh->nh_oif) + + if (rtnh->rtnh_ifindex && rtnh->rtnh_ifindex != nh->nh_oif) return 1; - if (attrlen) { - gw = fib_get_attr32(RTNH_DATA(nhp), attrlen, RTA_GATEWAY); - if (gw && gw != nh->nh_gw) + + attrlen = rtnh_attrlen(rtnh); + if (attrlen < 0) { + struct nlattr *nla, *attrs = rtnh_attrs(rtnh); + + nla = nla_find(attrs, attrlen, RTA_GATEWAY); + if (nla && nla_get_u32(nla) != nh->nh_gw) return 1; #ifdef CONFIG_NET_CLS_ROUTE - gw = fib_get_attr32(RTNH_DATA(nhp), attrlen, RTA_FLOW); - if (gw && gw != nh->nh_tclassid) + nla = nla_find(attrs, attrlen, RTA_FLOW); + if (nla && nla_get_u32(nla) != nh->nh_tclassid) return 1; #endif } - nhp = RTNH_NEXT(nhp); + + rtnh = rtnh_next(rtnh, &remaining); } endfor_nexthops(fi); #endif return 0; @@ -488,7 +487,8 @@ int fib_nh_match(struct rtmsg *r, struct nlmsghdr *nlh, struct kern_rta *rta, |-> {local prefix} (terminal node) */ -static int fib_check_nh(const struct rtmsg *r, struct fib_info *fi, struct fib_nh *nh) +static int fib_check_nh(struct fib_config *cfg, struct fib_info *fi, + struct fib_nh *nh) { int err; @@ -502,7 +502,7 @@ static int fib_check_nh(const struct rtmsg *r, struct fib_info *fi, struct fib_n if (nh->nh_flags&RTNH_F_ONLINK) { struct net_device *dev; - if (r->rtm_scope >= RT_SCOPE_LINK) + if (cfg->fc_scope >= RT_SCOPE_LINK) return -EINVAL; if (inet_addr_type(nh->nh_gw) != RTN_UNICAST) return -EINVAL; @@ -516,10 +516,15 @@ static int fib_check_nh(const struct rtmsg *r, struct fib_info *fi, struct fib_n return 0; } { - struct flowi fl = { .nl_u = { .ip4_u = - { .daddr = nh->nh_gw, - .scope = r->rtm_scope + 1 } }, - .oif = nh->nh_oif }; + struct flowi fl = { + .nl_u = { + .ip4_u = { + .daddr = nh->nh_gw, + .scope = cfg->fc_scope + 1, + }, + }, + .oif = nh->nh_oif, + }; /* It is not necessary, but requires a bit of thinking */ if (fl.fl4_scope < RT_SCOPE_LINK) @@ -646,39 +651,28 @@ static void fib_hash_move(struct hlist_head *new_info_hash, fib_hash_free(old_laddrhash, bytes); } -struct fib_info * -fib_create_info(const struct rtmsg *r, struct kern_rta *rta, - const struct nlmsghdr *nlh, int *errp) +struct fib_info *fib_create_info(struct fib_config *cfg) { int err; struct fib_info *fi = NULL; struct fib_info *ofi; -#ifdef CONFIG_IP_ROUTE_MULTIPATH int nhs = 1; -#else - const int nhs = 1; -#endif -#ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED - u32 mp_alg = IP_MP_ALG_NONE; -#endif /* Fast check to catch the most weird cases */ - if (fib_props[r->rtm_type].scope > r->rtm_scope) + if (fib_props[cfg->fc_type].scope > cfg->fc_scope) goto err_inval; #ifdef CONFIG_IP_ROUTE_MULTIPATH - if (rta->rta_mp) { - nhs = fib_count_nexthops(rta->rta_mp); + if (cfg->fc_mp) { + nhs = fib_count_nexthops(cfg->fc_mp, cfg->fc_mp_len); if (nhs == 0) goto err_inval; } #endif #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED - if (rta->rta_mp_alg) { - mp_alg = *rta->rta_mp_alg; - - if (mp_alg < IP_MP_ALG_NONE || - mp_alg > IP_MP_ALG_MAX) + if (cfg->fc_mp_alg) { + if (cfg->fc_mp_alg < IP_MP_ALG_NONE || + cfg->fc_mp_alg > IP_MP_ALG_MAX) goto err_inval; } #endif @@ -714,43 +708,42 @@ fib_create_info(const struct rtmsg *r, struct kern_rta *rta, goto failure; fib_info_cnt++; - fi->fib_protocol = r->rtm_protocol; + fi->fib_protocol = cfg->fc_protocol; + fi->fib_flags = cfg->fc_flags; + fi->fib_priority = cfg->fc_priority; + fi->fib_prefsrc = cfg->fc_prefsrc; fi->fib_nhs = nhs; change_nexthops(fi) { nh->nh_parent = fi; } endfor_nexthops(fi) - fi->fib_flags = r->rtm_flags; - if (rta->rta_priority) - fi->fib_priority = *rta->rta_priority; - if (rta->rta_mx) { - int attrlen = RTA_PAYLOAD(rta->rta_mx); - struct rtattr *attr = RTA_DATA(rta->rta_mx); - - while (RTA_OK(attr, attrlen)) { - unsigned flavor = attr->rta_type; - if (flavor) { - if (flavor > RTAX_MAX) + if (cfg->fc_mx) { + struct nlattr *nla; + int remaining; + + nla_for_each_attr(nla, cfg->fc_mx, cfg->fc_mx_len, remaining) { + int type = nla->nla_type; + + if (type) { + if (type > RTAX_MAX) goto err_inval; - fi->fib_metrics[flavor-1] = *(unsigned*)RTA_DATA(attr); + fi->fib_metrics[type - 1] = nla_get_u32(nla); } - attr = RTA_NEXT(attr, attrlen); } } - if (rta->rta_prefsrc) - memcpy(&fi->fib_prefsrc, rta->rta_prefsrc, 4); - if (rta->rta_mp) { + if (cfg->fc_mp) { #ifdef CONFIG_IP_ROUTE_MULTIPATH - if ((err = fib_get_nhs(fi, rta->rta_mp, r)) != 0) + err = fib_get_nhs(fi, cfg->fc_mp, cfg->fc_mp_len, cfg); + if (err != 0) goto failure; - if (rta->rta_oif && fi->fib_nh->nh_oif != *rta->rta_oif) + if (cfg->fc_oif && fi->fib_nh->nh_oif != cfg->fc_oif) goto err_inval; - if (rta->rta_gw && memcmp(&fi->fib_nh->nh_gw, rta->rta_gw, 4)) + if (cfg->fc_gw && fi->fib_nh->nh_gw != cfg->fc_gw) goto err_inval; #ifdef CONFIG_NET_CLS_ROUTE - if (rta->rta_flow && memcmp(&fi->fib_nh->nh_tclassid, rta->rta_flow, 4)) + if (cfg->fc_flow && fi->fib_nh->nh_tclassid != cfg->fc_flow) goto err_inval; #endif #else @@ -758,34 +751,32 @@ fib_create_info(const struct rtmsg *r, struct kern_rta *rta, #endif } else { struct fib_nh *nh = fi->fib_nh; - if (rta->rta_oif) - nh->nh_oif = *rta->rta_oif; - if (rta->rta_gw) - memcpy(&nh->nh_gw, rta->rta_gw, 4); + + nh->nh_oif = cfg->fc_oif; + nh->nh_gw = cfg->fc_gw; + nh->nh_flags = cfg->fc_flags; #ifdef CONFIG_NET_CLS_ROUTE - if (rta->rta_flow) - memcpy(&nh->nh_tclassid, rta->rta_flow, 4); + nh->nh_tclassid = cfg->fc_flow; #endif - nh->nh_flags = r->rtm_flags; #ifdef CONFIG_IP_ROUTE_MULTIPATH nh->nh_weight = 1; #endif } #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED - fi->fib_mp_alg = mp_alg; + fi->fib_mp_alg = cfg->fc_mp_alg; #endif - if (fib_props[r->rtm_type].error) { - if (rta->rta_gw || rta->rta_oif || rta->rta_mp) + if (fib_props[cfg->fc_type].error) { + if (cfg->fc_gw || cfg->fc_oif || cfg->fc_mp) goto err_inval; goto link_it; } - if (r->rtm_scope > RT_SCOPE_HOST) + if (cfg->fc_scope > RT_SCOPE_HOST) goto err_inval; - if (r->rtm_scope == RT_SCOPE_HOST) { + if (cfg->fc_scope == RT_SCOPE_HOST) { struct fib_nh *nh = fi->fib_nh; /* Local address is added. */ @@ -798,14 +789,14 @@ fib_create_info(const struct rtmsg *r, struct kern_rta *rta, goto failure; } else { change_nexthops(fi) { - if ((err = fib_check_nh(r, fi, nh)) != 0) + if ((err = fib_check_nh(cfg, fi, nh)) != 0) goto failure; } endfor_nexthops(fi) } if (fi->fib_prefsrc) { - if (r->rtm_type != RTN_LOCAL || rta->rta_dst == NULL || - memcmp(&fi->fib_prefsrc, rta->rta_dst, 4)) + if (cfg->fc_type != RTN_LOCAL || !cfg->fc_dst || + fi->fib_prefsrc != cfg->fc_dst) if (inet_addr_type(fi->fib_prefsrc) != RTN_LOCAL) goto err_inval; } @@ -846,12 +837,12 @@ err_inval: err = -EINVAL; failure: - *errp = err; if (fi) { fi->fib_dead = 1; free_fib_info(fi); } - return NULL; + + return ERR_PTR(err); } /* Note! fib_semantic_match intentionally uses RCU list functions. */ @@ -1012,150 +1003,6 @@ rtattr_failure: return -1; } -#ifndef CONFIG_IP_NOSIOCRT - -int -fib_convert_rtentry(int cmd, struct nlmsghdr *nl, struct rtmsg *rtm, - struct kern_rta *rta, struct rtentry *r) -{ - int plen; - u32 *ptr; - - memset(rtm, 0, sizeof(*rtm)); - memset(rta, 0, sizeof(*rta)); - - if (r->rt_dst.sa_family != AF_INET) - return -EAFNOSUPPORT; - - /* Check mask for validity: - a) it must be contiguous. - b) destination must have all host bits clear. - c) if application forgot to set correct family (AF_INET), - reject request unless it is absolutely clear i.e. - both family and mask are zero. - */ - plen = 32; - ptr = &((struct sockaddr_in*)&r->rt_dst)->sin_addr.s_addr; - if (!(r->rt_flags&RTF_HOST)) { - u32 mask = ((struct sockaddr_in*)&r->rt_genmask)->sin_addr.s_addr; - if (r->rt_genmask.sa_family != AF_INET) { - if (mask || r->rt_genmask.sa_family) - return -EAFNOSUPPORT; - } - if (bad_mask(mask, *ptr)) - return -EINVAL; - plen = inet_mask_len(mask); - } - - nl->nlmsg_flags = NLM_F_REQUEST; - nl->nlmsg_pid = 0; - nl->nlmsg_seq = 0; - nl->nlmsg_len = NLMSG_LENGTH(sizeof(*rtm)); - if (cmd == SIOCDELRT) { - nl->nlmsg_type = RTM_DELROUTE; - nl->nlmsg_flags = 0; - } else { - nl->nlmsg_type = RTM_NEWROUTE; - nl->nlmsg_flags = NLM_F_REQUEST|NLM_F_CREATE; - rtm->rtm_protocol = RTPROT_BOOT; - } - - rtm->rtm_dst_len = plen; - rta->rta_dst = ptr; - - if (r->rt_metric) { - *(u32*)&r->rt_pad3 = r->rt_metric - 1; - rta->rta_priority = (u32*)&r->rt_pad3; - } - if (r->rt_flags&RTF_REJECT) { - rtm->rtm_scope = RT_SCOPE_HOST; - rtm->rtm_type = RTN_UNREACHABLE; - return 0; - } - rtm->rtm_scope = RT_SCOPE_NOWHERE; - rtm->rtm_type = RTN_UNICAST; - - if (r->rt_dev) { - char *colon; - struct net_device *dev; - char devname[IFNAMSIZ]; - - if (copy_from_user(devname, r->rt_dev, IFNAMSIZ-1)) - return -EFAULT; - devname[IFNAMSIZ-1] = 0; - colon = strchr(devname, ':'); - if (colon) - *colon = 0; - dev = __dev_get_by_name(devname); - if (!dev) - return -ENODEV; - rta->rta_oif = &dev->ifindex; - if (colon) { - struct in_ifaddr *ifa; - struct in_device *in_dev = __in_dev_get_rtnl(dev); - if (!in_dev) - return -ENODEV; - *colon = ':'; - for (ifa = in_dev->ifa_list; ifa; ifa = ifa->ifa_next) - if (strcmp(ifa->ifa_label, devname) == 0) - break; - if (ifa == NULL) - return -ENODEV; - rta->rta_prefsrc = &ifa->ifa_local; - } - } - - ptr = &((struct sockaddr_in*)&r->rt_gateway)->sin_addr.s_addr; - if (r->rt_gateway.sa_family == AF_INET && *ptr) { - rta->rta_gw = ptr; - if (r->rt_flags&RTF_GATEWAY && inet_addr_type(*ptr) == RTN_UNICAST) - rtm->rtm_scope = RT_SCOPE_UNIVERSE; - } - - if (cmd == SIOCDELRT) - return 0; - - if (r->rt_flags&RTF_GATEWAY && rta->rta_gw == NULL) - return -EINVAL; - - if (rtm->rtm_scope == RT_SCOPE_NOWHERE) - rtm->rtm_scope = RT_SCOPE_LINK; - - if (r->rt_flags&(RTF_MTU|RTF_WINDOW|RTF_IRTT)) { - struct rtattr *rec; - struct rtattr *mx = kmalloc(RTA_LENGTH(3*RTA_LENGTH(4)), GFP_KERNEL); - if (mx == NULL) - return -ENOMEM; - rta->rta_mx = mx; - mx->rta_type = RTA_METRICS; - mx->rta_len = RTA_LENGTH(0); - if (r->rt_flags&RTF_MTU) { - rec = (void*)((char*)mx + RTA_ALIGN(mx->rta_len)); - rec->rta_type = RTAX_ADVMSS; - rec->rta_len = RTA_LENGTH(4); - mx->rta_len += RTA_LENGTH(4); - *(u32*)RTA_DATA(rec) = r->rt_mtu - 40; - } - if (r->rt_flags&RTF_WINDOW) { - rec = (void*)((char*)mx + RTA_ALIGN(mx->rta_len)); - rec->rta_type = RTAX_WINDOW; - rec->rta_len = RTA_LENGTH(4); - mx->rta_len += RTA_LENGTH(4); - *(u32*)RTA_DATA(rec) = r->rt_window; - } - if (r->rt_flags&RTF_IRTT) { - rec = (void*)((char*)mx + RTA_ALIGN(mx->rta_len)); - rec->rta_type = RTAX_RTT; - rec->rta_len = RTA_LENGTH(4); - mx->rta_len += RTA_LENGTH(4); - *(u32*)RTA_DATA(rec) = r->rt_irtt<<3; - } - } - return 0; -} - -#endif - /* Update FIB if: - local address disappeared -> we must delete all the entries diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 2a580eb..41bef0a 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1124,17 +1124,14 @@ err: return fa_head; } -static int -fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, - struct nlmsghdr *nlhdr, struct netlink_skb_parms *req) +static int fn_trie_insert(struct fib_table *tb, struct fib_config *cfg) { struct trie *t = (struct trie *) tb->tb_data; struct fib_alias *fa, *new_fa; struct list_head *fa_head = NULL; struct fib_info *fi; - int plen = r->rtm_dst_len; - int type = r->rtm_type; - u8 tos = r->rtm_tos; + int plen = cfg->fc_dst_len; + u8 tos = cfg->fc_tos; u32 key, mask; int err; struct leaf *l; @@ -1142,11 +1139,7 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, if (plen > 32) return -EINVAL; - key = 0; - if (rta->rta_dst) - memcpy(&key, rta->rta_dst, 4); - - key = ntohl(key); + key = ntohl(cfg->fc_dst); pr_debug("Insert table=%u %08x/%d\n", tb->tb_id, key, plen); @@ -1157,10 +1150,11 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, key = key & mask; - fi = fib_create_info(r, rta, nlhdr, &err); - - if (!fi) + fi = fib_create_info(cfg); + if (IS_ERR(fi)) { + err = PTR_ERR(fi); goto err; + } l = fib_find_node(t, key); fa = NULL; @@ -1185,10 +1179,10 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, struct fib_alias *fa_orig; err = -EEXIST; - if (nlhdr->nlmsg_flags & NLM_F_EXCL) + if (cfg->fc_nlflags & NLM_F_EXCL) goto out; - if (nlhdr->nlmsg_flags & NLM_F_REPLACE) { + if (cfg->fc_nlflags & NLM_F_REPLACE) { struct fib_info *fi_drop; u8 state; @@ -1200,8 +1194,8 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, fi_drop = fa->fa_info; new_fa->fa_tos = fa->fa_tos; new_fa->fa_info = fi; - new_fa->fa_type = type; - new_fa->fa_scope = r->rtm_scope; + new_fa->fa_type = cfg->fc_type; + new_fa->fa_scope = cfg->fc_scope; state = fa->fa_state; new_fa->fa_state &= ~FA_S_ACCESSED; @@ -1224,17 +1218,17 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, break; if (fa->fa_info->fib_priority != fi->fib_priority) break; - if (fa->fa_type == type && - fa->fa_scope == r->rtm_scope && + if (fa->fa_type == cfg->fc_type && + fa->fa_scope == cfg->fc_scope && fa->fa_info == fi) { goto out; } } - if (!(nlhdr->nlmsg_flags & NLM_F_APPEND)) + if (!(cfg->fc_nlflags & NLM_F_APPEND)) fa = fa_orig; } err = -ENOENT; - if (!(nlhdr->nlmsg_flags & NLM_F_CREATE)) + if (!(cfg->fc_nlflags & NLM_F_CREATE)) goto out; err = -ENOBUFS; @@ -1244,8 +1238,8 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, new_fa->fa_info = fi; new_fa->fa_tos = tos; - new_fa->fa_type = type; - new_fa->fa_scope = r->rtm_scope; + new_fa->fa_type = cfg->fc_type; + new_fa->fa_scope = cfg->fc_scope; new_fa->fa_state = 0; /* * Insert new entry to the list. @@ -1262,7 +1256,8 @@ fn_trie_insert(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, (fa ? &fa->fa_list : fa_head)); rt_cache_flush(-1); - rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, tb->tb_id, nlhdr, req); + rtmsg_fib(RTM_NEWROUTE, htonl(key), new_fa, plen, tb->tb_id, + &cfg->fc_nlinfo); succeeded: return 0; @@ -1548,28 +1543,21 @@ static int trie_leaf_remove(struct trie *t, t_key key) return 1; } -static int -fn_trie_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, - struct nlmsghdr *nlhdr, struct netlink_skb_parms *req) +static int fn_trie_delete(struct fib_table *tb, struct fib_config *cfg) { struct trie *t = (struct trie *) tb->tb_data; u32 key, mask; - int plen = r->rtm_dst_len; - u8 tos = r->rtm_tos; + int plen = cfg->fc_dst_len; + u8 tos = cfg->fc_tos; struct fib_alias *fa, *fa_to_delete; struct list_head *fa_head; struct leaf *l; struct leaf_info *li; - if (plen > 32) return -EINVAL; - key = 0; - if (rta->rta_dst) - memcpy(&key, rta->rta_dst, 4); - - key = ntohl(key); + key = ntohl(cfg->fc_dst); mask = ntohl(inet_make_mask(plen)); if (key & ~mask) @@ -1598,13 +1586,12 @@ fn_trie_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, if (fa->fa_tos != tos) break; - if ((!r->rtm_type || - fa->fa_type == r->rtm_type) && - (r->rtm_scope == RT_SCOPE_NOWHERE || - fa->fa_scope == r->rtm_scope) && - (!r->rtm_protocol || - fi->fib_protocol == r->rtm_protocol) && - fib_nh_match(r, nlhdr, rta, fi) == 0) { + if ((!cfg->fc_type || fa->fa_type == cfg->fc_type) && + (cfg->fc_scope == RT_SCOPE_NOWHERE || + fa->fa_scope == cfg->fc_scope) && + (!cfg->fc_protocol || + fi->fib_protocol == cfg->fc_protocol) && + fib_nh_match(cfg, fi) == 0) { fa_to_delete = fa; break; } @@ -1614,7 +1601,8 @@ fn_trie_delete(struct fib_table *tb, struct rtmsg *r, struct kern_rta *rta, return -ESRCH; fa = fa_to_delete; - rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb->tb_id, nlhdr, req); + rtmsg_fib(RTM_DELROUTE, htonl(key), fa, plen, tb->tb_id, + &cfg->fc_nlinfo); l = fib_find_node(t, key); li = find_leaf_info(l, plen); -- cgit v0.10.2 From be403ea1856f1428b5912b42184acbba808c41d6 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 17 Aug 2006 18:15:17 -0700 Subject: [IPv4]: Convert FIB dumping to use new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index acc18bd..d537c93 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -591,8 +591,8 @@ int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) struct hlist_node *node; int dumped = 0; - if (NLMSG_PAYLOAD(cb->nlh, 0) >= sizeof(struct rtmsg) && - ((struct rtmsg*)NLMSG_DATA(cb->nlh))->rtm_flags&RTM_F_CLONED) + if (nlmsg_len(cb->nlh) >= sizeof(struct rtmsg) && + ((struct rtmsg *) nlmsg_data(cb->nlh))->rtm_flags & RTM_F_CLONED) return ip_rt_dump(skb, cb); s_h = cb->args[0]; diff --git a/net/ipv4/fib_hash.c b/net/ipv4/fib_hash.c index 3575575..88133b3 100644 --- a/net/ipv4/fib_hash.c +++ b/net/ipv4/fib_hash.c @@ -693,7 +693,7 @@ fn_hash_dump_bucket(struct sk_buff *skb, struct netlink_callback *cb, tb->tb_id, fa->fa_type, fa->fa_scope, - &f->fn_key, + f->fn_key, fz->fz_order, fa->fa_tos, fa->fa_info, diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h index d6d1a89..fd6f776 100644 --- a/net/ipv4/fib_lookup.h +++ b/net/ipv4/fib_lookup.h @@ -26,7 +26,7 @@ extern void fib_release_info(struct fib_info *); extern struct fib_info *fib_create_info(struct fib_config *cfg); extern int fib_nh_match(struct fib_config *cfg, struct fib_info *fi); extern int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, - u32 tb_id, u8 type, u8 scope, void *dst, + u32 tb_id, u8 type, u8 scope, u32 dst, int dst_len, u8 tos, struct fib_info *fi, unsigned int); extern void rtmsg_fib(int event, u32 key, struct fib_alias *fa, diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 340f9db..2ead0954 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -286,7 +286,7 @@ void rtmsg_fib(int event, u32 key, struct fib_alias *fa, goto errout; err = fib_dump_info(skb, info->pid, seq, event, tb_id, - fa->fa_type, fa->fa_scope, &key, dst_len, + fa->fa_type, fa->fa_scope, key, dst_len, fa->fa_tos, fa->fa_info, 0); if (err < 0) { kfree_skb(skb); @@ -928,79 +928,87 @@ u32 __fib_res_prefsrc(struct fib_result *res) return inet_select_addr(FIB_RES_DEV(*res), FIB_RES_GW(*res), res->scope); } -int -fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, - u32 tb_id, u8 type, u8 scope, void *dst, int dst_len, u8 tos, - struct fib_info *fi, unsigned int flags) +int fib_dump_info(struct sk_buff *skb, u32 pid, u32 seq, int event, + u32 tb_id, u8 type, u8 scope, u32 dst, int dst_len, u8 tos, + struct fib_info *fi, unsigned int flags) { + struct nlmsghdr *nlh; struct rtmsg *rtm; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*rtm), flags); - rtm = NLMSG_DATA(nlh); + nlh = nlmsg_put(skb, pid, seq, event, sizeof(*rtm), flags); + if (nlh == NULL) + return -ENOBUFS; + + rtm = nlmsg_data(nlh); rtm->rtm_family = AF_INET; rtm->rtm_dst_len = dst_len; rtm->rtm_src_len = 0; rtm->rtm_tos = tos; rtm->rtm_table = tb_id; - RTA_PUT_U32(skb, RTA_TABLE, tb_id); + NLA_PUT_U32(skb, RTA_TABLE, tb_id); rtm->rtm_type = type; rtm->rtm_flags = fi->fib_flags; rtm->rtm_scope = scope; - if (rtm->rtm_dst_len) - RTA_PUT(skb, RTA_DST, 4, dst); rtm->rtm_protocol = fi->fib_protocol; + + if (rtm->rtm_dst_len) + NLA_PUT_U32(skb, RTA_DST, dst); + if (fi->fib_priority) - RTA_PUT(skb, RTA_PRIORITY, 4, &fi->fib_priority); + NLA_PUT_U32(skb, RTA_PRIORITY, fi->fib_priority); + if (rtnetlink_put_metrics(skb, fi->fib_metrics) < 0) - goto rtattr_failure; + goto nla_put_failure; + if (fi->fib_prefsrc) - RTA_PUT(skb, RTA_PREFSRC, 4, &fi->fib_prefsrc); + NLA_PUT_U32(skb, RTA_PREFSRC, fi->fib_prefsrc); + if (fi->fib_nhs == 1) { if (fi->fib_nh->nh_gw) - RTA_PUT(skb, RTA_GATEWAY, 4, &fi->fib_nh->nh_gw); + NLA_PUT_U32(skb, RTA_GATEWAY, fi->fib_nh->nh_gw); + if (fi->fib_nh->nh_oif) - RTA_PUT(skb, RTA_OIF, sizeof(int), &fi->fib_nh->nh_oif); + NLA_PUT_U32(skb, RTA_OIF, fi->fib_nh->nh_oif); #ifdef CONFIG_NET_CLS_ROUTE if (fi->fib_nh[0].nh_tclassid) - RTA_PUT(skb, RTA_FLOW, 4, &fi->fib_nh[0].nh_tclassid); + NLA_PUT_U32(skb, RTA_FLOW, fi->fib_nh[0].nh_tclassid); #endif } #ifdef CONFIG_IP_ROUTE_MULTIPATH if (fi->fib_nhs > 1) { - struct rtnexthop *nhp; - struct rtattr *mp_head; - if (skb_tailroom(skb) <= RTA_SPACE(0)) - goto rtattr_failure; - mp_head = (struct rtattr*)skb_put(skb, RTA_SPACE(0)); + struct rtnexthop *rtnh; + struct nlattr *mp; + + mp = nla_nest_start(skb, RTA_MULTIPATH); + if (mp == NULL) + goto nla_put_failure; for_nexthops(fi) { - if (skb_tailroom(skb) < RTA_ALIGN(RTA_ALIGN(sizeof(*nhp)) + 4)) - goto rtattr_failure; - nhp = (struct rtnexthop*)skb_put(skb, RTA_ALIGN(sizeof(*nhp))); - nhp->rtnh_flags = nh->nh_flags & 0xFF; - nhp->rtnh_hops = nh->nh_weight-1; - nhp->rtnh_ifindex = nh->nh_oif; + rtnh = nla_reserve_nohdr(skb, sizeof(*rtnh)); + if (rtnh == NULL) + goto nla_put_failure; + + rtnh->rtnh_flags = nh->nh_flags & 0xFF; + rtnh->rtnh_hops = nh->nh_weight - 1; + rtnh->rtnh_ifindex = nh->nh_oif; + if (nh->nh_gw) - RTA_PUT(skb, RTA_GATEWAY, 4, &nh->nh_gw); + NLA_PUT_U32(skb, RTA_GATEWAY, nh->nh_gw); #ifdef CONFIG_NET_CLS_ROUTE if (nh->nh_tclassid) - RTA_PUT(skb, RTA_FLOW, 4, &nh->nh_tclassid); + NLA_PUT_U32(skb, RTA_FLOW, nh->nh_tclassid); #endif - nhp->rtnh_len = skb->tail - (unsigned char*)nhp; + /* length of rtnetlink header + attributes */ + rtnh->rtnh_len = nlmsg_get_pos(skb) - (void *) rtnh; } endfor_nexthops(fi); - mp_head->rta_type = RTA_MULTIPATH; - mp_head->rta_len = skb->tail - (u8*)mp_head; + + nla_nest_end(skb, mp); } #endif - nlh->nlmsg_len = skb->tail - b; - return skb->len; + return nlmsg_end(skb, nlh); -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; +nla_put_failure: + return nlmsg_cancel(skb, nlh); } /* diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c index 41bef0a..9c3ff6b 100644 --- a/net/ipv4/fib_trie.c +++ b/net/ipv4/fib_trie.c @@ -1854,7 +1854,7 @@ static int fn_trie_dump_fa(t_key key, int plen, struct list_head *fah, struct fi tb->tb_id, fa->fa_type, fa->fa_scope, - &xkey, + xkey, plen, fa->fa_tos, fa->fa_info, 0) < 0) { diff --git a/net/ipv4/route.c b/net/ipv4/route.c index b8f6cad..31b6705 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2639,52 +2639,54 @@ static int rt_fill_info(struct sk_buff *skb, u32 pid, u32 seq, int event, { struct rtable *rt = (struct rtable*)skb->dst; struct rtmsg *r; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; + struct nlmsghdr *nlh; struct rta_cacheinfo ci; -#ifdef CONFIG_IP_MROUTE - struct rtattr *eptr; -#endif - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*r), flags); - r = NLMSG_DATA(nlh); + + nlh = nlmsg_put(skb, pid, seq, event, sizeof(*r), flags); + if (nlh == NULL) + return -ENOBUFS; + + r = nlmsg_data(nlh); r->rtm_family = AF_INET; r->rtm_dst_len = 32; r->rtm_src_len = 0; r->rtm_tos = rt->fl.fl4_tos; r->rtm_table = RT_TABLE_MAIN; - RTA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN); + NLA_PUT_U32(skb, RTA_TABLE, RT_TABLE_MAIN); r->rtm_type = rt->rt_type; r->rtm_scope = RT_SCOPE_UNIVERSE; r->rtm_protocol = RTPROT_UNSPEC; r->rtm_flags = (rt->rt_flags & ~0xFFFF) | RTM_F_CLONED; if (rt->rt_flags & RTCF_NOTIFY) r->rtm_flags |= RTM_F_NOTIFY; - RTA_PUT(skb, RTA_DST, 4, &rt->rt_dst); + + NLA_PUT_U32(skb, RTA_DST, rt->rt_dst); + if (rt->fl.fl4_src) { r->rtm_src_len = 32; - RTA_PUT(skb, RTA_SRC, 4, &rt->fl.fl4_src); + NLA_PUT_U32(skb, RTA_SRC, rt->fl.fl4_src); } if (rt->u.dst.dev) - RTA_PUT(skb, RTA_OIF, sizeof(int), &rt->u.dst.dev->ifindex); + NLA_PUT_U32(skb, RTA_OIF, rt->u.dst.dev->ifindex); #ifdef CONFIG_NET_CLS_ROUTE if (rt->u.dst.tclassid) - RTA_PUT(skb, RTA_FLOW, 4, &rt->u.dst.tclassid); + NLA_PUT_U32(skb, RTA_FLOW, rt->u.dst.tclassid); #endif #ifdef CONFIG_IP_ROUTE_MULTIPATH_CACHED - if (rt->rt_multipath_alg != IP_MP_ALG_NONE) { - __u32 alg = rt->rt_multipath_alg; - - RTA_PUT(skb, RTA_MP_ALGO, 4, &alg); - } + if (rt->rt_multipath_alg != IP_MP_ALG_NONE) + NLA_PUT_U32(skb, RTA_MP_ALGO, rt->rt_multipath_alg); #endif if (rt->fl.iif) - RTA_PUT(skb, RTA_PREFSRC, 4, &rt->rt_spec_dst); + NLA_PUT_U32(skb, RTA_PREFSRC, rt->rt_spec_dst); else if (rt->rt_src != rt->fl.fl4_src) - RTA_PUT(skb, RTA_PREFSRC, 4, &rt->rt_src); + NLA_PUT_U32(skb, RTA_PREFSRC, rt->rt_src); + if (rt->rt_dst != rt->rt_gateway) - RTA_PUT(skb, RTA_GATEWAY, 4, &rt->rt_gateway); + NLA_PUT_U32(skb, RTA_GATEWAY, rt->rt_gateway); + if (rtnetlink_put_metrics(skb, rt->u.dst.metrics) < 0) - goto rtattr_failure; + goto nla_put_failure; + ci.rta_lastuse = jiffies_to_clock_t(jiffies - rt->u.dst.lastuse); ci.rta_used = rt->u.dst.__use; ci.rta_clntref = atomic_read(&rt->u.dst.__refcnt); @@ -2701,10 +2703,7 @@ static int rt_fill_info(struct sk_buff *skb, u32 pid, u32 seq, int event, ci.rta_tsage = xtime.tv_sec - rt->peer->tcp_ts_stamp; } } -#ifdef CONFIG_IP_MROUTE - eptr = (struct rtattr*)skb->tail; -#endif - RTA_PUT(skb, RTA_CACHEINFO, sizeof(ci), &ci); + if (rt->fl.iif) { #ifdef CONFIG_IP_MROUTE u32 dst = rt->rt_dst; @@ -2716,25 +2715,24 @@ static int rt_fill_info(struct sk_buff *skb, u32 pid, u32 seq, int event, if (!nowait) { if (err == 0) return 0; - goto nlmsg_failure; + goto nla_put_failure; } else { if (err == -EMSGSIZE) - goto nlmsg_failure; - ((struct rta_cacheinfo*)RTA_DATA(eptr))->rta_error = err; + goto nla_put_failure; + ci.rta_error = err; } } } else #endif - RTA_PUT(skb, RTA_IIF, sizeof(int), &rt->fl.iif); + NLA_PUT_U32(skb, RTA_IIF, rt->fl.iif); } - nlh->nlmsg_len = skb->tail - b; - return skb->len; + NLA_PUT(skb, RTA_CACHEINFO, sizeof(ci), &ci); + + return nlmsg_end(skb, nlh); -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; +nla_put_failure: + return nlmsg_cancel(skb, nlh); } int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) -- cgit v0.10.2 From d889ce3b29e55b91257964b4c9aac70b91fedd91 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 17 Aug 2006 18:15:44 -0700 Subject: [IPv4]: Convert route get to new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h index 42ed96f..fcc159a 100644 --- a/include/net/ip_fib.h +++ b/include/net/ip_fib.h @@ -216,6 +216,7 @@ extern void fib_select_default(const struct flowi *flp, struct fib_result *res); #endif /* CONFIG_IP_MULTIPLE_TABLES */ /* Exported by fib_frontend.c */ +extern struct nla_policy rtm_ipv4_policy[]; extern void ip_fib_init(void); extern int inet_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); extern int inet_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg); diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index d537c93..d0abeab 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -453,7 +453,7 @@ int ip_rt_ioctl(unsigned int cmd, void *arg) #endif -static struct nla_policy rtm_ipv4_policy[RTA_MAX+1] __read_mostly = { +struct nla_policy rtm_ipv4_policy[RTA_MAX+1] __read_mostly = { [RTA_DST] = { .type = NLA_U32 }, [RTA_SRC] = { .type = NLA_U32 }, [RTA_IIF] = { .type = NLA_U32 }, diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 31b6705..a4d4cb8 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -2737,18 +2737,24 @@ nla_put_failure: int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) { - struct rtattr **rta = arg; - struct rtmsg *rtm = NLMSG_DATA(nlh); + struct rtmsg *rtm; + struct nlattr *tb[RTA_MAX+1]; struct rtable *rt = NULL; - u32 dst = 0; - u32 src = 0; - int iif = 0; - int err = -ENOBUFS; + u32 dst, src, iif; + int err; struct sk_buff *skb; + err = nlmsg_parse(nlh, sizeof(*rtm), tb, RTA_MAX, rtm_ipv4_policy); + if (err < 0) + goto errout; + + rtm = nlmsg_data(nlh); + skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); - if (!skb) - goto out; + if (skb == NULL) { + err = -ENOBUFS; + goto errout; + } /* Reserve room for dummy headers, this skb can pass through good chunk of routing engine. @@ -2759,61 +2765,61 @@ int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) skb->nh.iph->protocol = IPPROTO_ICMP; skb_reserve(skb, MAX_HEADER + sizeof(struct iphdr)); - if (rta[RTA_SRC - 1]) - memcpy(&src, RTA_DATA(rta[RTA_SRC - 1]), 4); - if (rta[RTA_DST - 1]) - memcpy(&dst, RTA_DATA(rta[RTA_DST - 1]), 4); - if (rta[RTA_IIF - 1]) - memcpy(&iif, RTA_DATA(rta[RTA_IIF - 1]), sizeof(int)); + src = tb[RTA_SRC] ? nla_get_u32(tb[RTA_SRC]) : 0; + dst = tb[RTA_DST] ? nla_get_u32(tb[RTA_DST]) : 0; + iif = tb[RTA_IIF] ? nla_get_u32(tb[RTA_IIF]) : 0; if (iif) { - struct net_device *dev = __dev_get_by_index(iif); - err = -ENODEV; - if (!dev) - goto out_free; + struct net_device *dev; + + dev = __dev_get_by_index(iif); + if (dev == NULL) { + err = -ENODEV; + goto errout_free; + } + skb->protocol = htons(ETH_P_IP); skb->dev = dev; local_bh_disable(); err = ip_route_input(skb, dst, src, rtm->rtm_tos, dev); local_bh_enable(); - rt = (struct rtable*)skb->dst; - if (!err && rt->u.dst.error) + + rt = (struct rtable*) skb->dst; + if (err == 0 && rt->u.dst.error) err = -rt->u.dst.error; } else { - struct flowi fl = { .nl_u = { .ip4_u = { .daddr = dst, - .saddr = src, - .tos = rtm->rtm_tos } } }; - int oif = 0; - if (rta[RTA_OIF - 1]) - memcpy(&oif, RTA_DATA(rta[RTA_OIF - 1]), sizeof(int)); - fl.oif = oif; + struct flowi fl = { + .nl_u = { + .ip4_u = { + .daddr = dst, + .saddr = src, + .tos = rtm->rtm_tos, + }, + }, + .oif = tb[RTA_OIF] ? nla_get_u32(tb[RTA_OIF]) : 0, + }; err = ip_route_output_key(&rt, &fl); } + if (err) - goto out_free; + goto errout_free; skb->dst = &rt->u.dst; if (rtm->rtm_flags & RTM_F_NOTIFY) rt->rt_flags |= RTCF_NOTIFY; - NETLINK_CB(skb).dst_pid = NETLINK_CB(in_skb).pid; - err = rt_fill_info(skb, NETLINK_CB(in_skb).pid, nlh->nlmsg_seq, RTM_NEWROUTE, 0, 0); - if (!err) - goto out_free; - if (err < 0) { - err = -EMSGSIZE; - goto out_free; - } + if (err <= 0) + goto errout_free; err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); -out: +errout: return err; -out_free: +errout_free: kfree_skb(skb); - goto out; + goto errout; } int ip_rt_dump(struct sk_buff *skb, struct netlink_callback *cb) -- cgit v0.10.2 From e92b43a3455d3e817c13481bb3ea3cd29d0a47f4 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 17 Aug 2006 18:17:37 -0700 Subject: [NET] neighbour: reduce exports There are several symbols only used by rtnetlink and since it can not be a module, there is no reason to export them. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/core/neighbour.c b/net/core/neighbour.c index c7e653f..c0a2740 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -889,7 +889,7 @@ out_unlock_bh: return rc; } -static __inline__ void neigh_update_hhs(struct neighbour *neigh) +static void neigh_update_hhs(struct neighbour *neigh) { struct hh_cache *hh; void (*update)(struct hh_cache*, struct net_device*, unsigned char *) = @@ -2724,7 +2724,6 @@ void neigh_sysctl_unregister(struct neigh_parms *p) #endif /* CONFIG_SYSCTL */ EXPORT_SYMBOL(__neigh_event_send); -EXPORT_SYMBOL(neigh_add); EXPORT_SYMBOL(neigh_changeaddr); EXPORT_SYMBOL(neigh_compat_output); EXPORT_SYMBOL(neigh_connected_output); @@ -2744,11 +2743,8 @@ EXPORT_SYMBOL(neigh_table_clear); EXPORT_SYMBOL(neigh_table_init); EXPORT_SYMBOL(neigh_table_init_no_netlink); EXPORT_SYMBOL(neigh_update); -EXPORT_SYMBOL(neigh_update_hhs); EXPORT_SYMBOL(pneigh_enqueue); EXPORT_SYMBOL(pneigh_lookup); -EXPORT_SYMBOL(neightbl_dump_info); -EXPORT_SYMBOL(neightbl_set); #ifdef CONFIG_ARPD EXPORT_SYMBOL(neigh_app_ns); -- cgit v0.10.2 From d3e01f71863da30a2d6bfca069a036168b6c8607 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 17 Aug 2006 18:18:53 -0700 Subject: [ETH]: docbook comments Add docbook style comments to ethernet support. Signed-off-by: Stephen Hemminger Acked-by: Randy Dunlap Signed-off-by: David S. Miller diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index 387c71c5..72bdb15 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -64,23 +64,24 @@ __setup("ether=", netdev_boot_setup); -/* - * Create the Ethernet MAC header for an arbitrary protocol layer +/** + * eth_header - create the Ethernet header + * @skb: buffer to alter + * @dev: source device + * @type: Ethernet type field + * @daddr: destination address (NULL leave destination address) + * @saddr: source address (NULL use device source address) + * @len: packet length (<= skb->len) * - * saddr=NULL means use device source address - * daddr=NULL means leave destination address (eg unresolved arp) + * + * Set the protocol type. For a packet of type ETH_P_802_3 we put the length + * in here instead. It is up to the 802.2 layer to carry protocol information. */ - int eth_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, void *daddr, void *saddr, unsigned len) { struct ethhdr *eth = (struct ethhdr *)skb_push(skb,ETH_HLEN); - /* - * Set the protocol type. For a packet of type ETH_P_802_3 we put the length - * in here instead. It is up to the 802.2 layer to carry protocol information. - */ - if(type!=ETH_P_802_3) eth->h_proto = htons(type); else @@ -113,16 +114,16 @@ int eth_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, return -ETH_HLEN; } - -/* - * Rebuild the Ethernet MAC header. This is called after an ARP - * (or in future other address resolution) has completed on this - * sk_buff. We now let ARP fill in the other fields. +/** + * eth_rebuild_header- rebuild the Ethernet MAC header. + * @skb: socket buffer to update + * + * This is called after an ARP or IPV6 ndisc it's resolution on this + * sk_buff. We now let protocol (ARP) fill in the other fields. * - * This routine CANNOT use cached dst->neigh! - * Really, it is used only when dst->neigh is wrong. + * This routine CANNOT use cached dst->neigh! + * Really, it is used only when dst->neigh is wrong. */ - int eth_rebuild_header(struct sk_buff *skb) { struct ethhdr *eth = (struct ethhdr *)skb->data; @@ -147,12 +148,15 @@ int eth_rebuild_header(struct sk_buff *skb) } -/* - * Determine the packet's protocol ID. The rule here is that we - * assume 802.3 if the type field is short enough to be a length. - * This is normal practice and works for any 'now in use' protocol. +/** + * eth_type_trans - determine the packet's protocol ID. + * @skb: received socket data + * @dev: receiving network device + * + * The rule here is that we + * assume 802.3 if the type field is short enough to be a length. + * This is normal practice and works for any 'now in use' protocol. */ - __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev) { struct ethhdr *eth; @@ -202,6 +206,11 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev) return htons(ETH_P_802_2); } +/** + * eth_header_parse - extract hardware address from packet + * @skb: packet to extract header from + * @haddr: destination buffer + */ static int eth_header_parse(struct sk_buff *skb, unsigned char *haddr) { struct ethhdr *eth = eth_hdr(skb); @@ -209,6 +218,12 @@ static int eth_header_parse(struct sk_buff *skb, unsigned char *haddr) return ETH_ALEN; } +/** + * eth_header_cache - fill cache entry from neighbour + * @neigh: source neighbour + * @hh: destination cache entry + * Create an Ethernet header template from the neighbour. + */ int eth_header_cache(struct neighbour *neigh, struct hh_cache *hh) { unsigned short type = hh->hh_type; @@ -228,10 +243,14 @@ int eth_header_cache(struct neighbour *neigh, struct hh_cache *hh) return 0; } -/* +/** + * eth_header_cache_update - update cache entry + * @hh: destination cache entry + * @dev: network device + * @haddr: new hardware address + * * Called by Address Resolution module to notify changes in address. */ - void eth_header_cache_update(struct hh_cache *hh, struct net_device *dev, unsigned char * haddr) { memcpy(((u8*)hh->hh_data) + HH_DATA_OFF(sizeof(struct ethhdr)), @@ -240,6 +259,15 @@ void eth_header_cache_update(struct hh_cache *hh, struct net_device *dev, unsign EXPORT_SYMBOL(eth_type_trans); +/** + * eth_mac_addr - set new Ethernet hardware address + * @dev: network device + * @p: socket address + * Change hardware address of device. + * + * This doesn't change hardware matching, so needs to be overridden + * for most real devices. + */ static int eth_mac_addr(struct net_device *dev, void *p) { struct sockaddr *addr=p; @@ -249,6 +277,14 @@ static int eth_mac_addr(struct net_device *dev, void *p) return 0; } +/** + * eth_change_mtu - set new MTU size + * @dev: network device + * @new_mtu: new Maximum Transfer Unit + * + * Allow changing MTU size. Needs to be overridden for devices + * supporting jumbo frames. + */ static int eth_change_mtu(struct net_device *dev, int new_mtu) { if (new_mtu < 68 || new_mtu > ETH_DATA_LEN) @@ -257,8 +293,10 @@ static int eth_change_mtu(struct net_device *dev, int new_mtu) return 0; } -/* - * Fill in the fields of the device structure with ethernet-generic values. +/** + * ether_setup - setup Ethernet network device + * @dev: network device + * Fill in the fields of the device structure with Ethernet-generic values. */ void ether_setup(struct net_device *dev) { @@ -283,15 +321,15 @@ void ether_setup(struct net_device *dev) EXPORT_SYMBOL(ether_setup); /** - * alloc_etherdev - Allocates and sets up an ethernet device + * alloc_etherdev - Allocates and sets up an Ethernet device * @sizeof_priv: Size of additional driver-private structure to be allocated - * for this ethernet device + * for this Ethernet device * - * Fill in the fields of the device structure with ethernet-generic + * Fill in the fields of the device structure with Ethernet-generic * values. Basically does everything except registering the device. * * Constructs a new net device, complete with a private data area of - * size @sizeof_priv. A 32-byte (not bit) alignment is enforced for + * size (sizeof_priv). A 32-byte (not bit) alignment is enforced for * this private data area. */ -- cgit v0.10.2 From 2e4ca75b31b6851dcc036c2cdebf3ecfe279a653 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Thu, 17 Aug 2006 18:20:18 -0700 Subject: [ETH]: indentation and cleanup Run ethernet support through Lindent and fix up. Applies after docbook comments patch Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/ethernet/eth.c b/net/ethernet/eth.c index 72bdb15..4386393 100644 --- a/net/ethernet/eth.c +++ b/net/ethernet/eth.c @@ -78,39 +78,37 @@ __setup("ether=", netdev_boot_setup); * in here instead. It is up to the 802.2 layer to carry protocol information. */ int eth_header(struct sk_buff *skb, struct net_device *dev, unsigned short type, - void *daddr, void *saddr, unsigned len) + void *daddr, void *saddr, unsigned len) { - struct ethhdr *eth = (struct ethhdr *)skb_push(skb,ETH_HLEN); + struct ethhdr *eth = (struct ethhdr *)skb_push(skb, ETH_HLEN); - if(type!=ETH_P_802_3) + if (type != ETH_P_802_3) eth->h_proto = htons(type); else eth->h_proto = htons(len); /* - * Set the source hardware address. + * Set the source hardware address. */ - - if(!saddr) + + if (!saddr) saddr = dev->dev_addr; - memcpy(eth->h_source,saddr,dev->addr_len); + memcpy(eth->h_source, saddr, dev->addr_len); - if(daddr) - { - memcpy(eth->h_dest,daddr,dev->addr_len); + if (daddr) { + memcpy(eth->h_dest, daddr, dev->addr_len); return ETH_HLEN; } - + /* - * Anyway, the loopback-device should never use this function... + * Anyway, the loopback-device should never use this function... */ - if (dev->flags & (IFF_LOOPBACK|IFF_NOARP)) - { + if (dev->flags & (IFF_LOOPBACK | IFF_NOARP)) { memset(eth->h_dest, 0, dev->addr_len); return ETH_HLEN; } - + return -ETH_HLEN; } @@ -129,17 +127,16 @@ int eth_rebuild_header(struct sk_buff *skb) struct ethhdr *eth = (struct ethhdr *)skb->data; struct net_device *dev = skb->dev; - switch (eth->h_proto) - { + switch (eth->h_proto) { #ifdef CONFIG_INET case __constant_htons(ETH_P_IP): - return arp_find(eth->h_dest, skb); -#endif + return arp_find(eth->h_dest, skb); +#endif default: printk(KERN_DEBUG - "%s: unable to resolve type %X addresses.\n", + "%s: unable to resolve type %X addresses.\n", dev->name, (int)eth->h_proto); - + memcpy(eth->h_source, dev->dev_addr, dev->addr_len); break; } @@ -147,7 +144,6 @@ int eth_rebuild_header(struct sk_buff *skb) return 0; } - /** * eth_type_trans - determine the packet's protocol ID. * @skb: received socket data @@ -161,50 +157,51 @@ __be16 eth_type_trans(struct sk_buff *skb, struct net_device *dev) { struct ethhdr *eth; unsigned char *rawp; - + skb->mac.raw = skb->data; - skb_pull(skb,ETH_HLEN); + skb_pull(skb, ETH_HLEN); eth = eth_hdr(skb); - + if (is_multicast_ether_addr(eth->h_dest)) { if (!compare_ether_addr(eth->h_dest, dev->broadcast)) skb->pkt_type = PACKET_BROADCAST; else skb->pkt_type = PACKET_MULTICAST; } - + /* - * This ALLMULTI check should be redundant by 1.4 - * so don't forget to remove it. + * This ALLMULTI check should be redundant by 1.4 + * so don't forget to remove it. * - * Seems, you forgot to remove it. All silly devices - * seems to set IFF_PROMISC. + * Seems, you forgot to remove it. All silly devices + * seems to set IFF_PROMISC. */ - - else if(1 /*dev->flags&IFF_PROMISC*/) { + + else if (1 /*dev->flags&IFF_PROMISC */ ) { if (unlikely(compare_ether_addr(eth->h_dest, dev->dev_addr))) skb->pkt_type = PACKET_OTHERHOST; } - + if (ntohs(eth->h_proto) >= 1536) return eth->h_proto; - + rawp = skb->data; - + /* - * This is a magic hack to spot IPX packets. Older Novell breaks - * the protocol design and runs IPX over 802.3 without an 802.2 LLC - * layer. We look for FFFF which isn't a used 802.2 SSAP/DSAP. This - * won't work for fault tolerant netware but does for the rest. + * This is a magic hack to spot IPX packets. Older Novell breaks + * the protocol design and runs IPX over 802.3 without an 802.2 LLC + * layer. We look for FFFF which isn't a used 802.2 SSAP/DSAP. This + * won't work for fault tolerant netware but does for the rest. */ if (*(unsigned short *)rawp == 0xFFFF) return htons(ETH_P_802_3); - + /* - * Real 802.2 LLC + * Real 802.2 LLC */ return htons(ETH_P_802_2); } +EXPORT_SYMBOL(eth_type_trans); /** * eth_header_parse - extract hardware address from packet @@ -230,8 +227,8 @@ int eth_header_cache(struct neighbour *neigh, struct hh_cache *hh) struct ethhdr *eth; struct net_device *dev = neigh->dev; - eth = (struct ethhdr*) - (((u8*)hh->hh_data) + (HH_DATA_OFF(sizeof(*eth)))); + eth = (struct ethhdr *) + (((u8 *) hh->hh_data) + (HH_DATA_OFF(sizeof(*eth)))); if (type == __constant_htons(ETH_P_802_3)) return -1; @@ -251,14 +248,13 @@ int eth_header_cache(struct neighbour *neigh, struct hh_cache *hh) * * Called by Address Resolution module to notify changes in address. */ -void eth_header_cache_update(struct hh_cache *hh, struct net_device *dev, unsigned char * haddr) +void eth_header_cache_update(struct hh_cache *hh, struct net_device *dev, + unsigned char *haddr) { - memcpy(((u8*)hh->hh_data) + HH_DATA_OFF(sizeof(struct ethhdr)), + memcpy(((u8 *) hh->hh_data) + HH_DATA_OFF(sizeof(struct ethhdr)), haddr, dev->addr_len); } -EXPORT_SYMBOL(eth_type_trans); - /** * eth_mac_addr - set new Ethernet hardware address * @dev: network device @@ -270,10 +266,10 @@ EXPORT_SYMBOL(eth_type_trans); */ static int eth_mac_addr(struct net_device *dev, void *p) { - struct sockaddr *addr=p; + struct sockaddr *addr = p; if (netif_running(dev)) return -EBUSY; - memcpy(dev->dev_addr, addr->sa_data,dev->addr_len); + memcpy(dev->dev_addr, addr->sa_data, dev->addr_len); return 0; } @@ -315,7 +311,7 @@ void ether_setup(struct net_device *dev) dev->tx_queue_len = 1000; /* Ethernet wants good queues */ dev->flags = IFF_BROADCAST|IFF_MULTICAST; - memset(dev->broadcast,0xFF, ETH_ALEN); + memset(dev->broadcast, 0xFF, ETH_ALEN); } EXPORT_SYMBOL(ether_setup); -- cgit v0.10.2 From e9ce1cd3cf6cf35b21d0ce990f2e738f35907386 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Mon, 21 Aug 2006 23:54:55 -0700 Subject: [PKT_SCHED]: Kill pkt_act.h inlining. This was simply making templates of functions and mostly causing a lot of code duplication in the classifier action modules. We solve this more cleanly by having a common "struct tcf_common" that hash worker functions contained once in act_api.c can work with. Callers work with real action objects that have the common struct plus their module specific struct members. You go from a common object to the higher level one using a "to_foo()" macro which makes use of container_of() to do the dirty work. This also kills off act_generic.h which was only used by act_simple.c and keeping it around was more work than the it's value. Signed-off-by: David S. Miller diff --git a/include/net/act_api.h b/include/net/act_api.h index 11e9eaf..8b06c2f 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -8,70 +8,110 @@ #include #include -#define tca_gen(name) \ -struct tcf_##name *next; \ - u32 index; \ - int refcnt; \ - int bindcnt; \ - u32 capab; \ - int action; \ - struct tcf_t tm; \ - struct gnet_stats_basic bstats; \ - struct gnet_stats_queue qstats; \ - struct gnet_stats_rate_est rate_est; \ - spinlock_t *stats_lock; \ - spinlock_t lock - -struct tcf_police -{ - tca_gen(police); - int result; - u32 ewma_rate; - u32 burst; - u32 mtu; - u32 toks; - u32 ptoks; - psched_time_t t_c; - struct qdisc_rate_table *R_tab; - struct qdisc_rate_table *P_tab; +struct tcf_common { + struct tcf_common *tcfc_next; + u32 tcfc_index; + int tcfc_refcnt; + int tcfc_bindcnt; + u32 tcfc_capab; + int tcfc_action; + struct tcf_t tcfc_tm; + struct gnet_stats_basic tcfc_bstats; + struct gnet_stats_queue tcfc_qstats; + struct gnet_stats_rate_est tcfc_rate_est; + spinlock_t *tcfc_stats_lock; + spinlock_t tcfc_lock; +}; +#define tcf_next common.tcfc_next +#define tcf_index common.tcfc_index +#define tcf_refcnt common.tcfc_refcnt +#define tcf_bindcnt common.tcfc_bindcnt +#define tcf_capab common.tcfc_capab +#define tcf_action common.tcfc_action +#define tcf_tm common.tcfc_tm +#define tcf_bstats common.tcfc_bstats +#define tcf_qstats common.tcfc_qstats +#define tcf_rate_est common.tcfc_rate_est +#define tcf_stats_lock common.tcfc_stats_lock +#define tcf_lock common.tcfc_lock + +struct tcf_police { + struct tcf_common common; + int tcfp_result; + u32 tcfp_ewma_rate; + u32 tcfp_burst; + u32 tcfp_mtu; + u32 tcfp_toks; + u32 tcfp_ptoks; + psched_time_t tcfp_t_c; + struct qdisc_rate_table *tcfp_R_tab; + struct qdisc_rate_table *tcfp_P_tab; }; +#define to_police(pc) \ + container_of(pc, struct tcf_police, common) + +struct tcf_hashinfo { + struct tcf_common **htab; + unsigned int hmask; + rwlock_t *lock; +}; + +static inline unsigned int tcf_hash(u32 index, unsigned int hmask) +{ + return index & hmask; +} #ifdef CONFIG_NET_CLS_ACT #define ACT_P_CREATED 1 #define ACT_P_DELETED 1 -struct tcf_act_hdr -{ - tca_gen(act_hdr); +struct tcf_act_hdr { + struct tcf_common common; }; -struct tc_action -{ - void *priv; - struct tc_action_ops *ops; - __u32 type; /* for backward compat(TCA_OLD_COMPAT) */ - __u32 order; - struct tc_action *next; +struct tc_action { + void *priv; + struct tc_action_ops *ops; + __u32 type; /* for backward compat(TCA_OLD_COMPAT) */ + __u32 order; + struct tc_action *next; }; #define TCA_CAP_NONE 0 -struct tc_action_ops -{ +struct tc_action_ops { struct tc_action_ops *next; + struct tcf_hashinfo *hinfo; char kind[IFNAMSIZ]; __u32 type; /* TBD to match kind */ __u32 capab; /* capabilities includes 4 bit version */ struct module *owner; int (*act)(struct sk_buff *, struct tc_action *, struct tcf_result *); int (*get_stats)(struct sk_buff *, struct tc_action *); - int (*dump)(struct sk_buff *, struct tc_action *,int , int); + int (*dump)(struct sk_buff *, struct tc_action *, int, int); int (*cleanup)(struct tc_action *, int bind); - int (*lookup)(struct tc_action *, u32 ); - int (*init)(struct rtattr *,struct rtattr *,struct tc_action *, int , int ); - int (*walk)(struct sk_buff *, struct netlink_callback *, int , struct tc_action *); + int (*lookup)(struct tc_action *, u32); + int (*init)(struct rtattr *, struct rtattr *, struct tc_action *, int , int); + int (*walk)(struct sk_buff *, struct netlink_callback *, int, struct tc_action *); }; +extern struct tcf_common *tcf_hash_lookup(u32 index, + struct tcf_hashinfo *hinfo); +extern void tcf_hash_destroy(struct tcf_common *p, struct tcf_hashinfo *hinfo); +extern int tcf_hash_release(struct tcf_common *p, int bind, + struct tcf_hashinfo *hinfo); +extern int tcf_generic_walker(struct sk_buff *skb, struct netlink_callback *cb, + int type, struct tc_action *a); +extern u32 tcf_hash_new_index(u32 *idx_gen, struct tcf_hashinfo *hinfo); +extern int tcf_hash_search(struct tc_action *a, u32 index); +extern struct tcf_common *tcf_hash_check(u32 index, struct tc_action *a, + int bind, struct tcf_hashinfo *hinfo); +extern struct tcf_common *tcf_hash_create(u32 index, struct rtattr *est, + struct tc_action *a, int size, + int bind, u32 *idx_gen, + struct tcf_hashinfo *hinfo); +extern void tcf_hash_insert(struct tcf_common *p, struct tcf_hashinfo *hinfo); + extern int tcf_register_action(struct tc_action_ops *a); extern int tcf_unregister_action(struct tc_action_ops *a); extern void tcf_action_destroy(struct tc_action *a, int bind); @@ -96,17 +136,17 @@ tcf_police_release(struct tcf_police *p, int bind) int ret = 0; #ifdef CONFIG_NET_CLS_ACT if (p) { - if (bind) { - p->bindcnt--; - } - p->refcnt--; - if (p->refcnt <= 0 && !p->bindcnt) { + if (bind) + p->tcf_bindcnt--; + + p->tcf_refcnt--; + if (p->tcf_refcnt <= 0 && !p->tcf_bindcnt) { tcf_police_destroy(p); ret = 1; } } #else - if (p && --p->refcnt == 0) + if (p && --p->tcf_refcnt == 0) tcf_police_destroy(p); #endif /* CONFIG_NET_CLS_ACT */ diff --git a/include/net/act_generic.h b/include/net/act_generic.h deleted file mode 100644 index c9daa7e..0000000 --- a/include/net/act_generic.h +++ /dev/null @@ -1,142 +0,0 @@ -/* - * include/net/act_generic.h - * -*/ -#ifndef _NET_ACT_GENERIC_H -#define _NET_ACT_GENERIC_H -static inline int tcf_defact_release(struct tcf_defact *p, int bind) -{ - int ret = 0; - if (p) { - if (bind) { - p->bindcnt--; - } - p->refcnt--; - if (p->bindcnt <= 0 && p->refcnt <= 0) { - kfree(p->defdata); - tcf_hash_destroy(p); - ret = 1; - } - } - return ret; -} - -static inline int -alloc_defdata(struct tcf_defact *p, u32 datalen, void *defdata) -{ - p->defdata = kmalloc(datalen, GFP_KERNEL); - if (p->defdata == NULL) - return -ENOMEM; - p->datalen = datalen; - memcpy(p->defdata, defdata, datalen); - return 0; -} - -static inline int -realloc_defdata(struct tcf_defact *p, u32 datalen, void *defdata) -{ - /* safer to be just brute force for now */ - kfree(p->defdata); - return alloc_defdata(p, datalen, defdata); -} - -static inline int -tcf_defact_init(struct rtattr *rta, struct rtattr *est, - struct tc_action *a, int ovr, int bind) -{ - struct rtattr *tb[TCA_DEF_MAX]; - struct tc_defact *parm; - struct tcf_defact *p; - void *defdata; - u32 datalen = 0; - int ret = 0; - - if (rta == NULL || rtattr_parse_nested(tb, TCA_DEF_MAX, rta) < 0) - return -EINVAL; - - if (tb[TCA_DEF_PARMS - 1] == NULL || - RTA_PAYLOAD(tb[TCA_DEF_PARMS - 1]) < sizeof(*parm)) - return -EINVAL; - - parm = RTA_DATA(tb[TCA_DEF_PARMS - 1]); - defdata = RTA_DATA(tb[TCA_DEF_DATA - 1]); - if (defdata == NULL) - return -EINVAL; - - datalen = RTA_PAYLOAD(tb[TCA_DEF_DATA - 1]); - if (datalen <= 0) - return -EINVAL; - - p = tcf_hash_check(parm->index, a, ovr, bind); - if (p == NULL) { - p = tcf_hash_create(parm->index, est, a, sizeof(*p), ovr, bind); - if (p == NULL) - return -ENOMEM; - - ret = alloc_defdata(p, datalen, defdata); - if (ret < 0) { - kfree(p); - return ret; - } - ret = ACT_P_CREATED; - } else { - if (!ovr) { - tcf_defact_release(p, bind); - return -EEXIST; - } - realloc_defdata(p, datalen, defdata); - } - - spin_lock_bh(&p->lock); - p->action = parm->action; - spin_unlock_bh(&p->lock); - if (ret == ACT_P_CREATED) - tcf_hash_insert(p); - return ret; -} - -static inline int tcf_defact_cleanup(struct tc_action *a, int bind) -{ - struct tcf_defact *p = PRIV(a, defact); - - if (p != NULL) - return tcf_defact_release(p, bind); - return 0; -} - -static inline int -tcf_defact_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) -{ - unsigned char *b = skb->tail; - struct tc_defact opt; - struct tcf_defact *p = PRIV(a, defact); - struct tcf_t t; - - opt.index = p->index; - opt.refcnt = p->refcnt - ref; - opt.bindcnt = p->bindcnt - bind; - opt.action = p->action; - RTA_PUT(skb, TCA_DEF_PARMS, sizeof(opt), &opt); - RTA_PUT(skb, TCA_DEF_DATA, p->datalen, p->defdata); - t.install = jiffies_to_clock_t(jiffies - p->tm.install); - t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); - t.expires = jiffies_to_clock_t(p->tm.expires); - RTA_PUT(skb, TCA_DEF_TM, sizeof(t), &t); - return skb->len; - -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; -} - -#define tca_use_default_ops \ - .dump = tcf_defact_dump, \ - .cleanup = tcf_defact_cleanup, \ - .init = tcf_defact_init, \ - .walk = tcf_generic_walker, \ - -#define tca_use_default_defines(name) \ - static u32 idx_gen; \ - static struct tcf_defact *tcf_##name_ht[MY_TAB_SIZE]; \ - static DEFINE_RWLOCK(##name_lock); -#endif /* _NET_ACT_GENERIC_H */ diff --git a/include/net/pkt_act.h b/include/net/pkt_act.h deleted file mode 100644 index cf5e4d2..0000000 --- a/include/net/pkt_act.h +++ /dev/null @@ -1,273 +0,0 @@ -#ifndef __NET_PKT_ACT_H -#define __NET_PKT_ACT_H - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#define tca_st(val) (struct tcf_##val *) -#define PRIV(a,name) ( tca_st(name) (a)->priv) - -#if 0 /* control */ -#define DPRINTK(format,args...) printk(KERN_DEBUG format,##args) -#else -#define DPRINTK(format,args...) -#endif - -#if 0 /* data */ -#define D2PRINTK(format,args...) printk(KERN_DEBUG format,##args) -#else -#define D2PRINTK(format,args...) -#endif - -static __inline__ unsigned -tcf_hash(u32 index) -{ - return index & MY_TAB_MASK; -} - -/* probably move this from being inline - * and put into act_generic -*/ -static inline void -tcf_hash_destroy(struct tcf_st *p) -{ - unsigned h = tcf_hash(p->index); - struct tcf_st **p1p; - - for (p1p = &tcf_ht[h]; *p1p; p1p = &(*p1p)->next) { - if (*p1p == p) { - write_lock_bh(&tcf_t_lock); - *p1p = p->next; - write_unlock_bh(&tcf_t_lock); -#ifdef CONFIG_NET_ESTIMATOR - gen_kill_estimator(&p->bstats, &p->rate_est); -#endif - kfree(p); - return; - } - } - BUG_TRAP(0); -} - -static inline int -tcf_hash_release(struct tcf_st *p, int bind ) -{ - int ret = 0; - if (p) { - if (bind) { - p->bindcnt--; - } - p->refcnt--; - if(p->bindcnt <=0 && p->refcnt <= 0) { - tcf_hash_destroy(p); - ret = 1; - } - } - return ret; -} - -static __inline__ int -tcf_dump_walker(struct sk_buff *skb, struct netlink_callback *cb, - struct tc_action *a) -{ - struct tcf_st *p; - int err =0, index = -1,i= 0, s_i = 0, n_i = 0; - struct rtattr *r ; - - read_lock(&tcf_t_lock); - - s_i = cb->args[0]; - - for (i = 0; i < MY_TAB_SIZE; i++) { - p = tcf_ht[tcf_hash(i)]; - - for (; p; p = p->next) { - index++; - if (index < s_i) - continue; - a->priv = p; - a->order = n_i; - r = (struct rtattr*) skb->tail; - RTA_PUT(skb, a->order, 0, NULL); - err = tcf_action_dump_1(skb, a, 0, 0); - if (0 > err) { - index--; - skb_trim(skb, (u8*)r - skb->data); - goto done; - } - r->rta_len = skb->tail - (u8*)r; - n_i++; - if (n_i >= TCA_ACT_MAX_PRIO) { - goto done; - } - } - } -done: - read_unlock(&tcf_t_lock); - if (n_i) - cb->args[0] += n_i; - return n_i; - -rtattr_failure: - skb_trim(skb, (u8*)r - skb->data); - goto done; -} - -static __inline__ int -tcf_del_walker(struct sk_buff *skb, struct tc_action *a) -{ - struct tcf_st *p, *s_p; - struct rtattr *r ; - int i= 0, n_i = 0; - - r = (struct rtattr*) skb->tail; - RTA_PUT(skb, a->order, 0, NULL); - RTA_PUT(skb, TCA_KIND, IFNAMSIZ, a->ops->kind); - for (i = 0; i < MY_TAB_SIZE; i++) { - p = tcf_ht[tcf_hash(i)]; - - while (p != NULL) { - s_p = p->next; - if (ACT_P_DELETED == tcf_hash_release(p, 0)) { - module_put(a->ops->owner); - } - n_i++; - p = s_p; - } - } - RTA_PUT(skb, TCA_FCNT, 4, &n_i); - r->rta_len = skb->tail - (u8*)r; - - return n_i; -rtattr_failure: - skb_trim(skb, (u8*)r - skb->data); - return -EINVAL; -} - -static __inline__ int -tcf_generic_walker(struct sk_buff *skb, struct netlink_callback *cb, int type, - struct tc_action *a) -{ - if (type == RTM_DELACTION) { - return tcf_del_walker(skb,a); - } else if (type == RTM_GETACTION) { - return tcf_dump_walker(skb,cb,a); - } else { - printk("tcf_generic_walker: unknown action %d\n",type); - return -EINVAL; - } -} - -static __inline__ struct tcf_st * -tcf_hash_lookup(u32 index) -{ - struct tcf_st *p; - - read_lock(&tcf_t_lock); - for (p = tcf_ht[tcf_hash(index)]; p; p = p->next) { - if (p->index == index) - break; - } - read_unlock(&tcf_t_lock); - return p; -} - -static __inline__ u32 -tcf_hash_new_index(void) -{ - do { - if (++idx_gen == 0) - idx_gen = 1; - } while (tcf_hash_lookup(idx_gen)); - - return idx_gen; -} - - -static inline int -tcf_hash_search(struct tc_action *a, u32 index) -{ - struct tcf_st *p = tcf_hash_lookup(index); - - if (p != NULL) { - a->priv = p; - return 1; - } - return 0; -} - -#ifdef CONFIG_NET_ACT_INIT -static inline struct tcf_st * -tcf_hash_check(u32 index, struct tc_action *a, int ovr, int bind) -{ - struct tcf_st *p = NULL; - if (index && (p = tcf_hash_lookup(index)) != NULL) { - if (bind) { - p->bindcnt++; - p->refcnt++; - } - a->priv = p; - } - return p; -} - -static inline struct tcf_st * -tcf_hash_create(u32 index, struct rtattr *est, struct tc_action *a, int size, int ovr, int bind) -{ - struct tcf_st *p = NULL; - - p = kmalloc(size, GFP_KERNEL); - if (p == NULL) - return p; - - memset(p, 0, size); - p->refcnt = 1; - - if (bind) { - p->bindcnt = 1; - } - - spin_lock_init(&p->lock); - p->stats_lock = &p->lock; - p->index = index ? : tcf_hash_new_index(); - p->tm.install = jiffies; - p->tm.lastuse = jiffies; -#ifdef CONFIG_NET_ESTIMATOR - if (est) - gen_new_estimator(&p->bstats, &p->rate_est, p->stats_lock, est); -#endif - a->priv = (void *) p; - return p; -} - -static inline void tcf_hash_insert(struct tcf_st *p) -{ - unsigned h = tcf_hash(p->index); - - write_lock_bh(&tcf_t_lock); - p->next = tcf_ht[h]; - tcf_ht[h] = p; - write_unlock_bh(&tcf_t_lock); -} - -#endif - -#endif diff --git a/include/net/tc_act/tc_defact.h b/include/net/tc_act/tc_defact.h index 463aa671..65f024b 100644 --- a/include/net/tc_act/tc_defact.h +++ b/include/net/tc_act/tc_defact.h @@ -3,11 +3,12 @@ #include -struct tcf_defact -{ - tca_gen(defact); - u32 datalen; - void *defdata; +struct tcf_defact { + struct tcf_common common; + u32 tcfd_datalen; + void *tcfd_defdata; }; +#define to_defact(pc) \ + container_of(pc, struct tcf_defact, common) -#endif +#endif /* __NET_TC_DEF_H */ diff --git a/include/net/tc_act/tc_gact.h b/include/net/tc_act/tc_gact.h index 59f0d96..9e3f676 100644 --- a/include/net/tc_act/tc_gact.h +++ b/include/net/tc_act/tc_gact.h @@ -3,15 +3,15 @@ #include -struct tcf_gact -{ - tca_gen(gact); +struct tcf_gact { + struct tcf_common common; #ifdef CONFIG_GACT_PROB - u16 ptype; - u16 pval; - int paction; + u16 tcfg_ptype; + u16 tcfg_pval; + int tcfg_paction; #endif - }; - -#endif +#define to_gact(pc) \ + container_of(pc, struct tcf_gact, common) + +#endif /* __NET_TC_GACT_H */ diff --git a/include/net/tc_act/tc_ipt.h b/include/net/tc_act/tc_ipt.h index cb37ad0..f7d25df 100644 --- a/include/net/tc_act/tc_ipt.h +++ b/include/net/tc_act/tc_ipt.h @@ -5,12 +5,13 @@ struct xt_entry_target; -struct tcf_ipt -{ - tca_gen(ipt); - u32 hook; - char *tname; - struct xt_entry_target *t; +struct tcf_ipt { + struct tcf_common common; + u32 tcfi_hook; + char *tcfi_tname; + struct xt_entry_target *tcfi_t; }; +#define to_ipt(pc) \ + container_of(pc, struct tcf_ipt, common) -#endif +#endif /* __NET_TC_IPT_H */ diff --git a/include/net/tc_act/tc_mirred.h b/include/net/tc_act/tc_mirred.h index b5c32f6..ceac661 100644 --- a/include/net/tc_act/tc_mirred.h +++ b/include/net/tc_act/tc_mirred.h @@ -3,13 +3,14 @@ #include -struct tcf_mirred -{ - tca_gen(mirred); - int eaction; - int ifindex; - int ok_push; - struct net_device *dev; +struct tcf_mirred { + struct tcf_common common; + int tcfm_eaction; + int tcfm_ifindex; + int tcfm_ok_push; + struct net_device *tcfm_dev; }; +#define to_mirred(pc) \ + container_of(pc, struct tcf_mirred, common) -#endif +#endif /* __NET_TC_MIR_H */ diff --git a/include/net/tc_act/tc_pedit.h b/include/net/tc_act/tc_pedit.h index eb21689..e6f6e15 100644 --- a/include/net/tc_act/tc_pedit.h +++ b/include/net/tc_act/tc_pedit.h @@ -3,12 +3,13 @@ #include -struct tcf_pedit -{ - tca_gen(pedit); - unsigned char nkeys; - unsigned char flags; - struct tc_pedit_key *keys; +struct tcf_pedit { + struct tcf_common common; + unsigned char tcfp_nkeys; + unsigned char tcfp_flags; + struct tc_pedit_key *tcfp_keys; }; +#define to_pedit(pc) \ + container_of(pc, struct tcf_pedit, common) -#endif +#endif /* __NET_TC_PED_H */ diff --git a/net/sched/act_api.c b/net/sched/act_api.c index 6990747..835070e 100644 --- a/net/sched/act_api.c +++ b/net/sched/act_api.c @@ -33,16 +33,230 @@ #include #include -#if 0 /* control */ -#define DPRINTK(format, args...) printk(KERN_DEBUG format, ##args) -#else -#define DPRINTK(format, args...) +void tcf_hash_destroy(struct tcf_common *p, struct tcf_hashinfo *hinfo) +{ + unsigned int h = tcf_hash(p->tcfc_index, hinfo->hmask); + struct tcf_common **p1p; + + for (p1p = &hinfo->htab[h]; *p1p; p1p = &(*p1p)->tcfc_next) { + if (*p1p == p) { + write_lock_bh(hinfo->lock); + *p1p = p->tcfc_next; + write_unlock_bh(hinfo->lock); +#ifdef CONFIG_NET_ESTIMATOR + gen_kill_estimator(&p->tcfc_bstats, + &p->tcfc_rate_est); #endif -#if 0 /* data */ -#define D2PRINTK(format, args...) printk(KERN_DEBUG format, ##args) -#else -#define D2PRINTK(format, args...) + kfree(p); + return; + } + } + BUG_TRAP(0); +} +EXPORT_SYMBOL(tcf_hash_destroy); + +int tcf_hash_release(struct tcf_common *p, int bind, + struct tcf_hashinfo *hinfo) +{ + int ret = 0; + + if (p) { + if (bind) + p->tcfc_bindcnt--; + + p->tcfc_refcnt--; + if (p->tcfc_bindcnt <= 0 && p->tcfc_refcnt <= 0) { + tcf_hash_destroy(p, hinfo); + ret = 1; + } + } + return ret; +} +EXPORT_SYMBOL(tcf_hash_release); + +static int tcf_dump_walker(struct sk_buff *skb, struct netlink_callback *cb, + struct tc_action *a, struct tcf_hashinfo *hinfo) +{ + struct tcf_common *p; + int err = 0, index = -1,i = 0, s_i = 0, n_i = 0; + struct rtattr *r ; + + read_lock(hinfo->lock); + + s_i = cb->args[0]; + + for (i = 0; i < (hinfo->hmask + 1); i++) { + p = hinfo->htab[tcf_hash(i, hinfo->hmask)]; + + for (; p; p = p->tcfc_next) { + index++; + if (index < s_i) + continue; + a->priv = p; + a->order = n_i; + r = (struct rtattr*) skb->tail; + RTA_PUT(skb, a->order, 0, NULL); + err = tcf_action_dump_1(skb, a, 0, 0); + if (err < 0) { + index--; + skb_trim(skb, (u8*)r - skb->data); + goto done; + } + r->rta_len = skb->tail - (u8*)r; + n_i++; + if (n_i >= TCA_ACT_MAX_PRIO) + goto done; + } + } +done: + read_unlock(hinfo->lock); + if (n_i) + cb->args[0] += n_i; + return n_i; + +rtattr_failure: + skb_trim(skb, (u8*)r - skb->data); + goto done; +} + +static int tcf_del_walker(struct sk_buff *skb, struct tc_action *a, + struct tcf_hashinfo *hinfo) +{ + struct tcf_common *p, *s_p; + struct rtattr *r ; + int i= 0, n_i = 0; + + r = (struct rtattr*) skb->tail; + RTA_PUT(skb, a->order, 0, NULL); + RTA_PUT(skb, TCA_KIND, IFNAMSIZ, a->ops->kind); + for (i = 0; i < (hinfo->hmask + 1); i++) { + p = hinfo->htab[tcf_hash(i, hinfo->hmask)]; + + while (p != NULL) { + s_p = p->tcfc_next; + if (ACT_P_DELETED == tcf_hash_release(p, 0, hinfo)) + module_put(a->ops->owner); + n_i++; + p = s_p; + } + } + RTA_PUT(skb, TCA_FCNT, 4, &n_i); + r->rta_len = skb->tail - (u8*)r; + + return n_i; +rtattr_failure: + skb_trim(skb, (u8*)r - skb->data); + return -EINVAL; +} + +int tcf_generic_walker(struct sk_buff *skb, struct netlink_callback *cb, + int type, struct tc_action *a) +{ + struct tcf_hashinfo *hinfo = a->ops->hinfo; + + if (type == RTM_DELACTION) { + return tcf_del_walker(skb, a, hinfo); + } else if (type == RTM_GETACTION) { + return tcf_dump_walker(skb, cb, a, hinfo); + } else { + printk("tcf_generic_walker: unknown action %d\n", type); + return -EINVAL; + } +} +EXPORT_SYMBOL(tcf_generic_walker); + +struct tcf_common *tcf_hash_lookup(u32 index, struct tcf_hashinfo *hinfo) +{ + struct tcf_common *p; + + read_lock(hinfo->lock); + for (p = hinfo->htab[tcf_hash(index, hinfo->hmask)]; p; + p = p->tcfc_next) { + if (p->tcfc_index == index) + break; + } + read_unlock(hinfo->lock); + + return p; +} +EXPORT_SYMBOL(tcf_hash_lookup); + +u32 tcf_hash_new_index(u32 *idx_gen, struct tcf_hashinfo *hinfo) +{ + u32 val = *idx_gen; + + do { + if (++val == 0) + val = 1; + } while (tcf_hash_lookup(val, hinfo)); + + return (*idx_gen = val); +} +EXPORT_SYMBOL(tcf_hash_new_index); + +int tcf_hash_search(struct tc_action *a, u32 index) +{ + struct tcf_hashinfo *hinfo = a->ops->hinfo; + struct tcf_common *p = tcf_hash_lookup(index, hinfo); + + if (p) { + a->priv = p; + return 1; + } + return 0; +} +EXPORT_SYMBOL(tcf_hash_search); + +struct tcf_common *tcf_hash_check(u32 index, struct tc_action *a, int bind, + struct tcf_hashinfo *hinfo) +{ + struct tcf_common *p = NULL; + if (index && (p = tcf_hash_lookup(index, hinfo)) != NULL) { + if (bind) { + p->tcfc_bindcnt++; + p->tcfc_refcnt++; + } + a->priv = p; + } + return p; +} +EXPORT_SYMBOL(tcf_hash_check); + +struct tcf_common *tcf_hash_create(u32 index, struct rtattr *est, struct tc_action *a, int size, int bind, u32 *idx_gen, struct tcf_hashinfo *hinfo) +{ + struct tcf_common *p = kzalloc(size, GFP_KERNEL); + + if (unlikely(!p)) + return p; + p->tcfc_refcnt = 1; + if (bind) + p->tcfc_bindcnt = 1; + + spin_lock_init(&p->tcfc_lock); + p->tcfc_stats_lock = &p->tcfc_lock; + p->tcfc_index = index ? index : tcf_hash_new_index(idx_gen, hinfo); + p->tcfc_tm.install = jiffies; + p->tcfc_tm.lastuse = jiffies; +#ifdef CONFIG_NET_ESTIMATOR + if (est) + gen_new_estimator(&p->tcfc_bstats, &p->tcfc_rate_est, + p->tcfc_stats_lock, est); #endif + a->priv = (void *) p; + return p; +} +EXPORT_SYMBOL(tcf_hash_create); + +void tcf_hash_insert(struct tcf_common *p, struct tcf_hashinfo *hinfo) +{ + unsigned int h = tcf_hash(p->tcfc_index, hinfo->hmask); + + write_lock_bh(hinfo->lock); + p->tcfc_next = hinfo->htab[h]; + hinfo->htab[h] = p; + write_unlock_bh(hinfo->lock); +} +EXPORT_SYMBOL(tcf_hash_insert); static struct tc_action_ops *act_base = NULL; static DEFINE_RWLOCK(act_mod_lock); @@ -155,9 +369,6 @@ int tcf_action_exec(struct sk_buff *skb, struct tc_action *act, if (skb->tc_verd & TC_NCLS) { skb->tc_verd = CLR_TC_NCLS(skb->tc_verd); - D2PRINTK("(%p)tcf_action_exec: cleared TC_NCLS in %s out %s\n", - skb, skb->input_dev ? skb->input_dev->name : "xxx", - skb->dev->name); ret = TC_ACT_OK; goto exec_done; } @@ -187,8 +398,6 @@ void tcf_action_destroy(struct tc_action *act, int bind) for (a = act; a; a = act) { if (a->ops && a->ops->cleanup) { - DPRINTK("tcf_action_destroy destroying %p next %p\n", - a, a->next); if (a->ops->cleanup(a, bind) == ACT_P_DELETED) module_put(a->ops->owner); act = act->next; @@ -331,7 +540,6 @@ struct tc_action *tcf_action_init_1(struct rtattr *rta, struct rtattr *est, if (*err != ACT_P_CREATED) module_put(a_o->owner); a->ops = a_o; - DPRINTK("tcf_action_init_1: successfull %s\n", act_name); *err = 0; return a; @@ -392,12 +600,12 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a, if (compat_mode) { if (a->type == TCA_OLD_COMPAT) err = gnet_stats_start_copy_compat(skb, 0, - TCA_STATS, TCA_XSTATS, h->stats_lock, &d); + TCA_STATS, TCA_XSTATS, h->tcf_stats_lock, &d); else return 0; } else err = gnet_stats_start_copy(skb, TCA_ACT_STATS, - h->stats_lock, &d); + h->tcf_stats_lock, &d); if (err < 0) goto errout; @@ -406,11 +614,11 @@ int tcf_action_copy_stats(struct sk_buff *skb, struct tc_action *a, if (a->ops->get_stats(skb, a) < 0) goto errout; - if (gnet_stats_copy_basic(&d, &h->bstats) < 0 || + if (gnet_stats_copy_basic(&d, &h->tcf_bstats) < 0 || #ifdef CONFIG_NET_ESTIMATOR - gnet_stats_copy_rate_est(&d, &h->rate_est) < 0 || + gnet_stats_copy_rate_est(&d, &h->tcf_rate_est) < 0 || #endif - gnet_stats_copy_queue(&d, &h->qstats) < 0) + gnet_stats_copy_queue(&d, &h->tcf_qstats) < 0) goto errout; if (gnet_stats_finish_copy(&d) < 0) diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c index e75a147..6cff566 100644 --- a/net/sched/act_gact.c +++ b/net/sched/act_gact.c @@ -34,48 +34,43 @@ #include #include -/* use generic hash table */ -#define MY_TAB_SIZE 16 -#define MY_TAB_MASK 15 - -static u32 idx_gen; -static struct tcf_gact *tcf_gact_ht[MY_TAB_SIZE]; +#define GACT_TAB_MASK 15 +static struct tcf_common *tcf_gact_ht[GACT_TAB_MASK + 1]; +static u32 gact_idx_gen; static DEFINE_RWLOCK(gact_lock); -/* ovewrride the defaults */ -#define tcf_st tcf_gact -#define tc_st tc_gact -#define tcf_t_lock gact_lock -#define tcf_ht tcf_gact_ht - -#define CONFIG_NET_ACT_INIT 1 -#include +static struct tcf_hashinfo gact_hash_info = { + .htab = tcf_gact_ht, + .hmask = GACT_TAB_MASK, + .lock = &gact_lock, +}; #ifdef CONFIG_GACT_PROB -static int gact_net_rand(struct tcf_gact *p) +static int gact_net_rand(struct tcf_gact *gact) { - if (net_random()%p->pval) - return p->action; - return p->paction; + if (net_random() % gact->tcfg_pval) + return gact->tcf_action; + return gact->tcfg_paction; } -static int gact_determ(struct tcf_gact *p) +static int gact_determ(struct tcf_gact *gact) { - if (p->bstats.packets%p->pval) - return p->action; - return p->paction; + if (gact->tcf_bstats.packets % gact->tcfg_pval) + return gact->tcf_action; + return gact->tcfg_paction; } -typedef int (*g_rand)(struct tcf_gact *p); +typedef int (*g_rand)(struct tcf_gact *gact); static g_rand gact_rand[MAX_RAND]= { NULL, gact_net_rand, gact_determ }; -#endif +#endif /* CONFIG_GACT_PROB */ static int tcf_gact_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, int ovr, int bind) { struct rtattr *tb[TCA_GACT_MAX]; struct tc_gact *parm; - struct tcf_gact *p; + struct tcf_gact *gact; + struct tcf_common *pc; int ret = 0; if (rta == NULL || rtattr_parse_nested(tb, TCA_GACT_MAX, rta) < 0) @@ -94,105 +89,106 @@ static int tcf_gact_init(struct rtattr *rta, struct rtattr *est, return -EOPNOTSUPP; #endif - p = tcf_hash_check(parm->index, a, ovr, bind); - if (p == NULL) { - p = tcf_hash_create(parm->index, est, a, sizeof(*p), ovr, bind); - if (p == NULL) + pc = tcf_hash_check(parm->index, a, bind, &gact_hash_info); + if (!pc) { + pc = tcf_hash_create(parm->index, est, a, sizeof(*gact), + bind, &gact_idx_gen, &gact_hash_info); + if (unlikely(!pc)) return -ENOMEM; ret = ACT_P_CREATED; } else { if (!ovr) { - tcf_hash_release(p, bind); + tcf_hash_release(pc, bind, &gact_hash_info); return -EEXIST; } } - spin_lock_bh(&p->lock); - p->action = parm->action; + gact = to_gact(pc); + + spin_lock_bh(&gact->tcf_lock); + gact->tcf_action = parm->action; #ifdef CONFIG_GACT_PROB if (tb[TCA_GACT_PROB-1] != NULL) { struct tc_gact_p *p_parm = RTA_DATA(tb[TCA_GACT_PROB-1]); - p->paction = p_parm->paction; - p->pval = p_parm->pval; - p->ptype = p_parm->ptype; + gact->tcfg_paction = p_parm->paction; + gact->tcfg_pval = p_parm->pval; + gact->tcfg_ptype = p_parm->ptype; } #endif - spin_unlock_bh(&p->lock); + spin_unlock_bh(&gact->tcf_lock); if (ret == ACT_P_CREATED) - tcf_hash_insert(p); + tcf_hash_insert(pc, &gact_hash_info); return ret; } -static int -tcf_gact_cleanup(struct tc_action *a, int bind) +static int tcf_gact_cleanup(struct tc_action *a, int bind) { - struct tcf_gact *p = PRIV(a, gact); + struct tcf_gact *gact = a->priv; - if (p != NULL) - return tcf_hash_release(p, bind); + if (gact) + return tcf_hash_release(&gact->common, bind, &gact_hash_info); return 0; } -static int -tcf_gact(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) +static int tcf_gact(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) { - struct tcf_gact *p = PRIV(a, gact); + struct tcf_gact *gact = a->priv; int action = TC_ACT_SHOT; - spin_lock(&p->lock); + spin_lock(&gact->tcf_lock); #ifdef CONFIG_GACT_PROB - if (p->ptype && gact_rand[p->ptype] != NULL) - action = gact_rand[p->ptype](p); + if (gact->tcfg_ptype && gact_rand[gact->tcfg_ptype] != NULL) + action = gact_rand[gact->tcfg_ptype](gact); else - action = p->action; + action = gact->tcf_action; #else - action = p->action; + action = gact->tcf_action; #endif - p->bstats.bytes += skb->len; - p->bstats.packets++; + gact->tcf_bstats.bytes += skb->len; + gact->tcf_bstats.packets++; if (action == TC_ACT_SHOT) - p->qstats.drops++; - p->tm.lastuse = jiffies; - spin_unlock(&p->lock); + gact->tcf_qstats.drops++; + gact->tcf_tm.lastuse = jiffies; + spin_unlock(&gact->tcf_lock); return action; } -static int -tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { unsigned char *b = skb->tail; struct tc_gact opt; - struct tcf_gact *p = PRIV(a, gact); + struct tcf_gact *gact = a->priv; struct tcf_t t; - opt.index = p->index; - opt.refcnt = p->refcnt - ref; - opt.bindcnt = p->bindcnt - bind; - opt.action = p->action; + opt.index = gact->tcf_index; + opt.refcnt = gact->tcf_refcnt - ref; + opt.bindcnt = gact->tcf_bindcnt - bind; + opt.action = gact->tcf_action; RTA_PUT(skb, TCA_GACT_PARMS, sizeof(opt), &opt); #ifdef CONFIG_GACT_PROB - if (p->ptype) { + if (gact->tcfg_ptype) { struct tc_gact_p p_opt; - p_opt.paction = p->paction; - p_opt.pval = p->pval; - p_opt.ptype = p->ptype; + p_opt.paction = gact->tcfg_paction; + p_opt.pval = gact->tcfg_pval; + p_opt.ptype = gact->tcfg_ptype; RTA_PUT(skb, TCA_GACT_PROB, sizeof(p_opt), &p_opt); } #endif - t.install = jiffies_to_clock_t(jiffies - p->tm.install); - t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); - t.expires = jiffies_to_clock_t(p->tm.expires); + t.install = jiffies_to_clock_t(jiffies - gact->tcf_tm.install); + t.lastuse = jiffies_to_clock_t(jiffies - gact->tcf_tm.lastuse); + t.expires = jiffies_to_clock_t(gact->tcf_tm.expires); RTA_PUT(skb, TCA_GACT_TM, sizeof(t), &t); return skb->len; - rtattr_failure: +rtattr_failure: skb_trim(skb, b - skb->data); return -1; } static struct tc_action_ops act_gact_ops = { .kind = "gact", + .hinfo = &gact_hash_info, .type = TCA_ACT_GACT, .capab = TCA_CAP_NONE, .owner = THIS_MODULE, @@ -208,8 +204,7 @@ MODULE_AUTHOR("Jamal Hadi Salim(2002-4)"); MODULE_DESCRIPTION("Generic Classifier actions"); MODULE_LICENSE("GPL"); -static int __init -gact_init_module(void) +static int __init gact_init_module(void) { #ifdef CONFIG_GACT_PROB printk("GACT probability on\n"); @@ -219,8 +214,7 @@ gact_init_module(void) return tcf_register_action(&act_gact_ops); } -static void __exit -gact_cleanup_module(void) +static void __exit gact_cleanup_module(void) { tcf_unregister_action(&act_gact_ops); } diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index d799e01..224c078 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -38,25 +38,19 @@ #include -/* use generic hash table */ -#define MY_TAB_SIZE 16 -#define MY_TAB_MASK 15 -static u32 idx_gen; -static struct tcf_ipt *tcf_ipt_ht[MY_TAB_SIZE]; -/* ipt hash table lock */ +#define IPT_TAB_MASK 15 +static struct tcf_common *tcf_ipt_ht[IPT_TAB_MASK + 1]; +static u32 ipt_idx_gen; static DEFINE_RWLOCK(ipt_lock); -/* ovewrride the defaults */ -#define tcf_st tcf_ipt -#define tcf_t_lock ipt_lock -#define tcf_ht tcf_ipt_ht - -#define CONFIG_NET_ACT_INIT -#include +static struct tcf_hashinfo ipt_hash_info = { + .htab = tcf_ipt_ht, + .hmask = IPT_TAB_MASK, + .lock = &ipt_lock, +}; -static int -ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int hook) +static int ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int hook) { struct ipt_target *target; int ret = 0; @@ -65,7 +59,6 @@ ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int hook) if (!target) return -ENOENT; - DPRINTK("ipt_init_target: found %s\n", target->name); t->u.kernel.target = target; ret = xt_check_target(target, AF_INET, t->u.target_size - sizeof(*t), @@ -78,8 +71,6 @@ ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int hook) t->u.kernel.target, t->data, t->u.target_size - sizeof(*t), hook)) { - DPRINTK("ipt_init_target: check failed for `%s'.\n", - t->u.kernel.target->name); module_put(t->u.kernel.target->me); ret = -EINVAL; } @@ -87,8 +78,7 @@ ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int hook) return ret; } -static void -ipt_destroy_target(struct ipt_entry_target *t) +static void ipt_destroy_target(struct ipt_entry_target *t) { if (t->u.kernel.target->destroy) t->u.kernel.target->destroy(t->u.kernel.target, t->data, @@ -96,31 +86,30 @@ ipt_destroy_target(struct ipt_entry_target *t) module_put(t->u.kernel.target->me); } -static int -tcf_ipt_release(struct tcf_ipt *p, int bind) +static int tcf_ipt_release(struct tcf_ipt *ipt, int bind) { int ret = 0; - if (p) { + if (ipt) { if (bind) - p->bindcnt--; - p->refcnt--; - if (p->bindcnt <= 0 && p->refcnt <= 0) { - ipt_destroy_target(p->t); - kfree(p->tname); - kfree(p->t); - tcf_hash_destroy(p); + ipt->tcf_bindcnt--; + ipt->tcf_refcnt--; + if (ipt->tcf_bindcnt <= 0 && ipt->tcf_refcnt <= 0) { + ipt_destroy_target(ipt->tcfi_t); + kfree(ipt->tcfi_tname); + kfree(ipt->tcfi_t); + tcf_hash_destroy(&ipt->common, &ipt_hash_info); ret = ACT_P_DELETED; } } return ret; } -static int -tcf_ipt_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, - int ovr, int bind) +static int tcf_ipt_init(struct rtattr *rta, struct rtattr *est, + struct tc_action *a, int ovr, int bind) { struct rtattr *tb[TCA_IPT_MAX]; - struct tcf_ipt *p; + struct tcf_ipt *ipt; + struct tcf_common *pc; struct ipt_entry_target *td, *t; char *tname; int ret = 0, err; @@ -144,49 +133,51 @@ tcf_ipt_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, RTA_PAYLOAD(tb[TCA_IPT_INDEX-1]) >= sizeof(u32)) index = *(u32 *)RTA_DATA(tb[TCA_IPT_INDEX-1]); - p = tcf_hash_check(index, a, ovr, bind); - if (p == NULL) { - p = tcf_hash_create(index, est, a, sizeof(*p), ovr, bind); - if (p == NULL) + pc = tcf_hash_check(index, a, bind, &ipt_hash_info); + if (!pc) { + pc = tcf_hash_create(index, est, a, sizeof(*ipt), bind, + &ipt_idx_gen, &ipt_hash_info); + if (unlikely(!pc)) return -ENOMEM; ret = ACT_P_CREATED; } else { if (!ovr) { - tcf_ipt_release(p, bind); + tcf_ipt_release(to_ipt(pc), bind); return -EEXIST; } } + ipt = to_ipt(pc); hook = *(u32 *)RTA_DATA(tb[TCA_IPT_HOOK-1]); err = -ENOMEM; tname = kmalloc(IFNAMSIZ, GFP_KERNEL); - if (tname == NULL) + if (unlikely(!tname)) goto err1; if (tb[TCA_IPT_TABLE - 1] == NULL || rtattr_strlcpy(tname, tb[TCA_IPT_TABLE-1], IFNAMSIZ) >= IFNAMSIZ) strcpy(tname, "mangle"); t = kmalloc(td->u.target_size, GFP_KERNEL); - if (t == NULL) + if (unlikely(!t)) goto err2; memcpy(t, td, td->u.target_size); if ((err = ipt_init_target(t, tname, hook)) < 0) goto err3; - spin_lock_bh(&p->lock); + spin_lock_bh(&ipt->tcf_lock); if (ret != ACT_P_CREATED) { - ipt_destroy_target(p->t); - kfree(p->tname); - kfree(p->t); + ipt_destroy_target(ipt->tcfi_t); + kfree(ipt->tcfi_tname); + kfree(ipt->tcfi_t); } - p->tname = tname; - p->t = t; - p->hook = hook; - spin_unlock_bh(&p->lock); + ipt->tcfi_tname = tname; + ipt->tcfi_t = t; + ipt->tcfi_hook = hook; + spin_unlock_bh(&ipt->tcf_lock); if (ret == ACT_P_CREATED) - tcf_hash_insert(p); + tcf_hash_insert(pc, &ipt_hash_info); return ret; err3: @@ -194,33 +185,32 @@ err3: err2: kfree(tname); err1: - kfree(p); + kfree(pc); return err; } -static int -tcf_ipt_cleanup(struct tc_action *a, int bind) +static int tcf_ipt_cleanup(struct tc_action *a, int bind) { - struct tcf_ipt *p = PRIV(a, ipt); - return tcf_ipt_release(p, bind); + struct tcf_ipt *ipt = a->priv; + return tcf_ipt_release(ipt, bind); } -static int -tcf_ipt(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) +static int tcf_ipt(struct sk_buff *skb, struct tc_action *a, + struct tcf_result *res) { int ret = 0, result = 0; - struct tcf_ipt *p = PRIV(a, ipt); + struct tcf_ipt *ipt = a->priv; if (skb_cloned(skb)) { if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) return TC_ACT_UNSPEC; } - spin_lock(&p->lock); + spin_lock(&ipt->tcf_lock); - p->tm.lastuse = jiffies; - p->bstats.bytes += skb->len; - p->bstats.packets++; + ipt->tcf_tm.lastuse = jiffies; + ipt->tcf_bstats.bytes += skb->len; + ipt->tcf_bstats.packets++; /* yes, we have to worry about both in and out dev worry later - danger - this API seems to have changed @@ -229,16 +219,17 @@ tcf_ipt(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) /* iptables targets take a double skb pointer in case the skb * needs to be replaced. We don't own the skb, so this must not * happen. The pskb_expand_head above should make sure of this */ - ret = p->t->u.kernel.target->target(&skb, skb->dev, NULL, p->hook, - p->t->u.kernel.target, p->t->data, - NULL); + ret = ipt->tcfi_t->u.kernel.target->target(&skb, skb->dev, NULL, + ipt->tcfi_hook, + ipt->tcfi_t->u.kernel.target, + ipt->tcfi_t->data, NULL); switch (ret) { case NF_ACCEPT: result = TC_ACT_OK; break; case NF_DROP: result = TC_ACT_SHOT; - p->qstats.drops++; + ipt->tcf_qstats.drops++; break; case IPT_CONTINUE: result = TC_ACT_PIPE; @@ -249,53 +240,46 @@ tcf_ipt(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) result = TC_POLICE_OK; break; } - spin_unlock(&p->lock); + spin_unlock(&ipt->tcf_lock); return result; } -static int -tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +static int tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { + unsigned char *b = skb->tail; + struct tcf_ipt *ipt = a->priv; struct ipt_entry_target *t; struct tcf_t tm; struct tc_cnt c; - unsigned char *b = skb->tail; - struct tcf_ipt *p = PRIV(a, ipt); /* for simple targets kernel size == user size ** user name = target name ** for foolproof you need to not assume this */ - t = kmalloc(p->t->u.user.target_size, GFP_ATOMIC); - if (t == NULL) + t = kmalloc(ipt->tcfi_t->u.user.target_size, GFP_ATOMIC); + if (unlikely(!t)) goto rtattr_failure; - c.bindcnt = p->bindcnt - bind; - c.refcnt = p->refcnt - ref; - memcpy(t, p->t, p->t->u.user.target_size); - strcpy(t->u.user.name, p->t->u.kernel.target->name); - - DPRINTK("\ttcf_ipt_dump tablename %s length %d\n", p->tname, - strlen(p->tname)); - DPRINTK("\tdump target name %s size %d size user %d " - "data[0] %x data[1] %x\n", p->t->u.kernel.target->name, - p->t->u.target_size, p->t->u.user.target_size, - p->t->data[0], p->t->data[1]); - RTA_PUT(skb, TCA_IPT_TARG, p->t->u.user.target_size, t); - RTA_PUT(skb, TCA_IPT_INDEX, 4, &p->index); - RTA_PUT(skb, TCA_IPT_HOOK, 4, &p->hook); + c.bindcnt = ipt->tcf_bindcnt - bind; + c.refcnt = ipt->tcf_refcnt - ref; + memcpy(t, ipt->tcfi_t, ipt->tcfi_t->u.user.target_size); + strcpy(t->u.user.name, ipt->tcfi_t->u.kernel.target->name); + + RTA_PUT(skb, TCA_IPT_TARG, ipt->tcfi_t->u.user.target_size, t); + RTA_PUT(skb, TCA_IPT_INDEX, 4, &ipt->tcf_index); + RTA_PUT(skb, TCA_IPT_HOOK, 4, &ipt->tcfi_hook); RTA_PUT(skb, TCA_IPT_CNT, sizeof(struct tc_cnt), &c); - RTA_PUT(skb, TCA_IPT_TABLE, IFNAMSIZ, p->tname); - tm.install = jiffies_to_clock_t(jiffies - p->tm.install); - tm.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); - tm.expires = jiffies_to_clock_t(p->tm.expires); + RTA_PUT(skb, TCA_IPT_TABLE, IFNAMSIZ, ipt->tcfi_tname); + tm.install = jiffies_to_clock_t(jiffies - ipt->tcf_tm.install); + tm.lastuse = jiffies_to_clock_t(jiffies - ipt->tcf_tm.lastuse); + tm.expires = jiffies_to_clock_t(ipt->tcf_tm.expires); RTA_PUT(skb, TCA_IPT_TM, sizeof (tm), &tm); kfree(t); return skb->len; - rtattr_failure: +rtattr_failure: skb_trim(skb, b - skb->data); kfree(t); return -1; @@ -303,6 +287,7 @@ tcf_ipt_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) static struct tc_action_ops act_ipt_ops = { .kind = "ipt", + .hinfo = &ipt_hash_info, .type = TCA_ACT_IPT, .capab = TCA_CAP_NONE, .owner = THIS_MODULE, @@ -318,14 +303,12 @@ MODULE_AUTHOR("Jamal Hadi Salim(2002-4)"); MODULE_DESCRIPTION("Iptables target actions"); MODULE_LICENSE("GPL"); -static int __init -ipt_init_module(void) +static int __init ipt_init_module(void) { return tcf_register_action(&act_ipt_ops); } -static void __exit -ipt_cleanup_module(void) +static void __exit ipt_cleanup_module(void) { tcf_unregister_action(&act_ipt_ops); } diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c index fc56204..4838972 100644 --- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -39,46 +39,39 @@ #include #include - -/* use generic hash table */ -#define MY_TAB_SIZE 8 -#define MY_TAB_MASK (MY_TAB_SIZE - 1) -static u32 idx_gen; -static struct tcf_mirred *tcf_mirred_ht[MY_TAB_SIZE]; +#define MIRRED_TAB_MASK 7 +static struct tcf_common *tcf_mirred_ht[MIRRED_TAB_MASK + 1]; +static u32 mirred_idx_gen; static DEFINE_RWLOCK(mirred_lock); -/* ovewrride the defaults */ -#define tcf_st tcf_mirred -#define tc_st tc_mirred -#define tcf_t_lock mirred_lock -#define tcf_ht tcf_mirred_ht - -#define CONFIG_NET_ACT_INIT 1 -#include +static struct tcf_hashinfo mirred_hash_info = { + .htab = tcf_mirred_ht, + .hmask = MIRRED_TAB_MASK, + .lock = &mirred_lock, +}; -static inline int -tcf_mirred_release(struct tcf_mirred *p, int bind) +static inline int tcf_mirred_release(struct tcf_mirred *m, int bind) { - if (p) { + if (m) { if (bind) - p->bindcnt--; - p->refcnt--; - if(!p->bindcnt && p->refcnt <= 0) { - dev_put(p->dev); - tcf_hash_destroy(p); + m->tcf_bindcnt--; + m->tcf_refcnt--; + if(!m->tcf_bindcnt && m->tcf_refcnt <= 0) { + dev_put(m->tcfm_dev); + tcf_hash_destroy(&m->common, &mirred_hash_info); return 1; } } return 0; } -static int -tcf_mirred_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, - int ovr, int bind) +static int tcf_mirred_init(struct rtattr *rta, struct rtattr *est, + struct tc_action *a, int ovr, int bind) { struct rtattr *tb[TCA_MIRRED_MAX]; struct tc_mirred *parm; - struct tcf_mirred *p; + struct tcf_mirred *m; + struct tcf_common *pc; struct net_device *dev = NULL; int ret = 0; int ok_push = 0; @@ -110,64 +103,62 @@ tcf_mirred_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, } } - p = tcf_hash_check(parm->index, a, ovr, bind); - if (p == NULL) { + pc = tcf_hash_check(parm->index, a, bind, &mirred_hash_info); + if (!pc) { if (!parm->ifindex) return -EINVAL; - p = tcf_hash_create(parm->index, est, a, sizeof(*p), ovr, bind); - if (p == NULL) + pc = tcf_hash_create(parm->index, est, a, sizeof(*m), bind, + &mirred_idx_gen, &mirred_hash_info); + if (unlikely(!pc)) return -ENOMEM; ret = ACT_P_CREATED; } else { if (!ovr) { - tcf_mirred_release(p, bind); + tcf_mirred_release(to_mirred(pc), bind); return -EEXIST; } } + m = to_mirred(pc); - spin_lock_bh(&p->lock); - p->action = parm->action; - p->eaction = parm->eaction; + spin_lock_bh(&m->tcf_lock); + m->tcf_action = parm->action; + m->tcfm_eaction = parm->eaction; if (parm->ifindex) { - p->ifindex = parm->ifindex; + m->tcfm_ifindex = parm->ifindex; if (ret != ACT_P_CREATED) - dev_put(p->dev); - p->dev = dev; + dev_put(m->tcfm_dev); + m->tcfm_dev = dev; dev_hold(dev); - p->ok_push = ok_push; + m->tcfm_ok_push = ok_push; } - spin_unlock_bh(&p->lock); + spin_unlock_bh(&m->tcf_lock); if (ret == ACT_P_CREATED) - tcf_hash_insert(p); + tcf_hash_insert(pc, &mirred_hash_info); - DPRINTK("tcf_mirred_init index %d action %d eaction %d device %s " - "ifindex %d\n", parm->index, parm->action, parm->eaction, - dev->name, parm->ifindex); return ret; } -static int -tcf_mirred_cleanup(struct tc_action *a, int bind) +static int tcf_mirred_cleanup(struct tc_action *a, int bind) { - struct tcf_mirred *p = PRIV(a, mirred); + struct tcf_mirred *m = a->priv; - if (p != NULL) - return tcf_mirred_release(p, bind); + if (m) + return tcf_mirred_release(m, bind); return 0; } -static int -tcf_mirred(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) +static int tcf_mirred(struct sk_buff *skb, struct tc_action *a, + struct tcf_result *res) { - struct tcf_mirred *p = PRIV(a, mirred); + struct tcf_mirred *m = a->priv; struct net_device *dev; struct sk_buff *skb2 = NULL; u32 at = G_TC_AT(skb->tc_verd); - spin_lock(&p->lock); + spin_lock(&m->tcf_lock); - dev = p->dev; - p->tm.lastuse = jiffies; + dev = m->tcfm_dev; + m->tcf_tm.lastuse = jiffies; if (!(dev->flags&IFF_UP) ) { if (net_ratelimit()) @@ -176,10 +167,10 @@ tcf_mirred(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) bad_mirred: if (skb2 != NULL) kfree_skb(skb2); - p->qstats.overlimits++; - p->bstats.bytes += skb->len; - p->bstats.packets++; - spin_unlock(&p->lock); + m->tcf_qstats.overlimits++; + m->tcf_bstats.bytes += skb->len; + m->tcf_bstats.packets++; + spin_unlock(&m->tcf_lock); /* should we be asking for packet to be dropped? * may make sense for redirect case only */ @@ -189,59 +180,59 @@ bad_mirred: skb2 = skb_clone(skb, GFP_ATOMIC); if (skb2 == NULL) goto bad_mirred; - if (p->eaction != TCA_EGRESS_MIRROR && p->eaction != TCA_EGRESS_REDIR) { + if (m->tcfm_eaction != TCA_EGRESS_MIRROR && + m->tcfm_eaction != TCA_EGRESS_REDIR) { if (net_ratelimit()) - printk("tcf_mirred unknown action %d\n", p->eaction); + printk("tcf_mirred unknown action %d\n", + m->tcfm_eaction); goto bad_mirred; } - p->bstats.bytes += skb2->len; - p->bstats.packets++; + m->tcf_bstats.bytes += skb2->len; + m->tcf_bstats.packets++; if (!(at & AT_EGRESS)) - if (p->ok_push) + if (m->tcfm_ok_push) skb_push(skb2, skb2->dev->hard_header_len); /* mirror is always swallowed */ - if (p->eaction != TCA_EGRESS_MIRROR) + if (m->tcfm_eaction != TCA_EGRESS_MIRROR) skb2->tc_verd = SET_TC_FROM(skb2->tc_verd, at); skb2->dev = dev; skb2->input_dev = skb->dev; dev_queue_xmit(skb2); - spin_unlock(&p->lock); - return p->action; + spin_unlock(&m->tcf_lock); + return m->tcf_action; } -static int -tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) +static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { unsigned char *b = skb->tail; + struct tcf_mirred *m = a->priv; struct tc_mirred opt; - struct tcf_mirred *p = PRIV(a, mirred); struct tcf_t t; - opt.index = p->index; - opt.action = p->action; - opt.refcnt = p->refcnt - ref; - opt.bindcnt = p->bindcnt - bind; - opt.eaction = p->eaction; - opt.ifindex = p->ifindex; - DPRINTK("tcf_mirred_dump index %d action %d eaction %d ifindex %d\n", - p->index, p->action, p->eaction, p->ifindex); + opt.index = m->tcf_index; + opt.action = m->tcf_action; + opt.refcnt = m->tcf_refcnt - ref; + opt.bindcnt = m->tcf_bindcnt - bind; + opt.eaction = m->tcfm_eaction; + opt.ifindex = m->tcfm_ifindex; RTA_PUT(skb, TCA_MIRRED_PARMS, sizeof(opt), &opt); - t.install = jiffies_to_clock_t(jiffies - p->tm.install); - t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); - t.expires = jiffies_to_clock_t(p->tm.expires); + t.install = jiffies_to_clock_t(jiffies - m->tcf_tm.install); + t.lastuse = jiffies_to_clock_t(jiffies - m->tcf_tm.lastuse); + t.expires = jiffies_to_clock_t(m->tcf_tm.expires); RTA_PUT(skb, TCA_MIRRED_TM, sizeof(t), &t); return skb->len; - rtattr_failure: +rtattr_failure: skb_trim(skb, b - skb->data); return -1; } static struct tc_action_ops act_mirred_ops = { .kind = "mirred", + .hinfo = &mirred_hash_info, .type = TCA_ACT_MIRRED, .capab = TCA_CAP_NONE, .owner = THIS_MODULE, @@ -257,15 +248,13 @@ MODULE_AUTHOR("Jamal Hadi Salim(2002)"); MODULE_DESCRIPTION("Device Mirror/redirect actions"); MODULE_LICENSE("GPL"); -static int __init -mirred_init_module(void) +static int __init mirred_init_module(void) { printk("Mirror/redirect action on\n"); return tcf_register_action(&act_mirred_ops); } -static void __exit -mirred_cleanup_module(void) +static void __exit mirred_cleanup_module(void) { tcf_unregister_action(&act_mirred_ops); } diff --git a/net/sched/act_pedit.c b/net/sched/act_pedit.c index f257475..8ac65c2 100644 --- a/net/sched/act_pedit.c +++ b/net/sched/act_pedit.c @@ -33,32 +33,25 @@ #include #include - -#define PEDIT_DEB 1 - -/* use generic hash table */ -#define MY_TAB_SIZE 16 -#define MY_TAB_MASK 15 -static u32 idx_gen; -static struct tcf_pedit *tcf_pedit_ht[MY_TAB_SIZE]; +#define PEDIT_TAB_MASK 15 +static struct tcf_common *tcf_pedit_ht[PEDIT_TAB_MASK + 1]; +static u32 pedit_idx_gen; static DEFINE_RWLOCK(pedit_lock); -#define tcf_st tcf_pedit -#define tc_st tc_pedit -#define tcf_t_lock pedit_lock -#define tcf_ht tcf_pedit_ht - -#define CONFIG_NET_ACT_INIT 1 -#include +static struct tcf_hashinfo pedit_hash_info = { + .htab = tcf_pedit_ht, + .hmask = PEDIT_TAB_MASK, + .lock = &pedit_lock, +}; -static int -tcf_pedit_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, - int ovr, int bind) +static int tcf_pedit_init(struct rtattr *rta, struct rtattr *est, + struct tc_action *a, int ovr, int bind) { struct rtattr *tb[TCA_PEDIT_MAX]; struct tc_pedit *parm; int ret = 0; struct tcf_pedit *p; + struct tcf_common *pc; struct tc_pedit_key *keys = NULL; int ksize; @@ -73,54 +66,56 @@ tcf_pedit_init(struct rtattr *rta, struct rtattr *est, struct tc_action *a, if (RTA_PAYLOAD(tb[TCA_PEDIT_PARMS-1]) < sizeof(*parm) + ksize) return -EINVAL; - p = tcf_hash_check(parm->index, a, ovr, bind); - if (p == NULL) { + pc = tcf_hash_check(parm->index, a, bind, &pedit_hash_info); + if (!pc) { if (!parm->nkeys) return -EINVAL; - p = tcf_hash_create(parm->index, est, a, sizeof(*p), ovr, bind); - if (p == NULL) + pc = tcf_hash_create(parm->index, est, a, sizeof(*p), bind, + &pedit_idx_gen, &pedit_hash_info); + if (unlikely(!pc)) return -ENOMEM; + p = to_pedit(pc); keys = kmalloc(ksize, GFP_KERNEL); if (keys == NULL) { - kfree(p); + kfree(pc); return -ENOMEM; } ret = ACT_P_CREATED; } else { + p = to_pedit(pc); if (!ovr) { - tcf_hash_release(p, bind); + tcf_hash_release(pc, bind, &pedit_hash_info); return -EEXIST; } - if (p->nkeys && p->nkeys != parm->nkeys) { + if (p->tcfp_nkeys && p->tcfp_nkeys != parm->nkeys) { keys = kmalloc(ksize, GFP_KERNEL); if (keys == NULL) return -ENOMEM; } } - spin_lock_bh(&p->lock); - p->flags = parm->flags; - p->action = parm->action; + spin_lock_bh(&p->tcf_lock); + p->tcfp_flags = parm->flags; + p->tcf_action = parm->action; if (keys) { - kfree(p->keys); - p->keys = keys; - p->nkeys = parm->nkeys; + kfree(p->tcfp_keys); + p->tcfp_keys = keys; + p->tcfp_nkeys = parm->nkeys; } - memcpy(p->keys, parm->keys, ksize); - spin_unlock_bh(&p->lock); + memcpy(p->tcfp_keys, parm->keys, ksize); + spin_unlock_bh(&p->tcf_lock); if (ret == ACT_P_CREATED) - tcf_hash_insert(p); + tcf_hash_insert(pc, &pedit_hash_info); return ret; } -static int -tcf_pedit_cleanup(struct tc_action *a, int bind) +static int tcf_pedit_cleanup(struct tc_action *a, int bind) { - struct tcf_pedit *p = PRIV(a, pedit); + struct tcf_pedit *p = a->priv; - if (p != NULL) { - struct tc_pedit_key *keys = p->keys; - if (tcf_hash_release(p, bind)) { + if (p) { + struct tc_pedit_key *keys = p->tcfp_keys; + if (tcf_hash_release(&p->common, bind, &pedit_hash_info)) { kfree(keys); return 1; } @@ -128,30 +123,30 @@ tcf_pedit_cleanup(struct tc_action *a, int bind) return 0; } -static int -tcf_pedit(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) +static int tcf_pedit(struct sk_buff *skb, struct tc_action *a, + struct tcf_result *res) { - struct tcf_pedit *p = PRIV(a, pedit); + struct tcf_pedit *p = a->priv; int i, munged = 0; u8 *pptr; if (!(skb->tc_verd & TC_OK2MUNGE)) { /* should we set skb->cloned? */ if (pskb_expand_head(skb, 0, 0, GFP_ATOMIC)) { - return p->action; + return p->tcf_action; } } pptr = skb->nh.raw; - spin_lock(&p->lock); + spin_lock(&p->tcf_lock); - p->tm.lastuse = jiffies; + p->tcf_tm.lastuse = jiffies; - if (p->nkeys > 0) { - struct tc_pedit_key *tkey = p->keys; + if (p->tcfp_nkeys > 0) { + struct tc_pedit_key *tkey = p->tcfp_keys; - for (i = p->nkeys; i > 0; i--, tkey++) { + for (i = p->tcfp_nkeys; i > 0; i--, tkey++) { u32 *ptr; int offset = tkey->off; @@ -169,7 +164,8 @@ tcf_pedit(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) printk("offset must be on 32 bit boundaries\n"); goto bad; } - if (skb->len < 0 || (offset > 0 && offset > skb->len)) { + if (skb->len < 0 || + (offset > 0 && offset > skb->len)) { printk("offset %d cant exceed pkt length %d\n", offset, skb->len); goto bad; @@ -185,63 +181,47 @@ tcf_pedit(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) skb->tc_verd = SET_TC_MUNGED(skb->tc_verd); goto done; } else { - printk("pedit BUG: index %d\n",p->index); + printk("pedit BUG: index %d\n", p->tcf_index); } bad: - p->qstats.overlimits++; + p->tcf_qstats.overlimits++; done: - p->bstats.bytes += skb->len; - p->bstats.packets++; - spin_unlock(&p->lock); - return p->action; + p->tcf_bstats.bytes += skb->len; + p->tcf_bstats.packets++; + spin_unlock(&p->tcf_lock); + return p->tcf_action; } -static int -tcf_pedit_dump(struct sk_buff *skb, struct tc_action *a,int bind, int ref) +static int tcf_pedit_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) { unsigned char *b = skb->tail; + struct tcf_pedit *p = a->priv; struct tc_pedit *opt; - struct tcf_pedit *p = PRIV(a, pedit); struct tcf_t t; int s; - s = sizeof(*opt) + p->nkeys * sizeof(struct tc_pedit_key); + s = sizeof(*opt) + p->tcfp_nkeys * sizeof(struct tc_pedit_key); /* netlink spinlocks held above us - must use ATOMIC */ opt = kzalloc(s, GFP_ATOMIC); - if (opt == NULL) + if (unlikely(!opt)) return -ENOBUFS; - memcpy(opt->keys, p->keys, p->nkeys * sizeof(struct tc_pedit_key)); - opt->index = p->index; - opt->nkeys = p->nkeys; - opt->flags = p->flags; - opt->action = p->action; - opt->refcnt = p->refcnt - ref; - opt->bindcnt = p->bindcnt - bind; - - -#ifdef PEDIT_DEB - { - /* Debug - get rid of later */ - int i; - struct tc_pedit_key *key = opt->keys; - - for (i=0; inkeys; i++, key++) { - printk( "\n key #%d",i); - printk( " at %d: val %08x mask %08x", - (unsigned int)key->off, - (unsigned int)key->val, - (unsigned int)key->mask); - } - } -#endif + memcpy(opt->keys, p->tcfp_keys, + p->tcfp_nkeys * sizeof(struct tc_pedit_key)); + opt->index = p->tcf_index; + opt->nkeys = p->tcfp_nkeys; + opt->flags = p->tcfp_flags; + opt->action = p->tcf_action; + opt->refcnt = p->tcf_refcnt - ref; + opt->bindcnt = p->tcf_bindcnt - bind; RTA_PUT(skb, TCA_PEDIT_PARMS, s, opt); - t.install = jiffies_to_clock_t(jiffies - p->tm.install); - t.lastuse = jiffies_to_clock_t(jiffies - p->tm.lastuse); - t.expires = jiffies_to_clock_t(p->tm.expires); + t.install = jiffies_to_clock_t(jiffies - p->tcf_tm.install); + t.lastuse = jiffies_to_clock_t(jiffies - p->tcf_tm.lastuse); + t.expires = jiffies_to_clock_t(p->tcf_tm.expires); RTA_PUT(skb, TCA_PEDIT_TM, sizeof(t), &t); kfree(opt); return skb->len; @@ -252,9 +232,9 @@ rtattr_failure: return -1; } -static -struct tc_action_ops act_pedit_ops = { +static struct tc_action_ops act_pedit_ops = { .kind = "pedit", + .hinfo = &pedit_hash_info, .type = TCA_ACT_PEDIT, .capab = TCA_CAP_NONE, .owner = THIS_MODULE, @@ -270,14 +250,12 @@ MODULE_AUTHOR("Jamal Hadi Salim(2002-4)"); MODULE_DESCRIPTION("Generic Packet Editor actions"); MODULE_LICENSE("GPL"); -static int __init -pedit_init_module(void) +static int __init pedit_init_module(void) { return tcf_register_action(&act_pedit_ops); } -static void __exit -pedit_cleanup_module(void) +static void __exit pedit_cleanup_module(void) { tcf_unregister_action(&act_pedit_ops); } diff --git a/net/sched/act_police.c b/net/sched/act_police.c index da905d7..fed47b6 100644 --- a/net/sched/act_police.c +++ b/net/sched/act_police.c @@ -32,43 +32,27 @@ #include #include -#define L2T(p,L) ((p)->R_tab->data[(L)>>(p)->R_tab->rate.cell_log]) -#define L2T_P(p,L) ((p)->P_tab->data[(L)>>(p)->P_tab->rate.cell_log]) -#define PRIV(a) ((struct tcf_police *) (a)->priv) - -/* use generic hash table */ -#define MY_TAB_SIZE 16 -#define MY_TAB_MASK 15 -static u32 idx_gen; -static struct tcf_police *tcf_police_ht[MY_TAB_SIZE]; -/* Policer hash table lock */ -static DEFINE_RWLOCK(police_lock); - -/* Each policer is serialized by its individual spinlock */ +#define L2T(p,L) ((p)->tcfp_R_tab->data[(L)>>(p)->tcfp_R_tab->rate.cell_log]) +#define L2T_P(p,L) ((p)->tcfp_P_tab->data[(L)>>(p)->tcfp_P_tab->rate.cell_log]) -static __inline__ unsigned tcf_police_hash(u32 index) -{ - return index&0xF; -} +#define POL_TAB_MASK 15 +static struct tcf_common *tcf_police_ht[POL_TAB_MASK + 1]; +static u32 police_idx_gen; +static DEFINE_RWLOCK(police_lock); -static __inline__ struct tcf_police * tcf_police_lookup(u32 index) -{ - struct tcf_police *p; +static struct tcf_hashinfo police_hash_info = { + .htab = tcf_police_ht, + .hmask = POL_TAB_MASK, + .lock = &police_lock, +}; - read_lock(&police_lock); - for (p = tcf_police_ht[tcf_police_hash(index)]; p; p = p->next) { - if (p->index == index) - break; - } - read_unlock(&police_lock); - return p; -} +/* Each policer is serialized by its individual spinlock */ #ifdef CONFIG_NET_CLS_ACT static int tcf_act_police_walker(struct sk_buff *skb, struct netlink_callback *cb, int type, struct tc_action *a) { - struct tcf_police *p; + struct tcf_common *p; int err = 0, index = -1, i = 0, s_i = 0, n_i = 0; struct rtattr *r; @@ -76,10 +60,10 @@ static int tcf_act_police_walker(struct sk_buff *skb, struct netlink_callback *c s_i = cb->args[0]; - for (i = 0; i < MY_TAB_SIZE; i++) { - p = tcf_police_ht[tcf_police_hash(i)]; + for (i = 0; i < (POL_TAB_MASK + 1); i++) { + p = tcf_police_ht[tcf_hash(i, POL_TAB_MASK)]; - for (; p; p = p->next) { + for (; p; p = p->tcfc_next) { index++; if (index < s_i) continue; @@ -110,48 +94,26 @@ rtattr_failure: skb_trim(skb, (u8*)r - skb->data); goto done; } - -static inline int -tcf_act_police_hash_search(struct tc_action *a, u32 index) -{ - struct tcf_police *p = tcf_police_lookup(index); - - if (p != NULL) { - a->priv = p; - return 1; - } else { - return 0; - } -} #endif -static inline u32 tcf_police_new_index(void) -{ - do { - if (++idx_gen == 0) - idx_gen = 1; - } while (tcf_police_lookup(idx_gen)); - - return idx_gen; -} - void tcf_police_destroy(struct tcf_police *p) { - unsigned h = tcf_police_hash(p->index); - struct tcf_police **p1p; + unsigned int h = tcf_hash(p->tcf_index, POL_TAB_MASK); + struct tcf_common **p1p; - for (p1p = &tcf_police_ht[h]; *p1p; p1p = &(*p1p)->next) { - if (*p1p == p) { + for (p1p = &tcf_police_ht[h]; *p1p; p1p = &(*p1p)->tcfc_next) { + if (*p1p == &p->common) { write_lock_bh(&police_lock); - *p1p = p->next; + *p1p = p->tcf_next; write_unlock_bh(&police_lock); #ifdef CONFIG_NET_ESTIMATOR - gen_kill_estimator(&p->bstats, &p->rate_est); + gen_kill_estimator(&p->tcf_bstats, + &p->tcf_rate_est); #endif - if (p->R_tab) - qdisc_put_rtab(p->R_tab); - if (p->P_tab) - qdisc_put_rtab(p->P_tab); + if (p->tcfp_R_tab) + qdisc_put_rtab(p->tcfp_R_tab); + if (p->tcfp_P_tab) + qdisc_put_rtab(p->tcfp_P_tab); kfree(p); return; } @@ -167,7 +129,7 @@ static int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est, int ret = 0, err; struct rtattr *tb[TCA_POLICE_MAX]; struct tc_police *parm; - struct tcf_police *p; + struct tcf_police *police; struct qdisc_rate_table *R_tab = NULL, *P_tab = NULL; if (rta == NULL || rtattr_parse_nested(tb, TCA_POLICE_MAX, rta) < 0) @@ -185,27 +147,32 @@ static int tcf_act_police_locate(struct rtattr *rta, struct rtattr *est, RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) return -EINVAL; - if (parm->index && (p = tcf_police_lookup(parm->index)) != NULL) { - a->priv = p; - if (bind) { - p->bindcnt += 1; - p->refcnt += 1; + if (parm->index) { + struct tcf_common *pc; + + pc = tcf_hash_lookup(parm->index, &police_hash_info); + if (pc != NULL) { + a->priv = pc; + police = to_police(pc); + if (bind) { + police->tcf_bindcnt += 1; + police->tcf_refcnt += 1; + } + if (ovr) + goto override; + return ret; } - if (ovr) - goto override; - return ret; } - p = kzalloc(sizeof(*p), GFP_KERNEL); - if (p == NULL) + police = kzalloc(sizeof(*police), GFP_KERNEL); + if (police == NULL) return -ENOMEM; - ret = ACT_P_CREATED; - p->refcnt = 1; - spin_lock_init(&p->lock); - p->stats_lock = &p->lock; + police->tcf_refcnt = 1; + spin_lock_init(&police->tcf_lock); + police->tcf_stats_lock = &police->tcf_lock; if (bind) - p->bindcnt = 1; + police->tcf_bindcnt = 1; override: if (parm->rate.rate) { err = -ENOMEM; @@ -215,67 +182,71 @@ override: if (parm->peakrate.rate) { P_tab = qdisc_get_rtab(&parm->peakrate, tb[TCA_POLICE_PEAKRATE-1]); - if (p->P_tab == NULL) { + if (P_tab == NULL) { qdisc_put_rtab(R_tab); goto failure; } } } /* No failure allowed after this point */ - spin_lock_bh(&p->lock); + spin_lock_bh(&police->tcf_lock); if (R_tab != NULL) { - qdisc_put_rtab(p->R_tab); - p->R_tab = R_tab; + qdisc_put_rtab(police->tcfp_R_tab); + police->tcfp_R_tab = R_tab; } if (P_tab != NULL) { - qdisc_put_rtab(p->P_tab); - p->P_tab = P_tab; + qdisc_put_rtab(police->tcfp_P_tab); + police->tcfp_P_tab = P_tab; } if (tb[TCA_POLICE_RESULT-1]) - p->result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); - p->toks = p->burst = parm->burst; - p->mtu = parm->mtu; - if (p->mtu == 0) { - p->mtu = ~0; - if (p->R_tab) - p->mtu = 255<R_tab->rate.cell_log; + police->tcfp_result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + police->tcfp_toks = police->tcfp_burst = parm->burst; + police->tcfp_mtu = parm->mtu; + if (police->tcfp_mtu == 0) { + police->tcfp_mtu = ~0; + if (police->tcfp_R_tab) + police->tcfp_mtu = 255<tcfp_R_tab->rate.cell_log; } - if (p->P_tab) - p->ptoks = L2T_P(p, p->mtu); - p->action = parm->action; + if (police->tcfp_P_tab) + police->tcfp_ptoks = L2T_P(police, police->tcfp_mtu); + police->tcf_action = parm->action; #ifdef CONFIG_NET_ESTIMATOR if (tb[TCA_POLICE_AVRATE-1]) - p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + police->tcfp_ewma_rate = + *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); if (est) - gen_replace_estimator(&p->bstats, &p->rate_est, p->stats_lock, est); + gen_replace_estimator(&police->tcf_bstats, + &police->tcf_rate_est, + police->tcf_stats_lock, est); #endif - spin_unlock_bh(&p->lock); + spin_unlock_bh(&police->tcf_lock); if (ret != ACT_P_CREATED) return ret; - PSCHED_GET_TIME(p->t_c); - p->index = parm->index ? : tcf_police_new_index(); - h = tcf_police_hash(p->index); + PSCHED_GET_TIME(police->tcfp_t_c); + police->tcf_index = parm->index ? parm->index : + tcf_hash_new_index(&police_idx_gen, &police_hash_info); + h = tcf_hash(police->tcf_index, POL_TAB_MASK); write_lock_bh(&police_lock); - p->next = tcf_police_ht[h]; - tcf_police_ht[h] = p; + police->tcf_next = tcf_police_ht[h]; + tcf_police_ht[h] = &police->common; write_unlock_bh(&police_lock); - a->priv = p; + a->priv = police; return ret; failure: if (ret == ACT_P_CREATED) - kfree(p); + kfree(police); return err; } static int tcf_act_police_cleanup(struct tc_action *a, int bind) { - struct tcf_police *p = PRIV(a); + struct tcf_police *p = a->priv; if (p != NULL) return tcf_police_release(p, bind); @@ -285,86 +256,87 @@ static int tcf_act_police_cleanup(struct tc_action *a, int bind) static int tcf_act_police(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) { + struct tcf_police *police = a->priv; psched_time_t now; - struct tcf_police *p = PRIV(a); long toks; long ptoks = 0; - spin_lock(&p->lock); + spin_lock(&police->tcf_lock); - p->bstats.bytes += skb->len; - p->bstats.packets++; + police->tcf_bstats.bytes += skb->len; + police->tcf_bstats.packets++; #ifdef CONFIG_NET_ESTIMATOR - if (p->ewma_rate && p->rate_est.bps >= p->ewma_rate) { - p->qstats.overlimits++; - spin_unlock(&p->lock); - return p->action; + if (police->tcfp_ewma_rate && + police->tcf_rate_est.bps >= police->tcfp_ewma_rate) { + police->tcf_qstats.overlimits++; + spin_unlock(&police->tcf_lock); + return police->tcf_action; } #endif - if (skb->len <= p->mtu) { - if (p->R_tab == NULL) { - spin_unlock(&p->lock); - return p->result; + if (skb->len <= police->tcfp_mtu) { + if (police->tcfp_R_tab == NULL) { + spin_unlock(&police->tcf_lock); + return police->tcfp_result; } PSCHED_GET_TIME(now); - toks = PSCHED_TDIFF_SAFE(now, p->t_c, p->burst); - - if (p->P_tab) { - ptoks = toks + p->ptoks; - if (ptoks > (long)L2T_P(p, p->mtu)) - ptoks = (long)L2T_P(p, p->mtu); - ptoks -= L2T_P(p, skb->len); + toks = PSCHED_TDIFF_SAFE(now, police->tcfp_t_c, + police->tcfp_burst); + if (police->tcfp_P_tab) { + ptoks = toks + police->tcfp_ptoks; + if (ptoks > (long)L2T_P(police, police->tcfp_mtu)) + ptoks = (long)L2T_P(police, police->tcfp_mtu); + ptoks -= L2T_P(police, skb->len); } - toks += p->toks; - if (toks > (long)p->burst) - toks = p->burst; - toks -= L2T(p, skb->len); - + toks += police->tcfp_toks; + if (toks > (long)police->tcfp_burst) + toks = police->tcfp_burst; + toks -= L2T(police, skb->len); if ((toks|ptoks) >= 0) { - p->t_c = now; - p->toks = toks; - p->ptoks = ptoks; - spin_unlock(&p->lock); - return p->result; + police->tcfp_t_c = now; + police->tcfp_toks = toks; + police->tcfp_ptoks = ptoks; + spin_unlock(&police->tcf_lock); + return police->tcfp_result; } } - p->qstats.overlimits++; - spin_unlock(&p->lock); - return p->action; + police->tcf_qstats.overlimits++; + spin_unlock(&police->tcf_lock); + return police->tcf_action; } static int tcf_act_police_dump(struct sk_buff *skb, struct tc_action *a, int bind, int ref) { unsigned char *b = skb->tail; + struct tcf_police *police = a->priv; struct tc_police opt; - struct tcf_police *p = PRIV(a); - - opt.index = p->index; - opt.action = p->action; - opt.mtu = p->mtu; - opt.burst = p->burst; - opt.refcnt = p->refcnt - ref; - opt.bindcnt = p->bindcnt - bind; - if (p->R_tab) - opt.rate = p->R_tab->rate; + + opt.index = police->tcf_index; + opt.action = police->tcf_action; + opt.mtu = police->tcfp_mtu; + opt.burst = police->tcfp_burst; + opt.refcnt = police->tcf_refcnt - ref; + opt.bindcnt = police->tcf_bindcnt - bind; + if (police->tcfp_R_tab) + opt.rate = police->tcfp_R_tab->rate; else memset(&opt.rate, 0, sizeof(opt.rate)); - if (p->P_tab) - opt.peakrate = p->P_tab->rate; + if (police->tcfp_P_tab) + opt.peakrate = police->tcfp_P_tab->rate; else memset(&opt.peakrate, 0, sizeof(opt.peakrate)); RTA_PUT(skb, TCA_POLICE_TBF, sizeof(opt), &opt); - if (p->result) - RTA_PUT(skb, TCA_POLICE_RESULT, sizeof(int), &p->result); + if (police->tcfp_result) + RTA_PUT(skb, TCA_POLICE_RESULT, sizeof(int), + &police->tcfp_result); #ifdef CONFIG_NET_ESTIMATOR - if (p->ewma_rate) - RTA_PUT(skb, TCA_POLICE_AVRATE, 4, &p->ewma_rate); + if (police->tcfp_ewma_rate) + RTA_PUT(skb, TCA_POLICE_AVRATE, 4, &police->tcfp_ewma_rate); #endif return skb->len; @@ -379,13 +351,14 @@ MODULE_LICENSE("GPL"); static struct tc_action_ops act_police_ops = { .kind = "police", + .hinfo = &police_hash_info, .type = TCA_ID_POLICE, .capab = TCA_CAP_NONE, .owner = THIS_MODULE, .act = tcf_act_police, .dump = tcf_act_police_dump, .cleanup = tcf_act_police_cleanup, - .lookup = tcf_act_police_hash_search, + .lookup = tcf_hash_search, .init = tcf_act_police_locate, .walk = tcf_act_police_walker }; @@ -407,10 +380,39 @@ module_exit(police_cleanup_module); #else /* CONFIG_NET_CLS_ACT */ -struct tcf_police * tcf_police_locate(struct rtattr *rta, struct rtattr *est) +static struct tcf_common *tcf_police_lookup(u32 index) { - unsigned h; - struct tcf_police *p; + struct tcf_hashinfo *hinfo = &police_hash_info; + struct tcf_common *p; + + read_lock(hinfo->lock); + for (p = hinfo->htab[tcf_hash(index, hinfo->hmask)]; p; + p = p->tcfc_next) { + if (p->tcfc_index == index) + break; + } + read_unlock(hinfo->lock); + + return p; +} + +static u32 tcf_police_new_index(void) +{ + u32 *idx_gen = &police_idx_gen; + u32 val = *idx_gen; + + do { + if (++val == 0) + val = 1; + } while (tcf_police_lookup(val)); + + return (*idx_gen = val); +} + +struct tcf_police *tcf_police_locate(struct rtattr *rta, struct rtattr *est) +{ + unsigned int h; + struct tcf_police *police; struct rtattr *tb[TCA_POLICE_MAX]; struct tc_police *parm; @@ -423,149 +425,158 @@ struct tcf_police * tcf_police_locate(struct rtattr *rta, struct rtattr *est) parm = RTA_DATA(tb[TCA_POLICE_TBF-1]); - if (parm->index && (p = tcf_police_lookup(parm->index)) != NULL) { - p->refcnt++; - return p; - } + if (parm->index) { + struct tcf_common *pc; - p = kzalloc(sizeof(*p), GFP_KERNEL); - if (p == NULL) + pc = tcf_police_lookup(parm->index); + if (pc) { + police = to_police(pc); + police->tcf_refcnt++; + return police; + } + } + police = kzalloc(sizeof(*police), GFP_KERNEL); + if (unlikely(!police)) return NULL; - p->refcnt = 1; - spin_lock_init(&p->lock); - p->stats_lock = &p->lock; + police->tcf_refcnt = 1; + spin_lock_init(&police->tcf_lock); + police->tcf_stats_lock = &police->tcf_lock; if (parm->rate.rate) { - p->R_tab = qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE-1]); - if (p->R_tab == NULL) + police->tcfp_R_tab = + qdisc_get_rtab(&parm->rate, tb[TCA_POLICE_RATE-1]); + if (police->tcfp_R_tab == NULL) goto failure; if (parm->peakrate.rate) { - p->P_tab = qdisc_get_rtab(&parm->peakrate, - tb[TCA_POLICE_PEAKRATE-1]); - if (p->P_tab == NULL) + police->tcfp_P_tab = + qdisc_get_rtab(&parm->peakrate, + tb[TCA_POLICE_PEAKRATE-1]); + if (police->tcfp_P_tab == NULL) goto failure; } } if (tb[TCA_POLICE_RESULT-1]) { if (RTA_PAYLOAD(tb[TCA_POLICE_RESULT-1]) != sizeof(u32)) goto failure; - p->result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); + police->tcfp_result = *(u32*)RTA_DATA(tb[TCA_POLICE_RESULT-1]); } #ifdef CONFIG_NET_ESTIMATOR if (tb[TCA_POLICE_AVRATE-1]) { if (RTA_PAYLOAD(tb[TCA_POLICE_AVRATE-1]) != sizeof(u32)) goto failure; - p->ewma_rate = *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); + police->tcfp_ewma_rate = + *(u32*)RTA_DATA(tb[TCA_POLICE_AVRATE-1]); } #endif - p->toks = p->burst = parm->burst; - p->mtu = parm->mtu; - if (p->mtu == 0) { - p->mtu = ~0; - if (p->R_tab) - p->mtu = 255<R_tab->rate.cell_log; + police->tcfp_toks = police->tcfp_burst = parm->burst; + police->tcfp_mtu = parm->mtu; + if (police->tcfp_mtu == 0) { + police->tcfp_mtu = ~0; + if (police->tcfp_R_tab) + police->tcfp_mtu = 255<tcfp_R_tab->rate.cell_log; } - if (p->P_tab) - p->ptoks = L2T_P(p, p->mtu); - PSCHED_GET_TIME(p->t_c); - p->index = parm->index ? : tcf_police_new_index(); - p->action = parm->action; + if (police->tcfp_P_tab) + police->tcfp_ptoks = L2T_P(police, police->tcfp_mtu); + PSCHED_GET_TIME(police->tcfp_t_c); + police->tcf_index = parm->index ? parm->index : + tcf_police_new_index(); + police->tcf_action = parm->action; #ifdef CONFIG_NET_ESTIMATOR if (est) - gen_new_estimator(&p->bstats, &p->rate_est, p->stats_lock, est); + gen_new_estimator(&police->tcf_bstats, &police->tcf_rate_est, + police->tcf_stats_lock, est); #endif - h = tcf_police_hash(p->index); + h = tcf_hash(police->tcf_index, POL_TAB_MASK); write_lock_bh(&police_lock); - p->next = tcf_police_ht[h]; - tcf_police_ht[h] = p; + police->tcf_next = tcf_police_ht[h]; + tcf_police_ht[h] = &police->common; write_unlock_bh(&police_lock); - return p; + return police; failure: - if (p->R_tab) - qdisc_put_rtab(p->R_tab); - kfree(p); + if (police->tcfp_R_tab) + qdisc_put_rtab(police->tcfp_R_tab); + kfree(police); return NULL; } -int tcf_police(struct sk_buff *skb, struct tcf_police *p) +int tcf_police(struct sk_buff *skb, struct tcf_police *police) { psched_time_t now; long toks; long ptoks = 0; - spin_lock(&p->lock); + spin_lock(&police->tcf_lock); - p->bstats.bytes += skb->len; - p->bstats.packets++; + police->tcf_bstats.bytes += skb->len; + police->tcf_bstats.packets++; #ifdef CONFIG_NET_ESTIMATOR - if (p->ewma_rate && p->rate_est.bps >= p->ewma_rate) { - p->qstats.overlimits++; - spin_unlock(&p->lock); - return p->action; + if (police->tcfp_ewma_rate && + police->tcf_rate_est.bps >= police->tcfp_ewma_rate) { + police->tcf_qstats.overlimits++; + spin_unlock(&police->tcf_lock); + return police->tcf_action; } #endif - - if (skb->len <= p->mtu) { - if (p->R_tab == NULL) { - spin_unlock(&p->lock); - return p->result; + if (skb->len <= police->tcfp_mtu) { + if (police->tcfp_R_tab == NULL) { + spin_unlock(&police->tcf_lock); + return police->tcfp_result; } PSCHED_GET_TIME(now); - - toks = PSCHED_TDIFF_SAFE(now, p->t_c, p->burst); - - if (p->P_tab) { - ptoks = toks + p->ptoks; - if (ptoks > (long)L2T_P(p, p->mtu)) - ptoks = (long)L2T_P(p, p->mtu); - ptoks -= L2T_P(p, skb->len); + toks = PSCHED_TDIFF_SAFE(now, police->tcfp_t_c, + police->tcfp_burst); + if (police->tcfp_P_tab) { + ptoks = toks + police->tcfp_ptoks; + if (ptoks > (long)L2T_P(police, police->tcfp_mtu)) + ptoks = (long)L2T_P(police, police->tcfp_mtu); + ptoks -= L2T_P(police, skb->len); } - toks += p->toks; - if (toks > (long)p->burst) - toks = p->burst; - toks -= L2T(p, skb->len); - + toks += police->tcfp_toks; + if (toks > (long)police->tcfp_burst) + toks = police->tcfp_burst; + toks -= L2T(police, skb->len); if ((toks|ptoks) >= 0) { - p->t_c = now; - p->toks = toks; - p->ptoks = ptoks; - spin_unlock(&p->lock); - return p->result; + police->tcfp_t_c = now; + police->tcfp_toks = toks; + police->tcfp_ptoks = ptoks; + spin_unlock(&police->tcf_lock); + return police->tcfp_result; } } - p->qstats.overlimits++; - spin_unlock(&p->lock); - return p->action; + police->tcf_qstats.overlimits++; + spin_unlock(&police->tcf_lock); + return police->tcf_action; } EXPORT_SYMBOL(tcf_police); -int tcf_police_dump(struct sk_buff *skb, struct tcf_police *p) +int tcf_police_dump(struct sk_buff *skb, struct tcf_police *police) { - unsigned char *b = skb->tail; + unsigned char *b = skb->tail; struct tc_police opt; - opt.index = p->index; - opt.action = p->action; - opt.mtu = p->mtu; - opt.burst = p->burst; - if (p->R_tab) - opt.rate = p->R_tab->rate; + opt.index = police->tcf_index; + opt.action = police->tcf_action; + opt.mtu = police->tcfp_mtu; + opt.burst = police->tcfp_burst; + if (police->tcfp_R_tab) + opt.rate = police->tcfp_R_tab->rate; else memset(&opt.rate, 0, sizeof(opt.rate)); - if (p->P_tab) - opt.peakrate = p->P_tab->rate; + if (police->tcfp_P_tab) + opt.peakrate = police->tcfp_P_tab->rate; else memset(&opt.peakrate, 0, sizeof(opt.peakrate)); RTA_PUT(skb, TCA_POLICE_TBF, sizeof(opt), &opt); - if (p->result) - RTA_PUT(skb, TCA_POLICE_RESULT, sizeof(int), &p->result); + if (police->tcfp_result) + RTA_PUT(skb, TCA_POLICE_RESULT, sizeof(int), + &police->tcfp_result); #ifdef CONFIG_NET_ESTIMATOR - if (p->ewma_rate) - RTA_PUT(skb, TCA_POLICE_AVRATE, 4, &p->ewma_rate); + if (police->tcfp_ewma_rate) + RTA_PUT(skb, TCA_POLICE_AVRATE, 4, &police->tcfp_ewma_rate); #endif return skb->len; @@ -574,19 +585,20 @@ rtattr_failure: return -1; } -int tcf_police_dump_stats(struct sk_buff *skb, struct tcf_police *p) +int tcf_police_dump_stats(struct sk_buff *skb, struct tcf_police *police) { struct gnet_dump d; if (gnet_stats_start_copy_compat(skb, TCA_STATS2, TCA_STATS, - TCA_XSTATS, p->stats_lock, &d) < 0) + TCA_XSTATS, police->tcf_stats_lock, + &d) < 0) goto errout; - if (gnet_stats_copy_basic(&d, &p->bstats) < 0 || + if (gnet_stats_copy_basic(&d, &police->tcf_bstats) < 0 || #ifdef CONFIG_NET_ESTIMATOR - gnet_stats_copy_rate_est(&d, &p->rate_est) < 0 || + gnet_stats_copy_rate_est(&d, &police->tcf_rate_est) < 0 || #endif - gnet_stats_copy_queue(&d, &p->qstats) < 0) + gnet_stats_copy_queue(&d, &police->tcf_qstats) < 0) goto errout; if (gnet_stats_finish_copy(&d) < 0) diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c index 17105c8..8c1ab8a 100644 --- a/net/sched/act_simple.c +++ b/net/sched/act_simple.c @@ -20,54 +20,175 @@ #define TCA_ACT_SIMP 22 -/* XXX: Hide all these common elements under some macro - * probably -*/ #include #include -/* use generic hash table with 8 buckets */ -#define MY_TAB_SIZE 8 -#define MY_TAB_MASK (MY_TAB_SIZE - 1) -static u32 idx_gen; -static struct tcf_defact *tcf_simp_ht[MY_TAB_SIZE]; +#define SIMP_TAB_MASK 7 +static struct tcf_common *tcf_simp_ht[SIMP_TAB_MASK + 1]; +static u32 simp_idx_gen; static DEFINE_RWLOCK(simp_lock); -/* override the defaults */ -#define tcf_st tcf_defact -#define tc_st tc_defact -#define tcf_t_lock simp_lock -#define tcf_ht tcf_simp_ht - -#define CONFIG_NET_ACT_INIT 1 -#include -#include +struct tcf_hashinfo simp_hash_info = { + .htab = tcf_simp_ht, + .hmask = SIMP_TAB_MASK, + .lock = &simp_lock, +}; static int tcf_simp(struct sk_buff *skb, struct tc_action *a, struct tcf_result *res) { - struct tcf_defact *p = PRIV(a, defact); + struct tcf_defact *d = a->priv; - spin_lock(&p->lock); - p->tm.lastuse = jiffies; - p->bstats.bytes += skb->len; - p->bstats.packets++; + spin_lock(&d->tcf_lock); + d->tcf_tm.lastuse = jiffies; + d->tcf_bstats.bytes += skb->len; + d->tcf_bstats.packets++; /* print policy string followed by _ then packet count * Example if this was the 3rd packet and the string was "hello" * then it would look like "hello_3" (without quotes) **/ - printk("simple: %s_%d\n", (char *)p->defdata, p->bstats.packets); - spin_unlock(&p->lock); - return p->action; + printk("simple: %s_%d\n", + (char *)d->tcfd_defdata, d->tcf_bstats.packets); + spin_unlock(&d->tcf_lock); + return d->tcf_action; +} + +static int tcf_simp_release(struct tcf_defact *d, int bind) +{ + int ret = 0; + if (d) { + if (bind) + d->tcf_bindcnt--; + d->tcf_refcnt--; + if (d->tcf_bindcnt <= 0 && d->tcf_refcnt <= 0) { + kfree(d->tcfd_defdata); + tcf_hash_destroy(&d->common, &simp_hash_info); + ret = 1; + } + } + return ret; +} + +static int alloc_defdata(struct tcf_defact *d, u32 datalen, void *defdata) +{ + d->tcfd_defdata = kmalloc(datalen, GFP_KERNEL); + if (unlikely(!d->tcfd_defdata)) + return -ENOMEM; + d->tcfd_datalen = datalen; + memcpy(d->tcfd_defdata, defdata, datalen); + return 0; +} + +static int realloc_defdata(struct tcf_defact *d, u32 datalen, void *defdata) +{ + kfree(d->tcfd_defdata); + return alloc_defdata(d, datalen, defdata); +} + +static int tcf_simp_init(struct rtattr *rta, struct rtattr *est, + struct tc_action *a, int ovr, int bind) +{ + struct rtattr *tb[TCA_DEF_MAX]; + struct tc_defact *parm; + struct tcf_defact *d; + struct tcf_common *pc; + void *defdata; + u32 datalen = 0; + int ret = 0; + + if (rta == NULL || rtattr_parse_nested(tb, TCA_DEF_MAX, rta) < 0) + return -EINVAL; + + if (tb[TCA_DEF_PARMS - 1] == NULL || + RTA_PAYLOAD(tb[TCA_DEF_PARMS - 1]) < sizeof(*parm)) + return -EINVAL; + + parm = RTA_DATA(tb[TCA_DEF_PARMS - 1]); + defdata = RTA_DATA(tb[TCA_DEF_DATA - 1]); + if (defdata == NULL) + return -EINVAL; + + datalen = RTA_PAYLOAD(tb[TCA_DEF_DATA - 1]); + if (datalen <= 0) + return -EINVAL; + + pc = tcf_hash_check(parm->index, a, bind, &simp_hash_info); + if (!pc) { + pc = tcf_hash_create(parm->index, est, a, sizeof(*d), bind, + &simp_idx_gen, &simp_hash_info); + if (unlikely(!pc)) + return -ENOMEM; + + d = to_defact(pc); + ret = alloc_defdata(d, datalen, defdata); + if (ret < 0) { + kfree(pc); + return ret; + } + ret = ACT_P_CREATED; + } else { + d = to_defact(pc); + if (!ovr) { + tcf_simp_release(d, bind); + return -EEXIST; + } + realloc_defdata(d, datalen, defdata); + } + + spin_lock_bh(&d->tcf_lock); + d->tcf_action = parm->action; + spin_unlock_bh(&d->tcf_lock); + + if (ret == ACT_P_CREATED) + tcf_hash_insert(pc, &simp_hash_info); + return ret; +} + +static inline int tcf_simp_cleanup(struct tc_action *a, int bind) +{ + struct tcf_defact *d = a->priv; + + if (d) + return tcf_simp_release(d, bind); + return 0; +} + +static inline int tcf_simp_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *b = skb->tail; + struct tcf_defact *d = a->priv; + struct tc_defact opt; + struct tcf_t t; + + opt.index = d->tcf_index; + opt.refcnt = d->tcf_refcnt - ref; + opt.bindcnt = d->tcf_bindcnt - bind; + opt.action = d->tcf_action; + RTA_PUT(skb, TCA_DEF_PARMS, sizeof(opt), &opt); + RTA_PUT(skb, TCA_DEF_DATA, d->tcfd_datalen, d->tcfd_defdata); + t.install = jiffies_to_clock_t(jiffies - d->tcf_tm.install); + t.lastuse = jiffies_to_clock_t(jiffies - d->tcf_tm.lastuse); + t.expires = jiffies_to_clock_t(d->tcf_tm.expires); + RTA_PUT(skb, TCA_DEF_TM, sizeof(t), &t); + return skb->len; + +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; } static struct tc_action_ops act_simp_ops = { - .kind = "simple", - .type = TCA_ACT_SIMP, - .capab = TCA_CAP_NONE, - .owner = THIS_MODULE, - .act = tcf_simp, - tca_use_default_ops + .kind = "simple", + .hinfo = &simp_hash_info, + .type = TCA_ACT_SIMP, + .capab = TCA_CAP_NONE, + .owner = THIS_MODULE, + .act = tcf_simp, + .dump = tcf_simp_dump, + .cleanup = tcf_simp_cleanup, + .init = tcf_simp_init, + .walk = tcf_generic_walker, }; MODULE_AUTHOR("Jamal Hadi Salim(2005)"); -- cgit v0.10.2 From e0a1ad73d34fd6dfdb630479400511e9879069c0 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 00:00:21 -0700 Subject: [IPv6] route: Simplify ip6_del_rt() Provide a simple ip6_del_rt() for the majority of users and an alternative for the exception via netlink. Avoids code obfuscation. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 01bfe40..a7e6086 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -69,10 +69,7 @@ extern int ip6_ins_rt(struct rt6_info *, struct nlmsghdr *, void *rtattr, struct netlink_skb_parms *req); -extern int ip6_del_rt(struct rt6_info *, - struct nlmsghdr *, - void *rtattr, - struct netlink_skb_parms *req); +extern int ip6_del_rt(struct rt6_info *); extern int ip6_rt_addr_add(struct in6_addr *addr, struct net_device *dev, diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index f1ede90..27f2e33 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -736,7 +736,7 @@ static void ipv6_del_addr(struct inet6_ifaddr *ifp) if (rt && ((rt->rt6i_flags & (RTF_GATEWAY | RTF_DEFAULT)) == 0)) { if (onlink == 0) { - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); rt = NULL; } else if (!(rt->rt6i_flags & RTF_EXPIRES)) { rt->rt6i_expires = expires; @@ -1662,7 +1662,7 @@ void addrconf_prefix_rcv(struct net_device *dev, u8 *opt, int len) if (rt && ((rt->rt6i_flags & (RTF_GATEWAY | RTF_DEFAULT)) == 0)) { if (rt->rt6i_flags&RTF_EXPIRES) { if (valid_lft == 0) { - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); rt = NULL; } else { rt->rt6i_expires = jiffies + rt_expires; @@ -3557,7 +3557,7 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) addrconf_leave_anycast(ifp); addrconf_leave_solict(ifp->idev, &ifp->addr); dst_hold(&ifp->rt->u.dst); - if (ip6_del_rt(ifp->rt, NULL, NULL, NULL)) + if (ip6_del_rt(ifp->rt)) dst_free(&ifp->rt->u.dst); break; } diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 5743e8b..419d651 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -961,7 +961,7 @@ static void ndisc_recv_na(struct sk_buff *skb) struct rt6_info *rt; rt = rt6_get_dflt_router(saddr, dev); if (rt) - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); } out: @@ -1114,7 +1114,7 @@ static void ndisc_router_discovery(struct sk_buff *skb) if (rt && lifetime == 0) { neigh_clone(neigh); - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); rt = NULL; } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 1aca787..8d511de 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -457,7 +457,7 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len, rt = rt6_get_route_info(prefix, rinfo->prefix_len, gwaddr, dev->ifindex); if (rt && !lifetime) { - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); rt = NULL; } @@ -813,7 +813,7 @@ static struct dst_entry *ip6_negative_advice(struct dst_entry *dst) if (rt) { if (rt->rt6i_flags & RTF_CACHE) - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); else dst_release(dst); } @@ -1218,7 +1218,8 @@ out: return err; } -int ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) +static int __ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, + void *_rtattr, struct netlink_skb_parms *req) { int err; struct fib6_table *table; @@ -1237,6 +1238,11 @@ int ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct return err; } +int ip6_del_rt(struct rt6_info *rt) +{ + return __ip6_del_rt(rt, NULL, NULL, NULL); +} + static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req, u32 table_id) @@ -1271,7 +1277,7 @@ static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, dst_hold(&rt->u.dst); read_unlock_bh(&table->tb6_lock); - return ip6_del_rt(rt, nlh, _rtattr, req); + return __ip6_del_rt(rt, nlh, _rtattr, req); } } read_unlock_bh(&table->tb6_lock); @@ -1395,7 +1401,7 @@ restart: call_netevent_notifiers(NETEVENT_REDIRECT, &netevent); if (rt->rt6i_flags&RTF_CACHE) { - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); return; } @@ -1631,7 +1637,7 @@ restart: if (rt->rt6i_flags & (RTF_DEFAULT | RTF_ADDRCONF)) { dst_hold(&rt->u.dst); read_unlock_bh(&table->tb6_lock); - ip6_del_rt(rt, NULL, NULL, NULL); + ip6_del_rt(rt); goto restart; } } -- cgit v0.10.2 From 40e22e8f3d4d4f1ff68fb03683f007c53ee8b348 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 00:00:45 -0700 Subject: [IPv6] route: Simplify ip6_ins_rt() Provide a simple ip6_ins_rt() for the majority of users and an alternative for the exception via netlink. Avoids code obfuscation. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index a7e6086..172c476 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -65,10 +65,7 @@ extern int ip6_route_add(struct in6_rtmsg *rtmsg, void *rtattr, struct netlink_skb_parms *req, u32 table_id); -extern int ip6_ins_rt(struct rt6_info *, - struct nlmsghdr *, - void *rtattr, - struct netlink_skb_parms *req); +extern int ip6_ins_rt(struct rt6_info *); extern int ip6_del_rt(struct rt6_info *); extern int ip6_rt_addr_add(struct in6_addr *addr, diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 27f2e33..aafba9e 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3548,7 +3548,7 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) switch (event) { case RTM_NEWADDR: - ip6_ins_rt(ifp->rt, NULL, NULL, NULL); + ip6_ins_rt(ifp->rt); if (ifp->idev->cnf.forwarding) addrconf_join_anycast(ifp); break; diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index f6881d7..abbc35a 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -335,7 +335,7 @@ int ipv6_dev_ac_inc(struct net_device *dev, struct in6_addr *addr) write_unlock_bh(&idev->lock); dst_hold(&rt->u.dst); - if (ip6_ins_rt(rt, NULL, NULL, NULL)) + if (ip6_ins_rt(rt)) dst_release(&rt->u.dst); addrconf_join_solict(dev, &aca->aca_addr); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 8d511de..9ec348a 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -546,8 +546,8 @@ struct rt6_info *rt6_lookup(struct in6_addr *daddr, struct in6_addr *saddr, be destroyed. */ -int ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req) +static int __ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, + void *_rtattr, struct netlink_skb_parms *req) { int err; struct fib6_table *table; @@ -560,6 +560,11 @@ int ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, return err; } +int ip6_ins_rt(struct rt6_info *rt) +{ + return __ip6_ins_rt(rt, NULL, NULL, NULL); +} + static struct rt6_info *rt6_alloc_cow(struct rt6_info *ort, struct in6_addr *daddr, struct in6_addr *saddr) { @@ -657,7 +662,7 @@ restart: dst_hold(&rt->u.dst); if (nrt) { - err = ip6_ins_rt(nrt, NULL, NULL, NULL); + err = ip6_ins_rt(nrt); if (!err) goto out2; } @@ -752,7 +757,7 @@ restart: dst_hold(&rt->u.dst); if (nrt) { - err = ip6_ins_rt(nrt, NULL, NULL, NULL); + err = ip6_ins_rt(nrt); if (!err) goto out2; } @@ -1206,7 +1211,7 @@ install_route: rt->u.dst.dev = dev; rt->rt6i_idev = idev; rt->rt6i_table = table; - return ip6_ins_rt(rt, nlh, _rtattr, req); + return __ip6_ins_rt(rt, nlh, _rtattr, req); out: if (dev) @@ -1393,7 +1398,7 @@ restart: nrt->u.dst.metrics[RTAX_MTU-1] = ipv6_get_mtu(neigh->dev); nrt->u.dst.metrics[RTAX_ADVMSS-1] = ipv6_advmss(dst_mtu(&nrt->u.dst)); - if (ip6_ins_rt(nrt, NULL, NULL, NULL)) + if (ip6_ins_rt(nrt)) goto out; netevent.old = &rt->u.dst; @@ -1483,7 +1488,7 @@ void rt6_pmtu_discovery(struct in6_addr *daddr, struct in6_addr *saddr, dst_set_expires(&nrt->u.dst, ip6_rt_mtu_expires); nrt->rt6i_flags |= RTF_DYNAMIC|RTF_EXPIRES; - ip6_ins_rt(nrt, NULL, NULL, NULL); + ip6_ins_rt(nrt); } out: dst_release(&rt->u.dst); -- cgit v0.10.2 From 86872cb57925c46a6499887d77afb880a892c0ec Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 00:01:08 -0700 Subject: [IPv6] route: FIB6 configuration using struct fib6_config Replaces the struct in6_rtmsg based interface orignating from the ioctl interface with a struct fib6_config based on. Allows changing the interface without breaking the ioctl interface and avoids passing on tons of parameters. The recently introduced struct nl_info is used to pass on netlink authorship information for notifications. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 69c4442..9610b887f 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -16,14 +16,35 @@ #ifdef __KERNEL__ #include - -#include -#include #include #include +#include +#include +#include struct rt6_info; +struct fib6_config +{ + u32 fc_table; + u32 fc_metric; + int fc_dst_len; + int fc_src_len; + int fc_ifindex; + u32 fc_flags; + u32 fc_protocol; + + struct in6_addr fc_dst; + struct in6_addr fc_src; + struct in6_addr fc_gateway; + + unsigned long fc_expires; + struct nlattr *fc_mx; + int fc_mx_len; + + struct nl_info fc_nlinfo; +}; + struct fib6_node { struct fib6_node *parent; @@ -175,18 +196,13 @@ extern void fib6_clean_all(int (*func)(struct rt6_info *, void *arg), extern int fib6_add(struct fib6_node *root, struct rt6_info *rt, - struct nlmsghdr *nlh, - void *rtattr, - struct netlink_skb_parms *req); + struct nl_info *info); extern int fib6_del(struct rt6_info *rt, - struct nlmsghdr *nlh, - void *rtattr, - struct netlink_skb_parms *req); + struct nl_info *info); extern void inet6_rt_notify(int event, struct rt6_info *rt, - struct nlmsghdr *nlh, - struct netlink_skb_parms *req); + struct nl_info *info); extern void fib6_run_gc(unsigned long dummy); diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 172c476..3f170f6 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -60,11 +60,7 @@ extern void ip6_route_cleanup(void); extern int ipv6_route_ioctl(unsigned int cmd, void __user *arg); -extern int ip6_route_add(struct in6_rtmsg *rtmsg, - struct nlmsghdr *, - void *rtattr, - struct netlink_skb_parms *req, - u32 table_id); +extern int ip6_route_add(struct fib6_config *cfg); extern int ip6_ins_rt(struct rt6_info *); extern int ip6_del_rt(struct rt6_info *); diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index aafba9e..fc9cff3 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1509,59 +1509,56 @@ static void addrconf_prefix_route(struct in6_addr *pfx, int plen, struct net_device *dev, unsigned long expires, u32 flags) { - struct in6_rtmsg rtmsg; + struct fib6_config cfg = { + .fc_table = RT6_TABLE_PREFIX, + .fc_metric = IP6_RT_PRIO_ADDRCONF, + .fc_ifindex = dev->ifindex, + .fc_expires = expires, + .fc_dst_len = plen, + .fc_flags = RTF_UP | flags, + }; - memset(&rtmsg, 0, sizeof(rtmsg)); - ipv6_addr_copy(&rtmsg.rtmsg_dst, pfx); - rtmsg.rtmsg_dst_len = plen; - rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_info = expires; - rtmsg.rtmsg_flags = RTF_UP|flags; - rtmsg.rtmsg_type = RTMSG_NEWROUTE; + ipv6_addr_copy(&cfg.fc_dst, pfx); /* Prevent useless cloning on PtP SIT. This thing is done here expecting that the whole class of non-broadcast devices need not cloning. */ - if (dev->type == ARPHRD_SIT && (dev->flags&IFF_POINTOPOINT)) - rtmsg.rtmsg_flags |= RTF_NONEXTHOP; + if (dev->type == ARPHRD_SIT && (dev->flags & IFF_POINTOPOINT)) + cfg.fc_flags |= RTF_NONEXTHOP; - ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_PREFIX); + ip6_route_add(&cfg); } /* Create "default" multicast route to the interface */ static void addrconf_add_mroute(struct net_device *dev) { - struct in6_rtmsg rtmsg; - - memset(&rtmsg, 0, sizeof(rtmsg)); - ipv6_addr_set(&rtmsg.rtmsg_dst, - htonl(0xFF000000), 0, 0, 0); - rtmsg.rtmsg_dst_len = 8; - rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; - rtmsg.rtmsg_ifindex = dev->ifindex; - rtmsg.rtmsg_flags = RTF_UP; - rtmsg.rtmsg_type = RTMSG_NEWROUTE; - ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_LOCAL); + struct fib6_config cfg = { + .fc_table = RT6_TABLE_LOCAL, + .fc_metric = IP6_RT_PRIO_ADDRCONF, + .fc_ifindex = dev->ifindex, + .fc_dst_len = 8, + .fc_flags = RTF_UP, + }; + + ipv6_addr_set(&cfg.fc_dst, htonl(0xFF000000), 0, 0, 0); + + ip6_route_add(&cfg); } static void sit_route_add(struct net_device *dev) { - struct in6_rtmsg rtmsg; - - memset(&rtmsg, 0, sizeof(rtmsg)); - - rtmsg.rtmsg_type = RTMSG_NEWROUTE; - rtmsg.rtmsg_metric = IP6_RT_PRIO_ADDRCONF; + struct fib6_config cfg = { + .fc_table = RT6_TABLE_MAIN, + .fc_metric = IP6_RT_PRIO_ADDRCONF, + .fc_ifindex = dev->ifindex, + .fc_dst_len = 96, + .fc_flags = RTF_UP | RTF_NONEXTHOP, + }; /* prefix length - 96 bits "::d.d.d.d" */ - rtmsg.rtmsg_dst_len = 96; - rtmsg.rtmsg_flags = RTF_UP|RTF_NONEXTHOP; - rtmsg.rtmsg_ifindex = dev->ifindex; - - ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_MAIN); + ip6_route_add(&cfg); } static void addrconf_add_lroute(struct net_device *dev) diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index be36f4a..667b1b1 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -610,7 +610,7 @@ insert_above: */ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt, - struct nlmsghdr *nlh, struct netlink_skb_parms *req) + struct nl_info *info) { struct rt6_info *iter = NULL; struct rt6_info **ins; @@ -665,7 +665,7 @@ out: *ins = rt; rt->rt6i_node = fn; atomic_inc(&rt->rt6i_ref); - inet6_rt_notify(RTM_NEWROUTE, rt, nlh, req); + inet6_rt_notify(RTM_NEWROUTE, rt, info); rt6_stats.fib_rt_entries++; if ((fn->fn_flags & RTN_RTINFO) == 0) { @@ -695,8 +695,7 @@ void fib6_force_start_gc(void) * with source addr info in sub-trees */ -int fib6_add(struct fib6_node *root, struct rt6_info *rt, - struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) +int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) { struct fib6_node *fn; int err = -ENOMEM; @@ -769,7 +768,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, } #endif - err = fib6_add_rt2node(fn, rt, nlh, req); + err = fib6_add_rt2node(fn, rt, info); if (err == 0) { fib6_start_gc(rt); @@ -1076,7 +1075,7 @@ static struct fib6_node * fib6_repair_tree(struct fib6_node *fn) } static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp, - struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) + struct nl_info *info) { struct fib6_walker_t *w; struct rt6_info *rt = *rtp; @@ -1132,11 +1131,11 @@ static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp, if (atomic_read(&rt->rt6i_ref) != 1) BUG(); } - inet6_rt_notify(RTM_DELROUTE, rt, nlh, req); + inet6_rt_notify(RTM_DELROUTE, rt, info); rt6_release(rt); } -int fib6_del(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct netlink_skb_parms *req) +int fib6_del(struct rt6_info *rt, struct nl_info *info) { struct fib6_node *fn = rt->rt6i_node; struct rt6_info **rtp; @@ -1161,7 +1160,7 @@ int fib6_del(struct rt6_info *rt, struct nlmsghdr *nlh, void *_rtattr, struct ne for (rtp = &fn->leaf; *rtp; rtp = &(*rtp)->u.next) { if (*rtp == rt) { - fib6_del_route(fn, rtp, nlh, _rtattr, req); + fib6_del_route(fn, rtp, info); return 0; } } @@ -1290,7 +1289,7 @@ static int fib6_clean_node(struct fib6_walker_t *w) res = c->func(rt, c->arg); if (res < 0) { w->leaf = rt; - res = fib6_del(rt, NULL, NULL, NULL); + res = fib6_del(rt, NULL); if (res) { #if RT6_DEBUG >= 2 printk(KERN_DEBUG "fib6_clean_node: del failed: rt=%p@%p err=%d\n", rt, rt->rt6i_node, res); diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 9ec348a..7bcffa6 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -546,15 +546,14 @@ struct rt6_info *rt6_lookup(struct in6_addr *daddr, struct in6_addr *saddr, be destroyed. */ -static int __ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req) +static int __ip6_ins_rt(struct rt6_info *rt, struct nl_info *info) { int err; struct fib6_table *table; table = rt->rt6i_table; write_lock_bh(&table->tb6_lock); - err = fib6_add(&table->tb6_root, rt, nlh, _rtattr, req); + err = fib6_add(&table->tb6_root, rt, info); write_unlock_bh(&table->tb6_lock); return err; @@ -562,7 +561,7 @@ static int __ip6_ins_rt(struct rt6_info *rt, struct nlmsghdr *nlh, int ip6_ins_rt(struct rt6_info *rt) { - return __ip6_ins_rt(rt, NULL, NULL, NULL); + return __ip6_ins_rt(rt, NULL); } static struct rt6_info *rt6_alloc_cow(struct rt6_info *ort, struct in6_addr *daddr, @@ -1014,30 +1013,24 @@ int ipv6_get_hoplimit(struct net_device *dev) * */ -int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req, - u32 table_id) +int ip6_route_add(struct fib6_config *cfg) { int err; - struct rtmsg *r; - struct rtattr **rta; struct rt6_info *rt = NULL; struct net_device *dev = NULL; struct inet6_dev *idev = NULL; struct fib6_table *table; int addr_type; - rta = (struct rtattr **) _rtattr; - - if (rtmsg->rtmsg_dst_len > 128 || rtmsg->rtmsg_src_len > 128) + if (cfg->fc_dst_len > 128 || cfg->fc_src_len > 128) return -EINVAL; #ifndef CONFIG_IPV6_SUBTREES - if (rtmsg->rtmsg_src_len) + if (cfg->fc_src_len) return -EINVAL; #endif - if (rtmsg->rtmsg_ifindex) { + if (cfg->fc_ifindex) { err = -ENODEV; - dev = dev_get_by_index(rtmsg->rtmsg_ifindex); + dev = dev_get_by_index(cfg->fc_ifindex); if (!dev) goto out; idev = in6_dev_get(dev); @@ -1045,10 +1038,10 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, goto out; } - if (rtmsg->rtmsg_metric == 0) - rtmsg->rtmsg_metric = IP6_RT_PRIO_USER; + if (cfg->fc_metric == 0) + cfg->fc_metric = IP6_RT_PRIO_USER; - table = fib6_new_table(table_id); + table = fib6_new_table(cfg->fc_table); if (table == NULL) { err = -ENOBUFS; goto out; @@ -1062,14 +1055,13 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, } rt->u.dst.obsolete = -1; - rt->rt6i_expires = jiffies + clock_t_to_jiffies(rtmsg->rtmsg_info); - if (nlh && (r = NLMSG_DATA(nlh))) { - rt->rt6i_protocol = r->rtm_protocol; - } else { - rt->rt6i_protocol = RTPROT_BOOT; - } + rt->rt6i_expires = jiffies + clock_t_to_jiffies(cfg->fc_expires); + + if (cfg->fc_protocol == RTPROT_UNSPEC) + cfg->fc_protocol = RTPROT_BOOT; + rt->rt6i_protocol = cfg->fc_protocol; - addr_type = ipv6_addr_type(&rtmsg->rtmsg_dst); + addr_type = ipv6_addr_type(&cfg->fc_dst); if (addr_type & IPV6_ADDR_MULTICAST) rt->u.dst.input = ip6_mc_input; @@ -1078,24 +1070,22 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, rt->u.dst.output = ip6_output; - ipv6_addr_prefix(&rt->rt6i_dst.addr, - &rtmsg->rtmsg_dst, rtmsg->rtmsg_dst_len); - rt->rt6i_dst.plen = rtmsg->rtmsg_dst_len; + ipv6_addr_prefix(&rt->rt6i_dst.addr, &cfg->fc_dst, cfg->fc_dst_len); + rt->rt6i_dst.plen = cfg->fc_dst_len; if (rt->rt6i_dst.plen == 128) rt->u.dst.flags = DST_HOST; #ifdef CONFIG_IPV6_SUBTREES - ipv6_addr_prefix(&rt->rt6i_src.addr, - &rtmsg->rtmsg_src, rtmsg->rtmsg_src_len); - rt->rt6i_src.plen = rtmsg->rtmsg_src_len; + ipv6_addr_prefix(&rt->rt6i_src.addr, &cfg->fc_src, cfg->fc_src_len); + rt->rt6i_src.plen = cfg->fc_src_len; #endif - rt->rt6i_metric = rtmsg->rtmsg_metric; + rt->rt6i_metric = cfg->fc_metric; /* We cannot add true routes via loopback here, they would result in kernel looping; promote them to reject routes */ - if ((rtmsg->rtmsg_flags&RTF_REJECT) || + if ((cfg->fc_flags & RTF_REJECT) || (dev && (dev->flags&IFF_LOOPBACK) && !(addr_type&IPV6_ADDR_LOOPBACK))) { /* hold loopback dev/idev if we haven't done so. */ if (dev != &loopback_dev) { @@ -1118,12 +1108,12 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, goto install_route; } - if (rtmsg->rtmsg_flags & RTF_GATEWAY) { + if (cfg->fc_flags & RTF_GATEWAY) { struct in6_addr *gw_addr; int gwa_type; - gw_addr = &rtmsg->rtmsg_gateway; - ipv6_addr_copy(&rt->rt6i_gateway, &rtmsg->rtmsg_gateway); + gw_addr = &cfg->fc_gateway; + ipv6_addr_copy(&rt->rt6i_gateway, gw_addr); gwa_type = ipv6_addr_type(gw_addr); if (gwa_type != (IPV6_ADDR_LINKLOCAL|IPV6_ADDR_UNICAST)) { @@ -1140,7 +1130,7 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, if (!(gwa_type&IPV6_ADDR_UNICAST)) goto out; - grt = rt6_lookup(gw_addr, NULL, rtmsg->rtmsg_ifindex, 1); + grt = rt6_lookup(gw_addr, NULL, cfg->fc_ifindex, 1); err = -EHOSTUNREACH; if (grt == NULL) @@ -1172,7 +1162,7 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, if (dev == NULL) goto out; - if (rtmsg->rtmsg_flags & (RTF_GATEWAY|RTF_NONEXTHOP)) { + if (cfg->fc_flags & (RTF_GATEWAY | RTF_NONEXTHOP)) { rt->rt6i_nexthop = __neigh_lookup_errno(&nd_tbl, &rt->rt6i_gateway, dev); if (IS_ERR(rt->rt6i_nexthop)) { err = PTR_ERR(rt->rt6i_nexthop); @@ -1181,24 +1171,24 @@ int ip6_route_add(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, } } - rt->rt6i_flags = rtmsg->rtmsg_flags; + rt->rt6i_flags = cfg->fc_flags; install_route: - if (rta && rta[RTA_METRICS-1]) { - int attrlen = RTA_PAYLOAD(rta[RTA_METRICS-1]); - struct rtattr *attr = RTA_DATA(rta[RTA_METRICS-1]); - - while (RTA_OK(attr, attrlen)) { - unsigned flavor = attr->rta_type; - if (flavor) { - if (flavor > RTAX_MAX) { + if (cfg->fc_mx) { + struct nlattr *nla; + int remaining; + + nla_for_each_attr(nla, cfg->fc_mx, cfg->fc_mx_len, remaining) { + int type = nla->nla_type; + + if (type) { + if (type > RTAX_MAX) { err = -EINVAL; goto out; } - rt->u.dst.metrics[flavor-1] = - *(u32 *)RTA_DATA(attr); + + rt->u.dst.metrics[type - 1] = nla_get_u32(nla); } - attr = RTA_NEXT(attr, attrlen); } } @@ -1211,7 +1201,7 @@ install_route: rt->u.dst.dev = dev; rt->rt6i_idev = idev; rt->rt6i_table = table; - return __ip6_ins_rt(rt, nlh, _rtattr, req); + return __ip6_ins_rt(rt, &cfg->fc_nlinfo); out: if (dev) @@ -1223,8 +1213,7 @@ out: return err; } -static int __ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req) +static int __ip6_del_rt(struct rt6_info *rt, struct nl_info *info) { int err; struct fib6_table *table; @@ -1235,7 +1224,7 @@ static int __ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, table = rt->rt6i_table; write_lock_bh(&table->tb6_lock); - err = fib6_del(rt, nlh, _rtattr, req); + err = fib6_del(rt, info); dst_release(&rt->u.dst); write_unlock_bh(&table->tb6_lock); @@ -1245,44 +1234,41 @@ static int __ip6_del_rt(struct rt6_info *rt, struct nlmsghdr *nlh, int ip6_del_rt(struct rt6_info *rt) { - return __ip6_del_rt(rt, NULL, NULL, NULL); + return __ip6_del_rt(rt, NULL); } -static int ip6_route_del(struct in6_rtmsg *rtmsg, struct nlmsghdr *nlh, - void *_rtattr, struct netlink_skb_parms *req, - u32 table_id) +static int ip6_route_del(struct fib6_config *cfg) { struct fib6_table *table; struct fib6_node *fn; struct rt6_info *rt; int err = -ESRCH; - table = fib6_get_table(table_id); + table = fib6_get_table(cfg->fc_table); if (table == NULL) return err; read_lock_bh(&table->tb6_lock); fn = fib6_locate(&table->tb6_root, - &rtmsg->rtmsg_dst, rtmsg->rtmsg_dst_len, - &rtmsg->rtmsg_src, rtmsg->rtmsg_src_len); + &cfg->fc_dst, cfg->fc_dst_len, + &cfg->fc_src, cfg->fc_src_len); if (fn) { for (rt = fn->leaf; rt; rt = rt->u.next) { - if (rtmsg->rtmsg_ifindex && + if (cfg->fc_ifindex && (rt->rt6i_dev == NULL || - rt->rt6i_dev->ifindex != rtmsg->rtmsg_ifindex)) + rt->rt6i_dev->ifindex != cfg->fc_ifindex)) continue; - if (rtmsg->rtmsg_flags&RTF_GATEWAY && - !ipv6_addr_equal(&rtmsg->rtmsg_gateway, &rt->rt6i_gateway)) + if (cfg->fc_flags & RTF_GATEWAY && + !ipv6_addr_equal(&cfg->fc_gateway, &rt->rt6i_gateway)) continue; - if (rtmsg->rtmsg_metric && - rtmsg->rtmsg_metric != rt->rt6i_metric) + if (cfg->fc_metric && cfg->fc_metric != rt->rt6i_metric) continue; dst_hold(&rt->u.dst); read_unlock_bh(&table->tb6_lock); - return __ip6_del_rt(rt, nlh, _rtattr, req); + return __ip6_del_rt(rt, &cfg->fc_nlinfo); } } read_unlock_bh(&table->tb6_lock); @@ -1565,21 +1551,23 @@ static struct rt6_info *rt6_add_route_info(struct in6_addr *prefix, int prefixle struct in6_addr *gwaddr, int ifindex, unsigned pref) { - struct in6_rtmsg rtmsg; + struct fib6_config cfg = { + .fc_table = RT6_TABLE_INFO, + .fc_metric = 1024, + .fc_ifindex = ifindex, + .fc_dst_len = prefixlen, + .fc_flags = RTF_GATEWAY | RTF_ADDRCONF | RTF_ROUTEINFO | + RTF_UP | RTF_PREF(pref), + }; + + ipv6_addr_copy(&cfg.fc_dst, prefix); + ipv6_addr_copy(&cfg.fc_gateway, gwaddr); - memset(&rtmsg, 0, sizeof(rtmsg)); - rtmsg.rtmsg_type = RTMSG_NEWROUTE; - ipv6_addr_copy(&rtmsg.rtmsg_dst, prefix); - rtmsg.rtmsg_dst_len = prefixlen; - ipv6_addr_copy(&rtmsg.rtmsg_gateway, gwaddr); - rtmsg.rtmsg_metric = 1024; - rtmsg.rtmsg_flags = RTF_GATEWAY | RTF_ADDRCONF | RTF_ROUTEINFO | RTF_UP | RTF_PREF(pref); /* We should treat it as a default route if prefix length is 0. */ if (!prefixlen) - rtmsg.rtmsg_flags |= RTF_DEFAULT; - rtmsg.rtmsg_ifindex = ifindex; + cfg.fc_flags |= RTF_DEFAULT; - ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_INFO); + ip6_route_add(&cfg); return rt6_get_route_info(prefix, prefixlen, gwaddr, ifindex); } @@ -1611,18 +1599,18 @@ struct rt6_info *rt6_add_dflt_router(struct in6_addr *gwaddr, struct net_device *dev, unsigned int pref) { - struct in6_rtmsg rtmsg; + struct fib6_config cfg = { + .fc_table = RT6_TABLE_DFLT, + .fc_metric = 1024, + .fc_ifindex = dev->ifindex, + .fc_flags = RTF_GATEWAY | RTF_ADDRCONF | RTF_DEFAULT | + RTF_UP | RTF_EXPIRES | RTF_PREF(pref), + }; - memset(&rtmsg, 0, sizeof(struct in6_rtmsg)); - rtmsg.rtmsg_type = RTMSG_NEWROUTE; - ipv6_addr_copy(&rtmsg.rtmsg_gateway, gwaddr); - rtmsg.rtmsg_metric = 1024; - rtmsg.rtmsg_flags = RTF_GATEWAY | RTF_ADDRCONF | RTF_DEFAULT | RTF_UP | RTF_EXPIRES | - RTF_PREF(pref); + ipv6_addr_copy(&cfg.fc_gateway, gwaddr); - rtmsg.rtmsg_ifindex = dev->ifindex; + ip6_route_add(&cfg); - ip6_route_add(&rtmsg, NULL, NULL, NULL, RT6_TABLE_DFLT); return rt6_get_dflt_router(gwaddr, dev); } @@ -1649,8 +1637,27 @@ restart: read_unlock_bh(&table->tb6_lock); } +static void rtmsg_to_fib6_config(struct in6_rtmsg *rtmsg, + struct fib6_config *cfg) +{ + memset(cfg, 0, sizeof(*cfg)); + + cfg->fc_table = RT6_TABLE_MAIN; + cfg->fc_ifindex = rtmsg->rtmsg_ifindex; + cfg->fc_metric = rtmsg->rtmsg_metric; + cfg->fc_expires = rtmsg->rtmsg_info; + cfg->fc_dst_len = rtmsg->rtmsg_dst_len; + cfg->fc_src_len = rtmsg->rtmsg_src_len; + cfg->fc_flags = rtmsg->rtmsg_flags; + + ipv6_addr_copy(&cfg->fc_dst, &rtmsg->rtmsg_dst); + ipv6_addr_copy(&cfg->fc_src, &rtmsg->rtmsg_src); + ipv6_addr_copy(&cfg->fc_gateway, &rtmsg->rtmsg_gateway); +} + int ipv6_route_ioctl(unsigned int cmd, void __user *arg) { + struct fib6_config cfg; struct in6_rtmsg rtmsg; int err; @@ -1663,16 +1670,16 @@ int ipv6_route_ioctl(unsigned int cmd, void __user *arg) sizeof(struct in6_rtmsg)); if (err) return -EFAULT; - + + rtmsg_to_fib6_config(&rtmsg, &cfg); + rtnl_lock(); switch (cmd) { case SIOCADDRT: - err = ip6_route_add(&rtmsg, NULL, NULL, NULL, - RT6_TABLE_MAIN); + err = ip6_route_add(&cfg); break; case SIOCDELRT: - err = ip6_route_del(&rtmsg, NULL, NULL, NULL, - RT6_TABLE_MAIN); + err = ip6_route_del(&cfg); break; default: err = -EINVAL; @@ -1823,66 +1830,104 @@ void rt6_mtu_change(struct net_device *dev, unsigned mtu) fib6_clean_all(rt6_mtu_change_route, 0, &arg); } -static int inet6_rtm_to_rtmsg(struct rtmsg *r, struct rtattr **rta, - struct in6_rtmsg *rtmsg) +static struct nla_policy rtm_ipv6_policy[RTA_MAX+1] __read_mostly = { + [RTA_GATEWAY] = { .minlen = sizeof(struct in6_addr) }, + [RTA_OIF] = { .type = NLA_U32 }, + [RTA_PRIORITY] = { .type = NLA_U32 }, + [RTA_METRICS] = { .type = NLA_NESTED }, +}; + +static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh, + struct fib6_config *cfg) { - memset(rtmsg, 0, sizeof(*rtmsg)); + struct rtmsg *rtm; + struct nlattr *tb[RTA_MAX+1]; + int err; + + err = nlmsg_parse(nlh, sizeof(*rtm), tb, RTA_MAX, rtm_ipv6_policy); + if (err < 0) + goto errout; - rtmsg->rtmsg_dst_len = r->rtm_dst_len; - rtmsg->rtmsg_src_len = r->rtm_src_len; - rtmsg->rtmsg_flags = RTF_UP; - if (r->rtm_type == RTN_UNREACHABLE) - rtmsg->rtmsg_flags |= RTF_REJECT; + err = -EINVAL; + rtm = nlmsg_data(nlh); + memset(cfg, 0, sizeof(*cfg)); - if (rta[RTA_GATEWAY-1]) { - if (rta[RTA_GATEWAY-1]->rta_len != RTA_LENGTH(16)) - return -EINVAL; - memcpy(&rtmsg->rtmsg_gateway, RTA_DATA(rta[RTA_GATEWAY-1]), 16); - rtmsg->rtmsg_flags |= RTF_GATEWAY; - } - if (rta[RTA_DST-1]) { - if (RTA_PAYLOAD(rta[RTA_DST-1]) < ((r->rtm_dst_len+7)>>3)) - return -EINVAL; - memcpy(&rtmsg->rtmsg_dst, RTA_DATA(rta[RTA_DST-1]), ((r->rtm_dst_len+7)>>3)); + cfg->fc_table = rtm->rtm_table; + cfg->fc_dst_len = rtm->rtm_dst_len; + cfg->fc_src_len = rtm->rtm_src_len; + cfg->fc_flags = RTF_UP; + cfg->fc_protocol = rtm->rtm_protocol; + + if (rtm->rtm_type == RTN_UNREACHABLE) + cfg->fc_flags |= RTF_REJECT; + + cfg->fc_nlinfo.pid = NETLINK_CB(skb).pid; + cfg->fc_nlinfo.nlh = nlh; + + if (tb[RTA_GATEWAY]) { + nla_memcpy(&cfg->fc_gateway, tb[RTA_GATEWAY], 16); + cfg->fc_flags |= RTF_GATEWAY; } - if (rta[RTA_SRC-1]) { - if (RTA_PAYLOAD(rta[RTA_SRC-1]) < ((r->rtm_src_len+7)>>3)) - return -EINVAL; - memcpy(&rtmsg->rtmsg_src, RTA_DATA(rta[RTA_SRC-1]), ((r->rtm_src_len+7)>>3)); + + if (tb[RTA_DST]) { + int plen = (rtm->rtm_dst_len + 7) >> 3; + + if (nla_len(tb[RTA_DST]) < plen) + goto errout; + + nla_memcpy(&cfg->fc_dst, tb[RTA_DST], plen); } - if (rta[RTA_OIF-1]) { - if (rta[RTA_OIF-1]->rta_len != RTA_LENGTH(sizeof(int))) - return -EINVAL; - memcpy(&rtmsg->rtmsg_ifindex, RTA_DATA(rta[RTA_OIF-1]), sizeof(int)); + + if (tb[RTA_SRC]) { + int plen = (rtm->rtm_src_len + 7) >> 3; + + if (nla_len(tb[RTA_SRC]) < plen) + goto errout; + + nla_memcpy(&cfg->fc_src, tb[RTA_SRC], plen); } - if (rta[RTA_PRIORITY-1]) { - if (rta[RTA_PRIORITY-1]->rta_len != RTA_LENGTH(4)) - return -EINVAL; - memcpy(&rtmsg->rtmsg_metric, RTA_DATA(rta[RTA_PRIORITY-1]), 4); + + if (tb[RTA_OIF]) + cfg->fc_ifindex = nla_get_u32(tb[RTA_OIF]); + + if (tb[RTA_PRIORITY]) + cfg->fc_metric = nla_get_u32(tb[RTA_PRIORITY]); + + if (tb[RTA_METRICS]) { + cfg->fc_mx = nla_data(tb[RTA_METRICS]); + cfg->fc_mx_len = nla_len(tb[RTA_METRICS]); } - return 0; + + if (tb[RTA_TABLE]) + cfg->fc_table = nla_get_u32(tb[RTA_TABLE]); + + err = 0; +errout: + return err; } int inet6_rtm_delroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) { - struct rtmsg *r = NLMSG_DATA(nlh); - struct in6_rtmsg rtmsg; + struct fib6_config cfg; + int err; - if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) - return -EINVAL; - return ip6_route_del(&rtmsg, nlh, arg, &NETLINK_CB(skb), - rtm_get_table(arg, r->rtm_table)); + err = rtm_to_fib6_config(skb, nlh, &cfg); + if (err < 0) + return err; + + return ip6_route_del(&cfg); } int inet6_rtm_newroute(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) { - struct rtmsg *r = NLMSG_DATA(nlh); - struct in6_rtmsg rtmsg; + struct fib6_config cfg; + int err; - if (inet6_rtm_to_rtmsg(r, arg, &rtmsg)) - return -EINVAL; - return ip6_route_add(&rtmsg, nlh, arg, &NETLINK_CB(skb), - rtm_get_table(arg, r->rtm_table)); + err = rtm_to_fib6_config(skb, nlh, &cfg); + if (err < 0) + return err; + + return ip6_route_add(&cfg); } static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, @@ -2063,15 +2108,21 @@ out_free: goto out; } -void inet6_rt_notify(int event, struct rt6_info *rt, struct nlmsghdr *nlh, - struct netlink_skb_parms *req) +void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info) { struct sk_buff *skb; - u32 pid = req ? req->pid : 0; - u32 seq = nlh ? nlh->nlmsg_seq : 0; + u32 pid = 0, seq = 0; + struct nlmsghdr *nlh = NULL; int payload = sizeof(struct rtmsg) + 256; int err = -ENOBUFS; + if (info) { + pid = info->pid; + nlh = info->nlh; + if (nlh) + seq = nlh->nlmsg_seq; + } + skb = nlmsg_new(nlmsg_total_size(payload), gfp_any()); if (skb == NULL) goto errout; -- cgit v0.10.2 From 2d7202bfdd28687073f5efef8d2f51bbab0af867 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 00:01:27 -0700 Subject: [IPv6] route: Convert FIB6 dumping to use new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index dfc5826..eeff0b2 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -188,22 +188,27 @@ void rtnl_set_sk_err(u32 group, int error) int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics) { - struct rtattr *mx = (struct rtattr*)skb->tail; - int i; + struct nlattr *mx; + int i, valid = 0; + + mx = nla_nest_start(skb, RTA_METRICS); + if (mx == NULL) + return -ENOBUFS; - RTA_PUT(skb, RTA_METRICS, 0, NULL); - for (i=0; irta_len = skb->tail - (u8*)mx; - if (mx->rta_len == RTA_LENGTH(0)) - skb_trim(skb, (u8*)mx - skb->data); - return 0; -rtattr_failure: - skb_trim(skb, (u8*)mx - skb->data); - return -1; + if (!valid) + goto nla_put_failure; + + return nla_nest_end(skb, mx); + +nla_put_failure: + return nla_nest_cancel(skb, mx); } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 7bcffa6..f0a66de 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1936,8 +1936,7 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, int prefix, unsigned int flags) { struct rtmsg *rtm; - struct nlmsghdr *nlh; - unsigned char *b = skb->tail; + struct nlmsghdr *nlh; struct rta_cacheinfo ci; u32 table; @@ -1948,8 +1947,11 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, } } - nlh = NLMSG_NEW(skb, pid, seq, type, sizeof(*rtm), flags); - rtm = NLMSG_DATA(nlh); + nlh = nlmsg_put(skb, pid, seq, type, sizeof(*rtm), flags); + if (nlh == NULL) + return -ENOBUFS; + + rtm = nlmsg_data(nlh); rtm->rtm_family = AF_INET6; rtm->rtm_dst_len = rt->rt6i_dst.plen; rtm->rtm_src_len = rt->rt6i_src.plen; @@ -1959,7 +1961,7 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, else table = RT6_TABLE_UNSPEC; rtm->rtm_table = table; - RTA_PUT_U32(skb, RTA_TABLE, table); + NLA_PUT_U32(skb, RTA_TABLE, table); if (rt->rt6i_flags&RTF_REJECT) rtm->rtm_type = RTN_UNREACHABLE; else if (rt->rt6i_dev && (rt->rt6i_dev->flags&IFF_LOOPBACK)) @@ -1980,31 +1982,35 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, rtm->rtm_flags |= RTM_F_CLONED; if (dst) { - RTA_PUT(skb, RTA_DST, 16, dst); + NLA_PUT(skb, RTA_DST, 16, dst); rtm->rtm_dst_len = 128; } else if (rtm->rtm_dst_len) - RTA_PUT(skb, RTA_DST, 16, &rt->rt6i_dst.addr); + NLA_PUT(skb, RTA_DST, 16, &rt->rt6i_dst.addr); #ifdef CONFIG_IPV6_SUBTREES if (src) { - RTA_PUT(skb, RTA_SRC, 16, src); + NLA_PUT(skb, RTA_SRC, 16, src); rtm->rtm_src_len = 128; } else if (rtm->rtm_src_len) - RTA_PUT(skb, RTA_SRC, 16, &rt->rt6i_src.addr); + NLA_PUT(skb, RTA_SRC, 16, &rt->rt6i_src.addr); #endif if (iif) - RTA_PUT(skb, RTA_IIF, 4, &iif); + NLA_PUT_U32(skb, RTA_IIF, iif); else if (dst) { struct in6_addr saddr_buf; if (ipv6_get_saddr(&rt->u.dst, dst, &saddr_buf) == 0) - RTA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf); + NLA_PUT(skb, RTA_PREFSRC, 16, &saddr_buf); } + if (rtnetlink_put_metrics(skb, rt->u.dst.metrics) < 0) - goto rtattr_failure; + goto nla_put_failure; + if (rt->u.dst.neighbour) - RTA_PUT(skb, RTA_GATEWAY, 16, &rt->u.dst.neighbour->primary_key); + NLA_PUT(skb, RTA_GATEWAY, 16, &rt->u.dst.neighbour->primary_key); + if (rt->u.dst.dev) - RTA_PUT(skb, RTA_OIF, sizeof(int), &rt->rt6i_dev->ifindex); - RTA_PUT(skb, RTA_PRIORITY, 4, &rt->rt6i_metric); + NLA_PUT_U32(skb, RTA_OIF, rt->rt6i_dev->ifindex); + + NLA_PUT_U32(skb, RTA_PRIORITY, rt->rt6i_metric); ci.rta_lastuse = jiffies_to_clock_t(jiffies - rt->u.dst.lastuse); if (rt->rt6i_expires) ci.rta_expires = jiffies_to_clock_t(rt->rt6i_expires - jiffies); @@ -2016,14 +2022,12 @@ static int rt6_fill_node(struct sk_buff *skb, struct rt6_info *rt, ci.rta_id = 0; ci.rta_ts = 0; ci.rta_tsage = 0; - RTA_PUT(skb, RTA_CACHEINFO, sizeof(ci), &ci); - nlh->nlmsg_len = skb->tail - b; - return skb->len; + NLA_PUT(skb, RTA_CACHEINFO, sizeof(ci), &ci); + + return nlmsg_end(skb, nlh); -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; +nla_put_failure: + return nlmsg_cancel(skb, nlh); } int rt6_dump_route(struct rt6_info *rt, void *p_arg) @@ -2031,8 +2035,8 @@ int rt6_dump_route(struct rt6_info *rt, void *p_arg) struct rt6_rtnl_dump_arg *arg = (struct rt6_rtnl_dump_arg *) p_arg; int prefix; - if (arg->cb->nlh->nlmsg_len >= NLMSG_LENGTH(sizeof(struct rtmsg))) { - struct rtmsg *rtm = NLMSG_DATA(arg->cb->nlh); + if (nlmsg_len(arg->cb->nlh) >= sizeof(struct rtmsg)) { + struct rtmsg *rtm = nlmsg_data(arg->cb->nlh); prefix = (rtm->rtm_flags & RTM_F_PREFIX) != 0; } else prefix = 0; -- cgit v0.10.2 From ab364a6f96bad9625bdb97b5688c76c44eb1e96e Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 00:01:47 -0700 Subject: [IPv6] route: Convert GETROUTE to use new netlink api Fixes various unvalidated netlink attributes causing memory corruptions when left empty by userspace applications. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index f0a66de..5d6e908 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1833,6 +1833,7 @@ void rt6_mtu_change(struct net_device *dev, unsigned mtu) static struct nla_policy rtm_ipv6_policy[RTA_MAX+1] __read_mostly = { [RTA_GATEWAY] = { .minlen = sizeof(struct in6_addr) }, [RTA_OIF] = { .type = NLA_U32 }, + [RTA_IIF] = { .type = NLA_U32 }, [RTA_PRIORITY] = { .type = NLA_U32 }, [RTA_METRICS] = { .type = NLA_NESTED }, }; @@ -2048,68 +2049,75 @@ int rt6_dump_route(struct rt6_info *rt, void *p_arg) int inet6_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr* nlh, void *arg) { - struct rtattr **rta = arg; - int iif = 0; - int err = -ENOBUFS; + struct nlattr *tb[RTA_MAX+1]; + struct rt6_info *rt; struct sk_buff *skb; + struct rtmsg *rtm; struct flowi fl; - struct rt6_info *rt; - - skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); - if (skb == NULL) - goto out; + int err, iif = 0; - /* Reserve room for dummy headers, this skb can pass - through good chunk of routing engine. - */ - skb->mac.raw = skb->data; - skb_reserve(skb, MAX_HEADER + sizeof(struct ipv6hdr)); + err = nlmsg_parse(nlh, sizeof(*rtm), tb, RTA_MAX, rtm_ipv6_policy); + if (err < 0) + goto errout; + err = -EINVAL; memset(&fl, 0, sizeof(fl)); - if (rta[RTA_SRC-1]) - ipv6_addr_copy(&fl.fl6_src, - (struct in6_addr*)RTA_DATA(rta[RTA_SRC-1])); - if (rta[RTA_DST-1]) - ipv6_addr_copy(&fl.fl6_dst, - (struct in6_addr*)RTA_DATA(rta[RTA_DST-1])); - if (rta[RTA_IIF-1]) - memcpy(&iif, RTA_DATA(rta[RTA_IIF-1]), sizeof(int)); + if (tb[RTA_SRC]) { + if (nla_len(tb[RTA_SRC]) < sizeof(struct in6_addr)) + goto errout; + + ipv6_addr_copy(&fl.fl6_src, nla_data(tb[RTA_SRC])); + } + + if (tb[RTA_DST]) { + if (nla_len(tb[RTA_DST]) < sizeof(struct in6_addr)) + goto errout; + + ipv6_addr_copy(&fl.fl6_dst, nla_data(tb[RTA_DST])); + } + + if (tb[RTA_IIF]) + iif = nla_get_u32(tb[RTA_IIF]); + + if (tb[RTA_OIF]) + fl.oif = nla_get_u32(tb[RTA_OIF]); if (iif) { struct net_device *dev; dev = __dev_get_by_index(iif); if (!dev) { err = -ENODEV; - goto out_free; + goto errout; } } - fl.oif = 0; - if (rta[RTA_OIF-1]) - memcpy(&fl.oif, RTA_DATA(rta[RTA_OIF-1]), sizeof(int)); + skb = alloc_skb(NLMSG_GOODSIZE, GFP_KERNEL); + if (skb == NULL) { + err = -ENOBUFS; + goto errout; + } - rt = (struct rt6_info*)ip6_route_output(NULL, &fl); + /* Reserve room for dummy headers, this skb can pass + through good chunk of routing engine. + */ + skb->mac.raw = skb->data; + skb_reserve(skb, MAX_HEADER + sizeof(struct ipv6hdr)); + rt = (struct rt6_info*) ip6_route_output(NULL, &fl); skb->dst = &rt->u.dst; - NETLINK_CB(skb).dst_pid = NETLINK_CB(in_skb).pid; - err = rt6_fill_node(skb, rt, - &fl.fl6_dst, &fl.fl6_src, - iif, + err = rt6_fill_node(skb, rt, &fl.fl6_dst, &fl.fl6_src, iif, RTM_NEWROUTE, NETLINK_CB(in_skb).pid, nlh->nlmsg_seq, 0, 0); if (err < 0) { - err = -EMSGSIZE; - goto out_free; + kfree_skb(skb); + goto errout; } err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); -out: +errout: return err; -out_free: - kfree_skb(skb); - goto out; } void inet6_rt_notify(int event, struct rt6_info *rt, struct nl_info *info) -- cgit v0.10.2 From 72d3b2c970a2d5d2ccb1d1cab4fb76663c4f2e49 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 22 Aug 2006 00:13:07 -0700 Subject: [IPV6]: Fixup ip6_del_rt() call for new args. Signed-off-by: David S. Miller diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index abbc35a..b80fc50 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -378,7 +378,7 @@ int __ipv6_dev_ac_dec(struct inet6_dev *idev, struct in6_addr *addr) addrconf_leave_solict(idev, &aca->aca_addr); dst_hold(&aca->aca_rt->u.dst); - if (ip6_del_rt(aca->aca_rt, NULL, NULL, NULL)) + if (ip6_del_rt(aca->aca_rt)) dst_free(&aca->aca_rt->u.dst); else dst_release(&aca->aca_rt->u.dst); -- cgit v0.10.2 From ac0b04627269ff16c3c7ab854a65fe6780c6e3e5 Mon Sep 17 00:00:00 2001 From: Sridhar Samudrala Date: Tue, 22 Aug 2006 00:15:33 -0700 Subject: [SCTP]: Extend /proc/net/sctp/snmp to provide more statistics. This patch adds more statistics info under /proc/net/sctp/snmp that should be useful for debugging. The additional events that are counted now include timer expirations, retransmits, packet and data chunk discards. The Data chunk discards include all the cases where a data chunk is discarded including high tsn, bad stream, dup tsn and the most useful one(out of receive buffer/rwnd). Also moved the SCTP MIB data structures from the generic include directories to include/sctp/sctp.h. Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/include/linux/snmp.h b/include/linux/snmp.h index 3015655..854aa6b 100644 --- a/include/linux/snmp.h +++ b/include/linux/snmp.h @@ -160,39 +160,6 @@ enum __UDP_MIB_MAX }; -/* sctp mib definitions */ -/* - * draft-ietf-sigtran-sctp-mib-07.txt - */ -enum -{ - SCTP_MIB_NUM = 0, - SCTP_MIB_CURRESTAB, /* CurrEstab */ - SCTP_MIB_ACTIVEESTABS, /* ActiveEstabs */ - SCTP_MIB_PASSIVEESTABS, /* PassiveEstabs */ - SCTP_MIB_ABORTEDS, /* Aborteds */ - SCTP_MIB_SHUTDOWNS, /* Shutdowns */ - SCTP_MIB_OUTOFBLUES, /* OutOfBlues */ - SCTP_MIB_CHECKSUMERRORS, /* ChecksumErrors */ - SCTP_MIB_OUTCTRLCHUNKS, /* OutCtrlChunks */ - SCTP_MIB_OUTORDERCHUNKS, /* OutOrderChunks */ - SCTP_MIB_OUTUNORDERCHUNKS, /* OutUnorderChunks */ - SCTP_MIB_INCTRLCHUNKS, /* InCtrlChunks */ - SCTP_MIB_INORDERCHUNKS, /* InOrderChunks */ - SCTP_MIB_INUNORDERCHUNKS, /* InUnorderChunks */ - SCTP_MIB_FRAGUSRMSGS, /* FragUsrMsgs */ - SCTP_MIB_REASMUSRMSGS, /* ReasmUsrMsgs */ - SCTP_MIB_OUTSCTPPACKS, /* OutSCTPPacks */ - SCTP_MIB_INSCTPPACKS, /* InSCTPPacks */ - SCTP_MIB_RTOALGORITHM, /* RtoAlgorithm */ - SCTP_MIB_RTOMIN, /* RtoMin */ - SCTP_MIB_RTOMAX, /* RtoMax */ - SCTP_MIB_RTOINITIAL, /* RtoInitial */ - SCTP_MIB_VALCOOKIELIFE, /* ValCookieLife */ - SCTP_MIB_MAXINITRETR, /* MaxInitRetr */ - __SCTP_MIB_MAX -}; - /* linux mib definitions */ enum { diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index 1c1abce..e274fd4 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -216,6 +216,50 @@ DECLARE_SNMP_STAT(struct sctp_mib, sctp_statistics); #endif /* !TEST_FRAME */ +/* sctp mib definitions */ +enum +{ + SCTP_MIB_NUM = 0, + SCTP_MIB_CURRESTAB, /* CurrEstab */ + SCTP_MIB_ACTIVEESTABS, /* ActiveEstabs */ + SCTP_MIB_PASSIVEESTABS, /* PassiveEstabs */ + SCTP_MIB_ABORTEDS, /* Aborteds */ + SCTP_MIB_SHUTDOWNS, /* Shutdowns */ + SCTP_MIB_OUTOFBLUES, /* OutOfBlues */ + SCTP_MIB_CHECKSUMERRORS, /* ChecksumErrors */ + SCTP_MIB_OUTCTRLCHUNKS, /* OutCtrlChunks */ + SCTP_MIB_OUTORDERCHUNKS, /* OutOrderChunks */ + SCTP_MIB_OUTUNORDERCHUNKS, /* OutUnorderChunks */ + SCTP_MIB_INCTRLCHUNKS, /* InCtrlChunks */ + SCTP_MIB_INORDERCHUNKS, /* InOrderChunks */ + SCTP_MIB_INUNORDERCHUNKS, /* InUnorderChunks */ + SCTP_MIB_FRAGUSRMSGS, /* FragUsrMsgs */ + SCTP_MIB_REASMUSRMSGS, /* ReasmUsrMsgs */ + SCTP_MIB_OUTSCTPPACKS, /* OutSCTPPacks */ + SCTP_MIB_INSCTPPACKS, /* InSCTPPacks */ + SCTP_MIB_T1_INIT_EXPIREDS, + SCTP_MIB_T1_COOKIE_EXPIREDS, + SCTP_MIB_T2_SHUTDOWN_EXPIREDS, + SCTP_MIB_T3_RTX_EXPIREDS, + SCTP_MIB_T4_RTO_EXPIREDS, + SCTP_MIB_T5_SHUTDOWN_GUARD_EXPIREDS, + SCTP_MIB_DELAY_SACK_EXPIREDS, + SCTP_MIB_AUTOCLOSE_EXPIREDS, + SCTP_MIB_T3_RETRANSMITS, + SCTP_MIB_PMTUD_RETRANSMITS, + SCTP_MIB_FAST_RETRANSMITS, + SCTP_MIB_IN_PKT_SOFTIRQ, + SCTP_MIB_IN_PKT_BACKLOG, + SCTP_MIB_IN_PKT_DISCARDS, + SCTP_MIB_IN_DATA_CHUNK_DISCARDS, + __SCTP_MIB_MAX +}; + +#define SCTP_MIB_MAX __SCTP_MIB_MAX +struct sctp_mib { + unsigned long mibs[SCTP_MIB_MAX]; +} __SNMP_MIB_ALIGN__; + /* Print debugging messages. */ #if SCTP_DEBUG diff --git a/include/net/snmp.h b/include/net/snmp.h index a36bed8..464970e 100644 --- a/include/net/snmp.h +++ b/include/net/snmp.h @@ -100,12 +100,6 @@ struct udp_mib { unsigned long mibs[UDP_MIB_MAX]; } __SNMP_MIB_ALIGN__; -/* SCTP */ -#define SCTP_MIB_MAX __SCTP_MIB_MAX -struct sctp_mib { - unsigned long mibs[SCTP_MIB_MAX]; -} __SNMP_MIB_ALIGN__; - /* Linux */ #define LINUX_MIB_MAX __LINUX_MIB_MAX struct linux_mib { diff --git a/net/sctp/input.c b/net/sctp/input.c index 42b66e7..8a34d95 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -255,10 +255,13 @@ int sctp_rcv(struct sk_buff *skb) */ sctp_bh_lock_sock(sk); - if (sock_owned_by_user(sk)) + if (sock_owned_by_user(sk)) { + SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_BACKLOG); sctp_add_backlog(sk, skb); - else + } else { + SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_SOFTIRQ); sctp_inq_push(&chunk->rcvr->inqueue, chunk); + } sctp_bh_unlock_sock(sk); @@ -271,6 +274,7 @@ int sctp_rcv(struct sk_buff *skb) return 0; discard_it: + SCTP_INC_STATS_BH(SCTP_MIB_IN_PKT_DISCARDS); kfree_skb(skb); return 0; diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c index cf0c767..cf6deed 100644 --- a/net/sctp/inqueue.c +++ b/net/sctp/inqueue.c @@ -87,7 +87,7 @@ void sctp_inq_free(struct sctp_inq *queue) /* Put a new packet in an SCTP inqueue. * We assume that packet->sctp_hdr is set and in host byte order. */ -void sctp_inq_push(struct sctp_inq *q, struct sctp_chunk *packet) +void sctp_inq_push(struct sctp_inq *q, struct sctp_chunk *chunk) { /* Directly call the packet handling routine. */ @@ -96,7 +96,7 @@ void sctp_inq_push(struct sctp_inq *q, struct sctp_chunk *packet) * Eventually, we should clean up inqueue to not rely * on the BH related data structures. */ - list_add_tail(&packet->list, &q->in_chunk_list); + list_add_tail(&chunk->list, &q->in_chunk_list); q->immediate.func(q->immediate.data); } diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c index 30b710c..37074a3 100644 --- a/net/sctp/outqueue.c +++ b/net/sctp/outqueue.c @@ -467,6 +467,7 @@ void sctp_retransmit(struct sctp_outq *q, struct sctp_transport *transport, switch(reason) { case SCTP_RTXR_T3_RTX: + SCTP_INC_STATS(SCTP_MIB_T3_RETRANSMITS); sctp_transport_lower_cwnd(transport, SCTP_LOWER_CWND_T3_RTX); /* Update the retran path if the T3-rtx timer has expired for * the current retran path. @@ -475,12 +476,15 @@ void sctp_retransmit(struct sctp_outq *q, struct sctp_transport *transport, sctp_assoc_update_retran_path(transport->asoc); break; case SCTP_RTXR_FAST_RTX: + SCTP_INC_STATS(SCTP_MIB_FAST_RETRANSMITS); sctp_transport_lower_cwnd(transport, SCTP_LOWER_CWND_FAST_RTX); fast_retransmit = 1; break; case SCTP_RTXR_PMTUD: - default: + SCTP_INC_STATS(SCTP_MIB_PMTUD_RETRANSMITS); break; + default: + BUG(); } sctp_retransmit_mark(q, transport, fast_retransmit); diff --git a/net/sctp/proc.c b/net/sctp/proc.c index 5b3b0e0..a356d8d 100644 --- a/net/sctp/proc.c +++ b/net/sctp/proc.c @@ -57,6 +57,21 @@ static struct snmp_mib sctp_snmp_list[] = { SNMP_MIB_ITEM("SctpReasmUsrMsgs", SCTP_MIB_REASMUSRMSGS), SNMP_MIB_ITEM("SctpOutSCTPPacks", SCTP_MIB_OUTSCTPPACKS), SNMP_MIB_ITEM("SctpInSCTPPacks", SCTP_MIB_INSCTPPACKS), + SNMP_MIB_ITEM("SctpT1InitExpireds", SCTP_MIB_T1_INIT_EXPIREDS), + SNMP_MIB_ITEM("SctpT1CookieExpireds", SCTP_MIB_T1_COOKIE_EXPIREDS), + SNMP_MIB_ITEM("SctpT2ShutdownExpireds", SCTP_MIB_T2_SHUTDOWN_EXPIREDS), + SNMP_MIB_ITEM("SctpT3RtxExpireds", SCTP_MIB_T3_RTX_EXPIREDS), + SNMP_MIB_ITEM("SctpT4RtoExpireds", SCTP_MIB_T4_RTO_EXPIREDS), + SNMP_MIB_ITEM("SctpT5ShutdownGuardExpireds", SCTP_MIB_T5_SHUTDOWN_GUARD_EXPIREDS), + SNMP_MIB_ITEM("SctpDelaySackExpireds", SCTP_MIB_DELAY_SACK_EXPIREDS), + SNMP_MIB_ITEM("SctpAutocloseExpireds", SCTP_MIB_AUTOCLOSE_EXPIREDS), + SNMP_MIB_ITEM("SctpT3Retransmits", SCTP_MIB_T3_RETRANSMITS), + SNMP_MIB_ITEM("SctpPmtudRetransmits", SCTP_MIB_PMTUD_RETRANSMITS), + SNMP_MIB_ITEM("SctpFastRetransmits", SCTP_MIB_FAST_RETRANSMITS), + SNMP_MIB_ITEM("SctpInPktSoftirq", SCTP_MIB_IN_PKT_SOFTIRQ), + SNMP_MIB_ITEM("SctpInPktBacklog", SCTP_MIB_IN_PKT_BACKLOG), + SNMP_MIB_ITEM("SctpInPktDiscards", SCTP_MIB_IN_PKT_DISCARDS), + SNMP_MIB_ITEM("SctpInDataChunkDiscards", SCTP_MIB_IN_DATA_CHUNK_DISCARDS), SNMP_MIB_SENTINEL }; @@ -328,8 +343,8 @@ static int sctp_assocs_seq_show(struct seq_file *seq, void *v) "%8p %8p %-3d %-3d %-2d %-4d %4d %8d %8d %7d %5lu %-5d %5d ", assoc, sk, sctp_sk(sk)->type, sk->sk_state, assoc->state, hash, assoc->assoc_id, - (sk->sk_rcvbuf - assoc->rwnd), assoc->sndbuf_used, + (sk->sk_rcvbuf - assoc->rwnd), sock_i_uid(sk), sock_i_ino(sk), epb->bind_addr.port, assoc->peer.port); diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 5b5ae79..32f57f4 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -2663,9 +2663,11 @@ sctp_disposition_t sctp_sf_eat_data_6_2(const struct sctp_endpoint *ep, break; case SCTP_IERROR_HIGH_TSN: case SCTP_IERROR_BAD_STREAM: + SCTP_INC_STATS(SCTP_MIB_IN_DATA_CHUNK_DISCARDS); goto discard_noforce; case SCTP_IERROR_DUP_TSN: case SCTP_IERROR_IGNORE_TSN: + SCTP_INC_STATS(SCTP_MIB_IN_DATA_CHUNK_DISCARDS); goto discard_force; case SCTP_IERROR_NO_DATA: goto consume; @@ -3652,6 +3654,7 @@ sctp_disposition_t sctp_sf_pdiscard(const struct sctp_endpoint *ep, void *arg, sctp_cmd_seq_t *commands) { + SCTP_INC_STATS(SCTP_MIB_IN_PKT_DISCARDS); sctp_add_cmd_sf(commands, SCTP_CMD_DISCARD_PACKET, SCTP_NULL()); return SCTP_DISPOSITION_CONSUME; @@ -4548,6 +4551,8 @@ sctp_disposition_t sctp_sf_do_6_3_3_rtx(const struct sctp_endpoint *ep, { struct sctp_transport *transport = arg; + SCTP_INC_STATS(SCTP_MIB_T3_RTX_EXPIREDS); + if (asoc->overall_error_count >= asoc->max_retrans) { sctp_add_cmd_sf(commands, SCTP_CMD_SET_SK_ERR, SCTP_ERROR(ETIMEDOUT)); @@ -4616,6 +4621,7 @@ sctp_disposition_t sctp_sf_do_6_2_sack(const struct sctp_endpoint *ep, void *arg, sctp_cmd_seq_t *commands) { + SCTP_INC_STATS(SCTP_MIB_DELAY_SACK_EXPIREDS); sctp_add_cmd_sf(commands, SCTP_CMD_GEN_SACK, SCTP_FORCE()); return SCTP_DISPOSITION_CONSUME; } @@ -4650,6 +4656,7 @@ sctp_disposition_t sctp_sf_t1_init_timer_expire(const struct sctp_endpoint *ep, int attempts = asoc->init_err_counter + 1; SCTP_DEBUG_PRINTK("Timer T1 expired (INIT).\n"); + SCTP_INC_STATS(SCTP_MIB_T1_INIT_EXPIREDS); if (attempts <= asoc->max_init_attempts) { bp = (struct sctp_bind_addr *) &asoc->base.bind_addr; @@ -4709,6 +4716,7 @@ sctp_disposition_t sctp_sf_t1_cookie_timer_expire(const struct sctp_endpoint *ep int attempts = asoc->init_err_counter + 1; SCTP_DEBUG_PRINTK("Timer T1 expired (COOKIE-ECHO).\n"); + SCTP_INC_STATS(SCTP_MIB_T1_COOKIE_EXPIREDS); if (attempts <= asoc->max_init_attempts) { repl = sctp_make_cookie_echo(asoc, NULL); @@ -4753,6 +4761,8 @@ sctp_disposition_t sctp_sf_t2_timer_expire(const struct sctp_endpoint *ep, struct sctp_chunk *reply = NULL; SCTP_DEBUG_PRINTK("Timer T2 expired.\n"); + SCTP_INC_STATS(SCTP_MIB_T2_SHUTDOWN_EXPIREDS); + if (asoc->overall_error_count >= asoc->max_retrans) { sctp_add_cmd_sf(commands, SCTP_CMD_SET_SK_ERR, SCTP_ERROR(ETIMEDOUT)); @@ -4814,6 +4824,8 @@ sctp_disposition_t sctp_sf_t4_timer_expire( struct sctp_chunk *chunk = asoc->addip_last_asconf; struct sctp_transport *transport = chunk->transport; + SCTP_INC_STATS(SCTP_MIB_T4_RTO_EXPIREDS); + /* ADDIP 4.1 B1) Increment the error counters and perform path failure * detection on the appropriate destination address as defined in * RFC2960 [5] section 8.1 and 8.2. @@ -4880,6 +4892,7 @@ sctp_disposition_t sctp_sf_t5_timer_expire(const struct sctp_endpoint *ep, struct sctp_chunk *reply = NULL; SCTP_DEBUG_PRINTK("Timer T5 expired.\n"); + SCTP_INC_STATS(SCTP_MIB_T5_SHUTDOWN_GUARD_EXPIREDS); reply = sctp_make_abort(asoc, NULL, 0); if (!reply) @@ -4910,6 +4923,8 @@ sctp_disposition_t sctp_sf_autoclose_timer_expire( { int disposition; + SCTP_INC_STATS(SCTP_MIB_AUTOCLOSE_EXPIREDS); + /* From 9.2 Shutdown of an Association * Upon receipt of the SHUTDOWN primitive from its upper * layer, the endpoint enters SHUTDOWN-PENDING state and -- cgit v0.10.2 From df7deeb5402087ea0387173aaf067d37a264a8f0 Mon Sep 17 00:00:00 2001 From: Vladislav Yasevich Date: Tue, 22 Aug 2006 00:19:51 -0700 Subject: [SCTP]: Cleanup nomem handling in the state functions. This patch cleans up the "nomem" conditions that may occur during the processing by the state machine functions. In most cases we delay adding side-effect commands until all memory allocations are done. Signed-off-by: Vladislav Yasevich Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 32f57f4..1c42fe9 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -187,10 +187,9 @@ sctp_disposition_t sctp_sf_do_4_C(const struct sctp_endpoint *ep, */ ev = sctp_ulpevent_make_assoc_change(asoc, 0, SCTP_SHUTDOWN_COMP, 0, 0, 0, GFP_ATOMIC); - if (!ev) - goto nomem; - - sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev)); + if (ev) + sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, + SCTP_ULPEVENT(ev)); /* Upon reception of the SHUTDOWN COMPLETE chunk the endpoint * will verify that it is in SHUTDOWN-ACK-SENT state, if it is @@ -215,9 +214,6 @@ sctp_disposition_t sctp_sf_do_4_C(const struct sctp_endpoint *ep, sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL()); return SCTP_DISPOSITION_DELETE_TCB; - -nomem: - return SCTP_DISPOSITION_NOMEM; } /* @@ -347,8 +343,6 @@ sctp_disposition_t sctp_sf_do_5_1B_init(const struct sctp_endpoint *ep, GFP_ATOMIC)) goto nomem_init; - sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc)); - /* B) "Z" shall respond immediately with an INIT ACK chunk. */ /* If there are errors need to be reported for unknown parameters, @@ -360,11 +354,11 @@ sctp_disposition_t sctp_sf_do_5_1B_init(const struct sctp_endpoint *ep, sizeof(sctp_chunkhdr_t); if (sctp_assoc_set_bind_addr_from_ep(new_asoc, GFP_ATOMIC) < 0) - goto nomem_ack; + goto nomem_init; repl = sctp_make_init_ack(new_asoc, chunk, GFP_ATOMIC, len); if (!repl) - goto nomem_ack; + goto nomem_init; /* If there are errors need to be reported for unknown parameters, * include them in the outgoing INIT ACK as "Unrecognized parameter" @@ -388,6 +382,8 @@ sctp_disposition_t sctp_sf_do_5_1B_init(const struct sctp_endpoint *ep, sctp_chunk_free(err_chunk); } + sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc)); + sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); /* @@ -400,12 +396,11 @@ sctp_disposition_t sctp_sf_do_5_1B_init(const struct sctp_endpoint *ep, return SCTP_DISPOSITION_DELETE_TCB; -nomem_ack: - if (err_chunk) - sctp_chunk_free(err_chunk); nomem_init: sctp_association_free(new_asoc); nomem: + if (err_chunk) + sctp_chunk_free(err_chunk); return SCTP_DISPOSITION_NOMEM; } @@ -600,7 +595,7 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep, struct sctp_association *new_asoc; sctp_init_chunk_t *peer_init; struct sctp_chunk *repl; - struct sctp_ulpevent *ev; + struct sctp_ulpevent *ev, *ai_ev = NULL; int error = 0; struct sctp_chunk *err_chk_p; @@ -659,20 +654,10 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep, }; } - sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc)); - sctp_add_cmd_sf(commands, SCTP_CMD_NEW_STATE, - SCTP_STATE(SCTP_STATE_ESTABLISHED)); - SCTP_INC_STATS(SCTP_MIB_CURRESTAB); - SCTP_INC_STATS(SCTP_MIB_PASSIVEESTABS); - sctp_add_cmd_sf(commands, SCTP_CMD_HB_TIMERS_START, SCTP_NULL()); - if (new_asoc->autoclose) - sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START, - SCTP_TO(SCTP_EVENT_TIMEOUT_AUTOCLOSE)); - - sctp_add_cmd_sf(commands, SCTP_CMD_TRANSMIT, SCTP_NULL()); - - /* Re-build the bind address for the association is done in + /* Delay state machine commands until later. + * + * Re-build the bind address for the association is done in * the sctp_unpack_cookie() already. */ /* This is a brand-new association, so these are not yet side @@ -687,9 +672,7 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep, repl = sctp_make_cookie_ack(new_asoc, chunk); if (!repl) - goto nomem_repl; - - sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); + goto nomem_init; /* RFC 2960 5.1 Normal Establishment of an Association * @@ -704,28 +687,53 @@ sctp_disposition_t sctp_sf_do_5_1D_ce(const struct sctp_endpoint *ep, if (!ev) goto nomem_ev; - sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev)); - /* Sockets API Draft Section 5.3.1.6 * When a peer sends a Adaption Layer Indication parameter , SCTP * delivers this notification to inform the application that of the * peers requested adaption layer. */ if (new_asoc->peer.adaption_ind) { - ev = sctp_ulpevent_make_adaption_indication(new_asoc, + ai_ev = sctp_ulpevent_make_adaption_indication(new_asoc, GFP_ATOMIC); - if (!ev) - goto nomem_ev; + if (!ai_ev) + goto nomem_aiev; + } + + /* Add all the state machine commands now since we've created + * everything. This way we don't introduce memory corruptions + * during side-effect processing and correclty count established + * associations. + */ + sctp_add_cmd_sf(commands, SCTP_CMD_NEW_ASOC, SCTP_ASOC(new_asoc)); + sctp_add_cmd_sf(commands, SCTP_CMD_NEW_STATE, + SCTP_STATE(SCTP_STATE_ESTABLISHED)); + SCTP_INC_STATS(SCTP_MIB_CURRESTAB); + SCTP_INC_STATS(SCTP_MIB_PASSIVEESTABS); + sctp_add_cmd_sf(commands, SCTP_CMD_HB_TIMERS_START, SCTP_NULL()); + + if (new_asoc->autoclose) + sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_START, + SCTP_TO(SCTP_EVENT_TIMEOUT_AUTOCLOSE)); + + sctp_add_cmd_sf(commands, SCTP_CMD_TRANSMIT, SCTP_NULL()); + /* This will send the COOKIE ACK */ + sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); + + /* Queue the ASSOC_CHANGE event */ + sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev)); + + /* Send up the Adaptation Layer Indication event */ + if (ai_ev) sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, - SCTP_ULPEVENT(ev)); - } + SCTP_ULPEVENT(ai_ev)); return SCTP_DISPOSITION_CONSUME; +nomem_aiev: + sctp_ulpevent_free(ev); nomem_ev: sctp_chunk_free(repl); -nomem_repl: nomem_init: sctp_association_free(new_asoc); nomem: @@ -1360,10 +1368,8 @@ static sctp_disposition_t sctp_sf_do_unexpected_init( if (!sctp_process_init(new_asoc, chunk->chunk_hdr->type, sctp_source(chunk), (sctp_init_chunk_t *)chunk->chunk_hdr, - GFP_ATOMIC)) { - retval = SCTP_DISPOSITION_NOMEM; - goto nomem_init; - } + GFP_ATOMIC)) + goto nomem; /* Make sure no new addresses are being added during the * restart. Do not do this check for COOKIE-WAIT state, @@ -1374,7 +1380,7 @@ static sctp_disposition_t sctp_sf_do_unexpected_init( if (!sctp_sf_check_restart_addrs(new_asoc, asoc, chunk, commands)) { retval = SCTP_DISPOSITION_CONSUME; - goto cleanup_asoc; + goto nomem_retval; } } @@ -1430,17 +1436,17 @@ static sctp_disposition_t sctp_sf_do_unexpected_init( sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL()); retval = SCTP_DISPOSITION_CONSUME; + return retval; + +nomem: + retval = SCTP_DISPOSITION_NOMEM; +nomem_retval: + if (new_asoc) + sctp_association_free(new_asoc); cleanup: if (err_chunk) sctp_chunk_free(err_chunk); return retval; -nomem: - retval = SCTP_DISPOSITION_NOMEM; - goto cleanup; -nomem_init: -cleanup_asoc: - sctp_association_free(new_asoc); - goto cleanup; } /* @@ -1611,15 +1617,10 @@ static sctp_disposition_t sctp_sf_do_dupcook_a(const struct sctp_endpoint *ep, */ sctp_add_cmd_sf(commands, SCTP_CMD_PURGE_OUTQUEUE, SCTP_NULL()); - /* Update the content of current association. */ - sctp_add_cmd_sf(commands, SCTP_CMD_UPDATE_ASSOC, SCTP_ASOC(new_asoc)); - repl = sctp_make_cookie_ack(new_asoc, chunk); if (!repl) goto nomem; - sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); - /* Report association restart to upper layer. */ ev = sctp_ulpevent_make_assoc_change(asoc, 0, SCTP_RESTART, 0, new_asoc->c.sinit_num_ostreams, @@ -1628,6 +1629,9 @@ static sctp_disposition_t sctp_sf_do_dupcook_a(const struct sctp_endpoint *ep, if (!ev) goto nomem_ev; + /* Update the content of current association. */ + sctp_add_cmd_sf(commands, SCTP_CMD_UPDATE_ASSOC, SCTP_ASOC(new_asoc)); + sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev)); return SCTP_DISPOSITION_CONSUME; @@ -1751,7 +1755,7 @@ static sctp_disposition_t sctp_sf_do_dupcook_d(const struct sctp_endpoint *ep, sctp_cmd_seq_t *commands, struct sctp_association *new_asoc) { - struct sctp_ulpevent *ev = NULL; + struct sctp_ulpevent *ev = NULL, *ai_ev = NULL; struct sctp_chunk *repl; /* Clarification from Implementor's Guide: @@ -1778,29 +1782,25 @@ static sctp_disposition_t sctp_sf_do_dupcook_d(const struct sctp_endpoint *ep, * SCTP user upon reception of a valid COOKIE * ECHO chunk. */ - ev = sctp_ulpevent_make_assoc_change(new_asoc, 0, + ev = sctp_ulpevent_make_assoc_change(asoc, 0, SCTP_COMM_UP, 0, - new_asoc->c.sinit_num_ostreams, - new_asoc->c.sinit_max_instreams, + asoc->c.sinit_num_ostreams, + asoc->c.sinit_max_instreams, GFP_ATOMIC); if (!ev) goto nomem; - sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, - SCTP_ULPEVENT(ev)); /* Sockets API Draft Section 5.3.1.6 * When a peer sends a Adaption Layer Indication parameter, * SCTP delivers this notification to inform the application * that of the peers requested adaption layer. */ - if (new_asoc->peer.adaption_ind) { - ev = sctp_ulpevent_make_adaption_indication(new_asoc, + if (asoc->peer.adaption_ind) { + ai_ev = sctp_ulpevent_make_adaption_indication(asoc, GFP_ATOMIC); - if (!ev) + if (!ai_ev) goto nomem; - sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, - SCTP_ULPEVENT(ev)); } } sctp_add_cmd_sf(commands, SCTP_CMD_TRANSMIT, SCTP_NULL()); @@ -1809,12 +1809,21 @@ static sctp_disposition_t sctp_sf_do_dupcook_d(const struct sctp_endpoint *ep, if (!repl) goto nomem; + if (ev) + sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, + SCTP_ULPEVENT(ev)); + if (ai_ev) + sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, + SCTP_ULPEVENT(ai_ev)); + sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl)); sctp_add_cmd_sf(commands, SCTP_CMD_TRANSMIT, SCTP_NULL()); return SCTP_DISPOSITION_CONSUME; nomem: + if (ai_ev) + sctp_ulpevent_free(ai_ev); if (ev) sctp_ulpevent_free(ev); return SCTP_DISPOSITION_NOMEM; @@ -3019,7 +3028,6 @@ sctp_disposition_t sctp_sf_do_9_2_final(const struct sctp_endpoint *ep, if (!sctp_chunk_length_valid(chunk, sizeof(sctp_chunkhdr_t))) return sctp_sf_violation_chunklen(ep, asoc, type, arg, commands); - /* 10.2 H) SHUTDOWN COMPLETE notification * * When SCTP completes the shutdown procedures (section 9.2) this @@ -3030,6 +3038,14 @@ sctp_disposition_t sctp_sf_do_9_2_final(const struct sctp_endpoint *ep, if (!ev) goto nomem; + /* ...send a SHUTDOWN COMPLETE chunk to its peer, */ + reply = sctp_make_shutdown_complete(asoc, chunk); + if (!reply) + goto nomem_chunk; + + /* Do all the commands now (after allocation), so that we + * have consistent state if memory allocation failes + */ sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev)); /* Upon the receipt of the SHUTDOWN ACK, the SHUTDOWN sender shall @@ -3041,11 +3057,6 @@ sctp_disposition_t sctp_sf_do_9_2_final(const struct sctp_endpoint *ep, sctp_add_cmd_sf(commands, SCTP_CMD_TIMER_STOP, SCTP_TO(SCTP_EVENT_TIMEOUT_T5_SHUTDOWN_GUARD)); - /* ...send a SHUTDOWN COMPLETE chunk to its peer, */ - reply = sctp_make_shutdown_complete(asoc, chunk); - if (!reply) - goto nomem; - sctp_add_cmd_sf(commands, SCTP_CMD_NEW_STATE, SCTP_STATE(SCTP_STATE_CLOSED)); SCTP_INC_STATS(SCTP_MIB_SHUTDOWNS); @@ -3056,6 +3067,8 @@ sctp_disposition_t sctp_sf_do_9_2_final(const struct sctp_endpoint *ep, sctp_add_cmd_sf(commands, SCTP_CMD_DELETE_TCB, SCTP_NULL()); return SCTP_DISPOSITION_DELETE_TCB; +nomem_chunk: + sctp_ulpevent_free(ev); nomem: return SCTP_DISPOSITION_NOMEM; } -- cgit v0.10.2 From eb5fa39f5ef490c72901b547ac5e7211efd47d56 Mon Sep 17 00:00:00 2001 From: Vladislav Yasevich Date: Tue, 22 Aug 2006 00:23:13 -0700 Subject: [SCTP]: Fix IPv6 address flag setting when doing peel-off/accept. During accept/peeloff we try to copy the list of bound addresses from the original endpoint to the new one. However, we forgot to set the flag to say that IPv6 is allowed on the new endpoint. Signed-off-by: Vladislav Yasevich Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 85caf79..30d2dbe 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -5619,6 +5619,8 @@ static void sctp_sock_migrate(struct sock *oldsk, struct sock *newsk, /* Copy the bind_addr list from the original endpoint to the new * endpoint so that we can handle restarts properly */ + if (PF_INET6 == assoc->base.sk->sk_family) + flags = SCTP_ADDR6_ALLOWED; if (assoc->peer.ipv4_address) flags |= SCTP_ADDR4_PEERSUPP; if (assoc->peer.ipv6_address) -- cgit v0.10.2 From 8abfedd889e46ad4977dfcdab737edf5c5803c62 Mon Sep 17 00:00:00 2001 From: Sridhar Samudrala Date: Tue, 22 Aug 2006 00:24:09 -0700 Subject: [SCTP]: Use the flags value that is passed as an arg to sctp_accept. No need to do multiple dereferences - sk->sk_socket->file->f_flags Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 30d2dbe..3b6e82c 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -2970,7 +2970,7 @@ SCTP_STATIC struct sock *sctp_accept(struct sock *sk, int flags, int *err) goto out; } - timeo = sock_rcvtimeo(sk, sk->sk_socket->file->f_flags & O_NONBLOCK); + timeo = sock_rcvtimeo(sk, flags & O_NONBLOCK); error = sctp_wait_for_accept(sk, timeo); if (error) -- cgit v0.10.2 From 131852176c1f5b4350b4af811d1836db387d0c61 Mon Sep 17 00:00:00 2001 From: Henrik Kretzschmar Date: Tue, 22 Aug 2006 00:28:33 -0700 Subject: [TG3]: Convert the pci_device_id table to PCI_DEVICE() Convert the pci_device_ids to PCI_DEVICE() macro. Saves 1.5k in the sourcefile. Signed-off-by: Henrik Kretzschmar Acked-by: Michael Chan Signed-off-by: David S. Miller diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c index 34078a7..fb70261 100644 --- a/drivers/net/tg3.c +++ b/drivers/net/tg3.c @@ -149,117 +149,62 @@ module_param(tg3_debug, int, 0); MODULE_PARM_DESC(tg3_debug, "Tigon3 bitmapped debugging message enable value"); static struct pci_device_id tg3_pci_tbl[] = { - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5700, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5701, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702FE, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705_2, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705M_2, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702X, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703X, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702A3, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703A3, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5782, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5788, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5789, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901_2, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S_2, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705F, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5720, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5721, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5750, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5750M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751F, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5752, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5752M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753F, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5754, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5754M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5755, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5755M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5786, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5787, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5787M, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714S, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715S, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780S, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5781, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9DXX, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9MXX, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1000, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1001, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1003, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC9100, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_TIGON3, - PCI_ANY_ID, PCI_ANY_ID, 0, 0, 0UL }, - { 0, } + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5700)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5701)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702FE)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705_2)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705M_2)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702X)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703X)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5702A3)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5703A3)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5782)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5788)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5789)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5901_2)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5704S_2)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5705F)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5720)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5721)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5750)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5750M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5751F)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5752)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5752M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5753F)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5754)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5754M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5755)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5755M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5786)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5787)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5787M)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5714S)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5715S)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5780S)}, + {PCI_DEVICE(PCI_VENDOR_ID_BROADCOM, PCI_DEVICE_ID_TIGON3_5781)}, + {PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9DXX)}, + {PCI_DEVICE(PCI_VENDOR_ID_SYSKONNECT, PCI_DEVICE_ID_SYSKONNECT_9MXX)}, + {PCI_DEVICE(PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1000)}, + {PCI_DEVICE(PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1001)}, + {PCI_DEVICE(PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC1003)}, + {PCI_DEVICE(PCI_VENDOR_ID_ALTIMA, PCI_DEVICE_ID_ALTIMA_AC9100)}, + {PCI_DEVICE(PCI_VENDOR_ID_APPLE, PCI_DEVICE_ID_APPLE_TIGON3)}, + {} }; MODULE_DEVICE_TABLE(pci, tg3_pci_tbl); -- cgit v0.10.2 From 9ba1627617d396135a4d679542a3623d5819e628 Mon Sep 17 00:00:00 2001 From: Yasuyuki Kozakai Date: Tue, 22 Aug 2006 00:29:37 -0700 Subject: [NETFILTER]: x_tables: replace IPv4 dscp match by address family independent version This replaces IPv4 dscp match by address family independent version. This also - utilizes dsfield.h to get the DS field in IPv4/IPv6 header, and - checks for the DSCP value from user space. - fixes Kconfig help text. Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/xt_dscp.h b/include/linux/netfilter/xt_dscp.h new file mode 100644 index 0000000..1da61e6 --- /dev/null +++ b/include/linux/netfilter/xt_dscp.h @@ -0,0 +1,23 @@ +/* x_tables module for matching the IPv4/IPv6 DSCP field + * + * (C) 2002 Harald Welte + * This software is distributed under GNU GPL v2, 1991 + * + * See RFC2474 for a description of the DSCP field within the IP Header. + * + * xt_dscp.h,v 1.3 2002/08/05 19:00:21 laforge Exp +*/ +#ifndef _XT_DSCP_H +#define _XT_DSCP_H + +#define XT_DSCP_MASK 0xfc /* 11111100 */ +#define XT_DSCP_SHIFT 2 +#define XT_DSCP_MAX 0x3f /* 00111111 */ + +/* match info */ +struct xt_dscp_info { + u_int8_t dscp; + u_int8_t invert; +}; + +#endif /* _XT_DSCP_H */ diff --git a/include/linux/netfilter_ipv4/ipt_dscp.h b/include/linux/netfilter_ipv4/ipt_dscp.h index 2fa6dfe..4b82ca9 100644 --- a/include/linux/netfilter_ipv4/ipt_dscp.h +++ b/include/linux/netfilter_ipv4/ipt_dscp.h @@ -10,14 +10,12 @@ #ifndef _IPT_DSCP_H #define _IPT_DSCP_H -#define IPT_DSCP_MASK 0xfc /* 11111100 */ -#define IPT_DSCP_SHIFT 2 -#define IPT_DSCP_MAX 0x3f /* 00111111 */ +#include -/* match info */ -struct ipt_dscp_info { - u_int8_t dscp; - u_int8_t invert; -}; +#define IPT_DSCP_MASK XT_DSCP_MASK +#define IPT_DSCP_SHIFT XT_DSCP_SHIFT +#define IPT_DSCP_MAX XT_DSCP_MAX + +#define ipt_dscp_info xt_dscp_info #endif /* _IPT_DSCP_H */ diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig index ef0b5aa..d88d71d 100644 --- a/net/ipv4/netfilter/Kconfig +++ b/net/ipv4/netfilter/Kconfig @@ -278,17 +278,6 @@ config IP_NF_MATCH_ECN To compile it as a module, choose M here. If unsure, say N. -config IP_NF_MATCH_DSCP - tristate "DSCP match support" - depends on IP_NF_IPTABLES - help - This option adds a `DSCP' match, which allows you to match against - the IPv4 header DSCP field (DSCP codepoint). - - The DSCP codepoint can have any value between 0x0 and 0x4f. - - To compile it as a module, choose M here. If unsure, say N. - config IP_NF_MATCH_AH tristate "AH match support" depends on IP_NF_IPTABLES diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile index 3ded4a3..b946b0f 100644 --- a/net/ipv4/netfilter/Makefile +++ b/net/ipv4/netfilter/Makefile @@ -59,7 +59,6 @@ obj-$(CONFIG_IP_NF_MATCH_OWNER) += ipt_owner.o obj-$(CONFIG_IP_NF_MATCH_TOS) += ipt_tos.o obj-$(CONFIG_IP_NF_MATCH_RECENT) += ipt_recent.o obj-$(CONFIG_IP_NF_MATCH_ECN) += ipt_ecn.o -obj-$(CONFIG_IP_NF_MATCH_DSCP) += ipt_dscp.o obj-$(CONFIG_IP_NF_MATCH_AH) += ipt_ah.o obj-$(CONFIG_IP_NF_MATCH_TTL) += ipt_ttl.o obj-$(CONFIG_IP_NF_MATCH_ADDRTYPE) += ipt_addrtype.o diff --git a/net/ipv4/netfilter/ipt_dscp.c b/net/ipv4/netfilter/ipt_dscp.c deleted file mode 100644 index 4717759..0000000 --- a/net/ipv4/netfilter/ipt_dscp.c +++ /dev/null @@ -1,54 +0,0 @@ -/* IP tables module for matching the value of the IPv4 DSCP field - * - * ipt_dscp.c,v 1.3 2002/08/05 19:00:21 laforge Exp - * - * (C) 2002 by Harald Welte - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include -#include - -#include -#include - -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("iptables DSCP matching module"); -MODULE_LICENSE("GPL"); - -static int match(const struct sk_buff *skb, - const struct net_device *in, const struct net_device *out, - const struct xt_match *match, const void *matchinfo, - int offset, unsigned int protoff, int *hotdrop) -{ - const struct ipt_dscp_info *info = matchinfo; - const struct iphdr *iph = skb->nh.iph; - - u_int8_t sh_dscp = ((info->dscp << IPT_DSCP_SHIFT) & IPT_DSCP_MASK); - - return ((iph->tos&IPT_DSCP_MASK) == sh_dscp) ^ info->invert; -} - -static struct ipt_match dscp_match = { - .name = "dscp", - .match = match, - .matchsize = sizeof(struct ipt_dscp_info), - .me = THIS_MODULE, -}; - -static int __init ipt_dscp_init(void) -{ - return ipt_register_match(&dscp_match); -} - -static void __exit ipt_dscp_fini(void) -{ - ipt_unregister_match(&dscp_match); - -} - -module_init(ipt_dscp_init); -module_exit(ipt_dscp_fini); diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index a9894ddf..f781405 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -263,6 +263,17 @@ config NETFILTER_XT_MATCH_DCCP If you want to compile it as a module, say M here and read . If unsure, say `N'. +config NETFILTER_XT_MATCH_DSCP + tristate '"DSCP" match support' + depends on NETFILTER_XTABLES + help + This option adds a `DSCP' match, which allows you to match against + the IPv4/IPv6 header DSCP field (differentiated services codepoint). + + The DSCP field can have any value between 0x0 and 0x3f inclusive. + + To compile it as a module, choose M here. If unsure, say N. + config NETFILTER_XT_MATCH_ESP tristate '"ESP" match support' depends on NETFILTER_XTABLES diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index 6fa4b75..0b8a70c 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -37,6 +37,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_CONNBYTES) += xt_connbytes.o obj-$(CONFIG_NETFILTER_XT_MATCH_CONNMARK) += xt_connmark.o obj-$(CONFIG_NETFILTER_XT_MATCH_CONNTRACK) += xt_conntrack.o obj-$(CONFIG_NETFILTER_XT_MATCH_DCCP) += xt_dccp.o +obj-$(CONFIG_NETFILTER_XT_MATCH_DSCP) += xt_dscp.o obj-$(CONFIG_NETFILTER_XT_MATCH_ESP) += xt_esp.o obj-$(CONFIG_NETFILTER_XT_MATCH_HELPER) += xt_helper.o obj-$(CONFIG_NETFILTER_XT_MATCH_LENGTH) += xt_length.o diff --git a/net/netfilter/xt_dscp.c b/net/netfilter/xt_dscp.c new file mode 100644 index 0000000..82e250d --- /dev/null +++ b/net/netfilter/xt_dscp.c @@ -0,0 +1,113 @@ +/* IP tables module for matching the value of the IPv4/IPv6 DSCP field + * + * xt_dscp.c,v 1.3 2002/08/05 19:00:21 laforge Exp + * + * (C) 2002 by Harald Welte + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include + +#include +#include + +MODULE_AUTHOR("Harald Welte "); +MODULE_DESCRIPTION("x_tables DSCP matching module"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS("ipt_dscp"); +MODULE_ALIAS("ip6t_dscp"); + +static int match(const struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + const struct xt_match *match, + const void *matchinfo, + int offset, + unsigned int protoff, + int *hotdrop) +{ + const struct xt_dscp_info *info = matchinfo; + u_int8_t dscp = ipv4_get_dsfield(skb->nh.iph) >> XT_DSCP_SHIFT; + + return (dscp == info->dscp) ^ !!info->invert; +} + +static int match6(const struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + const struct xt_match *match, + const void *matchinfo, + int offset, + unsigned int protoff, + int *hotdrop) +{ + const struct xt_dscp_info *info = matchinfo; + u_int8_t dscp = ipv6_get_dsfield(skb->nh.ipv6h) >> XT_DSCP_SHIFT; + + return (dscp == info->dscp) ^ !!info->invert; +} + +static int checkentry(const char *tablename, + const void *info, + const struct xt_match *match, + void *matchinfo, + unsigned int matchsize, + unsigned int hook_mask) +{ + const u_int8_t dscp = ((struct xt_dscp_info *)matchinfo)->dscp; + + if (dscp > XT_DSCP_MAX) { + printk(KERN_ERR "xt_dscp: dscp %x out of range\n", dscp); + return 0; + } + + return 1; +} + +static struct xt_match dscp_match = { + .name = "dscp", + .match = match, + .checkentry = checkentry, + .matchsize = sizeof(struct xt_dscp_info), + .family = AF_INET, + .me = THIS_MODULE, +}; + +static struct xt_match dscp6_match = { + .name = "dscp", + .match = match6, + .checkentry = checkentry, + .matchsize = sizeof(struct xt_dscp_info), + .family = AF_INET6, + .me = THIS_MODULE, +}; + +static int __init xt_dscp_match_init(void) +{ + int ret; + ret = xt_register_match(&dscp_match); + if (ret) + return ret; + + ret = xt_register_match(&dscp6_match); + if (ret) + xt_unregister_match(&dscp_match); + + return ret; +} + +static void __exit xt_dscp_match_fini(void) +{ + xt_unregister_match(&dscp_match); + xt_unregister_match(&dscp6_match); +} + +module_init(xt_dscp_match_init); +module_exit(xt_dscp_match_fini); -- cgit v0.10.2 From a468701db58a8b3e08e3f55fa6ac66db42014922 Mon Sep 17 00:00:00 2001 From: Yasuyuki Kozakai Date: Tue, 22 Aug 2006 00:30:26 -0700 Subject: [NETFILTER]: x_tables: replace IPv4 DSCP target by address family independent version This replaces IPv4 DSCP target by address family independent version. This also - utilizes dsfield.h to get/mangle DS field in IPv4/IPv6 header - fixes Kconfig help text. Signed-off-by: Yasuyuki Kozakai Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/xt_DSCP.h b/include/linux/netfilter/xt_DSCP.h new file mode 100644 index 0000000..3c7c963 --- /dev/null +++ b/include/linux/netfilter/xt_DSCP.h @@ -0,0 +1,20 @@ +/* x_tables module for setting the IPv4/IPv6 DSCP field + * + * (C) 2002 Harald Welte + * based on ipt_FTOS.c (C) 2000 by Matthew G. Marsh + * This software is distributed under GNU GPL v2, 1991 + * + * See RFC2474 for a description of the DSCP field within the IP Header. + * + * xt_DSCP.h,v 1.7 2002/03/14 12:03:13 laforge Exp +*/ +#ifndef _XT_DSCP_TARGET_H +#define _XT_DSCP_TARGET_H +#include + +/* target info */ +struct xt_DSCP_info { + u_int8_t dscp; +}; + +#endif /* _XT_DSCP_TARGET_H */ diff --git a/include/linux/netfilter_ipv4/ipt_DSCP.h b/include/linux/netfilter_ipv4/ipt_DSCP.h index b30f510..3491e52 100644 --- a/include/linux/netfilter_ipv4/ipt_DSCP.h +++ b/include/linux/netfilter_ipv4/ipt_DSCP.h @@ -11,10 +11,8 @@ #ifndef _IPT_DSCP_TARGET_H #define _IPT_DSCP_TARGET_H #include +#include -/* target info */ -struct ipt_DSCP_info { - u_int8_t dscp; -}; +#define ipt_DSCP_info xt_DSCP_info #endif /* _IPT_DSCP_TARGET_H */ diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig index d88d71d..a55b8ff 100644 --- a/net/ipv4/netfilter/Kconfig +++ b/net/ipv4/netfilter/Kconfig @@ -557,17 +557,6 @@ config IP_NF_TARGET_ECN To compile it as a module, choose M here. If unsure, say N. -config IP_NF_TARGET_DSCP - tristate "DSCP target support" - depends on IP_NF_MANGLE - help - This option adds a `DSCP' match, which allows you to match against - the IPv4 header DSCP field (DSCP codepoint). - - The DSCP codepoint can have any value between 0x0 and 0x4f. - - To compile it as a module, choose M here. If unsure, say N. - config IP_NF_TARGET_TTL tristate 'TTL target support' depends on IP_NF_MANGLE diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile index b946b0f..09aaed1 100644 --- a/net/ipv4/netfilter/Makefile +++ b/net/ipv4/netfilter/Makefile @@ -67,7 +67,6 @@ obj-$(CONFIG_IP_NF_MATCH_ADDRTYPE) += ipt_addrtype.o obj-$(CONFIG_IP_NF_TARGET_REJECT) += ipt_REJECT.o obj-$(CONFIG_IP_NF_TARGET_TOS) += ipt_TOS.o obj-$(CONFIG_IP_NF_TARGET_ECN) += ipt_ECN.o -obj-$(CONFIG_IP_NF_TARGET_DSCP) += ipt_DSCP.o obj-$(CONFIG_IP_NF_TARGET_MASQUERADE) += ipt_MASQUERADE.o obj-$(CONFIG_IP_NF_TARGET_REDIRECT) += ipt_REDIRECT.o obj-$(CONFIG_IP_NF_TARGET_NETMAP) += ipt_NETMAP.o diff --git a/net/ipv4/netfilter/ipt_DSCP.c b/net/ipv4/netfilter/ipt_DSCP.c deleted file mode 100644 index c8e9712..0000000 --- a/net/ipv4/netfilter/ipt_DSCP.c +++ /dev/null @@ -1,96 +0,0 @@ -/* iptables module for setting the IPv4 DSCP field, Version 1.8 - * - * (C) 2002 by Harald Welte - * based on ipt_FTOS.c (C) 2000 by Matthew G. Marsh - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * See RFC2474 for a description of the DSCP field within the IP Header. - * - * ipt_DSCP.c,v 1.8 2002/08/06 18:41:57 laforge Exp -*/ - -#include -#include -#include -#include - -#include -#include - -MODULE_AUTHOR("Harald Welte "); -MODULE_DESCRIPTION("iptables DSCP modification module"); -MODULE_LICENSE("GPL"); - -static unsigned int -target(struct sk_buff **pskb, - const struct net_device *in, - const struct net_device *out, - unsigned int hooknum, - const struct xt_target *target, - const void *targinfo, - void *userinfo) -{ - const struct ipt_DSCP_info *dinfo = targinfo; - u_int8_t sh_dscp = ((dinfo->dscp << IPT_DSCP_SHIFT) & IPT_DSCP_MASK); - - - if (((*pskb)->nh.iph->tos & IPT_DSCP_MASK) != sh_dscp) { - u_int16_t diffs[2]; - - if (!skb_make_writable(pskb, sizeof(struct iphdr))) - return NF_DROP; - - diffs[0] = htons((*pskb)->nh.iph->tos) ^ 0xFFFF; - (*pskb)->nh.iph->tos = ((*pskb)->nh.iph->tos & ~IPT_DSCP_MASK) - | sh_dscp; - diffs[1] = htons((*pskb)->nh.iph->tos); - (*pskb)->nh.iph->check - = csum_fold(csum_partial((char *)diffs, - sizeof(diffs), - (*pskb)->nh.iph->check - ^ 0xFFFF)); - } - return IPT_CONTINUE; -} - -static int -checkentry(const char *tablename, - const void *e_void, - const struct xt_target *target, - void *targinfo, - unsigned int targinfosize, - unsigned int hook_mask) -{ - const u_int8_t dscp = ((struct ipt_DSCP_info *)targinfo)->dscp; - - if ((dscp > IPT_DSCP_MAX)) { - printk(KERN_WARNING "DSCP: dscp %x out of range\n", dscp); - return 0; - } - return 1; -} - -static struct ipt_target ipt_dscp_reg = { - .name = "DSCP", - .target = target, - .targetsize = sizeof(struct ipt_DSCP_info), - .table = "mangle", - .checkentry = checkentry, - .me = THIS_MODULE, -}; - -static int __init ipt_dscp_init(void) -{ - return ipt_register_target(&ipt_dscp_reg); -} - -static void __exit ipt_dscp_fini(void) -{ - ipt_unregister_target(&ipt_dscp_reg); -} - -module_init(ipt_dscp_init); -module_exit(ipt_dscp_fini); diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig index f781405..0a28d2c 100644 --- a/net/netfilter/Kconfig +++ b/net/netfilter/Kconfig @@ -148,6 +148,18 @@ config NETFILTER_XT_TARGET_CONNMARK . The module will be called ipt_CONNMARK.o. If unsure, say `N'. +config NETFILTER_XT_TARGET_DSCP + tristate '"DSCP" target support' + depends on NETFILTER_XTABLES + depends on IP_NF_MANGLE || IP6_NF_MANGLE + help + This option adds a `DSCP' target, which allows you to manipulate + the IPv4/IPv6 header DSCP field (differentiated services codepoint). + + The DSCP field can have any value between 0x0 and 0x3f inclusive. + + To compile it as a module, choose M here. If unsure, say N. + config NETFILTER_XT_TARGET_MARK tristate '"MARK" target support' depends on NETFILTER_XTABLES diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile index 0b8a70c..a74be49 100644 --- a/net/netfilter/Makefile +++ b/net/netfilter/Makefile @@ -25,6 +25,7 @@ obj-$(CONFIG_NETFILTER_XTABLES) += x_tables.o xt_tcpudp.o # targets obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o obj-$(CONFIG_NETFILTER_XT_TARGET_CONNMARK) += xt_CONNMARK.o +obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o obj-$(CONFIG_NETFILTER_XT_TARGET_MARK) += xt_MARK.o obj-$(CONFIG_NETFILTER_XT_TARGET_NFQUEUE) += xt_NFQUEUE.o obj-$(CONFIG_NETFILTER_XT_TARGET_NOTRACK) += xt_NOTRACK.o diff --git a/net/netfilter/xt_DSCP.c b/net/netfilter/xt_DSCP.c new file mode 100644 index 0000000..79df816 --- /dev/null +++ b/net/netfilter/xt_DSCP.c @@ -0,0 +1,130 @@ +/* x_tables module for setting the IPv4/IPv6 DSCP field, Version 1.8 + * + * (C) 2002 by Harald Welte + * based on ipt_FTOS.c (C) 2000 by Matthew G. Marsh + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + * + * See RFC2474 for a description of the DSCP field within the IP Header. + * + * xt_DSCP.c,v 1.8 2002/08/06 18:41:57 laforge Exp +*/ + +#include +#include +#include +#include +#include + +#include +#include + +MODULE_AUTHOR("Harald Welte "); +MODULE_DESCRIPTION("x_tables DSCP modification module"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS("ipt_DSCP"); +MODULE_ALIAS("ip6t_DSCP"); + +static unsigned int target(struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + unsigned int hooknum, + const struct xt_target *target, + const void *targinfo, + void *userinfo) +{ + const struct xt_DSCP_info *dinfo = targinfo; + u_int8_t dscp = ipv4_get_dsfield((*pskb)->nh.iph) >> XT_DSCP_SHIFT; + + if (dscp != dinfo->dscp) { + if (!skb_make_writable(pskb, sizeof(struct iphdr))) + return NF_DROP; + + ipv4_change_dsfield((*pskb)->nh.iph, (__u8)(~XT_DSCP_MASK), + dinfo->dscp << XT_DSCP_SHIFT); + + } + return XT_CONTINUE; +} + +static unsigned int target6(struct sk_buff **pskb, + const struct net_device *in, + const struct net_device *out, + unsigned int hooknum, + const struct xt_target *target, + const void *targinfo, + void *userinfo) +{ + const struct xt_DSCP_info *dinfo = targinfo; + u_int8_t dscp = ipv6_get_dsfield((*pskb)->nh.ipv6h) >> XT_DSCP_SHIFT; + + if (dscp != dinfo->dscp) { + if (!skb_make_writable(pskb, sizeof(struct ipv6hdr))) + return NF_DROP; + + ipv6_change_dsfield((*pskb)->nh.ipv6h, (__u8)(~XT_DSCP_MASK), + dinfo->dscp << XT_DSCP_SHIFT); + } + return XT_CONTINUE; +} + +static int checkentry(const char *tablename, + const void *e_void, + const struct xt_target *target, + void *targinfo, + unsigned int targinfosize, + unsigned int hook_mask) +{ + const u_int8_t dscp = ((struct xt_DSCP_info *)targinfo)->dscp; + + if ((dscp > XT_DSCP_MAX)) { + printk(KERN_WARNING "DSCP: dscp %x out of range\n", dscp); + return 0; + } + return 1; +} + +static struct xt_target xt_dscp_reg = { + .name = "DSCP", + .target = target, + .targetsize = sizeof(struct xt_DSCP_info), + .table = "mangle", + .checkentry = checkentry, + .family = AF_INET, + .me = THIS_MODULE, +}; + +static struct xt_target xt_dscp6_reg = { + .name = "DSCP", + .target = target6, + .targetsize = sizeof(struct xt_DSCP_info), + .table = "mangle", + .checkentry = checkentry, + .family = AF_INET6, + .me = THIS_MODULE, +}; + +static int __init xt_dscp_target_init(void) +{ + int ret; + ret = xt_register_target(&xt_dscp_reg); + if (ret) + return ret; + + ret = xt_register_target(&xt_dscp6_reg); + if (ret) + xt_unregister_target(&xt_dscp_reg); + + return ret; +} + +static void __exit xt_dscp_target_fini(void) +{ + xt_unregister_target(&xt_dscp_reg); + xt_unregister_target(&xt_dscp6_reg); +} + +module_init(xt_dscp_target_init); +module_exit(xt_dscp_target_fini); -- cgit v0.10.2 From b93ff78317c0b8f42830e2bb13dd8df596232528 Mon Sep 17 00:00:00 2001 From: Daniel De Graaf Date: Tue, 22 Aug 2006 00:30:55 -0700 Subject: [NETFILTER]: ipt_recent: add module parameter for changing ownership of /proc/net/ipt_recent/* Signed-off-by: Daniel De Graaf Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c index 61a2139..682c094 100644 --- a/net/ipv4/netfilter/ipt_recent.c +++ b/net/ipv4/netfilter/ipt_recent.c @@ -35,14 +35,20 @@ static unsigned int ip_list_tot = 100; static unsigned int ip_pkt_list_tot = 20; static unsigned int ip_list_hash_size = 0; static unsigned int ip_list_perms = 0644; +static unsigned int ip_list_uid = 0; +static unsigned int ip_list_gid = 0; module_param(ip_list_tot, uint, 0400); module_param(ip_pkt_list_tot, uint, 0400); module_param(ip_list_hash_size, uint, 0400); module_param(ip_list_perms, uint, 0400); +module_param(ip_list_uid, uint, 0400); +module_param(ip_list_gid, uint, 0400); MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list"); MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP to remember (max. 255)"); MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs"); MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/ipt_recent/* files"); +MODULE_PARM_DESC(ip_list_uid,"owner of /proc/net/ipt_recent/* files"); +MODULE_PARM_DESC(ip_list_gid,"owning group of /proc/net/ipt_recent/* files"); struct recent_entry { @@ -274,6 +280,8 @@ ipt_recent_checkentry(const char *tablename, const void *ip, goto out; } t->proc->proc_fops = &recent_fops; + t->proc->uid = ip_list_uid; + t->proc->gid = ip_list_gid; t->proc->data = t; #endif spin_lock_bh(&recent_lock); -- cgit v0.10.2 From 2521c12cf1a29f6c380b13ca32a38175f6beed08 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Tue, 22 Aug 2006 00:31:24 -0700 Subject: [NETFILTER]: conntrack: introduce connection mark event This patch introduces the mark event. ctnetlink can use this to know if the mark needs to be dumped. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/nf_conntrack_common.h b/include/linux/netfilter/nf_conntrack_common.h index d2e4bd7..9e0dae0 100644 --- a/include/linux/netfilter/nf_conntrack_common.h +++ b/include/linux/netfilter/nf_conntrack_common.h @@ -125,6 +125,10 @@ enum ip_conntrack_events /* Counter highest bit has been set */ IPCT_COUNTER_FILLING_BIT = 11, IPCT_COUNTER_FILLING = (1 << IPCT_COUNTER_FILLING_BIT), + + /* Mark is set */ + IPCT_MARK_BIT = 12, + IPCT_MARK = (1 << IPCT_MARK_BIT), }; enum ip_conntrack_expect_events { diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index 60c375d..784482b 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -52,13 +52,25 @@ target(struct sk_buff **pskb, switch(markinfo->mode) { case XT_CONNMARK_SET: newmark = (*ctmark & ~markinfo->mask) | markinfo->mark; - if (newmark != *ctmark) + if (newmark != *ctmark) { *ctmark = newmark; +#ifdef CONFIG_IP_NF_CONNTRACK_EVENTS + ip_conntrack_event_cache(IPCT_MARK, *pskb); +#else + nf_conntrack_event_cache(IPCT_MARK, *pskb); +#endif + } break; case XT_CONNMARK_SAVE: newmark = (*ctmark & ~markinfo->mask) | ((*pskb)->nfmark & markinfo->mask); - if (*ctmark != newmark) + if (*ctmark != newmark) { *ctmark = newmark; +#ifdef CONFIG_IP_NF_CONNTRACK_EVENTS + ip_conntrack_event_cache(IPCT_MARK, *pskb); +#else + nf_conntrack_event_cache(IPCT_MARK, *pskb); +#endif + } break; case XT_CONNMARK_RESTORE: nfmark = (*pskb)->nfmark; -- cgit v0.10.2 From b9a37e0c81c498be2db9f52063c53e55d76c815e Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Tue, 22 Aug 2006 00:31:49 -0700 Subject: [NETFILTER]: ctnetlink: dump connection mark ctnetlink dumps the mark iif the event mark happened Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c index 0d4cc92..38708e6 100644 --- a/net/ipv4/netfilter/ip_conntrack_netlink.c +++ b/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -385,6 +385,10 @@ static int ctnetlink_conntrack_event(struct notifier_block *this, ctnetlink_dump_counters(skb, ct, IP_CT_DIR_REPLY) < 0) goto nfattr_failure; + if (events & IPCT_MARK + && ctnetlink_dump_mark(skb, ct) < 0) + goto nfattr_failure; + nlh->nlmsg_len = skb->tail - b; nfnetlink_send(skb, 0, group, 0); return NOTIFY_DONE; diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 6527d4e..aa0148f 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -395,6 +395,10 @@ static int ctnetlink_conntrack_event(struct notifier_block *this, ctnetlink_dump_counters(skb, ct, IP_CT_DIR_REPLY) < 0) goto nfattr_failure; + if (events & IPCT_MARK + && ctnetlink_dump_mark(skb, ct) < 0) + goto nfattr_failure; + nlh->nlmsg_len = skb->tail - b; nfnetlink_send(skb, 0, group, 0); return NOTIFY_DONE; -- cgit v0.10.2 From b3a27bfba51d445784eb0cd6451b73a73fb69cf9 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Tue, 22 Aug 2006 00:32:05 -0700 Subject: [NETFILTER]: ctnetlink: check for listeners before sending expectation events This patch uses nfnetlink_has_listeners to check for listeners in userspace. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c index 38708e6..ef84f43 100644 --- a/net/ipv4/netfilter/ip_conntrack_netlink.c +++ b/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -1257,6 +1257,9 @@ static int ctnetlink_expect_event(struct notifier_block *this, } else return NOTIFY_DONE; + if (!nfnetlink_has_listeners(NFNLGRP_CONNTRACK_EXP_NEW)) + return NOTIFY_DONE; + skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC); if (!skb) return NOTIFY_DONE; diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index aa0148f..dc4f081 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -1278,6 +1278,9 @@ static int ctnetlink_expect_event(struct notifier_block *this, } else return NOTIFY_DONE; + if (!nfnetlink_has_listeners(NFNLGRP_CONNTRACK_EXP_NEW)) + return NOTIFY_DONE; + skb = alloc_skb(NLMSG_GOODSIZE, GFP_ATOMIC); if (!skb) return NOTIFY_DONE; -- cgit v0.10.2 From 1a31526baeed30aaa70503cee0ab281f78cae0d6 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Tue, 22 Aug 2006 00:32:23 -0700 Subject: [NETFILTER]: ctnetlink: remove impossible events tests for updates IPCT_HELPER and IPCT_NATINFO bits are never set on updates. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c index ef84f43..a20b0e3 100644 --- a/net/ipv4/netfilter/ip_conntrack_netlink.c +++ b/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -329,11 +329,7 @@ static int ctnetlink_conntrack_event(struct notifier_block *this, /* dump everything */ events = ~0UL; group = NFNLGRP_CONNTRACK_NEW; - } else if (events & (IPCT_STATUS | - IPCT_PROTOINFO | - IPCT_HELPER | - IPCT_HELPINFO | - IPCT_NATINFO)) { + } else if (events & (IPCT_STATUS | IPCT_PROTOINFO)) { type = IPCTNL_MSG_CT_NEW; group = NFNLGRP_CONNTRACK_UPDATE; } else diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index dc4f081..8cd85cf 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -339,11 +339,7 @@ static int ctnetlink_conntrack_event(struct notifier_block *this, /* dump everything */ events = ~0UL; group = NFNLGRP_CONNTRACK_NEW; - } else if (events & (IPCT_STATUS | - IPCT_PROTOINFO | - IPCT_HELPER | - IPCT_HELPINFO | - IPCT_NATINFO)) { + } else if (events & (IPCT_STATUS | IPCT_PROTOINFO)) { type = IPCTNL_MSG_CT_NEW; group = NFNLGRP_CONNTRACK_UPDATE; } else -- cgit v0.10.2 From 1158ba27bec6d1a20999099a938908cf85f47640 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:32:47 -0700 Subject: [NETFILTER]: nfnetlink_queue: fix typo in error message Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c index eddfbe4..8eb2473 100644 --- a/net/netfilter/nfnetlink_queue.c +++ b/net/netfilter/nfnetlink_queue.c @@ -584,7 +584,7 @@ nfqnl_enqueue_packet(struct sk_buff *skb, struct nf_info *info, queue->queue_dropped++; status = -ENOSPC; if (net_ratelimit()) - printk(KERN_WARNING "ip_queue: full at %d entries, " + printk(KERN_WARNING "nf_queue: full at %d entries, " "dropping packets(s). Dropped: %d\n", queue->queue_total, queue->queue_dropped); goto err_out_free_nskb; @@ -635,7 +635,7 @@ nfqnl_mangle(void *data, int data_len, struct nfqnl_queue_entry *e) diff, GFP_ATOMIC); if (newskb == NULL) { - printk(KERN_WARNING "ip_queue: OOM " + printk(KERN_WARNING "nf_queue: OOM " "in mangle, dropping packet\n"); return -ENOMEM; } -- cgit v0.10.2 From da878c8e5aae3eeceeee7af8d52633d7bc125edf Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:33:09 -0700 Subject: [NETFILTER]: replace open coded checksum updates Replace open coded checksum update by nf_csum_update calls and clean up the surrounding code a bit. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_ECN.c b/net/ipv4/netfilter/ipt_ECN.c index 35916c7..7e30e6d 100644 --- a/net/ipv4/netfilter/ipt_ECN.c +++ b/net/ipv4/netfilter/ipt_ECN.c @@ -27,22 +27,18 @@ MODULE_DESCRIPTION("iptables ECN modification module"); static inline int set_ect_ip(struct sk_buff **pskb, const struct ipt_ECN_info *einfo) { - if (((*pskb)->nh.iph->tos & IPT_ECN_IP_MASK) - != (einfo->ip_ect & IPT_ECN_IP_MASK)) { - u_int16_t diffs[2]; + struct iphdr *iph = (*pskb)->nh.iph; + u_int16_t oldtos; + if ((iph->tos & IPT_ECN_IP_MASK) != (einfo->ip_ect & IPT_ECN_IP_MASK)) { if (!skb_make_writable(pskb, sizeof(struct iphdr))) return 0; - - diffs[0] = htons((*pskb)->nh.iph->tos) ^ 0xFFFF; - (*pskb)->nh.iph->tos &= ~IPT_ECN_IP_MASK; - (*pskb)->nh.iph->tos |= (einfo->ip_ect & IPT_ECN_IP_MASK); - diffs[1] = htons((*pskb)->nh.iph->tos); - (*pskb)->nh.iph->check - = csum_fold(csum_partial((char *)diffs, - sizeof(diffs), - (*pskb)->nh.iph->check - ^0xFFFF)); + iph = (*pskb)->nh.iph; + oldtos = iph->tos; + iph->tos &= ~IPT_ECN_IP_MASK; + iph->tos |= (einfo->ip_ect & IPT_ECN_IP_MASK); + iph->check = nf_csum_update(oldtos ^ 0xFFFF, iph->tos, + iph->check); } return 1; } diff --git a/net/ipv4/netfilter/ipt_TOS.c b/net/ipv4/netfilter/ipt_TOS.c index 1c7a5ca..52e9d70 100644 --- a/net/ipv4/netfilter/ipt_TOS.c +++ b/net/ipv4/netfilter/ipt_TOS.c @@ -30,23 +30,17 @@ target(struct sk_buff **pskb, void *userinfo) { const struct ipt_tos_target_info *tosinfo = targinfo; + struct iphdr *iph = (*pskb)->nh.iph; + u_int16_t oldtos; - if (((*pskb)->nh.iph->tos & IPTOS_TOS_MASK) != tosinfo->tos) { - u_int16_t diffs[2]; - + if ((iph->tos & IPTOS_TOS_MASK) != tosinfo->tos) { if (!skb_make_writable(pskb, sizeof(struct iphdr))) return NF_DROP; - - diffs[0] = htons((*pskb)->nh.iph->tos) ^ 0xFFFF; - (*pskb)->nh.iph->tos - = ((*pskb)->nh.iph->tos & IPTOS_PREC_MASK) - | tosinfo->tos; - diffs[1] = htons((*pskb)->nh.iph->tos); - (*pskb)->nh.iph->check - = csum_fold(csum_partial((char *)diffs, - sizeof(diffs), - (*pskb)->nh.iph->check - ^0xFFFF)); + iph = (*pskb)->nh.iph; + oldtos = iph->tos; + iph->tos = (iph->tos & IPTOS_PREC_MASK) | tosinfo->tos; + iph->check = nf_csum_update(oldtos ^ 0xFFFF, iph->tos, + iph->check); } return IPT_CONTINUE; } diff --git a/net/ipv4/netfilter/ipt_TTL.c b/net/ipv4/netfilter/ipt_TTL.c index f48892a..2afb2a8 100644 --- a/net/ipv4/netfilter/ipt_TTL.c +++ b/net/ipv4/netfilter/ipt_TTL.c @@ -27,7 +27,6 @@ ipt_ttl_target(struct sk_buff **pskb, { struct iphdr *iph; const struct ipt_TTL_info *info = targinfo; - u_int16_t diffs[2]; int new_ttl; if (!skb_make_writable(pskb, (*pskb)->len)) @@ -55,12 +54,10 @@ ipt_ttl_target(struct sk_buff **pskb, } if (new_ttl != iph->ttl) { - diffs[0] = htons(((unsigned)iph->ttl) << 8) ^ 0xFFFF; + iph->check = nf_csum_update((iph->ttl << 8) ^ 0xFFFF, + new_ttl << 8, + iph->check); iph->ttl = new_ttl; - diffs[1] = htons(((unsigned)iph->ttl) << 8); - iph->check = csum_fold(csum_partial((char *)diffs, - sizeof(diffs), - iph->check^0xFFFF)); } return IPT_CONTINUE; -- cgit v0.10.2 From 90528e6fe92ee1a353d6a639930e7d70d85b5c85 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:33:26 -0700 Subject: [NETFILTER]: xt_CONNMARK: use tabs for indentation Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index 784482b..19989a9 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -49,36 +49,37 @@ target(struct sk_buff **pskb, u_int32_t *ctmark = nf_ct_get_mark(*pskb, &ctinfo); if (ctmark) { - switch(markinfo->mode) { - case XT_CONNMARK_SET: - newmark = (*ctmark & ~markinfo->mask) | markinfo->mark; - if (newmark != *ctmark) { - *ctmark = newmark; + switch(markinfo->mode) { + case XT_CONNMARK_SET: + newmark = (*ctmark & ~markinfo->mask) | markinfo->mark; + if (newmark != *ctmark) { + *ctmark = newmark; #ifdef CONFIG_IP_NF_CONNTRACK_EVENTS - ip_conntrack_event_cache(IPCT_MARK, *pskb); + ip_conntrack_event_cache(IPCT_MARK, *pskb); #else - nf_conntrack_event_cache(IPCT_MARK, *pskb); + nf_conntrack_event_cache(IPCT_MARK, *pskb); #endif } - break; - case XT_CONNMARK_SAVE: - newmark = (*ctmark & ~markinfo->mask) | ((*pskb)->nfmark & markinfo->mask); - if (*ctmark != newmark) { - *ctmark = newmark; + break; + case XT_CONNMARK_SAVE: + newmark = (*ctmark & ~markinfo->mask) | + ((*pskb)->nfmark & markinfo->mask); + if (*ctmark != newmark) { + *ctmark = newmark; #ifdef CONFIG_IP_NF_CONNTRACK_EVENTS - ip_conntrack_event_cache(IPCT_MARK, *pskb); + ip_conntrack_event_cache(IPCT_MARK, *pskb); #else - nf_conntrack_event_cache(IPCT_MARK, *pskb); + nf_conntrack_event_cache(IPCT_MARK, *pskb); #endif + } + break; + case XT_CONNMARK_RESTORE: + nfmark = (*pskb)->nfmark; + diff = (*ctmark ^ nfmark) & markinfo->mask; + if (diff != 0) + (*pskb)->nfmark = nfmark ^ diff; + break; } - break; - case XT_CONNMARK_RESTORE: - nfmark = (*pskb)->nfmark; - diff = (*ctmark ^ nfmark) & markinfo->mask; - if (diff != 0) - (*pskb)->nfmark = nfmark ^ diff; - break; - } } return XT_CONTINUE; @@ -95,17 +96,17 @@ checkentry(const char *tablename, struct xt_connmark_target_info *matchinfo = targinfo; if (matchinfo->mode == XT_CONNMARK_RESTORE) { - if (strcmp(tablename, "mangle") != 0) { - printk(KERN_WARNING "CONNMARK: restore can only be called from \"mangle\" table, not \"%s\"\n", tablename); - return 0; - } + if (strcmp(tablename, "mangle") != 0) { + printk(KERN_WARNING "CONNMARK: restore can only be " + "called from \"mangle\" table, not \"%s\"\n", + tablename); + return 0; + } } - if (matchinfo->mark > 0xffffffff || matchinfo->mask > 0xffffffff) { printk(KERN_WARNING "CONNMARK: Only supports 32bit mark\n"); return 0; } - return 1; } -- cgit v0.10.2 From 52d9c42ef2563d2c420eb23b96bf5a4cae9e167b Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:33:45 -0700 Subject: [NETFILTER]: x_tables: add helpers for mass match/target registration Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 48cc32d8..9a99124 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -290,8 +290,13 @@ struct xt_table_info extern int xt_register_target(struct xt_target *target); extern void xt_unregister_target(struct xt_target *target); +extern int xt_register_targets(struct xt_target *target, unsigned int n); +extern void xt_unregister_targets(struct xt_target *target, unsigned int n); + extern int xt_register_match(struct xt_match *target); extern void xt_unregister_match(struct xt_match *target); +extern int xt_register_matches(struct xt_match *match, unsigned int n); +extern void xt_unregister_matches(struct xt_match *match, unsigned int n); extern int xt_check_match(const struct xt_match *match, unsigned short family, unsigned int size, const char *table, unsigned int hook, diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 174e8f9..8037ba6 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -87,6 +87,36 @@ xt_unregister_target(struct xt_target *target) EXPORT_SYMBOL(xt_unregister_target); int +xt_register_targets(struct xt_target *target, unsigned int n) +{ + unsigned int i; + int err = 0; + + for (i = 0; i < n; i++) { + err = xt_register_target(&target[i]); + if (err) + goto err; + } + return err; + +err: + if (i > 0) + xt_unregister_targets(target, i); + return err; +} +EXPORT_SYMBOL(xt_register_targets); + +void +xt_unregister_targets(struct xt_target *target, unsigned int n) +{ + unsigned int i; + + for (i = 0; i < n; i++) + xt_unregister_target(&target[i]); +} +EXPORT_SYMBOL(xt_unregister_targets); + +int xt_register_match(struct xt_match *match) { int ret, af = match->family; @@ -113,6 +143,36 @@ xt_unregister_match(struct xt_match *match) } EXPORT_SYMBOL(xt_unregister_match); +int +xt_register_matches(struct xt_match *match, unsigned int n) +{ + unsigned int i; + int err = 0; + + for (i = 0; i < n; i++) { + err = xt_register_match(&match[i]); + if (err) + goto err; + } + return err; + +err: + if (i > 0) + xt_unregister_matches(match, i); + return err; +} +EXPORT_SYMBOL(xt_register_matches); + +void +xt_unregister_matches(struct xt_match *match, unsigned int n) +{ + unsigned int i; + + for (i = 0; i < n; i++) + xt_unregister_match(&match[i]); +} +EXPORT_SYMBOL(xt_unregister_matches); + /* * These are weird, but module loading must not be done with mutex -- cgit v0.10.2 From 4470bbc749e5551cce914529309456f631e25120 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:34:04 -0700 Subject: [NETFILTER]: x_tables: make use of mass registation helpers Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c index c4eba1a..7929ff4 100644 --- a/net/ipv6/netfilter/ip6t_REJECT.c +++ b/net/ipv6/netfilter/ip6t_REJECT.c @@ -257,9 +257,7 @@ static struct ip6t_target ip6t_reject_reg = { static int __init ip6t_reject_init(void) { - if (ip6t_register_target(&ip6t_reject_reg)) - return -EINVAL; - return 0; + return ip6t_register_target(&ip6t_reject_reg); } static void __exit ip6t_reject_fini(void) diff --git a/net/netfilter/xt_CLASSIFY.c b/net/netfilter/xt_CLASSIFY.c index e54e577..1f92edd 100644 --- a/net/netfilter/xt_CLASSIFY.c +++ b/net/netfilter/xt_CLASSIFY.c @@ -40,47 +40,41 @@ target(struct sk_buff **pskb, return XT_CONTINUE; } -static struct xt_target classify_reg = { - .name = "CLASSIFY", - .target = target, - .targetsize = sizeof(struct xt_classify_target_info), - .table = "mangle", - .hooks = (1 << NF_IP_LOCAL_OUT) | (1 << NF_IP_FORWARD) | - (1 << NF_IP_POST_ROUTING), - .family = AF_INET, - .me = THIS_MODULE, +static struct xt_target xt_classify_target[] = { + { + .family = AF_INET, + .name = "CLASSIFY", + .target = target, + .targetsize = sizeof(struct xt_classify_target_info), + .table = "mangle", + .hooks = (1 << NF_IP_LOCAL_OUT) | + (1 << NF_IP_FORWARD) | + (1 << NF_IP_POST_ROUTING), + .me = THIS_MODULE, + }, + { + .name = "CLASSIFY", + .family = AF_INET6, + .target = target, + .targetsize = sizeof(struct xt_classify_target_info), + .table = "mangle", + .hooks = (1 << NF_IP_LOCAL_OUT) | + (1 << NF_IP_FORWARD) | + (1 << NF_IP_POST_ROUTING), + .me = THIS_MODULE, + }, }; -static struct xt_target classify6_reg = { - .name = "CLASSIFY", - .target = target, - .targetsize = sizeof(struct xt_classify_target_info), - .table = "mangle", - .hooks = (1 << NF_IP_LOCAL_OUT) | (1 << NF_IP_FORWARD) | - (1 << NF_IP_POST_ROUTING), - .family = AF_INET6, - .me = THIS_MODULE, -}; - static int __init xt_classify_init(void) { - int ret; - - ret = xt_register_target(&classify_reg); - if (ret) - return ret; - - ret = xt_register_target(&classify6_reg); - if (ret) - xt_unregister_target(&classify_reg); - - return ret; + return xt_register_targets(xt_classify_target, + ARRAY_SIZE(xt_classify_target)); } static void __exit xt_classify_fini(void) { - xt_unregister_target(&classify_reg); - xt_unregister_target(&classify6_reg); + xt_unregister_targets(xt_classify_target, + ARRAY_SIZE(xt_classify_target)); } module_init(xt_classify_init); diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index 19989a9..e577356 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -110,45 +110,36 @@ checkentry(const char *tablename, return 1; } -static struct xt_target connmark_reg = { - .name = "CONNMARK", - .target = target, - .targetsize = sizeof(struct xt_connmark_target_info), - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE -}; - -static struct xt_target connmark6_reg = { - .name = "CONNMARK", - .target = target, - .targetsize = sizeof(struct xt_connmark_target_info), - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_target xt_connmark_target[] = { + { + .name = "CONNMARK", + .family = AF_INET, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_connmark_target_info), + .me = THIS_MODULE + }, + { + .name = "CONNMARK", + .family = AF_INET6, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_connmark_target_info), + .me = THIS_MODULE + }, }; static int __init xt_connmark_init(void) { - int ret; - need_conntrack(); - - ret = xt_register_target(&connmark_reg); - if (ret) - return ret; - - ret = xt_register_target(&connmark6_reg); - if (ret) - xt_unregister_target(&connmark_reg); - - return ret; + return xt_register_targets(xt_connmark_target, + ARRAY_SIZE(xt_connmark_target)); } static void __exit xt_connmark_fini(void) { - xt_unregister_target(&connmark_reg); - xt_unregister_target(&connmark6_reg); + xt_unregister_targets(xt_connmark_target, + ARRAY_SIZE(xt_connmark_target)); } module_init(xt_connmark_init); diff --git a/net/netfilter/xt_CONNSECMARK.c b/net/netfilter/xt_CONNSECMARK.c index 8c011e0..48f7fc3 100644 --- a/net/netfilter/xt_CONNSECMARK.c +++ b/net/netfilter/xt_CONNSECMARK.c @@ -106,49 +106,38 @@ static int checkentry(const char *tablename, const void *entry, return 1; } -static struct xt_target ipt_connsecmark_reg = { - .name = "CONNSECMARK", - .target = target, - .targetsize = sizeof(struct xt_connsecmark_target_info), - .table = "mangle", - .checkentry = checkentry, - .me = THIS_MODULE, - .family = AF_INET, - .revision = 0, -}; - -static struct xt_target ip6t_connsecmark_reg = { - .name = "CONNSECMARK", - .target = target, - .targetsize = sizeof(struct xt_connsecmark_target_info), - .table = "mangle", - .checkentry = checkentry, - .me = THIS_MODULE, - .family = AF_INET6, - .revision = 0, +static struct xt_target xt_connsecmark_target[] = { + { + .name = "CONNSECMARK", + .family = AF_INET, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_connsecmark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, + { + .name = "CONNSECMARK", + .family = AF_INET6, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_connsecmark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, }; static int __init xt_connsecmark_init(void) { - int err; - need_conntrack(); - - err = xt_register_target(&ipt_connsecmark_reg); - if (err) - return err; - - err = xt_register_target(&ip6t_connsecmark_reg); - if (err) - xt_unregister_target(&ipt_connsecmark_reg); - - return err; + return xt_register_targets(xt_connsecmark_targets, + ARRAY_SIZE(xt_connsecmark_targets)); } static void __exit xt_connsecmark_fini(void) { - xt_unregister_target(&ip6t_connsecmark_reg); - xt_unregister_target(&ipt_connsecmark_reg); + xt_unregister_targets(xt_connsecmark_targets, + ARRAY_SIZE(xt_connsecmark_targets)); } module_init(xt_connsecmark_init); diff --git a/net/netfilter/xt_DSCP.c b/net/netfilter/xt_DSCP.c index 79df816..a1cd972 100644 --- a/net/netfilter/xt_DSCP.c +++ b/net/netfilter/xt_DSCP.c @@ -86,44 +86,35 @@ static int checkentry(const char *tablename, return 1; } -static struct xt_target xt_dscp_reg = { - .name = "DSCP", - .target = target, - .targetsize = sizeof(struct xt_DSCP_info), - .table = "mangle", - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_target xt_dscp6_reg = { - .name = "DSCP", - .target = target6, - .targetsize = sizeof(struct xt_DSCP_info), - .table = "mangle", - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_target xt_dscp_target[] = { + { + .name = "DSCP", + .family = AF_INET, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_DSCP_info), + .table = "mangle", + .me = THIS_MODULE, + }, + { + .name = "DSCP", + .family = AF_INET6, + .checkentry = checkentry, + .target = target6, + .targetsize = sizeof(struct xt_DSCP_info), + .table = "mangle", + .me = THIS_MODULE, + }, }; static int __init xt_dscp_target_init(void) { - int ret; - ret = xt_register_target(&xt_dscp_reg); - if (ret) - return ret; - - ret = xt_register_target(&xt_dscp6_reg); - if (ret) - xt_unregister_target(&xt_dscp_reg); - - return ret; + return xt_register_targets(xt_dscp_target, ARRAY_SIZE(xt_dscp_target)); } static void __exit xt_dscp_target_fini(void) { - xt_unregister_target(&xt_dscp_reg); - xt_unregister_target(&xt_dscp6_reg); + xt_unregister_targets(xt_dscp_target, ARRAY_SIZE(xt_dscp_target)); } module_init(xt_dscp_target_init); diff --git a/net/netfilter/xt_MARK.c b/net/netfilter/xt_MARK.c index ee9c34e..0a61272 100644 --- a/net/netfilter/xt_MARK.c +++ b/net/netfilter/xt_MARK.c @@ -112,65 +112,47 @@ checkentry_v1(const char *tablename, return 1; } -static struct xt_target ipt_mark_reg_v0 = { - .name = "MARK", - .target = target_v0, - .targetsize = sizeof(struct xt_mark_target_info), - .table = "mangle", - .checkentry = checkentry_v0, - .me = THIS_MODULE, - .family = AF_INET, - .revision = 0, -}; - -static struct xt_target ipt_mark_reg_v1 = { - .name = "MARK", - .target = target_v1, - .targetsize = sizeof(struct xt_mark_target_info_v1), - .table = "mangle", - .checkentry = checkentry_v1, - .me = THIS_MODULE, - .family = AF_INET, - .revision = 1, -}; - -static struct xt_target ip6t_mark_reg_v0 = { - .name = "MARK", - .target = target_v0, - .targetsize = sizeof(struct xt_mark_target_info), - .table = "mangle", - .checkentry = checkentry_v0, - .me = THIS_MODULE, - .family = AF_INET6, - .revision = 0, +static struct xt_target xt_mark_target[] = { + { + .name = "MARK", + .family = AF_INET, + .revision = 0, + .checkentry = checkentry_v0, + .target = target_v0, + .targetsize = sizeof(struct xt_mark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, + { + .name = "MARK", + .family = AF_INET, + .revision = 1, + .checkentry = checkentry_v1, + .target = target_v1, + .targetsize = sizeof(struct xt_mark_target_info_v1), + .table = "mangle", + .me = THIS_MODULE, + }, + { + .name = "MARK", + .family = AF_INET6, + .revision = 0, + .checkentry = checkentry_v0, + .target = target_v0, + .targetsize = sizeof(struct xt_mark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, }; static int __init xt_mark_init(void) { - int err; - - err = xt_register_target(&ipt_mark_reg_v0); - if (err) - return err; - - err = xt_register_target(&ipt_mark_reg_v1); - if (err) - xt_unregister_target(&ipt_mark_reg_v0); - - err = xt_register_target(&ip6t_mark_reg_v0); - if (err) { - xt_unregister_target(&ipt_mark_reg_v0); - xt_unregister_target(&ipt_mark_reg_v1); - } - - return err; + return xt_register_targets(xt_mark_target, ARRAY_SIZE(xt_mark_target)); } static void __exit xt_mark_fini(void) { - xt_unregister_target(&ipt_mark_reg_v0); - xt_unregister_target(&ipt_mark_reg_v1); - xt_unregister_target(&ip6t_mark_reg_v0); + xt_unregister_targets(xt_mark_target, ARRAY_SIZE(xt_mark_target)); } module_init(xt_mark_init); diff --git a/net/netfilter/xt_NFQUEUE.c b/net/netfilter/xt_NFQUEUE.c index 86ccceb..7b98228 100644 --- a/net/netfilter/xt_NFQUEUE.c +++ b/net/netfilter/xt_NFQUEUE.c @@ -37,57 +37,39 @@ target(struct sk_buff **pskb, return NF_QUEUE_NR(tinfo->queuenum); } -static struct xt_target ipt_NFQ_reg = { - .name = "NFQUEUE", - .target = target, - .targetsize = sizeof(struct xt_NFQ_info), - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_target ip6t_NFQ_reg = { - .name = "NFQUEUE", - .target = target, - .targetsize = sizeof(struct xt_NFQ_info), - .family = AF_INET6, - .me = THIS_MODULE, -}; - -static struct xt_target arpt_NFQ_reg = { - .name = "NFQUEUE", - .target = target, - .targetsize = sizeof(struct xt_NFQ_info), - .family = NF_ARP, - .me = THIS_MODULE, +static struct xt_target xt_nfqueue_target[] = { + { + .name = "NFQUEUE", + .family = AF_INET, + .target = target, + .targetsize = sizeof(struct xt_NFQ_info), + .me = THIS_MODULE, + }, + { + .name = "NFQUEUE", + .family = AF_INET6, + .target = target, + .targetsize = sizeof(struct xt_NFQ_info), + .me = THIS_MODULE, + }, + { + .name = "NFQUEUE", + .family = NF_ARP, + .target = target, + .targetsize = sizeof(struct xt_NFQ_info), + .me = THIS_MODULE, + }, }; static int __init xt_nfqueue_init(void) { - int ret; - ret = xt_register_target(&ipt_NFQ_reg); - if (ret) - return ret; - ret = xt_register_target(&ip6t_NFQ_reg); - if (ret) - goto out_ip; - ret = xt_register_target(&arpt_NFQ_reg); - if (ret) - goto out_ip6; - - return ret; -out_ip6: - xt_unregister_target(&ip6t_NFQ_reg); -out_ip: - xt_unregister_target(&ipt_NFQ_reg); - - return ret; + return xt_register_targets(xt_nfqueue_target, + ARRAY_SIZE(xt_nfqueue_target)); } static void __exit xt_nfqueue_fini(void) { - xt_unregister_target(&arpt_NFQ_reg); - xt_unregister_target(&ip6t_NFQ_reg); - xt_unregister_target(&ipt_NFQ_reg); + xt_register_targets(xt_nfqueue_target, ARRAY_SIZE(xt_nfqueue_target)); } module_init(xt_nfqueue_init); diff --git a/net/netfilter/xt_NOTRACK.c b/net/netfilter/xt_NOTRACK.c index 98f4b53..cab881d 100644 --- a/net/netfilter/xt_NOTRACK.c +++ b/net/netfilter/xt_NOTRACK.c @@ -34,43 +34,32 @@ target(struct sk_buff **pskb, return XT_CONTINUE; } -static struct xt_target notrack_reg = { - .name = "NOTRACK", - .target = target, - .targetsize = 0, - .table = "raw", - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_target notrack6_reg = { - .name = "NOTRACK", - .target = target, - .targetsize = 0, - .table = "raw", - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_target xt_notrack_target[] = { + { + .name = "NOTRACK", + .family = AF_INET, + .target = target, + .table = "raw", + .me = THIS_MODULE, + }, + { + .name = "NOTRACK", + .family = AF_INET6, + .target = target, + .table = "raw", + .me = THIS_MODULE, + }, }; static int __init xt_notrack_init(void) { - int ret; - - ret = xt_register_target(¬rack_reg); - if (ret) - return ret; - - ret = xt_register_target(¬rack6_reg); - if (ret) - xt_unregister_target(¬rack_reg); - - return ret; + return xt_register_targets(xt_notrack_target, + ARRAY_SIZE(xt_notrack_target)); } static void __exit xt_notrack_fini(void) { - xt_unregister_target(¬rack6_reg); - xt_unregister_target(¬rack_reg); + xt_unregister_targets(xt_notrack_target, ARRAY_SIZE(xt_notrack_target)); } module_init(xt_notrack_init); diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c index de9537a..4300988 100644 --- a/net/netfilter/xt_SECMARK.c +++ b/net/netfilter/xt_SECMARK.c @@ -111,47 +111,36 @@ static int checkentry(const char *tablename, const void *entry, return 1; } -static struct xt_target ipt_secmark_reg = { - .name = "SECMARK", - .target = target, - .targetsize = sizeof(struct xt_secmark_target_info), - .table = "mangle", - .checkentry = checkentry, - .me = THIS_MODULE, - .family = AF_INET, - .revision = 0, -}; - -static struct xt_target ip6t_secmark_reg = { - .name = "SECMARK", - .target = target, - .targetsize = sizeof(struct xt_secmark_target_info), - .table = "mangle", - .checkentry = checkentry, - .me = THIS_MODULE, - .family = AF_INET6, - .revision = 0, +static struct xt_target xt_secmark_target = { + { + .name = "SECMARK", + .family = AF_INET, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_secmark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, + { + .name = "SECMARK", + .family = AF_INET6, + .checkentry = checkentry, + .target = target, + .targetsize = sizeof(struct xt_secmark_target_info), + .table = "mangle", + .me = THIS_MODULE, + }, }; static int __init xt_secmark_init(void) { - int err; - - err = xt_register_target(&ipt_secmark_reg); - if (err) - return err; - - err = xt_register_target(&ip6t_secmark_reg); - if (err) - xt_unregister_target(&ipt_secmark_reg); - - return err; + return xt_register_targets(xt_secmark_target, + ARRAY_SIZE(xt_secmark_target)); } static void __exit xt_secmark_fini(void) { - xt_unregister_target(&ip6t_secmark_reg); - xt_unregister_target(&ipt_secmark_reg); + xt_unregister_targets(xt_secmark_target, ARRAY_SIZE(xt_secmark_target)); } module_init(xt_secmark_init); diff --git a/net/netfilter/xt_comment.c b/net/netfilter/xt_comment.c index 197609c..7db492d 100644 --- a/net/netfilter/xt_comment.c +++ b/net/netfilter/xt_comment.c @@ -29,41 +29,32 @@ match(const struct sk_buff *skb, return 1; } -static struct xt_match comment_match = { - .name = "comment", - .match = match, - .matchsize = sizeof(struct xt_comment_info), - .family = AF_INET, - .me = THIS_MODULE -}; - -static struct xt_match comment6_match = { - .name = "comment", - .match = match, - .matchsize = sizeof(struct xt_comment_info), - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_match xt_comment_match[] = { + { + .name = "comment", + .family = AF_INET, + .match = match, + .matchsize = sizeof(struct xt_comment_info), + .me = THIS_MODULE + }, + { + .name = "comment", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct xt_comment_info), + .me = THIS_MODULE + }, }; static int __init xt_comment_init(void) { - int ret; - - ret = xt_register_match(&comment_match); - if (ret) - return ret; - - ret = xt_register_match(&comment6_match); - if (ret) - xt_unregister_match(&comment_match); - - return ret; + return xt_register_matches(xt_comment_match, + ARRAY_SIZE(xt_comment_match)); } static void __exit xt_comment_fini(void) { - xt_unregister_match(&comment_match); - xt_unregister_match(&comment6_match); + xt_unregister_matches(xt_comment_match, ARRAY_SIZE(xt_comment_match)); } module_init(xt_comment_init); diff --git a/net/netfilter/xt_connbytes.c b/net/netfilter/xt_connbytes.c index 1396fe2..2d49948 100644 --- a/net/netfilter/xt_connbytes.c +++ b/net/netfilter/xt_connbytes.c @@ -143,40 +143,35 @@ static int check(const char *tablename, return 1; } -static struct xt_match connbytes_match = { - .name = "connbytes", - .match = match, - .checkentry = check, - .matchsize = sizeof(struct xt_connbytes_info), - .family = AF_INET, - .me = THIS_MODULE -}; -static struct xt_match connbytes6_match = { - .name = "connbytes", - .match = match, - .checkentry = check, - .matchsize = sizeof(struct xt_connbytes_info), - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_match xt_connbytes_match = { + { + .name = "connbytes", + .family = AF_INET, + .checkentry = check, + .match = match, + .matchsize = sizeof(struct xt_connbytes_info), + .me = THIS_MODULE + }, + { + .name = "connbytes", + .family = AF_INET6, + .checkentry = check, + .match = match, + .matchsize = sizeof(struct xt_connbytes_info), + .me = THIS_MODULE + }, }; static int __init xt_connbytes_init(void) { - int ret; - ret = xt_register_match(&connbytes_match); - if (ret) - return ret; - - ret = xt_register_match(&connbytes6_match); - if (ret) - xt_unregister_match(&connbytes_match); - return ret; + return xt_register_matches(xt_connbytes_match, + ARRAY_SIZE(xt_connbytes_match)); } static void __exit xt_connbytes_fini(void) { - xt_unregister_match(&connbytes_match); - xt_unregister_match(&connbytes6_match); + xt_unregister_matches(xt_connbytes_match, + ARRAY_SIZE(xt_connbytes_match)); } module_init(xt_connbytes_init); diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c index 56324c8a..a97b2d4 100644 --- a/net/netfilter/xt_connmark.c +++ b/net/netfilter/xt_connmark.c @@ -82,46 +82,37 @@ destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) #endif } -static struct xt_match connmark_match = { - .name = "connmark", - .match = match, - .matchsize = sizeof(struct xt_connmark_info), - .checkentry = checkentry, - .destroy = destroy, - .family = AF_INET, - .me = THIS_MODULE -}; - -static struct xt_match connmark6_match = { - .name = "connmark", - .match = match, - .matchsize = sizeof(struct xt_connmark_info), - .checkentry = checkentry, - .destroy = destroy, - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_match xt_connmark_match[] = { + { + .name = "connmark", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_connmark_info), + .me = THIS_MODULE + }, + { + .name = "connmark", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_connmark_info), + .me = THIS_MODULE + }, }; static int __init xt_connmark_init(void) { - int ret; - need_conntrack(); - - ret = xt_register_match(&connmark_match); - if (ret) - return ret; - - ret = xt_register_match(&connmark6_match); - if (ret) - xt_unregister_match(&connmark_match); - return ret; + return xt_register_matches(xt_connmark_match, + ARRAY_SIZE(xt_connmark_match)); } static void __exit xt_connmark_fini(void) { - xt_unregister_match(&connmark6_match); - xt_unregister_match(&connmark_match); + xt_register_matches(xt_connmark_match, ARRAY_SIZE(xt_connmark_match)); } module_init(xt_connmark_init); diff --git a/net/netfilter/xt_conntrack.c b/net/netfilter/xt_conntrack.c index 145489a..1540885 100644 --- a/net/netfilter/xt_conntrack.c +++ b/net/netfilter/xt_conntrack.c @@ -241,11 +241,8 @@ static struct xt_match conntrack_match = { static int __init xt_conntrack_init(void) { - int ret; need_conntrack(); - ret = xt_register_match(&conntrack_match); - - return ret; + return xt_register_match(&conntrack_match); } static void __exit xt_conntrack_fini(void) diff --git a/net/netfilter/xt_dccp.c b/net/netfilter/xt_dccp.c index 2e2f825..5ca6f52 100644 --- a/net/netfilter/xt_dccp.c +++ b/net/netfilter/xt_dccp.c @@ -141,27 +141,26 @@ checkentry(const char *tablename, && !(info->invflags & ~info->flags); } -static struct xt_match dccp_match = -{ - .name = "dccp", - .match = match, - .matchsize = sizeof(struct xt_dccp_info), - .proto = IPPROTO_DCCP, - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, +static struct xt_match xt_dccp_match[] = { + { + .name = "dccp", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_dccp_info), + .proto = IPPROTO_DCCP, + .me = THIS_MODULE, + }, + { + .name = "dccp", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_dccp_info), + .proto = IPPROTO_DCCP, + .me = THIS_MODULE, + }, }; -static struct xt_match dccp6_match = -{ - .name = "dccp", - .match = match, - .matchsize = sizeof(struct xt_dccp_info), - .proto = IPPROTO_DCCP, - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, -}; - static int __init xt_dccp_init(void) { @@ -173,27 +172,19 @@ static int __init xt_dccp_init(void) dccp_optbuf = kmalloc(256 * 4, GFP_KERNEL); if (!dccp_optbuf) return -ENOMEM; - ret = xt_register_match(&dccp_match); + ret = xt_register_matches(xt_dccp_match, ARRAY_SIZE(xt_dccp_match)); if (ret) goto out_kfree; - ret = xt_register_match(&dccp6_match); - if (ret) - goto out_unreg; - return ret; -out_unreg: - xt_unregister_match(&dccp_match); out_kfree: kfree(dccp_optbuf); - return ret; } static void __exit xt_dccp_fini(void) { - xt_unregister_match(&dccp6_match); - xt_unregister_match(&dccp_match); + xt_unregister_matches(xt_dccp_match, ARRAY_SIZE(xt_dccp_match)); kfree(dccp_optbuf); } diff --git a/net/netfilter/xt_dscp.c b/net/netfilter/xt_dscp.c index 82e250d..d84075c3 100644 --- a/net/netfilter/xt_dscp.c +++ b/net/netfilter/xt_dscp.c @@ -71,42 +71,33 @@ static int checkentry(const char *tablename, return 1; } -static struct xt_match dscp_match = { - .name = "dscp", - .match = match, - .checkentry = checkentry, - .matchsize = sizeof(struct xt_dscp_info), - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match dscp6_match = { - .name = "dscp", - .match = match6, - .checkentry = checkentry, - .matchsize = sizeof(struct xt_dscp_info), - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_dscp_match[] = { + { + .name = "dscp", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_dscp_info), + .me = THIS_MODULE, + }, + { + .name = "dscp", + .family = AF_INET6, + .checkentry = checkentry, + .match = match6, + .matchsize = sizeof(struct xt_dscp_info), + .me = THIS_MODULE, + }, }; static int __init xt_dscp_match_init(void) { - int ret; - ret = xt_register_match(&dscp_match); - if (ret) - return ret; - - ret = xt_register_match(&dscp6_match); - if (ret) - xt_unregister_match(&dscp_match); - - return ret; + return xt_register_matches(xt_dscp_match, ARRAY_SIZE(xt_dscp_match)); } static void __exit xt_dscp_match_fini(void) { - xt_unregister_match(&dscp_match); - xt_unregister_match(&dscp6_match); + xt_unregister_matches(xt_dscp_match, ARRAY_SIZE(xt_dscp_match)); } module_init(xt_dscp_match_init); diff --git a/net/netfilter/xt_esp.c b/net/netfilter/xt_esp.c index 9dad628..7b19bc9 100644 --- a/net/netfilter/xt_esp.c +++ b/net/netfilter/xt_esp.c @@ -92,44 +92,35 @@ checkentry(const char *tablename, return 1; } -static struct xt_match esp_match = { - .name = "esp", - .family = AF_INET, - .proto = IPPROTO_ESP, - .match = &match, - .matchsize = sizeof(struct xt_esp), - .checkentry = &checkentry, - .me = THIS_MODULE, -}; - -static struct xt_match esp6_match = { - .name = "esp", - .family = AF_INET6, - .proto = IPPROTO_ESP, - .match = &match, - .matchsize = sizeof(struct xt_esp), - .checkentry = &checkentry, - .me = THIS_MODULE, +static struct xt_match xt_esp_match[] = { + { + .name = "esp", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_esp), + .proto = IPPROTO_ESP, + .me = THIS_MODULE, + }, + { + .name = "esp", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_esp), + .proto = IPPROTO_ESP, + .me = THIS_MODULE, + }, }; static int __init xt_esp_init(void) { - int ret; - ret = xt_register_match(&esp_match); - if (ret) - return ret; - - ret = xt_register_match(&esp6_match); - if (ret) - xt_unregister_match(&esp_match); - - return ret; + return xt_register_matches(xt_esp_match, ARRAY_SIZE(xt_esp_match)); } static void __exit xt_esp_cleanup(void) { - xt_unregister_match(&esp_match); - xt_unregister_match(&esp6_match); + xt_unregister_matches(xt_esp_match, ARRAY_SIZE(xt_esp_match)); } module_init(xt_esp_init); diff --git a/net/netfilter/xt_helper.c b/net/netfilter/xt_helper.c index 799c2a4..db453a7 100644 --- a/net/netfilter/xt_helper.c +++ b/net/netfilter/xt_helper.c @@ -163,45 +163,37 @@ destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) #endif } -static struct xt_match helper_match = { - .name = "helper", - .match = match, - .matchsize = sizeof(struct xt_helper_info), - .checkentry = check, - .destroy = destroy, - .family = AF_INET, - .me = THIS_MODULE, -}; -static struct xt_match helper6_match = { - .name = "helper", - .match = match, - .matchsize = sizeof(struct xt_helper_info), - .checkentry = check, - .destroy = destroy, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_helper_match[] = { + { + .name = "helper", + .family = AF_INET, + .checkentry = check, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_helper_info), + .me = THIS_MODULE, + }, + { + .name = "helper", + .family = AF_INET6, + .checkentry = check, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_helper_info), + .me = THIS_MODULE, + }, }; static int __init xt_helper_init(void) { - int ret; need_conntrack(); - - ret = xt_register_match(&helper_match); - if (ret < 0) - return ret; - - ret = xt_register_match(&helper6_match); - if (ret < 0) - xt_unregister_match(&helper_match); - - return ret; + return xt_register_matches(xt_helper_match, + ARRAY_SIZE(xt_helper_match)); } static void __exit xt_helper_fini(void) { - xt_unregister_match(&helper_match); - xt_unregister_match(&helper6_match); + xt_unregister_matches(xt_helper_match, ARRAY_SIZE(xt_helper_match)); } module_init(xt_helper_init); diff --git a/net/netfilter/xt_length.c b/net/netfilter/xt_length.c index 109132c..67fd30d 100644 --- a/net/netfilter/xt_length.c +++ b/net/netfilter/xt_length.c @@ -52,39 +52,32 @@ match6(const struct sk_buff *skb, return (pktlen >= info->min && pktlen <= info->max) ^ info->invert; } -static struct xt_match length_match = { - .name = "length", - .match = match, - .matchsize = sizeof(struct xt_length_info), - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match length6_match = { - .name = "length", - .match = match6, - .matchsize = sizeof(struct xt_length_info), - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_length_match[] = { + { + .name = "length", + .family = AF_INET, + .match = match, + .matchsize = sizeof(struct xt_length_info), + .me = THIS_MODULE, + }, + { + .name = "length", + .family = AF_INET6, + .match = match6, + .matchsize = sizeof(struct xt_length_info), + .me = THIS_MODULE, + }, }; static int __init xt_length_init(void) { - int ret; - ret = xt_register_match(&length_match); - if (ret) - return ret; - ret = xt_register_match(&length6_match); - if (ret) - xt_unregister_match(&length_match); - - return ret; + return xt_register_matches(xt_length_match, + ARRAY_SIZE(xt_length_match)); } static void __exit xt_length_fini(void) { - xt_unregister_match(&length_match); - xt_unregister_match(&length6_match); + xt_unregister_matches(xt_length_match, ARRAY_SIZE(xt_length_match)); } module_init(xt_length_init); diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c index ce7fdb7..e8d5e7a 100644 --- a/net/netfilter/xt_limit.c +++ b/net/netfilter/xt_limit.c @@ -136,42 +136,33 @@ ipt_limit_checkentry(const char *tablename, return 1; } -static struct xt_match ipt_limit_reg = { - .name = "limit", - .match = ipt_limit_match, - .matchsize = sizeof(struct xt_rateinfo), - .checkentry = ipt_limit_checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; -static struct xt_match limit6_reg = { - .name = "limit", - .match = ipt_limit_match, - .matchsize = sizeof(struct xt_rateinfo), - .checkentry = ipt_limit_checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_limit_match[] = { + { + .name = "limit", + .family = AF_INET, + .checkentry = ipt_limit_checkentry, + .match = ipt_limit_match, + .matchsize = sizeof(struct xt_rateinfo), + .me = THIS_MODULE, + }, + { + .name = "limit", + .family = AF_INET6, + .checkentry = ipt_limit_checkentry, + .match = ipt_limit_match, + .matchsize = sizeof(struct xt_rateinfo), + .me = THIS_MODULE, + }, }; static int __init xt_limit_init(void) { - int ret; - - ret = xt_register_match(&ipt_limit_reg); - if (ret) - return ret; - - ret = xt_register_match(&limit6_reg); - if (ret) - xt_unregister_match(&ipt_limit_reg); - - return ret; + return xt_register_matches(xt_limit_match, ARRAY_SIZE(xt_limit_match)); } static void __exit xt_limit_fini(void) { - xt_unregister_match(&ipt_limit_reg); - xt_unregister_match(&limit6_reg); + xt_unregister_matches(xt_limit_match, ARRAY_SIZE(xt_limit_match)); } module_init(xt_limit_init); diff --git a/net/netfilter/xt_mac.c b/net/netfilter/xt_mac.c index 356290f..425fc21 100644 --- a/net/netfilter/xt_mac.c +++ b/net/netfilter/xt_mac.c @@ -43,43 +43,37 @@ match(const struct sk_buff *skb, ^ info->invert)); } -static struct xt_match mac_match = { - .name = "mac", - .match = match, - .matchsize = sizeof(struct xt_mac_info), - .hooks = (1 << NF_IP_PRE_ROUTING) | (1 << NF_IP_LOCAL_IN) | - (1 << NF_IP_FORWARD), - .family = AF_INET, - .me = THIS_MODULE, -}; -static struct xt_match mac6_match = { - .name = "mac", - .match = match, - .matchsize = sizeof(struct xt_mac_info), - .hooks = (1 << NF_IP_PRE_ROUTING) | (1 << NF_IP_LOCAL_IN) | - (1 << NF_IP_FORWARD), - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_mac_match[] = { + { + .name = "mac", + .family = AF_INET, + .match = match, + .matchsize = sizeof(struct xt_mac_info), + .hooks = (1 << NF_IP_PRE_ROUTING) | + (1 << NF_IP_LOCAL_IN) | + (1 << NF_IP_FORWARD), + .me = THIS_MODULE, + }, + { + .name = "mac", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct xt_mac_info), + .hooks = (1 << NF_IP_PRE_ROUTING) | + (1 << NF_IP_LOCAL_IN) | + (1 << NF_IP_FORWARD), + .me = THIS_MODULE, + }, }; static int __init xt_mac_init(void) { - int ret; - ret = xt_register_match(&mac_match); - if (ret) - return ret; - - ret = xt_register_match(&mac6_match); - if (ret) - xt_unregister_match(&mac_match); - - return ret; + return xt_register_matches(xt_mac_match, ARRAY_SIZE(xt_mac_match)); } static void __exit xt_mac_fini(void) { - xt_unregister_match(&mac_match); - xt_unregister_match(&mac6_match); + xt_unregister_matches(xt_mac_match, ARRAY_SIZE(xt_mac_match)); } module_init(xt_mac_init); diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c index 876bc57..39f9b079 100644 --- a/net/netfilter/xt_mark.c +++ b/net/netfilter/xt_mark.c @@ -51,42 +51,33 @@ checkentry(const char *tablename, return 1; } -static struct xt_match mark_match = { - .name = "mark", - .match = match, - .matchsize = sizeof(struct xt_mark_info), - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match mark6_match = { - .name = "mark", - .match = match, - .matchsize = sizeof(struct xt_mark_info), - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_mark_match[] = { + { + .name = "mark", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_mark_info), + .me = THIS_MODULE, + }, + { + .name = "mark", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_mark_info), + .me = THIS_MODULE, + }, }; static int __init xt_mark_init(void) { - int ret; - ret = xt_register_match(&mark_match); - if (ret) - return ret; - - ret = xt_register_match(&mark6_match); - if (ret) - xt_unregister_match(&mark_match); - - return ret; + return xt_register_matches(xt_mark_match, ARRAY_SIZE(xt_mark_match)); } static void __exit xt_mark_fini(void) { - xt_unregister_match(&mark_match); - xt_unregister_match(&mark6_match); + xt_unregister_matches(xt_mark_match, ARRAY_SIZE(xt_mark_match)); } module_init(xt_mark_init); diff --git a/net/netfilter/xt_multiport.c b/net/netfilter/xt_multiport.c index 1ff0a25..e74f9bb 100644 --- a/net/netfilter/xt_multiport.c +++ b/net/netfilter/xt_multiport.c @@ -231,84 +231,55 @@ checkentry6_v1(const char *tablename, multiinfo->count); } -static struct xt_match multiport_match = { - .name = "multiport", - .revision = 0, - .matchsize = sizeof(struct xt_multiport), - .match = &match, - .checkentry = &checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match multiport_match_v1 = { - .name = "multiport", - .revision = 1, - .matchsize = sizeof(struct xt_multiport_v1), - .match = &match_v1, - .checkentry = &checkentry_v1, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match multiport6_match = { - .name = "multiport", - .revision = 0, - .matchsize = sizeof(struct xt_multiport), - .match = &match, - .checkentry = &checkentry6, - .family = AF_INET6, - .me = THIS_MODULE, -}; - -static struct xt_match multiport6_match_v1 = { - .name = "multiport", - .revision = 1, - .matchsize = sizeof(struct xt_multiport_v1), - .match = &match_v1, - .checkentry = &checkentry6_v1, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_multiport_match[] = { + { + .name = "multiport", + .family = AF_INET, + .revision = 0, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_multiport), + .me = THIS_MODULE, + }, + { + .name = "multiport", + .family = AF_INET, + .revision = 1, + .checkentry = checkentry_v1, + .match = match_v1, + .matchsize = sizeof(struct xt_multiport_v1), + .me = THIS_MODULE, + }, + { + .name = "multiport", + .family = AF_INET6, + .revision = 0, + .checkentry = checkentry6, + .match = match, + .matchsize = sizeof(struct xt_multiport), + .me = THIS_MODULE, + }, + { + .name = "multiport", + .family = AF_INET6, + .revision = 1, + .checkentry = checkentry6_v1, + .match = match_v1, + .matchsize = sizeof(struct xt_multiport_v1), + .me = THIS_MODULE, + }, }; static int __init xt_multiport_init(void) { - int ret; - - ret = xt_register_match(&multiport_match); - if (ret) - goto out; - - ret = xt_register_match(&multiport_match_v1); - if (ret) - goto out_unreg_multi_v0; - - ret = xt_register_match(&multiport6_match); - if (ret) - goto out_unreg_multi_v1; - - ret = xt_register_match(&multiport6_match_v1); - if (ret) - goto out_unreg_multi6_v0; - - return ret; - -out_unreg_multi6_v0: - xt_unregister_match(&multiport6_match); -out_unreg_multi_v1: - xt_unregister_match(&multiport_match_v1); -out_unreg_multi_v0: - xt_unregister_match(&multiport_match); -out: - return ret; + return xt_register_matches(xt_multiport_match, + ARRAY_SIZE(xt_multiport_match)); } static void __exit xt_multiport_fini(void) { - xt_unregister_match(&multiport_match); - xt_unregister_match(&multiport_match_v1); - xt_unregister_match(&multiport6_match); - xt_unregister_match(&multiport6_match_v1); + xt_unregister_matches(xt_multiport_match, + ARRAY_SIZE(xt_multiport_match)); } module_init(xt_multiport_init); diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c index 63a9654..af3d70f 100644 --- a/net/netfilter/xt_physdev.c +++ b/net/netfilter/xt_physdev.c @@ -132,43 +132,34 @@ checkentry(const char *tablename, return 1; } -static struct xt_match physdev_match = { - .name = "physdev", - .match = match, - .matchsize = sizeof(struct xt_physdev_info), - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match physdev6_match = { - .name = "physdev", - .match = match, - .matchsize = sizeof(struct xt_physdev_info), - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_physdev_match[] = { + { + .name = "physdev", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_physdev_info), + .me = THIS_MODULE, + }, + { + .name = "physdev", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_physdev_info), + .me = THIS_MODULE, + }, }; static int __init xt_physdev_init(void) { - int ret; - - ret = xt_register_match(&physdev_match); - if (ret < 0) - return ret; - - ret = xt_register_match(&physdev6_match); - if (ret < 0) - xt_unregister_match(&physdev_match); - - return ret; + return xt_register_matches(xt_physdev_match, + ARRAY_SIZE(xt_physdev_match)); } static void __exit xt_physdev_fini(void) { - xt_unregister_match(&physdev_match); - xt_unregister_match(&physdev6_match); + xt_unregister_matches(xt_physdev_match, ARRAY_SIZE(xt_physdev_match)); } module_init(xt_physdev_init); diff --git a/net/netfilter/xt_pkttype.c b/net/netfilter/xt_pkttype.c index d2f5320..16e7b08 100644 --- a/net/netfilter/xt_pkttype.c +++ b/net/netfilter/xt_pkttype.c @@ -43,40 +43,32 @@ static int match(const struct sk_buff *skb, return (type == info->pkttype) ^ info->invert; } -static struct xt_match pkttype_match = { - .name = "pkttype", - .match = match, - .matchsize = sizeof(struct xt_pkttype_info), - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match pkttype6_match = { - .name = "pkttype", - .match = match, - .matchsize = sizeof(struct xt_pkttype_info), - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_pkttype_match[] = { + { + .name = "pkttype", + .family = AF_INET, + .match = match, + .matchsize = sizeof(struct xt_pkttype_info), + .me = THIS_MODULE, + }, + { + .name = "pkttype", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct xt_pkttype_info), + .me = THIS_MODULE, + }, }; static int __init xt_pkttype_init(void) { - int ret; - ret = xt_register_match(&pkttype_match); - if (ret) - return ret; - - ret = xt_register_match(&pkttype6_match); - if (ret) - xt_unregister_match(&pkttype_match); - - return ret; + return xt_register_matches(xt_pkttype_match, + ARRAY_SIZE(xt_pkttype_match)); } static void __exit xt_pkttype_fini(void) { - xt_unregister_match(&pkttype_match); - xt_unregister_match(&pkttype6_match); + xt_unregister_matches(xt_pkttype_match, ARRAY_SIZE(xt_pkttype_match)); } module_init(xt_pkttype_init); diff --git a/net/netfilter/xt_policy.c b/net/netfilter/xt_policy.c index ba1ca03..f5639c4 100644 --- a/net/netfilter/xt_policy.c +++ b/net/netfilter/xt_policy.c @@ -165,43 +165,36 @@ static int checkentry(const char *tablename, const void *ip_void, return 1; } -static struct xt_match policy_match = { - .name = "policy", - .family = AF_INET, - .match = match, - .matchsize = sizeof(struct xt_policy_info), - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match policy6_match = { - .name = "policy", - .family = AF_INET6, - .match = match, - .matchsize = sizeof(struct xt_policy_info), - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_policy_match[] = { + { + .name = "policy", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_policy_info), + .family = AF_INET, + .me = THIS_MODULE, + }, + { + .name = "policy", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_policy_info), + .family = AF_INET6, + .me = THIS_MODULE, + }, }; static int __init init(void) { - int ret; - - ret = xt_register_match(&policy_match); - if (ret) - return ret; - ret = xt_register_match(&policy6_match); - if (ret) - xt_unregister_match(&policy_match); - return ret; + return xt_register_matches(xt_policy_match, + ARRAY_SIZE(xt_policy_match)); } static void __exit fini(void) { - xt_unregister_match(&policy6_match); - xt_unregister_match(&policy_match); + xt_unregister_matches(xt_policy_match, ARRAY_SIZE(xt_policy_match)); } module_init(init); diff --git a/net/netfilter/xt_quota.c b/net/netfilter/xt_quota.c index be8d3c2..cc44f87 100644 --- a/net/netfilter/xt_quota.c +++ b/net/netfilter/xt_quota.c @@ -52,46 +52,33 @@ checkentry(const char *tablename, const void *entry, return 1; } -static struct xt_match quota_match = { - .name = "quota", - .family = AF_INET, - .match = match, - .matchsize = sizeof(struct xt_quota_info), - .checkentry = checkentry, - .me = THIS_MODULE -}; - -static struct xt_match quota_match6 = { - .name = "quota", - .family = AF_INET6, - .match = match, - .matchsize = sizeof(struct xt_quota_info), - .checkentry = checkentry, - .me = THIS_MODULE +static struct xt_match xt_quota_match[] = { + { + .name = "quota", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_quota_info), + .me = THIS_MODULE + }, + { + .name = "quota", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_quota_info), + .me = THIS_MODULE + }, }; static int __init xt_quota_init(void) { - int ret; - - ret = xt_register_match("a_match); - if (ret) - goto err1; - ret = xt_register_match("a_match6); - if (ret) - goto err2; - return ret; - -err2: - xt_unregister_match("a_match); -err1: - return ret; + return xt_register_matches(xt_quota_match, ARRAY_SIZE(xt_quota_match)); } static void __exit xt_quota_fini(void) { - xt_unregister_match("a_match6); - xt_unregister_match("a_match); + xt_unregister_matches(xt_quota_match, ARRAY_SIZE(xt_quota_match)); } module_init(xt_quota_init); diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c index 843383e..5628621 100644 --- a/net/netfilter/xt_sctp.c +++ b/net/netfilter/xt_sctp.c @@ -178,44 +178,35 @@ checkentry(const char *tablename, | SCTP_CHUNK_MATCH_ONLY))); } -static struct xt_match sctp_match = { - .name = "sctp", - .match = match, - .matchsize = sizeof(struct xt_sctp_info), - .proto = IPPROTO_SCTP, - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE -}; - -static struct xt_match sctp6_match = { - .name = "sctp", - .match = match, - .matchsize = sizeof(struct xt_sctp_info), - .proto = IPPROTO_SCTP, - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_match xt_sctp_match[] = { + { + .name = "sctp", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_sctp_info), + .proto = IPPROTO_SCTP, + .me = THIS_MODULE + }, + { + .name = "sctp", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_sctp_info), + .proto = IPPROTO_SCTP, + .me = THIS_MODULE + }, }; static int __init xt_sctp_init(void) { - int ret; - ret = xt_register_match(&sctp_match); - if (ret) - return ret; - - ret = xt_register_match(&sctp6_match); - if (ret) - xt_unregister_match(&sctp_match); - - return ret; + return xt_register_matches(xt_sctp_match, ARRAY_SIZE(xt_sctp_match)); } static void __exit xt_sctp_fini(void) { - xt_unregister_match(&sctp6_match); - xt_unregister_match(&sctp_match); + xt_unregister_matches(xt_sctp_match, ARRAY_SIZE(xt_sctp_match)); } module_init(xt_sctp_init); diff --git a/net/netfilter/xt_state.c b/net/netfilter/xt_state.c index f9e304d..5f9492e 100644 --- a/net/netfilter/xt_state.c +++ b/net/netfilter/xt_state.c @@ -69,47 +69,36 @@ destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) #endif } -static struct xt_match state_match = { - .name = "state", - .match = match, - .checkentry = check, - .destroy = destroy, - .matchsize = sizeof(struct xt_state_info), - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match state6_match = { - .name = "state", - .match = match, - .checkentry = check, - .destroy = destroy, - .matchsize = sizeof(struct xt_state_info), - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_state_match[] = { + { + .name = "state", + .family = AF_INET, + .checkentry = check, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_state_info), + .me = THIS_MODULE, + }, + { + .name = "state", + .family = AF_INET6, + .checkentry = check, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_state_info), + .me = THIS_MODULE, + }, }; static int __init xt_state_init(void) { - int ret; - need_conntrack(); - - ret = xt_register_match(&state_match); - if (ret < 0) - return ret; - - ret = xt_register_match(&state6_match); - if (ret < 0) - xt_unregister_match(&state_match); - - return ret; + return xt_register_matches(xt_state_match, ARRAY_SIZE(xt_state_match)); } static void __exit xt_state_fini(void) { - xt_unregister_match(&state_match); - xt_unregister_match(&state6_match); + xt_unregister_matches(xt_state_match, ARRAY_SIZE(xt_state_match)); } module_init(xt_state_init); diff --git a/net/netfilter/xt_statistic.c b/net/netfilter/xt_statistic.c index de1037f..5181630 100644 --- a/net/netfilter/xt_statistic.c +++ b/net/netfilter/xt_statistic.c @@ -66,46 +66,35 @@ checkentry(const char *tablename, const void *entry, return 1; } -static struct xt_match statistic_match = { - .name = "statistic", - .match = match, - .matchsize = sizeof(struct xt_statistic_info), - .checkentry = checkentry, - .family = AF_INET, - .me = THIS_MODULE, -}; - -static struct xt_match statistic_match6 = { - .name = "statistic", - .match = match, - .matchsize = sizeof(struct xt_statistic_info), - .checkentry = checkentry, - .family = AF_INET6, - .me = THIS_MODULE, +static struct xt_match xt_statistic_match[] = { + { + .name = "statistic", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_statistic_info), + .me = THIS_MODULE, + }, + { + .name = "statistic", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .matchsize = sizeof(struct xt_statistic_info), + .me = THIS_MODULE, + }, }; static int __init xt_statistic_init(void) { - int ret; - - ret = xt_register_match(&statistic_match); - if (ret) - goto err1; - - ret = xt_register_match(&statistic_match6); - if (ret) - goto err2; - return ret; -err2: - xt_unregister_match(&statistic_match); -err1: - return ret; + return xt_register_matches(xt_statistic_match, + ARRAY_SIZE(xt_statistic_match)); } static void __exit xt_statistic_fini(void) { - xt_unregister_match(&statistic_match6); - xt_unregister_match(&statistic_match); + xt_unregister_matches(xt_statistic_match, + ARRAY_SIZE(xt_statistic_match)); } module_init(xt_statistic_init); diff --git a/net/netfilter/xt_string.c b/net/netfilter/xt_string.c index 275330f..1a1c1d1 100644 --- a/net/netfilter/xt_string.c +++ b/net/netfilter/xt_string.c @@ -75,43 +75,35 @@ static void destroy(const struct xt_match *match, void *matchinfo, textsearch_destroy(STRING_TEXT_PRIV(matchinfo)->config); } -static struct xt_match string_match = { - .name = "string", - .match = match, - .matchsize = sizeof(struct xt_string_info), - .checkentry = checkentry, - .destroy = destroy, - .family = AF_INET, - .me = THIS_MODULE -}; -static struct xt_match string6_match = { - .name = "string", - .match = match, - .matchsize = sizeof(struct xt_string_info), - .checkentry = checkentry, - .destroy = destroy, - .family = AF_INET6, - .me = THIS_MODULE +static struct xt_match xt_string_match[] = { + { + .name = "string", + .family = AF_INET, + .checkentry = checkentry, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_string_info), + .me = THIS_MODULE + }, + { + .name = "string", + .family = AF_INET6, + .checkentry = checkentry, + .match = match, + .destroy = destroy, + .matchsize = sizeof(struct xt_string_info), + .me = THIS_MODULE + }, }; static int __init xt_string_init(void) { - int ret; - - ret = xt_register_match(&string_match); - if (ret) - return ret; - ret = xt_register_match(&string6_match); - if (ret) - xt_unregister_match(&string_match); - - return ret; + return xt_register_matches(xt_string_match, ARRAY_SIZE(xt_string_match)); } static void __exit xt_string_fini(void) { - xt_unregister_match(&string_match); - xt_unregister_match(&string6_match); + xt_unregister_matches(xt_string_match, ARRAY_SIZE(xt_string_match)); } module_init(xt_string_init); diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c index cf7d335c..7baa9eb 100644 --- a/net/netfilter/xt_tcpmss.c +++ b/net/netfilter/xt_tcpmss.c @@ -93,43 +93,34 @@ match(const struct sk_buff *skb, info->invert, hotdrop); } -static struct xt_match tcpmss_match = { - .name = "tcpmss", - .match = match, - .matchsize = sizeof(struct xt_tcpmss_match_info), - .proto = IPPROTO_TCP, - .family = AF_INET, - .me = THIS_MODULE, +static struct xt_match xt_tcpmss_match[] = { + { + .name = "tcpmss", + .family = AF_INET, + .match = match, + .matchsize = sizeof(struct xt_tcpmss_match_info), + .proto = IPPROTO_TCP, + .me = THIS_MODULE, + }, + { + .name = "tcpmss", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct xt_tcpmss_match_info), + .proto = IPPROTO_TCP, + .me = THIS_MODULE, + }, }; -static struct xt_match tcpmss6_match = { - .name = "tcpmss", - .match = match, - .matchsize = sizeof(struct xt_tcpmss_match_info), - .proto = IPPROTO_TCP, - .family = AF_INET6, - .me = THIS_MODULE, -}; - - static int __init xt_tcpmss_init(void) { - int ret; - ret = xt_register_match(&tcpmss_match); - if (ret) - return ret; - - ret = xt_register_match(&tcpmss6_match); - if (ret) - xt_unregister_match(&tcpmss_match); - - return ret; + return xt_register_matches(xt_tcpmss_match, + ARRAY_SIZE(xt_tcpmss_match)); } static void __exit xt_tcpmss_fini(void) { - xt_unregister_match(&tcpmss6_match); - xt_unregister_match(&tcpmss_match); + xt_unregister_matches(xt_tcpmss_match, ARRAY_SIZE(xt_tcpmss_match)); } module_init(xt_tcpmss_init); diff --git a/net/netfilter/xt_tcpudp.c b/net/netfilter/xt_tcpudp.c index a9a63aa..54aab05 100644 --- a/net/netfilter/xt_tcpudp.c +++ b/net/netfilter/xt_tcpudp.c @@ -199,81 +199,54 @@ udp_checkentry(const char *tablename, return !(udpinfo->invflags & ~XT_UDP_INV_MASK); } -static struct xt_match tcp_matchstruct = { - .name = "tcp", - .match = tcp_match, - .matchsize = sizeof(struct xt_tcp), - .proto = IPPROTO_TCP, - .family = AF_INET, - .checkentry = tcp_checkentry, - .me = THIS_MODULE, -}; - -static struct xt_match tcp6_matchstruct = { - .name = "tcp", - .match = tcp_match, - .matchsize = sizeof(struct xt_tcp), - .proto = IPPROTO_TCP, - .family = AF_INET6, - .checkentry = tcp_checkentry, - .me = THIS_MODULE, -}; - -static struct xt_match udp_matchstruct = { - .name = "udp", - .match = udp_match, - .matchsize = sizeof(struct xt_udp), - .proto = IPPROTO_UDP, - .family = AF_INET, - .checkentry = udp_checkentry, - .me = THIS_MODULE, -}; -static struct xt_match udp6_matchstruct = { - .name = "udp", - .match = udp_match, - .matchsize = sizeof(struct xt_udp), - .proto = IPPROTO_UDP, - .family = AF_INET6, - .checkentry = udp_checkentry, - .me = THIS_MODULE, +static struct xt_match xt_tcpudp_match[] = { + { + .name = "tcp", + .family = AF_INET, + .checkentry = tcp_checkentry, + .match = tcp_match, + .matchsize = sizeof(struct xt_tcp), + .proto = IPPROTO_TCP, + .me = THIS_MODULE, + }, + { + .name = "tcp", + .family = AF_INET6, + .checkentry = tcp_checkentry, + .match = tcp_match, + .matchsize = sizeof(struct xt_tcp), + .proto = IPPROTO_TCP, + .me = THIS_MODULE, + }, + { + .name = "udp", + .family = AF_INET, + .checkentry = udp_checkentry, + .match = udp_match, + .matchsize = sizeof(struct xt_udp), + .proto = IPPROTO_UDP, + .me = THIS_MODULE, + }, + { + .name = "udp", + .family = AF_INET6, + .checkentry = udp_checkentry, + .match = udp_match, + .matchsize = sizeof(struct xt_udp), + .proto = IPPROTO_UDP, + .me = THIS_MODULE, + }, }; static int __init xt_tcpudp_init(void) { - int ret; - ret = xt_register_match(&tcp_matchstruct); - if (ret) - return ret; - - ret = xt_register_match(&tcp6_matchstruct); - if (ret) - goto out_unreg_tcp; - - ret = xt_register_match(&udp_matchstruct); - if (ret) - goto out_unreg_tcp6; - - ret = xt_register_match(&udp6_matchstruct); - if (ret) - goto out_unreg_udp; - - return ret; - -out_unreg_udp: - xt_unregister_match(&udp_matchstruct); -out_unreg_tcp6: - xt_unregister_match(&tcp6_matchstruct); -out_unreg_tcp: - xt_unregister_match(&tcp_matchstruct); - return ret; + return xt_register_matches(xt_tcpudp_match, + ARRAY_SIZE(xt_tcpudp_match)); } static void __exit xt_tcpudp_fini(void) { - xt_unregister_match(&udp6_matchstruct); - xt_unregister_match(&udp_matchstruct); - xt_unregister_match(&tcp6_matchstruct); - xt_unregister_match(&tcp_matchstruct); + xt_unregister_matches(xt_tcpudp_match, ARRAY_SIZE(xt_tcpudp_match)); } module_init(xt_tcpudp_init); -- cgit v0.10.2 From fe1cb10873b44cf89082465823ee6d4d4ac63ad7 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:35:47 -0700 Subject: [NETFILTER]: x_tables: remove unused argument to target functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 9a99124..9cef0e9 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -211,8 +211,7 @@ struct xt_target const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userdata); + const void *targinfo); /* Called when user tries to insert an entry of this type: hook_mask is a bitmask of hooks from which it can be diff --git a/include/linux/netfilter_arp/arp_tables.h b/include/linux/netfilter_arp/arp_tables.h index 62cc27d..149e87c 100644 --- a/include/linux/netfilter_arp/arp_tables.h +++ b/include/linux/netfilter_arp/arp_tables.h @@ -248,8 +248,7 @@ extern unsigned int arpt_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct arpt_table *table, - void *userdata); + struct arpt_table *table); #define ARPT_ALIGN(s) (((s) + (__alignof__(struct arpt_entry)-1)) & ~(__alignof__(struct arpt_entry)-1)) #endif /*__KERNEL__*/ diff --git a/include/linux/netfilter_ipv4/ip_tables.h b/include/linux/netfilter_ipv4/ip_tables.h index c0dac16..a536bbd 100644 --- a/include/linux/netfilter_ipv4/ip_tables.h +++ b/include/linux/netfilter_ipv4/ip_tables.h @@ -312,8 +312,7 @@ extern unsigned int ipt_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct ipt_table *table, - void *userdata); + struct ipt_table *table); #define IPT_ALIGN(s) XT_ALIGN(s) diff --git a/include/linux/netfilter_ipv6/ip6_tables.h b/include/linux/netfilter_ipv6/ip6_tables.h index d0d5d1e..d7a8e9c0 100644 --- a/include/linux/netfilter_ipv6/ip6_tables.h +++ b/include/linux/netfilter_ipv6/ip6_tables.h @@ -300,8 +300,7 @@ extern unsigned int ip6t_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct ip6t_table *table, - void *userdata); + struct ip6t_table *table); /* Check for an extension */ extern int ip6t_ext_hdr(u8 nexthdr); diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 8d1d7a6..c6bd270 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -208,8 +208,7 @@ static unsigned int arpt_error(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { if (net_ratelimit()) printk("arp_tables: error: '%s'\n", (char *)targinfo); @@ -226,8 +225,7 @@ unsigned int arpt_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct arpt_table *table, - void *userdata) + struct arpt_table *table) { static const char nulldevname[IFNAMSIZ]; unsigned int verdict = NF_DROP; @@ -302,8 +300,7 @@ unsigned int arpt_do_table(struct sk_buff **pskb, in, out, hook, t->u.kernel.target, - t->data, - userdata); + t->data); /* Target might have changed stuff. */ arp = (*pskb)->nh.arph; diff --git a/net/ipv4/netfilter/arpt_mangle.c b/net/ipv4/netfilter/arpt_mangle.c index a58325c..05fb242 100644 --- a/net/ipv4/netfilter/arpt_mangle.c +++ b/net/ipv4/netfilter/arpt_mangle.c @@ -11,7 +11,7 @@ static unsigned int target(struct sk_buff **pskb, const struct net_device *in, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { const struct arpt_mangle *mangle = targinfo; struct arphdr *arp; diff --git a/net/ipv4/netfilter/arptable_filter.c b/net/ipv4/netfilter/arptable_filter.c index d7c472f..7edea2a 100644 --- a/net/ipv4/netfilter/arptable_filter.c +++ b/net/ipv4/netfilter/arptable_filter.c @@ -155,7 +155,7 @@ static unsigned int arpt_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return arpt_do_table(pskb, hook, in, out, &packet_filter, NULL); + return arpt_do_table(pskb, hook, in, out, &packet_filter); } static struct nf_hook_ops arpt_ops[] = { diff --git a/net/ipv4/netfilter/ip_nat_rule.c b/net/ipv4/netfilter/ip_nat_rule.c index 1aba926..1aa0e4f 100644 --- a/net/ipv4/netfilter/ip_nat_rule.c +++ b/net/ipv4/netfilter/ip_nat_rule.c @@ -104,8 +104,7 @@ static unsigned int ipt_snat_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct ipt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; @@ -147,8 +146,7 @@ static unsigned int ipt_dnat_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct ipt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; @@ -255,7 +253,7 @@ int ip_nat_rule_find(struct sk_buff **pskb, { int ret; - ret = ipt_do_table(pskb, hooknum, in, out, &nat_table, NULL); + ret = ipt_do_table(pskb, hooknum, in, out, &nat_table); if (ret == NF_ACCEPT) { if (!ip_nat_initialized(ct, HOOK2MANIP(hooknum))) diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 048514f..8ce5b6f 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -180,8 +180,7 @@ ipt_error(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { if (net_ratelimit()) printk("ip_tables: error: `%s'\n", (char *)targinfo); @@ -217,8 +216,7 @@ ipt_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct ipt_table *table, - void *userdata) + struct ipt_table *table) { static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long)))); u_int16_t offset; @@ -308,8 +306,7 @@ ipt_do_table(struct sk_buff **pskb, in, out, hook, t->u.kernel.target, - t->data, - userdata); + t->data); #ifdef CONFIG_NETFILTER_DEBUG if (((struct ipt_entry *)table_base)->comefrom diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index d994c5f..a08383c 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -302,8 +302,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_clusterip_tgt_info *cipinfo = targinfo; enum ip_conntrack_info ctinfo; diff --git a/net/ipv4/netfilter/ipt_ECN.c b/net/ipv4/netfilter/ipt_ECN.c index 7e30e6d..1c3da4a 100644 --- a/net/ipv4/netfilter/ipt_ECN.c +++ b/net/ipv4/netfilter/ipt_ECN.c @@ -85,8 +85,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_ECN_info *einfo = targinfo; diff --git a/net/ipv4/netfilter/ipt_LOG.c b/net/ipv4/netfilter/ipt_LOG.c index b98f7b0..a8d356c 100644 --- a/net/ipv4/netfilter/ipt_LOG.c +++ b/net/ipv4/netfilter/ipt_LOG.c @@ -416,8 +416,7 @@ ipt_log_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_log_info *loginfo = targinfo; struct nf_loginfo li; diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c index ebd94f2..9659793 100644 --- a/net/ipv4/netfilter/ipt_MASQUERADE.c +++ b/net/ipv4/netfilter/ipt_MASQUERADE.c @@ -64,8 +64,7 @@ masquerade_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; diff --git a/net/ipv4/netfilter/ipt_NETMAP.c b/net/ipv4/netfilter/ipt_NETMAP.c index 736c4b5..fd5e74a 100644 --- a/net/ipv4/netfilter/ipt_NETMAP.c +++ b/net/ipv4/netfilter/ipt_NETMAP.c @@ -55,8 +55,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; diff --git a/net/ipv4/netfilter/ipt_REDIRECT.c b/net/ipv4/netfilter/ipt_REDIRECT.c index f290463..839fe99 100644 --- a/net/ipv4/netfilter/ipt_REDIRECT.c +++ b/net/ipv4/netfilter/ipt_REDIRECT.c @@ -58,8 +58,7 @@ redirect_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c index 95c6662..1dfd8e5 100644 --- a/net/ipv4/netfilter/ipt_REJECT.c +++ b/net/ipv4/netfilter/ipt_REJECT.c @@ -228,8 +228,7 @@ static unsigned int reject(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_reject_info *reject = targinfo; diff --git a/net/ipv4/netfilter/ipt_SAME.c b/net/ipv4/netfilter/ipt_SAME.c index 7169b09..cf80174 100644 --- a/net/ipv4/netfilter/ipt_SAME.c +++ b/net/ipv4/netfilter/ipt_SAME.c @@ -133,8 +133,7 @@ same_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index 0fce85e..6d668dc 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -41,8 +41,7 @@ ipt_tcpmss_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_tcpmss_info *tcpmssinfo = targinfo; struct tcphdr *tcph; diff --git a/net/ipv4/netfilter/ipt_TOS.c b/net/ipv4/netfilter/ipt_TOS.c index 52e9d70..043df01 100644 --- a/net/ipv4/netfilter/ipt_TOS.c +++ b/net/ipv4/netfilter/ipt_TOS.c @@ -26,8 +26,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ipt_tos_target_info *tosinfo = targinfo; struct iphdr *iph = (*pskb)->nh.iph; diff --git a/net/ipv4/netfilter/ipt_TTL.c b/net/ipv4/netfilter/ipt_TTL.c index 2afb2a8..1640071 100644 --- a/net/ipv4/netfilter/ipt_TTL.c +++ b/net/ipv4/netfilter/ipt_TTL.c @@ -23,7 +23,7 @@ static unsigned int ipt_ttl_target(struct sk_buff **pskb, const struct net_device *in, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { struct iphdr *iph; const struct ipt_TTL_info *info = targinfo; diff --git a/net/ipv4/netfilter/ipt_ULOG.c b/net/ipv4/netfilter/ipt_ULOG.c index d46fd67..4c5f0a1 100644 --- a/net/ipv4/netfilter/ipt_ULOG.c +++ b/net/ipv4/netfilter/ipt_ULOG.c @@ -308,7 +308,7 @@ static unsigned int ipt_ulog_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { struct ipt_ulog_info *loginfo = (struct ipt_ulog_info *) targinfo; diff --git a/net/ipv4/netfilter/iptable_filter.c b/net/ipv4/netfilter/iptable_filter.c index 7f41748..e2e7dd8 100644 --- a/net/ipv4/netfilter/iptable_filter.c +++ b/net/ipv4/netfilter/iptable_filter.c @@ -90,7 +90,7 @@ ipt_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ipt_do_table(pskb, hook, in, out, &packet_filter, NULL); + return ipt_do_table(pskb, hook, in, out, &packet_filter); } static unsigned int @@ -108,7 +108,7 @@ ipt_local_out_hook(unsigned int hook, return NF_ACCEPT; } - return ipt_do_table(pskb, hook, in, out, &packet_filter, NULL); + return ipt_do_table(pskb, hook, in, out, &packet_filter); } static struct nf_hook_ops ipt_ops[] = { diff --git a/net/ipv4/netfilter/iptable_mangle.c b/net/ipv4/netfilter/iptable_mangle.c index 4e7998b..79336cb 100644 --- a/net/ipv4/netfilter/iptable_mangle.c +++ b/net/ipv4/netfilter/iptable_mangle.c @@ -119,7 +119,7 @@ ipt_route_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ipt_do_table(pskb, hook, in, out, &packet_mangler, NULL); + return ipt_do_table(pskb, hook, in, out, &packet_mangler); } static unsigned int @@ -148,7 +148,7 @@ ipt_local_hook(unsigned int hook, daddr = (*pskb)->nh.iph->daddr; tos = (*pskb)->nh.iph->tos; - ret = ipt_do_table(pskb, hook, in, out, &packet_mangler, NULL); + ret = ipt_do_table(pskb, hook, in, out, &packet_mangler); /* Reroute for ANY change. */ if (ret != NF_DROP && ret != NF_STOLEN && ret != NF_QUEUE && ((*pskb)->nh.iph->saddr != saddr diff --git a/net/ipv4/netfilter/iptable_raw.c b/net/ipv4/netfilter/iptable_raw.c index 7912cce..bcbeb4a 100644 --- a/net/ipv4/netfilter/iptable_raw.c +++ b/net/ipv4/netfilter/iptable_raw.c @@ -95,7 +95,7 @@ ipt_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ipt_do_table(pskb, hook, in, out, &packet_raw, NULL); + return ipt_do_table(pskb, hook, in, out, &packet_raw); } /* 'raw' is the very first table. */ diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index c9d6b23..38cd7ff 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -220,8 +220,7 @@ ip6t_error(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { if (net_ratelimit()) printk("ip6_tables: error: `%s'\n", (char *)targinfo); @@ -258,8 +257,7 @@ ip6t_do_table(struct sk_buff **pskb, unsigned int hook, const struct net_device *in, const struct net_device *out, - struct xt_table *table, - void *userdata) + struct xt_table *table) { static const char nulldevname[IFNAMSIZ] __attribute__((aligned(sizeof(long)))); int offset = 0; @@ -349,8 +347,7 @@ ip6t_do_table(struct sk_buff **pskb, in, out, hook, t->u.kernel.target, - t->data, - userdata); + t->data); #ifdef CONFIG_NETFILTER_DEBUG if (((struct ip6t_entry *)table_base)->comefrom diff --git a/net/ipv6/netfilter/ip6t_HL.c b/net/ipv6/netfilter/ip6t_HL.c index b8eff8e..c85d124 100644 --- a/net/ipv6/netfilter/ip6t_HL.c +++ b/net/ipv6/netfilter/ip6t_HL.c @@ -22,7 +22,7 @@ static unsigned int ip6t_hl_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { struct ipv6hdr *ip6h; const struct ip6t_HL_info *info = targinfo; diff --git a/net/ipv6/netfilter/ip6t_LOG.c b/net/ipv6/netfilter/ip6t_LOG.c index 73c6300..acb9173 100644 --- a/net/ipv6/netfilter/ip6t_LOG.c +++ b/net/ipv6/netfilter/ip6t_LOG.c @@ -427,8 +427,7 @@ ip6t_log_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ip6t_log_info *loginfo = targinfo; struct nf_loginfo li; diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c index 7929ff4..343acd3 100644 --- a/net/ipv6/netfilter/ip6t_REJECT.c +++ b/net/ipv6/netfilter/ip6t_REJECT.c @@ -180,8 +180,7 @@ static unsigned int reject6_target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct ip6t_reject_info *reject = targinfo; diff --git a/net/ipv6/netfilter/ip6table_filter.c b/net/ipv6/netfilter/ip6table_filter.c index 60976c0..2fc07c7 100644 --- a/net/ipv6/netfilter/ip6table_filter.c +++ b/net/ipv6/netfilter/ip6table_filter.c @@ -108,7 +108,7 @@ ip6t_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ip6t_do_table(pskb, hook, in, out, &packet_filter, NULL); + return ip6t_do_table(pskb, hook, in, out, &packet_filter); } static unsigned int @@ -128,7 +128,7 @@ ip6t_local_out_hook(unsigned int hook, } #endif - return ip6t_do_table(pskb, hook, in, out, &packet_filter, NULL); + return ip6t_do_table(pskb, hook, in, out, &packet_filter); } static struct nf_hook_ops ip6t_ops[] = { diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c index 03a13ea..32db04f 100644 --- a/net/ipv6/netfilter/ip6table_mangle.c +++ b/net/ipv6/netfilter/ip6table_mangle.c @@ -138,7 +138,7 @@ ip6t_route_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ip6t_do_table(pskb, hook, in, out, &packet_mangler, NULL); + return ip6t_do_table(pskb, hook, in, out, &packet_mangler); } static unsigned int @@ -174,7 +174,7 @@ ip6t_local_hook(unsigned int hook, /* flowlabel and prio (includes version, which shouldn't change either */ flowlabel = *((u_int32_t *) (*pskb)->nh.ipv6h); - ret = ip6t_do_table(pskb, hook, in, out, &packet_mangler, NULL); + ret = ip6t_do_table(pskb, hook, in, out, &packet_mangler); if (ret != NF_DROP && ret != NF_STOLEN && (memcmp(&(*pskb)->nh.ipv6h->saddr, &saddr, sizeof(saddr)) diff --git a/net/ipv6/netfilter/ip6table_raw.c b/net/ipv6/netfilter/ip6table_raw.c index 61a7c58..b4154da 100644 --- a/net/ipv6/netfilter/ip6table_raw.c +++ b/net/ipv6/netfilter/ip6table_raw.c @@ -122,7 +122,7 @@ ip6t_hook(unsigned int hook, const struct net_device *out, int (*okfn)(struct sk_buff *)) { - return ip6t_do_table(pskb, hook, in, out, &packet_raw, NULL); + return ip6t_do_table(pskb, hook, in, out, &packet_raw); } static struct nf_hook_ops ip6t_ops[] = { diff --git a/net/netfilter/xt_CLASSIFY.c b/net/netfilter/xt_CLASSIFY.c index 1f92edd..50de965 100644 --- a/net/netfilter/xt_CLASSIFY.c +++ b/net/netfilter/xt_CLASSIFY.c @@ -29,8 +29,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_classify_target_info *clinfo = targinfo; diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index e577356..c2125f6 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -38,8 +38,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_connmark_target_info *markinfo = targinfo; u_int32_t diff; diff --git a/net/netfilter/xt_CONNSECMARK.c b/net/netfilter/xt_CONNSECMARK.c index 48f7fc3..4b9cc65 100644 --- a/net/netfilter/xt_CONNSECMARK.c +++ b/net/netfilter/xt_CONNSECMARK.c @@ -66,7 +66,7 @@ static void secmark_restore(struct sk_buff *skb) static unsigned int target(struct sk_buff **pskb, const struct net_device *in, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { struct sk_buff *skb = *pskb; const struct xt_connsecmark_target_info *info = targinfo; diff --git a/net/netfilter/xt_DSCP.c b/net/netfilter/xt_DSCP.c index a1cd972..9d23c95 100644 --- a/net/netfilter/xt_DSCP.c +++ b/net/netfilter/xt_DSCP.c @@ -32,8 +32,7 @@ static unsigned int target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_DSCP_info *dinfo = targinfo; u_int8_t dscp = ipv4_get_dsfield((*pskb)->nh.iph) >> XT_DSCP_SHIFT; @@ -54,8 +53,7 @@ static unsigned int target6(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_DSCP_info *dinfo = targinfo; u_int8_t dscp = ipv6_get_dsfield((*pskb)->nh.ipv6h) >> XT_DSCP_SHIFT; diff --git a/net/netfilter/xt_MARK.c b/net/netfilter/xt_MARK.c index 0a61272..95a171c 100644 --- a/net/netfilter/xt_MARK.c +++ b/net/netfilter/xt_MARK.c @@ -27,8 +27,7 @@ target_v0(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_mark_target_info *markinfo = targinfo; @@ -44,8 +43,7 @@ target_v1(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_mark_target_info_v1 *markinfo = targinfo; int mark = 0; diff --git a/net/netfilter/xt_NFQUEUE.c b/net/netfilter/xt_NFQUEUE.c index 7b98228..db9b896 100644 --- a/net/netfilter/xt_NFQUEUE.c +++ b/net/netfilter/xt_NFQUEUE.c @@ -29,8 +29,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { const struct xt_NFQ_info *tinfo = targinfo; diff --git a/net/netfilter/xt_NOTRACK.c b/net/netfilter/xt_NOTRACK.c index cab881d..6d00dca 100644 --- a/net/netfilter/xt_NOTRACK.c +++ b/net/netfilter/xt_NOTRACK.c @@ -16,8 +16,7 @@ target(struct sk_buff **pskb, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, - void *userinfo) + const void *targinfo) { /* Previously seen (loopback)? Ignore. */ if ((*pskb)->nfct != NULL) diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c index 4300988..8a04dcf 100644 --- a/net/netfilter/xt_SECMARK.c +++ b/net/netfilter/xt_SECMARK.c @@ -31,7 +31,7 @@ static u8 mode; static unsigned int target(struct sk_buff **pskb, const struct net_device *in, const struct net_device *out, unsigned int hooknum, const struct xt_target *target, - const void *targinfo, void *userinfo) + const void *targinfo) { u32 secmark = 0; const struct xt_secmark_target_info *info = targinfo; diff --git a/net/netfilter/xt_connbytes.c b/net/netfilter/xt_connbytes.c index 2d49948..d725e8b 100644 --- a/net/netfilter/xt_connbytes.c +++ b/net/netfilter/xt_connbytes.c @@ -143,7 +143,7 @@ static int check(const char *tablename, return 1; } -static struct xt_match xt_connbytes_match = { +static struct xt_match xt_connbytes_match[] = { { .name = "connbytes", .family = AF_INET, diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 224c078..45a3143 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -222,7 +222,7 @@ static int tcf_ipt(struct sk_buff *skb, struct tc_action *a, ret = ipt->tcfi_t->u.kernel.target->target(&skb, skb->dev, NULL, ipt->tcfi_hook, ipt->tcfi_t->u.kernel.target, - ipt->tcfi_t->data, NULL); + ipt->tcfi_t->data); switch (ret) { case NF_ACCEPT: result = TC_ACT_OK; -- cgit v0.10.2 From efa741656e9ebf5fd6e0432b0d1b3c7f156392d3 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:36:37 -0700 Subject: [NETFILTER]: x_tables: remove unused size argument to check/destroy functions The size is verified by x_tables and isn't needed by the modules anymore. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 9cef0e9..9d97102 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -174,12 +174,10 @@ struct xt_match const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask); /* Called when entry of this type deleted. */ - void (*destroy)(const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize); + void (*destroy)(const struct xt_match *match, void *matchinfo); /* Called when userspace align differs from kernel space one */ int (*compat)(void *match, void **dstptr, int *size, int convert); @@ -221,12 +219,10 @@ struct xt_target const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask); /* Called when entry of this type deleted. */ - void (*destroy)(const struct xt_target *target, void *targinfo, - unsigned int targinfosize); + void (*destroy)(const struct xt_target *target, void *targinfo); /* Called when userspace align differs from kernel space one */ int (*compat)(void *target, void **dstptr, int *size, int convert); diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index c6bd270..4f10b06 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -491,8 +491,6 @@ static inline int check_entry(struct arpt_entry *e, const char *name, unsigned i } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, - t->u.target_size - - sizeof(*t), e->comefrom)) { duprintf("arp_tables: check failed for `%s'.\n", t->u.kernel.target->name); @@ -559,8 +557,7 @@ static inline int cleanup_entry(struct arpt_entry *e, unsigned int *i) t = arpt_get_target(e); if (t->u.kernel.target->destroy) - t->u.kernel.target->destroy(t->u.kernel.target, t->data, - t->u.target_size - sizeof(*t)); + t->u.kernel.target->destroy(t->u.kernel.target, t->data); module_put(t->u.kernel.target->me); return 0; } diff --git a/net/ipv4/netfilter/arpt_mangle.c b/net/ipv4/netfilter/arpt_mangle.c index 05fb242..d12b1df 100644 --- a/net/ipv4/netfilter/arpt_mangle.c +++ b/net/ipv4/netfilter/arpt_mangle.c @@ -67,7 +67,7 @@ target(struct sk_buff **pskb, static int checkentry(const char *tablename, const void *e, const struct xt_target *target, - void *targinfo, unsigned int targinfosize, unsigned int hook_mask) + void *targinfo, unsigned int hook_mask) { const struct arpt_mangle *mangle = targinfo; diff --git a/net/ipv4/netfilter/ip_nat_rule.c b/net/ipv4/netfilter/ip_nat_rule.c index 1aa0e4f..e59f5a8 100644 --- a/net/ipv4/netfilter/ip_nat_rule.c +++ b/net/ipv4/netfilter/ip_nat_rule.c @@ -172,7 +172,6 @@ static int ipt_snat_checkentry(const char *tablename, const void *entry, const struct ipt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct ip_nat_multi_range_compat *mr = targinfo; @@ -189,7 +188,6 @@ static int ipt_dnat_checkentry(const char *tablename, const void *entry, const struct ipt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct ip_nat_multi_range_compat *mr = targinfo; diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 8ce5b6f..a0f3680 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -464,8 +464,7 @@ cleanup_match(struct ipt_entry_match *m, unsigned int *i) return 1; if (m->u.kernel.match->destroy) - m->u.kernel.match->destroy(m->u.kernel.match, m->data, - m->u.match_size - sizeof(*m)); + m->u.kernel.match->destroy(m->u.kernel.match, m->data); module_put(m->u.kernel.match->me); return 0; } @@ -518,7 +517,6 @@ check_match(struct ipt_entry_match *m, if (m->u.kernel.match->checkentry && !m->u.kernel.match->checkentry(name, ip, match, m->data, - m->u.match_size - sizeof(*m), hookmask)) { duprintf("ip_tables: check failed for `%s'.\n", m->u.kernel.match->name); @@ -579,8 +577,6 @@ check_entry(struct ipt_entry *e, const char *name, unsigned int size, } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, - t->u.target_size - - sizeof(*t), e->comefrom)) { duprintf("ip_tables: check failed for `%s'.\n", t->u.kernel.target->name); @@ -652,8 +648,7 @@ cleanup_entry(struct ipt_entry *e, unsigned int *i) IPT_MATCH_ITERATE(e, cleanup_match, NULL); t = ipt_get_target(e); if (t->u.kernel.target->destroy) - t->u.kernel.target->destroy(t->u.kernel.target, t->data, - t->u.target_size - sizeof(*t)); + t->u.kernel.target->destroy(t->u.kernel.target, t->data); module_put(t->u.kernel.target->me); return 0; } @@ -1599,7 +1594,6 @@ static inline int compat_copy_match_from_user(struct ipt_entry_match *m, if (m->u.kernel.match->checkentry && !m->u.kernel.match->checkentry(name, ip, match, dm->data, - dm->u.match_size - sizeof(*dm), hookmask)) { duprintf("ip_tables: check failed for `%s'.\n", m->u.kernel.match->name); @@ -1658,8 +1652,7 @@ static int compat_copy_entry_from_user(struct ipt_entry *e, void **dstptr, goto out; } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, de, target, - t->data, t->u.target_size - sizeof(*t), - de->comefrom)) { + t->data, de->comefrom)) { duprintf("ip_tables: compat: check failed for `%s'.\n", t->u.kernel.target->name); goto out; @@ -2182,7 +2175,6 @@ icmp_checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ipt_icmp *icmpinfo = matchinfo; diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c index a08383c..4158966 100644 --- a/net/ipv4/netfilter/ipt_CLUSTERIP.c +++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c @@ -372,7 +372,6 @@ checkentry(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct ipt_clusterip_tgt_info *cipinfo = targinfo; @@ -449,8 +448,7 @@ checkentry(const char *tablename, } /* drop reference count of cluster config when rule is deleted */ -static void destroy(const struct xt_target *target, void *targinfo, - unsigned int targinfosize) +static void destroy(const struct xt_target *target, void *targinfo) { struct ipt_clusterip_tgt_info *cipinfo = targinfo; diff --git a/net/ipv4/netfilter/ipt_ECN.c b/net/ipv4/netfilter/ipt_ECN.c index 1c3da4a..23f9c7e 100644 --- a/net/ipv4/netfilter/ipt_ECN.c +++ b/net/ipv4/netfilter/ipt_ECN.c @@ -106,7 +106,6 @@ checkentry(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ipt_ECN_info *einfo = (struct ipt_ECN_info *)targinfo; diff --git a/net/ipv4/netfilter/ipt_LOG.c b/net/ipv4/netfilter/ipt_LOG.c index a8d356c..7dc820d 100644 --- a/net/ipv4/netfilter/ipt_LOG.c +++ b/net/ipv4/netfilter/ipt_LOG.c @@ -439,7 +439,6 @@ static int ipt_log_checkentry(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ipt_log_info *loginfo = targinfo; diff --git a/net/ipv4/netfilter/ipt_MASQUERADE.c b/net/ipv4/netfilter/ipt_MASQUERADE.c index 9659793..bc65168 100644 --- a/net/ipv4/netfilter/ipt_MASQUERADE.c +++ b/net/ipv4/netfilter/ipt_MASQUERADE.c @@ -42,7 +42,6 @@ masquerade_check(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ip_nat_multi_range_compat *mr = targinfo; diff --git a/net/ipv4/netfilter/ipt_NETMAP.c b/net/ipv4/netfilter/ipt_NETMAP.c index fd5e74a..beb2914 100644 --- a/net/ipv4/netfilter/ipt_NETMAP.c +++ b/net/ipv4/netfilter/ipt_NETMAP.c @@ -33,7 +33,6 @@ check(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ip_nat_multi_range_compat *mr = targinfo; diff --git a/net/ipv4/netfilter/ipt_REDIRECT.c b/net/ipv4/netfilter/ipt_REDIRECT.c index 839fe99..f03d436 100644 --- a/net/ipv4/netfilter/ipt_REDIRECT.c +++ b/net/ipv4/netfilter/ipt_REDIRECT.c @@ -36,7 +36,6 @@ redirect_check(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ip_nat_multi_range_compat *mr = targinfo; diff --git a/net/ipv4/netfilter/ipt_REJECT.c b/net/ipv4/netfilter/ipt_REJECT.c index 1dfd8e5..b81821e 100644 --- a/net/ipv4/netfilter/ipt_REJECT.c +++ b/net/ipv4/netfilter/ipt_REJECT.c @@ -276,7 +276,6 @@ static int check(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ipt_reject_info *rejinfo = targinfo; diff --git a/net/ipv4/netfilter/ipt_SAME.c b/net/ipv4/netfilter/ipt_SAME.c index cf80174..efbcb11 100644 --- a/net/ipv4/netfilter/ipt_SAME.c +++ b/net/ipv4/netfilter/ipt_SAME.c @@ -52,7 +52,6 @@ same_check(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { unsigned int count, countess, rangeip, index = 0; @@ -116,8 +115,7 @@ same_check(const char *tablename, } static void -same_destroy(const struct xt_target *target, void *targinfo, - unsigned int targinfosize) +same_destroy(const struct xt_target *target, void *targinfo) { struct ipt_same_info *mr = targinfo; diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index 6d668dc..ac8a35e 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -207,7 +207,6 @@ ipt_tcpmss_checkentry(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ipt_tcpmss_info *tcpmssinfo = targinfo; diff --git a/net/ipv4/netfilter/ipt_TOS.c b/net/ipv4/netfilter/ipt_TOS.c index 043df01..471a4c4 100644 --- a/net/ipv4/netfilter/ipt_TOS.c +++ b/net/ipv4/netfilter/ipt_TOS.c @@ -49,7 +49,6 @@ checkentry(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const u_int8_t tos = ((struct ipt_tos_target_info *)targinfo)->tos; diff --git a/net/ipv4/netfilter/ipt_TTL.c b/net/ipv4/netfilter/ipt_TTL.c index 1640071..214d9d9 100644 --- a/net/ipv4/netfilter/ipt_TTL.c +++ b/net/ipv4/netfilter/ipt_TTL.c @@ -67,7 +67,6 @@ static int ipt_ttl_checkentry(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct ipt_TTL_info *info = targinfo; diff --git a/net/ipv4/netfilter/ipt_ULOG.c b/net/ipv4/netfilter/ipt_ULOG.c index 4c5f0a1..2b104ea 100644 --- a/net/ipv4/netfilter/ipt_ULOG.c +++ b/net/ipv4/netfilter/ipt_ULOG.c @@ -346,7 +346,6 @@ static int ipt_ulog_checkentry(const char *tablename, const void *e, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hookmask) { struct ipt_ulog_info *loginfo = (struct ipt_ulog_info *) targinfo; diff --git a/net/ipv4/netfilter/ipt_ah.c b/net/ipv4/netfilter/ipt_ah.c index 2927135..1798f86 100644 --- a/net/ipv4/netfilter/ipt_ah.c +++ b/net/ipv4/netfilter/ipt_ah.c @@ -74,7 +74,6 @@ checkentry(const char *tablename, const void *ip_void, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ipt_ah *ahinfo = matchinfo; diff --git a/net/ipv4/netfilter/ipt_ecn.c b/net/ipv4/netfilter/ipt_ecn.c index b282504..dafbdec 100644 --- a/net/ipv4/netfilter/ipt_ecn.c +++ b/net/ipv4/netfilter/ipt_ecn.c @@ -88,8 +88,7 @@ static int match(const struct sk_buff *skb, static int checkentry(const char *tablename, const void *ip_void, const struct xt_match *match, - void *matchinfo, unsigned int matchsize, - unsigned int hook_mask) + void *matchinfo, unsigned int hook_mask) { const struct ipt_ecn_info *info = matchinfo; const struct ipt_ip *ip = ip_void; diff --git a/net/ipv4/netfilter/ipt_hashlimit.c b/net/ipv4/netfilter/ipt_hashlimit.c index 3bd2368..b5b74b0 100644 --- a/net/ipv4/netfilter/ipt_hashlimit.c +++ b/net/ipv4/netfilter/ipt_hashlimit.c @@ -478,7 +478,6 @@ hashlimit_checkentry(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { struct ipt_hashlimit_info *r = matchinfo; @@ -529,8 +528,7 @@ hashlimit_checkentry(const char *tablename, } static void -hashlimit_destroy(const struct xt_match *match, void *matchinfo, - unsigned int matchsize) +hashlimit_destroy(const struct xt_match *match, void *matchinfo) { struct ipt_hashlimit_info *r = matchinfo; diff --git a/net/ipv4/netfilter/ipt_owner.c b/net/ipv4/netfilter/ipt_owner.c index 5ac6ac0..78c336f1 100644 --- a/net/ipv4/netfilter/ipt_owner.c +++ b/net/ipv4/netfilter/ipt_owner.c @@ -56,7 +56,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ipt_owner_info *info = matchinfo; diff --git a/net/ipv4/netfilter/ipt_recent.c b/net/ipv4/netfilter/ipt_recent.c index 682c094..32ae8d7 100644 --- a/net/ipv4/netfilter/ipt_recent.c +++ b/net/ipv4/netfilter/ipt_recent.c @@ -238,7 +238,7 @@ out: static int ipt_recent_checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) + unsigned int hook_mask) { const struct ipt_recent_info *info = matchinfo; struct recent_table *t; @@ -294,8 +294,7 @@ out: } static void -ipt_recent_destroy(const struct xt_match *match, void *matchinfo, - unsigned int matchsize) +ipt_recent_destroy(const struct xt_match *match, void *matchinfo) { const struct ipt_recent_info *info = matchinfo; struct recent_table *t; diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 38cd7ff..d1c3153 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -504,8 +504,7 @@ cleanup_match(struct ip6t_entry_match *m, unsigned int *i) return 1; if (m->u.kernel.match->destroy) - m->u.kernel.match->destroy(m->u.kernel.match, m->data, - m->u.match_size - sizeof(*m)); + m->u.kernel.match->destroy(m->u.kernel.match, m->data); module_put(m->u.kernel.match->me); return 0; } @@ -558,7 +557,6 @@ check_match(struct ip6t_entry_match *m, if (m->u.kernel.match->checkentry && !m->u.kernel.match->checkentry(name, ipv6, match, m->data, - m->u.match_size - sizeof(*m), hookmask)) { duprintf("ip_tables: check failed for `%s'.\n", m->u.kernel.match->name); @@ -619,8 +617,6 @@ check_entry(struct ip6t_entry *e, const char *name, unsigned int size, } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, - t->u.target_size - - sizeof(*t), e->comefrom)) { duprintf("ip_tables: check failed for `%s'.\n", t->u.kernel.target->name); @@ -692,8 +688,7 @@ cleanup_entry(struct ip6t_entry *e, unsigned int *i) IP6T_MATCH_ITERATE(e, cleanup_match, NULL); t = ip6t_get_target(e); if (t->u.kernel.target->destroy) - t->u.kernel.target->destroy(t->u.kernel.target, t->data, - t->u.target_size - sizeof(*t)); + t->u.kernel.target->destroy(t->u.kernel.target, t->data); module_put(t->u.kernel.target->me); return 0; } @@ -1349,7 +1344,6 @@ icmp6_checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ip6t_icmp *icmpinfo = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_HL.c b/net/ipv6/netfilter/ip6t_HL.c index c85d124..e54ea92 100644 --- a/net/ipv6/netfilter/ip6t_HL.c +++ b/net/ipv6/netfilter/ip6t_HL.c @@ -66,7 +66,6 @@ static int ip6t_hl_checkentry(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct ip6t_HL_info *info = targinfo; diff --git a/net/ipv6/netfilter/ip6t_LOG.c b/net/ipv6/netfilter/ip6t_LOG.c index acb9173..0cf537d 100644 --- a/net/ipv6/netfilter/ip6t_LOG.c +++ b/net/ipv6/netfilter/ip6t_LOG.c @@ -451,7 +451,6 @@ static int ip6t_log_checkentry(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ip6t_log_info *loginfo = targinfo; diff --git a/net/ipv6/netfilter/ip6t_REJECT.c b/net/ipv6/netfilter/ip6t_REJECT.c index 343acd3..311eae8 100644 --- a/net/ipv6/netfilter/ip6t_REJECT.c +++ b/net/ipv6/netfilter/ip6t_REJECT.c @@ -223,7 +223,6 @@ static int check(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const struct ip6t_reject_info *rejinfo = targinfo; diff --git a/net/ipv6/netfilter/ip6t_ah.c b/net/ipv6/netfilter/ip6t_ah.c index 2f7bb20..ec1b160 100644 --- a/net/ipv6/netfilter/ip6t_ah.c +++ b/net/ipv6/netfilter/ip6t_ah.c @@ -102,7 +102,6 @@ checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ip6t_ah *ahinfo = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_dst.c b/net/ipv6/netfilter/ip6t_dst.c index 9422413..223c335 100644 --- a/net/ipv6/netfilter/ip6t_dst.c +++ b/net/ipv6/netfilter/ip6t_dst.c @@ -182,7 +182,6 @@ checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ip6t_opts *optsinfo = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_frag.c b/net/ipv6/netfilter/ip6t_frag.c index 06768c8..78d9c8b 100644 --- a/net/ipv6/netfilter/ip6t_frag.c +++ b/net/ipv6/netfilter/ip6t_frag.c @@ -119,7 +119,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ip6t_frag *fraginfo = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_hbh.c b/net/ipv6/netfilter/ip6t_hbh.c index 374f1be..72defc8 100644 --- a/net/ipv6/netfilter/ip6t_hbh.c +++ b/net/ipv6/netfilter/ip6t_hbh.c @@ -182,7 +182,6 @@ checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ip6t_opts *optsinfo = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_ipv6header.c b/net/ipv6/netfilter/ip6t_ipv6header.c index 9375eeb..3093c39 100644 --- a/net/ipv6/netfilter/ip6t_ipv6header.c +++ b/net/ipv6/netfilter/ip6t_ipv6header.c @@ -128,7 +128,6 @@ ipv6header_checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ip6t_ipv6header_info *info = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_owner.c b/net/ipv6/netfilter/ip6t_owner.c index 5d04799..4eb9bbc 100644 --- a/net/ipv6/netfilter/ip6t_owner.c +++ b/net/ipv6/netfilter/ip6t_owner.c @@ -57,7 +57,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ip6t_owner_info *info = matchinfo; diff --git a/net/ipv6/netfilter/ip6t_rt.c b/net/ipv6/netfilter/ip6t_rt.c index fbb0184..bcb2e16 100644 --- a/net/ipv6/netfilter/ip6t_rt.c +++ b/net/ipv6/netfilter/ip6t_rt.c @@ -197,7 +197,6 @@ checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct ip6t_rt *rtinfo = matchinfo; diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index c2125f6..0e4249d 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -89,7 +89,6 @@ checkentry(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct xt_connmark_target_info *matchinfo = targinfo; diff --git a/net/netfilter/xt_CONNSECMARK.c b/net/netfilter/xt_CONNSECMARK.c index 4b9cc65..4b0e14b 100644 --- a/net/netfilter/xt_CONNSECMARK.c +++ b/net/netfilter/xt_CONNSECMARK.c @@ -89,7 +89,7 @@ static unsigned int target(struct sk_buff **pskb, const struct net_device *in, static int checkentry(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) + unsigned int hook_mask) { struct xt_connsecmark_target_info *info = targinfo; diff --git a/net/netfilter/xt_DSCP.c b/net/netfilter/xt_DSCP.c index 9d23c95..a7cc75a 100644 --- a/net/netfilter/xt_DSCP.c +++ b/net/netfilter/xt_DSCP.c @@ -72,7 +72,6 @@ static int checkentry(const char *tablename, const void *e_void, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { const u_int8_t dscp = ((struct xt_DSCP_info *)targinfo)->dscp; diff --git a/net/netfilter/xt_MARK.c b/net/netfilter/xt_MARK.c index 95a171c..782f8d8 100644 --- a/net/netfilter/xt_MARK.c +++ b/net/netfilter/xt_MARK.c @@ -74,7 +74,6 @@ checkentry_v0(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct xt_mark_target_info *markinfo = targinfo; @@ -91,7 +90,6 @@ checkentry_v1(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) { struct xt_mark_target_info_v1 *markinfo = targinfo; diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c index 8a04dcf..451b67c 100644 --- a/net/netfilter/xt_SECMARK.c +++ b/net/netfilter/xt_SECMARK.c @@ -85,7 +85,7 @@ static int checkentry_selinux(struct xt_secmark_target_info *info) static int checkentry(const char *tablename, const void *entry, const struct xt_target *target, void *targinfo, - unsigned int targinfosize, unsigned int hook_mask) + unsigned int hook_mask) { struct xt_secmark_target_info *info = targinfo; diff --git a/net/netfilter/xt_connbytes.c b/net/netfilter/xt_connbytes.c index d725e8b..dcc497e 100644 --- a/net/netfilter/xt_connbytes.c +++ b/net/netfilter/xt_connbytes.c @@ -125,7 +125,6 @@ static int check(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_connbytes_info *sinfo = matchinfo; diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c index a97b2d4..c9104d0 100644 --- a/net/netfilter/xt_connmark.c +++ b/net/netfilter/xt_connmark.c @@ -55,7 +55,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { struct xt_connmark_info *cm = matchinfo; @@ -75,7 +74,7 @@ checkentry(const char *tablename, } static void -destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) +destroy(const struct xt_match *match, void *matchinfo) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) nf_ct_l3proto_module_put(match->family); diff --git a/net/netfilter/xt_conntrack.c b/net/netfilter/xt_conntrack.c index 1540885..39c57e9 100644 --- a/net/netfilter/xt_conntrack.c +++ b/net/netfilter/xt_conntrack.c @@ -208,7 +208,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) @@ -222,7 +221,7 @@ checkentry(const char *tablename, } static void -destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) +destroy(const struct xt_match *match, void *matchinfo) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) nf_ct_l3proto_module_put(match->family); diff --git a/net/netfilter/xt_dccp.c b/net/netfilter/xt_dccp.c index 5ca6f52..3e6cf43 100644 --- a/net/netfilter/xt_dccp.c +++ b/net/netfilter/xt_dccp.c @@ -131,7 +131,6 @@ checkentry(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_dccp_info *info = matchinfo; diff --git a/net/netfilter/xt_dscp.c b/net/netfilter/xt_dscp.c index d84075c3..26c7f4a 100644 --- a/net/netfilter/xt_dscp.c +++ b/net/netfilter/xt_dscp.c @@ -58,7 +58,6 @@ static int checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const u_int8_t dscp = ((struct xt_dscp_info *)matchinfo)->dscp; diff --git a/net/netfilter/xt_esp.c b/net/netfilter/xt_esp.c index 7b19bc9..7c95f14 100644 --- a/net/netfilter/xt_esp.c +++ b/net/netfilter/xt_esp.c @@ -79,7 +79,6 @@ checkentry(const char *tablename, const void *ip_void, const struct xt_match *match, void *matchinfo, - unsigned int matchinfosize, unsigned int hook_mask) { const struct xt_esp *espinfo = matchinfo; diff --git a/net/netfilter/xt_helper.c b/net/netfilter/xt_helper.c index db453a7..5d7818b 100644 --- a/net/netfilter/xt_helper.c +++ b/net/netfilter/xt_helper.c @@ -139,7 +139,6 @@ static int check(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { struct xt_helper_info *info = matchinfo; @@ -156,7 +155,7 @@ static int check(const char *tablename, } static void -destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) +destroy(const struct xt_match *match, void *matchinfo) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) nf_ct_l3proto_module_put(match->family); diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c index e8d5e7a..b9c9ff3 100644 --- a/net/netfilter/xt_limit.c +++ b/net/netfilter/xt_limit.c @@ -110,7 +110,6 @@ ipt_limit_checkentry(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { struct xt_rateinfo *r = matchinfo; diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c index 39f9b079..e8059cd 100644 --- a/net/netfilter/xt_mark.c +++ b/net/netfilter/xt_mark.c @@ -39,7 +39,6 @@ checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_mark_info *minfo = matchinfo; diff --git a/net/netfilter/xt_multiport.c b/net/netfilter/xt_multiport.c index e74f9bb..d3aefd3 100644 --- a/net/netfilter/xt_multiport.c +++ b/net/netfilter/xt_multiport.c @@ -176,7 +176,6 @@ checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ipt_ip *ip = info; @@ -191,7 +190,6 @@ checkentry_v1(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ipt_ip *ip = info; @@ -206,7 +204,6 @@ checkentry6(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ip6t_ip6 *ip = info; @@ -221,7 +218,6 @@ checkentry6_v1(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct ip6t_ip6 *ip = info; diff --git a/net/netfilter/xt_physdev.c b/net/netfilter/xt_physdev.c index af3d70f..fd8f954 100644 --- a/net/netfilter/xt_physdev.c +++ b/net/netfilter/xt_physdev.c @@ -106,7 +106,6 @@ checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_physdev_info *info = matchinfo; diff --git a/net/netfilter/xt_policy.c b/net/netfilter/xt_policy.c index f5639c4..e9d8137 100644 --- a/net/netfilter/xt_policy.c +++ b/net/netfilter/xt_policy.c @@ -135,8 +135,7 @@ static int match(const struct sk_buff *skb, static int checkentry(const char *tablename, const void *ip_void, const struct xt_match *match, - void *matchinfo, unsigned int matchsize, - unsigned int hook_mask) + void *matchinfo, unsigned int hook_mask) { struct xt_policy_info *info = matchinfo; diff --git a/net/netfilter/xt_quota.c b/net/netfilter/xt_quota.c index cc44f87..b75fa2c 100644 --- a/net/netfilter/xt_quota.c +++ b/net/netfilter/xt_quota.c @@ -41,7 +41,7 @@ match(const struct sk_buff *skb, static int checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) + unsigned int hook_mask) { struct xt_quota_info *q = (struct xt_quota_info *)matchinfo; diff --git a/net/netfilter/xt_sctp.c b/net/netfilter/xt_sctp.c index 5628621..7956aca 100644 --- a/net/netfilter/xt_sctp.c +++ b/net/netfilter/xt_sctp.c @@ -163,7 +163,6 @@ checkentry(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_sctp_info *info = matchinfo; diff --git a/net/netfilter/xt_state.c b/net/netfilter/xt_state.c index 5f9492e..d9010b1 100644 --- a/net/netfilter/xt_state.c +++ b/net/netfilter/xt_state.c @@ -48,7 +48,6 @@ static int check(const char *tablename, const void *inf, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) @@ -62,7 +61,7 @@ static int check(const char *tablename, } static void -destroy(const struct xt_match *match, void *matchinfo, unsigned int matchsize) +destroy(const struct xt_match *match, void *matchinfo) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) nf_ct_l3proto_module_put(match->family); diff --git a/net/netfilter/xt_statistic.c b/net/netfilter/xt_statistic.c index 5181630..091a9f8 100644 --- a/net/netfilter/xt_statistic.c +++ b/net/netfilter/xt_statistic.c @@ -55,7 +55,7 @@ match(const struct sk_buff *skb, static int checkentry(const char *tablename, const void *entry, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) + unsigned int hook_mask) { struct xt_statistic_info *info = (struct xt_statistic_info *)matchinfo; diff --git a/net/netfilter/xt_string.c b/net/netfilter/xt_string.c index 1a1c1d1..4453252 100644 --- a/net/netfilter/xt_string.c +++ b/net/netfilter/xt_string.c @@ -46,7 +46,6 @@ static int checkentry(const char *tablename, const void *ip, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { struct xt_string_info *conf = matchinfo; @@ -69,8 +68,7 @@ static int checkentry(const char *tablename, return 1; } -static void destroy(const struct xt_match *match, void *matchinfo, - unsigned int matchsize) +static void destroy(const struct xt_match *match, void *matchinfo) { textsearch_destroy(STRING_TEXT_PRIV(matchinfo)->config); } diff --git a/net/netfilter/xt_tcpudp.c b/net/netfilter/xt_tcpudp.c index 54aab05..e76a68e 100644 --- a/net/netfilter/xt_tcpudp.c +++ b/net/netfilter/xt_tcpudp.c @@ -141,7 +141,6 @@ tcp_checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_tcp *tcpinfo = matchinfo; @@ -190,7 +189,6 @@ udp_checkentry(const char *tablename, const void *info, const struct xt_match *match, void *matchinfo, - unsigned int matchsize, unsigned int hook_mask) { const struct xt_tcp *udpinfo = matchinfo; diff --git a/net/sched/act_ipt.c b/net/sched/act_ipt.c index 45a3143..d8c9310 100644 --- a/net/sched/act_ipt.c +++ b/net/sched/act_ipt.c @@ -69,7 +69,6 @@ static int ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(table, NULL, t->u.kernel.target, t->data, - t->u.target_size - sizeof(*t), hook)) { module_put(t->u.kernel.target->me); ret = -EINVAL; @@ -81,8 +80,7 @@ static int ipt_init_target(struct ipt_entry_target *t, char *table, unsigned int static void ipt_destroy_target(struct ipt_entry_target *t) { if (t->u.kernel.target->destroy) - t->u.kernel.target->destroy(t->u.kernel.target, t->data, - t->u.target_size - sizeof(*t)); + t->u.kernel.target->destroy(t->u.kernel.target, t->data); module_put(t->u.kernel.target->me); } -- cgit v0.10.2 From 53e26658282373b84ba85a0c9807cb762f7738a6 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:43:20 -0700 Subject: [NETFILTER]: nfnetlink: remove unnecessary packed attributes Remove unnecessary packed attributes in nfnetlink structures. Unfortunately in a few cases they have to stay to avoid changing structure sizes. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h index 9f5b12c..6d8e3e5 100644 --- a/include/linux/netfilter/nfnetlink.h +++ b/include/linux/netfilter/nfnetlink.h @@ -43,7 +43,7 @@ struct nfattr u_int16_t nfa_len; u_int16_t nfa_type; /* we use 15 bits for the type, and the highest * bit to indicate whether the payload is nested */ -} __attribute__ ((packed)); +}; /* FIXME: Apart from NFNL_NFA_NESTED shamelessly copy and pasted from * rtnetlink.h, it's time to put this in a generic file */ @@ -79,7 +79,7 @@ struct nfgenmsg { u_int8_t nfgen_family; /* AF_xxx */ u_int8_t version; /* nfnetlink version */ u_int16_t res_id; /* resource id */ -} __attribute__ ((packed)); +}; #define NFNETLINK_V0 0 diff --git a/include/linux/netfilter/nfnetlink_log.h b/include/linux/netfilter/nfnetlink_log.h index a7497c7..87b92f8 100644 --- a/include/linux/netfilter/nfnetlink_log.h +++ b/include/linux/netfilter/nfnetlink_log.h @@ -19,18 +19,18 @@ struct nfulnl_msg_packet_hdr { u_int16_t hw_protocol; /* hw protocol (network order) */ u_int8_t hook; /* netfilter hook */ u_int8_t _pad; -} __attribute__ ((packed)); +}; struct nfulnl_msg_packet_hw { u_int16_t hw_addrlen; u_int16_t _pad; u_int8_t hw_addr[8]; -} __attribute__ ((packed)); +}; struct nfulnl_msg_packet_timestamp { aligned_u64 sec; aligned_u64 usec; -} __attribute__ ((packed)); +}; #define NFULNL_PREFIXLEN 30 /* just like old log target */ diff --git a/include/linux/netfilter/nfnetlink_queue.h b/include/linux/netfilter/nfnetlink_queue.h index 9e77437..36af036 100644 --- a/include/linux/netfilter/nfnetlink_queue.h +++ b/include/linux/netfilter/nfnetlink_queue.h @@ -22,12 +22,12 @@ struct nfqnl_msg_packet_hw { u_int16_t hw_addrlen; u_int16_t _pad; u_int8_t hw_addr[8]; -} __attribute__ ((packed)); +}; struct nfqnl_msg_packet_timestamp { aligned_u64 sec; aligned_u64 usec; -} __attribute__ ((packed)); +}; enum nfqnl_attr_type { NFQA_UNSPEC, @@ -49,7 +49,7 @@ enum nfqnl_attr_type { struct nfqnl_msg_verdict_hdr { u_int32_t verdict; u_int32_t id; -} __attribute__ ((packed)); +}; enum nfqnl_msg_config_cmds { @@ -64,7 +64,7 @@ struct nfqnl_msg_config_cmd { u_int8_t command; /* nfqnl_msg_config_cmds */ u_int8_t _pad; u_int16_t pf; /* AF_xxx for PF_[UN]BIND */ -} __attribute__ ((packed)); +}; enum nfqnl_config_mode { NFQNL_COPY_NONE, -- cgit v0.10.2 From 91270cf81765152f6e77953440beb4d3b34a71b5 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:43:38 -0700 Subject: [NETFILTER]: x_tables: add data member to struct xt_match Shared match functions can use this to make runtime decisions basen on the used match. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 9d97102..03d1027 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -185,6 +185,9 @@ struct xt_match /* Set this to THIS_MODULE if you are a module, otherwise NULL */ struct module *me; + /* Free to use by each match */ + unsigned long data; + char *table; unsigned int matchsize; unsigned int hooks; -- cgit v0.10.2 From 5fa2a7601f994bdd034e871b7ea1abd6969fbb6c Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:43:55 -0700 Subject: [NETFILTER]: ip6_tables: consolidate dst and hbh matches The matches are identical besides one looking for NEXTHDR_HOP, the other for NEXTHDR_DEST. Remove ip6t_dst.c and handle both in ip6t_hbh.c. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile index eeeb57d..ac1dfeb 100644 --- a/net/ipv6/netfilter/Makefile +++ b/net/ipv6/netfilter/Makefile @@ -5,7 +5,7 @@ # Link order matters here. obj-$(CONFIG_IP6_NF_IPTABLES) += ip6_tables.o obj-$(CONFIG_IP6_NF_MATCH_RT) += ip6t_rt.o -obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o ip6t_dst.o +obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o obj-$(CONFIG_IP6_NF_MATCH_IPV6HEADER) += ip6t_ipv6header.o obj-$(CONFIG_IP6_NF_MATCH_FRAG) += ip6t_frag.o obj-$(CONFIG_IP6_NF_MATCH_AH) += ip6t_ah.o diff --git a/net/ipv6/netfilter/ip6t_dst.c b/net/ipv6/netfilter/ip6t_dst.c deleted file mode 100644 index 223c335..0000000 --- a/net/ipv6/netfilter/ip6t_dst.c +++ /dev/null @@ -1,219 +0,0 @@ -/* Kernel module to match Hop-by-Hop and Destination parameters. */ - -/* (C) 2001-2002 Andras Kis-Szabo - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - */ - -#include -#include -#include -#include -#include -#include - -#include - -#include -#include - -#define HOPBYHOP 0 - -MODULE_LICENSE("GPL"); -#if HOPBYHOP -MODULE_DESCRIPTION("IPv6 HbH match"); -#else -MODULE_DESCRIPTION("IPv6 DST match"); -#endif -MODULE_AUTHOR("Andras Kis-Szabo "); - -#if 0 -#define DEBUGP printk -#else -#define DEBUGP(format, args...) -#endif - -/* - * (Type & 0xC0) >> 6 - * 0 -> ignorable - * 1 -> must drop the packet - * 2 -> send ICMP PARM PROB regardless and drop packet - * 3 -> Send ICMP if not a multicast address and drop packet - * (Type & 0x20) >> 5 - * 0 -> invariant - * 1 -> can change the routing - * (Type & 0x1F) Type - * 0 -> Pad1 (only 1 byte!) - * 1 -> PadN LENGTH info (total length = length + 2) - * C0 | 2 -> JUMBO 4 x x x x ( xxxx > 64k ) - * 5 -> RTALERT 2 x x - */ - -static int -match(const struct sk_buff *skb, - const struct net_device *in, - const struct net_device *out, - const struct xt_match *match, - const void *matchinfo, - int offset, - unsigned int protoff, - int *hotdrop) -{ - struct ipv6_opt_hdr _optsh, *oh; - const struct ip6t_opts *optinfo = matchinfo; - unsigned int temp; - unsigned int ptr; - unsigned int hdrlen = 0; - unsigned int ret = 0; - u8 _opttype, *tp = NULL; - u8 _optlen, *lp = NULL; - unsigned int optlen; - -#if HOPBYHOP - if (ipv6_find_hdr(skb, &ptr, NEXTHDR_HOP, NULL) < 0) -#else - if (ipv6_find_hdr(skb, &ptr, NEXTHDR_DEST, NULL) < 0) -#endif - return 0; - - oh = skb_header_pointer(skb, ptr, sizeof(_optsh), &_optsh); - if (oh == NULL) { - *hotdrop = 1; - return 0; - } - - hdrlen = ipv6_optlen(oh); - if (skb->len - ptr < hdrlen) { - /* Packet smaller than it's length field */ - return 0; - } - - DEBUGP("IPv6 OPTS LEN %u %u ", hdrlen, oh->hdrlen); - - DEBUGP("len %02X %04X %02X ", - optinfo->hdrlen, hdrlen, - (!(optinfo->flags & IP6T_OPTS_LEN) || - ((optinfo->hdrlen == hdrlen) ^ - !!(optinfo->invflags & IP6T_OPTS_INV_LEN)))); - - ret = (oh != NULL) && - (!(optinfo->flags & IP6T_OPTS_LEN) || - ((optinfo->hdrlen == hdrlen) ^ - !!(optinfo->invflags & IP6T_OPTS_INV_LEN))); - - ptr += 2; - hdrlen -= 2; - if (!(optinfo->flags & IP6T_OPTS_OPTS)) { - return ret; - } else if (optinfo->flags & IP6T_OPTS_NSTRICT) { - DEBUGP("Not strict - not implemented"); - } else { - DEBUGP("Strict "); - DEBUGP("#%d ", optinfo->optsnr); - for (temp = 0; temp < optinfo->optsnr; temp++) { - /* type field exists ? */ - if (hdrlen < 1) - break; - tp = skb_header_pointer(skb, ptr, sizeof(_opttype), - &_opttype); - if (tp == NULL) - break; - - /* Type check */ - if (*tp != (optinfo->opts[temp] & 0xFF00) >> 8) { - DEBUGP("Tbad %02X %02X\n", - *tp, - (optinfo->opts[temp] & 0xFF00) >> 8); - return 0; - } else { - DEBUGP("Tok "); - } - /* Length check */ - if (*tp) { - u16 spec_len; - - /* length field exists ? */ - if (hdrlen < 2) - break; - lp = skb_header_pointer(skb, ptr + 1, - sizeof(_optlen), - &_optlen); - if (lp == NULL) - break; - spec_len = optinfo->opts[temp] & 0x00FF; - - if (spec_len != 0x00FF && spec_len != *lp) { - DEBUGP("Lbad %02X %04X\n", *lp, - spec_len); - return 0; - } - DEBUGP("Lok "); - optlen = *lp + 2; - } else { - DEBUGP("Pad1\n"); - optlen = 1; - } - - /* Step to the next */ - DEBUGP("len%04X \n", optlen); - - if ((ptr > skb->len - optlen || hdrlen < optlen) && - (temp < optinfo->optsnr - 1)) { - DEBUGP("new pointer is too large! \n"); - break; - } - ptr += optlen; - hdrlen -= optlen; - } - if (temp == optinfo->optsnr) - return ret; - else - return 0; - } - - return 0; -} - -/* Called when user tries to insert an entry of this type. */ -static int -checkentry(const char *tablename, - const void *info, - const struct xt_match *match, - void *matchinfo, - unsigned int hook_mask) -{ - const struct ip6t_opts *optsinfo = matchinfo; - - if (optsinfo->invflags & ~IP6T_OPTS_INV_MASK) { - DEBUGP("ip6t_opts: unknown flags %X\n", optsinfo->invflags); - return 0; - } - return 1; -} - -static struct ip6t_match opts_match = { -#if HOPBYHOP - .name = "hbh", -#else - .name = "dst", -#endif - .match = match, - .matchsize = sizeof(struct ip6t_opts), - .checkentry = checkentry, - .me = THIS_MODULE, -}; - -static int __init ip6t_dst_init(void) -{ - return ip6t_register_match(&opts_match); -} - -static void __exit ip6t_dst_fini(void) -{ - ip6t_unregister_match(&opts_match); -} - -module_init(ip6t_dst_init); -module_exit(ip6t_dst_fini); diff --git a/net/ipv6/netfilter/ip6t_hbh.c b/net/ipv6/netfilter/ip6t_hbh.c index 72defc8..d32a205 100644 --- a/net/ipv6/netfilter/ip6t_hbh.c +++ b/net/ipv6/netfilter/ip6t_hbh.c @@ -19,15 +19,10 @@ #include #include -#define HOPBYHOP 1 - MODULE_LICENSE("GPL"); -#if HOPBYHOP -MODULE_DESCRIPTION("IPv6 HbH match"); -#else -MODULE_DESCRIPTION("IPv6 DST match"); -#endif +MODULE_DESCRIPTION("IPv6 opts match"); MODULE_AUTHOR("Andras Kis-Szabo "); +MODULE_ALIAS("ip6t_dst"); #if 0 #define DEBUGP printk @@ -71,11 +66,7 @@ match(const struct sk_buff *skb, u8 _optlen, *lp = NULL; unsigned int optlen; -#if HOPBYHOP - if (ipv6_find_hdr(skb, &ptr, NEXTHDR_HOP, NULL) < 0) -#else - if (ipv6_find_hdr(skb, &ptr, NEXTHDR_DEST, NULL) < 0) -#endif + if (ipv6_find_hdr(skb, &ptr, match->data, NULL) < 0) return 0; oh = skb_header_pointer(skb, ptr, sizeof(_optsh), &_optsh); @@ -193,26 +184,35 @@ checkentry(const char *tablename, return 1; } -static struct ip6t_match opts_match = { -#if HOPBYHOP - .name = "hbh", -#else - .name = "dst", -#endif - .match = match, - .matchsize = sizeof(struct ip6t_opts), - .checkentry = checkentry, - .me = THIS_MODULE, +static struct xt_match opts_match[] = { + { + .name = "hbh", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct ip6t_opts), + .checkentry = checkentry, + .me = THIS_MODULE, + .data = NEXTHDR_HOP, + }, + { + .name = "dst", + .family = AF_INET6, + .match = match, + .matchsize = sizeof(struct ip6t_opts), + .checkentry = checkentry, + .me = THIS_MODULE, + .data = NEXTHDR_DEST, + }, }; static int __init ip6t_hbh_init(void) { - return ip6t_register_match(&opts_match); + return xt_register_matches(opts_match, ARRAY_SIZE(opts_match)); } static void __exit ip6t_hbh_fini(void) { - ip6t_unregister_match(&opts_match); + xt_unregister_matches(opts_match, ARRAY_SIZE(opts_match)); } module_init(ip6t_hbh_init); -- cgit v0.10.2 From ce556b3a591fff3bebf8c5590a86aa98e1b2f153 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 22 Aug 2006 00:44:14 -0700 Subject: [NETFILTER]: xt_tcpmss: minor cleanups - remove unused define - remove useless wrapper function - use new line for expression after condition Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_tcpmss.c b/net/netfilter/xt_tcpmss.c index 7baa9eb..a3682fe 100644 --- a/net/netfilter/xt_tcpmss.c +++ b/net/netfilter/xt_tcpmss.c @@ -18,21 +18,22 @@ #include #include -#define TH_SYN 0x02 - MODULE_LICENSE("GPL"); MODULE_AUTHOR("Marc Boucher "); MODULE_DESCRIPTION("iptables TCP MSS match module"); MODULE_ALIAS("ipt_tcpmss"); -/* Returns 1 if the mss option is set and matched by the range, 0 otherwise */ -static inline int -mssoption_match(u_int16_t min, u_int16_t max, - const struct sk_buff *skb, - unsigned int protoff, - int invert, - int *hotdrop) +static int +match(const struct sk_buff *skb, + const struct net_device *in, + const struct net_device *out, + const struct xt_match *match, + const void *matchinfo, + int offset, + unsigned int protoff, + int *hotdrop) { + const struct xt_tcpmss_match_info *info = matchinfo; struct tcphdr _tcph, *th; /* tcp.doff is only 4 bits, ie. max 15 * 4 bytes */ u8 _opt[15 * 4 - sizeof(_tcph)], *op; @@ -64,35 +65,22 @@ mssoption_match(u_int16_t min, u_int16_t max, mssval = (op[i+2] << 8) | op[i+3]; - return (mssval >= min && mssval <= max) ^ invert; + return (mssval >= info->mss_min && + mssval <= info->mss_max) ^ info->invert; } - if (op[i] < 2) i++; - else i += op[i+1]?:1; + if (op[i] < 2) + i++; + else + i += op[i+1] ? : 1; } out: - return invert; + return info->invert; - dropit: +dropit: *hotdrop = 1; return 0; } -static int -match(const struct sk_buff *skb, - const struct net_device *in, - const struct net_device *out, - const struct xt_match *match, - const void *matchinfo, - int offset, - unsigned int protoff, - int *hotdrop) -{ - const struct xt_tcpmss_match_info *info = matchinfo; - - return mssoption_match(info->mss_min, info->mss_max, skb, protoff, - info->invert, hotdrop); -} - static struct xt_match xt_tcpmss_match[] = { { .name = "tcpmss", -- cgit v0.10.2 From 3fd091e73b81f131e1567c4d4a1ec042940bf2f7 Mon Sep 17 00:00:00 2001 From: Vladislav Yasevich Date: Tue, 22 Aug 2006 13:29:17 -0700 Subject: [SCTP]: Remove multiple levels of msecs to jiffies conversions. The SCTP sysctl entries are displayed in milliseconds, but stored internally in jiffies. This results in multiple levels of msecs to jiffies conversion and as a result produces a truncation error. This patch makes things consistent in that we store and display defaults in milliseconds and only convert once for use by association. This patch also adds some sane min/max values so that we don't go off the deep end. Signed-off-by: Vladislav Yasevich Signed-off-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/include/net/sctp/constants.h b/include/net/sctp/constants.h index 57166bf..6c632e2 100644 --- a/include/net/sctp/constants.h +++ b/include/net/sctp/constants.h @@ -264,10 +264,10 @@ enum { SCTP_MAX_DUP_TSNS = 16 }; enum { SCTP_MAX_GABS = 16 }; /* Heartbeat interval - 30 secs */ -#define SCTP_DEFAULT_TIMEOUT_HEARTBEAT (30 * HZ) +#define SCTP_DEFAULT_TIMEOUT_HEARTBEAT (30*1000) /* Delayed sack timer - 200ms */ -#define SCTP_DEFAULT_TIMEOUT_SACK ((200 * HZ) / 1000) +#define SCTP_DEFAULT_TIMEOUT_SACK (200) /* RTO.Initial - 3 seconds * RTO.Min - 1 second @@ -275,9 +275,9 @@ enum { SCTP_MAX_GABS = 16 }; * RTO.Alpha - 1/8 * RTO.Beta - 1/4 */ -#define SCTP_RTO_INITIAL (3 * HZ) -#define SCTP_RTO_MIN (1 * HZ) -#define SCTP_RTO_MAX (60 * HZ) +#define SCTP_RTO_INITIAL (3 * 1000) +#define SCTP_RTO_MIN (1 * 1000) +#define SCTP_RTO_MAX (60 * 1000) #define SCTP_RTO_ALPHA 3 /* 1/8 when converted to right shifts. */ #define SCTP_RTO_BETA 2 /* 1/4 when converted to right shifts. */ @@ -290,8 +290,7 @@ enum { SCTP_MAX_GABS = 16 }; #define SCTP_DEF_MAX_INIT 6 #define SCTP_DEF_MAX_SEND 10 -#define SCTP_DEFAULT_COOKIE_LIFE_SEC 60 /* seconds */ -#define SCTP_DEFAULT_COOKIE_LIFE_USEC 0 /* microseconds */ +#define SCTP_DEFAULT_COOKIE_LIFE (60 * 1000) /* 60 seconds */ #define SCTP_DEFAULT_MINWINDOW 1500 /* default minimum rwnd size */ #define SCTP_DEFAULT_MAXWINDOW 65535 /* default rwnd size */ diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 0412e73..c6d93bb 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -128,9 +128,9 @@ extern struct sctp_globals { * RTO.Alpha - 1/8 (3 when converted to right shifts.) * RTO.Beta - 1/4 (2 when converted to right shifts.) */ - unsigned long rto_initial; - unsigned long rto_min; - unsigned long rto_max; + unsigned int rto_initial; + unsigned int rto_min; + unsigned int rto_max; /* Note: rto_alpha and rto_beta are really defined as inverse * powers of two to facilitate integer operations. @@ -145,13 +145,13 @@ extern struct sctp_globals { int cookie_preserve_enable; /* Valid.Cookie.Life - 60 seconds */ - unsigned long valid_cookie_life; + unsigned int valid_cookie_life; /* Delayed SACK timeout 200ms default*/ - unsigned long sack_timeout; + unsigned int sack_timeout; /* HB.interval - 30 seconds */ - unsigned long hb_interval; + unsigned int hb_interval; /* Association.Max.Retrans - 10 attempts * Path.Max.Retrans - 5 attempts (per destination address) diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 1ab03a2..5692ef5 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -1049,7 +1049,7 @@ SCTP_STATIC __init int sctp_init(void) sctp_rto_beta = SCTP_RTO_BETA; /* Valid.Cookie.Life - 60 seconds */ - sctp_valid_cookie_life = 60 * HZ; + sctp_valid_cookie_life = SCTP_DEFAULT_COOKIE_LIFE; /* Whether Cookie Preservative is enabled(1) or not(0) */ sctp_cookie_preserve_enable = 1; diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 3b6e82c..7c1dbb1 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -3045,14 +3045,14 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk) sp->initmsg.sinit_num_ostreams = sctp_max_outstreams; sp->initmsg.sinit_max_instreams = sctp_max_instreams; sp->initmsg.sinit_max_attempts = sctp_max_retrans_init; - sp->initmsg.sinit_max_init_timeo = jiffies_to_msecs(sctp_rto_max); + sp->initmsg.sinit_max_init_timeo = sctp_rto_max; /* Initialize default RTO related parameters. These parameters can * be modified for with the SCTP_RTOINFO socket option. */ - sp->rtoinfo.srto_initial = jiffies_to_msecs(sctp_rto_initial); - sp->rtoinfo.srto_max = jiffies_to_msecs(sctp_rto_max); - sp->rtoinfo.srto_min = jiffies_to_msecs(sctp_rto_min); + sp->rtoinfo.srto_initial = sctp_rto_initial; + sp->rtoinfo.srto_max = sctp_rto_max; + sp->rtoinfo.srto_min = sctp_rto_min; /* Initialize default association related parameters. These parameters * can be modified with the SCTP_ASSOCINFO socket option. @@ -3061,8 +3061,7 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk) sp->assocparams.sasoc_number_peer_destinations = 0; sp->assocparams.sasoc_peer_rwnd = 0; sp->assocparams.sasoc_local_rwnd = 0; - sp->assocparams.sasoc_cookie_life = - jiffies_to_msecs(sctp_valid_cookie_life); + sp->assocparams.sasoc_cookie_life = sctp_valid_cookie_life; /* Initialize default event subscriptions. By default, all the * options are off. @@ -3072,10 +3071,10 @@ SCTP_STATIC int sctp_init_sock(struct sock *sk) /* Default Peer Address Parameters. These defaults can * be modified via SCTP_PEER_ADDR_PARAMS */ - sp->hbinterval = jiffies_to_msecs(sctp_hb_interval); + sp->hbinterval = sctp_hb_interval; sp->pathmaxrxt = sctp_max_retrans_path; sp->pathmtu = 0; // allow default discovery - sp->sackdelay = jiffies_to_msecs(sctp_sack_timeout); + sp->sackdelay = sctp_sack_timeout; sp->param_flags = SPP_HB_ENABLE | SPP_PMTUD_ENABLE | SPP_SACKDELAY_ENABLE; diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c index dc6f3ff..633cd17 100644 --- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -45,9 +45,10 @@ #include #include -static ctl_handler sctp_sysctl_jiffies_ms; -static long rto_timer_min = 1; -static long rto_timer_max = 86400000; /* One day */ +static int zero = 0; +static int one = 1; +static int timer_max = 86400000; /* ms in one day */ +static int int_max = INT_MAX; static long sack_timer_min = 1; static long sack_timer_max = 500; @@ -56,45 +57,45 @@ static ctl_table sctp_table[] = { .ctl_name = NET_SCTP_RTO_INITIAL, .procname = "rto_initial", .data = &sctp_rto_initial, - .maxlen = sizeof(long), + .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, - .extra1 = &rto_timer_min, - .extra2 = &rto_timer_max + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &timer_max }, { .ctl_name = NET_SCTP_RTO_MIN, .procname = "rto_min", .data = &sctp_rto_min, - .maxlen = sizeof(long), + .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, - .extra1 = &rto_timer_min, - .extra2 = &rto_timer_max + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &timer_max }, { .ctl_name = NET_SCTP_RTO_MAX, .procname = "rto_max", .data = &sctp_rto_max, - .maxlen = sizeof(long), + .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, - .extra1 = &rto_timer_min, - .extra2 = &rto_timer_max + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &timer_max }, { .ctl_name = NET_SCTP_VALID_COOKIE_LIFE, .procname = "valid_cookie_life", .data = &sctp_valid_cookie_life, - .maxlen = sizeof(long), + .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, - .extra1 = &rto_timer_min, - .extra2 = &rto_timer_max + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &timer_max }, { .ctl_name = NET_SCTP_MAX_BURST, @@ -102,7 +103,10 @@ static ctl_table sctp_table[] = { .data = &sctp_max_burst, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &zero, + .extra2 = &int_max }, { .ctl_name = NET_SCTP_ASSOCIATION_MAX_RETRANS, @@ -110,7 +114,10 @@ static ctl_table sctp_table[] = { .data = &sctp_max_retrans_association, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &int_max }, { .ctl_name = NET_SCTP_SNDBUF_POLICY, @@ -118,7 +125,8 @@ static ctl_table sctp_table[] = { .data = &sctp_sndbuf_policy, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_RCVBUF_POLICY, @@ -126,7 +134,8 @@ static ctl_table sctp_table[] = { .data = &sctp_rcvbuf_policy, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_PATH_MAX_RETRANS, @@ -134,7 +143,10 @@ static ctl_table sctp_table[] = { .data = &sctp_max_retrans_path, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &int_max }, { .ctl_name = NET_SCTP_MAX_INIT_RETRANSMITS, @@ -142,18 +154,21 @@ static ctl_table sctp_table[] = { .data = &sctp_max_retrans_init, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &int_max }, { .ctl_name = NET_SCTP_HB_INTERVAL, .procname = "hb_interval", .data = &sctp_hb_interval, - .maxlen = sizeof(long), + .maxlen = sizeof(unsigned int), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, - .extra1 = &rto_timer_min, - .extra2 = &rto_timer_max + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, + .extra1 = &one, + .extra2 = &timer_max }, { .ctl_name = NET_SCTP_PRESERVE_ENABLE, @@ -161,23 +176,26 @@ static ctl_table sctp_table[] = { .data = &sctp_cookie_preserve_enable, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_RTO_ALPHA, .procname = "rto_alpha_exp_divisor", .data = &sctp_rto_alpha, .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = &proc_dointvec + .mode = 0444, + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_RTO_BETA, .procname = "rto_beta_exp_divisor", .data = &sctp_rto_beta, .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = &proc_dointvec + .mode = 0444, + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_ADDIP_ENABLE, @@ -185,7 +203,8 @@ static ctl_table sctp_table[] = { .data = &sctp_addip_enable, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_PRSCTP_ENABLE, @@ -193,7 +212,8 @@ static ctl_table sctp_table[] = { .data = &sctp_prsctp_enable, .maxlen = sizeof(int), .mode = 0644, - .proc_handler = &proc_dointvec + .proc_handler = &proc_dointvec, + .strategy = &sysctl_intvec }, { .ctl_name = NET_SCTP_SACK_TIMEOUT, @@ -201,8 +221,8 @@ static ctl_table sctp_table[] = { .data = &sctp_sack_timeout, .maxlen = sizeof(long), .mode = 0644, - .proc_handler = &proc_doulongvec_ms_jiffies_minmax, - .strategy = &sctp_sysctl_jiffies_ms, + .proc_handler = &proc_dointvec_minmax, + .strategy = &sysctl_intvec, .extra1 = &sack_timer_min, .extra2 = &sack_timer_max, }, @@ -242,37 +262,3 @@ void sctp_sysctl_unregister(void) { unregister_sysctl_table(sctp_sysctl_header); } - -/* Strategy function to convert jiffies to milliseconds. */ -static int sctp_sysctl_jiffies_ms(ctl_table *table, int __user *name, int nlen, - void __user *oldval, size_t __user *oldlenp, - void __user *newval, size_t newlen, void **context) { - - if (oldval) { - size_t olen; - - if (oldlenp) { - if (get_user(olen, oldlenp)) - return -EFAULT; - - if (olen != sizeof (int)) - return -EINVAL; - } - if (put_user((*(int *)(table->data) * 1000) / HZ, - (int __user *)oldval) || - (oldlenp && put_user(sizeof (int), oldlenp))) - return -EFAULT; - } - if (newval && newlen) { - int new; - - if (newlen != sizeof (int)) - return -EINVAL; - - if (get_user(new, (int __user *)newval)) - return -EFAULT; - - *(int *)(table->data) = (new * HZ) / 1000; - } - return 1; -} diff --git a/net/sctp/transport.c b/net/sctp/transport.c index 2763aa9..3e5936a 100644 --- a/net/sctp/transport.c +++ b/net/sctp/transport.c @@ -75,7 +75,7 @@ static struct sctp_transport *sctp_transport_init(struct sctp_transport *peer, * parameter 'RTO.Initial'. */ peer->rtt = 0; - peer->rto = sctp_rto_initial; + peer->rto = msecs_to_jiffies(sctp_rto_initial); peer->rttvar = 0; peer->srtt = 0; peer->rto_pending = 0; -- cgit v0.10.2 From 2809486424df58043b380aeb9d7f402c031c46f6 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Tue, 22 Aug 2006 13:52:17 -0700 Subject: [NETFILTER]: x_tables: Fix typos after conversion to use mass registation helper Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_CONNSECMARK.c b/net/netfilter/xt_CONNSECMARK.c index 4b0e14b..4673862 100644 --- a/net/netfilter/xt_CONNSECMARK.c +++ b/net/netfilter/xt_CONNSECMARK.c @@ -130,14 +130,14 @@ static struct xt_target xt_connsecmark_target[] = { static int __init xt_connsecmark_init(void) { need_conntrack(); - return xt_register_targets(xt_connsecmark_targets, - ARRAY_SIZE(xt_connsecmark_targets)); + return xt_register_targets(xt_connsecmark_target, + ARRAY_SIZE(xt_connsecmark_target)); } static void __exit xt_connsecmark_fini(void) { - xt_unregister_targets(xt_connsecmark_targets, - ARRAY_SIZE(xt_connsecmark_targets)); + xt_unregister_targets(xt_connsecmark_target, + ARRAY_SIZE(xt_connsecmark_target)); } module_init(xt_connsecmark_init); diff --git a/net/netfilter/xt_SECMARK.c b/net/netfilter/xt_SECMARK.c index 451b67c..add7521 100644 --- a/net/netfilter/xt_SECMARK.c +++ b/net/netfilter/xt_SECMARK.c @@ -111,7 +111,7 @@ static int checkentry(const char *tablename, const void *entry, return 1; } -static struct xt_target xt_secmark_target = { +static struct xt_target xt_secmark_target[] = { { .name = "SECMARK", .family = AF_INET, -- cgit v0.10.2 From a57d27fc7107ddcc655ba2812cfebfce3163fd62 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 22 Aug 2006 22:20:14 -0700 Subject: [RTNETLINK]: Don't return error on no-metrics. Instead just cancel the nested attribute and return 0. Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index eeff0b2..8f22549 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -202,8 +202,10 @@ int rtnetlink_put_metrics(struct sk_buff *skb, u32 *metrics) } } - if (!valid) - goto nla_put_failure; + if (!valid) { + nla_nest_cancel(skb, mx); + return 0; + } return nla_nest_end(skb, mx); -- cgit v0.10.2 From 5e032e32ecc2e6cb0385dc115ca9bfe5e19a9539 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:12:24 -0700 Subject: [IPV6] NDISC: Take source address into account for redirects. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 3f170f6..249ce45 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -110,6 +110,7 @@ extern int rt6_route_rcv(struct net_device *dev, struct in6_addr *gwaddr); extern void rt6_redirect(struct in6_addr *dest, + struct in6_addr *src, struct in6_addr *saddr, struct neighbour *neigh, u8 *lladdr, diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 419d651..32f28de 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -1346,7 +1346,8 @@ static void ndisc_redirect_rcv(struct sk_buff *skb) neigh = __neigh_lookup(&nd_tbl, target, skb->dev, 1); if (neigh) { - rt6_redirect(dest, &skb->nh.ipv6h->saddr, neigh, lladdr, + rt6_redirect(dest, &skb->nh.ipv6h->daddr, + &skb->nh.ipv6h->saddr, neigh, lladdr, on_link); neigh_release(neigh); } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 5d6e908..a9b08a2 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1279,7 +1279,8 @@ static int ip6_route_del(struct fib6_config *cfg) /* * Handle redirects */ -void rt6_redirect(struct in6_addr *dest, struct in6_addr *saddr, +void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, + struct in6_addr *saddr, struct neighbour *neigh, u8 *lladdr, int on_link) { struct rt6_info *rt, *nrt = NULL; @@ -1304,7 +1305,7 @@ void rt6_redirect(struct in6_addr *dest, struct in6_addr *saddr, */ read_lock_bh(&table->tb6_lock); - fn = fib6_lookup(&table->tb6_root, dest, NULL); + fn = fib6_lookup(&table->tb6_root, dest, src); restart: for (rt = fn->leaf; rt; rt = rt->u.next) { /* -- cgit v0.10.2 From a6279458c534d01ccc39498aba61c93083ee0372 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:18:26 -0700 Subject: [IPV6] NDISC: Search over all possible rules on receipt of redirect. Split up function for finding routes for redirects. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index a9b08a2..8d00a9d 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1279,19 +1279,18 @@ static int ip6_route_del(struct fib6_config *cfg) /* * Handle redirects */ -void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, - struct in6_addr *saddr, - struct neighbour *neigh, u8 *lladdr, int on_link) +struct ip6rd_flowi { + struct flowi fl; + struct in6_addr gateway; +}; + +static struct rt6_info *__ip6_route_redirect(struct fib6_table *table, + struct flowi *fl, + int flags) { - struct rt6_info *rt, *nrt = NULL; + struct ip6rd_flowi *rdfl = (struct ip6rd_flowi *)fl; + struct rt6_info *rt; struct fib6_node *fn; - struct fib6_table *table; - struct netevent_redirect netevent; - - /* TODO: Very lazy, might need to check all tables */ - table = fib6_get_table(RT6_TABLE_MAIN); - if (table == NULL) - return; /* * Get the "current" route for this destination and @@ -1305,7 +1304,7 @@ void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, */ read_lock_bh(&table->tb6_lock); - fn = fib6_lookup(&table->tb6_root, dest, src); + fn = fib6_lookup(&table->tb6_root, &fl->fl6_dst, &fl->fl6_src); restart: for (rt = fn->leaf; rt; rt = rt->u.next) { /* @@ -1320,29 +1319,67 @@ restart: continue; if (!(rt->rt6i_flags & RTF_GATEWAY)) continue; - if (neigh->dev != rt->rt6i_dev) + if (fl->oif != rt->rt6i_dev->ifindex) continue; - if (!ipv6_addr_equal(saddr, &rt->rt6i_gateway)) + if (!ipv6_addr_equal(&rdfl->gateway, &rt->rt6i_gateway)) continue; break; } - if (rt) - dst_hold(&rt->u.dst); - else if (rt6_need_strict(dest)) { - while ((fn = fn->parent) != NULL) { - if (fn->fn_flags & RTN_ROOT) - break; - if (fn->fn_flags & RTN_RTINFO) - goto restart; + + if (!rt) { + if (rt6_need_strict(&fl->fl6_dst)) { + while ((fn = fn->parent) != NULL) { + if (fn->fn_flags & RTN_ROOT) + break; + if (fn->fn_flags & RTN_RTINFO) + goto restart; + } } + rt = &ip6_null_entry; } + dst_hold(&rt->u.dst); + read_unlock_bh(&table->tb6_lock); - if (!rt) { + return rt; +}; + +static struct rt6_info *ip6_route_redirect(struct in6_addr *dest, + struct in6_addr *src, + struct in6_addr *gateway, + struct net_device *dev) +{ + struct ip6rd_flowi rdfl = { + .fl = { + .oif = dev->ifindex, + .nl_u = { + .ip6_u = { + .daddr = *dest, + .saddr = *src, + }, + }, + }, + .gateway = *gateway, + }; + int flags = rt6_need_strict(dest) ? RT6_F_STRICT : 0; + + return (struct rt6_info *)fib6_rule_lookup((struct flowi *)&rdfl, flags, __ip6_route_redirect); +} + +void rt6_redirect(struct in6_addr *dest, struct in6_addr *src, + struct in6_addr *saddr, + struct neighbour *neigh, u8 *lladdr, int on_link) +{ + struct rt6_info *rt, *nrt = NULL; + struct netevent_redirect netevent; + + rt = ip6_route_redirect(dest, src, saddr, neigh->dev); + + if (rt == &ip6_null_entry) { if (net_ratelimit()) printk(KERN_DEBUG "rt6_redirect: source isn't a valid nexthop " "for redirect target\n"); - return; + goto out; } /* -- cgit v0.10.2 From af184765848c280c7e6190f45c827c5ea3881126 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:18:57 -0700 Subject: [IPV6] NDISC: Initialize fl with outbound interface to lookup rules properly. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 32f28de..ed01f9a 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -412,7 +412,8 @@ static void pndisc_destructor(struct pneigh_entry *n) */ static inline void ndisc_flow_init(struct flowi *fl, u8 type, - struct in6_addr *saddr, struct in6_addr *daddr) + struct in6_addr *saddr, struct in6_addr *daddr, + int oif) { memset(fl, 0, sizeof(*fl)); ipv6_addr_copy(&fl->fl6_src, saddr); @@ -420,6 +421,7 @@ static inline void ndisc_flow_init(struct flowi *fl, u8 type, fl->proto = IPPROTO_ICMPV6; fl->fl_icmp_type = type; fl->fl_icmp_code = 0; + fl->oif = oif; security_sk_classify_flow(ndisc_socket->sk, fl); } @@ -452,7 +454,8 @@ static void ndisc_send_na(struct net_device *dev, struct neighbour *neigh, src_addr = &tmpaddr; } - ndisc_flow_init(&fl, NDISC_NEIGHBOUR_ADVERTISEMENT, src_addr, daddr); + ndisc_flow_init(&fl, NDISC_NEIGHBOUR_ADVERTISEMENT, src_addr, daddr, + dev->ifindex); dst = ndisc_dst_alloc(dev, neigh, daddr, ip6_output); if (!dst) @@ -542,7 +545,8 @@ void ndisc_send_ns(struct net_device *dev, struct neighbour *neigh, saddr = &addr_buf; } - ndisc_flow_init(&fl, NDISC_NEIGHBOUR_SOLICITATION, saddr, daddr); + ndisc_flow_init(&fl, NDISC_NEIGHBOUR_SOLICITATION, saddr, daddr, + dev->ifindex); dst = ndisc_dst_alloc(dev, neigh, daddr, ip6_output); if (!dst) @@ -617,7 +621,8 @@ void ndisc_send_rs(struct net_device *dev, struct in6_addr *saddr, int len; int err; - ndisc_flow_init(&fl, NDISC_ROUTER_SOLICITATION, saddr, daddr); + ndisc_flow_init(&fl, NDISC_ROUTER_SOLICITATION, saddr, daddr, + dev->ifindex); dst = ndisc_dst_alloc(dev, NULL, daddr, ip6_output); if (!dst) @@ -1383,7 +1388,8 @@ void ndisc_send_redirect(struct sk_buff *skb, struct neighbour *neigh, return; } - ndisc_flow_init(&fl, NDISC_REDIRECT, &saddr_buf, &skb->nh.ipv6h->saddr); + ndisc_flow_init(&fl, NDISC_REDIRECT, &saddr_buf, &skb->nh.ipv6h->saddr, + dev->ifindex); dst = ip6_route_output(NULL, &fl); if (dst == NULL) -- cgit v0.10.2 From cf6b1982599cbb60f410adeda659b0b29cdf7ad7 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:19:18 -0700 Subject: [IPV6] ROUTE: Introduce a helper to check route validity. Signed-off-by: YOSHIFUJI Hideaki Acked-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 65514f2..0a18cb6 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -726,6 +726,14 @@ fail: return err; } +static inline int ip6_rt_check(struct rt6key *rt_key, + struct in6_addr *fl_addr, + struct in6_addr *addr_cache) +{ + return ((rt_key->plen != 128 || !ipv6_addr_equal(fl_addr, &rt_key->addr)) && + (addr_cache == NULL || !ipv6_addr_equal(fl_addr, addr_cache))); +} + static struct dst_entry *ip6_sk_dst_check(struct sock *sk, struct dst_entry *dst, struct flowi *fl) @@ -741,8 +749,8 @@ static struct dst_entry *ip6_sk_dst_check(struct sock *sk, * that we do not support routing by source, TOS, * and MSG_DONTROUTE --ANK (980726) * - * 1. If route was host route, check that - * cached destination is current. + * 1. ip6_rt_check(): If route was host route, + * check that cached destination is current. * If it is network route, we still may * check its validity using saved pointer * to the last used address: daddr_cache. @@ -753,11 +761,8 @@ static struct dst_entry *ip6_sk_dst_check(struct sock *sk, * sockets. * 2. oif also should be the same. */ - if (((rt->rt6i_dst.plen != 128 || - !ipv6_addr_equal(&fl->fl6_dst, &rt->rt6i_dst.addr)) - && (np->daddr_cache == NULL || - !ipv6_addr_equal(&fl->fl6_dst, np->daddr_cache))) - || (fl->oif && fl->oif != dst->dev->ifindex)) { + if (ip6_rt_check(&rt->rt6i_dst, &fl->fl6_dst, np->daddr_cache) || + (fl->oif && fl->oif != dst->dev->ifindex)) { dst_release(dst); dst = NULL; } -- cgit v0.10.2 From 8e1ef0a95b87e8b4292b2ba733e8cb854ea2d2fe Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Tue, 29 Aug 2006 17:15:09 -0700 Subject: [IPV6]: Cache source address as well in ipv6_pinfo{}. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 297853c..02d14a3 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -242,6 +242,9 @@ struct ipv6_pinfo { struct in6_addr rcv_saddr; struct in6_addr daddr; struct in6_addr *daddr_cache; +#ifdef CONFIG_IPV6_SUBTREES + struct in6_addr *saddr_cache; +#endif __u32 flow_label; __u32 frag_size; diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 249ce45..0d40f84 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -144,21 +144,24 @@ extern rwlock_t rt6_lock; * Store a destination cache entry in a socket */ static inline void __ip6_dst_store(struct sock *sk, struct dst_entry *dst, - struct in6_addr *daddr) + struct in6_addr *daddr, struct in6_addr *saddr) { struct ipv6_pinfo *np = inet6_sk(sk); struct rt6_info *rt = (struct rt6_info *) dst; sk_setup_caps(sk, dst); np->daddr_cache = daddr; +#ifdef CONFIG_IPV6_SUBTREES + np->saddr_cache = saddr; +#endif np->dst_cookie = rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0; } static inline void ip6_dst_store(struct sock *sk, struct dst_entry *dst, - struct in6_addr *daddr) + struct in6_addr *daddr, struct in6_addr *saddr) { write_lock(&sk->sk_dst_lock); - __ip6_dst_store(sk, dst, daddr); + __ip6_dst_store(sk, dst, daddr, saddr); write_unlock(&sk->sk_dst_lock); } diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index 231bc7c..f9c5e12 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -231,7 +231,7 @@ static int dccp_v6_connect(struct sock *sk, struct sockaddr *uaddr, ipv6_addr_copy(&np->saddr, saddr); inet->rcv_saddr = LOOPBACK4_IPV6; - __ip6_dst_store(sk, dst, NULL); + __ip6_dst_store(sk, dst, NULL, NULL); icsk->icsk_ext_hdr_len = 0; if (np->opt != NULL) @@ -872,7 +872,7 @@ static struct sock *dccp_v6_request_recv_sock(struct sock *sk, * comment in that function for the gory details. -acme */ - __ip6_dst_store(newsk, dst, NULL); + __ip6_dst_store(newsk, dst, NULL, NULL); newsk->sk_route_caps = dst->dev->features & ~(NETIF_F_IP_CSUM | NETIF_F_TSO); newdp6 = (struct dccp6_sock *)newsk; diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 2ff600c..57ee5dd 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -659,7 +659,7 @@ int inet6_sk_rebuild_header(struct sock *sk) return err; } - __ip6_dst_store(sk, dst, NULL); + __ip6_dst_store(sk, dst, NULL, NULL); } return 0; diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index c73508e..8561b9d 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -193,7 +193,12 @@ ipv4_connected: ip6_dst_store(sk, dst, ipv6_addr_equal(&fl.fl6_dst, &np->daddr) ? - &np->daddr : NULL); + &np->daddr : NULL, +#ifdef CONFIG_IPV6_SUBTREES + ipv6_addr_equal(&fl.fl6_src, &np->saddr) ? + &np->saddr : +#endif + NULL); sk->sk_state = TCP_ESTABLISHED; out: diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c index 7a51a25..827f41d 100644 --- a/net/ipv6/inet6_connection_sock.c +++ b/net/ipv6/inet6_connection_sock.c @@ -186,7 +186,7 @@ int inet6_csk_xmit(struct sk_buff *skb, int ipfragok) return err; } - __ip6_dst_store(sk, dst, NULL); + __ip6_dst_store(sk, dst, NULL, NULL); } skb->dst = dst_clone(dst); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 0a18cb6..2a376b7 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -762,6 +762,9 @@ static struct dst_entry *ip6_sk_dst_check(struct sock *sk, * 2. oif also should be the same. */ if (ip6_rt_check(&rt->rt6i_dst, &fl->fl6_dst, np->daddr_cache) || +#ifdef CONFIG_IPV6_SUBTREES + ip6_rt_check(&rt->rt6i_src, &fl->fl6_src, np->saddr_cache) || +#endif (fl->oif && fl->oif != dst->dev->ifindex)) { dst_release(dst); dst = NULL; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 7f1b660..2b18918 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -272,7 +272,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr *uaddr, inet->rcv_saddr = LOOPBACK4_IPV6; sk->sk_gso_type = SKB_GSO_TCPV6; - __ip6_dst_store(sk, dst, NULL); + __ip6_dst_store(sk, dst, NULL, NULL); icsk->icsk_ext_hdr_len = 0; if (np->opt) @@ -954,7 +954,7 @@ static struct sock * tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb, */ newsk->sk_gso_type = SKB_GSO_TCPV6; - __ip6_dst_store(newsk, dst, NULL); + __ip6_dst_store(newsk, dst, NULL, NULL); newtcp6sk = (struct tcp6_sock *)newsk; inet_sk(newsk)->pinet6 = &newtcp6sk->inet6; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index eb9e1b3..b9cc55c 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -847,7 +847,12 @@ do_append_data: if (connected) { ip6_dst_store(sk, dst, ipv6_addr_equal(&fl->fl6_dst, &np->daddr) ? - &np->daddr : NULL); + &np->daddr : NULL, +#ifdef CONFIG_IPV6_SUBTREES + ipv6_addr_equal(&fl->fl6_src, &np->saddr) ? + &np->saddr : +#endif + NULL); } else { dst_release(dst); } -- cgit v0.10.2 From 66729e18df08ee20a9824148236b89f56371659e Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:20:34 -0700 Subject: [IPV6] ROUTE: Make sure we have fn->leaf when adding a node on subtree. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 667b1b1..11f9660 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -80,6 +80,7 @@ static DEFINE_RWLOCK(fib6_walker_lock); #endif static void fib6_prune_clones(struct fib6_node *fn, struct rt6_info *rt); +static struct rt6_info * fib6_find_prefix(struct fib6_node *fn); static struct fib6_node * fib6_repair_tree(struct fib6_node *fn); static int fib6_walk(struct fib6_walker_t *w); static int fib6_walk_continue(struct fib6_walker_t *w); @@ -697,7 +698,7 @@ void fib6_force_start_gc(void) int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) { - struct fib6_node *fn; + struct fib6_node *fn, *pn = NULL; int err = -ENOMEM; fn = fib6_add_1(root, &rt->rt6i_dst.addr, sizeof(struct in6_addr), @@ -706,6 +707,8 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) if (fn == NULL) goto out; + pn = fn; + #ifdef CONFIG_IPV6_SUBTREES if (rt->rt6i_src.plen) { struct fib6_node *sn; @@ -751,10 +754,6 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) /* Now link new subtree to main tree */ sfn->parent = fn; fn->subtree = sfn; - if (fn->leaf == NULL) { - fn->leaf = rt; - atomic_inc(&rt->rt6i_ref); - } } else { sn = fib6_add_1(fn->subtree, &rt->rt6i_src.addr, sizeof(struct in6_addr), rt->rt6i_src.plen, @@ -764,6 +763,10 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) goto st_failure; } + if (fn->leaf == NULL) { + fn->leaf = rt; + atomic_inc(&rt->rt6i_ref); + } fn = sn; } #endif @@ -777,8 +780,25 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) } out: - if (err) + if (err) { +#ifdef CONFIG_IPV6_SUBTREES + /* + * If fib6_add_1 has cleared the old leaf pointer in the + * super-tree leaf node we have to find a new one for it. + */ + if (pn != fn && !pn->leaf && !(pn->fn_flags & RTN_RTINFO)) { + pn->leaf = fib6_find_prefix(pn); +#if RT6_DEBUG >= 2 + if (!pn->leaf) { + BUG_TRAP(pn->leaf != NULL); + pn->leaf = &ip6_null_entry; + } +#endif + atomic_inc(&pn->leaf->rt6i_ref); + } +#endif dst_free(&rt->u.dst); + } return err; #ifdef CONFIG_IPV6_SUBTREES -- cgit v0.10.2 From 2285adc1e6c9f964f9625e7edcd233fccd7a7c92 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:20:54 -0700 Subject: [IPV6] ROUTE: Prune clones from main tree as well. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 11f9660..35b91ff 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -776,7 +776,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info) if (err == 0) { fib6_start_gc(rt); if (!(rt->rt6i_flags&RTF_CACHE)) - fib6_prune_clones(fn, rt); + fib6_prune_clones(pn, rt); } out: -- cgit v0.10.2 From 3fc5e0440be7fab3abae4e801b0ef17e9b3b58c4 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:21:12 -0700 Subject: [IPV6] ROUTE: Fix looking up a route on subtree. Even on RTN_ROOT node, we need to process its subtree first. Fix NULL pointer dereference in fib6_locate(). Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 35b91ff..5408b64 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -850,33 +850,26 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root, break; } - while ((fn->fn_flags & RTN_ROOT) == 0) { -#ifdef CONFIG_IPV6_SUBTREES - if (fn->subtree) { - struct fib6_node *st; - struct lookup_args *narg; - - narg = args + 1; - - if (narg->addr) { - st = fib6_lookup_1(fn->subtree, narg); - - if (st && !(st->fn_flags & RTN_ROOT)) - return st; - } - } -#endif - - if (fn->fn_flags & RTN_RTINFO) { + while(fn) { + if (SUBTREE(fn) || fn->fn_flags & RTN_RTINFO) { struct rt6key *key; key = (struct rt6key *) ((u8 *) fn->leaf + args->offset); - if (ipv6_prefix_equal(&key->addr, args->addr, key->plen)) - return fn; + if (ipv6_prefix_equal(&key->addr, args->addr, key->plen)) { +#ifdef CONFIG_IPV6_SUBTREES + if (fn->subtree) + fn = fib6_lookup_1(fn->subtree, args + 1); +#endif + if (!fn || fn->fn_flags & RTN_RTINFO) + return fn; + } } + if (fn->fn_flags & RTN_ROOT) + break; + fn = fn->parent; } @@ -953,10 +946,8 @@ struct fib6_node * fib6_locate(struct fib6_node *root, #ifdef CONFIG_IPV6_SUBTREES if (src_len) { BUG_TRAP(saddr!=NULL); - if (fn == NULL) - fn = fn->subtree; - if (fn) - fn = fib6_locate_1(fn, saddr, src_len, + if (fn && fn->subtree) + fn = fib6_locate_1(fn->subtree, saddr, src_len, offsetof(struct rt6_info, rt6i_src)); } #endif -- cgit v0.10.2 From 825e288ef4c55a379a97e104c825eb9b74874099 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:21:29 -0700 Subject: [IPV6] ROUTE: Make sure we do not exceed args in fib6_lookup_1(). Signed-off-by: YOSHIFUJI Hideaki Acked-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 5408b64..19ee737 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -829,6 +829,9 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root, struct fib6_node *fn; int dir; + if (unlikely(args->offset == 0)) + return NULL; + /* * Descend on a tree */ @@ -879,16 +882,22 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root, struct fib6_node * fib6_lookup(struct fib6_node *root, struct in6_addr *daddr, struct in6_addr *saddr) { - struct lookup_args args[2]; struct fib6_node *fn; - - args[0].offset = offsetof(struct rt6_info, rt6i_dst); - args[0].addr = daddr; - + struct lookup_args args[] = { + { + .offset = offsetof(struct rt6_info, rt6i_dst), + .addr = daddr, + }, #ifdef CONFIG_IPV6_SUBTREES - args[1].offset = offsetof(struct rt6_info, rt6i_src); - args[1].addr = saddr; + { + .offset = offsetof(struct rt6_info, rt6i_src), + .addr = saddr, + }, #endif + { + .offset = 0, /* sentinel */ + } + }; fn = fib6_lookup_1(root, args); -- cgit v0.10.2 From fefc2a6c201aeafc1d0329a140de502d49f69d04 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:21:50 -0700 Subject: [IPV6] ROUTE: Allow searching subtree only. Signed-off-by: YOSHIFUJI Hideaki Acked-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 19ee737..b706424 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -899,7 +899,7 @@ struct fib6_node * fib6_lookup(struct fib6_node *root, struct in6_addr *daddr, } }; - fn = fib6_lookup_1(root, args); + fn = fib6_lookup_1(root, daddr ? args : args + 1); if (fn == NULL || fn->fn_flags & RTN_TL_ROOT) fn = root; -- cgit v0.10.2 From 7fc33165a74301b2c5c90b2f2a1f6907cbd5c6f1 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:22:24 -0700 Subject: [IPV6] ROUTE: Put SUBTREE() as FIB6_SUBTREE() into ip6_fib.h for future use. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 9610b887f..6a3f26a 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -60,6 +60,11 @@ struct fib6_node __u32 fn_sernum; }; +#ifndef CONFIG_IPV6_SUBTREES +#define FIB6_SUBTREE(fn) NULL +#else +#define FIB6_SUBTREE(fn) ((fn)->subtree) +#endif /* * routing information diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index b706424..6536e33 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -73,10 +73,8 @@ static DEFINE_RWLOCK(fib6_walker_lock); #ifdef CONFIG_IPV6_SUBTREES #define FWS_INIT FWS_S -#define SUBTREE(fn) ((fn)->subtree) #else #define FWS_INIT FWS_L -#define SUBTREE(fn) NULL #endif static void fib6_prune_clones(struct fib6_node *fn, struct rt6_info *rt); @@ -854,7 +852,7 @@ static struct fib6_node * fib6_lookup_1(struct fib6_node *root, } while(fn) { - if (SUBTREE(fn) || fn->fn_flags & RTN_RTINFO) { + if (FIB6_SUBTREE(fn) || fn->fn_flags & RTN_RTINFO) { struct rt6key *key; key = (struct rt6key *) ((u8 *) fn->leaf + @@ -985,7 +983,7 @@ static struct rt6_info * fib6_find_prefix(struct fib6_node *fn) if(fn->right) return fn->right->leaf; - fn = SUBTREE(fn); + fn = FIB6_SUBTREE(fn); } return NULL; } @@ -1016,7 +1014,7 @@ static struct fib6_node * fib6_repair_tree(struct fib6_node *fn) if (fn->right) child = fn->right, children |= 1; if (fn->left) child = fn->left, children |= 2; - if (children == 3 || SUBTREE(fn) + if (children == 3 || FIB6_SUBTREE(fn) #ifdef CONFIG_IPV6_SUBTREES /* Subtree root (i.e. fn) may have one child */ || (children && fn->fn_flags&RTN_ROOT) @@ -1035,9 +1033,9 @@ static struct fib6_node * fib6_repair_tree(struct fib6_node *fn) pn = fn->parent; #ifdef CONFIG_IPV6_SUBTREES - if (SUBTREE(pn) == fn) { + if (FIB6_SUBTREE(pn) == fn) { BUG_TRAP(fn->fn_flags&RTN_ROOT); - SUBTREE(pn) = NULL; + FIB6_SUBTREE(pn) = NULL; nstate = FWS_L; } else { BUG_TRAP(!(fn->fn_flags&RTN_ROOT)); @@ -1085,7 +1083,7 @@ static struct fib6_node * fib6_repair_tree(struct fib6_node *fn) read_unlock(&fib6_walker_lock); node_free(fn); - if (pn->fn_flags&RTN_RTINFO || SUBTREE(pn)) + if (pn->fn_flags&RTN_RTINFO || FIB6_SUBTREE(pn)) return pn; rt6_release(pn->leaf); @@ -1228,8 +1226,8 @@ static int fib6_walk_continue(struct fib6_walker_t *w) switch (w->state) { #ifdef CONFIG_IPV6_SUBTREES case FWS_S: - if (SUBTREE(fn)) { - w->node = SUBTREE(fn); + if (FIB6_SUBTREE(fn)) { + w->node = FIB6_SUBTREE(fn); continue; } w->state = FWS_L; @@ -1263,7 +1261,7 @@ static int fib6_walk_continue(struct fib6_walker_t *w) pn = fn->parent; w->node = pn; #ifdef CONFIG_IPV6_SUBTREES - if (SUBTREE(pn) == fn) { + if (FIB6_SUBTREE(pn) == fn) { BUG_TRAP(fn->fn_flags&RTN_ROOT); w->state = FWS_L; continue; -- cgit v0.10.2 From 982f56f3a9be4651520c0fdd3d80a5d02e95a178 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:22:39 -0700 Subject: [IPV6] ROUTE: Search subtree when backtracking. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 8d00a9d..bd4cf17 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -481,17 +481,23 @@ int rt6_route_rcv(struct net_device *dev, u8 *opt, int len, } #endif -#define BACKTRACK() \ -if (rt == &ip6_null_entry && flags & RT6_F_STRICT) { \ - while ((fn = fn->parent) != NULL) { \ - if (fn->fn_flags & RTN_TL_ROOT) { \ - dst_hold(&rt->u.dst); \ - goto out; \ +#define BACKTRACK(saddr) \ +do { \ + if (rt == &ip6_null_entry) { \ + struct fib6_node *pn; \ + while (fn) { \ + if (fn->fn_flags & RTN_TL_ROOT) \ + goto out; \ + pn = fn->parent; \ + if (FIB6_SUBTREE(pn) && FIB6_SUBTREE(pn) != fn) \ + fn = fib6_lookup(pn->subtree, NULL, saddr); \ + else \ + fn = pn; \ + if (fn->fn_flags & RTN_RTINFO) \ + goto restart; \ } \ - if (fn->fn_flags & RTN_RTINFO) \ - goto restart; \ } \ -} +} while(0) static struct rt6_info *ip6_pol_route_lookup(struct fib6_table *table, struct flowi *fl, int flags) @@ -504,7 +510,7 @@ static struct rt6_info *ip6_pol_route_lookup(struct fib6_table *table, restart: rt = fn->leaf; rt = rt6_device_match(rt, fl->oif, flags & RT6_F_STRICT); - BACKTRACK(); + BACKTRACK(&fl->fl6_src); dst_hold(&rt->u.dst); out: read_unlock_bh(&table->tb6_lock); @@ -638,7 +644,7 @@ restart_2: restart: rt = rt6_select(&fn->leaf, fl->iif, strict | reachable); - BACKTRACK(); + BACKTRACK(&fl->fl6_src); if (rt == &ip6_null_entry || rt->rt6i_flags & RTF_CACHE) goto out; @@ -733,7 +739,7 @@ restart_2: restart: rt = rt6_select(&fn->leaf, fl->oif, strict | reachable); - BACKTRACK(); + BACKTRACK(&fl->fl6_src); if (rt == &ip6_null_entry || rt->rt6i_flags & RTF_CACHE) goto out; -- cgit v0.10.2 From 150730d5a53b1bbb486101b2a5fb82ff0d3f916e Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:22:55 -0700 Subject: [IPV6] ROUTE: Purge clones on other trees when deleting a route. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index 6536e33..f0fdaf1 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1169,8 +1169,18 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info) BUG_TRAP(fn->fn_flags&RTN_RTINFO); - if (!(rt->rt6i_flags&RTF_CACHE)) - fib6_prune_clones(fn, rt); + if (!(rt->rt6i_flags&RTF_CACHE)) { + struct fib6_node *pn = fn; +#ifdef CONFIG_IPV6_SUBTREES + /* clones of this route might be in another subtree */ + if (rt->rt6i_src.plen) { + while (!(pn->fn_flags&RTN_ROOT)) + pn = pn->parent; + pn = pn->parent; + } +#endif + fib6_prune_clones(pn, rt); + } /* * Walk the leaf entries looking for ourself -- cgit v0.10.2 From cb15d9c224fcc03b32396c1c7416e777c2dcca34 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:23:11 -0700 Subject: [IPV6] NDISC: Search subtrees when backtracking on receipt of redirects. Signed-off-by: YOSHIFUJI Hideaki Acked-by: Ville Nuorvala diff --git a/net/ipv6/route.c b/net/ipv6/route.c index bd4cf17..fd626d4 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1332,17 +1332,10 @@ restart: break; } - if (!rt) { - if (rt6_need_strict(&fl->fl6_dst)) { - while ((fn = fn->parent) != NULL) { - if (fn->fn_flags & RTN_ROOT) - break; - if (fn->fn_flags & RTN_RTINFO) - goto restart; - } - } + if (!rt) rt = &ip6_null_entry; - } + BACKTRACK(&fl->fl6_src); +out: dst_hold(&rt->u.dst); read_unlock_bh(&table->tb6_lock); -- cgit v0.10.2 From c0bece9f2aec546c3750ae3972f80e024a923f34 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:23:25 -0700 Subject: [IPV6] ROUTE: Add credits about subtree fixes. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index f0fdaf1..fbca609 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -18,6 +18,7 @@ * Yuji SEKIYA @USAGI: Support default route on router node; * remove ip6_null_entry from the top of * routing table. + * Ville Nuorvala: Fixed routing subtrees. */ #include #include diff --git a/net/ipv6/route.c b/net/ipv6/route.c index fd626d4..fd6f2ec 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -22,6 +22,8 @@ * routers in REACHABLE, STALE, DELAY or PROBE states). * - always select the same router if it is (probably) * reachable. otherwise, round-robin the list. + * Ville Nuorvala + * Fixed routing subtrees. */ #include -- cgit v0.10.2 From 4e96c2b4180aff4f080b77314712073c6ca430e7 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:23:39 -0700 Subject: [IPV6] KCONFIG: Add subtrees support. This is for developers only. Based on MIPL2 kernel patch. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: Ville Nuorvala diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 36a6c2b..14f0b33 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -136,6 +136,20 @@ config IPV6_TUNNEL If unsure, say N. +config IPV6_SUBTREES + bool "IPv6: source address based routing" + depends on IPV6 && EXPERIMENTAL + ---help--- + Enable routing by source address or prefix. + + The destination address is still the primary routing key, so mixing + normal and source prefix specific routes in the same routing table + may sometimes lead to unintended routing behavior. This can be + avoided by defining different routing tables for the normal and + source prefix specific routes. + + If unsure, say N. + config IPV6_MULTIPLE_TABLES bool "IPv6: Multiple Routing Tables" depends on IPV6 && EXPERIMENTAL -- cgit v0.10.2 From 77d16f450ae0452d7d4b009f78debb1294fb435c Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Wed, 23 Aug 2006 17:25:05 -0700 Subject: [IPV6] ROUTE: Unify RT6_F_xxx and RT6_SELECT_F_xxx flags Unify RT6_F_xxx and RT6_SELECT_F_xxx flags into RT6_LOOKUP_F_xxx flags, and put them into ip6_route.h Signed-off-by: YOSHIFUJI Hideaki Acked-by: Ville Nuorvala diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h index 6a3f26a..e4438de 100644 --- a/include/net/ip6_fib.h +++ b/include/net/ip6_fib.h @@ -173,9 +173,6 @@ struct fib6_table { #define RT6_TABLE_LOCAL RT6_TABLE_MAIN #endif -#define RT6_F_STRICT 1 -#define RT6_F_HAS_SADDR 2 - typedef struct rt6_info *(*pol_lookup_t)(struct fib6_table *, struct flowi *, int); diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 0d40f84..2979095 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -32,6 +32,10 @@ struct route_info { #include #include +#define RT6_LOOKUP_F_IFACE 0x1 +#define RT6_LOOKUP_F_REACHABLE 0x2 +#define RT6_LOOKUP_F_HAS_SADDR 0x4 + struct pol_chain { int type; int priority; diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 2c4fbc8..7b4908c 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -117,7 +117,7 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) if (!ipv6_prefix_equal(&fl->fl6_dst, &r->dst.addr, r->dst.plen)) return 0; - if ((flags & RT6_F_HAS_SADDR) && + if ((flags & RT6_LOOKUP_F_HAS_SADDR) && !ipv6_prefix_equal(&fl->fl6_src, &r->src.addr, r->src.plen)) return 0; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index fd6f2ec..2069128 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -76,9 +76,6 @@ #define CLONE_OFFLINK_ROUTE 0 -#define RT6_SELECT_F_IFACE 0x1 -#define RT6_SELECT_F_REACHABLE 0x2 - static int ip6_rt_max_size = 4096; static int ip6_rt_gc_min_interval = HZ / 2; static int ip6_rt_gc_timeout = 60*HZ; @@ -340,7 +337,7 @@ static int rt6_score_route(struct rt6_info *rt, int oif, int m, n; m = rt6_check_dev(rt, oif); - if (!m && (strict & RT6_SELECT_F_IFACE)) + if (!m && (strict & RT6_LOOKUP_F_IFACE)) return -1; #ifdef CONFIG_IPV6_ROUTER_PREF m |= IPV6_DECODE_PREF(IPV6_EXTRACT_PREF(rt->rt6i_flags)) << 2; @@ -348,7 +345,7 @@ static int rt6_score_route(struct rt6_info *rt, int oif, n = rt6_check_neigh(rt); if (n > 1) m |= 16; - else if (!n && strict & RT6_SELECT_F_REACHABLE) + else if (!n && strict & RT6_LOOKUP_F_REACHABLE) return -1; return m; } @@ -388,7 +385,7 @@ static struct rt6_info *rt6_select(struct rt6_info **head, int oif, } if (!match && - (strict & RT6_SELECT_F_REACHABLE) && + (strict & RT6_LOOKUP_F_REACHABLE) && last && last != rt0) { /* no entries matched; do round-robin */ static DEFINE_SPINLOCK(lock); @@ -511,7 +508,7 @@ static struct rt6_info *ip6_pol_route_lookup(struct fib6_table *table, fn = fib6_lookup(&table->tb6_root, &fl->fl6_dst, &fl->fl6_src); restart: rt = fn->leaf; - rt = rt6_device_match(rt, fl->oif, flags & RT6_F_STRICT); + rt = rt6_device_match(rt, fl->oif, flags); BACKTRACK(&fl->fl6_src); dst_hold(&rt->u.dst); out: @@ -537,7 +534,7 @@ struct rt6_info *rt6_lookup(struct in6_addr *daddr, struct in6_addr *saddr, }, }; struct dst_entry *dst; - int flags = strict ? RT6_F_STRICT : 0; + int flags = strict ? RT6_LOOKUP_F_IFACE : 0; dst = fib6_rule_lookup(&fl, flags, ip6_pol_route_lookup); if (dst->error == 0) @@ -633,10 +630,9 @@ static struct rt6_info *ip6_pol_route_input(struct fib6_table *table, int strict = 0; int attempts = 3; int err; - int reachable = RT6_SELECT_F_REACHABLE; + int reachable = RT6_LOOKUP_F_REACHABLE; - if (flags & RT6_F_STRICT) - strict = RT6_SELECT_F_IFACE; + strict |= flags & RT6_LOOKUP_F_IFACE; relookup: read_lock_bh(&table->tb6_lock); @@ -712,10 +708,7 @@ void ip6_route_input(struct sk_buff *skb) }, .proto = iph->nexthdr, }; - int flags = 0; - - if (rt6_need_strict(&iph->daddr)) - flags |= RT6_F_STRICT; + int flags = rt6_need_strict(&iph->daddr) ? RT6_LOOKUP_F_IFACE : 0; skb->dst = fib6_rule_lookup(&fl, flags, ip6_pol_route_input); } @@ -728,10 +721,9 @@ static struct rt6_info *ip6_pol_route_output(struct fib6_table *table, int strict = 0; int attempts = 3; int err; - int reachable = RT6_SELECT_F_REACHABLE; + int reachable = RT6_LOOKUP_F_REACHABLE; - if (flags & RT6_F_STRICT) - strict = RT6_SELECT_F_IFACE; + strict |= flags & RT6_LOOKUP_F_IFACE; relookup: read_lock_bh(&table->tb6_lock); @@ -797,7 +789,7 @@ struct dst_entry * ip6_route_output(struct sock *sk, struct flowi *fl) int flags = 0; if (rt6_need_strict(&fl->fl6_dst)) - flags |= RT6_F_STRICT; + flags |= RT6_LOOKUP_F_IFACE; return fib6_rule_lookup(fl, flags, ip6_pol_route_output); } @@ -1362,7 +1354,7 @@ static struct rt6_info *ip6_route_redirect(struct in6_addr *dest, }, .gateway = *gateway, }; - int flags = rt6_need_strict(dest) ? RT6_F_STRICT : 0; + int flags = rt6_need_strict(dest) ? RT6_LOOKUP_F_IFACE : 0; return (struct rt6_info *)fib6_rule_lookup((struct flowi *)&rdfl, flags, __ip6_route_redirect); } -- cgit v0.10.2 From 7e49e6de30efa716614e280d97963c570f3acf29 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Fri, 22 Sep 2006 15:05:15 -0700 Subject: [XFRM]: Add XFRM_MODE_xxx for future use. Transformation mode is used as either IPsec transport or tunnel. It is required to add two more items, route optimization and inbound trigger for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 46a15c7..5154064 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -120,7 +120,9 @@ enum #define XFRM_MODE_TRANSPORT 0 #define XFRM_MODE_TUNNEL 1 -#define XFRM_MODE_MAX 2 +#define XFRM_MODE_ROUTEOPTIMIZATION 2 +#define XFRM_MODE_IN_TRIGGER 3 +#define XFRM_MODE_MAX 4 /* Netlink configuration messages. */ enum { @@ -247,7 +249,7 @@ struct xfrm_usersa_info { __u32 seq; __u32 reqid; __u16 family; - __u8 mode; /* 0=transport,1=tunnel */ + __u8 mode; /* XFRM_MODE_xxx */ __u8 replay_window; __u8 flags; #define XFRM_STATE_NOECN 1 diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 00bf86e..7627956 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -298,7 +298,7 @@ struct xfrm_tmpl __u32 reqid; -/* Mode: transport/tunnel */ +/* Mode: transport, tunnel etc. */ __u8 mode; /* Sharing mode: unique, this session only, this user only etc. */ diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index 008e69d..9954297 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -265,7 +265,7 @@ static int ah_init_state(struct xfrm_state *x) goto error; x->props.header_len = XFRM_ALIGN8(sizeof(struct ip_auth_hdr) + ahp->icv_trunc_len); - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct iphdr); x->data = ahp; diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index b428489..e87377e 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -248,7 +248,7 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb) * as per draft-ietf-ipsec-udp-encaps-06, * section 3.1.2 */ - if (!x->props.mode) + if (x->props.mode == XFRM_MODE_TRANSPORT) skb->ip_summed = CHECKSUM_UNNECESSARY; } @@ -267,7 +267,7 @@ static u32 esp4_get_max_size(struct xfrm_state *x, int mtu) struct esp_data *esp = x->data; u32 blksize = ALIGN(crypto_blkcipher_blocksize(esp->conf.tfm), 4); - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { mtu = ALIGN(mtu + 2, blksize); } else { /* The worst case. */ @@ -383,7 +383,7 @@ static int esp_init_state(struct xfrm_state *x) if (crypto_blkcipher_setkey(tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ip_esp_hdr) + esp->conf.ivlen; - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct iphdr); if (x->encap) { struct xfrm_encap_tmpl *encap = x->encap; diff --git a/net/ipv4/ipcomp.c b/net/ipv4/ipcomp.c index 5bb9c9f..1734243 100644 --- a/net/ipv4/ipcomp.c +++ b/net/ipv4/ipcomp.c @@ -176,7 +176,7 @@ static int ipcomp_output(struct xfrm_state *x, struct sk_buff *skb) return 0; out_ok: - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) ip_send_check(iph); return 0; } @@ -216,7 +216,7 @@ static struct xfrm_state *ipcomp_tunnel_create(struct xfrm_state *x) t->id.daddr.a4 = x->id.daddr.a4; memcpy(&t->sel, &x->sel, sizeof(t->sel)); t->props.family = AF_INET; - t->props.mode = 1; + t->props.mode = XFRM_MODE_TUNNEL; t->props.saddr.a4 = x->props.saddr.a4; t->props.flags = x->props.flags; @@ -416,7 +416,7 @@ static int ipcomp_init_state(struct xfrm_state *x) goto out; x->props.header_len = 0; - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct iphdr); mutex_lock(&ipcomp_resource_mutex); @@ -428,7 +428,7 @@ static int ipcomp_init_state(struct xfrm_state *x) goto error; mutex_unlock(&ipcomp_resource_mutex); - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { err = ipcomp_tunnel_attach(x); if (err) goto error_tunnel; diff --git a/net/ipv4/xfrm4_input.c b/net/ipv4/xfrm4_input.c index 817ed84..040e847 100644 --- a/net/ipv4/xfrm4_input.c +++ b/net/ipv4/xfrm4_input.c @@ -106,7 +106,7 @@ int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type) if (x->mode->input(x, skb)) goto drop; - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { decaps = 1; break; } diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c index 4a96a9e..5fd115f 100644 --- a/net/ipv4/xfrm4_output.c +++ b/net/ipv4/xfrm4_output.c @@ -54,7 +54,7 @@ static int xfrm4_output_one(struct sk_buff *skb) goto error_nolock; } - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { err = xfrm4_tunnel_check_size(skb); if (err) goto error_nolock; @@ -85,7 +85,7 @@ static int xfrm4_output_one(struct sk_buff *skb) } dst = skb->dst; x = dst->xfrm; - } while (x && !x->props.mode); + } while (x && (x->props.mode != XFRM_MODE_TUNNEL)); IPCB(skb)->flags |= IPSKB_XFRM_TRANSFORMED; err = 0; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index 8f50eae..a5bed74 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -96,7 +96,7 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int dst1->next = dst_prev; dst_prev = dst1; - if (xfrm[i]->props.mode) { + if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) { remote = xfrm[i]->id.daddr.a4; local = xfrm[i]->props.saddr.a4; tunnel = 1; diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index 81e1751..97b0c75 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -42,7 +42,7 @@ __xfrm4_init_tempsel(struct xfrm_state *x, struct flowi *fl, x->props.saddr = tmpl->saddr; if (x->props.saddr.a4 == 0) x->props.saddr.a4 = saddr->a4; - if (tmpl->mode && x->props.saddr.a4 == 0) { + if (tmpl->mode == XFRM_MODE_TUNNEL && x->props.saddr.a4 == 0) { struct rtable *rt; struct flowi fl_tunnel = { .nl_u = { diff --git a/net/ipv4/xfrm4_tunnel.c b/net/ipv4/xfrm4_tunnel.c index f8ceaa1..f110af5 100644 --- a/net/ipv4/xfrm4_tunnel.c +++ b/net/ipv4/xfrm4_tunnel.c @@ -28,7 +28,7 @@ static int ipip_xfrm_rcv(struct xfrm_state *x, struct sk_buff *skb) static int ipip_init_state(struct xfrm_state *x) { - if (!x->props.mode) + if (x->props.mode != XFRM_MODE_TUNNEL) return -EINVAL; if (x->encap) diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index 00ffa7b..60954fc 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -398,7 +398,7 @@ static int ah6_init_state(struct xfrm_state *x) goto error; x->props.header_len = XFRM_ALIGN8(sizeof(struct ipv6_auth_hdr) + ahp->icv_trunc_len); - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct ipv6hdr); x->data = ahp; diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 2ebfd28..2b8e52e 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -237,7 +237,7 @@ static u32 esp6_get_max_size(struct xfrm_state *x, int mtu) struct esp_data *esp = x->data; u32 blksize = ALIGN(crypto_blkcipher_blocksize(esp->conf.tfm), 4); - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { mtu = ALIGN(mtu + 2, blksize); } else { /* The worst case. */ @@ -358,7 +358,7 @@ static int esp6_init_state(struct xfrm_state *x) if (crypto_blkcipher_setkey(tfm, esp->conf.key, esp->conf.key_len)) goto error; x->props.header_len = sizeof(struct ipv6_esp_hdr) + esp->conf.ivlen; - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct ipv6hdr); x->data = esp; return 0; diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c index a81e9e9..19eba8d 100644 --- a/net/ipv6/ipcomp6.c +++ b/net/ipv6/ipcomp6.c @@ -212,7 +212,7 @@ static struct xfrm_state *ipcomp6_tunnel_create(struct xfrm_state *x) memcpy(t->id.daddr.a6, x->id.daddr.a6, sizeof(struct in6_addr)); memcpy(&t->sel, &x->sel, sizeof(t->sel)); t->props.family = AF_INET6; - t->props.mode = 1; + t->props.mode = XFRM_MODE_TUNNEL; memcpy(t->props.saddr.a6, x->props.saddr.a6, sizeof(struct in6_addr)); if (xfrm_init_state(t)) @@ -417,7 +417,7 @@ static int ipcomp6_init_state(struct xfrm_state *x) goto out; x->props.header_len = 0; - if (x->props.mode) + if (x->props.mode == XFRM_MODE_TUNNEL) x->props.header_len += sizeof(struct ipv6hdr); mutex_lock(&ipcomp6_resource_mutex); @@ -429,7 +429,7 @@ static int ipcomp6_init_state(struct xfrm_state *x) goto error; mutex_unlock(&ipcomp6_resource_mutex); - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { err = ipcomp6_tunnel_attach(x); if (err) goto error_tunnel; diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c index 0405d74..ee2f6b3 100644 --- a/net/ipv6/xfrm6_input.c +++ b/net/ipv6/xfrm6_input.c @@ -72,7 +72,7 @@ int xfrm6_rcv_spi(struct sk_buff *skb, u32 spi) if (x->mode->input(x, skb)) goto drop; - if (x->props.mode) { /* XXX */ + if (x->props.mode == XFRM_MODE_TUNNEL) { /* XXX */ decaps = 1; break; } diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index 6d11174..26f1886 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -47,7 +47,7 @@ static int xfrm6_output_one(struct sk_buff *skb) goto error_nolock; } - if (x->props.mode) { + if (x->props.mode == XFRM_MODE_TUNNEL) { err = xfrm6_tunnel_check_size(skb); if (err) goto error_nolock; @@ -80,7 +80,7 @@ static int xfrm6_output_one(struct sk_buff *skb) } dst = skb->dst; x = dst->xfrm; - } while (x && !x->props.mode); + } while (x && (x->props.mode != XFRM_MODE_TUNNEL)); IP6CB(skb)->flags |= IP6SKB_XFRM_TRANSFORMED; err = 0; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 73cd250..81355bb 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -114,7 +114,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int dst1->next = dst_prev; dst_prev = dst1; - if (xfrm[i]->props.mode) { + if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) { remote = (struct in6_addr*)&xfrm[i]->id.daddr; local = (struct in6_addr*)&xfrm[i]->props.saddr; tunnel = 1; diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index b33296b..a1a1f54 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -42,7 +42,7 @@ __xfrm6_init_tempsel(struct xfrm_state *x, struct flowi *fl, memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) memcpy(&x->props.saddr, saddr, sizeof(x->props.saddr)); - if (tmpl->mode && ipv6_addr_any((struct in6_addr*)&x->props.saddr)) { + if (tmpl->mode == XFRM_MODE_TUNNEL && ipv6_addr_any((struct in6_addr*)&x->props.saddr)) { struct rt6_info *rt; struct flowi fl_tunnel = { .nl_u = { diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c index c8f9369..59685ee 100644 --- a/net/ipv6/xfrm6_tunnel.c +++ b/net/ipv6/xfrm6_tunnel.c @@ -307,7 +307,7 @@ static int xfrm6_tunnel_err(struct sk_buff *skb, struct inet6_skb_parm *opt, static int xfrm6_tunnel_init_state(struct xfrm_state *x) { - if (!x->props.mode) + if (x->props.mode != XFRM_MODE_TUNNEL) return -EINVAL; if (x->encap) diff --git a/net/key/af_key.c b/net/key/af_key.c index 797c744..19e047b 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1765,7 +1765,7 @@ parse_ipsecrequest(struct xfrm_policy *xp, struct sadb_x_ipsecrequest *rq) } /* addresses present only in tunnel mode */ - if (t->mode) { + if (t->mode == XFRM_MODE_TUNNEL) { switch (xp->family) { case AF_INET: sin = (void*)(rq+1); @@ -1997,7 +1997,7 @@ static void pfkey_xfrm_policy2msg(struct sk_buff *skb, struct xfrm_policy *xp, i int req_size; req_size = sizeof(struct sadb_x_ipsecrequest); - if (t->mode) + if (t->mode == XFRM_MODE_TUNNEL) req_size += 2*socklen; else size -= 2*socklen; @@ -2013,7 +2013,7 @@ static void pfkey_xfrm_policy2msg(struct sk_buff *skb, struct xfrm_policy *xp, i if (t->optional) rq->sadb_x_ipsecrequest_level = IPSEC_LEVEL_USE; rq->sadb_x_ipsecrequest_reqid = t->reqid; - if (t->mode) { + if (t->mode == XFRM_MODE_TUNNEL) { switch (xp->family) { case AF_INET: sin = (void*)(rq+1); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 32c963c..a0d5897 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -779,7 +779,7 @@ xfrm_tmpl_resolve(struct xfrm_policy *policy, struct flowi *fl, xfrm_address_t *local = saddr; struct xfrm_tmpl *tmpl = &policy->xfrm_vec[i]; - if (tmpl->mode) { + if (tmpl->mode == XFRM_MODE_TUNNEL) { remote = &tmpl->id.daddr; local = &tmpl->saddr; } @@ -1005,7 +1005,8 @@ xfrm_state_ok(struct xfrm_tmpl *tmpl, struct xfrm_state *x, (x->props.reqid == tmpl->reqid || !tmpl->reqid) && x->props.mode == tmpl->mode && (tmpl->aalgos & (1<props.aalgo)) && - !(x->props.mode && xfrm_state_addr_cmp(tmpl, x, family)); + !(x->props.mode != XFRM_MODE_TRANSPORT && + xfrm_state_addr_cmp(tmpl, x, family)); } static inline int @@ -1015,14 +1016,14 @@ xfrm_policy_ok(struct xfrm_tmpl *tmpl, struct sec_path *sp, int start, int idx = start; if (tmpl->optional) { - if (!tmpl->mode) + if (tmpl->mode == XFRM_MODE_TRANSPORT) return start; } else start = -1; for (; idx < sp->len; idx++) { if (xfrm_state_ok(tmpl, sp->xvec[idx], family)) return ++idx; - if (sp->xvec[idx]->props.mode) + if (sp->xvec[idx]->props.mode != XFRM_MODE_TRANSPORT) break; } return start; @@ -1047,7 +1048,7 @@ EXPORT_SYMBOL(xfrm_decode_session); static inline int secpath_has_tunnel(struct sec_path *sp, int k) { for (; k < sp->len; k++) { - if (sp->xvec[k]->props.mode) + if (sp->xvec[k]->props.mode != XFRM_MODE_TRANSPORT) return 1; } diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index f70e158..0d580ac 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -174,8 +174,8 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, err = -EINVAL; switch (p->mode) { - case 0: - case 1: + case XFRM_MODE_TRANSPORT: + case XFRM_MODE_TUNNEL: break; default: -- cgit v0.10.2 From 5794708f11551b6d19b10673abf4b0202f66b44d Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Fri, 22 Sep 2006 15:06:24 -0700 Subject: [XFRM]: Introduce a helper to compare id protocol. Put the helper to header for future use. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 7627956..5b364b0 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -9,6 +9,7 @@ #include #include #include +#include #include #include @@ -835,6 +836,11 @@ static inline int xfrm_state_kern(struct xfrm_state *x) return atomic_read(&x->tunnel_users); } +static inline int xfrm_id_proto_match(u8 proto, u8 userproto) +{ + return (userproto == IPSEC_PROTO_ANY || proto == userproto); +} + /* * xfrm algorithm information */ diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 1c79608..34c038c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -294,7 +294,7 @@ void xfrm_state_flush(u8 proto) restart: list_for_each_entry(x, xfrm_state_bydst+i, bydst) { if (!xfrm_state_kern(x) && - (proto == IPSEC_PROTO_ANY || x->id.proto == proto)) { + xfrm_id_proto_match(x->id.proto, proto)) { xfrm_state_hold(x); spin_unlock_bh(&xfrm_state_lock); @@ -772,7 +772,7 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), spin_lock_bh(&xfrm_state_lock); for (i = 0; i < XFRM_DST_HSIZE; i++) { list_for_each_entry(x, xfrm_state_bydst+i, bydst) { - if (proto == IPSEC_PROTO_ANY || x->id.proto == proto) + if (xfrm_id_proto_match(x->id.proto, proto)) count++; } } @@ -783,7 +783,7 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), for (i = 0; i < XFRM_DST_HSIZE; i++) { list_for_each_entry(x, xfrm_state_bydst+i, bydst) { - if (proto != IPSEC_PROTO_ANY && x->id.proto != proto) + if (!xfrm_id_proto_match(x->id.proto, proto)) continue; err = func(x, --count, data); if (err) -- cgit v0.10.2 From dc00a525603650a1471c823a1e48c6505c2f9765 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 17:49:52 -0700 Subject: [XFRM] STATE: Allow non IPsec protocol. It will be added two more transformation protocols (routing header and destination options header) for Mobile IPv6. xfrm_id_proto_match() can be handle zero as all, IPSEC_PROTO_ANY as all IPsec and otherwise as exact one. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 5b364b0..2a7d213 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -838,7 +838,10 @@ static inline int xfrm_state_kern(struct xfrm_state *x) static inline int xfrm_id_proto_match(u8 proto, u8 userproto) { - return (userproto == IPSEC_PROTO_ANY || proto == userproto); + return (!userproto || proto == userproto || + (userproto == IPSEC_PROTO_ANY && (proto == IPPROTO_AH || + proto == IPPROTO_ESP || + proto == IPPROTO_COMP))); } /* diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 0d580ac..41f3d51 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -542,7 +542,7 @@ static int xfrm_dump_sa(struct sk_buff *skb, struct netlink_callback *cb) info.nlmsg_flags = NLM_F_MULTI; info.this_idx = 0; info.start_idx = cb->args[0]; - (void) xfrm_state_walk(IPSEC_PROTO_ANY, dump_one_state, &info); + (void) xfrm_state_walk(0, dump_one_state, &info); cb->args[0] = info.this_idx; return skb->len; -- cgit v0.10.2 From 622dc8281a80374873686514e46f852093d91106 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 17:52:01 -0700 Subject: [XFRM]: Expand XFRM_MAX_DEPTH for route optimization. XFRM_MAX_DEPTH is a limit of transformation states to be applied to the same flow. Two more extension headers are used by Mobile IPv6 transformation. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 2a7d213..aa3be68 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -314,7 +314,7 @@ struct xfrm_tmpl __u32 calgos; }; -#define XFRM_MAX_DEPTH 4 +#define XFRM_MAX_DEPTH 6 struct xfrm_policy { -- cgit v0.10.2 From 6c44e6b7ab500d7e3e3f406c83325671be51a752 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 17:53:57 -0700 Subject: [XFRM] STATE: Add source address list. Support source address based searching. Mobile IPv6 will use it. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index aa3be68..88145e3 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -95,6 +95,7 @@ struct xfrm_state { /* Note: bydst is re-used during gc */ struct list_head bydst; + struct list_head bysrc; struct list_head byspi; atomic_t refcnt; @@ -236,6 +237,7 @@ extern int __xfrm_state_delete(struct xfrm_state *x); struct xfrm_state_afinfo { unsigned short family; struct list_head *state_bydst; + struct list_head *state_bysrc; struct list_head *state_byspi; int (*init_flags)(struct xfrm_state *x); void (*init_tempsel)(struct xfrm_state *x, struct flowi *fl, @@ -421,6 +423,30 @@ unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) } static __inline__ +unsigned __xfrm4_src_hash(xfrm_address_t *addr) +{ + return __xfrm4_dst_hash(addr); +} + +static __inline__ +unsigned __xfrm6_src_hash(xfrm_address_t *addr) +{ + return __xfrm6_dst_hash(addr); +} + +static __inline__ +unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +{ + switch (family) { + case AF_INET: + return __xfrm4_src_hash(addr); + case AF_INET6: + return __xfrm6_src_hash(addr); + } + return 0; +} + +static __inline__ unsigned __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) { unsigned h; diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index 97b0c75..c56b258 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -122,6 +122,9 @@ __xfrm4_find_acq(u8 mode, u32 reqid, u8 proto, add_timer(&x0->timer); xfrm_state_hold(x0); list_add_tail(&x0->bydst, xfrm4_state_afinfo.state_bydst+h); + h = __xfrm4_src_hash(saddr); + xfrm_state_hold(x0); + list_add_tail(&x0->bysrc, xfrm4_state_afinfo.state_bysrc+h); wake_up(&km_waitq); } if (x0) diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index a1a1f54..2fb0785 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -126,6 +126,9 @@ __xfrm6_find_acq(u8 mode, u32 reqid, u8 proto, add_timer(&x0->timer); xfrm_state_hold(x0); list_add_tail(&x0->bydst, xfrm6_state_afinfo.state_bydst+h); + h = __xfrm6_src_hash(saddr); + xfrm_state_hold(x0); + list_add_tail(&x0->bysrc, xfrm6_state_afinfo.state_bysrc+h); wake_up(&km_waitq); } if (x0) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 34c038c..2a99928 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -45,6 +45,7 @@ static DEFINE_SPINLOCK(xfrm_state_lock); * Also, it can be used by ah/esp icmp error handler to find offending SA. */ static struct list_head xfrm_state_bydst[XFRM_DST_HSIZE]; +static struct list_head xfrm_state_bysrc[XFRM_DST_HSIZE]; static struct list_head xfrm_state_byspi[XFRM_DST_HSIZE]; DECLARE_WAIT_QUEUE_HEAD(km_waitq); @@ -200,6 +201,7 @@ struct xfrm_state *xfrm_state_alloc(void) atomic_set(&x->refcnt, 1); atomic_set(&x->tunnel_users, 0); INIT_LIST_HEAD(&x->bydst); + INIT_LIST_HEAD(&x->bysrc); INIT_LIST_HEAD(&x->byspi); init_timer(&x->timer); x->timer.function = xfrm_timer_handler; @@ -240,6 +242,8 @@ int __xfrm_state_delete(struct xfrm_state *x) spin_lock(&xfrm_state_lock); list_del(&x->bydst); __xfrm_state_put(x); + list_del(&x->bysrc); + __xfrm_state_put(x); if (x->id.spi) { list_del(&x->byspi); __xfrm_state_put(x); @@ -415,6 +419,8 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, x->km.state = XFRM_STATE_ACQ; list_add_tail(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); + list_add_tail(&x->bysrc, xfrm_state_bysrc+h); + xfrm_state_hold(x); if (x->id.spi) { h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, family); list_add(&x->byspi, xfrm_state_byspi+h); @@ -448,11 +454,19 @@ static void __xfrm_state_insert(struct xfrm_state *x) list_add(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); - h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, x->props.family); + h = xfrm_src_hash(&x->props.saddr, x->props.family); - list_add(&x->byspi, xfrm_state_byspi+h); + list_add(&x->bysrc, xfrm_state_bysrc+h); xfrm_state_hold(x); + if (xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY)) { + h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, + x->props.family); + + list_add(&x->byspi, xfrm_state_byspi+h); + xfrm_state_hold(x); + } + if (!mod_timer(&x->timer, jiffies + HZ)) xfrm_state_hold(x); @@ -1075,6 +1089,7 @@ int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo) err = -ENOBUFS; else { afinfo->state_bydst = xfrm_state_bydst; + afinfo->state_bysrc = xfrm_state_bysrc; afinfo->state_byspi = xfrm_state_byspi; xfrm_state_afinfo[afinfo->family] = afinfo; } @@ -1097,6 +1112,7 @@ int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo) else { xfrm_state_afinfo[afinfo->family] = NULL; afinfo->state_byspi = NULL; + afinfo->state_bysrc = NULL; afinfo->state_bydst = NULL; } } @@ -1218,6 +1234,7 @@ void __init xfrm_state_init(void) for (i=0; i Date: Wed, 23 Aug 2006 17:56:04 -0700 Subject: [XFRM] STATE: Search by address using source address list. This is a support to search transformation states by its addresses by using source address list for Mobile IPv6 usage. To use it from user-space, it is also added a message type for source address as a xfrm state option. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 5154064..66343d3 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -234,6 +234,7 @@ enum xfrm_attr_type_t { XFRMA_REPLAY_VAL, XFRMA_REPLAY_THRESH, XFRMA_ETIMER_THRESH, + XFRMA_SRCADDR, /* xfrm_address_t */ __XFRMA_MAX #define XFRMA_MAX (__XFRMA_MAX - 1) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 88145e3..d9c40e7 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -244,6 +244,7 @@ struct xfrm_state_afinfo { struct xfrm_tmpl *tmpl, xfrm_address_t *daddr, xfrm_address_t *saddr); struct xfrm_state *(*state_lookup)(xfrm_address_t *daddr, u32 spi, u8 proto); + struct xfrm_state *(*state_lookup_byaddr)(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto); struct xfrm_state *(*find_acq)(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create); @@ -937,6 +938,7 @@ extern void xfrm_state_insert(struct xfrm_state *x); extern int xfrm_state_add(struct xfrm_state *x); extern int xfrm_state_update(struct xfrm_state *x); extern struct xfrm_state *xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family); +extern struct xfrm_state *xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto, unsigned short family); extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq); extern int xfrm_state_delete(struct xfrm_state *x); extern void xfrm_state_flush(u8 proto); diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index c56b258..616be13 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -80,6 +80,14 @@ __xfrm4_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto) return NULL; } +/* placeholder until ipv4's code is written */ +static struct xfrm_state * +__xfrm4_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, + u8 proto) +{ + return NULL; +} + static struct xfrm_state * __xfrm4_find_acq(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, @@ -137,6 +145,7 @@ static struct xfrm_state_afinfo xfrm4_state_afinfo = { .init_flags = xfrm4_init_flags, .init_tempsel = __xfrm4_init_tempsel, .state_lookup = __xfrm4_state_lookup, + .state_lookup_byaddr = __xfrm4_state_lookup_byaddr, .find_acq = __xfrm4_find_acq, }; diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index 2fb0785..9c95b9d 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -64,6 +64,26 @@ __xfrm6_init_tempsel(struct xfrm_state *x, struct flowi *fl, } static struct xfrm_state * +__xfrm6_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, + u8 proto) +{ + struct xfrm_state *x = NULL; + unsigned h; + + h = __xfrm6_src_hash(saddr); + list_for_each_entry(x, xfrm6_state_afinfo.state_bysrc+h, bysrc) { + if (x->props.family == AF_INET6 && + ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && + ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr *)x->props.saddr.a6) && + proto == x->id.proto) { + xfrm_state_hold(x); + return x; + } + } + return NULL; +} + +static struct xfrm_state * __xfrm6_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto) { unsigned h = __xfrm6_spi_hash(daddr, spi, proto); @@ -140,6 +160,7 @@ static struct xfrm_state_afinfo xfrm6_state_afinfo = { .family = AF_INET6, .init_tempsel = __xfrm6_init_tempsel, .state_lookup = __xfrm6_state_lookup, + .state_lookup_byaddr = __xfrm6_state_lookup_byaddr, .find_acq = __xfrm6_find_acq, }; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 2a99928..11f480b 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -487,6 +487,16 @@ void xfrm_state_insert(struct xfrm_state *x) } EXPORT_SYMBOL(xfrm_state_insert); +static inline struct xfrm_state * +__xfrm_state_locate(struct xfrm_state_afinfo *afinfo, struct xfrm_state *x, + int use_spi) +{ + if (use_spi) + return afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); + else + return afinfo->state_lookup_byaddr(&x->id.daddr, &x->props.saddr, x->id.proto); +} + static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq); int xfrm_state_add(struct xfrm_state *x) @@ -495,6 +505,7 @@ int xfrm_state_add(struct xfrm_state *x) struct xfrm_state *x1; int family; int err; + int use_spi = xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY); family = x->props.family; afinfo = xfrm_state_get_afinfo(family); @@ -503,7 +514,7 @@ int xfrm_state_add(struct xfrm_state *x) spin_lock_bh(&xfrm_state_lock); - x1 = afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); + x1 = __xfrm_state_locate(afinfo, x, use_spi); if (x1) { xfrm_state_put(x1); x1 = NULL; @@ -511,7 +522,7 @@ int xfrm_state_add(struct xfrm_state *x) goto out; } - if (x->km.seq) { + if (use_spi && x->km.seq) { x1 = __xfrm_find_acq_byseq(x->km.seq); if (x1 && xfrm_addr_cmp(&x1->id.daddr, &x->id.daddr, family)) { xfrm_state_put(x1); @@ -519,7 +530,7 @@ int xfrm_state_add(struct xfrm_state *x) } } - if (!x1) + if (use_spi && !x1) x1 = afinfo->find_acq( x->props.mode, x->props.reqid, x->id.proto, &x->id.daddr, &x->props.saddr, 0); @@ -548,13 +559,14 @@ int xfrm_state_update(struct xfrm_state *x) struct xfrm_state_afinfo *afinfo; struct xfrm_state *x1; int err; + int use_spi = xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY); afinfo = xfrm_state_get_afinfo(x->props.family); if (unlikely(afinfo == NULL)) return -EAFNOSUPPORT; spin_lock_bh(&xfrm_state_lock); - x1 = afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); + x1 = __xfrm_state_locate(afinfo, x, use_spi); err = -ESRCH; if (!x1) @@ -675,6 +687,23 @@ xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto, EXPORT_SYMBOL(xfrm_state_lookup); struct xfrm_state * +xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, + u8 proto, unsigned short family) +{ + struct xfrm_state *x; + struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); + if (!afinfo) + return NULL; + + spin_lock_bh(&xfrm_state_lock); + x = afinfo->state_lookup_byaddr(daddr, saddr, proto); + spin_unlock_bh(&xfrm_state_lock); + xfrm_state_put_afinfo(afinfo); + return x; +} +EXPORT_SYMBOL(xfrm_state_lookup_byaddr); + +struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create, unsigned short family) diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 41f3d51..b5f8ab7 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -87,6 +87,22 @@ static int verify_encap_tmpl(struct rtattr **xfrma) return 0; } +static int verify_one_addr(struct rtattr **xfrma, enum xfrm_attr_type_t type, + xfrm_address_t **addrp) +{ + struct rtattr *rt = xfrma[type - 1]; + + if (!rt) + return 0; + + if ((rt->rta_len - sizeof(*rt)) < sizeof(**addrp)) + return -EINVAL; + + if (addrp) + *addrp = RTA_DATA(rt); + + return 0; +} static inline int verify_sec_ctx_len(struct rtattr **xfrma) { @@ -418,16 +434,48 @@ out: return err; } +static struct xfrm_state *xfrm_user_state_lookup(struct xfrm_usersa_id *p, + struct rtattr **xfrma, + int *errp) +{ + struct xfrm_state *x = NULL; + int err; + + if (xfrm_id_proto_match(p->proto, IPSEC_PROTO_ANY)) { + err = -ESRCH; + x = xfrm_state_lookup(&p->daddr, p->spi, p->proto, p->family); + } else { + xfrm_address_t *saddr = NULL; + + err = verify_one_addr(xfrma, XFRMA_SRCADDR, &saddr); + if (err) + goto out; + + if (!saddr) { + err = -EINVAL; + goto out; + } + + x = xfrm_state_lookup_byaddr(&p->daddr, saddr, p->proto, + p->family); + } + + out: + if (!x && errp) + *errp = err; + return x; +} + static int xfrm_del_sa(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfrma) { struct xfrm_state *x; - int err; + int err = -ESRCH; struct km_event c; struct xfrm_usersa_id *p = NLMSG_DATA(nlh); - x = xfrm_state_lookup(&p->daddr, p->spi, p->proto, p->family); + x = xfrm_user_state_lookup(p, (struct rtattr **)xfrma, &err); if (x == NULL) - return -ESRCH; + return err; if ((err = security_xfrm_state_delete(x)) != 0) goto out; @@ -578,10 +626,9 @@ static int xfrm_get_sa(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfrma) struct xfrm_usersa_id *p = NLMSG_DATA(nlh); struct xfrm_state *x; struct sk_buff *resp_skb; - int err; + int err = -ESRCH; - x = xfrm_state_lookup(&p->daddr, p->spi, p->proto, p->family); - err = -ESRCH; + x = xfrm_user_state_lookup(p, (struct rtattr **)xfrma, &err); if (x == NULL) goto out_noput; -- cgit v0.10.2 From aee5adb4307c4c63a4dc5f3b49984d76f8a71b5b Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 17:57:28 -0700 Subject: [XFRM] STATE: Add a hook to find offset to be inserted header in outbound. On current kernel, ip6_find_1stfragopt() is used by IPv6 IPsec to find offset to be inserted header in outbound for transport mode. (BTW, no usage may be needed for IPv4 case.) Mobile IPv6 requires another logic for routing header and destination options header respectively. This patch is common platform for the offset and adopts it to IPsec. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index d9c40e7..eed48f8 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -265,6 +265,7 @@ struct xfrm_type void (*destructor)(struct xfrm_state *); int (*input)(struct xfrm_state *, struct sk_buff *skb); int (*output)(struct xfrm_state *, struct sk_buff *pskb); + int (*hdr_offset)(struct xfrm_state *, struct sk_buff *, u8 **); /* Estimate maximal size of result of transformation of a dgram */ u32 (*get_max_size)(struct xfrm_state *, int size); }; @@ -960,6 +961,8 @@ extern u32 xfrm6_tunnel_alloc_spi(xfrm_address_t *saddr); extern void xfrm6_tunnel_free_spi(xfrm_address_t *saddr); extern u32 xfrm6_tunnel_spi_lookup(xfrm_address_t *saddr); extern int xfrm6_output(struct sk_buff *skb); +extern int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb, + u8 **prevhdr); #ifdef CONFIG_XFRM extern int xfrm4_rcv_encap(struct sk_buff *skb, __u16 encap_type); diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index 60954fc..6c0aa51 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -435,7 +435,8 @@ static struct xfrm_type ah6_type = .init_state = ah6_init_state, .destructor = ah6_destroy, .input = ah6_input, - .output = ah6_output + .output = ah6_output, + .hdr_offset = xfrm6_find_1stfragopt, }; static struct inet6_protocol ah6_protocol = { diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index 2b8e52e..ae50b95 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -379,7 +379,8 @@ static struct xfrm_type esp6_type = .destructor = esp6_destroy, .get_max_size = esp6_get_max_size, .input = esp6_input, - .output = esp6_output + .output = esp6_output, + .hdr_offset = xfrm6_find_1stfragopt, }; static struct inet6_protocol esp6_protocol = { diff --git a/net/ipv6/ipcomp6.c b/net/ipv6/ipcomp6.c index 19eba8d..ad9c6e8 100644 --- a/net/ipv6/ipcomp6.c +++ b/net/ipv6/ipcomp6.c @@ -461,6 +461,7 @@ static struct xfrm_type ipcomp6_type = .destructor = ipcomp6_destroy, .input = ipcomp6_input, .output = ipcomp6_output, + .hdr_offset = xfrm6_find_1stfragopt, }; static struct inet6_protocol ipcomp6_protocol = diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c index dd4d1ce..e1a7416 100644 --- a/net/ipv6/ipv6_syms.c +++ b/net/ipv6/ipv6_syms.c @@ -31,6 +31,7 @@ EXPORT_SYMBOL(ipv6_chk_addr); EXPORT_SYMBOL(in6_dev_finish_destroy); #ifdef CONFIG_XFRM EXPORT_SYMBOL(xfrm6_rcv); +EXPORT_SYMBOL(xfrm6_find_1stfragopt); #endif EXPORT_SYMBOL(rt6_lookup); EXPORT_SYMBOL(ipv6_push_nfrag_opts); diff --git a/net/ipv6/xfrm6_mode_transport.c b/net/ipv6/xfrm6_mode_transport.c index 711d713..a5dce21 100644 --- a/net/ipv6/xfrm6_mode_transport.c +++ b/net/ipv6/xfrm6_mode_transport.c @@ -35,7 +35,7 @@ static int xfrm6_transport_output(struct sk_buff *skb) skb_push(skb, x->props.header_len); iph = skb->nh.ipv6h; - hdr_len = ip6_find_1stfragopt(skb, &prevhdr); + hdr_len = x->type->hdr_offset(x, skb, &prevhdr); skb->nh.raw = prevhdr - x->props.header_len; skb->h.raw = skb->data + hdr_len; memmove(skb->data, iph, hdr_len); diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index 26f1886..b4628fb 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -17,6 +17,12 @@ #include #include +int xfrm6_find_1stfragopt(struct xfrm_state *x, struct sk_buff *skb, + u8 **prevhdr) +{ + return ip6_find_1stfragopt(skb, prevhdr); +} + static int xfrm6_tunnel_check_size(struct sk_buff *skb) { int mtu, ret = 0; -- cgit v0.10.2 From 1d71627d699eca831c1fbfb66ea67bb1fba41415 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 17:59:44 -0700 Subject: [XFRM] STATE: Introduce route optimization mode. Route optimization is used with routing header and destination options header for Mobile IPv6. At outbound it makes header space like IPsec transport. At inbound it does nothing because exhdrs.c functions have responsibility to update skbuff information for these headers. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 14f0b33..1188d95 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -127,6 +127,13 @@ config INET6_XFRM_MODE_TUNNEL If unsure, say Y. +config INET6_XFRM_MODE_ROUTEOPTIMIZATION + tristate "IPv6: MIPv6 route optimization mode (EXPERIMENTAL)" + depends on IPV6 && EXPERIMENTAL + select XFRM + ---help--- + Support for MIPv6 route optimization mode. + config IPV6_TUNNEL tristate "IPv6: IPv6-in-IPv6 tunnel" select INET6_TUNNEL diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 9eebf60..87e912e 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -23,6 +23,7 @@ obj-$(CONFIG_INET6_XFRM_TUNNEL) += xfrm6_tunnel.o obj-$(CONFIG_INET6_TUNNEL) += tunnel6.o obj-$(CONFIG_INET6_XFRM_MODE_TRANSPORT) += xfrm6_mode_transport.o obj-$(CONFIG_INET6_XFRM_MODE_TUNNEL) += xfrm6_mode_tunnel.o +obj-$(CONFIG_INET6_XFRM_MODE_ROUTEOPTIMIZATION) += xfrm6_mode_ro.o obj-$(CONFIG_NETFILTER) += netfilter/ obj-$(CONFIG_IPV6_TUNNEL) += ip6_tunnel.o diff --git a/net/ipv6/xfrm6_mode_ro.c b/net/ipv6/xfrm6_mode_ro.c new file mode 100644 index 0000000..c11c335 --- /dev/null +++ b/net/ipv6/xfrm6_mode_ro.c @@ -0,0 +1,94 @@ +/* + * xfrm6_mode_ro.c - Route optimization mode for IPv6. + * + * Copyright (C)2003-2006 Helsinki University of Technology + * Copyright (C)2003-2006 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ +/* + * Authors: + * Noriaki TAKAMIYA @USAGI + * Masahide NAKAMURA @USAGI + */ + +#include +#include +#include +#include +#include +#include +#include + +/* Add route optimization header space. + * + * The IP header and mutable extension headers will be moved forward to make + * space for the route optimization header. + * + * On exit, skb->h will be set to the start of the encapsulation header to be + * filled in by x->type->output and skb->nh will be set to the nextheader field + * of the extension header directly preceding the encapsulation header, or in + * its absence, that of the top IP header. The value of skb->data will always + * point to the top IP header. + */ +static int xfrm6_ro_output(struct sk_buff *skb) +{ + struct xfrm_state *x = skb->dst->xfrm; + struct ipv6hdr *iph; + u8 *prevhdr; + int hdr_len; + + skb_push(skb, x->props.header_len); + iph = skb->nh.ipv6h; + + hdr_len = x->type->hdr_offset(x, skb, &prevhdr); + skb->nh.raw = prevhdr - x->props.header_len; + skb->h.raw = skb->data + hdr_len; + memmove(skb->data, iph, hdr_len); + return 0; +} + +/* + * Do nothing about routing optimization header unlike IPsec. + */ +static int xfrm6_ro_input(struct xfrm_state *x, struct sk_buff *skb) +{ + return 0; +} + +static struct xfrm_mode xfrm6_ro_mode = { + .input = xfrm6_ro_input, + .output = xfrm6_ro_output, + .owner = THIS_MODULE, + .encap = XFRM_MODE_ROUTEOPTIMIZATION, +}; + +static int __init xfrm6_ro_init(void) +{ + return xfrm_register_mode(&xfrm6_ro_mode, AF_INET6); +} + +static void __exit xfrm6_ro_exit(void) +{ + int err; + + err = xfrm_unregister_mode(&xfrm6_ro_mode, AF_INET6); + BUG_ON(err); +} + +module_init(xfrm6_ro_init); +module_exit(xfrm6_ro_exit); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_XFRM_MODE(AF_INET6, XFRM_MODE_ROUTEOPTIMIZATION); -- cgit v0.10.2 From f3bd484021d9486b826b422a017d75dd0bd258ad Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:00:48 -0700 Subject: [XFRM]: Restrict authentication algorithm only when inbound transformation protocol is IPsec. For Mobile IPv6 usage, routing header or destination options header is used and it doesn't require this comparison. It is checked only for IPsec template. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index a0d5897..f1cdcfb 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1004,7 +1004,8 @@ xfrm_state_ok(struct xfrm_tmpl *tmpl, struct xfrm_state *x, (x->id.spi == tmpl->id.spi || !tmpl->id.spi) && (x->props.reqid == tmpl->reqid || !tmpl->reqid) && x->props.mode == tmpl->mode && - (tmpl->aalgos & (1<props.aalgo)) && + ((tmpl->aalgos & (1<props.aalgo)) || + !(xfrm_id_proto_match(tmpl->id.proto, IPSEC_PROTO_ANY))) && !(x->props.mode != XFRM_MODE_TRANSPORT && xfrm_state_addr_cmp(tmpl, x, family)); } -- cgit v0.10.2 From fbd9a5b47ee9c319ff0cae584391241ce78ffd6b Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:08:21 -0700 Subject: [XFRM] STATE: Common receive function for route optimization extension headers. XFRM_STATE_WILDRECV flag is introduced; the last resort state is set it and receives packet which is not route optimized but uses such extension headers i.e. Mobile IPv6 signaling (binding update and acknowledgement). A node enabled Mobile IPv6 adds the state. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 66343d3..a7c9e4c 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -256,6 +256,7 @@ struct xfrm_usersa_info { #define XFRM_STATE_NOECN 1 #define XFRM_STATE_DECAP_DSCP 2 #define XFRM_STATE_NOPMTUDISC 4 +#define XFRM_STATE_WILDRECV 8 }; struct xfrm_usersa_id { diff --git a/include/net/xfrm.h b/include/net/xfrm.h index eed48f8..0d735a5 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -955,6 +955,8 @@ extern int xfrm4_tunnel_register(struct xfrm_tunnel *handler); extern int xfrm4_tunnel_deregister(struct xfrm_tunnel *handler); extern int xfrm6_rcv_spi(struct sk_buff *skb, u32 spi); extern int xfrm6_rcv(struct sk_buff **pskb); +extern int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr, + xfrm_address_t *saddr, u8 proto); extern int xfrm6_tunnel_register(struct xfrm6_tunnel *handler); extern int xfrm6_tunnel_deregister(struct xfrm6_tunnel *handler); extern u32 xfrm6_tunnel_alloc_spi(xfrm_address_t *saddr); diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c index e1a7416..7b7b90d 100644 --- a/net/ipv6/ipv6_syms.c +++ b/net/ipv6/ipv6_syms.c @@ -31,6 +31,7 @@ EXPORT_SYMBOL(ipv6_chk_addr); EXPORT_SYMBOL(in6_dev_finish_destroy); #ifdef CONFIG_XFRM EXPORT_SYMBOL(xfrm6_rcv); +EXPORT_SYMBOL(xfrm6_input_addr); EXPORT_SYMBOL(xfrm6_find_1stfragopt); #endif EXPORT_SYMBOL(rt6_lookup); diff --git a/net/ipv6/xfrm6_input.c b/net/ipv6/xfrm6_input.c index ee2f6b3..a40a057 100644 --- a/net/ipv6/xfrm6_input.c +++ b/net/ipv6/xfrm6_input.c @@ -138,3 +138,111 @@ int xfrm6_rcv(struct sk_buff **pskb) { return xfrm6_rcv_spi(*pskb, 0); } + +int xfrm6_input_addr(struct sk_buff *skb, xfrm_address_t *daddr, + xfrm_address_t *saddr, u8 proto) +{ + struct xfrm_state *x = NULL; + int wildcard = 0; + struct in6_addr any; + xfrm_address_t *xany; + struct xfrm_state *xfrm_vec_one = NULL; + int nh = 0; + int i = 0; + + ipv6_addr_set(&any, 0, 0, 0, 0); + xany = (xfrm_address_t *)&any; + + for (i = 0; i < 3; i++) { + xfrm_address_t *dst, *src; + switch (i) { + case 0: + dst = daddr; + src = saddr; + break; + case 1: + /* lookup state with wild-card source address */ + wildcard = 1; + dst = daddr; + src = xany; + break; + case 2: + default: + /* lookup state with wild-card addresses */ + wildcard = 1; /* XXX */ + dst = xany; + src = xany; + break; + } + + x = xfrm_state_lookup_byaddr(dst, src, proto, AF_INET6); + if (!x) + continue; + + spin_lock(&x->lock); + + if (wildcard) { + if ((x->props.flags & XFRM_STATE_WILDRECV) == 0) { + spin_unlock(&x->lock); + xfrm_state_put(x); + x = NULL; + continue; + } + } + + if (unlikely(x->km.state != XFRM_STATE_VALID)) { + spin_unlock(&x->lock); + xfrm_state_put(x); + x = NULL; + continue; + } + if (xfrm_state_check_expire(x)) { + spin_unlock(&x->lock); + xfrm_state_put(x); + x = NULL; + continue; + } + + nh = x->type->input(x, skb); + if (nh <= 0) { + spin_unlock(&x->lock); + xfrm_state_put(x); + x = NULL; + continue; + } + + x->curlft.bytes += skb->len; + x->curlft.packets++; + + spin_unlock(&x->lock); + + xfrm_vec_one = x; + break; + } + + if (!xfrm_vec_one) + goto drop; + + /* Allocate new secpath or COW existing one. */ + if (!skb->sp || atomic_read(&skb->sp->refcnt) != 1) { + struct sec_path *sp; + sp = secpath_dup(skb->sp); + if (!sp) + goto drop; + if (skb->sp) + secpath_put(skb->sp); + skb->sp = sp; + } + + if (1 + skb->sp->len > XFRM_MAX_DEPTH) + goto drop; + + skb->sp->xvec[skb->sp->len] = xfrm_vec_one; + skb->sp->len ++; + + return 1; +drop: + if (xfrm_vec_one) + xfrm_state_put(xfrm_vec_one); + return -1; +} diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 11f480b..f053715 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -352,6 +352,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, list_for_each_entry(x, xfrm_state_bydst+h, bydst) { if (x->props.family == family && x->props.reqid == tmpl->reqid && + !(x->props.flags & XFRM_STATE_WILDRECV) && xfrm_state_addr_check(x, daddr, saddr, family) && tmpl->mode == x->props.mode && tmpl->id.proto == x->id.proto && -- cgit v0.10.2 From 9e51fd371a022318c5b64b831c43026e89bc4f75 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:09:09 -0700 Subject: [XFRM]: Rename secpath_has_tunnel to secpath_has_nontransport. On current kernel inbound transformation state is allowed transport and disallowed tunnel mode when mismatch is occurred between tempates and states. As the result of adding two more modes by Mobile IPv6, this function name is misleading. Inbound transformation can allow only transport mode when mismatch is occurred between template and secpath. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index f1cdcfb..56abb5c 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1046,7 +1046,7 @@ xfrm_decode_session(struct sk_buff *skb, struct flowi *fl, unsigned short family } EXPORT_SYMBOL(xfrm_decode_session); -static inline int secpath_has_tunnel(struct sec_path *sp, int k) +static inline int secpath_has_nontransport(struct sec_path *sp, int k) { for (; k < sp->len; k++) { if (sp->xvec[k]->props.mode != XFRM_MODE_TRANSPORT) @@ -1087,7 +1087,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, xfrm_policy_lookup); if (!pol) - return !skb->sp || !secpath_has_tunnel(skb->sp, 0); + return !skb->sp || !secpath_has_nontransport(skb->sp, 0); pol->curlft.use_time = (unsigned long)xtime.tv_sec; @@ -1111,7 +1111,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, goto reject; } - if (secpath_has_tunnel(sp, k)) + if (secpath_has_nontransport(sp, k)) goto reject; xfrm_pol_put(pol); -- cgit v0.10.2 From 99505a843673faeae962a8cde128c7c034ba6b5e Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:10:33 -0700 Subject: [XFRM] STATE: Add a hook to obtain local/remote outbound address. Outbound transformation replaces both source and destination address with state's end-point addresses at the same time when IPsec tunnel mode. It is also required to change them for Mobile IPv6 route optimization, but we should care about the following differences: - changing result is not end-point but care-of address - either source or destination is replaced for each state This hook is a common platform to change outbound address. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 0d735a5..aa3ac99 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -266,6 +266,8 @@ struct xfrm_type int (*input)(struct xfrm_state *, struct sk_buff *skb); int (*output)(struct xfrm_state *, struct sk_buff *pskb); int (*hdr_offset)(struct xfrm_state *, struct sk_buff *, u8 **); + xfrm_address_t *(*local_addr)(struct xfrm_state *, xfrm_address_t *); + xfrm_address_t *(*remote_addr)(struct xfrm_state *, xfrm_address_t *); /* Estimate maximal size of result of transformation of a dgram */ u32 (*get_max_size)(struct xfrm_state *, int size); }; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 81355bb..9328fc8 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -59,6 +59,22 @@ __xfrm6_find_bundle(struct flowi *fl, struct xfrm_policy *policy) return dst; } +static inline struct in6_addr* +__xfrm6_bundle_addr_remote(struct xfrm_state *x, struct in6_addr *addr) +{ + return (x->type->remote_addr) ? + (struct in6_addr*)x->type->remote_addr(x, (xfrm_address_t *)addr) : + (struct in6_addr*)&x->id.daddr; +} + +static inline struct in6_addr* +__xfrm6_bundle_addr_local(struct xfrm_state *x, struct in6_addr *addr) +{ + return (x->type->local_addr) ? + (struct in6_addr*)x->type->local_addr(x, (xfrm_address_t *)addr) : + (struct in6_addr*)&x->props.saddr; +} + /* Allocate chain of dst_entry's, attach known xfrm's, calculate * all the metrics... Shortly, bundle a bundle. */ @@ -115,8 +131,8 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int dst1->next = dst_prev; dst_prev = dst1; if (xfrm[i]->props.mode != XFRM_MODE_TRANSPORT) { - remote = (struct in6_addr*)&xfrm[i]->id.daddr; - local = (struct in6_addr*)&xfrm[i]->props.saddr; + remote = __xfrm6_bundle_addr_remote(xfrm[i], remote); + local = __xfrm6_bundle_addr_local(xfrm[i], local); tunnel = 1; } header_len += xfrm[i]->props.header_len; -- cgit v0.10.2 From 1b5c229987dc4d0c92a38fac0cde2aeec08cd775 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:11:50 -0700 Subject: [XFRM] STATE: Support non-fragment outbound transformation headers. For originated outbound IPv6 packets which will fragment, ip6_append_data() should know length of extension headers before sending them and the length is carried by dst_entry. IPv6 IPsec headers fragment then transformation was designed to place all headers after fragment header. OTOH Mobile IPv6 extension headers do not fragment then it is a good idea to make dst_entry have non-fragment length to tell it to ip6_append_data(). Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/dst.h b/include/net/dst.h index 36d54fc..a8d825f 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -54,6 +54,7 @@ struct dst_entry unsigned long expires; unsigned short header_len; /* more space at head required */ + unsigned short nfheader_len; /* more non-fragment space at head required */ unsigned short trailer_len; /* space to reserve at tail */ u32 metrics[RTAX_MAX]; diff --git a/include/net/xfrm.h b/include/net/xfrm.h index aa3ac99..aa93cc1 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -260,6 +260,8 @@ struct xfrm_type char *description; struct module *owner; __u8 proto; + __u8 flags; +#define XFRM_TYPE_NON_FRAGMENT 1 int (*init_state)(struct xfrm_state *x); void (*destructor)(struct xfrm_state *); diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index a5bed74..e517981 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -135,6 +135,7 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int dst_prev->flags |= DST_HOST; dst_prev->lastuse = jiffies; dst_prev->header_len = header_len; + dst_prev->nfheader_len = 0; dst_prev->trailer_len = trailer_len; memcpy(&dst_prev->metrics, &x->route->metrics, sizeof(dst_prev->metrics)); diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 2a376b7..258e3e4 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -971,7 +971,7 @@ int ip6_append_data(struct sock *sk, int getfrag(void *from, char *to, hh_len = LL_RESERVED_SPACE(rt->u.dst.dev); - fragheaderlen = sizeof(struct ipv6hdr) + (opt ? opt->opt_nflen : 0); + fragheaderlen = sizeof(struct ipv6hdr) + rt->u.dst.nfheader_len + (opt ? opt->opt_nflen : 0); maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen - sizeof(struct frag_hdr); if (mtu <= sizeof(struct ipv6hdr) + IPV6_MAXPLEN) { diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 9328fc8..a3f68c8 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -75,6 +75,24 @@ __xfrm6_bundle_addr_local(struct xfrm_state *x, struct in6_addr *addr) (struct in6_addr*)&x->props.saddr; } +static inline void +__xfrm6_bundle_len_inc(int *len, int *nflen, struct xfrm_state *x) +{ + if (x->type->flags & XFRM_TYPE_NON_FRAGMENT) + *nflen += x->props.header_len; + else + *len += x->props.header_len; +} + +static inline void +__xfrm6_bundle_len_dec(int *len, int *nflen, struct xfrm_state *x) +{ + if (x->type->flags & XFRM_TYPE_NON_FRAGMENT) + *nflen -= x->props.header_len; + else + *len -= x->props.header_len; +} + /* Allocate chain of dst_entry's, attach known xfrm's, calculate * all the metrics... Shortly, bundle a bundle. */ @@ -99,6 +117,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int int i; int err = 0; int header_len = 0; + int nfheader_len = 0; int trailer_len = 0; dst = dst_prev = NULL; @@ -135,7 +154,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int local = __xfrm6_bundle_addr_local(xfrm[i], local); tunnel = 1; } - header_len += xfrm[i]->props.header_len; + __xfrm6_bundle_len_inc(&header_len, &nfheader_len, xfrm[i]); trailer_len += xfrm[i]->props.trailer_len; if (tunnel) { @@ -170,6 +189,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int dst_prev->flags |= DST_HOST; dst_prev->lastuse = jiffies; dst_prev->header_len = header_len; + dst_prev->nfheader_len = nfheader_len; dst_prev->trailer_len = trailer_len; memcpy(&dst_prev->metrics, &x->route->metrics, sizeof(dst_prev->metrics)); @@ -188,7 +208,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int x->u.rt6.rt6i_src = rt0->rt6i_src; x->u.rt6.rt6i_idev = rt0->rt6i_idev; in6_dev_hold(rt0->rt6i_idev); - header_len -= x->u.dst.xfrm->props.header_len; + __xfrm6_bundle_len_dec(&header_len, &nfheader_len, x->u.dst.xfrm); trailer_len -= x->u.dst.xfrm->props.trailer_len; } -- cgit v0.10.2 From 060f02a3bdd4d9ba8aa3c48e9b470672b1f3a585 Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Wed, 23 Aug 2006 18:18:55 -0700 Subject: [XFRM] STATE: Introduce care-of address. Care-of address is carried by state as a transformation option like IPsec encryption/authentication algorithm. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index a7c9e4c..b53f799 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -235,6 +235,7 @@ enum xfrm_attr_type_t { XFRMA_REPLAY_THRESH, XFRMA_ETIMER_THRESH, XFRMA_SRCADDR, /* xfrm_address_t */ + XFRMA_COADDR, /* xfrm_address_t */ __XFRMA_MAX #define XFRMA_MAX (__XFRMA_MAX - 1) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index aa93cc1..872a2a4 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -134,6 +134,9 @@ struct xfrm_state /* Data for encapsulator */ struct xfrm_encap_tmpl *encap; + /* Data for care-of address */ + xfrm_address_t *coaddr; + /* IPComp needs an IPIP tunnel for handling uncompressed packets */ struct xfrm_state *tunnel; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index f053715..3da89c0 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -78,6 +78,7 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x) kfree(x->ealg); kfree(x->calg); kfree(x->encap); + kfree(x->coaddr); if (x->mode) xfrm_put_mode(x->mode); if (x->type) { @@ -603,6 +604,11 @@ out: if (likely(x1->km.state == XFRM_STATE_VALID)) { if (x->encap && x1->encap) memcpy(x1->encap, x->encap, sizeof(*x1->encap)); + if (x->coaddr && x1->coaddr) { + memcpy(x1->coaddr, x->coaddr, sizeof(*x1->coaddr)); + } + if (!use_spi && memcmp(&x1->sel, &x->sel, sizeof(x1->sel))) + memcpy(&x1->sel, &x->sel, sizeof(x1->sel)); memcpy(&x1->lft, &x->lft, sizeof(x1->lft)); x1->km.dying = 0; diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index b5f8ab7..939808d 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -187,11 +187,14 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, goto out; if ((err = verify_sec_ctx_len(xfrma))) goto out; + if ((err = verify_one_addr(xfrma, XFRMA_COADDR, NULL))) + goto out; err = -EINVAL; switch (p->mode) { case XFRM_MODE_TRANSPORT: case XFRM_MODE_TUNNEL: + case XFRM_MODE_ROUTEOPTIMIZATION: break; default: @@ -276,6 +279,24 @@ static int attach_sec_ctx(struct xfrm_state *x, struct rtattr *u_arg) return security_xfrm_state_alloc(x, uctx); } +static int attach_one_addr(xfrm_address_t **addrpp, struct rtattr *u_arg) +{ + struct rtattr *rta = u_arg; + xfrm_address_t *p, *uaddrp; + + if (!rta) + return 0; + + uaddrp = RTA_DATA(rta); + p = kmalloc(sizeof(*p), GFP_KERNEL); + if (!p) + return -ENOMEM; + + memcpy(p, uaddrp, sizeof(*p)); + *addrpp = p; + return 0; +} + static void copy_from_user_state(struct xfrm_state *x, struct xfrm_usersa_info *p) { memcpy(&x->id, &p->id, sizeof(x->id)); @@ -365,7 +386,8 @@ static struct xfrm_state *xfrm_state_construct(struct xfrm_usersa_info *p, goto error; if ((err = attach_encap_tmpl(&x->encap, xfrma[XFRMA_ENCAP-1]))) goto error; - + if ((err = attach_one_addr(&x->coaddr, xfrma[XFRMA_COADDR-1]))) + goto error; err = xfrm_init_state(x); if (err) goto error; @@ -569,6 +591,10 @@ static int dump_one_state(struct xfrm_state *x, int count, void *ptr) uctx->ctx_len = x->security->ctx_len; memcpy(uctx + 1, x->security->ctx_str, x->security->ctx_len); } + + if (x->coaddr) + RTA_PUT(skb, XFRMA_COADDR, sizeof(*x->coaddr), x->coaddr); + nlh->nlmsg_len = skb->tail - b; out: sp->this_idx++; -- cgit v0.10.2 From 9afaca057980c02771f4657c455cc7592fcd7373 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 18:20:16 -0700 Subject: [XFRM] IPV6: Update outbound state timestamp for each sending. With this patch transformation state is updated last used time for each sending. Xtime is used for it like other state lifetime expiration. Mobile IPv6 enabled nodes will want to know traffic status of each binding (e.g. judgement to request binding refresh by correspondent node, or to keep home/care-of nonce alive by mobile node). The last used timestamp is an important hint about it. Based on MIPL2 kernel patch. This patch was also written by: Henrik Petander Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index b53f799..1d8c1f2 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -236,6 +236,7 @@ enum xfrm_attr_type_t { XFRMA_ETIMER_THRESH, XFRMA_SRCADDR, /* xfrm_address_t */ XFRMA_COADDR, /* xfrm_address_t */ + XFRMA_LASTUSED, __XFRMA_MAX #define XFRMA_MAX (__XFRMA_MAX - 1) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 872a2a4..248874e 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -167,6 +167,9 @@ struct xfrm_state struct xfrm_lifetime_cur curlft; struct timer_list timer; + /* Last used time */ + u64 lastused; + /* Reference to data common to all the instances of this * transformer. */ struct xfrm_type *type; diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index b4628fb..db58104 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -75,6 +75,8 @@ static int xfrm6_output_one(struct sk_buff *skb) x->curlft.bytes += skb->len; x->curlft.packets++; + if (x->props.mode == XFRM_MODE_ROUTEOPTIMIZATION) + x->lastused = (u64)xtime.tv_sec; spin_unlock_bh(&x->lock); diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 939808d..f643063 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -595,6 +595,9 @@ static int dump_one_state(struct xfrm_state *x, int count, void *ptr) if (x->coaddr) RTA_PUT(skb, XFRMA_COADDR, sizeof(*x->coaddr), x->coaddr); + if (x->lastused) + RTA_PUT(skb, XFRMA_LASTUSED, sizeof(x->lastused), &x->lastused); + nlh->nlmsg_len = skb->tail - b; out: sp->this_idx++; -- cgit v0.10.2 From e53820de0f81da1429048634cadc6ef5f50c2f8b Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:12:01 -0700 Subject: [XFRM] IPV6: Restrict bundle reusing For outbound transformation, bundle is checked whether it is suitable for current flow to be reused or not. In such IPv6 case as below, transformation may apply incorrect bundle for the flow instead of creating another bundle: - The policy selector has destination prefix length < 128 (Two or more addresses can be matched it) - Its bundle holds dst entry of default route whose prefix length < 128 (Previous traffic was used such route as next hop) - The policy and the bundle were used a transport mode state and this time flow address is not matched the bundled state. This issue is found by Mobile IPv6 usage to protect mobility signaling by IPsec, but it is not a Mobile IPv6 specific. This patch adds strict check to xfrm_bundle_ok() for each state mode and address when prefix length is less than 128. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 248874e..7f16306 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -869,6 +869,23 @@ xfrm_state_addr_check(struct xfrm_state *x, return 0; } +static __inline__ int +xfrm_state_addr_flow_check(struct xfrm_state *x, struct flowi *fl, + unsigned short family) +{ + switch (family) { + case AF_INET: + return __xfrm4_state_addr_check(x, + (xfrm_address_t *)&fl->fl4_dst, + (xfrm_address_t *)&fl->fl4_src); + case AF_INET6: + return __xfrm6_state_addr_check(x, + (xfrm_address_t *)&fl->fl6_dst, + (xfrm_address_t *)&fl->fl6_src); + } + return 0; +} + static inline int xfrm_state_kern(struct xfrm_state *x) { return atomic_read(&x->tunnel_users); @@ -1014,7 +1031,7 @@ extern void xfrm_policy_flush(void); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); extern int xfrm_flush_bundles(void); extern void xfrm_flush_all_bundles(void); -extern int xfrm_bundle_ok(struct xfrm_dst *xdst, struct flowi *fl, int family); +extern int xfrm_bundle_ok(struct xfrm_dst *xdst, struct flowi *fl, int family, int strict); extern void xfrm_init_pmtu(struct dst_entry *dst); extern wait_queue_head_t km_waitq; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index e517981..42d8ded 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -33,7 +33,7 @@ __xfrm4_find_bundle(struct flowi *fl, struct xfrm_policy *policy) xdst->u.rt.fl.fl4_dst == fl->fl4_dst && xdst->u.rt.fl.fl4_src == fl->fl4_src && xdst->u.rt.fl.fl4_tos == fl->fl4_tos && - xfrm_bundle_ok(xdst, fl, AF_INET)) { + xfrm_bundle_ok(xdst, fl, AF_INET, 0)) { dst_clone(dst); break; } diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index a3f68c8..729b474 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -50,7 +50,9 @@ __xfrm6_find_bundle(struct flowi *fl, struct xfrm_policy *policy) xdst->u.rt6.rt6i_src.plen); if (ipv6_addr_equal(&xdst->u.rt6.rt6i_dst.addr, &fl_dst_prefix) && ipv6_addr_equal(&xdst->u.rt6.rt6i_src.addr, &fl_src_prefix) && - xfrm_bundle_ok(xdst, fl, AF_INET6)) { + xfrm_bundle_ok(xdst, fl, AF_INET6, + (xdst->u.rt6.rt6i_dst.plen != 128 || + xdst->u.rt6.rt6i_src.plen != 128))) { dst_clone(dst); break; } diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 56abb5c..ad2a5cb 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1167,7 +1167,7 @@ static struct dst_entry *xfrm_dst_check(struct dst_entry *dst, u32 cookie) static int stale_bundle(struct dst_entry *dst) { - return !xfrm_bundle_ok((struct xfrm_dst *)dst, NULL, AF_UNSPEC); + return !xfrm_bundle_ok((struct xfrm_dst *)dst, NULL, AF_UNSPEC, 0); } void xfrm_dst_ifdown(struct dst_entry *dst, struct net_device *dev) @@ -1282,7 +1282,7 @@ EXPORT_SYMBOL(xfrm_init_pmtu); * still valid. */ -int xfrm_bundle_ok(struct xfrm_dst *first, struct flowi *fl, int family) +int xfrm_bundle_ok(struct xfrm_dst *first, struct flowi *fl, int family, int strict) { struct dst_entry *dst = &first->u.dst; struct xfrm_dst *last; @@ -1304,6 +1304,10 @@ int xfrm_bundle_ok(struct xfrm_dst *first, struct flowi *fl, int family) if (dst->xfrm->km.state != XFRM_STATE_VALID) return 0; + if (strict && fl && dst->xfrm->props.mode != XFRM_MODE_TUNNEL && + !xfrm_state_addr_flow_check(dst->xfrm, fl, family)) + return 0; + mtu = dst_mtu(dst->child); if (xdst->child_mtu_cached != mtu) { last = xdst; -- cgit v0.10.2 From 654b32c6aad19d2fd363813cd8a1a1e64daf611b Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:12:56 -0700 Subject: [XFRM]: Fix message about transformation user interface. Transformation user interface is not only for IPsec. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig index 0c1c043..43228f7 100644 --- a/net/xfrm/Kconfig +++ b/net/xfrm/Kconfig @@ -6,11 +6,11 @@ config XFRM depends on NET config XFRM_USER - tristate "IPsec user configuration interface" + tristate "Transformation user configuration interface" depends on INET && XFRM ---help--- - Support for IPsec user configuration interface used - by native Linux tools. + Support for Transformation(XFRM) user configuration interface + like IPsec used by native Linux tools. If unsure, say Y. diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index f643063..3a83c59 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -2054,7 +2054,7 @@ static int __init xfrm_user_init(void) { struct sock *nlsk; - printk(KERN_INFO "Initializing IPsec netlink socket\n"); + printk(KERN_INFO "Initializing XFRM netlink socket\n"); nlsk = netlink_kernel_create(NETLINK_XFRM, XFRMNLGRP_MAX, xfrm_netlink_rcv, THIS_MODULE); -- cgit v0.10.2 From ee53826801a8fa7a0e333895421ef6d0e5fbfbf0 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:13:46 -0700 Subject: [IPV6]: Add Kconfig to enable Mobile IPv6. Add Kconfig to enable Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 1188d95..21e0cc8 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -98,6 +98,15 @@ config INET6_IPCOMP If unsure, say Y. +config IPV6_MIP6 + bool "IPv6: Mobility (EXPERIMENTAL)" + depends on IPV6 && EXPERIMENTAL + select XFRM + ---help--- + Support for IPv6 Mobility described in RFC 3775. + + If unsure, say N. + config INET6_XFRM_TUNNEL tristate select INET6_TUNNEL -- cgit v0.10.2 From 642ec62eee5bdc158e01029220c8a23c685778fb Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Wed, 23 Aug 2006 19:15:07 -0700 Subject: [IPV6] MIP6: Add routing header type 2 definition. Add routing header type 2 definition for Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 02d14a3..d995662 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -29,6 +29,7 @@ struct in6_ifreq { #define IPV6_SRCRT_STRICT 0x01 /* this hop must be a neighbor */ #define IPV6_SRCRT_TYPE_0 0 /* IPv6 type 0 Routing Header */ +#define IPV6_SRCRT_TYPE_2 2 /* IPv6 type 2 Routing Header */ /* * routing header @@ -73,6 +74,18 @@ struct rt0_hdr { #define rt0_type rt_hdr.type }; +/* + * routing header type 2 + */ + +struct rt2_hdr { + struct ipv6_rt_hdr rt_hdr; + __u32 reserved; + struct in6_addr addr; + +#define rt2_type rt_hdr.type +}; + struct ipv6_auth_hdr { __u8 nexthdr; __u8 hdrlen; /* This one is measured in 32 bit units! */ -- cgit v0.10.2 From 65d4ed92219b28875efb52de5700da8c3dfa83e1 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:16:22 -0700 Subject: [IPV6] MIP6: Add inbound interface of routing header type 2. Add inbound interface of routing header type 2 for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 3d71251..5fc8627 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -61,6 +61,13 @@ extern int addrconf_set_dstaddr(void __user *arg); extern int ipv6_chk_addr(struct in6_addr *addr, struct net_device *dev, int strict); +/* XXX: this is a placeholder till addrconf supports */ +#ifdef CONFIG_IPV6_MIP6 +static inline int ipv6_chk_home_addr(struct in6_addr *addr) +{ + return 0; +} +#endif extern struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev, int strict); diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 05afa6b..8d3a0e1 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -43,6 +43,9 @@ #include #include #include +#ifdef CONFIG_IPV6_MIP6 +#include +#endif #include @@ -219,7 +222,7 @@ static int ipv6_rthdr_rcv(struct sk_buff **skbp) { struct sk_buff *skb = *skbp; struct inet6_skb_parm *opt = IP6CB(skb); - struct in6_addr *addr; + struct in6_addr *addr = NULL; struct in6_addr daddr; int n, i; @@ -244,6 +247,23 @@ static int ipv6_rthdr_rcv(struct sk_buff **skbp) looped_back: if (hdr->segments_left == 0) { + switch (hdr->type) { +#ifdef CONFIG_IPV6_MIP6 + case IPV6_SRCRT_TYPE_2: + /* Silently discard type 2 header unless it was + * processed by own + */ + if (!addr) { + IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS); + kfree_skb(skb); + return -1; + } + break; +#endif + default: + break; + } + opt->lastopt = skb->h.raw - skb->nh.raw; opt->srcrt = skb->h.raw - skb->nh.raw; skb->h.raw += (hdr->hdrlen + 1) << 3; @@ -253,17 +273,29 @@ looped_back: return 1; } - if (hdr->type != IPV6_SRCRT_TYPE_0) { + switch (hdr->type) { + case IPV6_SRCRT_TYPE_0: + if (hdr->hdrlen & 0x01) { + IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS); + icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (&hdr->hdrlen) - skb->nh.raw); + return -1; + } + break; +#ifdef CONFIG_IPV6_MIP6 + case IPV6_SRCRT_TYPE_2: + /* Silently discard invalid RTH type 2 */ + if (hdr->hdrlen != 2 || hdr->segments_left != 1) { + IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS); + kfree_skb(skb); + return -1; + } + break; +#endif + default: IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS); icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (&hdr->type) - skb->nh.raw); return -1; } - - if (hdr->hdrlen & 0x01) { - IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS); - icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, (&hdr->hdrlen) - skb->nh.raw); - return -1; - } /* * This is the routing header forwarding algorithm from @@ -303,6 +335,27 @@ looped_back: addr = rthdr->addr; addr += i - 1; + switch (hdr->type) { +#ifdef CONFIG_IPV6_MIP6 + case IPV6_SRCRT_TYPE_2: + if (xfrm6_input_addr(skb, (xfrm_address_t *)addr, + (xfrm_address_t *)&skb->nh.ipv6h->saddr, + IPPROTO_ROUTING) < 0) { + IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS); + kfree_skb(skb); + return -1; + } + if (!ipv6_chk_home_addr(addr)) { + IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS); + kfree_skb(skb); + return -1; + } + break; +#endif + default: + break; + } + if (ipv6_addr_is_multicast(addr)) { IP6_INC_STATS_BH(IPSTATS_MIB_INADDRERRORS); kfree_skb(skb); -- cgit v0.10.2 From 280a9d340057ce1b3cca63084df22f4ef5b35fba Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:17:12 -0700 Subject: [IPV6] MIP6: Add socket option and ancillary data interface of routing header type 2. Add socket option and ancillary data interface of routing header type 2. Mobile IPv6 application will use this to send binding acknowledgement with the header without relation of confirmed route optimization (binding). Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c index 8561b9d..7206747 100644 --- a/net/ipv6/datagram.c +++ b/net/ipv6/datagram.c @@ -648,10 +648,13 @@ int datagram_send_ctl(struct msghdr *msg, struct flowi *fl, rthdr = (struct ipv6_rt_hdr *)CMSG_DATA(cmsg); - /* - * TYPE 0 - */ - if (rthdr->type) { + switch (rthdr->type) { + case IPV6_SRCRT_TYPE_0: +#ifdef CONFIG_IPV6_MIP6 + case IPV6_SRCRT_TYPE_2: +#endif + break; + default: err = -EINVAL; goto exit_f; } diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index a5eaaf6..4f3bb7f 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -407,8 +407,16 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, /* routing header option needs extra check */ if (optname == IPV6_RTHDR && opt->srcrt) { struct ipv6_rt_hdr *rthdr = opt->srcrt; - if (rthdr->type) + switch (rthdr->type) { + case IPV6_SRCRT_TYPE_0: +#ifdef CONFIG_IPV6_MIP6 + case IPV6_SRCRT_TYPE_2: +#endif + break; + default: goto sticky_done; + } + if ((rthdr->hdrlen & 1) || (rthdr->hdrlen >> 1) != rthdr->segments_left) goto sticky_done; -- cgit v0.10.2 From c61a404325093250b676f40ad8f4dd00f3bcab5f Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:18:35 -0700 Subject: [IPV6]: Find option offset by type. This is a helper to search option offset from extension header which can carry TLV option like destination options header. Mobile IPv6 home address option will use it. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/ipv6.h b/include/net/ipv6.h index ece7e8a..c4ea127 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -506,6 +506,8 @@ extern int ipv6_skip_exthdr(const struct sk_buff *, int start, extern int ipv6_ext_hdr(u8 nexthdr); +extern int ipv6_find_tlv(struct sk_buff *skb, int offset, int type); + extern struct ipv6_txoptions * ipv6_invert_rthdr(struct sock *sk, struct ipv6_rt_hdr *hdr); diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 8d3a0e1..50ff49e 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -49,6 +49,49 @@ #include +int ipv6_find_tlv(struct sk_buff *skb, int offset, int type) +{ + int packet_len = skb->tail - skb->nh.raw; + struct ipv6_opt_hdr *hdr; + int len; + + if (offset + 2 > packet_len) + goto bad; + hdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + len = ((hdr->hdrlen + 1) << 3); + + if (offset + len > packet_len) + goto bad; + + offset += 2; + len -= 2; + + while (len > 0) { + int opttype = skb->nh.raw[offset]; + int optlen; + + if (opttype == type) + return offset; + + switch (opttype) { + case IPV6_TLV_PAD0: + optlen = 1; + break; + default: + optlen = skb->nh.raw[offset + 1] + 2; + if (optlen > len) + goto bad; + break; + } + offset += optlen; + len -= optlen; + } + /* not_found */ + return -1; + bad: + return -1; +} + /* * Parsing tlv encoded headers. * -- cgit v0.10.2 From a80ff03e05e4343d647780c116b02ec86078fd24 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:19:50 -0700 Subject: [IPV6]: Allow to replace skbuff by TLV parser. In receiving Mobile IPv6 home address option which is a TLV carried by destination options header, kernel will try to mangle source adderss of packet. Think of cloned skbuff it is required to replace it by the parser just like routing header case. This is a framework to achieve that to allow TLV parser to replace inbound skbuff pointer. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/ipv6.h b/include/net/ipv6.h index c4ea127..8e6ec60 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -229,7 +229,7 @@ extern int ip6_ra_control(struct sock *sk, int sel, void (*destructor)(struct sock *)); -extern int ipv6_parse_hopopts(struct sk_buff *skb); +extern int ipv6_parse_hopopts(struct sk_buff **skbp); extern struct ipv6_txoptions * ipv6_dup_options(struct sock *sk, struct ipv6_txoptions *opt); extern struct ipv6_txoptions * ipv6_renew_options(struct sock *sk, struct ipv6_txoptions *opt, diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 50ff49e..1cdd0f0 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -102,7 +102,7 @@ int ipv6_find_tlv(struct sk_buff *skb, int offset, int type) struct tlvtype_proc { int type; - int (*func)(struct sk_buff *skb, int offset); + int (*func)(struct sk_buff **skbp, int offset); }; /********************* @@ -111,8 +111,10 @@ struct tlvtype_proc { /* An unknown option is detected, decide what to do */ -static int ip6_tlvopt_unknown(struct sk_buff *skb, int optoff) +static int ip6_tlvopt_unknown(struct sk_buff **skbp, int optoff) { + struct sk_buff *skb = *skbp; + switch ((skb->nh.raw[optoff] & 0xC0) >> 6) { case 0: /* ignore */ return 1; @@ -137,8 +139,9 @@ static int ip6_tlvopt_unknown(struct sk_buff *skb, int optoff) /* Parse tlv encoded option header (hop-by-hop or destination) */ -static int ip6_parse_tlv(struct tlvtype_proc *procs, struct sk_buff *skb) +static int ip6_parse_tlv(struct tlvtype_proc *procs, struct sk_buff **skbp) { + struct sk_buff *skb = *skbp; struct tlvtype_proc *curr; int off = skb->h.raw - skb->nh.raw; int len = ((skb->h.raw[1]+1)<<3); @@ -168,13 +171,13 @@ static int ip6_parse_tlv(struct tlvtype_proc *procs, struct sk_buff *skb) /* type specific length/alignment checks will be performed in the func(). */ - if (curr->func(skb, off) == 0) + if (curr->func(skbp, off) == 0) return 0; break; } } if (curr->type < 0) { - if (ip6_tlvopt_unknown(skb, off) == 0) + if (ip6_tlvopt_unknown(skbp, off) == 0) return 0; } break; @@ -213,7 +216,8 @@ static int ipv6_destopt_rcv(struct sk_buff **skbp) opt->lastopt = skb->h.raw - skb->nh.raw; opt->dst1 = skb->h.raw - skb->nh.raw; - if (ip6_parse_tlv(tlvprocdestopt_lst, skb)) { + if (ip6_parse_tlv(tlvprocdestopt_lst, skbp)) { + skb = *skbp; skb->h.raw += ((skb->h.raw[1]+1)<<3); opt->nhoff = opt->dst1; return 1; @@ -517,8 +521,10 @@ EXPORT_SYMBOL_GPL(ipv6_invert_rthdr); /* Router Alert as of RFC 2711 */ -static int ipv6_hop_ra(struct sk_buff *skb, int optoff) +static int ipv6_hop_ra(struct sk_buff **skbp, int optoff) { + struct sk_buff *skb = *skbp; + if (skb->nh.raw[optoff+1] == 2) { IP6CB(skb)->ra = optoff; return 1; @@ -531,8 +537,9 @@ static int ipv6_hop_ra(struct sk_buff *skb, int optoff) /* Jumbo payload */ -static int ipv6_hop_jumbo(struct sk_buff *skb, int optoff) +static int ipv6_hop_jumbo(struct sk_buff **skbp, int optoff) { + struct sk_buff *skb = *skbp; u32 pkt_len; if (skb->nh.raw[optoff+1] != 4 || (optoff&3) != 2) { @@ -581,8 +588,9 @@ static struct tlvtype_proc tlvprochopopt_lst[] = { { -1, } }; -int ipv6_parse_hopopts(struct sk_buff *skb) +int ipv6_parse_hopopts(struct sk_buff **skbp) { + struct sk_buff *skb = *skbp; struct inet6_skb_parm *opt = IP6CB(skb); /* @@ -598,7 +606,8 @@ int ipv6_parse_hopopts(struct sk_buff *skb) } opt->hop = sizeof(struct ipv6hdr); - if (ip6_parse_tlv(tlvprochopopt_lst, skb)) { + if (ip6_parse_tlv(tlvprochopopt_lst, skbp)) { + skb = *skbp; skb->h.raw += (skb->h.raw[1]+1)<<3; opt->nhoff = sizeof(struct ipv6hdr); return 1; diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c index 25c2a9e..6b8e6d7 100644 --- a/net/ipv6/ip6_input.c +++ b/net/ipv6/ip6_input.c @@ -111,7 +111,7 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt } if (hdr->nexthdr == NEXTHDR_HOP) { - if (ipv6_parse_hopopts(skb) < 0) { + if (ipv6_parse_hopopts(&skb) < 0) { IP6_INC_STATS_BH(IPSTATS_MIB_INHDRERRORS); return 0; } -- cgit v0.10.2 From 842426e719f86cd5709617208efae93ff1a1e2d8 Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Wed, 23 Aug 2006 19:21:34 -0700 Subject: [IPV6] MIP6: Add home address option definition. Add home address option definition for Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/in6.h b/include/linux/in6.h index 304aaed..086ec2a 100644 --- a/include/linux/in6.h +++ b/include/linux/in6.h @@ -142,6 +142,7 @@ struct in6_flowlabel_req #define IPV6_TLV_PADN 1 #define IPV6_TLV_ROUTERALERT 5 #define IPV6_TLV_JUMBO 194 +#define IPV6_TLV_HAO 201 /* home address option */ /* * IPV6 socket options diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index d995662..5bf4406 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -86,6 +86,16 @@ struct rt2_hdr { #define rt2_type rt_hdr.type }; +/* + * home address option in destination options header + */ + +struct ipv6_destopt_hao { + __u8 type; + __u8 length; + struct in6_addr addr; +} __attribute__ ((__packed__)); + struct ipv6_auth_hdr { __u8 nexthdr; __u8 hdrlen; /* This one is measured in 32 bit units! */ -- cgit v0.10.2 From a831f5bbc89a9978795504be9e1ff412043f8f77 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:24:48 -0700 Subject: [IPV6] MIP6: Add inbound interface of home address option. Add inbound function of home address option by registering it to TLV table for destination options header. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 5bf4406..db3b2ba 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -226,6 +226,9 @@ struct inet6_skb_parm { __u16 dst0; __u16 srcrt; __u16 dst1; +#ifdef CONFIG_IPV6_MIP6 + __u16 dsthao; +#endif __u16 lastopt; __u32 nhoff; __u16 flags; diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 1cdd0f0..6a6466b 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -196,8 +196,80 @@ bad: Destination options header. *****************************/ +#ifdef CONFIG_IPV6_MIP6 +static int ipv6_dest_hao(struct sk_buff **skbp, int optoff) +{ + struct sk_buff *skb = *skbp; + struct ipv6_destopt_hao *hao; + struct inet6_skb_parm *opt = IP6CB(skb); + struct ipv6hdr *ipv6h = (struct ipv6hdr *)skb->nh.raw; + struct in6_addr tmp_addr; + int ret; + + if (opt->dsthao) { + LIMIT_NETDEBUG(KERN_DEBUG "hao duplicated\n"); + goto discard; + } + opt->dsthao = opt->dst1; + opt->dst1 = 0; + + hao = (struct ipv6_destopt_hao *)(skb->nh.raw + optoff); + + if (hao->length != 16) { + LIMIT_NETDEBUG( + KERN_DEBUG "hao invalid option length = %d\n", hao->length); + goto discard; + } + + if (!(ipv6_addr_type(&hao->addr) & IPV6_ADDR_UNICAST)) { + LIMIT_NETDEBUG( + KERN_DEBUG "hao is not an unicast addr: " NIP6_FMT "\n", NIP6(hao->addr)); + goto discard; + } + + ret = xfrm6_input_addr(skb, (xfrm_address_t *)&ipv6h->daddr, + (xfrm_address_t *)&hao->addr, IPPROTO_DSTOPTS); + if (unlikely(ret < 0)) + goto discard; + + if (skb_cloned(skb)) { + struct sk_buff *skb2 = skb_copy(skb, GFP_ATOMIC); + if (skb2 == NULL) + goto discard; + + kfree_skb(skb); + + /* update all variable using below by copied skbuff */ + *skbp = skb = skb2; + hao = (struct ipv6_destopt_hao *)(skb2->nh.raw + optoff); + ipv6h = (struct ipv6hdr *)skb2->nh.raw; + } + + if (skb->ip_summed == CHECKSUM_COMPLETE) + skb->ip_summed = CHECKSUM_NONE; + + ipv6_addr_copy(&tmp_addr, &ipv6h->saddr); + ipv6_addr_copy(&ipv6h->saddr, &hao->addr); + ipv6_addr_copy(&hao->addr, &tmp_addr); + + if (skb->tstamp.off_sec == 0) + __net_timestamp(skb); + + return 1; + + discard: + kfree_skb(skb); + return 0; +} +#endif + static struct tlvtype_proc tlvprocdestopt_lst[] = { - /* No destination options are defined now */ +#ifdef CONFIG_IPV6_MIP6 + { + .type = IPV6_TLV_HAO, + .func = ipv6_dest_hao, + }, +#endif {-1, NULL} }; @@ -205,6 +277,9 @@ static int ipv6_destopt_rcv(struct sk_buff **skbp) { struct sk_buff *skb = *skbp; struct inet6_skb_parm *opt = IP6CB(skb); +#ifdef CONFIG_IPV6_MIP6 + __u16 dstbuf; +#endif if (!pskb_may_pull(skb, (skb->h.raw-skb->data)+8) || !pskb_may_pull(skb, (skb->h.raw-skb->data)+((skb->h.raw[1]+1)<<3))) { @@ -215,11 +290,18 @@ static int ipv6_destopt_rcv(struct sk_buff **skbp) opt->lastopt = skb->h.raw - skb->nh.raw; opt->dst1 = skb->h.raw - skb->nh.raw; +#ifdef CONFIG_IPV6_MIP6 + dstbuf = opt->dst1; +#endif if (ip6_parse_tlv(tlvprocdestopt_lst, skbp)) { skb = *skbp; skb->h.raw += ((skb->h.raw[1]+1)<<3); +#ifdef CONFIG_IPV6_MIP6 + opt->nhoff = dstbuf; +#else opt->nhoff = opt->dst1; +#endif return 1; } -- cgit v0.10.2 From 8dd7368dd97def967bbb3aec67b882e8dfd1a528 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Wed, 23 Aug 2006 19:25:55 -0700 Subject: [IPV6]: Put dsthao after flags in order to pack inet6_skb_parm better. Signed-off-by: David S. Miller diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index db3b2ba..1d6d3cc 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -226,12 +226,12 @@ struct inet6_skb_parm { __u16 dst0; __u16 srcrt; __u16 dst1; -#ifdef CONFIG_IPV6_MIP6 - __u16 dsthao; -#endif __u16 lastopt; __u32 nhoff; __u16 flags; +#ifdef CONFIG_IPV6_MIP6 + __u16 dsthao; +#endif #define IP6SKB_XFRM_TRANSFORMED 1 }; -- cgit v0.10.2 From 793832361fe7e9c3fcae2edd1d293c583a0a095c Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:27:25 -0700 Subject: [IPV6] MIP6: Revert address to send ICMPv6 error. IPv6 source address is replaced in receiving packet with home address option carried by destination options header. To send ICMPv6 error back, original address which is received one on wire should be used. This function checks such header is included and reverts them. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index e3a8e27..4ec8760 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -273,6 +273,29 @@ static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, st return 0; } +#ifdef CONFIG_IPV6_MIP6 +static void mip6_addr_swap(struct sk_buff *skb) +{ + struct ipv6hdr *iph = skb->nh.ipv6h; + struct inet6_skb_parm *opt = IP6CB(skb); + struct ipv6_destopt_hao *hao; + struct in6_addr tmp; + int off; + + if (opt->dsthao) { + off = ipv6_find_tlv(skb, opt->dsthao, IPV6_TLV_HAO); + if (likely(off >= 0)) { + hao = (struct ipv6_destopt_hao *)(skb->nh.raw + off); + ipv6_addr_copy(&tmp, &iph->saddr); + ipv6_addr_copy(&iph->saddr, &hao->addr); + ipv6_addr_copy(&hao->addr, &tmp); + } + } +} +#else +static inline void mip6_addr_swap(struct sk_buff *skb) {} +#endif + /* * Send an ICMP message in response to a packet in error */ @@ -350,6 +373,8 @@ void icmpv6_send(struct sk_buff *skb, int type, int code, __u32 info, return; } + mip6_addr_swap(skb); + memset(&fl, 0, sizeof(fl)); fl.proto = IPPROTO_ICMPV6; ipv6_addr_copy(&fl.fl6_dst, &hdr->saddr); -- cgit v0.10.2 From 27637df92e25dfb45dd71a93a2f4bf9c080fa627 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 19:29:47 -0700 Subject: [IPV6] IPSEC: Support sending with Mobile IPv6 extension headers. Mobile IPv6 defines home address option as an option of destination options header. It is placed before fragment header then ip6_find_1stfragopt() is fixed to know about it. Home address option also carries final source address of the flow, then outbound AH calculation should take care of it like routing header case. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index 6c0aa51..0f2b4e3 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -74,6 +74,68 @@ bad: return 0; } +#ifdef CONFIG_IPV6_MIP6 +/** + * ipv6_rearrange_destopt - rearrange IPv6 destination options header + * @iph: IPv6 header + * @destopt: destionation options header + */ +static void ipv6_rearrange_destopt(struct ipv6hdr *iph, struct ipv6_opt_hdr *destopt) +{ + u8 *opt = (u8 *)destopt; + int len = ipv6_optlen(destopt); + int off = 0; + int optlen = 0; + + off += 2; + len -= 2; + + while (len > 0) { + + switch (opt[off]) { + + case IPV6_TLV_PAD0: + optlen = 1; + break; + default: + if (len < 2) + goto bad; + optlen = opt[off+1]+2; + if (len < optlen) + goto bad; + + /* Rearrange the source address in @iph and the + * addresses in home address option for final source. + * See 11.3.2 of RFC 3775 for details. + */ + if (opt[off] == IPV6_TLV_HAO) { + struct in6_addr final_addr; + struct ipv6_destopt_hao *hao; + + hao = (struct ipv6_destopt_hao *)&opt[off]; + if (hao->length != sizeof(hao->addr)) { + if (net_ratelimit()) + printk(KERN_WARNING "destopt hao: invalid header length: %u\n", hao->length); + goto bad; + } + ipv6_addr_copy(&final_addr, &hao->addr); + ipv6_addr_copy(&hao->addr, &iph->saddr); + ipv6_addr_copy(&iph->saddr, &final_addr); + } + break; + } + + off += optlen; + len -= optlen; + } + if (len == 0) + return; + +bad: + return; +} +#endif + /** * ipv6_rearrange_rthdr - rearrange IPv6 routing header * @iph: IPv6 header @@ -113,7 +175,11 @@ static void ipv6_rearrange_rthdr(struct ipv6hdr *iph, struct ipv6_rt_hdr *rthdr) ipv6_addr_copy(&iph->daddr, &final_addr); } +#ifdef CONFIG_IPV6_MIP6 +static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len, int dir) +#else static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) +#endif { union { struct ipv6hdr *iph; @@ -128,6 +194,28 @@ static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) while (exthdr.raw < end) { switch (nexthdr) { +#ifdef CONFIG_IPV6_MIP6 + case NEXTHDR_HOP: + if (!zero_out_mutable_opts(exthdr.opth)) { + LIMIT_NETDEBUG( + KERN_WARNING "overrun %sopts\n", + nexthdr == NEXTHDR_HOP ? + "hop" : "dest"); + return -EINVAL; + } + break; + case NEXTHDR_DEST: + if (dir == XFRM_POLICY_OUT) + ipv6_rearrange_destopt(iph, exthdr.opth); + if (!zero_out_mutable_opts(exthdr.opth)) { + LIMIT_NETDEBUG( + KERN_WARNING "overrun %sopts\n", + nexthdr == NEXTHDR_HOP ? + "hop" : "dest"); + return -EINVAL; + } + break; +#else case NEXTHDR_HOP: case NEXTHDR_DEST: if (!zero_out_mutable_opts(exthdr.opth)) { @@ -138,6 +226,7 @@ static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) return -EINVAL; } break; +#endif case NEXTHDR_ROUTING: ipv6_rearrange_rthdr(iph, exthdr.rth); @@ -164,6 +253,9 @@ static int ah6_output(struct xfrm_state *x, struct sk_buff *skb) u8 nexthdr; char tmp_base[8]; struct { +#ifdef CONFIG_IPV6_MIP6 + struct in6_addr saddr; +#endif struct in6_addr daddr; char hdrs[0]; } *tmp_ext; @@ -188,10 +280,18 @@ static int ah6_output(struct xfrm_state *x, struct sk_buff *skb) err = -ENOMEM; goto error; } +#ifdef CONFIG_IPV6_MIP6 + memcpy(tmp_ext, &top_iph->saddr, extlen); + err = ipv6_clear_mutable_options(top_iph, + extlen - sizeof(*tmp_ext) + + sizeof(*top_iph), + XFRM_POLICY_OUT); +#else memcpy(tmp_ext, &top_iph->daddr, extlen); err = ipv6_clear_mutable_options(top_iph, extlen - sizeof(*tmp_ext) + sizeof(*top_iph)); +#endif if (err) goto error_free_iph; } @@ -222,7 +322,11 @@ static int ah6_output(struct xfrm_state *x, struct sk_buff *skb) memcpy(top_iph, tmp_base, sizeof(tmp_base)); if (tmp_ext) { +#ifdef CONFIG_IPV6_MIP6 + memcpy(&top_iph->saddr, tmp_ext, extlen); +#else memcpy(&top_iph->daddr, tmp_ext, extlen); +#endif error_free_iph: kfree(tmp_ext); } @@ -282,8 +386,13 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff *skb) if (!tmp_hdr) goto out; memcpy(tmp_hdr, skb->nh.raw, hdr_len); +#ifdef CONFIG_IPV6_MIP6 + if (ipv6_clear_mutable_options(skb->nh.ipv6h, hdr_len, XFRM_POLICY_IN)) + goto free_out; +#else if (ipv6_clear_mutable_options(skb->nh.ipv6h, hdr_len)) goto free_out; +#endif skb->nh.ipv6h->priority = 0; skb->nh.ipv6h->flow_lbl[0] = 0; skb->nh.ipv6h->flow_lbl[1] = 0; diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 258e3e4..c14ea1e 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -475,17 +475,25 @@ int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr) switch (**nexthdr) { case NEXTHDR_HOP: + break; case NEXTHDR_ROUTING: + found_rhdr = 1; + break; case NEXTHDR_DEST: - if (**nexthdr == NEXTHDR_ROUTING) found_rhdr = 1; - if (**nexthdr == NEXTHDR_DEST && found_rhdr) return offset; - offset += ipv6_optlen(exthdr); - *nexthdr = &exthdr->nexthdr; - exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); +#ifdef CONFIG_IPV6_MIP6 + if (ipv6_find_tlv(skb, offset, IPV6_TLV_HAO) >= 0) + break; +#endif + if (found_rhdr) + return offset; break; default : return offset; } + + offset += ipv6_optlen(exthdr); + *nexthdr = &exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); } return offset; -- cgit v0.10.2 From 2c8d7ca0f76103855ad1f2a930e05683b64a00eb Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Wed, 23 Aug 2006 20:31:11 -0700 Subject: [IPV6] MIP6: Add routing header type 2 transformation. Add routing header type 2 transformation for Mobile IPv6. Based on MIPL2 kernel patch. Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/mip6.h b/include/net/mip6.h new file mode 100644 index 0000000..644b8b6 --- /dev/null +++ b/include/net/mip6.h @@ -0,0 +1,31 @@ +/* + * Copyright (C)2003-2006 Helsinki University of Technology + * Copyright (C)2003-2006 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ +/* + * Authors: + * Noriaki TAKAMIYA @USAGI + * Masahide NAKAMURA @USAGI + * YOSHIFUJI Hideaki @USAGI + */ +#ifndef _NET_MIP6_H +#define _NET_MIP6_H + +extern int mip6_init(void); +extern void mip6_fini(void); + +#endif diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile index 87e912e..0213c66 100644 --- a/net/ipv6/Makefile +++ b/net/ipv6/Makefile @@ -14,6 +14,8 @@ ipv6-$(CONFIG_XFRM) += xfrm6_policy.o xfrm6_state.o xfrm6_input.o \ xfrm6_output.o ipv6-$(CONFIG_NETFILTER) += netfilter.o ipv6-$(CONFIG_IPV6_MULTIPLE_TABLES) += fib6_rules.o +ipv6-$(CONFIG_IPV6_MIP6) += mip6.o + ipv6-objs += $(ipv6-y) obj-$(CONFIG_INET6_AH) += ah6.o diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 57ee5dd..fc9c8a9 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -59,6 +59,9 @@ #ifdef CONFIG_IPV6_TUNNEL #include #endif +#ifdef CONFIG_IPV6_MIP6 +#include +#endif #include #include @@ -857,6 +860,9 @@ static int __init inet6_init(void) ipv6_frag_init(); ipv6_nodata_init(); ipv6_destopt_init(); +#ifdef CONFIG_IPV6_MIP6 + mip6_init(); +#endif /* Init v6 transport protocols. */ udpv6_init(); @@ -920,6 +926,9 @@ static void __exit inet6_exit(void) tcp6_proc_exit(); raw6_proc_exit(); #endif +#ifdef CONFIG_IPV6_MIP6 + mip6_fini(); +#endif /* Cleanup code parts. */ sit_cleanup(); ip6_flowlabel_cleanup(); diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c new file mode 100644 index 0000000..63e548b --- /dev/null +++ b/net/ipv6/mip6.c @@ -0,0 +1,181 @@ +/* + * Copyright (C)2003-2006 Helsinki University of Technology + * Copyright (C)2003-2006 USAGI/WIDE Project + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + */ +/* + * Authors: + * Noriaki TAKAMIYA @USAGI + * Masahide NAKAMURA @USAGI + */ + +#include +#include +#include +#include +#include +#include +#include + +static xfrm_address_t *mip6_xfrm_addr(struct xfrm_state *x, xfrm_address_t *addr) +{ + return x->coaddr; +} + +static int mip6_rthdr_input(struct xfrm_state *x, struct sk_buff *skb) +{ + struct rt2_hdr *rt2 = (struct rt2_hdr *)skb->data; + + if (!ipv6_addr_equal(&rt2->addr, (struct in6_addr *)x->coaddr) && + !ipv6_addr_any((struct in6_addr *)x->coaddr)) + return -ENOENT; + + return rt2->rt_hdr.nexthdr; +} + +/* Routing Header type 2 is inserted. + * IP Header's dst address is replaced with Routing Header's Home Address. + */ +static int mip6_rthdr_output(struct xfrm_state *x, struct sk_buff *skb) +{ + struct ipv6hdr *iph; + struct rt2_hdr *rt2; + u8 nexthdr; + + iph = (struct ipv6hdr *)skb->data; + iph->payload_len = htons(skb->len - sizeof(*iph)); + + nexthdr = *skb->nh.raw; + *skb->nh.raw = IPPROTO_ROUTING; + + rt2 = (struct rt2_hdr *)skb->h.raw; + rt2->rt_hdr.nexthdr = nexthdr; + rt2->rt_hdr.hdrlen = (x->props.header_len >> 3) - 1; + rt2->rt_hdr.type = IPV6_SRCRT_TYPE_2; + rt2->rt_hdr.segments_left = 1; + memset(&rt2->reserved, 0, sizeof(rt2->reserved)); + + BUG_TRAP(rt2->rt_hdr.hdrlen == 2); + + memcpy(&rt2->addr, &iph->daddr, sizeof(rt2->addr)); + memcpy(&iph->daddr, x->coaddr, sizeof(iph->daddr)); + + return 0; +} + +static int mip6_rthdr_offset(struct xfrm_state *x, struct sk_buff *skb, + u8 **nexthdr) +{ + u16 offset = sizeof(struct ipv6hdr); + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(skb->nh.ipv6h + 1); + unsigned int packet_len = skb->tail - skb->nh.raw; + int found_rhdr = 0; + + *nexthdr = &skb->nh.ipv6h->nexthdr; + + while (offset + 1 <= packet_len) { + + switch (**nexthdr) { + case NEXTHDR_HOP: + break; + case NEXTHDR_ROUTING: + if (offset + 3 <= packet_len) { + struct ipv6_rt_hdr *rt; + rt = (struct ipv6_rt_hdr *)(skb->nh.raw + offset); + if (rt->type != 0) + return offset; + } + found_rhdr = 1; + break; + case NEXTHDR_DEST: + if (ipv6_find_tlv(skb, offset, IPV6_TLV_HAO) >= 0) + return offset; + + if (found_rhdr) + return offset; + + break; + default: + return offset; + } + + offset += ipv6_optlen(exthdr); + *nexthdr = &exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + } + + return offset; +} + +static int mip6_rthdr_init_state(struct xfrm_state *x) +{ + if (x->id.spi) { + printk(KERN_INFO "%s: spi is not 0: %u\n", __FUNCTION__, + x->id.spi); + return -EINVAL; + } + if (x->props.mode != XFRM_MODE_ROUTEOPTIMIZATION) { + printk(KERN_INFO "%s: state's mode is not %u: %u\n", + __FUNCTION__, XFRM_MODE_ROUTEOPTIMIZATION, x->props.mode); + return -EINVAL; + } + + x->props.header_len = sizeof(struct rt2_hdr); + + return 0; +} + +/* + * Do nothing about destroying since it has no specific operation for routing + * header type 2 unlike IPsec protocols. + */ +static void mip6_rthdr_destroy(struct xfrm_state *x) +{ +} + +static struct xfrm_type mip6_rthdr_type = +{ + .description = "MIP6RT", + .owner = THIS_MODULE, + .proto = IPPROTO_ROUTING, + .flags = XFRM_TYPE_NON_FRAGMENT, + .init_state = mip6_rthdr_init_state, + .destructor = mip6_rthdr_destroy, + .input = mip6_rthdr_input, + .output = mip6_rthdr_output, + .hdr_offset = mip6_rthdr_offset, + .remote_addr = mip6_xfrm_addr, +}; + +int __init mip6_init(void) +{ + printk(KERN_INFO "Mobile IPv6\n"); + + if (xfrm_register_type(&mip6_rthdr_type, AF_INET6) < 0) { + printk(KERN_INFO "%s: can't add xfrm type(rthdr)\n", __FUNCTION__); + goto mip6_rthdr_xfrm_fail; + } + return 0; + + mip6_rthdr_xfrm_fail: + return -EAGAIN; +} + +void __exit mip6_fini(void) +{ + if (xfrm_unregister_type(&mip6_rthdr_type, AF_INET6) < 0) + printk(KERN_INFO "%s: can't remove xfrm type(rthdr)\n", __FUNCTION__); +} -- cgit v0.10.2 From 3d126890dd67beffec27c1b6f51c040fc8d0b526 Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Wed, 23 Aug 2006 20:32:34 -0700 Subject: [IPV6] MIP6: Add destination options header transformation. Add destination options header transformation for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Noriaki TAKAMIYA Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/mip6.h b/include/net/mip6.h index 644b8b6..42b65ba 100644 --- a/include/net/mip6.h +++ b/include/net/mip6.h @@ -25,6 +25,9 @@ #ifndef _NET_MIP6_H #define _NET_MIP6_H +#define MIP6_OPT_PAD_1 0 +#define MIP6_OPT_PAD_N 1 + extern int mip6_init(void); extern void mip6_fini(void); diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index 63e548b..a8adf89 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -35,6 +35,165 @@ static xfrm_address_t *mip6_xfrm_addr(struct xfrm_state *x, xfrm_address_t *addr return x->coaddr; } +static inline unsigned int calc_padlen(unsigned int len, unsigned int n) +{ + return (n - len + 16) & 0x7; +} + +static inline void *mip6_padn(__u8 *data, __u8 padlen) +{ + if (!data) + return NULL; + if (padlen == 1) { + data[0] = MIP6_OPT_PAD_1; + } else if (padlen > 1) { + data[0] = MIP6_OPT_PAD_N; + data[1] = padlen - 2; + if (padlen > 2) + memset(data+2, 0, data[1]); + } + return data + padlen; +} + +static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb) +{ + struct ipv6hdr *iph = skb->nh.ipv6h; + struct ipv6_destopt_hdr *destopt = (struct ipv6_destopt_hdr *)skb->data; + + if (!ipv6_addr_equal(&iph->saddr, (struct in6_addr *)x->coaddr) && + !ipv6_addr_any((struct in6_addr *)x->coaddr)) + return -ENOENT; + + return destopt->nexthdr; +} + +/* Destination Option Header is inserted. + * IP Header's src address is replaced with Home Address Option in + * Destination Option Header. + */ +static int mip6_destopt_output(struct xfrm_state *x, struct sk_buff *skb) +{ + struct ipv6hdr *iph; + struct ipv6_destopt_hdr *dstopt; + struct ipv6_destopt_hao *hao; + u8 nexthdr; + int len; + + iph = (struct ipv6hdr *)skb->data; + iph->payload_len = htons(skb->len - sizeof(*iph)); + + nexthdr = *skb->nh.raw; + *skb->nh.raw = IPPROTO_DSTOPTS; + + dstopt = (struct ipv6_destopt_hdr *)skb->h.raw; + dstopt->nexthdr = nexthdr; + + hao = mip6_padn((char *)(dstopt + 1), + calc_padlen(sizeof(*dstopt), 6)); + + hao->type = IPV6_TLV_HAO; + hao->length = sizeof(*hao) - 2; + BUG_TRAP(hao->length == 16); + + len = ((char *)hao - (char *)dstopt) + sizeof(*hao); + + memcpy(&hao->addr, &iph->saddr, sizeof(hao->addr)); + memcpy(&iph->saddr, x->coaddr, sizeof(iph->saddr)); + + BUG_TRAP(len == x->props.header_len); + dstopt->hdrlen = (x->props.header_len >> 3) - 1; + + return 0; +} + +static int mip6_destopt_offset(struct xfrm_state *x, struct sk_buff *skb, + u8 **nexthdr) +{ + u16 offset = sizeof(struct ipv6hdr); + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr*)(skb->nh.ipv6h + 1); + unsigned int packet_len = skb->tail - skb->nh.raw; + int found_rhdr = 0; + + *nexthdr = &skb->nh.ipv6h->nexthdr; + + while (offset + 1 <= packet_len) { + + switch (**nexthdr) { + case NEXTHDR_HOP: + break; + case NEXTHDR_ROUTING: + found_rhdr = 1; + break; + case NEXTHDR_DEST: + /* + * HAO MUST NOT appear more than once. + * XXX: It is better to try to find by the end of + * XXX: packet if HAO exists. + */ + if (ipv6_find_tlv(skb, offset, IPV6_TLV_HAO) >= 0) { + LIMIT_NETDEBUG(KERN_WARNING "mip6: hao exists already, override\n"); + return offset; + } + + if (found_rhdr) + return offset; + + break; + default: + return offset; + } + + offset += ipv6_optlen(exthdr); + *nexthdr = &exthdr->nexthdr; + exthdr = (struct ipv6_opt_hdr*)(skb->nh.raw + offset); + } + + return offset; +} + +static int mip6_destopt_init_state(struct xfrm_state *x) +{ + if (x->id.spi) { + printk(KERN_INFO "%s: spi is not 0: %u\n", __FUNCTION__, + x->id.spi); + return -EINVAL; + } + if (x->props.mode != XFRM_MODE_ROUTEOPTIMIZATION) { + printk(KERN_INFO "%s: state's mode is not %u: %u\n", + __FUNCTION__, XFRM_MODE_ROUTEOPTIMIZATION, x->props.mode); + return -EINVAL; + } + + x->props.header_len = sizeof(struct ipv6_destopt_hdr) + + calc_padlen(sizeof(struct ipv6_destopt_hdr), 6) + + sizeof(struct ipv6_destopt_hao); + BUG_TRAP(x->props.header_len == 24); + + return 0; +} + +/* + * Do nothing about destroying since it has no specific operation for + * destination options header unlike IPsec protocols. + */ +static void mip6_destopt_destroy(struct xfrm_state *x) +{ +} + +static struct xfrm_type mip6_destopt_type = +{ + .description = "MIP6DESTOPT", + .owner = THIS_MODULE, + .proto = IPPROTO_DSTOPTS, + .flags = XFRM_TYPE_NON_FRAGMENT, + .init_state = mip6_destopt_init_state, + .destructor = mip6_destopt_destroy, + .input = mip6_destopt_input, + .output = mip6_destopt_output, + .hdr_offset = mip6_destopt_offset, + .local_addr = mip6_xfrm_addr, +}; + static int mip6_rthdr_input(struct xfrm_state *x, struct sk_buff *skb) { struct rt2_hdr *rt2 = (struct rt2_hdr *)skb->data; @@ -164,6 +323,10 @@ int __init mip6_init(void) { printk(KERN_INFO "Mobile IPv6\n"); + if (xfrm_register_type(&mip6_destopt_type, AF_INET6) < 0) { + printk(KERN_INFO "%s: can't add xfrm type(destopt)\n", __FUNCTION__); + goto mip6_destopt_xfrm_fail; + } if (xfrm_register_type(&mip6_rthdr_type, AF_INET6) < 0) { printk(KERN_INFO "%s: can't add xfrm type(rthdr)\n", __FUNCTION__); goto mip6_rthdr_xfrm_fail; @@ -171,6 +334,8 @@ int __init mip6_init(void) return 0; mip6_rthdr_xfrm_fail: + xfrm_unregister_type(&mip6_destopt_type, AF_INET6); + mip6_destopt_xfrm_fail: return -EAGAIN; } @@ -178,4 +343,6 @@ void __exit mip6_fini(void) { if (xfrm_unregister_type(&mip6_rthdr_type, AF_INET6) < 0) printk(KERN_INFO "%s: can't remove xfrm type(rthdr)\n", __FUNCTION__); + if (xfrm_unregister_type(&mip6_destopt_type, AF_INET6) < 0) + printk(KERN_INFO "%s: can't remove xfrm type(destopt)\n", __FUNCTION__); } -- cgit v0.10.2 From e23c7194a8a21e96b99106bdabde94614c4b84d6 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:33:28 -0700 Subject: [XFRM] STATE: Add Mobile IPv6 route optimization protocols to netlink interface. Add Mobile IPv6 route optimization protocols to netlink interface. Route optimization states carry care-of address. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 3a83c59..770bd24 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -28,6 +28,9 @@ #include #include #include +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) +#include +#endif static int verify_one_alg(struct rtattr **xfrma, enum xfrm_attr_type_t type) { @@ -173,6 +176,19 @@ static int verify_newsa_info(struct xfrm_usersa_info *p, goto out; break; +#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE) + case IPPROTO_DSTOPTS: + case IPPROTO_ROUTING: + if (xfrma[XFRMA_ALG_COMP-1] || + xfrma[XFRMA_ALG_AUTH-1] || + xfrma[XFRMA_ALG_CRYPT-1] || + xfrma[XFRMA_ENCAP-1] || + xfrma[XFRMA_SEC_CTX-1] || + !xfrma[XFRMA_COADDR-1]) + goto out; + break; +#endif + default: goto out; }; -- cgit v0.10.2 From 2b741653b6c824fe7520ee92b6795f11c5f24b24 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:34:26 -0700 Subject: [IPV6] MIP6: Add Mobility header definition. Add Mobility header definition for Mobile IPv6. Based on MIPL2 kernel patch. This patch was also written by: Antti Tuominen Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/in6.h b/include/linux/in6.h index 086ec2a..d776829 100644 --- a/include/linux/in6.h +++ b/include/linux/in6.h @@ -134,6 +134,7 @@ struct in6_flowlabel_req #define IPPROTO_ICMPV6 58 /* ICMPv6 */ #define IPPROTO_NONE 59 /* IPv6 no next header */ #define IPPROTO_DSTOPTS 60 /* IPv6 destination options */ +#define IPPROTO_MH 135 /* IPv6 mobility header */ /* * IPv6 TLV options. diff --git a/include/net/flow.h b/include/net/flow.h index 21d988b..e052291 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -72,12 +72,21 @@ struct flowi { } dnports; __u32 spi; + +#ifdef CONFIG_IPV6_MIP6 + struct { + __u8 type; + } mht; +#endif } uli_u; #define fl_ip_sport uli_u.ports.sport #define fl_ip_dport uli_u.ports.dport #define fl_icmp_type uli_u.icmpt.type #define fl_icmp_code uli_u.icmpt.code #define fl_ipsec_spi uli_u.spi +#ifdef CONFIG_IPV6_MIP6 +#define fl_mh_type uli_u.mht.type +#endif __u32 secid; /* used by xfrm; see secid.txt */ } __attribute__((__aligned__(BITS_PER_LONG/8))); diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 8e6ec60..72bf47b 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -40,6 +40,7 @@ #define NEXTHDR_ICMP 58 /* ICMP for IPv6. */ #define NEXTHDR_NONE 59 /* No next header */ #define NEXTHDR_DEST 60 /* Destination options header. */ +#define NEXTHDR_MOBILITY 135 /* Mobility header. */ #define NEXTHDR_MAX 255 diff --git a/include/net/mip6.h b/include/net/mip6.h index 42b65ba..fd43178 100644 --- a/include/net/mip6.h +++ b/include/net/mip6.h @@ -28,6 +28,29 @@ #define MIP6_OPT_PAD_1 0 #define MIP6_OPT_PAD_N 1 +/* + * Mobility Header + */ +struct ip6_mh { + __u8 ip6mh_proto; + __u8 ip6mh_hdrlen; + __u8 ip6mh_type; + __u8 ip6mh_reserved; + __u16 ip6mh_cksum; + /* Followed by type specific messages */ + __u8 data[0]; +} __attribute__ ((__packed__)); + +#define IP6_MH_TYPE_BRR 0 /* Binding Refresh Request */ +#define IP6_MH_TYPE_HOTI 1 /* HOTI Message */ +#define IP6_MH_TYPE_COTI 2 /* COTI Message */ +#define IP6_MH_TYPE_HOT 3 /* HOT Message */ +#define IP6_MH_TYPE_COT 4 /* COT Message */ +#define IP6_MH_TYPE_BU 5 /* Binding Update */ +#define IP6_MH_TYPE_BACK 6 /* Binding ACK */ +#define IP6_MH_TYPE_BERROR 7 /* Binding Error */ +#define IP6_MH_TYPE_MAX IP6_MH_TYPE_BERROR + extern int mip6_init(void); extern void mip6_fini(void); -- cgit v0.10.2 From 7be96f7628469e56f91d51f13b03e9bcff113c7f Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:35:31 -0700 Subject: [IPV6] MIP6: Add receiving mobility header functions through raw socket. Like ICMPv6, mobility header is handled through raw socket. In inbound case, check only whether ICMPv6 error should be sent as a reply or not by kernel. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala This patch was also written by: Antti Tuominen Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/mip6.h b/include/net/mip6.h index fd43178..68263c6 100644 --- a/include/net/mip6.h +++ b/include/net/mip6.h @@ -25,6 +25,9 @@ #ifndef _NET_MIP6_H #define _NET_MIP6_H +#include +#include + #define MIP6_OPT_PAD_1 0 #define MIP6_OPT_PAD_N 1 @@ -53,5 +56,6 @@ struct ip6_mh { extern int mip6_init(void); extern void mip6_fini(void); +extern int mip6_mh_filter(struct sock *sk, struct sk_buff *skb); #endif diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index a8adf89..7b5f893 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -26,7 +26,10 @@ #include #include #include +#include +#include #include +#include #include #include @@ -55,6 +58,86 @@ static inline void *mip6_padn(__u8 *data, __u8 padlen) return data + padlen; } +static inline void mip6_param_prob(struct sk_buff *skb, int code, int pos) +{ + icmpv6_send(skb, ICMPV6_PARAMPROB, code, pos, skb->dev); +} + +static int mip6_mh_len(int type) +{ + int len = 0; + + switch (type) { + case IP6_MH_TYPE_BRR: + len = 0; + break; + case IP6_MH_TYPE_HOTI: + case IP6_MH_TYPE_COTI: + case IP6_MH_TYPE_BU: + case IP6_MH_TYPE_BACK: + len = 1; + break; + case IP6_MH_TYPE_HOT: + case IP6_MH_TYPE_COT: + case IP6_MH_TYPE_BERROR: + len = 2; + break; + } + return len; +} + +int mip6_mh_filter(struct sock *sk, struct sk_buff *skb) +{ + struct ip6_mh *mh; + int mhlen; + + if (!pskb_may_pull(skb, (skb->h.raw - skb->data) + 8) || + !pskb_may_pull(skb, (skb->h.raw - skb->data) + ((skb->h.raw[1] + 1) << 3))) + return -1; + + mh = (struct ip6_mh *)skb->h.raw; + + if (mh->ip6mh_hdrlen < mip6_mh_len(mh->ip6mh_type)) { + LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH message too short: %d vs >=%d\n", + mh->ip6mh_hdrlen, mip6_mh_len(mh->ip6mh_type)); + mip6_param_prob(skb, 0, (&mh->ip6mh_hdrlen) - skb->nh.raw); + return -1; + } + mhlen = (mh->ip6mh_hdrlen + 1) << 3; + + if (skb->ip_summed == CHECKSUM_COMPLETE) { + skb->ip_summed = CHECKSUM_UNNECESSARY; + if (csum_ipv6_magic(&skb->nh.ipv6h->saddr, + &skb->nh.ipv6h->daddr, + mhlen, IPPROTO_MH, + skb->csum)) { + LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH hw checksum failed\n"); + skb->ip_summed = CHECKSUM_NONE; + } + } + if (skb->ip_summed == CHECKSUM_NONE) { + if (csum_ipv6_magic(&skb->nh.ipv6h->saddr, + &skb->nh.ipv6h->daddr, + mhlen, IPPROTO_MH, + skb_checksum(skb, 0, mhlen, 0))) { + LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH checksum failed [%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x > %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x]\n", + NIP6(skb->nh.ipv6h->saddr), + NIP6(skb->nh.ipv6h->daddr)); + return -1; + } + skb->ip_summed = CHECKSUM_UNNECESSARY; + } + + if (mh->ip6mh_proto != IPPROTO_NONE) { + LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH invalid payload proto = %d\n", + mh->ip6mh_proto); + mip6_param_prob(skb, 0, (&mh->ip6mh_proto) - skb->nh.raw); + return -1; + } + + return 0; +} + static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb) { struct ipv6hdr *iph = skb->nh.ipv6h; diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index d4af1cb..ecca8aa 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -50,6 +50,9 @@ #include #include #include +#ifdef CONFIG_IPV6_MIP6 +#include +#endif #include #include @@ -169,8 +172,32 @@ int ipv6_raw_deliver(struct sk_buff *skb, int nexthdr) sk = __raw_v6_lookup(sk, nexthdr, daddr, saddr, IP6CB(skb)->iif); while (sk) { + int filtered; + delivered = 1; - if (nexthdr != IPPROTO_ICMPV6 || !icmpv6_filter(sk, skb)) { + switch (nexthdr) { + case IPPROTO_ICMPV6: + filtered = icmpv6_filter(sk, skb); + break; +#ifdef CONFIG_IPV6_MIP6 + case IPPROTO_MH: + /* XXX: To validate MH only once for each packet, + * this is placed here. It should be after checking + * xfrm policy, however it doesn't. The checking xfrm + * policy is placed in rawv6_rcv() because it is + * required for each socket. + */ + filtered = mip6_mh_filter(sk, skb); + break; +#endif + default: + filtered = 0; + break; + } + + if (filtered < 0) + break; + if (filtered == 0) { struct sk_buff *clone = skb_clone(skb, GFP_ATOMIC); /* Not releasing hash table! */ -- cgit v0.10.2 From 6e8f4d48b265225bdf437bbf3151b0d6700dda22 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:36:47 -0700 Subject: [IPV6] MIP6: Add sending mobility header functions through raw socket. Mobility header is built by user-space and sent through raw socket. Kernel just extracts its type to flow. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c index ecca8aa..d09329c 100644 --- a/net/ipv6/raw.c +++ b/net/ipv6/raw.c @@ -609,6 +609,9 @@ static void rawv6_probe_proto_opt(struct flowi *fl, struct msghdr *msg) struct iovec *iov; u8 __user *type = NULL; u8 __user *code = NULL; +#ifdef CONFIG_IPV6_MIP6 + u8 len = 0; +#endif int probed = 0; int i; @@ -640,6 +643,20 @@ static void rawv6_probe_proto_opt(struct flowi *fl, struct msghdr *msg) probed = 1; } break; +#ifdef CONFIG_IPV6_MIP6 + case IPPROTO_MH: + if (iov->iov_base && iov->iov_len < 1) + break; + /* check if type field is readable or not. */ + if (iov->iov_len > 2 - len) { + u8 __user *p = iov->iov_base; + get_user(fl->fl_mh_type, &p[2 - len]); + probed = 1; + } else + len += iov->iov_len; + + break; +#endif default: probed = 1; break; -- cgit v0.10.2 From 2ce4272a699c731b9736d76126dc742353e381db Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:39:03 -0700 Subject: [IPV6] MIP6: Transformation support mobility header. Transformation support mobility header. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 7f16306..13488e7 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -546,6 +546,11 @@ u16 xfrm_flowi_sport(struct flowi *fl) case IPPROTO_ICMPV6: port = htons(fl->fl_icmp_type); break; +#ifdef CONFIG_IPV6_MIP6 + case IPPROTO_MH: + port = htons(fl->fl_mh_type); + break; +#endif default: port = 0; /*XXX*/ } diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 729b474..98c2fe4 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -18,6 +18,9 @@ #include #include #include +#ifdef CONFIG_IPV6_MIP6 +#include +#endif static struct dst_ops xfrm6_dst_ops; static struct xfrm_policy_afinfo xfrm6_policy_afinfo; @@ -270,6 +273,18 @@ _decode_session6(struct sk_buff *skb, struct flowi *fl) fl->proto = nexthdr; return; +#ifdef CONFIG_IPV6_MIP6 + case IPPROTO_MH: + if (pskb_may_pull(skb, skb->nh.raw + offset + 3 - skb->data)) { + struct ip6_mh *mh; + mh = (struct ip6_mh *)exthdr; + + fl->fl_mh_type = mh->ip6mh_type; + } + fl->proto = nexthdr; + return; +#endif + /* XXX Why are there these headers? */ case IPPROTO_AH: case IPPROTO_ESP: -- cgit v0.10.2 From df0ba92a99ca757039dfa84a929281ea3f7a50e8 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:41:00 -0700 Subject: [XFRM]: Trace which secpath state is reject factor. For Mobile IPv6 usage, it is required to trace which secpath state is reject factor in order to notify it to user space (to know the address which cannot be used route optimized communication). Based on MIPL2 kernel patch. This patch was also written by: Henrik Petander Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 13488e7..9ebbdc1 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -273,6 +273,7 @@ struct xfrm_type void (*destructor)(struct xfrm_state *); int (*input)(struct xfrm_state *, struct sk_buff *skb); int (*output)(struct xfrm_state *, struct sk_buff *pskb); + int (*reject)(struct xfrm_state *, struct sk_buff *, struct flowi *); int (*hdr_offset)(struct xfrm_state *, struct sk_buff *, u8 **); xfrm_address_t *(*local_addr)(struct xfrm_state *, xfrm_address_t *); xfrm_address_t *(*remote_addr)(struct xfrm_state *, xfrm_address_t *); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index ad2a5cb..d125a26 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -988,6 +988,23 @@ error: } EXPORT_SYMBOL(xfrm_lookup); +static inline int +xfrm_secpath_reject(int idx, struct sk_buff *skb, struct flowi *fl) +{ + struct xfrm_state *x; + int err; + + if (!skb->sp || idx < 0 || idx >= skb->sp->len) + return 0; + x = skb->sp->xvec[idx]; + if (!x->type->reject) + return 0; + xfrm_state_hold(x); + err = x->type->reject(x, skb, fl); + xfrm_state_put(x); + return err; +} + /* When skb is transformed back to its "native" form, we have to * check policy restrictions. At the moment we make this in maximally * stupid way. Shame on me. :-) Of course, connected sockets must @@ -1010,6 +1027,13 @@ xfrm_state_ok(struct xfrm_tmpl *tmpl, struct xfrm_state *x, xfrm_state_addr_cmp(tmpl, x, family)); } +/* + * 0 or more than 0 is returned when validation is succeeded (either bypass + * because of optional transport mode, or next index of the mathced secpath + * state with the template. + * -1 is returned when no matching template is found. + * Otherwise "-2 - errored_index" is returned. + */ static inline int xfrm_policy_ok(struct xfrm_tmpl *tmpl, struct sec_path *sp, int start, unsigned short family) @@ -1024,8 +1048,11 @@ xfrm_policy_ok(struct xfrm_tmpl *tmpl, struct sec_path *sp, int start, for (; idx < sp->len; idx++) { if (xfrm_state_ok(tmpl, sp->xvec[idx], family)) return ++idx; - if (sp->xvec[idx]->props.mode != XFRM_MODE_TRANSPORT) + if (sp->xvec[idx]->props.mode != XFRM_MODE_TRANSPORT) { + if (start == -1) + start = -2-idx; break; + } } return start; } @@ -1046,11 +1073,14 @@ xfrm_decode_session(struct sk_buff *skb, struct flowi *fl, unsigned short family } EXPORT_SYMBOL(xfrm_decode_session); -static inline int secpath_has_nontransport(struct sec_path *sp, int k) +static inline int secpath_has_nontransport(struct sec_path *sp, int k, int *idxp) { for (; k < sp->len; k++) { - if (sp->xvec[k]->props.mode != XFRM_MODE_TRANSPORT) + if (sp->xvec[k]->props.mode != XFRM_MODE_TRANSPORT) { + if (idxp) + *idxp = k; return 1; + } } return 0; @@ -1062,6 +1092,8 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, struct xfrm_policy *pol; struct flowi fl; u8 fl_dir = policy_to_flow_dir(dir); + int xerr_idx = -1; + int *xerr_idxp = &xerr_idx; if (xfrm_decode_session(skb, &fl, family) < 0) return 0; @@ -1086,8 +1118,13 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, pol = flow_cache_lookup(&fl, family, fl_dir, xfrm_policy_lookup); - if (!pol) - return !skb->sp || !secpath_has_nontransport(skb->sp, 0); + if (!pol) { + if (skb->sp && secpath_has_nontransport(skb->sp, 0, xerr_idxp)) { + xfrm_secpath_reject(xerr_idx, skb, &fl); + return 0; + } + return 1; + } pol->curlft.use_time = (unsigned long)xtime.tv_sec; @@ -1107,11 +1144,14 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, */ for (i = pol->xfrm_nr-1, k = 0; i >= 0; i--) { k = xfrm_policy_ok(pol->xfrm_vec+i, sp, k, family); - if (k < 0) + if (k < 0) { + if (k < -1 && xerr_idxp) + *xerr_idxp = -(2+k); goto reject; + } } - if (secpath_has_nontransport(sp, k)) + if (secpath_has_nontransport(sp, k, xerr_idxp)) goto reject; xfrm_pol_put(pol); @@ -1119,6 +1159,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, } reject: + xfrm_secpath_reject(xerr_idx, skb, &fl); xfrm_pol_put(pol); return 0; } -- cgit v0.10.2 From 97a64b4577ae2bc5599dbd008a3cd9e25de9b9f5 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:44:06 -0700 Subject: [XFRM]: Introduce XFRM_MSG_REPORT. XFRM_MSG_REPORT is a message as notification of state protocol and selector from kernel to user-space. Mobile IPv6 will use it when inbound reject is occurred at route optimization to make user-space know a binding error requirement. Based on MIPL2 kernel patch. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 1d8c1f2..4009f44 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -166,6 +166,10 @@ enum { #define XFRM_MSG_NEWAE XFRM_MSG_NEWAE XFRM_MSG_GETAE, #define XFRM_MSG_GETAE XFRM_MSG_GETAE + + XFRM_MSG_REPORT, +#define XFRM_MSG_REPORT XFRM_MSG_REPORT + __XFRM_MSG_MAX }; #define XFRM_MSG_MAX (__XFRM_MSG_MAX - 1) @@ -325,12 +329,18 @@ struct xfrm_usersa_flush { __u8 proto; }; +struct xfrm_user_report { + __u8 proto; + struct xfrm_selector sel; +}; + #ifndef __KERNEL__ /* backwards compatibility for userspace */ #define XFRMGRP_ACQUIRE 1 #define XFRMGRP_EXPIRE 2 #define XFRMGRP_SA 4 #define XFRMGRP_POLICY 8 +#define XFRMGRP_REPORT 0x10 #endif enum xfrm_nlgroups { @@ -346,6 +356,8 @@ enum xfrm_nlgroups { #define XFRMNLGRP_POLICY XFRMNLGRP_POLICY XFRMNLGRP_AEVENTS, #define XFRMNLGRP_AEVENTS XFRMNLGRP_AEVENTS + XFRMNLGRP_REPORT, +#define XFRMNLGRP_REPORT XFRMNLGRP_REPORT __XFRMNLGRP_MAX }; #define XFRMNLGRP_MAX (__XFRMNLGRP_MAX - 1) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 9ebbdc1..0b223ee 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -381,6 +381,7 @@ struct xfrm_mgr struct xfrm_policy *(*compile_policy)(struct sock *sk, int opt, u8 *data, int len, int *dir); int (*new_mapping)(struct xfrm_state *x, xfrm_address_t *ipaddr, u16 sport); int (*notify_policy)(struct xfrm_policy *x, int dir, struct km_event *c); + int (*report)(u8 proto, struct xfrm_selector *sel, xfrm_address_t *addr); }; extern int xfrm_register_km(struct xfrm_mgr *km); @@ -1043,6 +1044,7 @@ extern void xfrm_init_pmtu(struct dst_entry *dst); extern wait_queue_head_t km_waitq; extern int km_new_mapping(struct xfrm_state *x, xfrm_address_t *ipaddr, u16 sport); extern void km_policy_expired(struct xfrm_policy *pol, int dir, int hard, u32 pid); +extern int km_report(u8 proto, struct xfrm_selector *sel, xfrm_address_t *addr); extern void xfrm_input_init(void); extern int xfrm_parse_spi(struct sk_buff *skb, u8 nexthdr, u32 *spi, u32 *seq); diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 3da89c0..a26ef69 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -1055,6 +1055,25 @@ void km_policy_expired(struct xfrm_policy *pol, int dir, int hard, u32 pid) } EXPORT_SYMBOL(km_policy_expired); +int km_report(u8 proto, struct xfrm_selector *sel, xfrm_address_t *addr) +{ + int err = -EINVAL; + int ret; + struct xfrm_mgr *km; + + read_lock(&xfrm_km_lock); + list_for_each_entry(km, &xfrm_km_list, list) { + if (km->report) { + ret = km->report(proto, sel, addr); + if (!ret) + err = ret; + } + } + read_unlock(&xfrm_km_lock); + return err; +} +EXPORT_SYMBOL(km_report); + int xfrm_user_policy(struct sock *sk, int optname, u8 __user *optval, int optlen) { int err; diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 770bd24..7303b82 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1491,6 +1491,7 @@ static const int xfrm_msg_min[XFRM_NR_MSGTYPES] = { [XFRM_MSG_FLUSHPOLICY - XFRM_MSG_BASE] = NLMSG_LENGTH(0), [XFRM_MSG_NEWAE - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id), [XFRM_MSG_GETAE - XFRM_MSG_BASE] = XMSGSIZE(xfrm_aevent_id), + [XFRM_MSG_REPORT - XFRM_MSG_BASE] = XMSGSIZE(xfrm_user_report), }; #undef XMSGSIZE @@ -2058,12 +2059,57 @@ static int xfrm_send_policy_notify(struct xfrm_policy *xp, int dir, struct km_ev } +static int build_report(struct sk_buff *skb, u8 proto, + struct xfrm_selector *sel, xfrm_address_t *addr) +{ + struct xfrm_user_report *ur; + struct nlmsghdr *nlh; + unsigned char *b = skb->tail; + + nlh = NLMSG_PUT(skb, 0, 0, XFRM_MSG_REPORT, sizeof(*ur)); + ur = NLMSG_DATA(nlh); + nlh->nlmsg_flags = 0; + + ur->proto = proto; + memcpy(&ur->sel, sel, sizeof(ur->sel)); + + if (addr) + RTA_PUT(skb, XFRMA_COADDR, sizeof(*addr), addr); + + nlh->nlmsg_len = skb->tail - b; + return skb->len; + +nlmsg_failure: +rtattr_failure: + skb_trim(skb, b - skb->data); + return -1; +} + +static int xfrm_send_report(u8 proto, struct xfrm_selector *sel, + xfrm_address_t *addr) +{ + struct sk_buff *skb; + size_t len; + + len = NLMSG_ALIGN(NLMSG_LENGTH(sizeof(struct xfrm_user_report))); + skb = alloc_skb(len, GFP_ATOMIC); + if (skb == NULL) + return -ENOMEM; + + if (build_report(skb, proto, sel, addr) < 0) + BUG(); + + NETLINK_CB(skb).dst_group = XFRMNLGRP_REPORT; + return netlink_broadcast(xfrm_nl, skb, 0, XFRMNLGRP_REPORT, GFP_ATOMIC); +} + static struct xfrm_mgr netlink_mgr = { .id = "netlink", .notify = xfrm_send_state_notify, .acquire = xfrm_send_acquire, .compile_policy = xfrm_compile_policy, .notify_policy = xfrm_send_policy_notify, + .report = xfrm_send_report, }; static int __init xfrm_user_init(void) -- cgit v0.10.2 From 70182ed23d2559345aadb3cfb6a68a7c1cc0aa39 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:45:55 -0700 Subject: [IPV6] MIP6: Report to user-space when home address option is rejected. Report to user-space when home address option is rejected. In receiving this message user-space application will send Mobile IPv6 binding error. It is rate-limited by kernel. Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index 7b5f893..31445d0 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -25,6 +25,7 @@ #include #include #include +#include #include #include #include @@ -138,6 +139,18 @@ int mip6_mh_filter(struct sock *sk, struct sk_buff *skb) return 0; } +struct mip6_report_rate_limiter { + spinlock_t lock; + struct timeval stamp; + int iif; + struct in6_addr src; + struct in6_addr dst; +}; + +static struct mip6_report_rate_limiter mip6_report_rl = { + .lock = SPIN_LOCK_UNLOCKED +}; + static int mip6_destopt_input(struct xfrm_state *x, struct sk_buff *skb) { struct ipv6hdr *iph = skb->nh.ipv6h; @@ -189,6 +202,75 @@ static int mip6_destopt_output(struct xfrm_state *x, struct sk_buff *skb) return 0; } +static inline int mip6_report_rl_allow(struct timeval *stamp, + struct in6_addr *dst, + struct in6_addr *src, int iif) +{ + int allow = 0; + + spin_lock_bh(&mip6_report_rl.lock); + if (mip6_report_rl.stamp.tv_sec != stamp->tv_sec || + mip6_report_rl.stamp.tv_usec != stamp->tv_usec || + mip6_report_rl.iif != iif || + !ipv6_addr_equal(&mip6_report_rl.src, src) || + !ipv6_addr_equal(&mip6_report_rl.dst, dst)) { + mip6_report_rl.stamp.tv_sec = stamp->tv_sec; + mip6_report_rl.stamp.tv_usec = stamp->tv_usec; + mip6_report_rl.iif = iif; + ipv6_addr_copy(&mip6_report_rl.src, src); + ipv6_addr_copy(&mip6_report_rl.dst, dst); + allow = 1; + } + spin_unlock_bh(&mip6_report_rl.lock); + return allow; +} + +static int mip6_destopt_reject(struct xfrm_state *x, struct sk_buff *skb, struct flowi *fl) +{ + struct inet6_skb_parm *opt = (struct inet6_skb_parm *)skb->cb; + struct ipv6_destopt_hao *hao = NULL; + struct xfrm_selector sel; + int offset; + struct timeval stamp; + int err = 0; + + if (likely(opt->dsthao)) { + offset = ipv6_find_tlv(skb, opt->dsthao, IPV6_TLV_HAO); + if (likely(offset >= 0)) + hao = (struct ipv6_destopt_hao *)(skb->nh.raw + offset); + } + + skb_get_timestamp(skb, &stamp); + + if (!mip6_report_rl_allow(&stamp, &skb->nh.ipv6h->daddr, + hao ? &hao->addr : &skb->nh.ipv6h->saddr, + opt->iif)) + goto out; + + memset(&sel, 0, sizeof(sel)); + memcpy(&sel.daddr, (xfrm_address_t *)&skb->nh.ipv6h->daddr, + sizeof(sel.daddr)); + sel.prefixlen_d = 128; + memcpy(&sel.saddr, (xfrm_address_t *)&skb->nh.ipv6h->saddr, + sizeof(sel.saddr)); + sel.prefixlen_s = 128; + sel.family = AF_INET6; + sel.proto = fl->proto; + sel.dport = xfrm_flowi_dport(fl); + if (sel.dport) + sel.dport_mask = ~((__u16)0); + sel.sport = xfrm_flowi_sport(fl); + if (sel.sport) + sel.sport_mask = ~((__u16)0); + sel.ifindex = fl->oif; + + err = km_report(IPPROTO_DSTOPTS, &sel, + (hao ? (xfrm_address_t *)&hao->addr : NULL)); + + out: + return err; +} + static int mip6_destopt_offset(struct xfrm_state *x, struct sk_buff *skb, u8 **nexthdr) { @@ -273,6 +355,7 @@ static struct xfrm_type mip6_destopt_type = .destructor = mip6_destopt_destroy, .input = mip6_destopt_input, .output = mip6_destopt_output, + .reject = mip6_destopt_reject, .hdr_offset = mip6_destopt_offset, .local_addr = mip6_xfrm_addr, }; -- cgit v0.10.2 From 01be8e5d59d7e6da5c425a31b43709c2a4a69b5d Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 20:47:44 -0700 Subject: [IPV6] MIP6: Ignore to report if mobility headers is rejected. Ignore to report user-space for known mobility headers rejected by destination options header transformation. Mobile IPv6 specification (RFC3775) says that mobility header is used with destination options header carrying home address option only for binding update message. Other type message cannot be used and node must drop it silently (and must not send binding error) if receving such packet. To achieve it, (1) application should use transformation policy and wild-card states to catch binding update message prior other packets (2) kernel doesn't report the reject to user-space not to send binding error message by application. This patch is for (2). Based on MIPL2 kernel patch. This patch was also written by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index 31445d0..7085403 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -234,6 +234,9 @@ static int mip6_destopt_reject(struct xfrm_state *x, struct sk_buff *skb, struct struct timeval stamp; int err = 0; + if (unlikely(fl->proto == IPPROTO_MH && fl->fl_mh_type <= IP6_MH_TYPE_MAX)) + goto out; + if (likely(opt->dsthao)) { offset = ipv6_find_tlv(skb, opt->dsthao, IPV6_TLV_HAO); if (likely(offset >= 0)) -- cgit v0.10.2 From c11f1a15c522ddd3bbd2c32b5ce3e0b1831b22f2 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:38:14 -0700 Subject: [XFRM] POLICY: Add Kconfig to support sub policy. Add Kconfig to support sub policy. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/xfrm/Kconfig b/net/xfrm/Kconfig index 43228f7..0faab63 100644 --- a/net/xfrm/Kconfig +++ b/net/xfrm/Kconfig @@ -14,6 +14,16 @@ config XFRM_USER If unsure, say Y. +config XFRM_SUB_POLICY + bool "Transformation sub policy support (EXPERIMENTAL)" + depends on XFRM && EXPERIMENTAL + ---help--- + Support sub policy for developers. By using sub policy with main + one, two policies can be applied to the same packet at once. + Policy which lives shorter time in kernel should be a sub. + + If unsure, say N. + config NET_KEY tristate "PF_KEY sockets" select XFRM -- cgit v0.10.2 From 4e81bb8336a0ac50289d4d4c7a55e559b994ee8f Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:43:30 -0700 Subject: [XFRM] POLICY: sub policy support. Sub policy is introduced. Main and sub policy are applied the same flow. (Policy that current kernel uses is named as main.) It is required another transformation policy management to keep IPsec and Mobile IPv6 lives separate. Policy which lives shorter time in kernel should be a sub i.e. normally main is for IPsec and sub is for Mobile IPv6. (Such usage as two IPsec policies on different database can be used, too.) Limitation or TODOs: - Sub policy is not supported for per socket one (it is always inserted as main). - Current kernel makes cached outbound with flowi to skip searching database. However this patch makes it disabled only when "two policies are used and the first matched one is bypass case" because neither flowi nor bundle information knows about transformation template size. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 4009f44..492fb98 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -104,6 +104,13 @@ struct xfrm_stats { enum { + XFRM_POLICY_TYPE_MAIN = 0, + XFRM_POLICY_TYPE_SUB = 1, + XFRM_POLICY_TYPE_MAX = 2 +}; + +enum +{ XFRM_POLICY_IN = 0, XFRM_POLICY_OUT = 1, XFRM_POLICY_FWD = 2, diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 0b223ee..4655ca2 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -341,6 +341,7 @@ struct xfrm_policy atomic_t refcnt; struct timer_list timer; + u8 type; u32 priority; u32 index; struct xfrm_selector selector; @@ -389,6 +390,19 @@ extern int xfrm_unregister_km(struct xfrm_mgr *km); extern struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2]; +#ifdef CONFIG_XFRM_SUB_POLICY +extern struct xfrm_policy *xfrm_policy_list_sub[XFRM_POLICY_MAX*2]; + +static inline int xfrm_policy_lists_empty(int dir) +{ + return (!xfrm_policy_list[dir] && !xfrm_policy_list_sub[dir]); +} +#else +static inline int xfrm_policy_lists_empty(int dir) +{ + return (!xfrm_policy_list[dir]); +} +#endif static inline void xfrm_pol_hold(struct xfrm_policy *policy) { @@ -404,6 +418,20 @@ static inline void xfrm_pol_put(struct xfrm_policy *policy) __xfrm_policy_destroy(policy); } +#ifdef CONFIG_XFRM_SUB_POLICY +static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols) +{ + int i; + for (i = npols - 1; i >= 0; --i) + xfrm_pol_put(pols[i]); +} +#else +static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols) +{ + xfrm_pol_put(pols[0]); +} +#endif + #define XFRM_DST_HSIZE 1024 static __inline__ @@ -737,8 +765,8 @@ static inline int xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *sk { if (sk && sk->sk_policy[XFRM_POLICY_IN]) return __xfrm_policy_check(sk, dir, skb, family); - - return (!xfrm_policy_list[dir] && !skb->sp) || + + return (xfrm_policy_lists_empty(dir) && !skb->sp) || (skb->dst->flags & DST_NOPOLICY) || __xfrm_policy_check(sk, dir, skb, family); } @@ -758,7 +786,7 @@ extern int __xfrm_route_forward(struct sk_buff *skb, unsigned short family); static inline int xfrm_route_forward(struct sk_buff *skb, unsigned short family) { - return !xfrm_policy_list[XFRM_POLICY_OUT] || + return xfrm_policy_lists_empty(XFRM_POLICY_OUT) || (skb->dst->flags & DST_NOXFRM) || __xfrm_route_forward(skb, family); } @@ -1023,18 +1051,19 @@ static inline int xfrm_dst_lookup(struct xfrm_dst **dst, struct flowi *fl, unsig #endif struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp); -extern int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), void *); +extern int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, int, void*), void *); int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl); -struct xfrm_policy *xfrm_policy_bysel_ctx(int dir, struct xfrm_selector *sel, +struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int dir, + struct xfrm_selector *sel, struct xfrm_sec_ctx *ctx, int delete); -struct xfrm_policy *xfrm_policy_byid(int dir, u32 id, int delete); -void xfrm_policy_flush(void); +struct xfrm_policy *xfrm_policy_byid(u8, int dir, u32 id, int delete); +void xfrm_policy_flush(u8 type); u32 xfrm_get_acqseq(void); void xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi); struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create, unsigned short family); -extern void xfrm_policy_flush(void); +extern void xfrm_policy_flush(u8 type); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); extern int xfrm_flush_bundles(void); extern void xfrm_flush_all_bundles(void); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index d125a26..96de6c7 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -32,6 +32,24 @@ static DEFINE_RWLOCK(xfrm_policy_lock); struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2]; EXPORT_SYMBOL(xfrm_policy_list); +#ifdef CONFIG_XFRM_SUB_POLICY +struct xfrm_policy *xfrm_policy_list_sub[XFRM_POLICY_MAX*2]; +EXPORT_SYMBOL(xfrm_policy_list_sub); + +#define XFRM_POLICY_LISTS(type) \ + ((type == XFRM_POLICY_TYPE_SUB) ? xfrm_policy_list_sub : \ + xfrm_policy_list) +#define XFRM_POLICY_LISTHEAD(type, dir) \ + ((type == XFRM_POLICY_TYPE_SUB) ? xfrm_policy_list_sub[dir] : \ + xfrm_policy_list[dir]) +#define XFRM_POLICY_LISTHEADP(type, dir) \ + ((type == XFRM_POLICY_TYPE_SUB) ? &xfrm_policy_list_sub[dir] : \ + &xfrm_policy_list[dir]) +#else +#define XFRM_POLICY_LISTS(type) xfrm_policy_list +#define XFRM_POLICY_LISTHEAD(type, dif) xfrm_policy_list[dir] +#define XFRM_POLICY_LISTHEADP(type, dif) &xfrm_policy_list[dir] +#endif static DEFINE_RWLOCK(xfrm_policy_afinfo_lock); static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO]; @@ -397,7 +415,7 @@ static void xfrm_policy_kill(struct xfrm_policy *policy) /* Generate new index... KAME seems to generate them ordered by cost * of an absolute inpredictability of ordering of rules. This will not pass. */ -static u32 xfrm_gen_index(int dir) +static u32 xfrm_gen_index(u8 type, int dir) { u32 idx; struct xfrm_policy *p; @@ -408,7 +426,7 @@ static u32 xfrm_gen_index(int dir) idx_generator += 8; if (idx == 0) idx = 8; - for (p = xfrm_policy_list[dir]; p; p = p->next) { + for (p = XFRM_POLICY_LISTHEAD(type, dir); p; p = p->next) { if (p->index == idx) break; } @@ -425,7 +443,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) struct dst_entry *gc_list; write_lock_bh(&xfrm_policy_lock); - for (p = &xfrm_policy_list[dir]; (pol=*p)!=NULL;) { + for (p = XFRM_POLICY_LISTHEADP(policy->type, dir); (pol=*p)!=NULL;) { if (!delpol && memcmp(&policy->selector, &pol->selector, sizeof(pol->selector)) == 0 && xfrm_sec_ctx_match(pol->security, policy->security)) { if (excl) { @@ -452,7 +470,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) policy->next = *p; *p = policy; atomic_inc(&flow_cache_genid); - policy->index = delpol ? delpol->index : xfrm_gen_index(dir); + policy->index = delpol ? delpol->index : xfrm_gen_index(policy->type, dir); policy->curlft.add_time = (unsigned long)xtime.tv_sec; policy->curlft.use_time = 0; if (!mod_timer(&policy->timer, jiffies + HZ)) @@ -493,13 +511,14 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) } EXPORT_SYMBOL(xfrm_policy_insert); -struct xfrm_policy *xfrm_policy_bysel_ctx(int dir, struct xfrm_selector *sel, +struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int dir, + struct xfrm_selector *sel, struct xfrm_sec_ctx *ctx, int delete) { struct xfrm_policy *pol, **p; write_lock_bh(&xfrm_policy_lock); - for (p = &xfrm_policy_list[dir]; (pol=*p)!=NULL; p = &pol->next) { + for (p = XFRM_POLICY_LISTHEADP(type, dir); (pol=*p)!=NULL; p = &pol->next) { if ((memcmp(sel, &pol->selector, sizeof(*sel)) == 0) && (xfrm_sec_ctx_match(ctx, pol->security))) { xfrm_pol_hold(pol); @@ -518,12 +537,12 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(int dir, struct xfrm_selector *sel, } EXPORT_SYMBOL(xfrm_policy_bysel_ctx); -struct xfrm_policy *xfrm_policy_byid(int dir, u32 id, int delete) +struct xfrm_policy *xfrm_policy_byid(u8 type, int dir, u32 id, int delete) { struct xfrm_policy *pol, **p; write_lock_bh(&xfrm_policy_lock); - for (p = &xfrm_policy_list[dir]; (pol=*p)!=NULL; p = &pol->next) { + for (p = XFRM_POLICY_LISTHEADP(type, dir); (pol=*p)!=NULL; p = &pol->next) { if (pol->index == id) { xfrm_pol_hold(pol); if (delete) @@ -541,15 +560,16 @@ struct xfrm_policy *xfrm_policy_byid(int dir, u32 id, int delete) } EXPORT_SYMBOL(xfrm_policy_byid); -void xfrm_policy_flush(void) +void xfrm_policy_flush(u8 type) { struct xfrm_policy *xp; + struct xfrm_policy **p_list = XFRM_POLICY_LISTS(type); int dir; write_lock_bh(&xfrm_policy_lock); for (dir = 0; dir < XFRM_POLICY_MAX; dir++) { - while ((xp = xfrm_policy_list[dir]) != NULL) { - xfrm_policy_list[dir] = xp->next; + while ((xp = p_list[dir]) != NULL) { + p_list[dir] = xp->next; write_unlock_bh(&xfrm_policy_lock); xfrm_policy_kill(xp); @@ -562,7 +582,7 @@ void xfrm_policy_flush(void) } EXPORT_SYMBOL(xfrm_policy_flush); -int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), +int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, int, void*), void *data) { struct xfrm_policy *xp; @@ -572,7 +592,7 @@ int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), read_lock_bh(&xfrm_policy_lock); for (dir = 0; dir < 2*XFRM_POLICY_MAX; dir++) { - for (xp = xfrm_policy_list[dir]; xp; xp = xp->next) + for (xp = XFRM_POLICY_LISTHEAD(type, dir); xp; xp = xp->next) count++; } @@ -582,7 +602,7 @@ int xfrm_policy_walk(int (*func)(struct xfrm_policy *, int, int, void*), } for (dir = 0; dir < 2*XFRM_POLICY_MAX; dir++) { - for (xp = xfrm_policy_list[dir]; xp; xp = xp->next) { + for (xp = XFRM_POLICY_LISTHEAD(type, dir); xp; xp = xp->next) { error = func(xp, dir%XFRM_POLICY_MAX, --count, data); if (error) goto out; @@ -597,13 +617,13 @@ EXPORT_SYMBOL(xfrm_policy_walk); /* Find policy to apply to this flow. */ -static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, - void **objp, atomic_t **obj_refp) +static struct xfrm_policy *xfrm_policy_lookup_bytype(u8 type, struct flowi *fl, + u16 family, u8 dir) { struct xfrm_policy *pol; read_lock_bh(&xfrm_policy_lock); - for (pol = xfrm_policy_list[dir]; pol; pol = pol->next) { + for (pol = XFRM_POLICY_LISTHEAD(type, dir); pol; pol = pol->next) { struct xfrm_selector *sel = &pol->selector; int match; @@ -620,6 +640,25 @@ static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, } } read_unlock_bh(&xfrm_policy_lock); + + return pol; +} + +static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, + void **objp, atomic_t **obj_refp) +{ + struct xfrm_policy *pol; + +#ifdef CONFIG_XFRM_SUB_POLICY + pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_SUB, fl, family, dir); + if (pol) + goto end; +#endif + pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_MAIN, fl, family, dir); + +#ifdef CONFIG_XFRM_SUB_POLICY + end: +#endif if ((*objp = (void *) pol) != NULL) *obj_refp = &pol->refcnt; } @@ -665,8 +704,10 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struc static void __xfrm_policy_link(struct xfrm_policy *pol, int dir) { - pol->next = xfrm_policy_list[dir]; - xfrm_policy_list[dir] = pol; + struct xfrm_policy **p_list = XFRM_POLICY_LISTS(pol->type); + + pol->next = p_list[dir]; + p_list[dir] = pol; xfrm_pol_hold(pol); } @@ -675,7 +716,7 @@ static struct xfrm_policy *__xfrm_policy_unlink(struct xfrm_policy *pol, { struct xfrm_policy **polp; - for (polp = &xfrm_policy_list[dir]; + for (polp = XFRM_POLICY_LISTHEADP(pol->type, dir); *polp != NULL; polp = &(*polp)->next) { if (*polp == pol) { *polp = pol->next; @@ -704,12 +745,17 @@ int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol) { struct xfrm_policy *old_pol; +#ifdef CONFIG_XFRM_SUB_POLICY + if (pol && pol->type != XFRM_POLICY_TYPE_MAIN) + return -EINVAL; +#endif + write_lock_bh(&xfrm_policy_lock); old_pol = sk->sk_policy[dir]; sk->sk_policy[dir] = pol; if (pol) { pol->curlft.add_time = (unsigned long)xtime.tv_sec; - pol->index = xfrm_gen_index(XFRM_POLICY_MAX+dir); + pol->index = xfrm_gen_index(pol->type, XFRM_POLICY_MAX+dir); __xfrm_policy_link(pol, XFRM_POLICY_MAX+dir); } if (old_pol) @@ -738,6 +784,7 @@ static struct xfrm_policy *clone_policy(struct xfrm_policy *old, int dir) newp->flags = old->flags; newp->xfrm_nr = old->xfrm_nr; newp->index = old->index; + newp->type = old->type; memcpy(newp->xfrm_vec, old->xfrm_vec, newp->xfrm_nr*sizeof(struct xfrm_tmpl)); write_lock_bh(&xfrm_policy_lock); @@ -764,9 +811,9 @@ int __xfrm_sk_clone_policy(struct sock *sk) /* Resolve list of templates for the flow, given policy. */ static int -xfrm_tmpl_resolve(struct xfrm_policy *policy, struct flowi *fl, - struct xfrm_state **xfrm, - unsigned short family) +xfrm_tmpl_resolve_one(struct xfrm_policy *policy, struct flowi *fl, + struct xfrm_state **xfrm, + unsigned short family) { int nx; int i, error; @@ -809,6 +856,38 @@ fail: return error; } +static int +xfrm_tmpl_resolve(struct xfrm_policy **pols, int npols, struct flowi *fl, + struct xfrm_state **xfrm, + unsigned short family) +{ + int cnx = 0; + int error; + int ret; + int i; + + for (i = 0; i < npols; i++) { + if (cnx + pols[i]->xfrm_nr >= XFRM_MAX_DEPTH) { + error = -ENOBUFS; + goto fail; + } + ret = xfrm_tmpl_resolve_one(pols[i], fl, &xfrm[cnx], family); + if (ret < 0) { + error = ret; + goto fail; + } else + cnx += ret; + } + + return cnx; + + fail: + for (cnx--; cnx>=0; cnx--) + xfrm_state_put(xfrm[cnx]); + return error; + +} + /* Check that the bundle accepts the flow and its components are * still valid. */ @@ -855,6 +934,11 @@ int xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, struct sock *sk, int flags) { struct xfrm_policy *policy; + struct xfrm_policy *pols[XFRM_POLICY_TYPE_MAX]; + int npols; + int pol_dead; + int xfrm_nr; + int pi; struct xfrm_state *xfrm[XFRM_MAX_DEPTH]; struct dst_entry *dst, *dst_orig = *dst_p; int nx = 0; @@ -866,12 +950,18 @@ int xfrm_lookup(struct dst_entry **dst_p, struct flowi *fl, restart: genid = atomic_read(&flow_cache_genid); policy = NULL; + for (pi = 0; pi < ARRAY_SIZE(pols); pi++) + pols[pi] = NULL; + npols = 0; + pol_dead = 0; + xfrm_nr = 0; + if (sk && sk->sk_policy[1]) policy = xfrm_sk_policy_lookup(sk, XFRM_POLICY_OUT, fl); if (!policy) { /* To accelerate a bit... */ - if ((dst_orig->flags & DST_NOXFRM) || !xfrm_policy_list[XFRM_POLICY_OUT]) + if ((dst_orig->flags & DST_NOXFRM) || xfrm_policy_lists_empty(XFRM_POLICY_OUT)) return 0; policy = flow_cache_lookup(fl, dst_orig->ops->family, @@ -883,6 +973,9 @@ restart: family = dst_orig->ops->family; policy->curlft.use_time = (unsigned long)xtime.tv_sec; + pols[0] = policy; + npols ++; + xfrm_nr += pols[0]->xfrm_nr; switch (policy->action) { case XFRM_POLICY_BLOCK: @@ -891,11 +984,13 @@ restart: goto error; case XFRM_POLICY_ALLOW: +#ifndef CONFIG_XFRM_SUB_POLICY if (policy->xfrm_nr == 0) { /* Flow passes not transformed. */ xfrm_pol_put(policy); return 0; } +#endif /* Try to find matching bundle. * @@ -911,7 +1006,36 @@ restart: if (dst) break; - nx = xfrm_tmpl_resolve(policy, fl, xfrm, family); +#ifdef CONFIG_XFRM_SUB_POLICY + if (pols[0]->type != XFRM_POLICY_TYPE_MAIN) { + pols[1] = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_MAIN, + fl, family, + XFRM_POLICY_OUT); + if (pols[1]) { + if (pols[1]->action == XFRM_POLICY_BLOCK) { + err = -EPERM; + goto error; + } + npols ++; + xfrm_nr += pols[1]->xfrm_nr; + } + } + + /* + * Because neither flowi nor bundle information knows about + * transformation template size. On more than one policy usage + * we can realize whether all of them is bypass or not after + * they are searched. See above not-transformed bypass + * is surrounded by non-sub policy configuration, too. + */ + if (xfrm_nr == 0) { + /* Flow passes not transformed. */ + xfrm_pols_put(pols, npols); + return 0; + } + +#endif + nx = xfrm_tmpl_resolve(pols, npols, fl, xfrm, family); if (unlikely(nx<0)) { err = nx; @@ -924,7 +1048,7 @@ restart: set_current_state(TASK_RUNNING); remove_wait_queue(&km_waitq, &wait); - nx = xfrm_tmpl_resolve(policy, fl, xfrm, family); + nx = xfrm_tmpl_resolve(pols, npols, fl, xfrm, family); if (nx == -EAGAIN && signal_pending(current)) { err = -ERESTART; @@ -932,7 +1056,7 @@ restart: } if (nx == -EAGAIN || genid != atomic_read(&flow_cache_genid)) { - xfrm_pol_put(policy); + xfrm_pols_put(pols, npols); goto restart; } err = nx; @@ -942,7 +1066,7 @@ restart: } if (nx == 0) { /* Flow passes not transformed. */ - xfrm_pol_put(policy); + xfrm_pols_put(pols, npols); return 0; } @@ -956,8 +1080,14 @@ restart: goto error; } + for (pi = 0; pi < npols; pi++) { + read_lock_bh(&pols[pi]->lock); + pol_dead |= pols[pi]->dead; + read_unlock_bh(&pols[pi]->lock); + } + write_lock_bh(&policy->lock); - if (unlikely(policy->dead || stale_bundle(dst))) { + if (unlikely(pol_dead || stale_bundle(dst))) { /* Wow! While we worked on resolving, this * policy has gone. Retry. It is not paranoia, * we just cannot enlist new bundle to dead object. @@ -977,12 +1107,12 @@ restart: } *dst_p = dst; dst_release(dst_orig); - xfrm_pol_put(policy); + xfrm_pols_put(pols, npols); return 0; error: dst_release(dst_orig); - xfrm_pol_put(policy); + xfrm_pols_put(pols, npols); *dst_p = NULL; return err; } @@ -1090,6 +1220,10 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, unsigned short family) { struct xfrm_policy *pol; + struct xfrm_policy *pols[XFRM_POLICY_TYPE_MAX]; + int npols = 0; + int xfrm_nr; + int pi; struct flowi fl; u8 fl_dir = policy_to_flow_dir(dir); int xerr_idx = -1; @@ -1128,22 +1262,50 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, pol->curlft.use_time = (unsigned long)xtime.tv_sec; + pols[0] = pol; + npols ++; +#ifdef CONFIG_XFRM_SUB_POLICY + if (pols[0]->type != XFRM_POLICY_TYPE_MAIN) { + pols[1] = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_MAIN, + &fl, family, + XFRM_POLICY_IN); + if (pols[1]) { + pols[1]->curlft.use_time = (unsigned long)xtime.tv_sec; + npols ++; + } + } +#endif + if (pol->action == XFRM_POLICY_ALLOW) { struct sec_path *sp; static struct sec_path dummy; + struct xfrm_tmpl *tp[XFRM_MAX_DEPTH]; + struct xfrm_tmpl **tpp = tp; + int ti = 0; int i, k; if ((sp = skb->sp) == NULL) sp = &dummy; + for (pi = 0; pi < npols; pi++) { + if (pols[pi] != pol && + pols[pi]->action != XFRM_POLICY_ALLOW) + goto reject; + if (ti + pols[pi]->xfrm_nr >= XFRM_MAX_DEPTH) + goto reject_error; + for (i = 0; i < pols[pi]->xfrm_nr; i++) + tpp[ti++] = &pols[pi]->xfrm_vec[i]; + } + xfrm_nr = ti; + /* For each tunnel xfrm, find the first matching tmpl. * For each tmpl before that, find corresponding xfrm. * Order is _important_. Later we will implement * some barriers, but at the moment barriers * are implied between each two transformations. */ - for (i = pol->xfrm_nr-1, k = 0; i >= 0; i--) { - k = xfrm_policy_ok(pol->xfrm_vec+i, sp, k, family); + for (i = xfrm_nr-1, k = 0; i >= 0; i--) { + k = xfrm_policy_ok(tpp[i], sp, k, family); if (k < 0) { if (k < -1 && xerr_idxp) *xerr_idxp = -(2+k); @@ -1154,13 +1316,14 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, if (secpath_has_nontransport(sp, k, xerr_idxp)) goto reject; - xfrm_pol_put(pol); + xfrm_pols_put(pols, npols); return 1; } reject: xfrm_secpath_reject(xerr_idx, skb, &fl); - xfrm_pol_put(pol); +reject_error: + xfrm_pols_put(pols, npols); return 0; } EXPORT_SYMBOL(__xfrm_policy_check); @@ -1246,6 +1409,23 @@ static void xfrm_prune_bundles(int (*func)(struct dst_entry *)) read_lock_bh(&xfrm_policy_lock); for (i=0; i<2*XFRM_POLICY_MAX; i++) { +#ifdef CONFIG_XFRM_SUB_POLICY + for (pol = xfrm_policy_list_sub[i]; pol; pol = pol->next) { + write_lock(&pol->lock); + dstp = &pol->bundles; + while ((dst=*dstp) != NULL) { + if (func(dst)) { + *dstp = dst->next; + dst->next = gc_list; + gc_list = dst; + } else { + dstp = &dst->next; + } + } + write_unlock(&pol->lock); + } + +#endif for (pol = xfrm_policy_list[i]; pol; pol = pol->next) { write_lock(&pol->lock); dstp = &pol->bundles; -- cgit v0.10.2 From 41a49cc3c02ace59d4dddae91ea211c330970ee3 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:48:31 -0700 Subject: [XFRM]: Add sorting interface for state and template. Under two transformation policies it is required to merge them. This is a platform to sort state for outbound and templates for inbound respectively. It will be used when Mobile IPv6 and IPsec are used at the same time. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 4655ca2..d341603 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -254,6 +254,8 @@ struct xfrm_state_afinfo { struct xfrm_state *(*find_acq)(u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create); + int (*tmpl_sort)(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n); + int (*state_sort)(struct xfrm_state **dst, struct xfrm_state **src, int n); }; extern int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo); @@ -1002,6 +1004,24 @@ extern int xfrm_state_add(struct xfrm_state *x); extern int xfrm_state_update(struct xfrm_state *x); extern struct xfrm_state *xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family); extern struct xfrm_state *xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto, unsigned short family); +#ifdef CONFIG_XFRM_SUB_POLICY +extern int xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, + int n, unsigned short family); +extern int xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src, + int n, unsigned short family); +#else +static inline int xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, + int n, unsigned short family) +{ + return -ENOSYS; +} + +static inline int xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src, + int n, unsigned short family) +{ + return -ENOSYS; +} +#endif extern struct xfrm_state *xfrm_find_acq_byseq(u32 seq); extern int xfrm_state_delete(struct xfrm_state *x); extern void xfrm_state_flush(u8 proto); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 96de6c7..1732159 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -861,6 +861,8 @@ xfrm_tmpl_resolve(struct xfrm_policy **pols, int npols, struct flowi *fl, struct xfrm_state **xfrm, unsigned short family) { + struct xfrm_state *tp[XFRM_MAX_DEPTH]; + struct xfrm_state **tpp = (npols > 1) ? tp : xfrm; int cnx = 0; int error; int ret; @@ -871,7 +873,8 @@ xfrm_tmpl_resolve(struct xfrm_policy **pols, int npols, struct flowi *fl, error = -ENOBUFS; goto fail; } - ret = xfrm_tmpl_resolve_one(pols[i], fl, &xfrm[cnx], family); + + ret = xfrm_tmpl_resolve_one(pols[i], fl, &tpp[cnx], family); if (ret < 0) { error = ret; goto fail; @@ -879,11 +882,15 @@ xfrm_tmpl_resolve(struct xfrm_policy **pols, int npols, struct flowi *fl, cnx += ret; } + /* found states are sorted for outbound processing */ + if (npols > 1) + xfrm_state_sort(xfrm, tpp, cnx, family); + return cnx; fail: for (cnx--; cnx>=0; cnx--) - xfrm_state_put(xfrm[cnx]); + xfrm_state_put(tpp[cnx]); return error; } @@ -1280,6 +1287,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, struct sec_path *sp; static struct sec_path dummy; struct xfrm_tmpl *tp[XFRM_MAX_DEPTH]; + struct xfrm_tmpl *stp[XFRM_MAX_DEPTH]; struct xfrm_tmpl **tpp = tp; int ti = 0; int i, k; @@ -1297,6 +1305,10 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, tpp[ti++] = &pols[pi]->xfrm_vec[i]; } xfrm_nr = ti; + if (npols > 1) { + xfrm_tmpl_sort(stp, tpp, xfrm_nr, family); + tpp = stp; + } /* For each tunnel xfrm, find the first matching tmpl. * For each tmpl before that, find corresponding xfrm. diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index a26ef69..622e92a 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -728,6 +728,44 @@ xfrm_find_acq(u8 mode, u32 reqid, u8 proto, } EXPORT_SYMBOL(xfrm_find_acq); +#ifdef CONFIG_XFRM_SUB_POLICY +int +xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n, + unsigned short family) +{ + int err = 0; + struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); + if (!afinfo) + return -EAFNOSUPPORT; + + spin_lock_bh(&xfrm_state_lock); + if (afinfo->tmpl_sort) + err = afinfo->tmpl_sort(dst, src, n); + spin_unlock_bh(&xfrm_state_lock); + xfrm_state_put_afinfo(afinfo); + return err; +} +EXPORT_SYMBOL(xfrm_tmpl_sort); + +int +xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n, + unsigned short family) +{ + int err = 0; + struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); + if (!afinfo) + return -EAFNOSUPPORT; + + spin_lock_bh(&xfrm_state_lock); + if (afinfo->state_sort) + err = afinfo->state_sort(dst, src, n); + spin_unlock_bh(&xfrm_state_lock); + xfrm_state_put_afinfo(afinfo); + return err; +} +EXPORT_SYMBOL(xfrm_state_sort); +#endif + /* Silly enough, but I'm lazy to build resolution list */ static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq) -- cgit v0.10.2 From f7b6983f0feeefcd2a594138adcffe640593d8de Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:49:28 -0700 Subject: [XFRM] POLICY: Support netlink socket interface for sub policy. Sub policy can be used through netlink socket. PF_KEY uses main only and it is TODO to support sub. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/xfrm.h b/include/linux/xfrm.h index 492fb98..14ecd19 100644 --- a/include/linux/xfrm.h +++ b/include/linux/xfrm.h @@ -230,6 +230,12 @@ enum xfrm_ae_ftype_t { #define XFRM_AE_MAX (__XFRM_AE_MAX - 1) }; +struct xfrm_userpolicy_type { + __u8 type; + __u16 reserved1; + __u8 reserved2; +}; + /* Netlink message attributes. */ enum xfrm_attr_type_t { XFRMA_UNSPEC, @@ -248,6 +254,7 @@ enum xfrm_attr_type_t { XFRMA_SRCADDR, /* xfrm_address_t */ XFRMA_COADDR, /* xfrm_address_t */ XFRMA_LASTUSED, + XFRMA_POLICY_TYPE, /* struct xfrm_userpolicy_type */ __XFRMA_MAX #define XFRMA_MAX (__XFRMA_MAX - 1) diff --git a/include/net/xfrm.h b/include/net/xfrm.h index d341603..c75b328 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -203,6 +203,7 @@ struct km_event u32 proto; u32 byid; u32 aevent; + u32 type; } data; u32 seq; diff --git a/net/key/af_key.c b/net/key/af_key.c index 19e047b..83b443d 100644 --- a/net/key/af_key.c +++ b/net/key/af_key.c @@ -1731,7 +1731,8 @@ static u32 gen_reqid(void) ++reqid; if (reqid == 0) reqid = IPSEC_MANUAL_REQID_MAX+1; - if (xfrm_policy_walk(check_reqid, (void*)&reqid) != -EEXIST) + if (xfrm_policy_walk(XFRM_POLICY_TYPE_MAIN, check_reqid, + (void*)&reqid) != -EEXIST) return reqid; } while (reqid != start); return 0; @@ -2268,7 +2269,8 @@ static int pfkey_spddelete(struct sock *sk, struct sk_buff *skb, struct sadb_msg return err; } - xp = xfrm_policy_bysel_ctx(pol->sadb_x_policy_dir-1, &sel, tmp.security, 1); + xp = xfrm_policy_bysel_ctx(XFRM_POLICY_TYPE_MAIN, pol->sadb_x_policy_dir-1, + &sel, tmp.security, 1); security_xfrm_policy_free(&tmp); if (xp == NULL) return -ENOENT; @@ -2330,7 +2332,7 @@ static int pfkey_spdget(struct sock *sk, struct sk_buff *skb, struct sadb_msg *h if (dir >= XFRM_POLICY_MAX) return -EINVAL; - xp = xfrm_policy_byid(dir, pol->sadb_x_policy_id, + xp = xfrm_policy_byid(XFRM_POLICY_TYPE_MAIN, dir, pol->sadb_x_policy_id, hdr->sadb_msg_type == SADB_X_SPDDELETE2); if (xp == NULL) return -ENOENT; @@ -2378,7 +2380,7 @@ static int pfkey_spddump(struct sock *sk, struct sk_buff *skb, struct sadb_msg * { struct pfkey_dump_data data = { .skb = skb, .hdr = hdr, .sk = sk }; - return xfrm_policy_walk(dump_sp, &data); + return xfrm_policy_walk(XFRM_POLICY_TYPE_MAIN, dump_sp, &data); } static int key_notify_policy_flush(struct km_event *c) @@ -2405,7 +2407,8 @@ static int pfkey_spdflush(struct sock *sk, struct sk_buff *skb, struct sadb_msg { struct km_event c; - xfrm_policy_flush(); + xfrm_policy_flush(XFRM_POLICY_TYPE_MAIN); + c.data.type = XFRM_POLICY_TYPE_MAIN; c.event = XFRM_MSG_FLUSHPOLICY; c.pid = hdr->sadb_msg_pid; c.seq = hdr->sadb_msg_seq; @@ -2667,6 +2670,9 @@ static int pfkey_send_notify(struct xfrm_state *x, struct km_event *c) static int pfkey_send_policy_notify(struct xfrm_policy *xp, int dir, struct km_event *c) { + if (xp && xp->type != XFRM_POLICY_TYPE_MAIN) + return 0; + switch (c->event) { case XFRM_MSG_POLEXPIRE: return key_notify_policy_expire(xp, c); @@ -2675,6 +2681,8 @@ static int pfkey_send_policy_notify(struct xfrm_policy *xp, int dir, struct km_e case XFRM_MSG_UPDPOLICY: return key_notify_policy(xp, dir, c); case XFRM_MSG_FLUSHPOLICY: + if (c->data.type != XFRM_POLICY_TYPE_MAIN) + break; return key_notify_policy_flush(c); default: printk("pfkey: Unknown policy event %d\n", c->event); diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index 7303b82..c59a78d 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -786,6 +786,22 @@ static int verify_policy_dir(__u8 dir) return 0; } +static int verify_policy_type(__u8 type) +{ + switch (type) { + case XFRM_POLICY_TYPE_MAIN: +#ifdef CONFIG_XFRM_SUB_POLICY + case XFRM_POLICY_TYPE_SUB: +#endif + break; + + default: + return -EINVAL; + }; + + return 0; +} + static int verify_newpolicy_info(struct xfrm_userpolicy_info *p) { switch (p->share) { @@ -879,6 +895,29 @@ static int copy_from_user_tmpl(struct xfrm_policy *pol, struct rtattr **xfrma) return 0; } +static int copy_from_user_policy_type(u8 *tp, struct rtattr **xfrma) +{ + struct rtattr *rt = xfrma[XFRMA_POLICY_TYPE-1]; + struct xfrm_userpolicy_type *upt; + __u8 type = XFRM_POLICY_TYPE_MAIN; + int err; + + if (rt) { + if (rt->rta_len < sizeof(*upt)) + return -EINVAL; + + upt = RTA_DATA(rt); + type = upt->type; + } + + err = verify_policy_type(type); + if (err) + return err; + + *tp = type; + return 0; +} + static void copy_from_user_policy(struct xfrm_policy *xp, struct xfrm_userpolicy_info *p) { xp->priority = p->priority; @@ -917,16 +956,20 @@ static struct xfrm_policy *xfrm_policy_construct(struct xfrm_userpolicy_info *p, copy_from_user_policy(xp, p); + err = copy_from_user_policy_type(&xp->type, xfrma); + if (err) + goto error; + if (!(err = copy_from_user_tmpl(xp, xfrma))) err = copy_from_user_sec_ctx(xp, xfrma); - - if (err) { - *errp = err; - kfree(xp); - xp = NULL; - } + if (err) + goto error; return xp; + error: + *errp = err; + kfree(xp); + return NULL; } static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfrma) @@ -1037,6 +1080,29 @@ static inline int copy_to_user_sec_ctx(struct xfrm_policy *xp, struct sk_buff *s return 0; } +#ifdef CONFIG_XFRM_SUB_POLICY +static int copy_to_user_policy_type(struct xfrm_policy *xp, struct sk_buff *skb) +{ + struct xfrm_userpolicy_type upt; + + memset(&upt, 0, sizeof(upt)); + upt.type = xp->type; + + RTA_PUT(skb, XFRMA_POLICY_TYPE, sizeof(upt), &upt); + + return 0; + +rtattr_failure: + return -1; +} + +#else +static inline int copy_to_user_policy_type(struct xfrm_policy *xp, struct sk_buff *skb) +{ + return 0; +} +#endif + static int dump_one_policy(struct xfrm_policy *xp, int dir, int count, void *ptr) { struct xfrm_dump_info *sp = ptr; @@ -1060,6 +1126,8 @@ static int dump_one_policy(struct xfrm_policy *xp, int dir, int count, void *ptr goto nlmsg_failure; if (copy_to_user_sec_ctx(xp, skb)) goto nlmsg_failure; + if (copy_to_user_policy_type(xp, skb) < 0) + goto nlmsg_failure; nlh->nlmsg_len = skb->tail - b; out: @@ -1081,7 +1149,10 @@ static int xfrm_dump_policy(struct sk_buff *skb, struct netlink_callback *cb) info.nlmsg_flags = NLM_F_MULTI; info.this_idx = 0; info.start_idx = cb->args[0]; - (void) xfrm_policy_walk(dump_one_policy, &info); + (void) xfrm_policy_walk(XFRM_POLICY_TYPE_MAIN, dump_one_policy, &info); +#ifdef CONFIG_XFRM_SUB_POLICY + (void) xfrm_policy_walk(XFRM_POLICY_TYPE_SUB, dump_one_policy, &info); +#endif cb->args[0] = info.this_idx; return skb->len; @@ -1117,6 +1188,7 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfr { struct xfrm_policy *xp; struct xfrm_userpolicy_id *p; + __u8 type = XFRM_POLICY_TYPE_MAIN; int err; struct km_event c; int delete; @@ -1124,12 +1196,16 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfr p = NLMSG_DATA(nlh); delete = nlh->nlmsg_type == XFRM_MSG_DELPOLICY; + err = copy_from_user_policy_type(&type, (struct rtattr **)xfrma); + if (err) + return err; + err = verify_policy_dir(p->dir); if (err) return err; if (p->index) - xp = xfrm_policy_byid(p->dir, p->index, delete); + xp = xfrm_policy_byid(type, p->dir, p->index, delete); else { struct rtattr **rtattrs = (struct rtattr **)xfrma; struct rtattr *rt = rtattrs[XFRMA_SEC_CTX-1]; @@ -1146,7 +1222,7 @@ static int xfrm_get_policy(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfr if ((err = security_xfrm_policy_alloc(&tmp, uctx))) return err; } - xp = xfrm_policy_bysel_ctx(p->dir, &p->sel, tmp.security, delete); + xp = xfrm_policy_bysel_ctx(type, p->dir, &p->sel, tmp.security, delete); security_xfrm_policy_free(&tmp); } if (xp == NULL) @@ -1329,9 +1405,16 @@ out: static int xfrm_flush_policy(struct sk_buff *skb, struct nlmsghdr *nlh, void **xfrma) { -struct km_event c; + struct km_event c; + __u8 type = XFRM_POLICY_TYPE_MAIN; + int err; + + err = copy_from_user_policy_type(&type, (struct rtattr **)xfrma); + if (err) + return err; - xfrm_policy_flush(); + xfrm_policy_flush(type); + c.data.type = type; c.event = nlh->nlmsg_type; c.seq = nlh->nlmsg_seq; c.pid = nlh->nlmsg_pid; @@ -1344,10 +1427,15 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, void * struct xfrm_policy *xp; struct xfrm_user_polexpire *up = NLMSG_DATA(nlh); struct xfrm_userpolicy_info *p = &up->pol; + __u8 type = XFRM_POLICY_TYPE_MAIN; int err = -ENOENT; + err = copy_from_user_policy_type(&type, (struct rtattr **)xfrma); + if (err) + return err; + if (p->index) - xp = xfrm_policy_byid(p->dir, p->index, 0); + xp = xfrm_policy_byid(type, p->dir, p->index, 0); else { struct rtattr **rtattrs = (struct rtattr **)xfrma; struct rtattr *rt = rtattrs[XFRMA_SEC_CTX-1]; @@ -1364,7 +1452,7 @@ static int xfrm_add_pol_expire(struct sk_buff *skb, struct nlmsghdr *nlh, void * if ((err = security_xfrm_policy_alloc(&tmp, uctx))) return err; } - xp = xfrm_policy_bysel_ctx(p->dir, &p->sel, tmp.security, 0); + xp = xfrm_policy_bysel_ctx(type, p->dir, &p->sel, tmp.security, 0); security_xfrm_policy_free(&tmp); } @@ -1818,6 +1906,8 @@ static int build_acquire(struct sk_buff *skb, struct xfrm_state *x, goto nlmsg_failure; if (copy_to_user_state_sec_ctx(x, skb)) goto nlmsg_failure; + if (copy_to_user_policy_type(xp, skb) < 0) + goto nlmsg_failure; nlh->nlmsg_len = skb->tail - b; return skb->len; @@ -1898,6 +1988,7 @@ static struct xfrm_policy *xfrm_compile_policy(struct sock *sk, int opt, } copy_from_user_policy(xp, p); + xp->type = XFRM_POLICY_TYPE_MAIN; copy_templates(xp, ut, nr); if (!xp->security) { @@ -1931,6 +2022,8 @@ static int build_polexpire(struct sk_buff *skb, struct xfrm_policy *xp, goto nlmsg_failure; if (copy_to_user_sec_ctx(xp, skb)) goto nlmsg_failure; + if (copy_to_user_policy_type(xp, skb) < 0) + goto nlmsg_failure; upe->hard = !!hard; nlh->nlmsg_len = skb->tail - b; @@ -2002,6 +2095,8 @@ static int xfrm_notify_policy(struct xfrm_policy *xp, int dir, struct km_event * copy_to_user_policy(xp, p, dir); if (copy_to_user_tmpl(xp, skb) < 0) goto nlmsg_failure; + if (copy_to_user_policy_type(xp, skb) < 0) + goto nlmsg_failure; nlh->nlmsg_len = skb->tail - b; @@ -2019,6 +2114,9 @@ static int xfrm_notify_policy_flush(struct km_event *c) struct nlmsghdr *nlh; struct sk_buff *skb; unsigned char *b; +#ifdef CONFIG_XFRM_SUB_POLICY + struct xfrm_userpolicy_type upt; +#endif int len = NLMSG_LENGTH(0); skb = alloc_skb(len, GFP_ATOMIC); @@ -2028,6 +2126,13 @@ static int xfrm_notify_policy_flush(struct km_event *c) nlh = NLMSG_PUT(skb, c->pid, c->seq, XFRM_MSG_FLUSHPOLICY, 0); + nlh->nlmsg_flags = 0; + +#ifdef CONFIG_XFRM_SUB_POLICY + memset(&upt, 0, sizeof(upt)); + upt.type = c->data.type; + RTA_PUT(skb, XFRMA_POLICY_TYPE, sizeof(upt), &upt); +#endif nlh->nlmsg_len = skb->tail - b; @@ -2035,6 +2140,9 @@ static int xfrm_notify_policy_flush(struct km_event *c) return netlink_broadcast(xfrm_nl, skb, 0, XFRMNLGRP_POLICY, GFP_ATOMIC); nlmsg_failure: +#ifdef CONFIG_XFRM_SUB_POLICY +rtattr_failure: +#endif kfree_skb(skb); return -1; } -- cgit v0.10.2 From 58c949d1b9551f3e4ba9dde4aeda341ecf5e42b5 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:51:02 -0700 Subject: [XFRM] IPV6: Add sort functions to combine templates/states for IPsec. Add sort functions to combine templates/states for IPsec. Think of outbound transformation order we should be careful with transport AH which must be the last of all transport ones. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index 9c95b9d..e0b8f3c 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -156,12 +156,109 @@ __xfrm6_find_acq(u8 mode, u32 reqid, u8 proto, return x0; } +static int +__xfrm6_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n) +{ + int i; + int j = 0; + + /* Rule 1: select IPsec transport except AH */ + for (i = 0; i < n; i++) { + if (src[i]->props.mode == XFRM_MODE_TRANSPORT && + src[i]->id.proto != IPPROTO_AH) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (j == n) + goto end; + + /* XXX: Rule 2: select MIPv6 RO or inbound trigger */ + + /* Rule 3: select IPsec transport AH */ + for (i = 0; i < n; i++) { + if (src[i] && + src[i]->props.mode == XFRM_MODE_TRANSPORT && + src[i]->id.proto == IPPROTO_AH) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (j == n) + goto end; + + /* Rule 4: select IPsec tunnel */ + for (i = 0; i < n; i++) { + if (src[i] && + src[i]->props.mode == XFRM_MODE_TUNNEL) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (likely(j == n)) + goto end; + + /* Final rule */ + for (i = 0; i < n; i++) { + if (src[i]) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + + end: + return 0; +} + +static int +__xfrm6_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n) +{ + int i; + int j = 0; + + /* Rule 1: select IPsec transport */ + for (i = 0; i < n; i++) { + if (src[i]->mode == XFRM_MODE_TRANSPORT) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (j == n) + goto end; + + /* XXX: Rule 2: select MIPv6 RO or inbound trigger */ + + /* Rule 3: select IPsec tunnel */ + for (i = 0; i < n; i++) { + if (src[i] && + src[i]->mode == XFRM_MODE_TUNNEL) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (likely(j == n)) + goto end; + + /* Final rule */ + for (i = 0; i < n; i++) { + if (src[i]) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + + end: + return 0; +} + static struct xfrm_state_afinfo xfrm6_state_afinfo = { .family = AF_INET6, .init_tempsel = __xfrm6_init_tempsel, .state_lookup = __xfrm6_state_lookup, .state_lookup_byaddr = __xfrm6_state_lookup_byaddr, .find_acq = __xfrm6_find_acq, + .tmpl_sort = __xfrm6_tmpl_sort, + .state_sort = __xfrm6_state_sort, }; void __init xfrm6_state_init(void) -- cgit v0.10.2 From 64d9fdda8e1bdf416b2d9203c3ad9c249ea301be Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Wed, 23 Aug 2006 22:54:07 -0700 Subject: [XFRM] IPV6: Support Mobile IPv6 extension headers sorting. Support Mobile IPv6 extension headers sorting for two transformation policies. Mobile IPv6 extension headers should be placed after IPsec transport mode, but before transport AH when outbound. Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index e0b8f3c..6269584 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -173,7 +173,19 @@ __xfrm6_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n) if (j == n) goto end; - /* XXX: Rule 2: select MIPv6 RO or inbound trigger */ + /* Rule 2: select MIPv6 RO or inbound trigger */ +#ifdef CONFIG_IPV6_MIP6 + for (i = 0; i < n; i++) { + if (src[i] && + (src[i]->props.mode == XFRM_MODE_ROUTEOPTIMIZATION || + src[i]->props.mode == XFRM_MODE_IN_TRIGGER)) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (j == n) + goto end; +#endif /* Rule 3: select IPsec transport AH */ for (i = 0; i < n; i++) { @@ -226,7 +238,19 @@ __xfrm6_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n) if (j == n) goto end; - /* XXX: Rule 2: select MIPv6 RO or inbound trigger */ + /* Rule 2: select MIPv6 RO or inbound trigger */ +#ifdef CONFIG_IPV6_MIP6 + for (i = 0; i < n; i++) { + if (src[i] && + (src[i]->mode == XFRM_MODE_ROUTEOPTIMIZATION || + src[i]->mode == XFRM_MODE_IN_TRIGGER)) { + dst[j++] = src[i]; + src[i] = NULL; + } + } + if (j == n) + goto end; +#endif /* Rule 3: select IPsec tunnel */ for (i = 0; i < n; i++) { -- cgit v0.10.2 From 2770834c9f44afd1bfa13914c7285470775af657 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 00:13:10 -0700 Subject: [XFRM]: Pull xfrm_state_bydst hash table knowledge out of afinfo. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index c75b328..cc83443 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -243,7 +243,6 @@ extern int __xfrm_state_delete(struct xfrm_state *x); struct xfrm_state_afinfo { unsigned short family; - struct list_head *state_bydst; struct list_head *state_bysrc; struct list_head *state_byspi; int (*init_flags)(struct xfrm_state *x); @@ -252,9 +251,6 @@ struct xfrm_state_afinfo { xfrm_address_t *daddr, xfrm_address_t *saddr); struct xfrm_state *(*state_lookup)(xfrm_address_t *daddr, u32 spi, u8 proto); struct xfrm_state *(*state_lookup_byaddr)(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto); - struct xfrm_state *(*find_acq)(u8 mode, u32 reqid, u8 proto, - xfrm_address_t *daddr, xfrm_address_t *saddr, - int create); int (*tmpl_sort)(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n); int (*state_sort)(struct xfrm_state **dst, struct xfrm_state **src, int n); }; @@ -456,18 +452,6 @@ unsigned __xfrm6_dst_hash(xfrm_address_t *addr) } static __inline__ -unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) -{ - switch (family) { - case AF_INET: - return __xfrm4_dst_hash(addr); - case AF_INET6: - return __xfrm6_dst_hash(addr); - } - return 0; -} - -static __inline__ unsigned __xfrm4_src_hash(xfrm_address_t *addr) { return __xfrm4_dst_hash(addr); diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index 616be13..9dc1afc 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -88,65 +88,12 @@ __xfrm4_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, return NULL; } -static struct xfrm_state * -__xfrm4_find_acq(u8 mode, u32 reqid, u8 proto, - xfrm_address_t *daddr, xfrm_address_t *saddr, - int create) -{ - struct xfrm_state *x, *x0; - unsigned h = __xfrm4_dst_hash(daddr); - - x0 = NULL; - - list_for_each_entry(x, xfrm4_state_afinfo.state_bydst+h, bydst) { - if (x->props.family == AF_INET && - daddr->a4 == x->id.daddr.a4 && - mode == x->props.mode && - proto == x->id.proto && - saddr->a4 == x->props.saddr.a4 && - reqid == x->props.reqid && - x->km.state == XFRM_STATE_ACQ && - !x->id.spi) { - x0 = x; - break; - } - } - if (!x0 && create && (x0 = xfrm_state_alloc()) != NULL) { - x0->sel.daddr.a4 = daddr->a4; - x0->sel.saddr.a4 = saddr->a4; - x0->sel.prefixlen_d = 32; - x0->sel.prefixlen_s = 32; - x0->props.saddr.a4 = saddr->a4; - x0->km.state = XFRM_STATE_ACQ; - x0->id.daddr.a4 = daddr->a4; - x0->id.proto = proto; - x0->props.family = AF_INET; - x0->props.mode = mode; - x0->props.reqid = reqid; - x0->props.family = AF_INET; - x0->lft.hard_add_expires_seconds = XFRM_ACQ_EXPIRES; - xfrm_state_hold(x0); - x0->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; - add_timer(&x0->timer); - xfrm_state_hold(x0); - list_add_tail(&x0->bydst, xfrm4_state_afinfo.state_bydst+h); - h = __xfrm4_src_hash(saddr); - xfrm_state_hold(x0); - list_add_tail(&x0->bysrc, xfrm4_state_afinfo.state_bysrc+h); - wake_up(&km_waitq); - } - if (x0) - xfrm_state_hold(x0); - return x0; -} - static struct xfrm_state_afinfo xfrm4_state_afinfo = { .family = AF_INET, .init_flags = xfrm4_init_flags, .init_tempsel = __xfrm4_init_tempsel, .state_lookup = __xfrm4_state_lookup, .state_lookup_byaddr = __xfrm4_state_lookup_byaddr, - .find_acq = __xfrm4_find_acq, }; void __init xfrm4_state_init(void) diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index 6269584..40fcaab 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -101,61 +101,6 @@ __xfrm6_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto) return NULL; } -static struct xfrm_state * -__xfrm6_find_acq(u8 mode, u32 reqid, u8 proto, - xfrm_address_t *daddr, xfrm_address_t *saddr, - int create) -{ - struct xfrm_state *x, *x0; - unsigned h = __xfrm6_dst_hash(daddr); - - x0 = NULL; - - list_for_each_entry(x, xfrm6_state_afinfo.state_bydst+h, bydst) { - if (x->props.family == AF_INET6 && - ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && - mode == x->props.mode && - proto == x->id.proto && - ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr *)x->props.saddr.a6) && - reqid == x->props.reqid && - x->km.state == XFRM_STATE_ACQ && - !x->id.spi) { - x0 = x; - break; - } - } - if (!x0 && create && (x0 = xfrm_state_alloc()) != NULL) { - ipv6_addr_copy((struct in6_addr *)x0->sel.daddr.a6, - (struct in6_addr *)daddr); - ipv6_addr_copy((struct in6_addr *)x0->sel.saddr.a6, - (struct in6_addr *)saddr); - x0->sel.prefixlen_d = 128; - x0->sel.prefixlen_s = 128; - ipv6_addr_copy((struct in6_addr *)x0->props.saddr.a6, - (struct in6_addr *)saddr); - x0->km.state = XFRM_STATE_ACQ; - ipv6_addr_copy((struct in6_addr *)x0->id.daddr.a6, - (struct in6_addr *)daddr); - x0->id.proto = proto; - x0->props.family = AF_INET6; - x0->props.mode = mode; - x0->props.reqid = reqid; - x0->lft.hard_add_expires_seconds = XFRM_ACQ_EXPIRES; - xfrm_state_hold(x0); - x0->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; - add_timer(&x0->timer); - xfrm_state_hold(x0); - list_add_tail(&x0->bydst, xfrm6_state_afinfo.state_bydst+h); - h = __xfrm6_src_hash(saddr); - xfrm_state_hold(x0); - list_add_tail(&x0->bysrc, xfrm6_state_afinfo.state_bysrc+h); - wake_up(&km_waitq); - } - if (x0) - xfrm_state_hold(x0); - return x0; -} - static int __xfrm6_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n) { @@ -280,7 +225,6 @@ static struct xfrm_state_afinfo xfrm6_state_afinfo = { .init_tempsel = __xfrm6_init_tempsel, .state_lookup = __xfrm6_state_lookup, .state_lookup_byaddr = __xfrm6_state_lookup_byaddr, - .find_acq = __xfrm6_find_acq, .tmpl_sort = __xfrm6_tmpl_sort, .state_sort = __xfrm6_state_sort, }; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 622e92a..80f5f9d 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -48,6 +48,18 @@ static struct list_head xfrm_state_bydst[XFRM_DST_HSIZE]; static struct list_head xfrm_state_bysrc[XFRM_DST_HSIZE]; static struct list_head xfrm_state_byspi[XFRM_DST_HSIZE]; +static __inline__ +unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) +{ + switch (family) { + case AF_INET: + return __xfrm4_dst_hash(addr); + case AF_INET6: + return __xfrm6_dst_hash(addr); + } + return 0; +} + DECLARE_WAIT_QUEUE_HEAD(km_waitq); EXPORT_SYMBOL(km_waitq); @@ -489,6 +501,89 @@ void xfrm_state_insert(struct xfrm_state *x) } EXPORT_SYMBOL(xfrm_state_insert); +/* xfrm_state_lock is held */ +static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create) +{ + unsigned int h = xfrm_dst_hash(daddr, family); + struct xfrm_state *x; + + list_for_each_entry(x, xfrm_state_bydst+h, bydst) { + if (x->props.reqid != reqid || + x->props.mode != mode || + x->props.family != family || + x->km.state != XFRM_STATE_ACQ || + x->id.spi != 0) + continue; + + switch (family) { + case AF_INET: + if (x->id.daddr.a4 != daddr->a4 || + x->props.saddr.a4 != saddr->a4) + continue; + break; + case AF_INET6: + if (!ipv6_addr_equal((struct in6_addr *)x->id.daddr.a6, + (struct in6_addr *)daddr) || + !ipv6_addr_equal((struct in6_addr *) + x->props.saddr.a6, + (struct in6_addr *)saddr)) + continue; + break; + }; + + xfrm_state_hold(x); + return x; + } + + if (!create) + return NULL; + + x = xfrm_state_alloc(); + if (likely(x)) { + switch (family) { + case AF_INET: + x->sel.daddr.a4 = daddr->a4; + x->sel.saddr.a4 = saddr->a4; + x->sel.prefixlen_d = 32; + x->sel.prefixlen_s = 32; + x->props.saddr.a4 = saddr->a4; + x->id.daddr.a4 = daddr->a4; + break; + + case AF_INET6: + ipv6_addr_copy((struct in6_addr *)x->sel.daddr.a6, + (struct in6_addr *)daddr); + ipv6_addr_copy((struct in6_addr *)x->sel.saddr.a6, + (struct in6_addr *)saddr); + x->sel.prefixlen_d = 128; + x->sel.prefixlen_s = 128; + ipv6_addr_copy((struct in6_addr *)x->props.saddr.a6, + (struct in6_addr *)saddr); + ipv6_addr_copy((struct in6_addr *)x->id.daddr.a6, + (struct in6_addr *)daddr); + break; + }; + + x->km.state = XFRM_STATE_ACQ; + x->id.proto = proto; + x->props.family = family; + x->props.mode = mode; + x->props.reqid = reqid; + x->lft.hard_add_expires_seconds = XFRM_ACQ_EXPIRES; + xfrm_state_hold(x); + x->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; + add_timer(&x->timer); + xfrm_state_hold(x); + list_add_tail(&x->bydst, xfrm_state_bydst+h); + h = xfrm_src_hash(saddr, family); + xfrm_state_hold(x); + list_add_tail(&x->bysrc, xfrm_state_bysrc+h); + wake_up(&km_waitq); + } + + return x; +} + static inline struct xfrm_state * __xfrm_state_locate(struct xfrm_state_afinfo *afinfo, struct xfrm_state *x, int use_spi) @@ -533,9 +628,9 @@ int xfrm_state_add(struct xfrm_state *x) } if (use_spi && !x1) - x1 = afinfo->find_acq( - x->props.mode, x->props.reqid, x->id.proto, - &x->id.daddr, &x->props.saddr, 0); + x1 = __find_acq_core(family, x->props.mode, x->props.reqid, + x->id.proto, + &x->id.daddr, &x->props.saddr, 0); __xfrm_state_insert(x); err = 0; @@ -716,14 +811,11 @@ xfrm_find_acq(u8 mode, u32 reqid, u8 proto, int create, unsigned short family) { struct xfrm_state *x; - struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); - if (!afinfo) - return NULL; spin_lock_bh(&xfrm_state_lock); - x = afinfo->find_acq(mode, reqid, proto, daddr, saddr, create); + x = __find_acq_core(family, mode, reqid, proto, daddr, saddr, create); spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); + return x; } EXPORT_SYMBOL(xfrm_find_acq); @@ -1181,7 +1273,6 @@ int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo) if (unlikely(xfrm_state_afinfo[afinfo->family] != NULL)) err = -ENOBUFS; else { - afinfo->state_bydst = xfrm_state_bydst; afinfo->state_bysrc = xfrm_state_bysrc; afinfo->state_byspi = xfrm_state_byspi; xfrm_state_afinfo[afinfo->family] = afinfo; @@ -1206,7 +1297,6 @@ int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo) xfrm_state_afinfo[afinfo->family] = NULL; afinfo->state_byspi = NULL; afinfo->state_bysrc = NULL; - afinfo->state_bydst = NULL; } } write_unlock_bh(&xfrm_state_afinfo_lock); -- cgit v0.10.2 From edcd582152090bfb0ccb4ad444c151798a73eda8 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 00:42:45 -0700 Subject: [XFRM]: Pull xfrm_state_by{spi,src} hash table knowledge out of afinfo. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index cc83443..dd3b84b 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -243,14 +243,10 @@ extern int __xfrm_state_delete(struct xfrm_state *x); struct xfrm_state_afinfo { unsigned short family; - struct list_head *state_bysrc; - struct list_head *state_byspi; int (*init_flags)(struct xfrm_state *x); void (*init_tempsel)(struct xfrm_state *x, struct flowi *fl, struct xfrm_tmpl *tmpl, xfrm_address_t *daddr, xfrm_address_t *saddr); - struct xfrm_state *(*state_lookup)(xfrm_address_t *daddr, u32 spi, u8 proto); - struct xfrm_state *(*state_lookup_byaddr)(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto); int (*tmpl_sort)(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n); int (*state_sort)(struct xfrm_state **dst, struct xfrm_state **src, int n); }; @@ -431,80 +427,6 @@ static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols) } #endif -#define XFRM_DST_HSIZE 1024 - -static __inline__ -unsigned __xfrm4_dst_hash(xfrm_address_t *addr) -{ - unsigned h; - h = ntohl(addr->a4); - h = (h ^ (h>>16)) % XFRM_DST_HSIZE; - return h; -} - -static __inline__ -unsigned __xfrm6_dst_hash(xfrm_address_t *addr) -{ - unsigned h; - h = ntohl(addr->a6[2]^addr->a6[3]); - h = (h ^ (h>>16)) % XFRM_DST_HSIZE; - return h; -} - -static __inline__ -unsigned __xfrm4_src_hash(xfrm_address_t *addr) -{ - return __xfrm4_dst_hash(addr); -} - -static __inline__ -unsigned __xfrm6_src_hash(xfrm_address_t *addr) -{ - return __xfrm6_dst_hash(addr); -} - -static __inline__ -unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) -{ - switch (family) { - case AF_INET: - return __xfrm4_src_hash(addr); - case AF_INET6: - return __xfrm6_src_hash(addr); - } - return 0; -} - -static __inline__ -unsigned __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) -{ - unsigned h; - h = ntohl(addr->a4^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; - return h; -} - -static __inline__ -unsigned __xfrm6_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) -{ - unsigned h; - h = ntohl(addr->a6[2]^addr->a6[3]^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; - return h; -} - -static __inline__ -unsigned xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family) -{ - switch (family) { - case AF_INET: - return __xfrm4_spi_hash(addr, spi, proto); - case AF_INET6: - return __xfrm6_spi_hash(addr, spi, proto); - } - return 0; /*XXX*/ -} - extern void __xfrm_state_destroy(struct xfrm_state *); static inline void __xfrm_state_put(struct xfrm_state *x) diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index 9dc1afc..6a2a4ab 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -62,38 +62,10 @@ __xfrm4_init_tempsel(struct xfrm_state *x, struct flowi *fl, x->props.family = AF_INET; } -static struct xfrm_state * -__xfrm4_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto) -{ - unsigned h = __xfrm4_spi_hash(daddr, spi, proto); - struct xfrm_state *x; - - list_for_each_entry(x, xfrm4_state_afinfo.state_byspi+h, byspi) { - if (x->props.family == AF_INET && - spi == x->id.spi && - daddr->a4 == x->id.daddr.a4 && - proto == x->id.proto) { - xfrm_state_hold(x); - return x; - } - } - return NULL; -} - -/* placeholder until ipv4's code is written */ -static struct xfrm_state * -__xfrm4_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, - u8 proto) -{ - return NULL; -} - static struct xfrm_state_afinfo xfrm4_state_afinfo = { .family = AF_INET, .init_flags = xfrm4_init_flags, .init_tempsel = __xfrm4_init_tempsel, - .state_lookup = __xfrm4_state_lookup, - .state_lookup_byaddr = __xfrm4_state_lookup_byaddr, }; void __init xfrm4_state_init(void) diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index 40fcaab..d88cd92 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -63,44 +63,6 @@ __xfrm6_init_tempsel(struct xfrm_state *x, struct flowi *fl, x->props.family = AF_INET6; } -static struct xfrm_state * -__xfrm6_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, - u8 proto) -{ - struct xfrm_state *x = NULL; - unsigned h; - - h = __xfrm6_src_hash(saddr); - list_for_each_entry(x, xfrm6_state_afinfo.state_bysrc+h, bysrc) { - if (x->props.family == AF_INET6 && - ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && - ipv6_addr_equal((struct in6_addr *)saddr, (struct in6_addr *)x->props.saddr.a6) && - proto == x->id.proto) { - xfrm_state_hold(x); - return x; - } - } - return NULL; -} - -static struct xfrm_state * -__xfrm6_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto) -{ - unsigned h = __xfrm6_spi_hash(daddr, spi, proto); - struct xfrm_state *x; - - list_for_each_entry(x, xfrm6_state_afinfo.state_byspi+h, byspi) { - if (x->props.family == AF_INET6 && - spi == x->id.spi && - ipv6_addr_equal((struct in6_addr *)daddr, (struct in6_addr *)x->id.daddr.a6) && - proto == x->id.proto) { - xfrm_state_hold(x); - return x; - } - } - return NULL; -} - static int __xfrm6_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n) { @@ -223,8 +185,6 @@ __xfrm6_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n) static struct xfrm_state_afinfo xfrm6_state_afinfo = { .family = AF_INET6, .init_tempsel = __xfrm6_init_tempsel, - .state_lookup = __xfrm6_state_lookup, - .state_lookup_byaddr = __xfrm6_state_lookup_byaddr, .tmpl_sort = __xfrm6_tmpl_sort, .state_sort = __xfrm6_state_sort, }; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 80f5f9d..4a3832f 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -38,6 +38,8 @@ EXPORT_SYMBOL(sysctl_xfrm_aevent_rseqth); static DEFINE_SPINLOCK(xfrm_state_lock); +#define XFRM_DST_HSIZE 1024 + /* Hash table to find appropriate SA towards given target (endpoint * of tunnel or destination of transport mode) allowed by selector. * @@ -49,6 +51,48 @@ static struct list_head xfrm_state_bysrc[XFRM_DST_HSIZE]; static struct list_head xfrm_state_byspi[XFRM_DST_HSIZE]; static __inline__ +unsigned __xfrm4_dst_hash(xfrm_address_t *addr) +{ + unsigned h; + h = ntohl(addr->a4); + h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + return h; +} + +static __inline__ +unsigned __xfrm6_dst_hash(xfrm_address_t *addr) +{ + unsigned h; + h = ntohl(addr->a6[2]^addr->a6[3]); + h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + return h; +} + +static __inline__ +unsigned __xfrm4_src_hash(xfrm_address_t *addr) +{ + return __xfrm4_dst_hash(addr); +} + +static __inline__ +unsigned __xfrm6_src_hash(xfrm_address_t *addr) +{ + return __xfrm6_dst_hash(addr); +} + +static __inline__ +unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +{ + switch (family) { + case AF_INET: + return __xfrm4_src_hash(addr); + case AF_INET6: + return __xfrm6_src_hash(addr); + } + return 0; +} + +static __inline__ unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) { switch (family) { @@ -60,6 +104,36 @@ unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) return 0; } +static __inline__ +unsigned __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) +{ + unsigned h; + h = ntohl(addr->a4^spi^proto); + h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + return h; +} + +static __inline__ +unsigned __xfrm6_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) +{ + unsigned h; + h = ntohl(addr->a6[2]^addr->a6[3]^spi^proto); + h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + return h; +} + +static __inline__ +unsigned xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family) +{ + switch (family) { + case AF_INET: + return __xfrm4_spi_hash(addr, spi, proto); + case AF_INET6: + return __xfrm6_spi_hash(addr, spi, proto); + } + return 0; /*XXX*/ +} + DECLARE_WAIT_QUEUE_HEAD(km_waitq); EXPORT_SYMBOL(km_waitq); @@ -342,6 +416,83 @@ xfrm_init_tempsel(struct xfrm_state *x, struct flowi *fl, return 0; } +static struct xfrm_state *__xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family) +{ + unsigned int h = xfrm_spi_hash(daddr, spi, proto, family); + struct xfrm_state *x; + + list_for_each_entry(x, xfrm_state_byspi+h, byspi) { + if (x->props.family != family || + x->id.spi != spi || + x->id.proto != proto) + continue; + + switch (family) { + case AF_INET: + if (x->id.daddr.a4 != daddr->a4) + continue; + break; + case AF_INET6: + if (!ipv6_addr_equal((struct in6_addr *)daddr, + (struct in6_addr *) + x->id.daddr.a6)) + continue; + break; + }; + + xfrm_state_hold(x); + return x; + } + + return NULL; +} + +static struct xfrm_state *__xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto, unsigned short family) +{ + unsigned int h = xfrm_src_hash(saddr, family); + struct xfrm_state *x; + + list_for_each_entry(x, xfrm_state_bysrc+h, bysrc) { + if (x->props.family != family || + x->id.proto != proto) + continue; + + switch (family) { + case AF_INET: + if (x->id.daddr.a4 != daddr->a4 || + x->props.saddr.a4 != saddr->a4) + continue; + break; + case AF_INET6: + if (!ipv6_addr_equal((struct in6_addr *)daddr, + (struct in6_addr *) + x->id.daddr.a6) || + !ipv6_addr_equal((struct in6_addr *)saddr, + (struct in6_addr *) + x->props.saddr.a6)) + continue; + break; + }; + + xfrm_state_hold(x); + return x; + } + + return NULL; +} + +static inline struct xfrm_state * +__xfrm_state_locate(struct xfrm_state *x, int use_spi, int family) +{ + if (use_spi) + return __xfrm_state_lookup(&x->id.daddr, x->id.spi, + x->id.proto, family); + else + return __xfrm_state_lookup_byaddr(&x->id.daddr, + &x->props.saddr, + x->id.proto, family); +} + struct xfrm_state * xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, struct flowi *fl, struct xfrm_tmpl *tmpl, @@ -353,14 +504,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, int acquire_in_progress = 0; int error = 0; struct xfrm_state *best = NULL; - struct xfrm_state_afinfo *afinfo; - afinfo = xfrm_state_get_afinfo(family); - if (afinfo == NULL) { - *err = -EAFNOSUPPORT; - return NULL; - } - spin_lock_bh(&xfrm_state_lock); list_for_each_entry(x, xfrm_state_bydst+h, bydst) { if (x->props.family == family && @@ -406,8 +550,8 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, x = best; if (!x && !error && !acquire_in_progress) { if (tmpl->id.spi && - (x0 = afinfo->state_lookup(daddr, tmpl->id.spi, - tmpl->id.proto)) != NULL) { + (x0 = __xfrm_state_lookup(daddr, tmpl->id.spi, + tmpl->id.proto, family)) != NULL) { xfrm_state_put(x0); error = -EEXIST; goto out; @@ -457,7 +601,6 @@ out: else *err = acquire_in_progress ? -EAGAIN : error; spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); return x; } @@ -584,34 +727,20 @@ static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 re return x; } -static inline struct xfrm_state * -__xfrm_state_locate(struct xfrm_state_afinfo *afinfo, struct xfrm_state *x, - int use_spi) -{ - if (use_spi) - return afinfo->state_lookup(&x->id.daddr, x->id.spi, x->id.proto); - else - return afinfo->state_lookup_byaddr(&x->id.daddr, &x->props.saddr, x->id.proto); -} - static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq); int xfrm_state_add(struct xfrm_state *x) { - struct xfrm_state_afinfo *afinfo; struct xfrm_state *x1; int family; int err; int use_spi = xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY); family = x->props.family; - afinfo = xfrm_state_get_afinfo(family); - if (unlikely(afinfo == NULL)) - return -EAFNOSUPPORT; spin_lock_bh(&xfrm_state_lock); - x1 = __xfrm_state_locate(afinfo, x, use_spi); + x1 = __xfrm_state_locate(x, use_spi, family); if (x1) { xfrm_state_put(x1); x1 = NULL; @@ -637,7 +766,6 @@ int xfrm_state_add(struct xfrm_state *x) out: spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); if (!err) xfrm_flush_all_bundles(); @@ -653,17 +781,12 @@ EXPORT_SYMBOL(xfrm_state_add); int xfrm_state_update(struct xfrm_state *x) { - struct xfrm_state_afinfo *afinfo; struct xfrm_state *x1; int err; int use_spi = xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY); - afinfo = xfrm_state_get_afinfo(x->props.family); - if (unlikely(afinfo == NULL)) - return -EAFNOSUPPORT; - spin_lock_bh(&xfrm_state_lock); - x1 = __xfrm_state_locate(afinfo, x, use_spi); + x1 = __xfrm_state_locate(x, use_spi, x->props.family); err = -ESRCH; if (!x1) @@ -683,7 +806,6 @@ int xfrm_state_update(struct xfrm_state *x) out: spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); if (err) return err; @@ -776,14 +898,10 @@ xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family) { struct xfrm_state *x; - struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); - if (!afinfo) - return NULL; spin_lock_bh(&xfrm_state_lock); - x = afinfo->state_lookup(daddr, spi, proto); + x = __xfrm_state_lookup(daddr, spi, proto, family); spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); return x; } EXPORT_SYMBOL(xfrm_state_lookup); @@ -793,14 +911,10 @@ xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm_address_t *saddr, u8 proto, unsigned short family) { struct xfrm_state *x; - struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family); - if (!afinfo) - return NULL; spin_lock_bh(&xfrm_state_lock); - x = afinfo->state_lookup_byaddr(daddr, saddr, proto); + x = __xfrm_state_lookup_byaddr(daddr, saddr, proto, family); spin_unlock_bh(&xfrm_state_lock); - xfrm_state_put_afinfo(afinfo); return x; } EXPORT_SYMBOL(xfrm_state_lookup_byaddr); @@ -1272,11 +1386,8 @@ int xfrm_state_register_afinfo(struct xfrm_state_afinfo *afinfo) write_lock_bh(&xfrm_state_afinfo_lock); if (unlikely(xfrm_state_afinfo[afinfo->family] != NULL)) err = -ENOBUFS; - else { - afinfo->state_bysrc = xfrm_state_bysrc; - afinfo->state_byspi = xfrm_state_byspi; + else xfrm_state_afinfo[afinfo->family] = afinfo; - } write_unlock_bh(&xfrm_state_afinfo_lock); return err; } @@ -1293,11 +1404,8 @@ int xfrm_state_unregister_afinfo(struct xfrm_state_afinfo *afinfo) if (likely(xfrm_state_afinfo[afinfo->family] != NULL)) { if (unlikely(xfrm_state_afinfo[afinfo->family] != afinfo)) err = -EINVAL; - else { + else xfrm_state_afinfo[afinfo->family] = NULL; - afinfo->state_byspi = NULL; - afinfo->state_bysrc = NULL; - } } write_unlock_bh(&xfrm_state_afinfo_lock); return err; -- cgit v0.10.2 From 8f126e37c0b250310a48a609bedf92a19a5559ec Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 02:45:07 -0700 Subject: [XFRM]: Convert xfrm_state hash linkage to hlists. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index dd3b84b..3405e5d 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -94,9 +94,9 @@ extern struct mutex xfrm_cfg_mutex; struct xfrm_state { /* Note: bydst is re-used during gc */ - struct list_head bydst; - struct list_head bysrc; - struct list_head byspi; + struct hlist_node bydst; + struct hlist_node bysrc; + struct hlist_node byspi; atomic_t refcnt; spinlock_t lock; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 4a3832f..fe3c8c3 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -46,9 +46,9 @@ static DEFINE_SPINLOCK(xfrm_state_lock); * Main use is finding SA after policy selected tunnel or transport mode. * Also, it can be used by ah/esp icmp error handler to find offending SA. */ -static struct list_head xfrm_state_bydst[XFRM_DST_HSIZE]; -static struct list_head xfrm_state_bysrc[XFRM_DST_HSIZE]; -static struct list_head xfrm_state_byspi[XFRM_DST_HSIZE]; +static struct hlist_head xfrm_state_bydst[XFRM_DST_HSIZE]; +static struct hlist_head xfrm_state_bysrc[XFRM_DST_HSIZE]; +static struct hlist_head xfrm_state_byspi[XFRM_DST_HSIZE]; static __inline__ unsigned __xfrm4_dst_hash(xfrm_address_t *addr) @@ -141,7 +141,7 @@ static DEFINE_RWLOCK(xfrm_state_afinfo_lock); static struct xfrm_state_afinfo *xfrm_state_afinfo[NPROTO]; static struct work_struct xfrm_state_gc_work; -static struct list_head xfrm_state_gc_list = LIST_HEAD_INIT(xfrm_state_gc_list); +static HLIST_HEAD(xfrm_state_gc_list); static DEFINE_SPINLOCK(xfrm_state_gc_lock); static int xfrm_state_gc_flush_bundles; @@ -178,8 +178,8 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x) static void xfrm_state_gc_task(void *data) { struct xfrm_state *x; - struct list_head *entry, *tmp; - struct list_head gc_list = LIST_HEAD_INIT(gc_list); + struct hlist_node *entry, *tmp; + struct hlist_head gc_list; if (xfrm_state_gc_flush_bundles) { xfrm_state_gc_flush_bundles = 0; @@ -187,13 +187,13 @@ static void xfrm_state_gc_task(void *data) } spin_lock_bh(&xfrm_state_gc_lock); - list_splice_init(&xfrm_state_gc_list, &gc_list); + gc_list.first = xfrm_state_gc_list.first; + INIT_HLIST_HEAD(&xfrm_state_gc_list); spin_unlock_bh(&xfrm_state_gc_lock); - list_for_each_safe(entry, tmp, &gc_list) { - x = list_entry(entry, struct xfrm_state, bydst); + hlist_for_each_entry_safe(x, entry, tmp, &gc_list, bydst) xfrm_state_gc_destroy(x); - } + wake_up(&km_waitq); } @@ -287,9 +287,9 @@ struct xfrm_state *xfrm_state_alloc(void) if (x) { atomic_set(&x->refcnt, 1); atomic_set(&x->tunnel_users, 0); - INIT_LIST_HEAD(&x->bydst); - INIT_LIST_HEAD(&x->bysrc); - INIT_LIST_HEAD(&x->byspi); + INIT_HLIST_NODE(&x->bydst); + INIT_HLIST_NODE(&x->bysrc); + INIT_HLIST_NODE(&x->byspi); init_timer(&x->timer); x->timer.function = xfrm_timer_handler; x->timer.data = (unsigned long)x; @@ -314,7 +314,7 @@ void __xfrm_state_destroy(struct xfrm_state *x) BUG_TRAP(x->km.state == XFRM_STATE_DEAD); spin_lock_bh(&xfrm_state_gc_lock); - list_add(&x->bydst, &xfrm_state_gc_list); + hlist_add_head(&x->bydst, &xfrm_state_gc_list); spin_unlock_bh(&xfrm_state_gc_lock); schedule_work(&xfrm_state_gc_work); } @@ -327,12 +327,12 @@ int __xfrm_state_delete(struct xfrm_state *x) if (x->km.state != XFRM_STATE_DEAD) { x->km.state = XFRM_STATE_DEAD; spin_lock(&xfrm_state_lock); - list_del(&x->bydst); + hlist_del(&x->bydst); __xfrm_state_put(x); - list_del(&x->bysrc); + hlist_del(&x->bysrc); __xfrm_state_put(x); if (x->id.spi) { - list_del(&x->byspi); + hlist_del(&x->byspi); __xfrm_state_put(x); } spin_unlock(&xfrm_state_lock); @@ -378,12 +378,13 @@ EXPORT_SYMBOL(xfrm_state_delete); void xfrm_state_flush(u8 proto) { int i; - struct xfrm_state *x; spin_lock_bh(&xfrm_state_lock); for (i = 0; i < XFRM_DST_HSIZE; i++) { + struct hlist_node *entry; + struct xfrm_state *x; restart: - list_for_each_entry(x, xfrm_state_bydst+i, bydst) { + hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { if (!xfrm_state_kern(x) && xfrm_id_proto_match(x->id.proto, proto)) { xfrm_state_hold(x); @@ -420,8 +421,9 @@ static struct xfrm_state *__xfrm_state_lookup(xfrm_address_t *daddr, u32 spi, u8 { unsigned int h = xfrm_spi_hash(daddr, spi, proto, family); struct xfrm_state *x; + struct hlist_node *entry; - list_for_each_entry(x, xfrm_state_byspi+h, byspi) { + hlist_for_each_entry(x, entry, xfrm_state_byspi+h, byspi) { if (x->props.family != family || x->id.spi != spi || x->id.proto != proto) @@ -451,8 +453,9 @@ static struct xfrm_state *__xfrm_state_lookup_byaddr(xfrm_address_t *daddr, xfrm { unsigned int h = xfrm_src_hash(saddr, family); struct xfrm_state *x; + struct hlist_node *entry; - list_for_each_entry(x, xfrm_state_bysrc+h, bysrc) { + hlist_for_each_entry(x, entry, xfrm_state_bysrc+h, bysrc) { if (x->props.family != family || x->id.proto != proto) continue; @@ -499,14 +502,15 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, struct xfrm_policy *pol, int *err, unsigned short family) { - unsigned h = xfrm_dst_hash(daddr, family); + unsigned int h = xfrm_dst_hash(daddr, family); + struct hlist_node *entry; struct xfrm_state *x, *x0; int acquire_in_progress = 0; int error = 0; struct xfrm_state *best = NULL; spin_lock_bh(&xfrm_state_lock); - list_for_each_entry(x, xfrm_state_bydst+h, bydst) { + hlist_for_each_entry(x, entry, xfrm_state_bydst+h, bydst) { if (x->props.family == family && x->props.reqid == tmpl->reqid && !(x->props.flags & XFRM_STATE_WILDRECV) && @@ -575,13 +579,14 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, if (km_query(x, tmpl, pol) == 0) { x->km.state = XFRM_STATE_ACQ; - list_add_tail(&x->bydst, xfrm_state_bydst+h); + hlist_add_head(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); - list_add_tail(&x->bysrc, xfrm_state_bysrc+h); + h = xfrm_src_hash(saddr, family); + hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); xfrm_state_hold(x); if (x->id.spi) { h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, family); - list_add(&x->byspi, xfrm_state_byspi+h); + hlist_add_head(&x->byspi, xfrm_state_byspi+h); xfrm_state_hold(x); } x->lft.hard_add_expires_seconds = XFRM_ACQ_EXPIRES; @@ -608,19 +613,19 @@ static void __xfrm_state_insert(struct xfrm_state *x) { unsigned h = xfrm_dst_hash(&x->id.daddr, x->props.family); - list_add(&x->bydst, xfrm_state_bydst+h); + hlist_add_head(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); h = xfrm_src_hash(&x->props.saddr, x->props.family); - list_add(&x->bysrc, xfrm_state_bysrc+h); + hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); xfrm_state_hold(x); if (xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY)) { h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, x->props.family); - list_add(&x->byspi, xfrm_state_byspi+h); + hlist_add_head(&x->byspi, xfrm_state_byspi+h); xfrm_state_hold(x); } @@ -648,9 +653,10 @@ EXPORT_SYMBOL(xfrm_state_insert); static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create) { unsigned int h = xfrm_dst_hash(daddr, family); + struct hlist_node *entry; struct xfrm_state *x; - list_for_each_entry(x, xfrm_state_bydst+h, bydst) { + hlist_for_each_entry(x, entry, xfrm_state_bydst+h, bydst) { if (x->props.reqid != reqid || x->props.mode != mode || x->props.family != family || @@ -717,10 +723,10 @@ static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 re x->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; add_timer(&x->timer); xfrm_state_hold(x); - list_add_tail(&x->bydst, xfrm_state_bydst+h); + hlist_add_head(&x->bydst, xfrm_state_bydst+h); h = xfrm_src_hash(saddr, family); xfrm_state_hold(x); - list_add_tail(&x->bysrc, xfrm_state_bysrc+h); + hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); wake_up(&km_waitq); } @@ -977,11 +983,14 @@ EXPORT_SYMBOL(xfrm_state_sort); static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq) { int i; - struct xfrm_state *x; for (i = 0; i < XFRM_DST_HSIZE; i++) { - list_for_each_entry(x, xfrm_state_bydst+i, bydst) { - if (x->km.seq == seq && x->km.state == XFRM_STATE_ACQ) { + struct hlist_node *entry; + struct xfrm_state *x; + + hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { + if (x->km.seq == seq && + x->km.state == XFRM_STATE_ACQ) { xfrm_state_hold(x); return x; } @@ -1047,7 +1056,7 @@ xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi) if (x->id.spi) { spin_lock_bh(&xfrm_state_lock); h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, x->props.family); - list_add(&x->byspi, xfrm_state_byspi+h); + hlist_add_head(&x->byspi, xfrm_state_byspi+h); xfrm_state_hold(x); spin_unlock_bh(&xfrm_state_lock); wake_up(&km_waitq); @@ -1060,12 +1069,13 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), { int i; struct xfrm_state *x; + struct hlist_node *entry; int count = 0; int err = 0; spin_lock_bh(&xfrm_state_lock); for (i = 0; i < XFRM_DST_HSIZE; i++) { - list_for_each_entry(x, xfrm_state_bydst+i, bydst) { + hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { if (xfrm_id_proto_match(x->id.proto, proto)) count++; } @@ -1076,7 +1086,7 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), } for (i = 0; i < XFRM_DST_HSIZE; i++) { - list_for_each_entry(x, xfrm_state_bydst+i, bydst) { + hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { if (!xfrm_id_proto_match(x->id.proto, proto)) continue; err = func(x, --count, data); @@ -1524,9 +1534,9 @@ void __init xfrm_state_init(void) int i; for (i=0; i Date: Thu, 24 Aug 2006 03:08:07 -0700 Subject: [XFRM]: Dynamic xfrm_state hash table sizing. The grow algorithm is simple, we grow if: 1) we see a hash chain collision at insert, and 2) we haven't hit the hash size limit (currently 1*1024*1024 slots), and 3) the number of xfrm_state objects is > the current hash mask All of this needs some tweaking. Remove __initdata from "hashdist" so we can use it safely at run time. Signed-off-by: David S. Miller diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h index 1021f50..e319c64 100644 --- a/include/linux/bootmem.h +++ b/include/linux/bootmem.h @@ -114,7 +114,7 @@ extern void *__init alloc_large_system_hash(const char *tablename, #else #define HASHDIST_DEFAULT 0 #endif -extern int __initdata hashdist; /* Distribute hashes across NUMA nodes? */ +extern int hashdist; /* Distribute hashes across NUMA nodes? */ #endif /* _LINUX_BOOTMEM_H */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 54a4f53..3b5358a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2363,7 +2363,7 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write, return 0; } -__initdata int hashdist = HASHDIST_DEFAULT; +int hashdist = HASHDIST_DEFAULT; #ifdef CONFIG_NUMA static int __init set_hashdist(char *str) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index fe3c8c3..445263c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -18,6 +18,9 @@ #include #include #include +#include +#include +#include #include struct sock *xfrm_nl; @@ -38,102 +41,230 @@ EXPORT_SYMBOL(sysctl_xfrm_aevent_rseqth); static DEFINE_SPINLOCK(xfrm_state_lock); -#define XFRM_DST_HSIZE 1024 - /* Hash table to find appropriate SA towards given target (endpoint * of tunnel or destination of transport mode) allowed by selector. * * Main use is finding SA after policy selected tunnel or transport mode. * Also, it can be used by ah/esp icmp error handler to find offending SA. */ -static struct hlist_head xfrm_state_bydst[XFRM_DST_HSIZE]; -static struct hlist_head xfrm_state_bysrc[XFRM_DST_HSIZE]; -static struct hlist_head xfrm_state_byspi[XFRM_DST_HSIZE]; - -static __inline__ -unsigned __xfrm4_dst_hash(xfrm_address_t *addr) +static struct hlist_head *xfrm_state_bydst __read_mostly; +static struct hlist_head *xfrm_state_bysrc __read_mostly; +static struct hlist_head *xfrm_state_byspi __read_mostly; +static unsigned int xfrm_state_hmask __read_mostly; +static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; +static unsigned int xfrm_state_num; + +static inline unsigned int __xfrm4_dst_hash(xfrm_address_t *addr, unsigned int hmask) { - unsigned h; + unsigned int h; h = ntohl(addr->a4); - h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + h = (h ^ (h>>16)) & hmask; return h; } -static __inline__ -unsigned __xfrm6_dst_hash(xfrm_address_t *addr) +static inline unsigned int __xfrm6_dst_hash(xfrm_address_t *addr, unsigned int hmask) { - unsigned h; + unsigned int h; h = ntohl(addr->a6[2]^addr->a6[3]); - h = (h ^ (h>>16)) % XFRM_DST_HSIZE; + h = (h ^ (h>>16)) & hmask; return h; } -static __inline__ -unsigned __xfrm4_src_hash(xfrm_address_t *addr) +static inline unsigned int __xfrm4_src_hash(xfrm_address_t *addr, unsigned int hmask) { - return __xfrm4_dst_hash(addr); + return __xfrm4_dst_hash(addr, hmask); } -static __inline__ -unsigned __xfrm6_src_hash(xfrm_address_t *addr) +static inline unsigned int __xfrm6_src_hash(xfrm_address_t *addr, unsigned int hmask) { - return __xfrm6_dst_hash(addr); + return __xfrm6_dst_hash(addr, hmask); } -static __inline__ -unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +static inline unsigned __xfrm_src_hash(xfrm_address_t *addr, unsigned short family, unsigned int hmask) { switch (family) { case AF_INET: - return __xfrm4_src_hash(addr); + return __xfrm4_src_hash(addr, hmask); case AF_INET6: - return __xfrm6_src_hash(addr); + return __xfrm6_src_hash(addr, hmask); } return 0; } -static __inline__ -unsigned xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) +static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +{ + return __xfrm_src_hash(addr, family, xfrm_state_hmask); +} + +static inline unsigned int __xfrm_dst_hash(xfrm_address_t *addr, unsigned short family, unsigned int hmask) { switch (family) { case AF_INET: - return __xfrm4_dst_hash(addr); + return __xfrm4_dst_hash(addr, hmask); case AF_INET6: - return __xfrm6_dst_hash(addr); + return __xfrm6_dst_hash(addr, hmask); } return 0; } -static __inline__ -unsigned __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) +static inline unsigned int xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) +{ + return __xfrm_dst_hash(addr, family, xfrm_state_hmask); +} + +static inline unsigned int __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, + unsigned int hmask) { - unsigned h; + unsigned int h; h = ntohl(addr->a4^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + h = (h ^ (h>>10) ^ (h>>20)) & hmask; return h; } -static __inline__ -unsigned __xfrm6_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto) +static inline unsigned int __xfrm6_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, + unsigned int hmask) { - unsigned h; + unsigned int h; h = ntohl(addr->a6[2]^addr->a6[3]^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) % XFRM_DST_HSIZE; + h = (h ^ (h>>10) ^ (h>>20)) & hmask; return h; } -static __inline__ -unsigned xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family) +static inline +unsigned __xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family, + unsigned int hmask) { switch (family) { case AF_INET: - return __xfrm4_spi_hash(addr, spi, proto); + return __xfrm4_spi_hash(addr, spi, proto, hmask); case AF_INET6: - return __xfrm6_spi_hash(addr, spi, proto); + return __xfrm6_spi_hash(addr, spi, proto, hmask); } return 0; /*XXX*/ } +static inline unsigned int +xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family) +{ + return __xfrm_spi_hash(addr, spi, proto, family, xfrm_state_hmask); +} + +static struct hlist_head *xfrm_state_hash_alloc(unsigned int sz) +{ + struct hlist_head *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct hlist_head *) + __get_free_pages(GFP_KERNEL, get_order(sz)); + + if (n) + memset(n, 0, sz); + + return n; +} + +static void xfrm_state_hash_free(struct hlist_head *n, unsigned int sz) +{ + if (sz <= PAGE_SIZE) + kfree(n); + else if (hashdist) + vfree(n); + else + free_pages((unsigned long)n, get_order(sz)); +} + +static void xfrm_hash_transfer(struct hlist_head *list, + struct hlist_head *ndsttable, + struct hlist_head *nsrctable, + struct hlist_head *nspitable, + unsigned int nhashmask) +{ + struct hlist_node *entry, *tmp; + struct xfrm_state *x; + + hlist_for_each_entry_safe(x, entry, tmp, list, bydst) { + unsigned int h; + + h = __xfrm_dst_hash(&x->id.daddr, x->props.family, nhashmask); + hlist_add_head(&x->bydst, ndsttable+h); + + h = __xfrm_src_hash(&x->props.saddr, x->props.family, + nhashmask); + hlist_add_head(&x->bysrc, nsrctable+h); + + h = __xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, + x->props.family, nhashmask); + hlist_add_head(&x->byspi, nspitable+h); + } +} + +static unsigned long xfrm_hash_new_size(void) +{ + return ((xfrm_state_hmask + 1) << 1) * + sizeof(struct hlist_head); +} + +static DEFINE_MUTEX(hash_resize_mutex); + +static void xfrm_hash_resize(void *__unused) +{ + struct hlist_head *ndst, *nsrc, *nspi, *odst, *osrc, *ospi; + unsigned long nsize, osize; + unsigned int nhashmask, ohashmask; + int i; + + mutex_lock(&hash_resize_mutex); + + nsize = xfrm_hash_new_size(); + ndst = xfrm_state_hash_alloc(nsize); + if (!ndst) + goto out_unlock; + nsrc = xfrm_state_hash_alloc(nsize); + if (!nsrc) { + xfrm_state_hash_free(ndst, nsize); + goto out_unlock; + } + nspi = xfrm_state_hash_alloc(nsize); + if (!nspi) { + xfrm_state_hash_free(ndst, nsize); + xfrm_state_hash_free(nsrc, nsize); + goto out_unlock; + } + + spin_lock_bh(&xfrm_state_lock); + + nhashmask = (nsize / sizeof(struct hlist_head)) - 1U; + for (i = xfrm_state_hmask; i >= 0; i--) + xfrm_hash_transfer(xfrm_state_bydst+i, ndst, nsrc, nspi, + nhashmask); + + odst = xfrm_state_bydst; + osrc = xfrm_state_bysrc; + ospi = xfrm_state_byspi; + ohashmask = xfrm_state_hmask; + + xfrm_state_bydst = ndst; + xfrm_state_bysrc = nsrc; + xfrm_state_byspi = nspi; + xfrm_state_hmask = nhashmask; + + spin_unlock_bh(&xfrm_state_lock); + + osize = (ohashmask + 1) * sizeof(struct hlist_head); + xfrm_state_hash_free(odst, osize); + xfrm_state_hash_free(osrc, osize); + xfrm_state_hash_free(ospi, osize); + +out_unlock: + mutex_unlock(&hash_resize_mutex); +} + +static DECLARE_WORK(xfrm_hash_work, xfrm_hash_resize, NULL); + DECLARE_WAIT_QUEUE_HEAD(km_waitq); EXPORT_SYMBOL(km_waitq); @@ -335,6 +466,7 @@ int __xfrm_state_delete(struct xfrm_state *x) hlist_del(&x->byspi); __xfrm_state_put(x); } + xfrm_state_num--; spin_unlock(&xfrm_state_lock); if (del_timer(&x->timer)) __xfrm_state_put(x); @@ -380,7 +512,7 @@ void xfrm_state_flush(u8 proto) int i; spin_lock_bh(&xfrm_state_lock); - for (i = 0; i < XFRM_DST_HSIZE; i++) { + for (i = 0; i < xfrm_state_hmask; i++) { struct hlist_node *entry; struct xfrm_state *x; restart: @@ -611,7 +743,7 @@ out: static void __xfrm_state_insert(struct xfrm_state *x) { - unsigned h = xfrm_dst_hash(&x->id.daddr, x->props.family); + unsigned int h = xfrm_dst_hash(&x->id.daddr, x->props.family); hlist_add_head(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); @@ -637,6 +769,13 @@ static void __xfrm_state_insert(struct xfrm_state *x) xfrm_state_hold(x); wake_up(&km_waitq); + + xfrm_state_num++; + + if (x->bydst.next != NULL && + (xfrm_state_hmask + 1) < xfrm_state_hashmax && + xfrm_state_num > xfrm_state_hmask) + schedule_work(&xfrm_hash_work); } void xfrm_state_insert(struct xfrm_state *x) @@ -984,7 +1123,7 @@ static struct xfrm_state *__xfrm_find_acq_byseq(u32 seq) { int i; - for (i = 0; i < XFRM_DST_HSIZE; i++) { + for (i = 0; i <= xfrm_state_hmask; i++) { struct hlist_node *entry; struct xfrm_state *x; @@ -1026,7 +1165,7 @@ EXPORT_SYMBOL(xfrm_get_acqseq); void xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi) { - u32 h; + unsigned int h; struct xfrm_state *x0; if (x->id.spi) @@ -1074,7 +1213,7 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), int err = 0; spin_lock_bh(&xfrm_state_lock); - for (i = 0; i < XFRM_DST_HSIZE; i++) { + for (i = 0; i <= xfrm_state_hmask; i++) { hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { if (xfrm_id_proto_match(x->id.proto, proto)) count++; @@ -1085,7 +1224,7 @@ int xfrm_state_walk(u8 proto, int (*func)(struct xfrm_state *, int, void*), goto out; } - for (i = 0; i < XFRM_DST_HSIZE; i++) { + for (i = 0; i <= xfrm_state_hmask; i++) { hlist_for_each_entry(x, entry, xfrm_state_bydst+i, bydst) { if (!xfrm_id_proto_match(x->id.proto, proto)) continue; @@ -1531,13 +1670,17 @@ EXPORT_SYMBOL(xfrm_init_state); void __init xfrm_state_init(void) { - int i; + unsigned int sz; + + sz = sizeof(struct hlist_head) * 8; + + xfrm_state_bydst = xfrm_state_hash_alloc(sz); + xfrm_state_bysrc = xfrm_state_hash_alloc(sz); + xfrm_state_byspi = xfrm_state_hash_alloc(sz); + if (!xfrm_state_bydst || !xfrm_state_bysrc || !xfrm_state_byspi) + panic("XFRM: Cannot allocate bydst/bysrc/byspi hashes."); + xfrm_state_hmask = ((sz / sizeof(struct hlist_head)) - 1); - for (i=0; i Date: Thu, 24 Aug 2006 03:18:09 -0700 Subject: [XFRM]: Add generation count to xfrm_state and xfrm_dst. Each xfrm_state inserted gets a new generation counter value. When a bundle is created, the xfrm_dst objects get the current generation counter of the xfrm_state they will attach to at dst->xfrm. xfrm_bundle_ok() will return false if it sees an xfrm_dst with a generation count different from the generation count of the xfrm_state that dst points to. This provides a facility by which to passively and cheaply invalidate cached IPSEC routes during SA database changes. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 3405e5d..fd4a300 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -104,6 +104,8 @@ struct xfrm_state struct xfrm_id id; struct xfrm_selector sel; + u32 genid; + /* Key manger bits */ struct { u8 state; @@ -590,6 +592,7 @@ struct xfrm_dst struct rt6_info rt6; } u; struct dst_entry *route; + u32 genid; u32 route_mtu_cached; u32 child_mtu_cached; u32 route_cookie; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index 42d8ded..4795985 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -93,6 +93,7 @@ __xfrm4_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int xdst = (struct xfrm_dst *)dst1; xdst->route = &rt->u.dst; + xdst->genid = xfrm[i]->genid; dst1->next = dst_prev; dst_prev = dst1; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 98c2fe4..9391c4c 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -149,6 +149,7 @@ __xfrm6_bundle_create(struct xfrm_policy *policy, struct xfrm_state **xfrm, int xdst = (struct xfrm_dst *)dst1; xdst->route = &rt->u.dst; + xdst->genid = xfrm[i]->genid; if (rt->rt6i_node) xdst->route_cookie = rt->rt6i_node->fn_sernum; diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 1732159..7fc6944 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1536,6 +1536,8 @@ int xfrm_bundle_ok(struct xfrm_dst *first, struct flowi *fl, int family, int str return 0; if (dst->xfrm->km.state != XFRM_STATE_VALID) return 0; + if (xdst->genid != dst->xfrm->genid) + return 0; if (strict && fl && dst->xfrm->props.mode != XFRM_MODE_TUNNEL && !xfrm_state_addr_flow_check(dst->xfrm, fl, family)) diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 445263c..535d43c 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -53,6 +53,7 @@ static struct hlist_head *xfrm_state_byspi __read_mostly; static unsigned int xfrm_state_hmask __read_mostly; static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; static unsigned int xfrm_state_num; +static unsigned int xfrm_state_genid; static inline unsigned int __xfrm4_dst_hash(xfrm_address_t *addr, unsigned int hmask) { @@ -745,6 +746,8 @@ static void __xfrm_state_insert(struct xfrm_state *x) { unsigned int h = xfrm_dst_hash(&x->id.daddr, x->props.family); + x->genid = ++xfrm_state_genid; + hlist_add_head(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); -- cgit v0.10.2 From a624c108e5595b5827796c253481436929cd5344 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 03:24:33 -0700 Subject: [XFRM]: Put more keys into destination hash function. Besides the daddr, key the hash on family and reqid too. Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 535d43c..7e5daaf 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -35,7 +35,7 @@ EXPORT_SYMBOL(sysctl_xfrm_aevent_rseqth); /* Each xfrm_state may be linked to two tables: 1. Hash table by (spi,daddr,ah/esp) to find SA by SPI. (input,ctl) - 2. Hash table by daddr to find what SAs exist for given + 2. Hash table by (daddr,family,reqid) to find what SAs exist for given destination/tunnel endpoint. (output) */ @@ -55,62 +55,56 @@ static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; static unsigned int xfrm_state_num; static unsigned int xfrm_state_genid; -static inline unsigned int __xfrm4_dst_hash(xfrm_address_t *addr, unsigned int hmask) +static inline unsigned int __xfrm4_addr_hash(xfrm_address_t *addr) { - unsigned int h; - h = ntohl(addr->a4); - h = (h ^ (h>>16)) & hmask; - return h; -} - -static inline unsigned int __xfrm6_dst_hash(xfrm_address_t *addr, unsigned int hmask) -{ - unsigned int h; - h = ntohl(addr->a6[2]^addr->a6[3]); - h = (h ^ (h>>16)) & hmask; - return h; + return ntohl(addr->a4); } -static inline unsigned int __xfrm4_src_hash(xfrm_address_t *addr, unsigned int hmask) +static inline unsigned int __xfrm6_addr_hash(xfrm_address_t *addr) { - return __xfrm4_dst_hash(addr, hmask); + return ntohl(addr->a6[2]^addr->a6[3]); } -static inline unsigned int __xfrm6_src_hash(xfrm_address_t *addr, unsigned int hmask) -{ - return __xfrm6_dst_hash(addr, hmask); -} - -static inline unsigned __xfrm_src_hash(xfrm_address_t *addr, unsigned short family, unsigned int hmask) +static inline unsigned int __xfrm_dst_hash(xfrm_address_t *addr, + u32 reqid, unsigned short family, + unsigned int hmask) { + unsigned int h = family ^ reqid; switch (family) { case AF_INET: - return __xfrm4_src_hash(addr, hmask); + h ^= __xfrm4_addr_hash(addr); + break; case AF_INET6: - return __xfrm6_src_hash(addr, hmask); - } - return 0; + h ^= __xfrm6_addr_hash(addr); + break; + }; + return (h ^ (h >> 16)) & hmask; } -static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +static inline unsigned int xfrm_dst_hash(xfrm_address_t *addr, u32 reqid, + unsigned short family) { - return __xfrm_src_hash(addr, family, xfrm_state_hmask); + return __xfrm_dst_hash(addr, reqid, family, xfrm_state_hmask); } -static inline unsigned int __xfrm_dst_hash(xfrm_address_t *addr, unsigned short family, unsigned int hmask) +static inline unsigned __xfrm_src_hash(xfrm_address_t *addr, unsigned short family, + unsigned int hmask) { + unsigned int h = family; switch (family) { case AF_INET: - return __xfrm4_dst_hash(addr, hmask); + h ^= __xfrm4_addr_hash(addr); + break; case AF_INET6: - return __xfrm6_dst_hash(addr, hmask); - } - return 0; + h ^= __xfrm6_addr_hash(addr); + break; + }; + return (h ^ (h >> 16)) & hmask; } -static inline unsigned int xfrm_dst_hash(xfrm_address_t *addr, unsigned short family) +static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) { - return __xfrm_dst_hash(addr, family, xfrm_state_hmask); + return __xfrm_src_hash(addr, family, xfrm_state_hmask); } static inline unsigned int __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, @@ -190,7 +184,8 @@ static void xfrm_hash_transfer(struct hlist_head *list, hlist_for_each_entry_safe(x, entry, tmp, list, bydst) { unsigned int h; - h = __xfrm_dst_hash(&x->id.daddr, x->props.family, nhashmask); + h = __xfrm_dst_hash(&x->id.daddr, x->props.reqid, + x->props.family, nhashmask); hlist_add_head(&x->bydst, ndsttable+h); h = __xfrm_src_hash(&x->props.saddr, x->props.family, @@ -635,7 +630,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, struct xfrm_policy *pol, int *err, unsigned short family) { - unsigned int h = xfrm_dst_hash(daddr, family); + unsigned int h = xfrm_dst_hash(daddr, tmpl->reqid, family); struct hlist_node *entry; struct xfrm_state *x, *x0; int acquire_in_progress = 0; @@ -744,15 +739,15 @@ out: static void __xfrm_state_insert(struct xfrm_state *x) { - unsigned int h = xfrm_dst_hash(&x->id.daddr, x->props.family); + unsigned int h; x->genid = ++xfrm_state_genid; + h = xfrm_dst_hash(&x->id.daddr, x->props.reqid, x->props.family); hlist_add_head(&x->bydst, xfrm_state_bydst+h); xfrm_state_hold(x); h = xfrm_src_hash(&x->props.saddr, x->props.family); - hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); xfrm_state_hold(x); @@ -794,7 +789,7 @@ EXPORT_SYMBOL(xfrm_state_insert); /* xfrm_state_lock is held */ static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create) { - unsigned int h = xfrm_dst_hash(daddr, family); + unsigned int h = xfrm_dst_hash(daddr, reqid, family); struct hlist_node *entry; struct xfrm_state *x; -- cgit v0.10.2 From 2575b65434d56559bd03854450b9b6aaf19b9c90 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 03:26:44 -0700 Subject: [XFRM]: Simplify xfrm_spi_hash It can use __xfrm{4,6}_addr_hash(). Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 7e5daaf..9820039 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -107,35 +107,20 @@ static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family return __xfrm_src_hash(addr, family, xfrm_state_hmask); } -static inline unsigned int __xfrm4_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, - unsigned int hmask) -{ - unsigned int h; - h = ntohl(addr->a4^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) & hmask; - return h; -} - -static inline unsigned int __xfrm6_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, - unsigned int hmask) -{ - unsigned int h; - h = ntohl(addr->a6[2]^addr->a6[3]^spi^proto); - h = (h ^ (h>>10) ^ (h>>20)) & hmask; - return h; -} - -static inline -unsigned __xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family, - unsigned int hmask) +static inline unsigned int +__xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family, + unsigned int hmask) { + unsigned int h = spi ^ proto; switch (family) { case AF_INET: - return __xfrm4_spi_hash(addr, spi, proto, hmask); + h ^= __xfrm4_addr_hash(addr); + break; case AF_INET6: - return __xfrm6_spi_hash(addr, spi, proto, hmask); + h ^= __xfrm6_addr_hash(addr); + break; } - return 0; /*XXX*/ + return (h ^ (h >> 10) ^ (h >> 20)) & hmask; } static inline unsigned int -- cgit v0.10.2 From c7f5ea3a4d1ae6b3b426e113358fdc57494bc754 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 03:29:04 -0700 Subject: [XFRM]: Do not flush all bundles on SA insert. Instead, simply set all potentially aliasing existing xfrm_state objects to have the current generation counter value. This will make routes get relooked up the next time an existing route mentioning these aliased xfrm_state objects gets used, via xfrm_dst_check(). Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index fd4a300..a620a43 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -996,7 +996,6 @@ struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, extern void xfrm_policy_flush(u8 type); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); extern int xfrm_flush_bundles(void); -extern void xfrm_flush_all_bundles(void); extern int xfrm_bundle_ok(struct xfrm_dst *xdst, struct flowi *fl, int family, int strict); extern void xfrm_init_pmtu(struct dst_entry *dst); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 7fc6944..cfa5c69 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1478,16 +1478,6 @@ int xfrm_flush_bundles(void) return 0; } -static int always_true(struct dst_entry *dst) -{ - return 1; -} - -void xfrm_flush_all_bundles(void) -{ - xfrm_prune_bundles(always_true); -} - void xfrm_init_pmtu(struct dst_entry *dst) { do { diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 9820039..77ef796 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -761,13 +761,30 @@ static void __xfrm_state_insert(struct xfrm_state *x) schedule_work(&xfrm_hash_work); } +/* xfrm_state_lock is held */ +static void __xfrm_state_bump_genids(struct xfrm_state *xnew) +{ + unsigned short family = xnew->props.family; + u32 reqid = xnew->props.reqid; + struct xfrm_state *x; + struct hlist_node *entry; + unsigned int h; + + h = xfrm_dst_hash(&xnew->id.daddr, reqid, family); + hlist_for_each_entry(x, entry, xfrm_state_bydst+h, bydst) { + if (x->props.family == family && + x->props.reqid == reqid && + !xfrm_addr_cmp(&x->id.daddr, &xnew->id.daddr, family)) + x->genid = xfrm_state_genid; + } +} + void xfrm_state_insert(struct xfrm_state *x) { spin_lock_bh(&xfrm_state_lock); + __xfrm_state_bump_genids(x); __xfrm_state_insert(x); spin_unlock_bh(&xfrm_state_lock); - - xfrm_flush_all_bundles(); } EXPORT_SYMBOL(xfrm_state_insert); @@ -889,15 +906,13 @@ int xfrm_state_add(struct xfrm_state *x) x->id.proto, &x->id.daddr, &x->props.saddr, 0); + __xfrm_state_bump_genids(x); __xfrm_state_insert(x); err = 0; out: spin_unlock_bh(&xfrm_state_lock); - if (!err) - xfrm_flush_all_bundles(); - if (x1) { xfrm_state_delete(x1); xfrm_state_put(x1); -- cgit v0.10.2 From 1c0953997567b22e32fdf85d3b4bc0f2461fd161 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 03:30:28 -0700 Subject: [XFRM]: Purge dst references to deleted SAs passively. Just let GC and other normal mechanisms take care of getting rid of DST cache references to deleted xfrm_state objects instead of walking all the policy bundles. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index a620a43..c7870b6 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -995,7 +995,6 @@ struct xfrm_state * xfrm_find_acq(u8 mode, u32 reqid, u8 proto, int create, unsigned short family); extern void xfrm_policy_flush(u8 type); extern int xfrm_sk_policy_insert(struct sock *sk, int dir, struct xfrm_policy *pol); -extern int xfrm_flush_bundles(void); extern int xfrm_bundle_ok(struct xfrm_dst *xdst, struct flowi *fl, int family, int strict); extern void xfrm_init_pmtu(struct dst_entry *dst); diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index cfa5c69..1bcaae4 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1472,7 +1472,7 @@ static void __xfrm_garbage_collect(void) xfrm_prune_bundles(unused_bundle); } -int xfrm_flush_bundles(void) +static int xfrm_flush_bundles(void) { xfrm_prune_bundles(stale_bundle); return 0; diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 77ef796..9ff00b7 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -256,8 +256,6 @@ static struct work_struct xfrm_state_gc_work; static HLIST_HEAD(xfrm_state_gc_list); static DEFINE_SPINLOCK(xfrm_state_gc_lock); -static int xfrm_state_gc_flush_bundles; - int __xfrm_state_delete(struct xfrm_state *x); static struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned short family); @@ -293,11 +291,6 @@ static void xfrm_state_gc_task(void *data) struct hlist_node *entry, *tmp; struct hlist_head gc_list; - if (xfrm_state_gc_flush_bundles) { - xfrm_state_gc_flush_bundles = 0; - xfrm_flush_bundles(); - } - spin_lock_bh(&xfrm_state_gc_lock); gc_list.first = xfrm_state_gc_list.first; INIT_HLIST_HEAD(&xfrm_state_gc_list); @@ -454,16 +447,6 @@ int __xfrm_state_delete(struct xfrm_state *x) if (del_timer(&x->rtimer)) __xfrm_state_put(x); - /* The number two in this test is the reference - * mentioned in the comment below plus the reference - * our caller holds. A larger value means that - * there are DSTs attached to this xfrm_state. - */ - if (atomic_read(&x->refcnt) > 2) { - xfrm_state_gc_flush_bundles = 1; - schedule_work(&xfrm_state_gc_work); - } - /* All xfrm_state objects are created by xfrm_state_alloc. * The xfrm_state_alloc call gives a reference, and that * is what we are dropping here. -- cgit v0.10.2 From a47f0ce05ae12ce9acad62896ff703175764104e Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 03:54:22 -0700 Subject: [XFRM]: Kill excessive refcounting of xfrm_state objects. The refcounting done for timers and hash table insertions are just wasted cycles. We can eliminate all of this refcounting because: 1) The implicit refcount when the xfrm_state object is active will always be held while the object is in the hash tables. We never kfree() the xfrm_state until long after we've made sure that it has been unhashed. 2) Timers are even easier. Once we mark that x->km.state as anything other than XFRM_STATE_VALID (__xfrm_state_delete sets it to XFRM_STATE_DEAD), any timer that fires will do nothing and return without rearming the timer. Therefore we can defer the del_timer calls until when the object is about to be freed up during GC. We have to use del_timer_sync() and defer it to GC because we can't do a del_timer_sync() while holding x->lock which all callers of __xfrm_state_delete hold. This makes SA changes even more light-weight. Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 9ff00b7..0bc6a4b 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -266,10 +266,8 @@ void km_state_expired(struct xfrm_state *x, int hard, u32 pid); static void xfrm_state_gc_destroy(struct xfrm_state *x) { - if (del_timer(&x->timer)) - BUG(); - if (del_timer(&x->rtimer)) - BUG(); + del_timer_sync(&x->timer); + del_timer_sync(&x->rtimer); kfree(x->aalg); kfree(x->ealg); kfree(x->calg); @@ -361,9 +359,9 @@ static void xfrm_timer_handler(unsigned long data) if (warn) km_state_expired(x, 0, 0); resched: - if (next != LONG_MAX && - !mod_timer(&x->timer, jiffies + make_jiffies(next))) - xfrm_state_hold(x); + if (next != LONG_MAX) + mod_timer(&x->timer, jiffies + make_jiffies(next)); + goto out; expired: @@ -378,7 +376,6 @@ expired: out: spin_unlock(&x->lock); - xfrm_state_put(x); } static void xfrm_replay_timer_handler(unsigned long data); @@ -433,19 +430,11 @@ int __xfrm_state_delete(struct xfrm_state *x) x->km.state = XFRM_STATE_DEAD; spin_lock(&xfrm_state_lock); hlist_del(&x->bydst); - __xfrm_state_put(x); hlist_del(&x->bysrc); - __xfrm_state_put(x); - if (x->id.spi) { + if (x->id.spi) hlist_del(&x->byspi); - __xfrm_state_put(x); - } xfrm_state_num--; spin_unlock(&xfrm_state_lock); - if (del_timer(&x->timer)) - __xfrm_state_put(x); - if (del_timer(&x->rtimer)) - __xfrm_state_put(x); /* All xfrm_state objects are created by xfrm_state_alloc. * The xfrm_state_alloc call gives a reference, and that @@ -676,17 +665,13 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, if (km_query(x, tmpl, pol) == 0) { x->km.state = XFRM_STATE_ACQ; hlist_add_head(&x->bydst, xfrm_state_bydst+h); - xfrm_state_hold(x); h = xfrm_src_hash(saddr, family); hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); - xfrm_state_hold(x); if (x->id.spi) { h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, family); hlist_add_head(&x->byspi, xfrm_state_byspi+h); - xfrm_state_hold(x); } x->lft.hard_add_expires_seconds = XFRM_ACQ_EXPIRES; - xfrm_state_hold(x); x->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; add_timer(&x->timer); } else { @@ -713,26 +698,20 @@ static void __xfrm_state_insert(struct xfrm_state *x) h = xfrm_dst_hash(&x->id.daddr, x->props.reqid, x->props.family); hlist_add_head(&x->bydst, xfrm_state_bydst+h); - xfrm_state_hold(x); h = xfrm_src_hash(&x->props.saddr, x->props.family); hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); - xfrm_state_hold(x); if (xfrm_id_proto_match(x->id.proto, IPSEC_PROTO_ANY)) { h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, x->props.family); hlist_add_head(&x->byspi, xfrm_state_byspi+h); - xfrm_state_hold(x); } - if (!mod_timer(&x->timer, jiffies + HZ)) - xfrm_state_hold(x); - - if (x->replay_maxage && - !mod_timer(&x->rtimer, jiffies + x->replay_maxage)) - xfrm_state_hold(x); + mod_timer(&x->timer, jiffies + HZ); + if (x->replay_maxage) + mod_timer(&x->rtimer, jiffies + x->replay_maxage); wake_up(&km_waitq); @@ -844,10 +823,8 @@ static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 re xfrm_state_hold(x); x->timer.expires = jiffies + XFRM_ACQ_EXPIRES*HZ; add_timer(&x->timer); - xfrm_state_hold(x); hlist_add_head(&x->bydst, xfrm_state_bydst+h); h = xfrm_src_hash(saddr, family); - xfrm_state_hold(x); hlist_add_head(&x->bysrc, xfrm_state_bysrc+h); wake_up(&km_waitq); } @@ -955,8 +932,7 @@ out: memcpy(&x1->lft, &x->lft, sizeof(x1->lft)); x1->km.dying = 0; - if (!mod_timer(&x1->timer, jiffies + HZ)) - xfrm_state_hold(x1); + mod_timer(&x1->timer, jiffies + HZ); if (x1->curlft.use_time) xfrm_state_check_expire(x1); @@ -981,8 +957,7 @@ int xfrm_state_check_expire(struct xfrm_state *x) if (x->curlft.bytes >= x->lft.hard_byte_limit || x->curlft.packets >= x->lft.hard_packet_limit) { x->km.state = XFRM_STATE_EXPIRED; - if (!mod_timer(&x->timer, jiffies)) - xfrm_state_hold(x); + mod_timer(&x->timer, jiffies); return -EINVAL; } @@ -1177,7 +1152,6 @@ xfrm_alloc_spi(struct xfrm_state *x, u32 minspi, u32 maxspi) spin_lock_bh(&xfrm_state_lock); h = xfrm_spi_hash(&x->id.daddr, x->id.spi, x->id.proto, x->props.family); hlist_add_head(&x->byspi, xfrm_state_byspi+h); - xfrm_state_hold(x); spin_unlock_bh(&xfrm_state_lock); wake_up(&km_waitq); } @@ -1264,10 +1238,8 @@ void xfrm_replay_notify(struct xfrm_state *x, int event) km_state_notify(x, &c); if (x->replay_maxage && - !mod_timer(&x->rtimer, jiffies + x->replay_maxage)) { - xfrm_state_hold(x); + !mod_timer(&x->rtimer, jiffies + x->replay_maxage)) x->xflags &= ~XFRM_TIME_DEFER; - } } EXPORT_SYMBOL(xfrm_replay_notify); @@ -1285,7 +1257,6 @@ static void xfrm_replay_timer_handler(unsigned long data) } spin_unlock(&x->lock); - xfrm_state_put(x); } int xfrm_replay_check(struct xfrm_state *x, u32 seq) -- cgit v0.10.2 From c1969f294e624d5b642fc8e6ab9468b7c7791fa8 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 04:00:03 -0700 Subject: [XFRM]: Hash xfrm_state objects by source address too. The source address is always non-prefixed so we should use it to help give entropy to the bydst hash. Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 0bc6a4b..37213f9 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -65,26 +65,40 @@ static inline unsigned int __xfrm6_addr_hash(xfrm_address_t *addr) return ntohl(addr->a6[2]^addr->a6[3]); } -static inline unsigned int __xfrm_dst_hash(xfrm_address_t *addr, +static inline unsigned int __xfrm4_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) +{ + return ntohl(daddr->a4 ^ saddr->a4); +} + +static inline unsigned int __xfrm6_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) +{ + return ntohl(daddr->a6[2] ^ daddr->a6[3] ^ + saddr->a6[2] ^ saddr->a6[3]); +} + +static inline unsigned int __xfrm_dst_hash(xfrm_address_t *daddr, + xfrm_address_t *saddr, u32 reqid, unsigned short family, unsigned int hmask) { unsigned int h = family ^ reqid; switch (family) { case AF_INET: - h ^= __xfrm4_addr_hash(addr); + h ^= __xfrm4_daddr_saddr_hash(daddr, saddr); break; case AF_INET6: - h ^= __xfrm6_addr_hash(addr); + h ^= __xfrm6_daddr_saddr_hash(daddr, saddr); break; }; return (h ^ (h >> 16)) & hmask; } -static inline unsigned int xfrm_dst_hash(xfrm_address_t *addr, u32 reqid, +static inline unsigned int xfrm_dst_hash(xfrm_address_t *daddr, + xfrm_address_t *saddr, + u32 reqid, unsigned short family) { - return __xfrm_dst_hash(addr, reqid, family, xfrm_state_hmask); + return __xfrm_dst_hash(daddr, saddr, reqid, family, xfrm_state_hmask); } static inline unsigned __xfrm_src_hash(xfrm_address_t *addr, unsigned short family, @@ -108,25 +122,25 @@ static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family } static inline unsigned int -__xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family, - unsigned int hmask) +__xfrm_spi_hash(xfrm_address_t *daddr, u32 spi, u8 proto, + unsigned short family, unsigned int hmask) { unsigned int h = spi ^ proto; switch (family) { case AF_INET: - h ^= __xfrm4_addr_hash(addr); + h ^= __xfrm4_addr_hash(daddr); break; case AF_INET6: - h ^= __xfrm6_addr_hash(addr); + h ^= __xfrm6_addr_hash(daddr); break; } return (h ^ (h >> 10) ^ (h >> 20)) & hmask; } static inline unsigned int -xfrm_spi_hash(xfrm_address_t *addr, u32 spi, u8 proto, unsigned short family) +xfrm_spi_hash(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family) { - return __xfrm_spi_hash(addr, spi, proto, family, xfrm_state_hmask); + return __xfrm_spi_hash(daddr, spi, proto, family, xfrm_state_hmask); } static struct hlist_head *xfrm_state_hash_alloc(unsigned int sz) @@ -169,8 +183,9 @@ static void xfrm_hash_transfer(struct hlist_head *list, hlist_for_each_entry_safe(x, entry, tmp, list, bydst) { unsigned int h; - h = __xfrm_dst_hash(&x->id.daddr, x->props.reqid, - x->props.family, nhashmask); + h = __xfrm_dst_hash(&x->id.daddr, &x->props.saddr, + x->props.reqid, x->props.family, + nhashmask); hlist_add_head(&x->bydst, ndsttable+h); h = __xfrm_src_hash(&x->props.saddr, x->props.family, @@ -587,7 +602,7 @@ xfrm_state_find(xfrm_address_t *daddr, xfrm_address_t *saddr, struct xfrm_policy *pol, int *err, unsigned short family) { - unsigned int h = xfrm_dst_hash(daddr, tmpl->reqid, family); + unsigned int h = xfrm_dst_hash(daddr, saddr, tmpl->reqid, family); struct hlist_node *entry; struct xfrm_state *x, *x0; int acquire_in_progress = 0; @@ -696,7 +711,8 @@ static void __xfrm_state_insert(struct xfrm_state *x) x->genid = ++xfrm_state_genid; - h = xfrm_dst_hash(&x->id.daddr, x->props.reqid, x->props.family); + h = xfrm_dst_hash(&x->id.daddr, &x->props.saddr, + x->props.reqid, x->props.family); hlist_add_head(&x->bydst, xfrm_state_bydst+h); h = xfrm_src_hash(&x->props.saddr, x->props.family); @@ -732,11 +748,12 @@ static void __xfrm_state_bump_genids(struct xfrm_state *xnew) struct hlist_node *entry; unsigned int h; - h = xfrm_dst_hash(&xnew->id.daddr, reqid, family); + h = xfrm_dst_hash(&xnew->id.daddr, &xnew->props.saddr, reqid, family); hlist_for_each_entry(x, entry, xfrm_state_bydst+h, bydst) { if (x->props.family == family && x->props.reqid == reqid && - !xfrm_addr_cmp(&x->id.daddr, &xnew->id.daddr, family)) + !xfrm_addr_cmp(&x->id.daddr, &xnew->id.daddr, family) && + !xfrm_addr_cmp(&x->props.saddr, &xnew->props.saddr, family)) x->genid = xfrm_state_genid; } } @@ -753,7 +770,7 @@ EXPORT_SYMBOL(xfrm_state_insert); /* xfrm_state_lock is held */ static struct xfrm_state *__find_acq_core(unsigned short family, u8 mode, u32 reqid, u8 proto, xfrm_address_t *daddr, xfrm_address_t *saddr, int create) { - unsigned int h = xfrm_dst_hash(daddr, reqid, family); + unsigned int h = xfrm_dst_hash(daddr, saddr, reqid, family); struct hlist_node *entry; struct xfrm_state *x; -- cgit v0.10.2 From 2518c7c2b3d7f0a6b302b4efe17c911f8dd4049f Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 04:45:07 -0700 Subject: [XFRM]: Hash policies when non-prefixed. This idea is from Alexey Kuznetsov. It is common for policies to be non-prefixed. And for that case we can optimize lookups, insert, etc. quite a bit. For each direction, we have a dynamically sized policy hash table for non-prefixed policies. We also have a hash table on policy->index. For prefixed policies, we have a list per-direction which we will consult on lookups when a non-prefix hashtable lookup fails. This still isn't as efficient as I would like it. There are four immediate problems: 1) Lots of excessive refcounting, which can be fixed just like xfrm_state was 2) We do 2 hash probes on insert, one to look for dups and one to allocate a unique policy->index. Althought I wonder how much this matters since xfrm_state inserts do up to 3 hash probes and that seems to perform fine. 3) xfrm_policy_insert() is very complex because of the priority ordering and entry replacement logic. 4) Lots of counter bumping, in addition to policy refcounts, in the form of xfrm_policy_count[]. This is merely used to let code path(s) know that some IPSEC rules exist. So this count is indexed per-direction, maybe that is overkill. Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index c7870b6..0acabf2 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -331,7 +331,8 @@ struct xfrm_tmpl struct xfrm_policy { struct xfrm_policy *next; - struct list_head list; + struct hlist_node bydst; + struct hlist_node byidx; /* This lock only affects elements except for entry. */ rwlock_t lock; @@ -385,21 +386,7 @@ struct xfrm_mgr extern int xfrm_register_km(struct xfrm_mgr *km); extern int xfrm_unregister_km(struct xfrm_mgr *km); - -extern struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2]; -#ifdef CONFIG_XFRM_SUB_POLICY -extern struct xfrm_policy *xfrm_policy_list_sub[XFRM_POLICY_MAX*2]; - -static inline int xfrm_policy_lists_empty(int dir) -{ - return (!xfrm_policy_list[dir] && !xfrm_policy_list_sub[dir]); -} -#else -static inline int xfrm_policy_lists_empty(int dir) -{ - return (!xfrm_policy_list[dir]); -} -#endif +extern unsigned int xfrm_policy_count[XFRM_POLICY_MAX*2]; static inline void xfrm_pol_hold(struct xfrm_policy *policy) { @@ -678,7 +665,7 @@ static inline int xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *sk if (sk && sk->sk_policy[XFRM_POLICY_IN]) return __xfrm_policy_check(sk, dir, skb, family); - return (xfrm_policy_lists_empty(dir) && !skb->sp) || + return (!xfrm_policy_count[dir] && !skb->sp) || (skb->dst->flags & DST_NOPOLICY) || __xfrm_policy_check(sk, dir, skb, family); } @@ -698,7 +685,7 @@ extern int __xfrm_route_forward(struct sk_buff *skb, unsigned short family); static inline int xfrm_route_forward(struct sk_buff *skb, unsigned short family) { - return xfrm_policy_lists_empty(XFRM_POLICY_OUT) || + return !xfrm_policy_count[XFRM_POLICY_OUT] || (skb->dst->flags & DST_NOXFRM) || __xfrm_route_forward(skb, family); } diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 1bcaae4..087a544 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -22,6 +22,9 @@ #include #include #include +#include +#include +#include #include #include @@ -30,26 +33,8 @@ EXPORT_SYMBOL(xfrm_cfg_mutex); static DEFINE_RWLOCK(xfrm_policy_lock); -struct xfrm_policy *xfrm_policy_list[XFRM_POLICY_MAX*2]; -EXPORT_SYMBOL(xfrm_policy_list); -#ifdef CONFIG_XFRM_SUB_POLICY -struct xfrm_policy *xfrm_policy_list_sub[XFRM_POLICY_MAX*2]; -EXPORT_SYMBOL(xfrm_policy_list_sub); - -#define XFRM_POLICY_LISTS(type) \ - ((type == XFRM_POLICY_TYPE_SUB) ? xfrm_policy_list_sub : \ - xfrm_policy_list) -#define XFRM_POLICY_LISTHEAD(type, dir) \ - ((type == XFRM_POLICY_TYPE_SUB) ? xfrm_policy_list_sub[dir] : \ - xfrm_policy_list[dir]) -#define XFRM_POLICY_LISTHEADP(type, dir) \ - ((type == XFRM_POLICY_TYPE_SUB) ? &xfrm_policy_list_sub[dir] : \ - &xfrm_policy_list[dir]) -#else -#define XFRM_POLICY_LISTS(type) xfrm_policy_list -#define XFRM_POLICY_LISTHEAD(type, dif) xfrm_policy_list[dir] -#define XFRM_POLICY_LISTHEADP(type, dif) &xfrm_policy_list[dir] -#endif +unsigned int xfrm_policy_count[XFRM_POLICY_MAX*2]; +EXPORT_SYMBOL(xfrm_policy_count); static DEFINE_RWLOCK(xfrm_policy_afinfo_lock); static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO]; @@ -57,8 +42,7 @@ static struct xfrm_policy_afinfo *xfrm_policy_afinfo[NPROTO]; static kmem_cache_t *xfrm_dst_cache __read_mostly; static struct work_struct xfrm_policy_gc_work; -static struct list_head xfrm_policy_gc_list = - LIST_HEAD_INIT(xfrm_policy_gc_list); +static HLIST_HEAD(xfrm_policy_gc_list); static DEFINE_SPINLOCK(xfrm_policy_gc_lock); static struct xfrm_policy_afinfo *xfrm_policy_get_afinfo(unsigned short family); @@ -328,8 +312,10 @@ struct xfrm_policy *xfrm_policy_alloc(gfp_t gfp) policy = kzalloc(sizeof(struct xfrm_policy), gfp); if (policy) { - atomic_set(&policy->refcnt, 1); + INIT_HLIST_NODE(&policy->bydst); + INIT_HLIST_NODE(&policy->byidx); rwlock_init(&policy->lock); + atomic_set(&policy->refcnt, 1); init_timer(&policy->timer); policy->timer.data = (unsigned long)policy; policy->timer.function = xfrm_policy_timer; @@ -375,17 +361,16 @@ static void xfrm_policy_gc_kill(struct xfrm_policy *policy) static void xfrm_policy_gc_task(void *data) { struct xfrm_policy *policy; - struct list_head *entry, *tmp; - struct list_head gc_list = LIST_HEAD_INIT(gc_list); + struct hlist_node *entry, *tmp; + struct hlist_head gc_list; spin_lock_bh(&xfrm_policy_gc_lock); - list_splice_init(&xfrm_policy_gc_list, &gc_list); + gc_list.first = xfrm_policy_gc_list.first; + INIT_HLIST_HEAD(&xfrm_policy_gc_list); spin_unlock_bh(&xfrm_policy_gc_lock); - list_for_each_safe(entry, tmp, &gc_list) { - policy = list_entry(entry, struct xfrm_policy, list); + hlist_for_each_entry_safe(policy, entry, tmp, &gc_list, bydst) xfrm_policy_gc_kill(policy); - } } /* Rule must be locked. Release descentant resources, announce @@ -407,70 +392,354 @@ static void xfrm_policy_kill(struct xfrm_policy *policy) } spin_lock(&xfrm_policy_gc_lock); - list_add(&policy->list, &xfrm_policy_gc_list); + hlist_add_head(&policy->bydst, &xfrm_policy_gc_list); spin_unlock(&xfrm_policy_gc_lock); schedule_work(&xfrm_policy_gc_work); } +struct xfrm_policy_hash { + struct hlist_head *table; + unsigned int hmask; +}; + +static struct hlist_head xfrm_policy_inexact[XFRM_POLICY_MAX*2]; +static struct xfrm_policy_hash xfrm_policy_bydst[XFRM_POLICY_MAX*2] __read_mostly; +static struct hlist_head *xfrm_policy_byidx __read_mostly; +static unsigned int xfrm_idx_hmask __read_mostly; +static unsigned int xfrm_policy_hashmax __read_mostly = 1 * 1024 * 1024; + +static inline unsigned int __idx_hash(u32 index, unsigned int hmask) +{ + return (index ^ (index >> 8)) & hmask; +} + +static inline unsigned int idx_hash(u32 index) +{ + return __idx_hash(index, xfrm_idx_hmask); +} + +static inline unsigned int __sel_hash(struct xfrm_selector *sel, unsigned short family, unsigned int hmask) +{ + xfrm_address_t *daddr = &sel->daddr; + xfrm_address_t *saddr = &sel->saddr; + unsigned int h = 0; + + switch (family) { + case AF_INET: + if (sel->prefixlen_d != 32 || + sel->prefixlen_s != 32) + return hmask + 1; + + h = ntohl(daddr->a4 ^ saddr->a4); + break; + + case AF_INET6: + if (sel->prefixlen_d != 128 || + sel->prefixlen_s != 128) + return hmask + 1; + + h = ntohl(daddr->a6[2] ^ daddr->a6[3] ^ + saddr->a6[2] ^ saddr->a6[3]); + break; + }; + h ^= (h >> 16); + return h & hmask; +} + +static inline unsigned int __addr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family, unsigned int hmask) +{ + unsigned int h = 0; + + switch (family) { + case AF_INET: + h = ntohl(daddr->a4 ^ saddr->a4); + break; + + case AF_INET6: + h = ntohl(daddr->a6[2] ^ daddr->a6[3] ^ + saddr->a6[2] ^ saddr->a6[3]); + break; + }; + h ^= (h >> 16); + return h & hmask; +} + +static struct hlist_head *policy_hash_bysel(struct xfrm_selector *sel, unsigned short family, int dir) +{ + unsigned int hmask = xfrm_policy_bydst[dir].hmask; + unsigned int hash = __sel_hash(sel, family, hmask); + + return (hash == hmask + 1 ? + &xfrm_policy_inexact[dir] : + xfrm_policy_bydst[dir].table + hash); +} + +static struct hlist_head *policy_hash_direct(xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family, int dir) +{ + unsigned int hmask = xfrm_policy_bydst[dir].hmask; + unsigned int hash = __addr_hash(daddr, saddr, family, hmask); + + return xfrm_policy_bydst[dir].table + hash; +} + +static struct hlist_head *xfrm_policy_hash_alloc(unsigned int sz) +{ + struct hlist_head *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct hlist_head *) + __get_free_pages(GFP_KERNEL, get_order(sz)); + + if (n) + memset(n, 0, sz); + + return n; +} + +static void xfrm_policy_hash_free(struct hlist_head *n, unsigned int sz) +{ + if (sz <= PAGE_SIZE) + kfree(n); + else if (hashdist) + vfree(n); + else + free_pages((unsigned long)n, get_order(sz)); +} + +static void xfrm_dst_hash_transfer(struct hlist_head *list, + struct hlist_head *ndsttable, + unsigned int nhashmask) +{ + struct hlist_node *entry, *tmp; + struct xfrm_policy *pol; + + hlist_for_each_entry_safe(pol, entry, tmp, list, bydst) { + unsigned int h; + + h = __addr_hash(&pol->selector.daddr, &pol->selector.saddr, + pol->family, nhashmask); + hlist_add_head(&pol->bydst, ndsttable+h); + } +} + +static void xfrm_idx_hash_transfer(struct hlist_head *list, + struct hlist_head *nidxtable, + unsigned int nhashmask) +{ + struct hlist_node *entry, *tmp; + struct xfrm_policy *pol; + + hlist_for_each_entry_safe(pol, entry, tmp, list, byidx) { + unsigned int h; + + h = __idx_hash(pol->index, nhashmask); + hlist_add_head(&pol->byidx, nidxtable+h); + } +} + +static unsigned long xfrm_new_hash_mask(unsigned int old_hmask) +{ + return ((old_hmask + 1) << 1) - 1; +} + +static void xfrm_bydst_resize(int dir) +{ + unsigned int hmask = xfrm_policy_bydst[dir].hmask; + unsigned int nhashmask = xfrm_new_hash_mask(hmask); + unsigned int nsize = (nhashmask + 1) * sizeof(struct hlist_head); + struct hlist_head *odst = xfrm_policy_bydst[dir].table; + struct hlist_head *ndst = xfrm_policy_hash_alloc(nsize); + int i; + + if (!ndst) + return; + + write_lock_bh(&xfrm_policy_lock); + + for (i = hmask; i >= 0; i--) + xfrm_dst_hash_transfer(odst + i, ndst, nhashmask); + + xfrm_policy_bydst[dir].table = ndst; + xfrm_policy_bydst[dir].hmask = nhashmask; + + write_unlock_bh(&xfrm_policy_lock); + + xfrm_policy_hash_free(odst, (hmask + 1) * sizeof(struct hlist_head)); +} + +static void xfrm_byidx_resize(int total) +{ + unsigned int hmask = xfrm_idx_hmask; + unsigned int nhashmask = xfrm_new_hash_mask(hmask); + unsigned int nsize = (nhashmask + 1) * sizeof(struct hlist_head); + struct hlist_head *oidx = xfrm_policy_byidx; + struct hlist_head *nidx = xfrm_policy_hash_alloc(nsize); + int i; + + if (!nidx) + return; + + write_lock_bh(&xfrm_policy_lock); + + for (i = hmask; i >= 0; i--) + xfrm_idx_hash_transfer(oidx + i, nidx, nhashmask); + + xfrm_policy_byidx = nidx; + xfrm_idx_hmask = nhashmask; + + write_unlock_bh(&xfrm_policy_lock); + + xfrm_policy_hash_free(oidx, (hmask + 1) * sizeof(struct hlist_head)); +} + +static inline int xfrm_bydst_should_resize(int dir, int *total) +{ + unsigned int cnt = xfrm_policy_count[dir]; + unsigned int hmask = xfrm_policy_bydst[dir].hmask; + + if (total) + *total += cnt; + + if ((hmask + 1) < xfrm_policy_hashmax && + cnt > hmask) + return 1; + + return 0; +} + +static inline int xfrm_byidx_should_resize(int total) +{ + unsigned int hmask = xfrm_idx_hmask; + + if ((hmask + 1) < xfrm_policy_hashmax && + total > hmask) + return 1; + + return 0; +} + +static DEFINE_MUTEX(hash_resize_mutex); + +static void xfrm_hash_resize(void *__unused) +{ + int dir, total; + + mutex_lock(&hash_resize_mutex); + + total = 0; + for (dir = 0; dir < XFRM_POLICY_MAX * 2; dir++) { + if (xfrm_bydst_should_resize(dir, &total)) + xfrm_bydst_resize(dir); + } + if (xfrm_byidx_should_resize(total)) + xfrm_byidx_resize(total); + + mutex_unlock(&hash_resize_mutex); +} + +static DECLARE_WORK(xfrm_hash_work, xfrm_hash_resize, NULL); + /* Generate new index... KAME seems to generate them ordered by cost * of an absolute inpredictability of ordering of rules. This will not pass. */ static u32 xfrm_gen_index(u8 type, int dir) { - u32 idx; - struct xfrm_policy *p; static u32 idx_generator; for (;;) { + struct hlist_node *entry; + struct hlist_head *list; + struct xfrm_policy *p; + u32 idx; + int found; + idx = (idx_generator | dir); idx_generator += 8; if (idx == 0) idx = 8; - for (p = XFRM_POLICY_LISTHEAD(type, dir); p; p = p->next) { - if (p->index == idx) + list = xfrm_policy_byidx + idx_hash(idx); + found = 0; + hlist_for_each_entry(p, entry, list, byidx) { + if (p->index == idx) { + found = 1; break; + } } - if (!p) + if (!found) return idx; } } +static inline int selector_cmp(struct xfrm_selector *s1, struct xfrm_selector *s2) +{ + u32 *p1 = (u32 *) s1; + u32 *p2 = (u32 *) s2; + int len = sizeof(struct xfrm_selector) / sizeof(u32); + int i; + + for (i = 0; i < len; i++) { + if (p1[i] != p2[i]) + return 1; + } + + return 0; +} + int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) { - struct xfrm_policy *pol, **p; - struct xfrm_policy *delpol = NULL; - struct xfrm_policy **newpos = NULL; + struct xfrm_policy *pol; + struct xfrm_policy *delpol; + struct hlist_head *chain; + struct hlist_node *entry, *newpos, *last; struct dst_entry *gc_list; write_lock_bh(&xfrm_policy_lock); - for (p = XFRM_POLICY_LISTHEADP(policy->type, dir); (pol=*p)!=NULL;) { - if (!delpol && memcmp(&policy->selector, &pol->selector, sizeof(pol->selector)) == 0 && + chain = policy_hash_bysel(&policy->selector, policy->family, dir); + delpol = NULL; + newpos = NULL; + last = NULL; + hlist_for_each_entry(pol, entry, chain, bydst) { + if (!delpol && + pol->type == policy->type && + !selector_cmp(&pol->selector, &policy->selector) && xfrm_sec_ctx_match(pol->security, policy->security)) { if (excl) { write_unlock_bh(&xfrm_policy_lock); return -EEXIST; } - *p = pol->next; delpol = pol; if (policy->priority > pol->priority) continue; } else if (policy->priority >= pol->priority) { - p = &pol->next; + last = &pol->bydst; continue; } if (!newpos) - newpos = p; + newpos = &pol->bydst; if (delpol) break; - p = &pol->next; + last = &pol->bydst; } + if (!newpos) + newpos = last; if (newpos) - p = newpos; + hlist_add_after(newpos, &policy->bydst); + else + hlist_add_head(&policy->bydst, chain); xfrm_pol_hold(policy); - policy->next = *p; - *p = policy; + xfrm_policy_count[dir]++; atomic_inc(&flow_cache_genid); + if (delpol) { + hlist_del(&delpol->bydst); + hlist_del(&delpol->byidx); + xfrm_policy_count[dir]--; + } policy->index = delpol ? delpol->index : xfrm_gen_index(policy->type, dir); + hlist_add_head(&policy->byidx, xfrm_policy_byidx+idx_hash(policy->index)); policy->curlft.add_time = (unsigned long)xtime.tv_sec; policy->curlft.use_time = 0; if (!mod_timer(&policy->timer, jiffies + HZ)) @@ -479,10 +748,13 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl) if (delpol) xfrm_policy_kill(delpol); + else if (xfrm_bydst_should_resize(dir, NULL)) + schedule_work(&xfrm_hash_work); read_lock_bh(&xfrm_policy_lock); gc_list = NULL; - for (policy = policy->next; policy; policy = policy->next) { + entry = &policy->bydst; + hlist_for_each_entry_continue(policy, entry, bydst) { struct dst_entry *dst; write_lock(&policy->lock); @@ -515,67 +787,112 @@ struct xfrm_policy *xfrm_policy_bysel_ctx(u8 type, int dir, struct xfrm_selector *sel, struct xfrm_sec_ctx *ctx, int delete) { - struct xfrm_policy *pol, **p; + struct xfrm_policy *pol, *ret; + struct hlist_head *chain; + struct hlist_node *entry; write_lock_bh(&xfrm_policy_lock); - for (p = XFRM_POLICY_LISTHEADP(type, dir); (pol=*p)!=NULL; p = &pol->next) { - if ((memcmp(sel, &pol->selector, sizeof(*sel)) == 0) && - (xfrm_sec_ctx_match(ctx, pol->security))) { + chain = policy_hash_bysel(sel, sel->family, dir); + ret = NULL; + hlist_for_each_entry(pol, entry, chain, bydst) { + if (pol->type == type && + !selector_cmp(sel, &pol->selector) && + xfrm_sec_ctx_match(ctx, pol->security)) { xfrm_pol_hold(pol); - if (delete) - *p = pol->next; + if (delete) { + hlist_del(&pol->bydst); + hlist_del(&pol->byidx); + xfrm_policy_count[dir]--; + } + ret = pol; break; } } write_unlock_bh(&xfrm_policy_lock); - if (pol && delete) { + if (ret && delete) { atomic_inc(&flow_cache_genid); - xfrm_policy_kill(pol); + xfrm_policy_kill(ret); } - return pol; + return ret; } EXPORT_SYMBOL(xfrm_policy_bysel_ctx); struct xfrm_policy *xfrm_policy_byid(u8 type, int dir, u32 id, int delete) { - struct xfrm_policy *pol, **p; + struct xfrm_policy *pol, *ret; + struct hlist_head *chain; + struct hlist_node *entry; write_lock_bh(&xfrm_policy_lock); - for (p = XFRM_POLICY_LISTHEADP(type, dir); (pol=*p)!=NULL; p = &pol->next) { - if (pol->index == id) { + chain = xfrm_policy_byidx + idx_hash(id); + ret = NULL; + hlist_for_each_entry(pol, entry, chain, byidx) { + if (pol->type == type && pol->index == id) { xfrm_pol_hold(pol); - if (delete) - *p = pol->next; + if (delete) { + hlist_del(&pol->bydst); + hlist_del(&pol->byidx); + xfrm_policy_count[dir]--; + } + ret = pol; break; } } write_unlock_bh(&xfrm_policy_lock); - if (pol && delete) { + if (ret && delete) { atomic_inc(&flow_cache_genid); - xfrm_policy_kill(pol); + xfrm_policy_kill(ret); } - return pol; + return ret; } EXPORT_SYMBOL(xfrm_policy_byid); void xfrm_policy_flush(u8 type) { - struct xfrm_policy *xp; - struct xfrm_policy **p_list = XFRM_POLICY_LISTS(type); int dir; write_lock_bh(&xfrm_policy_lock); for (dir = 0; dir < XFRM_POLICY_MAX; dir++) { - while ((xp = p_list[dir]) != NULL) { - p_list[dir] = xp->next; + struct xfrm_policy *pol; + struct hlist_node *entry; + int i; + + again1: + hlist_for_each_entry(pol, entry, + &xfrm_policy_inexact[dir], bydst) { + if (pol->type != type) + continue; + hlist_del(&pol->bydst); + hlist_del(&pol->byidx); write_unlock_bh(&xfrm_policy_lock); - xfrm_policy_kill(xp); + xfrm_policy_kill(pol); write_lock_bh(&xfrm_policy_lock); + goto again1; + } + + for (i = xfrm_policy_bydst[dir].hmask; i >= 0; i--) { + again2: + hlist_for_each_entry(pol, entry, + xfrm_policy_bydst[dir].table + i, + bydst) { + if (pol->type != type) + continue; + hlist_del(&pol->bydst); + hlist_del(&pol->byidx); + write_unlock_bh(&xfrm_policy_lock); + + xfrm_policy_kill(pol); + + write_lock_bh(&xfrm_policy_lock); + goto again2; + } } + + xfrm_policy_count[dir] = 0; } atomic_inc(&flow_cache_genid); write_unlock_bh(&xfrm_policy_lock); @@ -585,15 +902,27 @@ EXPORT_SYMBOL(xfrm_policy_flush); int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, int, void*), void *data) { - struct xfrm_policy *xp; - int dir; - int count = 0; - int error = 0; + struct xfrm_policy *pol; + struct hlist_node *entry; + int dir, count, error; read_lock_bh(&xfrm_policy_lock); + count = 0; for (dir = 0; dir < 2*XFRM_POLICY_MAX; dir++) { - for (xp = XFRM_POLICY_LISTHEAD(type, dir); xp; xp = xp->next) - count++; + struct hlist_head *table = xfrm_policy_bydst[dir].table; + int i; + + hlist_for_each_entry(pol, entry, + &xfrm_policy_inexact[dir], bydst) { + if (pol->type == type) + count++; + } + for (i = xfrm_policy_bydst[dir].hmask; i >= 0; i--) { + hlist_for_each_entry(pol, entry, table + i, bydst) { + if (pol->type == type) + count++; + } + } } if (count == 0) { @@ -602,13 +931,28 @@ int xfrm_policy_walk(u8 type, int (*func)(struct xfrm_policy *, int, int, void*) } for (dir = 0; dir < 2*XFRM_POLICY_MAX; dir++) { - for (xp = XFRM_POLICY_LISTHEAD(type, dir); xp; xp = xp->next) { - error = func(xp, dir%XFRM_POLICY_MAX, --count, data); + struct hlist_head *table = xfrm_policy_bydst[dir].table; + int i; + + hlist_for_each_entry(pol, entry, + &xfrm_policy_inexact[dir], bydst) { + if (pol->type != type) + continue; + error = func(pol, dir % XFRM_POLICY_MAX, --count, data); if (error) goto out; } + for (i = xfrm_policy_bydst[dir].hmask; i >= 0; i--) { + hlist_for_each_entry(pol, entry, table + i, bydst) { + if (pol->type != type) + continue; + error = func(pol, dir % XFRM_POLICY_MAX, --count, data); + if (error) + goto out; + } + } } - + error = 0; out: read_unlock_bh(&xfrm_policy_lock); return error; @@ -617,31 +961,61 @@ EXPORT_SYMBOL(xfrm_policy_walk); /* Find policy to apply to this flow. */ -static struct xfrm_policy *xfrm_policy_lookup_bytype(u8 type, struct flowi *fl, - u16 family, u8 dir) +static int xfrm_policy_match(struct xfrm_policy *pol, struct flowi *fl, + u8 type, u16 family, int dir) { - struct xfrm_policy *pol; + struct xfrm_selector *sel = &pol->selector; + int match; - read_lock_bh(&xfrm_policy_lock); - for (pol = XFRM_POLICY_LISTHEAD(type, dir); pol; pol = pol->next) { - struct xfrm_selector *sel = &pol->selector; - int match; + if (pol->family != family || + pol->type != type) + return 0; - if (pol->family != family) - continue; + match = xfrm_selector_match(sel, fl, family); + if (match) { + if (!security_xfrm_policy_lookup(pol, fl->secid, dir)) + return 1; + } + + return 0; +} - match = xfrm_selector_match(sel, fl, family); +static struct xfrm_policy *xfrm_policy_lookup_bytype(u8 type, struct flowi *fl, + u16 family, u8 dir) +{ + struct xfrm_policy *pol, *ret; + xfrm_address_t *daddr, *saddr; + struct hlist_node *entry; + struct hlist_head *chain; - if (match) { - if (!security_xfrm_policy_lookup(pol, fl->secid, dir)) { + daddr = xfrm_flowi_daddr(fl, family); + saddr = xfrm_flowi_saddr(fl, family); + if (unlikely(!daddr || !saddr)) + return NULL; + + read_lock_bh(&xfrm_policy_lock); + chain = policy_hash_direct(daddr, saddr, family, dir); + ret = NULL; + hlist_for_each_entry(pol, entry, chain, bydst) { + if (xfrm_policy_match(pol, fl, type, family, dir)) { + xfrm_pol_hold(pol); + ret = pol; + break; + } + } + if (!ret) { + chain = &xfrm_policy_inexact[dir]; + hlist_for_each_entry(pol, entry, chain, bydst) { + if (xfrm_policy_match(pol, fl, type, family, dir)) { xfrm_pol_hold(pol); + ret = pol; break; } } } read_unlock_bh(&xfrm_policy_lock); - return pol; + return ret; } static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, @@ -657,7 +1031,7 @@ static void xfrm_policy_lookup(struct flowi *fl, u16 family, u8 dir, pol = xfrm_policy_lookup_bytype(XFRM_POLICY_TYPE_MAIN, fl, family, dir); #ifdef CONFIG_XFRM_SUB_POLICY - end: +end: #endif if ((*objp = (void *) pol) != NULL) *obj_refp = &pol->refcnt; @@ -704,26 +1078,29 @@ static struct xfrm_policy *xfrm_sk_policy_lookup(struct sock *sk, int dir, struc static void __xfrm_policy_link(struct xfrm_policy *pol, int dir) { - struct xfrm_policy **p_list = XFRM_POLICY_LISTS(pol->type); + struct hlist_head *chain = policy_hash_bysel(&pol->selector, + pol->family, dir); - pol->next = p_list[dir]; - p_list[dir] = pol; + hlist_add_head(&pol->bydst, chain); + hlist_add_head(&pol->byidx, xfrm_policy_byidx+idx_hash(pol->index)); + xfrm_policy_count[dir]++; xfrm_pol_hold(pol); + + if (xfrm_bydst_should_resize(dir, NULL)) + schedule_work(&xfrm_hash_work); } static struct xfrm_policy *__xfrm_policy_unlink(struct xfrm_policy *pol, int dir) { - struct xfrm_policy **polp; + if (hlist_unhashed(&pol->bydst)) + return NULL; - for (polp = XFRM_POLICY_LISTHEADP(pol->type, dir); - *polp != NULL; polp = &(*polp)->next) { - if (*polp == pol) { - *polp = pol->next; - return pol; - } - } - return NULL; + hlist_del(&pol->bydst); + hlist_del(&pol->byidx); + xfrm_policy_count[dir]--; + + return pol; } int xfrm_policy_delete(struct xfrm_policy *pol, int dir) @@ -968,7 +1345,8 @@ restart: if (!policy) { /* To accelerate a bit... */ - if ((dst_orig->flags & DST_NOXFRM) || xfrm_policy_lists_empty(XFRM_POLICY_OUT)) + if ((dst_orig->flags & DST_NOXFRM) || + !xfrm_policy_count[XFRM_POLICY_OUT]) return 0; policy = flow_cache_lookup(fl, dst_orig->ops->family, @@ -1413,50 +1791,50 @@ static struct dst_entry *xfrm_negative_advice(struct dst_entry *dst) return dst; } +static void prune_one_bundle(struct xfrm_policy *pol, int (*func)(struct dst_entry *), struct dst_entry **gc_list_p) +{ + struct dst_entry *dst, **dstp; + + write_lock(&pol->lock); + dstp = &pol->bundles; + while ((dst=*dstp) != NULL) { + if (func(dst)) { + *dstp = dst->next; + dst->next = *gc_list_p; + *gc_list_p = dst; + } else { + dstp = &dst->next; + } + } + write_unlock(&pol->lock); +} + static void xfrm_prune_bundles(int (*func)(struct dst_entry *)) { - int i; - struct xfrm_policy *pol; - struct dst_entry *dst, **dstp, *gc_list = NULL; + struct dst_entry *gc_list = NULL; + int dir; read_lock_bh(&xfrm_policy_lock); - for (i=0; i<2*XFRM_POLICY_MAX; i++) { -#ifdef CONFIG_XFRM_SUB_POLICY - for (pol = xfrm_policy_list_sub[i]; pol; pol = pol->next) { - write_lock(&pol->lock); - dstp = &pol->bundles; - while ((dst=*dstp) != NULL) { - if (func(dst)) { - *dstp = dst->next; - dst->next = gc_list; - gc_list = dst; - } else { - dstp = &dst->next; - } - } - write_unlock(&pol->lock); - } + for (dir = 0; dir < XFRM_POLICY_MAX * 2; dir++) { + struct xfrm_policy *pol; + struct hlist_node *entry; + struct hlist_head *table; + int i; -#endif - for (pol = xfrm_policy_list[i]; pol; pol = pol->next) { - write_lock(&pol->lock); - dstp = &pol->bundles; - while ((dst=*dstp) != NULL) { - if (func(dst)) { - *dstp = dst->next; - dst->next = gc_list; - gc_list = dst; - } else { - dstp = &dst->next; - } - } - write_unlock(&pol->lock); + hlist_for_each_entry(pol, entry, + &xfrm_policy_inexact[dir], bydst) + prune_one_bundle(pol, func, &gc_list); + + table = xfrm_policy_bydst[dir].table; + for (i = xfrm_policy_bydst[dir].hmask; i >= 0; i--) { + hlist_for_each_entry(pol, entry, table + i, bydst) + prune_one_bundle(pol, func, &gc_list); } } read_unlock_bh(&xfrm_policy_lock); while (gc_list) { - dst = gc_list; + struct dst_entry *dst = gc_list; gc_list = dst->next; dst_free(dst); } @@ -1680,6 +2058,9 @@ static struct notifier_block xfrm_dev_notifier = { static void __init xfrm_policy_init(void) { + unsigned int hmask, sz; + int dir; + xfrm_dst_cache = kmem_cache_create("xfrm_dst_cache", sizeof(struct xfrm_dst), 0, SLAB_HWCACHE_ALIGN, @@ -1687,6 +2068,26 @@ static void __init xfrm_policy_init(void) if (!xfrm_dst_cache) panic("XFRM: failed to allocate xfrm_dst_cache\n"); + hmask = 8 - 1; + sz = (hmask+1) * sizeof(struct hlist_head); + + xfrm_policy_byidx = xfrm_policy_hash_alloc(sz); + xfrm_idx_hmask = hmask; + if (!xfrm_policy_byidx) + panic("XFRM: failed to allocate byidx hash\n"); + + for (dir = 0; dir < XFRM_POLICY_MAX * 2; dir++) { + struct xfrm_policy_hash *htab; + + INIT_HLIST_HEAD(&xfrm_policy_inexact[dir]); + + htab = &xfrm_policy_bydst[dir]; + htab->table = xfrm_policy_hash_alloc(sz); + htab->hmask = hmask; + if (!htab->table) + panic("XFRM: failed to allocate bydst hash\n"); + } + INIT_WORK(&xfrm_policy_gc_work, xfrm_policy_gc_task, NULL); register_netdevice_notifier(&xfrm_dev_notifier); } -- cgit v0.10.2 From 44e36b42a8378be1dcf7e6f8a1cb2710a8903387 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Thu, 24 Aug 2006 04:50:50 -0700 Subject: [XFRM]: Extract common hashing code into xfrm_hash.[ch] Signed-off-by: David S. Miller diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile index 693aac1..de3c1a6 100644 --- a/net/xfrm/Makefile +++ b/net/xfrm/Makefile @@ -2,6 +2,7 @@ # Makefile for the XFRM subsystem. # -obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_input.o xfrm_algo.o +obj-$(CONFIG_XFRM) := xfrm_policy.o xfrm_state.o xfrm_hash.o \ + xfrm_input.o xfrm_algo.o obj-$(CONFIG_XFRM_USER) += xfrm_user.o diff --git a/net/xfrm/xfrm_hash.c b/net/xfrm/xfrm_hash.c new file mode 100644 index 0000000..37643bb --- /dev/null +++ b/net/xfrm/xfrm_hash.c @@ -0,0 +1,41 @@ +/* xfrm_hash.c: Common hash table code. + * + * Copyright (C) 2006 David S. Miller (davem@davemloft.net) + */ + +#include +#include +#include +#include +#include +#include + +#include "xfrm_hash.h" + +struct hlist_head *xfrm_hash_alloc(unsigned int sz) +{ + struct hlist_head *n; + + if (sz <= PAGE_SIZE) + n = kmalloc(sz, GFP_KERNEL); + else if (hashdist) + n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); + else + n = (struct hlist_head *) + __get_free_pages(GFP_KERNEL, get_order(sz)); + + if (n) + memset(n, 0, sz); + + return n; +} + +void xfrm_hash_free(struct hlist_head *n, unsigned int sz) +{ + if (sz <= PAGE_SIZE) + kfree(n); + else if (hashdist) + vfree(n); + else + free_pages((unsigned long)n, get_order(sz)); +} diff --git a/net/xfrm/xfrm_hash.h b/net/xfrm/xfrm_hash.h new file mode 100644 index 0000000..d3abb0b --- /dev/null +++ b/net/xfrm/xfrm_hash.h @@ -0,0 +1,128 @@ +#ifndef _XFRM_HASH_H +#define _XFRM_HASH_H + +#include +#include + +static inline unsigned int __xfrm4_addr_hash(xfrm_address_t *addr) +{ + return ntohl(addr->a4); +} + +static inline unsigned int __xfrm6_addr_hash(xfrm_address_t *addr) +{ + return ntohl(addr->a6[2] ^ addr->a6[3]); +} + +static inline unsigned int __xfrm4_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) +{ + return ntohl(daddr->a4 ^ saddr->a4); +} + +static inline unsigned int __xfrm6_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) +{ + return ntohl(daddr->a6[2] ^ daddr->a6[3] ^ + saddr->a6[2] ^ saddr->a6[3]); +} + +static inline unsigned int __xfrm_dst_hash(xfrm_address_t *daddr, xfrm_address_t *saddr, + u32 reqid, unsigned short family, + unsigned int hmask) +{ + unsigned int h = family ^ reqid; + switch (family) { + case AF_INET: + h ^= __xfrm4_daddr_saddr_hash(daddr, saddr); + break; + case AF_INET6: + h ^= __xfrm6_daddr_saddr_hash(daddr, saddr); + break; + } + return (h ^ (h >> 16)) & hmask; +} + +static inline unsigned __xfrm_src_hash(xfrm_address_t *saddr, + unsigned short family, + unsigned int hmask) +{ + unsigned int h = family; + switch (family) { + case AF_INET: + h ^= __xfrm4_addr_hash(saddr); + break; + case AF_INET6: + h ^= __xfrm6_addr_hash(saddr); + break; + }; + return (h ^ (h >> 16)) & hmask; +} + +static inline unsigned int +__xfrm_spi_hash(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family, + unsigned int hmask) +{ + unsigned int h = spi ^ proto; + switch (family) { + case AF_INET: + h ^= __xfrm4_addr_hash(daddr); + break; + case AF_INET6: + h ^= __xfrm6_addr_hash(daddr); + break; + } + return (h ^ (h >> 10) ^ (h >> 20)) & hmask; +} + +static inline unsigned int __idx_hash(u32 index, unsigned int hmask) +{ + return (index ^ (index >> 8)) & hmask; +} + +static inline unsigned int __sel_hash(struct xfrm_selector *sel, unsigned short family, unsigned int hmask) +{ + xfrm_address_t *daddr = &sel->daddr; + xfrm_address_t *saddr = &sel->saddr; + unsigned int h = 0; + + switch (family) { + case AF_INET: + if (sel->prefixlen_d != 32 || + sel->prefixlen_s != 32) + return hmask + 1; + + h = __xfrm4_daddr_saddr_hash(daddr, saddr); + break; + + case AF_INET6: + if (sel->prefixlen_d != 128 || + sel->prefixlen_s != 128) + return hmask + 1; + + h = __xfrm6_daddr_saddr_hash(daddr, saddr); + break; + }; + h ^= (h >> 16); + return h & hmask; +} + +static inline unsigned int __addr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family, unsigned int hmask) +{ + unsigned int h = 0; + + switch (family) { + case AF_INET: + h = __xfrm4_daddr_saddr_hash(daddr, saddr); + break; + + case AF_INET6: + h = __xfrm6_daddr_saddr_hash(daddr, saddr); + break; + }; + h ^= (h >> 16); + return h & hmask; +} + +extern struct hlist_head *xfrm_hash_alloc(unsigned int sz); +extern void xfrm_hash_free(struct hlist_head *n, unsigned int sz); + +#endif /* _XFRM_HASH_H */ diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 087a544..b446ca3 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -22,12 +22,12 @@ #include #include #include -#include -#include #include #include #include +#include "xfrm_hash.h" + DEFINE_MUTEX(xfrm_cfg_mutex); EXPORT_SYMBOL(xfrm_cfg_mutex); @@ -409,62 +409,11 @@ static struct hlist_head *xfrm_policy_byidx __read_mostly; static unsigned int xfrm_idx_hmask __read_mostly; static unsigned int xfrm_policy_hashmax __read_mostly = 1 * 1024 * 1024; -static inline unsigned int __idx_hash(u32 index, unsigned int hmask) -{ - return (index ^ (index >> 8)) & hmask; -} - static inline unsigned int idx_hash(u32 index) { return __idx_hash(index, xfrm_idx_hmask); } -static inline unsigned int __sel_hash(struct xfrm_selector *sel, unsigned short family, unsigned int hmask) -{ - xfrm_address_t *daddr = &sel->daddr; - xfrm_address_t *saddr = &sel->saddr; - unsigned int h = 0; - - switch (family) { - case AF_INET: - if (sel->prefixlen_d != 32 || - sel->prefixlen_s != 32) - return hmask + 1; - - h = ntohl(daddr->a4 ^ saddr->a4); - break; - - case AF_INET6: - if (sel->prefixlen_d != 128 || - sel->prefixlen_s != 128) - return hmask + 1; - - h = ntohl(daddr->a6[2] ^ daddr->a6[3] ^ - saddr->a6[2] ^ saddr->a6[3]); - break; - }; - h ^= (h >> 16); - return h & hmask; -} - -static inline unsigned int __addr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr, unsigned short family, unsigned int hmask) -{ - unsigned int h = 0; - - switch (family) { - case AF_INET: - h = ntohl(daddr->a4 ^ saddr->a4); - break; - - case AF_INET6: - h = ntohl(daddr->a6[2] ^ daddr->a6[3] ^ - saddr->a6[2] ^ saddr->a6[3]); - break; - }; - h ^= (h >> 16); - return h & hmask; -} - static struct hlist_head *policy_hash_bysel(struct xfrm_selector *sel, unsigned short family, int dir) { unsigned int hmask = xfrm_policy_bydst[dir].hmask; @@ -483,34 +432,6 @@ static struct hlist_head *policy_hash_direct(xfrm_address_t *daddr, xfrm_address return xfrm_policy_bydst[dir].table + hash; } -static struct hlist_head *xfrm_policy_hash_alloc(unsigned int sz) -{ - struct hlist_head *n; - - if (sz <= PAGE_SIZE) - n = kmalloc(sz, GFP_KERNEL); - else if (hashdist) - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); - else - n = (struct hlist_head *) - __get_free_pages(GFP_KERNEL, get_order(sz)); - - if (n) - memset(n, 0, sz); - - return n; -} - -static void xfrm_policy_hash_free(struct hlist_head *n, unsigned int sz) -{ - if (sz <= PAGE_SIZE) - kfree(n); - else if (hashdist) - vfree(n); - else - free_pages((unsigned long)n, get_order(sz)); -} - static void xfrm_dst_hash_transfer(struct hlist_head *list, struct hlist_head *ndsttable, unsigned int nhashmask) @@ -553,7 +474,7 @@ static void xfrm_bydst_resize(int dir) unsigned int nhashmask = xfrm_new_hash_mask(hmask); unsigned int nsize = (nhashmask + 1) * sizeof(struct hlist_head); struct hlist_head *odst = xfrm_policy_bydst[dir].table; - struct hlist_head *ndst = xfrm_policy_hash_alloc(nsize); + struct hlist_head *ndst = xfrm_hash_alloc(nsize); int i; if (!ndst) @@ -569,7 +490,7 @@ static void xfrm_bydst_resize(int dir) write_unlock_bh(&xfrm_policy_lock); - xfrm_policy_hash_free(odst, (hmask + 1) * sizeof(struct hlist_head)); + xfrm_hash_free(odst, (hmask + 1) * sizeof(struct hlist_head)); } static void xfrm_byidx_resize(int total) @@ -578,7 +499,7 @@ static void xfrm_byidx_resize(int total) unsigned int nhashmask = xfrm_new_hash_mask(hmask); unsigned int nsize = (nhashmask + 1) * sizeof(struct hlist_head); struct hlist_head *oidx = xfrm_policy_byidx; - struct hlist_head *nidx = xfrm_policy_hash_alloc(nsize); + struct hlist_head *nidx = xfrm_hash_alloc(nsize); int i; if (!nidx) @@ -594,7 +515,7 @@ static void xfrm_byidx_resize(int total) write_unlock_bh(&xfrm_policy_lock); - xfrm_policy_hash_free(oidx, (hmask + 1) * sizeof(struct hlist_head)); + xfrm_hash_free(oidx, (hmask + 1) * sizeof(struct hlist_head)); } static inline int xfrm_bydst_should_resize(int dir, int *total) @@ -2071,7 +1992,7 @@ static void __init xfrm_policy_init(void) hmask = 8 - 1; sz = (hmask+1) * sizeof(struct hlist_head); - xfrm_policy_byidx = xfrm_policy_hash_alloc(sz); + xfrm_policy_byidx = xfrm_hash_alloc(sz); xfrm_idx_hmask = hmask; if (!xfrm_policy_byidx) panic("XFRM: failed to allocate byidx hash\n"); @@ -2082,7 +2003,7 @@ static void __init xfrm_policy_init(void) INIT_HLIST_HEAD(&xfrm_policy_inexact[dir]); htab = &xfrm_policy_bydst[dir]; - htab->table = xfrm_policy_hash_alloc(sz); + htab->table = xfrm_hash_alloc(sz); htab->hmask = hmask; if (!htab->table) panic("XFRM: failed to allocate bydst hash\n"); diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 37213f9..4341795 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -18,11 +18,11 @@ #include #include #include -#include -#include #include #include +#include "xfrm_hash.h" + struct sock *xfrm_nl; EXPORT_SYMBOL(xfrm_nl); @@ -55,44 +55,6 @@ static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024; static unsigned int xfrm_state_num; static unsigned int xfrm_state_genid; -static inline unsigned int __xfrm4_addr_hash(xfrm_address_t *addr) -{ - return ntohl(addr->a4); -} - -static inline unsigned int __xfrm6_addr_hash(xfrm_address_t *addr) -{ - return ntohl(addr->a6[2]^addr->a6[3]); -} - -static inline unsigned int __xfrm4_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) -{ - return ntohl(daddr->a4 ^ saddr->a4); -} - -static inline unsigned int __xfrm6_daddr_saddr_hash(xfrm_address_t *daddr, xfrm_address_t *saddr) -{ - return ntohl(daddr->a6[2] ^ daddr->a6[3] ^ - saddr->a6[2] ^ saddr->a6[3]); -} - -static inline unsigned int __xfrm_dst_hash(xfrm_address_t *daddr, - xfrm_address_t *saddr, - u32 reqid, unsigned short family, - unsigned int hmask) -{ - unsigned int h = family ^ reqid; - switch (family) { - case AF_INET: - h ^= __xfrm4_daddr_saddr_hash(daddr, saddr); - break; - case AF_INET6: - h ^= __xfrm6_daddr_saddr_hash(daddr, saddr); - break; - }; - return (h ^ (h >> 16)) & hmask; -} - static inline unsigned int xfrm_dst_hash(xfrm_address_t *daddr, xfrm_address_t *saddr, u32 reqid, @@ -101,76 +63,18 @@ static inline unsigned int xfrm_dst_hash(xfrm_address_t *daddr, return __xfrm_dst_hash(daddr, saddr, reqid, family, xfrm_state_hmask); } -static inline unsigned __xfrm_src_hash(xfrm_address_t *addr, unsigned short family, - unsigned int hmask) -{ - unsigned int h = family; - switch (family) { - case AF_INET: - h ^= __xfrm4_addr_hash(addr); - break; - case AF_INET6: - h ^= __xfrm6_addr_hash(addr); - break; - }; - return (h ^ (h >> 16)) & hmask; -} - -static inline unsigned xfrm_src_hash(xfrm_address_t *addr, unsigned short family) +static inline unsigned int xfrm_src_hash(xfrm_address_t *addr, + unsigned short family) { return __xfrm_src_hash(addr, family, xfrm_state_hmask); } static inline unsigned int -__xfrm_spi_hash(xfrm_address_t *daddr, u32 spi, u8 proto, - unsigned short family, unsigned int hmask) -{ - unsigned int h = spi ^ proto; - switch (family) { - case AF_INET: - h ^= __xfrm4_addr_hash(daddr); - break; - case AF_INET6: - h ^= __xfrm6_addr_hash(daddr); - break; - } - return (h ^ (h >> 10) ^ (h >> 20)) & hmask; -} - -static inline unsigned int xfrm_spi_hash(xfrm_address_t *daddr, u32 spi, u8 proto, unsigned short family) { return __xfrm_spi_hash(daddr, spi, proto, family, xfrm_state_hmask); } -static struct hlist_head *xfrm_state_hash_alloc(unsigned int sz) -{ - struct hlist_head *n; - - if (sz <= PAGE_SIZE) - n = kmalloc(sz, GFP_KERNEL); - else if (hashdist) - n = __vmalloc(sz, GFP_KERNEL, PAGE_KERNEL); - else - n = (struct hlist_head *) - __get_free_pages(GFP_KERNEL, get_order(sz)); - - if (n) - memset(n, 0, sz); - - return n; -} - -static void xfrm_state_hash_free(struct hlist_head *n, unsigned int sz) -{ - if (sz <= PAGE_SIZE) - kfree(n); - else if (hashdist) - vfree(n); - else - free_pages((unsigned long)n, get_order(sz)); -} - static void xfrm_hash_transfer(struct hlist_head *list, struct hlist_head *ndsttable, struct hlist_head *nsrctable, @@ -216,18 +120,18 @@ static void xfrm_hash_resize(void *__unused) mutex_lock(&hash_resize_mutex); nsize = xfrm_hash_new_size(); - ndst = xfrm_state_hash_alloc(nsize); + ndst = xfrm_hash_alloc(nsize); if (!ndst) goto out_unlock; - nsrc = xfrm_state_hash_alloc(nsize); + nsrc = xfrm_hash_alloc(nsize); if (!nsrc) { - xfrm_state_hash_free(ndst, nsize); + xfrm_hash_free(ndst, nsize); goto out_unlock; } - nspi = xfrm_state_hash_alloc(nsize); + nspi = xfrm_hash_alloc(nsize); if (!nspi) { - xfrm_state_hash_free(ndst, nsize); - xfrm_state_hash_free(nsrc, nsize); + xfrm_hash_free(ndst, nsize); + xfrm_hash_free(nsrc, nsize); goto out_unlock; } @@ -251,9 +155,9 @@ static void xfrm_hash_resize(void *__unused) spin_unlock_bh(&xfrm_state_lock); osize = (ohashmask + 1) * sizeof(struct hlist_head); - xfrm_state_hash_free(odst, osize); - xfrm_state_hash_free(osrc, osize); - xfrm_state_hash_free(ospi, osize); + xfrm_hash_free(odst, osize); + xfrm_hash_free(osrc, osize); + xfrm_hash_free(ospi, osize); out_unlock: mutex_unlock(&hash_resize_mutex); @@ -1643,9 +1547,9 @@ void __init xfrm_state_init(void) sz = sizeof(struct hlist_head) * 8; - xfrm_state_bydst = xfrm_state_hash_alloc(sz); - xfrm_state_bysrc = xfrm_state_hash_alloc(sz); - xfrm_state_byspi = xfrm_state_hash_alloc(sz); + xfrm_state_bydst = xfrm_hash_alloc(sz); + xfrm_state_bysrc = xfrm_hash_alloc(sz); + xfrm_state_byspi = xfrm_hash_alloc(sz); if (!xfrm_state_bydst || !xfrm_state_bysrc || !xfrm_state_byspi) panic("XFRM: Cannot allocate bydst/bysrc/byspi hashes."); xfrm_state_hmask = ((sz / sizeof(struct hlist_head)) - 1); -- cgit v0.10.2 From e4bec827feda76d5e7417a2696a75424834d564f Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Fri, 22 Sep 2006 15:17:35 -0700 Subject: [IPSEC] esp: Defer output IV initialization to first use. First of all, if the xfrm_state only gets used for input packets this entropy is a complete waste. Secondly, it is often the case that a configuration loads many rules (perhaps even dynamically) and they don't all necessarily ever get used. This get_random_bytes() call was showing up in the profiles for xfrm_state inserts which is how I noticed this. Signed-off-by: David S. Miller diff --git a/include/net/esp.h b/include/net/esp.h index 064366d..713d039f 100644 --- a/include/net/esp.h +++ b/include/net/esp.h @@ -15,13 +15,14 @@ struct esp_data struct { u8 *key; /* Key */ int key_len; /* Key length */ - u8 *ivec; /* ivec buffer */ + int padlen; /* 0..255 */ /* ivlen is offset from enc_data, where encrypted data start. * It is logically different of crypto_tfm_alg_ivsize(tfm). * We assume that it is either zero (no ivec), or * >= crypto_tfm_alg_ivsize(tfm). */ int ivlen; - int padlen; /* 0..255 */ + int ivinitted; + u8 *ivec; /* ivec buffer */ struct crypto_blkcipher *tfm; /* crypto handle */ } conf; diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c index e87377e..13b2936 100644 --- a/net/ipv4/esp4.c +++ b/net/ipv4/esp4.c @@ -95,8 +95,13 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb) esph->seq_no = htonl(++x->replay.oseq); xfrm_aevent_doreplay(x); - if (esp->conf.ivlen) + if (esp->conf.ivlen) { + if (unlikely(!esp->conf.ivinitted)) { + get_random_bytes(esp->conf.ivec, esp->conf.ivlen); + esp->conf.ivinitted = 1; + } crypto_blkcipher_set_iv(tfm, esp->conf.ivec, esp->conf.ivlen); + } do { struct scatterlist *sg = &esp->sgbuf[0]; @@ -378,7 +383,7 @@ static int esp_init_state(struct xfrm_state *x) esp->conf.ivec = kmalloc(esp->conf.ivlen, GFP_KERNEL); if (unlikely(esp->conf.ivec == NULL)) goto error; - get_random_bytes(esp->conf.ivec, esp->conf.ivlen); + esp->conf.ivinitted = 0; } if (crypto_blkcipher_setkey(tfm, esp->conf.key, esp->conf.key_len)) goto error; diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c index ae50b95..e78680a 100644 --- a/net/ipv6/esp6.c +++ b/net/ipv6/esp6.c @@ -99,8 +99,13 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb) esph->seq_no = htonl(++x->replay.oseq); xfrm_aevent_doreplay(x); - if (esp->conf.ivlen) + if (esp->conf.ivlen) { + if (unlikely(!esp->conf.ivinitted)) { + get_random_bytes(esp->conf.ivec, esp->conf.ivlen); + esp->conf.ivinitted = 1; + } crypto_blkcipher_set_iv(tfm, esp->conf.ivec, esp->conf.ivlen); + } do { struct scatterlist *sg = &esp->sgbuf[0]; @@ -353,7 +358,7 @@ static int esp6_init_state(struct xfrm_state *x) esp->conf.ivec = kmalloc(esp->conf.ivlen, GFP_KERNEL); if (unlikely(esp->conf.ivec == NULL)) goto error; - get_random_bytes(esp->conf.ivec, esp->conf.ivlen); + esp->conf.ivinitted = 0; } if (crypto_blkcipher_setkey(tfm, esp->conf.key, esp->conf.key_len)) goto error; -- cgit v0.10.2 From e731c248ba9e8c7025ae8b4a3fa48e4236b82e52 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Thu, 24 Aug 2006 23:18:12 +0900 Subject: [IPV6] MIP6: Several obvious clean-ups. - Remove redundant code. Pointed out by Brian Haley . - Unify code paths with/without CONFIG_IPV6_MIP. - Use NIP6_FMT for IPv6 address textual presentation. - Fold long line. Pointed out by David Miller . Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/ah6.c b/net/ipv6/ah6.c index 0f2b4e3..b0d83e8 100644 --- a/net/ipv6/ah6.c +++ b/net/ipv6/ah6.c @@ -128,9 +128,7 @@ static void ipv6_rearrange_destopt(struct ipv6hdr *iph, struct ipv6_opt_hdr *des off += optlen; len -= optlen; } - if (len == 0) - return; - + /* Note: ok if len == 0 */ bad: return; } @@ -175,11 +173,7 @@ static void ipv6_rearrange_rthdr(struct ipv6hdr *iph, struct ipv6_rt_hdr *rthdr) ipv6_addr_copy(&iph->daddr, &final_addr); } -#ifdef CONFIG_IPV6_MIP6 static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len, int dir) -#else -static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) -#endif { union { struct ipv6hdr *iph; @@ -194,30 +188,12 @@ static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) while (exthdr.raw < end) { switch (nexthdr) { -#ifdef CONFIG_IPV6_MIP6 - case NEXTHDR_HOP: - if (!zero_out_mutable_opts(exthdr.opth)) { - LIMIT_NETDEBUG( - KERN_WARNING "overrun %sopts\n", - nexthdr == NEXTHDR_HOP ? - "hop" : "dest"); - return -EINVAL; - } - break; case NEXTHDR_DEST: +#ifdef CONFIG_IPV6_MIP6 if (dir == XFRM_POLICY_OUT) ipv6_rearrange_destopt(iph, exthdr.opth); - if (!zero_out_mutable_opts(exthdr.opth)) { - LIMIT_NETDEBUG( - KERN_WARNING "overrun %sopts\n", - nexthdr == NEXTHDR_HOP ? - "hop" : "dest"); - return -EINVAL; - } - break; -#else +#endif case NEXTHDR_HOP: - case NEXTHDR_DEST: if (!zero_out_mutable_opts(exthdr.opth)) { LIMIT_NETDEBUG( KERN_WARNING "overrun %sopts\n", @@ -226,7 +202,6 @@ static int ipv6_clear_mutable_options(struct ipv6hdr *iph, int len) return -EINVAL; } break; -#endif case NEXTHDR_ROUTING: ipv6_rearrange_rthdr(iph, exthdr.rth); @@ -282,16 +257,13 @@ static int ah6_output(struct xfrm_state *x, struct sk_buff *skb) } #ifdef CONFIG_IPV6_MIP6 memcpy(tmp_ext, &top_iph->saddr, extlen); - err = ipv6_clear_mutable_options(top_iph, - extlen - sizeof(*tmp_ext) + - sizeof(*top_iph), - XFRM_POLICY_OUT); #else memcpy(tmp_ext, &top_iph->daddr, extlen); +#endif err = ipv6_clear_mutable_options(top_iph, extlen - sizeof(*tmp_ext) + - sizeof(*top_iph)); -#endif + sizeof(*top_iph), + XFRM_POLICY_OUT); if (err) goto error_free_iph; } @@ -386,13 +358,8 @@ static int ah6_input(struct xfrm_state *x, struct sk_buff *skb) if (!tmp_hdr) goto out; memcpy(tmp_hdr, skb->nh.raw, hdr_len); -#ifdef CONFIG_IPV6_MIP6 if (ipv6_clear_mutable_options(skb->nh.ipv6h, hdr_len, XFRM_POLICY_IN)) goto free_out; -#else - if (ipv6_clear_mutable_options(skb->nh.ipv6h, hdr_len)) - goto free_out; -#endif skb->nh.ipv6h->priority = 0; skb->nh.ipv6h->flow_lbl[0] = 0; skb->nh.ipv6h->flow_lbl[1] = 0; diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 6a6466b..084f78c 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -87,7 +87,6 @@ int ipv6_find_tlv(struct sk_buff *skb, int offset, int type) len -= optlen; } /* not_found */ - return -1; bad: return -1; } diff --git a/net/ipv6/mip6.c b/net/ipv6/mip6.c index 7085403..99d116c 100644 --- a/net/ipv6/mip6.c +++ b/net/ipv6/mip6.c @@ -121,7 +121,8 @@ int mip6_mh_filter(struct sock *sk, struct sk_buff *skb) &skb->nh.ipv6h->daddr, mhlen, IPPROTO_MH, skb_checksum(skb, 0, mhlen, 0))) { - LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH checksum failed [%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x > %04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x]\n", + LIMIT_NETDEBUG(KERN_DEBUG "mip6: MH checksum failed " + "[" NIP6_FMT " > " NIP6_FMT "]\n", NIP6(skb->nh.ipv6h->saddr), NIP6(skb->nh.ipv6h->daddr)); return -1; @@ -234,7 +235,8 @@ static int mip6_destopt_reject(struct xfrm_state *x, struct sk_buff *skb, struct struct timeval stamp; int err = 0; - if (unlikely(fl->proto == IPPROTO_MH && fl->fl_mh_type <= IP6_MH_TYPE_MAX)) + if (unlikely(fl->proto == IPPROTO_MH && + fl->fl_mh_type <= IP6_MH_TYPE_MAX)) goto out; if (likely(opt->dsthao)) { -- cgit v0.10.2 From 2cc67cc731d9b693a08e781e98fec0e3a6d6ba44 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Mon, 21 Aug 2006 19:18:57 +0900 Subject: [IPV6] ROUTE: Routing by Traffic Class. Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 7b4908c..91f6233 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -121,6 +121,9 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) !ipv6_prefix_equal(&fl->fl6_src, &r->src.addr, r->src.plen)) return 0; + if (r->tclass && r->tclass != ((ntohl(fl->fl6_flowlabel) >> 20) & 0xff)) + return 0; + return 1; } -- cgit v0.10.2 From 75bff8f023e02b045a8f68f36fa7da98dca124b8 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Mon, 21 Aug 2006 19:22:01 +0900 Subject: [IPV6] ROUTE: Routing by FWMARK. Based on patch by Jean Lorchat . Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/linux/fib_rules.h b/include/linux/fib_rules.h index 19a82b6..2987549 100644 --- a/include/linux/fib_rules.h +++ b/include/linux/fib_rules.h @@ -34,7 +34,7 @@ enum FRA_UNUSED3, FRA_UNUSED4, FRA_UNUSED5, - FRA_FWMARK, /* netfilter mark (IPv4) */ + FRA_FWMARK, /* netfilter mark (IPv4/IPv6) */ FRA_FLOW, /* flow/class id */ FRA_UNUSED6, FRA_UNUSED7, diff --git a/include/net/flow.h b/include/net/flow.h index e052291..3ca210e 100644 --- a/include/net/flow.h +++ b/include/net/flow.h @@ -26,6 +26,7 @@ struct flowi { struct { struct in6_addr daddr; struct in6_addr saddr; + __u32 fwmark; __u32 flowlabel; } ip6_u; @@ -42,6 +43,7 @@ struct flowi { #define fld_scope nl_u.dn_u.scope #define fl6_dst nl_u.ip6_u.daddr #define fl6_src nl_u.ip6_u.saddr +#define fl6_fwmark nl_u.ip6_u.fwmark #define fl6_flowlabel nl_u.ip6_u.flowlabel #define fl4_dst nl_u.ip4_u.daddr #define fl4_src nl_u.ip4_u.saddr diff --git a/net/ipv6/Kconfig b/net/ipv6/Kconfig index 21e0cc8..a2d211d 100644 --- a/net/ipv6/Kconfig +++ b/net/ipv6/Kconfig @@ -173,3 +173,10 @@ config IPV6_MULTIPLE_TABLES ---help--- Support multiple routing tables. +config IPV6_ROUTE_FWMARK + bool "IPv6: use netfilter MARK value as routing key" + depends on IPV6_MULTIPLE_TABLES && NETFILTER + ---help--- + If you say Y here, you will be able to specify different routes for + packets with different mark values (see iptables(8), MARK target). + diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 91f6233..aebd9e2 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -26,6 +26,9 @@ struct fib6_rule struct fib_rule common; struct rt6key src; struct rt6key dst; +#ifdef CONFIG_IPV6_ROUTE_FWMARK + u8 fwmark; +#endif u8 tclass; }; @@ -124,6 +127,11 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) if (r->tclass && r->tclass != ((ntohl(fl->fl6_flowlabel) >> 20) & 0xff)) return 0; +#ifdef CONFIG_IPV6_ROUTE_FWMARK + if (r->fwmark && (r->fwmark != fl->fl6_fwmark)) + return 0; +#endif + return 1; } @@ -164,6 +172,11 @@ static int fib6_rule_configure(struct fib_rule *rule, struct sk_buff *skb, nla_memcpy(&rule6->dst.addr, tb[FRA_DST], sizeof(struct in6_addr)); +#ifdef CONFIG_IPV6_ROUTE_FWMARK + if (tb[FRA_FWMARK]) + rule6->fwmark = nla_get_u32(tb[FRA_FWMARK]); +#endif + rule6->src.plen = frh->src_len; rule6->dst.plen = frh->dst_len; rule6->tclass = frh->tos; @@ -195,6 +208,11 @@ static int fib6_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, nla_memcmp(tb[FRA_DST], &rule6->dst.addr, sizeof(struct in6_addr))) return 0; +#ifdef CONFIG_IPV6_ROUTE_FWMARK + if (tb[FRA_FWMARK] && (rule6->fwmark != nla_get_u32(tb[FRA_FWMARK]))) + return 0; +#endif + return 1; } @@ -216,6 +234,11 @@ static int fib6_rule_fill(struct fib_rule *rule, struct sk_buff *skb, NLA_PUT(skb, FRA_SRC, sizeof(struct in6_addr), &rule6->src.addr); +#ifdef CONFIG_IPV6_ROUTE_FWMARK + if (rule6->fwmark) + NLA_PUT_U32(skb, FRA_FWMARK, rule6->fwmark); +#endif + return 0; nla_put_failure: diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 2069128..649350b 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -703,6 +703,7 @@ void ip6_route_input(struct sk_buff *skb) .ip6_u = { .daddr = iph->daddr, .saddr = iph->saddr, + .fwmark = skb->nfmark, .flowlabel = (* (u32 *) iph)&IPV6_FLOWINFO_MASK, }, }, -- cgit v0.10.2 From 1aaec67f9335a17856dfacdd3e5cc6f4c18faeec Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Sun, 25 Jun 2006 23:54:55 +0900 Subject: [NET]: Add common helper functions to convert IPv6/IPv4 address string to network address structure. These helpers can be used in netfilter, cifs etc. Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/linux/inet.h b/include/linux/inet.h index 6c5587a..b7c6da7 100644 --- a/include/linux/inet.h +++ b/include/linux/inet.h @@ -46,5 +46,7 @@ #include extern __be32 in_aton(const char *str); +extern int in4_pton(const char *src, int srclen, u8 *dst, char delim, const char **end); +extern int in6_pton(const char *src, int srclen, u8 *dst, char delim, const char **end); #endif #endif /* _LINUX_INET_H */ diff --git a/net/core/utils.c b/net/core/utils.c index e31c90e..5a06e8a 100644 --- a/net/core/utils.c +++ b/net/core/utils.c @@ -4,6 +4,7 @@ * Authors: * net_random Alan Cox * net_ratelimit Andy Kleen + * in{4,6}_pton YOSHIFUJI Hideaki, Copyright (C)2006 USAGI/WIDE Project * * Created by Alexey Kuznetsov * @@ -191,3 +192,217 @@ __be32 in_aton(const char *str) } EXPORT_SYMBOL(in_aton); + +#define IN6PTON_XDIGIT 0x00010000 +#define IN6PTON_DIGIT 0x00020000 +#define IN6PTON_COLON_MASK 0x00700000 +#define IN6PTON_COLON_1 0x00100000 /* single : requested */ +#define IN6PTON_COLON_2 0x00200000 /* second : requested */ +#define IN6PTON_COLON_1_2 0x00400000 /* :: requested */ +#define IN6PTON_DOT 0x00800000 /* . */ +#define IN6PTON_DELIM 0x10000000 +#define IN6PTON_NULL 0x20000000 /* first/tail */ +#define IN6PTON_UNKNOWN 0x40000000 + +static inline int digit2bin(char c, char delim) +{ + if (c == delim || c == '\0') + return IN6PTON_DELIM; + if (c == '.') + return IN6PTON_DOT; + if (c >= '0' && c <= '9') + return (IN6PTON_DIGIT | (c - '0')); + return IN6PTON_UNKNOWN; +} + +static inline int xdigit2bin(char c, char delim) +{ + if (c == delim || c == '\0') + return IN6PTON_DELIM; + if (c == ':') + return IN6PTON_COLON_MASK; + if (c == '.') + return IN6PTON_DOT; + if (c >= '0' && c <= '9') + return (IN6PTON_XDIGIT | IN6PTON_DIGIT| (c - '0')); + if (c >= 'a' && c <= 'f') + return (IN6PTON_XDIGIT | (c - 'a' + 10)); + if (c >= 'A' && c <= 'F') + return (IN6PTON_XDIGIT | (c - 'A' + 10)); + return IN6PTON_UNKNOWN; +} + +int in4_pton(const char *src, int srclen, + u8 *dst, + char delim, const char **end) +{ + const char *s; + u8 *d; + u8 dbuf[4]; + int ret = 0; + int i; + int w = 0; + + if (srclen < 0) + srclen = strlen(src); + s = src; + d = dbuf; + i = 0; + while(1) { + int c; + c = xdigit2bin(srclen > 0 ? *s : '\0', delim); + if (!(c & (IN6PTON_DIGIT | IN6PTON_DOT | IN6PTON_DELIM))) { + goto out; + } + if (c & (IN6PTON_DOT | IN6PTON_DELIM)) { + if (w == 0) + goto out; + *d++ = w & 0xff; + w = 0; + i++; + if (c & IN6PTON_DELIM) { + if (i != 4) + goto out; + break; + } + goto cont; + } + w = (w * 10) + c; + if ((w & 0xffff) > 255) { + goto out; + } +cont: + if (i >= 4) + goto out; + s++; + srclen--; + } + ret = 1; + memcpy(dst, dbuf, sizeof(dbuf)); +out: + if (end) + *end = s; + return ret; +} + +EXPORT_SYMBOL(in4_pton); + +int in6_pton(const char *src, int srclen, + u8 *dst, + char delim, const char **end) +{ + const char *s, *tok = NULL; + u8 *d, *dc = NULL; + u8 dbuf[16]; + int ret = 0; + int i; + int state = IN6PTON_COLON_1_2 | IN6PTON_XDIGIT | IN6PTON_NULL; + int w = 0; + + memset(dbuf, 0, sizeof(dbuf)); + + s = src; + d = dbuf; + if (srclen < 0) + srclen = strlen(src); + + printf("srclen=%d\n", srclen); + + while (1) { + int c; + + c = xdigit2bin(srclen > 0 ? *s : '\0', delim); + if (!(c & state)) + goto out; + if (c & (IN6PTON_DELIM | IN6PTON_COLON_MASK)) { + /* process one 16-bit word */ + if (!(state & IN6PTON_NULL)) { + *d++ = (w >> 8) & 0xff; + *d++ = w & 0xff; + } + w = 0; + if (c & IN6PTON_DELIM) { + /* We've processed last word */ + break; + } + /* + * COLON_1 => XDIGIT + * COLON_2 => XDIGIT|DELIM + * COLON_1_2 => COLON_2 + */ + switch (state & IN6PTON_COLON_MASK) { + case IN6PTON_COLON_2: + dc = d; + state = IN6PTON_XDIGIT | IN6PTON_DELIM; + if (dc - dbuf >= sizeof(dbuf)) + state |= IN6PTON_NULL; + break; + case IN6PTON_COLON_1|IN6PTON_COLON_1_2: + state = IN6PTON_XDIGIT | IN6PTON_COLON_2; + break; + case IN6PTON_COLON_1: + state = IN6PTON_XDIGIT; + break; + case IN6PTON_COLON_1_2: + state = IN6PTON_COLON_2; + break; + default: + state = 0; + } + tok = s + 1; + goto cont; + } + + if (c & IN6PTON_DOT) { + ret = in4_pton(tok ? tok : s, srclen + (int)(s - tok), d, delim, &s); + if (ret > 0) { + d += 4; + break; + } + goto out; + } + + w = (w << 4) | (0xff & c); + state = IN6PTON_COLON_1 | IN6PTON_DELIM; + if (!(w & 0xf000)) { + state |= IN6PTON_XDIGIT; + } + if (!dc && d + 2 < dbuf + sizeof(dbuf)) { + state |= IN6PTON_COLON_1_2; + state &= ~IN6PTON_DELIM; + } + if (d + 2 >= dbuf + sizeof(dbuf)) { + state &= ~(IN6PTON_COLON_1|IN6PTON_COLON_1_2); + } +cont: + if ((dc && d + 4 < dbuf + sizeof(dbuf)) || + d + 4 == dbuf + sizeof(dbuf)) { + state |= IN6PTON_DOT; + } + if (d >= dbuf + sizeof(dbuf)) { + state &= ~(IN6PTON_XDIGIT|IN6PTON_COLON_MASK); + } + s++; + srclen--; + } + + i = 15; d--; + + if (dc) { + while(d >= dc) + dst[i--] = *d--; + while(i >= dc - dbuf) + dst[i--] = 0; + while(i >= 0) + dst[i--] = *d--; + } else + memcpy(dst, dbuf, sizeof(dbuf)); + + ret = 1; +out: + if (end) + *end = s; + return ret; +} + +EXPORT_SYMBOL(in6_pton); -- cgit v0.10.2 From 1884f78c7a8b456c654338e3eb2874a99688ea10 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Mon, 19 Jun 2006 03:20:32 +0900 Subject: [NETFILTER] NF_CONNTRACK_FTP: Use in6_pton() to convert address string. Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c index 960972d..9dccb40 100644 --- a/net/netfilter/nf_conntrack_ftp.c +++ b/net/netfilter/nf_conntrack_ftp.c @@ -111,101 +111,13 @@ static struct ftp_search { }, }; -/* This code is based on inet_pton() in glibc-2.2.4 */ static int get_ipv6_addr(const char *src, size_t dlen, struct in6_addr *dst, u_int8_t term) { - static const char xdigits[] = "0123456789abcdef"; - u_int8_t tmp[16], *tp, *endp, *colonp; - int ch, saw_xdigit; - u_int32_t val; - size_t clen = 0; - - tp = memset(tmp, '\0', sizeof(tmp)); - endp = tp + sizeof(tmp); - colonp = NULL; - - /* Leading :: requires some special handling. */ - if (*src == ':'){ - if (*++src != ':') { - DEBUGP("invalid \":\" at the head of addr\n"); - return 0; - } - clen++; - } - - saw_xdigit = 0; - val = 0; - while ((clen < dlen) && (*src != term)) { - const char *pch; - - ch = tolower(*src++); - clen++; - - pch = strchr(xdigits, ch); - if (pch != NULL) { - val <<= 4; - val |= (pch - xdigits); - if (val > 0xffff) - return 0; - - saw_xdigit = 1; - continue; - } - if (ch != ':') { - DEBUGP("get_ipv6_addr: invalid char. \'%c\'\n", ch); - return 0; - } - - if (!saw_xdigit) { - if (colonp) { - DEBUGP("invalid location of \"::\".\n"); - return 0; - } - colonp = tp; - continue; - } else if (*src == term) { - DEBUGP("trancated IPv6 addr\n"); - return 0; - } - - if (tp + 2 > endp) - return 0; - *tp++ = (u_int8_t) (val >> 8) & 0xff; - *tp++ = (u_int8_t) val & 0xff; - - saw_xdigit = 0; - val = 0; - continue; - } - if (saw_xdigit) { - if (tp + 2 > endp) - return 0; - *tp++ = (u_int8_t) (val >> 8) & 0xff; - *tp++ = (u_int8_t) val & 0xff; - } - if (colonp != NULL) { - /* - * Since some memmove()'s erroneously fail to handle - * overlapping regions, we'll do the shift by hand. - */ - const int n = tp - colonp; - int i; - - if (tp == endp) - return 0; - - for (i = 1; i <= n; i++) { - endp[- i] = colonp[n - i]; - colonp[n - i] = 0; - } - tp = endp; - } - if (tp != endp || (*src != term)) - return 0; - - memcpy(dst->s6_addr, tmp, sizeof(dst->s6_addr)); - return clen; + int ret = in6_pton(src, min_t(size_t, dlen, 0xffff), dst, term, &end); + if (ret > 0) + return (int)(end - src); + return 0; } static int try_number(const char *data, size_t dlen, u_int32_t array[], -- cgit v0.10.2 From d4f3e9b735c72823aab597bfa4860d184658a609 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Fri, 25 Aug 2006 00:27:09 -0700 Subject: [NET] in6_pton: Kill errant printf statement. Signed-off-by: David S. Miller diff --git a/net/core/utils.c b/net/core/utils.c index 5a06e8a..2682490 100644 --- a/net/core/utils.c +++ b/net/core/utils.c @@ -306,8 +306,6 @@ int in6_pton(const char *src, int srclen, if (srclen < 0) srclen = strlen(src); - printf("srclen=%d\n", srclen); - while (1) { int c; -- cgit v0.10.2 From 298969727e7b855d53f3becfa92c055914082ec4 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Fri, 25 Aug 2006 00:37:24 -0700 Subject: [TCP] tcp_lp: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_lp.c b/net/ipv4/tcp_lp.c index 649ebae..308fb7e 100644 --- a/net/ipv4/tcp_lp.c +++ b/net/ipv4/tcp_lp.c @@ -327,7 +327,7 @@ static struct tcp_congestion_ops tcp_lp = { static int __init tcp_lp_register(void) { - BUG_ON(sizeof(struct lp) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct lp) > ICSK_CA_PRIV_SIZE); return tcp_register_congestion_control(&tcp_lp); } -- cgit v0.10.2 From 65e3d72654d9a33cdccd5c19777a5515ae9dd37d Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Fri, 25 Aug 2006 00:38:03 -0700 Subject: [TCP] tcp_bic: use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_bic.c b/net/ipv4/tcp_bic.c index b0134ab..5730333 100644 --- a/net/ipv4/tcp_bic.c +++ b/net/ipv4/tcp_bic.c @@ -231,7 +231,7 @@ static struct tcp_congestion_ops bictcp = { static int __init bictcp_register(void) { - BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE); return tcp_register_congestion_control(&bictcp); } -- cgit v0.10.2 From acba48e1a3c95082af1e12c5efaaca3506103a92 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Fri, 25 Aug 2006 15:46:46 -0700 Subject: [XFRM]: Respect priority in policy lookups. Even if we find an exact match in the hash table, we must inspect the inexact list to look for a match with a better priority. Noticed by Masahide NAKAMURA . Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index b446ca3..1cf3209 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -908,6 +908,7 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(u8 type, struct flowi *fl, xfrm_address_t *daddr, *saddr; struct hlist_node *entry; struct hlist_head *chain; + u32 priority = ~0U; daddr = xfrm_flowi_daddr(fl, family); saddr = xfrm_flowi_saddr(fl, family); @@ -919,21 +920,21 @@ static struct xfrm_policy *xfrm_policy_lookup_bytype(u8 type, struct flowi *fl, ret = NULL; hlist_for_each_entry(pol, entry, chain, bydst) { if (xfrm_policy_match(pol, fl, type, family, dir)) { - xfrm_pol_hold(pol); ret = pol; + priority = ret->priority; break; } } - if (!ret) { - chain = &xfrm_policy_inexact[dir]; - hlist_for_each_entry(pol, entry, chain, bydst) { - if (xfrm_policy_match(pol, fl, type, family, dir)) { - xfrm_pol_hold(pol); - ret = pol; - break; - } + chain = &xfrm_policy_inexact[dir]; + hlist_for_each_entry(pol, entry, chain, bydst) { + if (xfrm_policy_match(pol, fl, type, family, dir) && + pol->priority < priority) { + ret = pol; + break; } } + if (ret) + xfrm_pol_hold(ret); read_unlock_bh(&xfrm_policy_lock); return ret; -- cgit v0.10.2 From 6c5eb6a50741b882fd99fbb8178942ca2f74b724 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 25 Aug 2006 16:04:29 -0700 Subject: [IPV6] ROUTE: Fix FWMARK support. - Add missing nla_policy entry. - type of fwmark is u32, not u8. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index aebd9e2..b4cd5c0 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -27,7 +27,7 @@ struct fib6_rule struct rt6key src; struct rt6key dst; #ifdef CONFIG_IPV6_ROUTE_FWMARK - u8 fwmark; + u32 fwmark; #endif u8 tclass; }; @@ -140,6 +140,7 @@ static struct nla_policy fib6_rule_policy[RTA_MAX+1] __read_mostly = { [FRA_PRIORITY] = { .type = NLA_U32 }, [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, [FRA_DST] = { .minlen = sizeof(struct in6_addr) }, + [FRA_FWMARK] = { .type = NLA_U32 }, [FRA_TABLE] = { .type = NLA_U32 }, }; -- cgit v0.10.2 From 2613aad5ab28579687519918cdc353af0eed5a3f Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 25 Aug 2006 16:05:00 -0700 Subject: [IPV6] ROUTE: Fix size of fib6_rule_policy. It should not be RTA_MAX+1 but FRA_MAX+1. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index b4cd5c0..3d64c71 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -135,7 +135,7 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) return 1; } -static struct nla_policy fib6_rule_policy[RTA_MAX+1] __read_mostly = { +static struct nla_policy fib6_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_IFNAME] = { .type = NLA_STRING }, [FRA_PRIORITY] = { .type = NLA_U32 }, [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, -- cgit v0.10.2 From cd9d742622fbc2190221e0b2aca80596bfd17733 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 25 Aug 2006 16:05:43 -0700 Subject: [IPV6] ROUTE: Add support for fwmask in routing rules. Add support for fwmark masks. A mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask. Based on patch for net/ipv4/fib_rules.c by Patrick McHardy . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 3d64c71..ee4aa43 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -28,6 +28,7 @@ struct fib6_rule struct rt6key dst; #ifdef CONFIG_IPV6_ROUTE_FWMARK u32 fwmark; + u32 fwmask; #endif u8 tclass; }; @@ -128,7 +129,7 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) return 0; #ifdef CONFIG_IPV6_ROUTE_FWMARK - if (r->fwmark && (r->fwmark != fl->fl6_fwmark)) + if ((r->fwmark ^ fl->fl6_fwmark) / r->fwmask) return 0; #endif @@ -141,6 +142,7 @@ static struct nla_policy fib6_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, [FRA_DST] = { .minlen = sizeof(struct in6_addr) }, [FRA_FWMARK] = { .type = NLA_U32 }, + [FRA_FWMASK] = { .type = NLA_U32 }, [FRA_TABLE] = { .type = NLA_U32 }, }; @@ -174,8 +176,20 @@ static int fib6_rule_configure(struct fib_rule *rule, struct sk_buff *skb, sizeof(struct in6_addr)); #ifdef CONFIG_IPV6_ROUTE_FWMARK - if (tb[FRA_FWMARK]) + if (tb[FRA_FWMARK]) { rule6->fwmark = nla_get_u32(tb[FRA_FWMARK]); + if (rule6->fwmark) { + /* + * if the mark value is non-zero, + * all bits are compared by default + * unless a mask is explicitly specified. + */ + rule6->fwmask = 0xFFFFFFFF; + } + } + + if (tb[FRA_FWMASK]) + rule6->fwmask = nla_get_u32(tb[FRA_FWMASK]); #endif rule6->src.plen = frh->src_len; @@ -212,6 +226,9 @@ static int fib6_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, #ifdef CONFIG_IPV6_ROUTE_FWMARK if (tb[FRA_FWMARK] && (rule6->fwmark != nla_get_u32(tb[FRA_FWMARK]))) return 0; + + if (tb[FRA_FWMASK] && (rule6->fwmask != nla_get_u32(tb[FRA_FWMASK]))) + return 0; #endif return 1; @@ -238,6 +255,9 @@ static int fib6_rule_fill(struct fib_rule *rule, struct sk_buff *skb, #ifdef CONFIG_IPV6_ROUTE_FWMARK if (rule6->fwmark) NLA_PUT_U32(skb, FRA_FWMARK, rule6->fwmark); + + if (rule6->fwmask) + NLA_PUT_U32(skb, FRA_FWMASK, rule6->fwmask); #endif return 0; -- cgit v0.10.2 From 267935b197d2a6e6924f9de2841f0470bfe63acd Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Fri, 25 Aug 2006 16:07:48 -0700 Subject: [IPV6]: Fix build with fwmark disabled. Based upon a patch by Brian Haley. Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 649350b..d83844d 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -703,7 +703,9 @@ void ip6_route_input(struct sk_buff *skb) .ip6_u = { .daddr = iph->daddr, .saddr = iph->saddr, +#ifdef CONFIG_IPV6_ROUTE_FWMARK .fwmark = skb->nfmark, +#endif .flowlabel = (* (u32 *) iph)&IPV6_FLOWINFO_MASK, }, }, -- cgit v0.10.2 From bbfb39cbf63829d1db607aa90cbdca557a3a131d Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Fri, 25 Aug 2006 16:10:14 -0700 Subject: [IPV4]: Add support for fwmark masks in routing rules Add a FRA_FWMASK attributes for fwmark masks. For compatibility a mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/fib_rules.h b/include/linux/fib_rules.h index 2987549..4418c8d 100644 --- a/include/linux/fib_rules.h +++ b/include/linux/fib_rules.h @@ -34,12 +34,13 @@ enum FRA_UNUSED3, FRA_UNUSED4, FRA_UNUSED5, - FRA_FWMARK, /* netfilter mark (IPv4/IPv6) */ + FRA_FWMARK, /* netfilter mark */ FRA_FLOW, /* flow/class id */ FRA_UNUSED6, FRA_UNUSED7, FRA_UNUSED8, FRA_TABLE, /* Extended table id */ + FRA_FWMASK, /* mask for netfilter mark */ __FRA_MAX }; diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index ce185ac..280f424 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -46,6 +46,7 @@ struct fib4_rule u32 dstmask; #ifdef CONFIG_IP_ROUTE_FWMARK u32 fwmark; + u32 fwmask; #endif #ifdef CONFIG_NET_CLS_ROUTE u32 tclassid; @@ -160,7 +161,7 @@ static int fib4_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) return 0; #ifdef CONFIG_IP_ROUTE_FWMARK - if (r->fwmark && (r->fwmark != fl->fl4_fwmark)) + if ((r->fwmark ^ fl->fl4_fwmark) & r->fwmask) return 0; #endif @@ -183,6 +184,7 @@ static struct nla_policy fib4_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_SRC] = { .type = NLA_U32 }, [FRA_DST] = { .type = NLA_U32 }, [FRA_FWMARK] = { .type = NLA_U32 }, + [FRA_FWMASK] = { .type = NLA_U32 }, [FRA_FLOW] = { .type = NLA_U32 }, [FRA_TABLE] = { .type = NLA_U32 }, }; @@ -219,8 +221,17 @@ static int fib4_rule_configure(struct fib_rule *rule, struct sk_buff *skb, rule4->dst = nla_get_u32(tb[FRA_DST]); #ifdef CONFIG_IP_ROUTE_FWMARK - if (tb[FRA_FWMARK]) + if (tb[FRA_FWMARK]) { rule4->fwmark = nla_get_u32(tb[FRA_FWMARK]); + if (rule4->fwmark) + /* compatibility: if the mark value is non-zero all bits + * are compared unless a mask is explicitly specified. + */ + rule4->fwmask = 0xFFFFFFFF; + } + + if (tb[FRA_FWMASK]) + rule4->fwmask = nla_get_u32(tb[FRA_FWMASK]); #endif #ifdef CONFIG_NET_CLS_ROUTE @@ -256,6 +267,9 @@ static int fib4_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, #ifdef CONFIG_IP_ROUTE_FWMARK if (tb[FRA_FWMARK] && (rule4->fwmark != nla_get_u32(tb[FRA_FWMARK]))) return 0; + + if (tb[FRA_FWMASK] && (rule4->fwmask != nla_get_u32(tb[FRA_FWMASK]))) + return 0; #endif #ifdef CONFIG_NET_CLS_ROUTE @@ -285,6 +299,9 @@ static int fib4_rule_fill(struct fib_rule *rule, struct sk_buff *skb, #ifdef CONFIG_IP_ROUTE_FWMARK if (rule4->fwmark) NLA_PUT_U32(skb, FRA_FWMARK, rule4->fwmark); + + if (rule4->fwmask || rule4->fwmark) + NLA_PUT_U32(skb, FRA_FWMASK, rule4->fwmask); #endif if (rule4->dst_len) -- cgit v0.10.2 From 88e91f290307d22ae88302e3a24f0c36905e8a6c Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Fri, 25 Aug 2006 16:11:08 -0700 Subject: [DECNET]: Add support for fwmark masks in routing rules Add support for fwmark masks. For compatibility a mask of 0xFFFFFFFF is used when a mark value != 0 is sent without a mask. Signed-off-by: Patrick McHardy Acked-by: Steven Whitehouse Signed-off-by: David S. Miller diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 50e819e..63ad63d 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -47,6 +47,7 @@ struct dn_fib_rule u8 flags; #ifdef CONFIG_DECNET_ROUTE_FWMARK u32 fwmark; + u32 fwmask; #endif }; @@ -116,6 +117,7 @@ static struct nla_policy dn_fib_rule_policy[FRA_MAX+1] __read_mostly = { [FRA_SRC] = { .type = NLA_U16 }, [FRA_DST] = { .type = NLA_U16 }, [FRA_FWMARK] = { .type = NLA_U32 }, + [FRA_FWMASK] = { .type = NLA_U32 }, [FRA_TABLE] = { .type = NLA_U32 }, }; @@ -130,7 +132,7 @@ static int dn_fib_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) return 0; #ifdef CONFIG_DECNET_ROUTE_FWMARK - if (r->fwmark && (r->fwmark != fl->fld_fwmark)) + if ((r->fwmark ^ fl->fld_fwmark) & r->fwmask) return 0; #endif @@ -168,8 +170,17 @@ static int dn_fib_rule_configure(struct fib_rule *rule, struct sk_buff *skb, r->dst = nla_get_u16(tb[FRA_DST]); #ifdef CONFIG_DECNET_ROUTE_FWMARK - if (tb[FRA_FWMARK]) + if (tb[FRA_FWMARK]) { r->fwmark = nla_get_u32(tb[FRA_FWMARK]); + if (r->fwmark) + /* compatibility: if the mark value is non-zero all bits + * are compared unless a mask is explicitly specified. + */ + r->fwmask = 0xFFFFFFFF; + } + + if (tb[FRA_FWMASK]) + r->fwmask = nla_get_u32(tb[FRA_FWMASK]); #endif r->src_len = frh->src_len; @@ -195,6 +206,9 @@ static int dn_fib_rule_compare(struct fib_rule *rule, struct fib_rule_hdr *frh, #ifdef CONFIG_DECNET_ROUTE_FWMARK if (tb[FRA_FWMARK] && (r->fwmark != nla_get_u32(tb[FRA_FWMARK]))) return 0; + + if (tb[FRA_FWMASK] && (r->fwmask != nla_get_u32(tb[FRA_FWMASK]))) + return 0; #endif if (tb[FRA_SRC] && (r->src != nla_get_u16(tb[FRA_SRC]))) @@ -237,6 +251,8 @@ static int dn_fib_rule_fill(struct fib_rule *rule, struct sk_buff *skb, #ifdef CONFIG_DECNET_ROUTE_FWMARK if (r->fwmark) NLA_PUT_U32(skb, FRA_FWMARK, r->fwmark); + if (r->fwmask || r->fwmark) + NLA_PUT_U32(skb, FRA_FWMASK, r->fwmask); #endif if (r->dst_len) NLA_PUT_U16(skb, FRA_DST, r->dst); -- cgit v0.10.2 From b4e9b520ca5d07a37ea59648e7f50f478e7487a3 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Fri, 25 Aug 2006 16:11:42 -0700 Subject: [NET_SCHED]: Add mask support to fwmark classifier Support masking the nfmark value before the search. The mask value is global for all filters contained in one instance. It can only be set when a new instance is created, all filters must specify the same mask. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/pkt_cls.h b/include/linux/pkt_cls.h index bd2c5a2..c3f01b3 100644 --- a/include/linux/pkt_cls.h +++ b/include/linux/pkt_cls.h @@ -305,6 +305,7 @@ enum TCA_FW_POLICE, TCA_FW_INDEV, /* used by CONFIG_NET_CLS_IND */ TCA_FW_ACT, /* used by CONFIG_NET_CLS_ACT */ + TCA_FW_MASK, __TCA_FW_MAX }; diff --git a/net/sched/cls_fw.c b/net/sched/cls_fw.c index e6973d9..e54acc6 100644 --- a/net/sched/cls_fw.c +++ b/net/sched/cls_fw.c @@ -50,6 +50,7 @@ struct fw_head { struct fw_filter *ht[HTSIZE]; + u32 mask; }; struct fw_filter @@ -101,7 +102,7 @@ static int fw_classify(struct sk_buff *skb, struct tcf_proto *tp, struct fw_filter *f; int r; #ifdef CONFIG_NETFILTER - u32 id = skb->nfmark; + u32 id = skb->nfmark & head->mask; #else u32 id = 0; #endif @@ -209,7 +210,9 @@ static int fw_change_attrs(struct tcf_proto *tp, struct fw_filter *f, struct rtattr **tb, struct rtattr **tca, unsigned long base) { + struct fw_head *head = (struct fw_head *)tp->root; struct tcf_exts e; + u32 mask; int err; err = tcf_exts_validate(tp, tb, tca[TCA_RATE-1], &e, &fw_ext_map); @@ -232,6 +235,15 @@ fw_change_attrs(struct tcf_proto *tp, struct fw_filter *f, } #endif /* CONFIG_NET_CLS_IND */ + if (tb[TCA_FW_MASK-1]) { + if (RTA_PAYLOAD(tb[TCA_FW_MASK-1]) != sizeof(u32)) + goto errout; + mask = *(u32*)RTA_DATA(tb[TCA_FW_MASK-1]); + if (mask != head->mask) + goto errout; + } else if (head->mask != 0xFFFFFFFF) + goto errout; + tcf_exts_change(tp, &f->exts, &e); return 0; @@ -267,9 +279,17 @@ static int fw_change(struct tcf_proto *tp, unsigned long base, return -EINVAL; if (head == NULL) { + u32 mask = 0xFFFFFFFF; + if (tb[TCA_FW_MASK-1]) { + if (RTA_PAYLOAD(tb[TCA_FW_MASK-1]) != sizeof(u32)) + return -EINVAL; + mask = *(u32*)RTA_DATA(tb[TCA_FW_MASK-1]); + } + head = kzalloc(sizeof(struct fw_head), GFP_KERNEL); if (head == NULL) return -ENOBUFS; + head->mask = mask; tcf_tree_lock(tp); tp->root = head; @@ -330,6 +350,7 @@ static void fw_walk(struct tcf_proto *tp, struct tcf_walker *arg) static int fw_dump(struct tcf_proto *tp, unsigned long fh, struct sk_buff *skb, struct tcmsg *t) { + struct fw_head *head = (struct fw_head *)tp->root; struct fw_filter *f = (struct fw_filter*)fh; unsigned char *b = skb->tail; struct rtattr *rta; @@ -351,6 +372,8 @@ static int fw_dump(struct tcf_proto *tp, unsigned long fh, if (strlen(f->indev)) RTA_PUT(skb, TCA_FW_INDEV, IFNAMSIZ, f->indev); #endif /* CONFIG_NET_CLS_IND */ + if (head->mask != 0xFFFFFFFF) + RTA_PUT(skb, TCA_FW_MASK, 4, &head->mask); if (tcf_exts_dump(skb, &f->exts, &fw_ext_map) < 0) goto rtattr_failure; -- cgit v0.10.2 From 74975d40b16fd4bad24a2e2630dc7957d8cba013 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Fri, 25 Aug 2006 17:10:33 -0700 Subject: [TCP] Congestion control (modulo lp, bic): use BUILD_BUG_ON Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c index 2be2798..a60ef38 100644 --- a/net/ipv4/tcp_cubic.c +++ b/net/ipv4/tcp_cubic.c @@ -358,7 +358,7 @@ static struct tcp_congestion_ops cubictcp = { static int __init cubictcp_register(void) { - BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct bictcp) > ICSK_CA_PRIV_SIZE); /* Precompute a bunch of the scaling factors that are used per-packet * based on SRTT of 100ms diff --git a/net/ipv4/tcp_highspeed.c b/net/ipv4/tcp_highspeed.c index fa3e1aa..c4fc811b 100644 --- a/net/ipv4/tcp_highspeed.c +++ b/net/ipv4/tcp_highspeed.c @@ -189,7 +189,7 @@ static struct tcp_congestion_ops tcp_highspeed = { static int __init hstcp_register(void) { - BUG_ON(sizeof(struct hstcp) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct hstcp) > ICSK_CA_PRIV_SIZE); return tcp_register_congestion_control(&tcp_highspeed); } diff --git a/net/ipv4/tcp_htcp.c b/net/ipv4/tcp_htcp.c index 6edfe5e..682e7d5 100644 --- a/net/ipv4/tcp_htcp.c +++ b/net/ipv4/tcp_htcp.c @@ -286,7 +286,7 @@ static struct tcp_congestion_ops htcp = { static int __init htcp_register(void) { - BUG_ON(sizeof(struct htcp) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct htcp) > ICSK_CA_PRIV_SIZE); BUILD_BUG_ON(BETA_MIN >= BETA_MAX); return tcp_register_congestion_control(&htcp); } diff --git a/net/ipv4/tcp_hybla.c b/net/ipv4/tcp_hybla.c index 7406e0c..59e691d 100644 --- a/net/ipv4/tcp_hybla.c +++ b/net/ipv4/tcp_hybla.c @@ -170,7 +170,7 @@ static struct tcp_congestion_ops tcp_hybla = { static int __init hybla_register(void) { - BUG_ON(sizeof(struct hybla) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct hybla) > ICSK_CA_PRIV_SIZE); return tcp_register_congestion_control(&tcp_hybla); } diff --git a/net/ipv4/tcp_vegas.c b/net/ipv4/tcp_vegas.c index 490360b..a3b7aa0 100644 --- a/net/ipv4/tcp_vegas.c +++ b/net/ipv4/tcp_vegas.c @@ -370,7 +370,7 @@ static struct tcp_congestion_ops tcp_vegas = { static int __init tcp_vegas_register(void) { - BUG_ON(sizeof(struct vegas) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct vegas) > ICSK_CA_PRIV_SIZE); tcp_register_congestion_control(&tcp_vegas); return 0; } diff --git a/net/ipv4/tcp_veno.c b/net/ipv4/tcp_veno.c index 5b2fe6d..ce57bf3 100644 --- a/net/ipv4/tcp_veno.c +++ b/net/ipv4/tcp_veno.c @@ -212,7 +212,7 @@ static struct tcp_congestion_ops tcp_veno = { static int __init tcp_veno_register(void) { - BUG_ON(sizeof(struct veno) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct veno) > ICSK_CA_PRIV_SIZE); tcp_register_congestion_control(&tcp_veno); return 0; } diff --git a/net/ipv4/tcp_westwood.c b/net/ipv4/tcp_westwood.c index 5446312..4f42a86 100644 --- a/net/ipv4/tcp_westwood.c +++ b/net/ipv4/tcp_westwood.c @@ -289,7 +289,7 @@ static struct tcp_congestion_ops tcp_westwood = { static int __init tcp_westwood_register(void) { - BUG_ON(sizeof(struct westwood) > ICSK_CA_PRIV_SIZE); + BUILD_BUG_ON(sizeof(struct westwood) > ICSK_CA_PRIV_SIZE); return tcp_register_congestion_control(&tcp_westwood); } -- cgit v0.10.2 From 366e4adc0f9ef33f56c62f980a7d83775e64abd0 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Sat, 26 Aug 2006 16:50:20 -0700 Subject: [IPV6]: Fix routing by fwmark Fix mark comparison, also dump the mask to userspace when the mask is zero, but the mark is not (in which case the mark is dumped, so the mask is needed to make sense of it). Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index ee4aa43..2fbc71d 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -129,7 +129,7 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) return 0; #ifdef CONFIG_IPV6_ROUTE_FWMARK - if ((r->fwmark ^ fl->fl6_fwmark) / r->fwmask) + if ((r->fwmark ^ fl->fl6_fwmark) & r->fwmask) return 0; #endif @@ -256,7 +256,7 @@ static int fib6_rule_fill(struct fib_rule *rule, struct sk_buff *skb, if (rule6->fwmark) NLA_PUT_U32(skb, FRA_FWMARK, rule6->fwmark); - if (rule6->fwmask) + if (rule6->fwmask || rule6->fwmark) NLA_PUT_U32(skb, FRA_FWMASK, rule6->fwmask); #endif -- cgit v0.10.2 From ef047f5e1085d6393748d1ee27d6327905f098dc Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 1 Sep 2006 00:29:06 -0700 Subject: [NET]: Use BUILD_BUG_ON() for checking size of skb->cb. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index f2e8927..fdd89e3 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -1254,10 +1254,7 @@ static int __init inet_init(void) struct list_head *r; int rc = -EINVAL; - if (sizeof(struct inet_skb_parm) > sizeof(dummy_skb->cb)) { - printk(KERN_CRIT "%s: panic\n", __FUNCTION__); - goto out; - } + BUILD_BUG_ON(sizeof(struct inet_skb_parm) > sizeof(dummy_skb->cb)); rc = proto_register(&tcp_prot, 1); if (rc) diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index fc9c8a9..bf6e8af 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -761,6 +761,8 @@ static int __init inet6_init(void) struct list_head *r; int err; + BUILD_BUG_ON(sizeof(struct inet6_skb_parm) > sizeof(dummy_skb->cb)); + #ifdef MODULE #if 0 /* FIXME --RR */ if (!mod_member_present(&__this_module, can_unload)) @@ -770,11 +772,6 @@ static int __init inet6_init(void) #endif #endif - if (sizeof(struct inet6_skb_parm) > sizeof(dummy_skb->cb)) { - printk(KERN_CRIT "inet6_proto_init: size fault\n"); - return -EINVAL; - } - err = proto_register(&tcpv6_prot, 1); if (err) goto out; diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c index a80e445..d56e0d2 100644 --- a/net/netlink/af_netlink.c +++ b/net/netlink/af_netlink.c @@ -1762,8 +1762,6 @@ static struct net_proto_family netlink_family_ops = { .owner = THIS_MODULE, /* for consistency 8) */ }; -extern void netlink_skb_parms_too_large(void); - static int __init netlink_proto_init(void) { struct sk_buff *dummy_skb; @@ -1775,8 +1773,7 @@ static int __init netlink_proto_init(void) if (err != 0) goto out; - if (sizeof(struct netlink_skb_parms) > sizeof(dummy_skb->cb)) - netlink_skb_parms_too_large(); + BUILD_BUG_ON(sizeof(struct netlink_skb_parms) > sizeof(dummy_skb->cb)); nl_table = kcalloc(MAX_LINKS, sizeof(*nl_table), GFP_KERNEL); if (!nl_table) diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index de6ec51..7c91c20 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2060,10 +2060,7 @@ static int __init af_unix_init(void) int rc = -1; struct sk_buff *dummy_skb; - if (sizeof(struct unix_skb_parms) > sizeof(dummy_skb->cb)) { - printk(KERN_CRIT "%s: panic\n", __FUNCTION__); - goto out; - } + BUILD_BUG_ON(sizeof(struct unix_skb_parms) > sizeof(dummy_skb->cb)); rc = proto_register(&unix_proto, 1); if (rc != 0) { -- cgit v0.10.2 From 2a0109a707d2b0ae48f124d3be0fdf1715c0107a Mon Sep 17 00:00:00 2001 From: Ian McDonald Date: Sat, 26 Aug 2006 19:15:35 -0700 Subject: [DCCP]: Shift sysctls into feat.h This shifts further sysctls into feat.h. No change in functionality - shifting code only. Signed off by: Ian McDonald Signed-off-by: David S. Miller diff --git a/net/dccp/feat.h b/net/dccp/feat.h index b44c4550..cee553d 100644 --- a/net/dccp/feat.h +++ b/net/dccp/feat.h @@ -27,5 +27,10 @@ extern int dccp_feat_clone(struct sock *oldsk, struct sock *newsk); extern int dccp_feat_init(struct dccp_minisock *dmsk); extern int dccp_feat_default_sequence_window; +extern int dccp_feat_default_rx_ccid; +extern int dccp_feat_default_tx_ccid; +extern int dccp_feat_default_ack_ratio; +extern int dccp_feat_default_send_ack_vector; +extern int dccp_feat_default_send_ndp_count; #endif /* _DCCP_FEAT_H */ diff --git a/net/dccp/sysctl.c b/net/dccp/sysctl.c index c1ba945..38bc157 100644 --- a/net/dccp/sysctl.c +++ b/net/dccp/sysctl.c @@ -11,18 +11,12 @@ #include #include +#include "feat.h" #ifndef CONFIG_SYSCTL #error This file should not be compiled without CONFIG_SYSCTL defined #endif -extern int dccp_feat_default_sequence_window; -extern int dccp_feat_default_rx_ccid; -extern int dccp_feat_default_tx_ccid; -extern int dccp_feat_default_ack_ratio; -extern int dccp_feat_default_send_ack_vector; -extern int dccp_feat_default_send_ndp_count; - static struct ctl_table dccp_default_table[] = { { .ctl_name = NET_DCCP_DEFAULT_SEQ_WINDOW, -- cgit v0.10.2 From 97e5848dd39e7e76bd6077735ebb5473763ab9c5 Mon Sep 17 00:00:00 2001 From: Ian McDonald Date: Sat, 26 Aug 2006 19:16:45 -0700 Subject: [DCCP]: Introduce tx buffering This adds transmit buffering to DCCP. I have tested with CCID2/3 and with loss and rate limiting. Signed off by: Ian McDonald Signed-off-by: David S. Miller diff --git a/include/linux/dccp.h b/include/linux/dccp.h index 676333b..2d7671c 100644 --- a/include/linux/dccp.h +++ b/include/linux/dccp.h @@ -438,6 +438,7 @@ struct dccp_ackvec; * @dccps_role - Role of this sock, one of %dccp_role * @dccps_ndp_count - number of Non Data Packets since last data packet * @dccps_hc_rx_ackvec - rx half connection ack vector + * @dccps_xmit_timer - timer for when CCID is not ready to send */ struct dccp_sock { /* inet_connection_sock has to be the first member of dccp_sock */ @@ -470,6 +471,7 @@ struct dccp_sock { enum dccp_role dccps_role:2; __u8 dccps_hc_rx_insert_options:1; __u8 dccps_hc_tx_insert_options:1; + struct timer_list dccps_xmit_timer; }; static inline struct dccp_sock *dccp_sk(const struct sock *sk) diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h index a5c5475..0a21be4 100644 --- a/net/dccp/dccp.h +++ b/net/dccp/dccp.h @@ -130,7 +130,7 @@ extern void dccp_send_delayed_ack(struct sock *sk); extern void dccp_send_sync(struct sock *sk, const u64 seq, const enum dccp_pkt_type pkt_type); -extern int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo); +extern void dccp_write_xmit(struct sock *sk, int block); extern void dccp_write_space(struct sock *sk); extern void dccp_init_xmit_timers(struct sock *sk); diff --git a/net/dccp/output.c b/net/dccp/output.c index 58669be..7102e3a 100644 --- a/net/dccp/output.c +++ b/net/dccp/output.c @@ -198,7 +198,7 @@ static int dccp_wait_for_ccid(struct sock *sk, struct sk_buff *skb, while (1) { prepare_to_wait(sk->sk_sleep, &wait, TASK_INTERRUPTIBLE); - if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN)) + if (sk->sk_err) goto do_error; if (!*timeo) goto do_nonblock; @@ -234,37 +234,72 @@ do_interrupted: goto out; } -int dccp_write_xmit(struct sock *sk, struct sk_buff *skb, long *timeo) +static void dccp_write_xmit_timer(unsigned long data) { + struct sock *sk = (struct sock *)data; + struct dccp_sock *dp = dccp_sk(sk); + + bh_lock_sock(sk); + if (sock_owned_by_user(sk)) + sk_reset_timer(sk, &dp->dccps_xmit_timer, jiffies+1); + else + dccp_write_xmit(sk, 0); + bh_unlock_sock(sk); + sock_put(sk); +} + +void dccp_write_xmit(struct sock *sk, int block) { - const struct dccp_sock *dp = dccp_sk(sk); - int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb, + struct dccp_sock *dp = dccp_sk(sk); + struct sk_buff *skb; + long timeo = 30000; /* If a packet is taking longer than 2 secs + we have other issues */ + + while ((skb = skb_peek(&sk->sk_write_queue))) { + int err = ccid_hc_tx_send_packet(dp->dccps_hc_tx_ccid, sk, skb, skb->len); - if (err > 0) - err = dccp_wait_for_ccid(sk, skb, timeo); + if (err > 0) { + if (!block) { + sk_reset_timer(sk, &dp->dccps_xmit_timer, + msecs_to_jiffies(err)+jiffies); + break; + } else + err = dccp_wait_for_ccid(sk, skb, &timeo); + if (err) { + printk(KERN_CRIT "%s:err at dccp_wait_for_ccid" + " %d\n", __FUNCTION__, err); + dump_stack(); + } + } - if (err == 0) { - struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb); - const int len = skb->len; + skb_dequeue(&sk->sk_write_queue); + if (err == 0) { + struct dccp_skb_cb *dcb = DCCP_SKB_CB(skb); + const int len = skb->len; - if (sk->sk_state == DCCP_PARTOPEN) { - /* See 8.1.5. Handshake Completion */ - inet_csk_schedule_ack(sk); - inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK, + if (sk->sk_state == DCCP_PARTOPEN) { + /* See 8.1.5. Handshake Completion */ + inet_csk_schedule_ack(sk); + inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK, inet_csk(sk)->icsk_rto, DCCP_RTO_MAX); - dcb->dccpd_type = DCCP_PKT_DATAACK; - } else if (dccp_ack_pending(sk)) - dcb->dccpd_type = DCCP_PKT_DATAACK; - else - dcb->dccpd_type = DCCP_PKT_DATA; - - err = dccp_transmit_skb(sk, skb); - ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len); - } else - kfree_skb(skb); - - return err; + dcb->dccpd_type = DCCP_PKT_DATAACK; + } else if (dccp_ack_pending(sk)) + dcb->dccpd_type = DCCP_PKT_DATAACK; + else + dcb->dccpd_type = DCCP_PKT_DATA; + + err = dccp_transmit_skb(sk, skb); + ccid_hc_tx_packet_sent(dp->dccps_hc_tx_ccid, sk, 0, len); + if (err) { + printk(KERN_CRIT "%s:err from " + "ccid_hc_tx_packet_sent %d\n", + __FUNCTION__, err); + dump_stack(); + } + } else + kfree(skb); + } } int dccp_retransmit_skb(struct sock *sk, struct sk_buff *skb) @@ -426,6 +461,9 @@ static inline void dccp_connect_init(struct sock *sk) dccp_set_seqno(&dp->dccps_awl, max48(dp->dccps_awl, dp->dccps_iss)); icsk->icsk_retransmits = 0; + init_timer(&dp->dccps_xmit_timer); + dp->dccps_xmit_timer.data = (unsigned long)sk; + dp->dccps_xmit_timer.function = dccp_write_xmit_timer; } int dccp_connect(struct sock *sk) @@ -560,8 +598,10 @@ void dccp_send_close(struct sock *sk, const int active) DCCP_PKT_CLOSE : DCCP_PKT_CLOSEREQ; if (active) { + dccp_write_xmit(sk, 1); dccp_skb_entail(sk, skb); dccp_transmit_skb(sk, skb_clone(skb, prio)); + /* FIXME do we need a retransmit timer here? */ } else dccp_transmit_skb(sk, skb); } diff --git a/net/dccp/proto.c b/net/dccp/proto.c index 6f14bb5..962df0e 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -662,17 +662,8 @@ int dccp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg, if (rc != 0) goto out_discard; - rc = dccp_write_xmit(sk, skb, &timeo); - /* - * XXX we don't use sk_write_queue, so just discard the packet. - * Current plan however is to _use_ sk_write_queue with - * an algorith similar to tcp_sendmsg, where the main difference - * is that in DCCP we have to respect packet boundaries, so - * no coalescing of skbs. - * - * This bug was _quickly_ found & fixed by just looking at an OSTRA - * generated callgraph 8) -acme - */ + skb_queue_tail(&sk->sk_write_queue, skb); + dccp_write_xmit(sk,0); out_release: release_sock(sk); return rc ? : len; @@ -846,6 +837,7 @@ static int dccp_close_state(struct sock *sk) void dccp_close(struct sock *sk, long timeout) { + struct dccp_sock *dp = dccp_sk(sk); struct sk_buff *skb; int state; @@ -862,6 +854,8 @@ void dccp_close(struct sock *sk, long timeout) goto adjudge_to_death; } + sk_stop_timer(sk, &dp->dccps_xmit_timer); + /* * We need to flush the recv. buffs. We do this only on the * descriptor close, not protocol-sourced closes, because the -- cgit v0.10.2 From ff5dfe736dd9f6c74b206aa77c0465dfd503bdb9 Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Sat, 26 Aug 2006 19:17:53 -0700 Subject: [NETLINK]: remove third bogus argument from NLA_PUT_FLAG This patch removes the 'value' argument from NLA_PUT_FLAG which is unused anyway. The documentation comment was already correct so it doesn't need an update :) Signed-off-by: Johannes Berg Signed-off-by: David S. Miller diff --git a/include/net/netlink.h b/include/net/netlink.h index 47044da..bcb27e3 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -828,7 +828,7 @@ static inline int nla_put_msecs(struct sk_buff *skb, int attrtype, #define NLA_PUT_STRING(skb, attrtype, value) \ NLA_PUT(skb, attrtype, strlen(value) + 1, value) -#define NLA_PUT_FLAG(skb, attrtype, value) \ +#define NLA_PUT_FLAG(skb, attrtype) \ NLA_PUT(skb, attrtype, 0, NULL) #define NLA_PUT_MSECS(skb, attrtype, jiffies) \ -- cgit v0.10.2 From e5d679f33900c71d1a76ba07c5b04055abd34480 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Sat, 26 Aug 2006 19:25:52 -0700 Subject: [NET]: Use SLAB_PANIC Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/net/core/flow.c b/net/core/flow.c index 6452411..f23e7e3 100644 --- a/net/core/flow.c +++ b/net/core/flow.c @@ -343,12 +343,8 @@ static int __init flow_cache_init(void) flow_cachep = kmem_cache_create("flow_cache", sizeof(struct flow_cache_entry), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - - if (!flow_cachep) - panic("NET: failed to allocate flow cache slab\n"); - flow_hash_shift = 10; flow_lwm = 2 * flow_hash_size; flow_hwm = 4 * flow_hash_size; diff --git a/net/core/neighbour.c b/net/core/neighbour.c index c0a2740..a45bd21 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1339,14 +1339,10 @@ void neigh_table_init_no_netlink(struct neigh_table *tbl) neigh_rand_reach_time(tbl->parms.base_reachable_time); if (!tbl->kmem_cachep) - tbl->kmem_cachep = kmem_cache_create(tbl->id, - tbl->entry_size, - 0, SLAB_HWCACHE_ALIGN, - NULL, NULL); - - if (!tbl->kmem_cachep) - panic("cannot create neighbour cache"); - + tbl->kmem_cachep = + kmem_cache_create(tbl->id, tbl->entry_size, 0, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, + NULL, NULL); tbl->stats = alloc_percpu(struct neigh_statistics); if (!tbl->stats) panic("cannot create neighbour cache statistics"); diff --git a/net/core/skbuff.c b/net/core/skbuff.c index 8a476f1..c448c7f 100644 --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -2046,19 +2046,14 @@ void __init skb_init(void) skbuff_head_cache = kmem_cache_create("skbuff_head_cache", sizeof(struct sk_buff), 0, - SLAB_HWCACHE_ALIGN, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!skbuff_head_cache) - panic("cannot create skbuff cache"); - skbuff_fclone_cache = kmem_cache_create("skbuff_fclone_cache", (2*sizeof(struct sk_buff)) + sizeof(atomic_t), 0, - SLAB_HWCACHE_ALIGN, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!skbuff_fclone_cache) - panic("cannot create skbuff cache"); } EXPORT_SYMBOL(___pskb_trim); diff --git a/net/decnet/dn_route.c b/net/decnet/dn_route.c index c5daf35..dd0761e 100644 --- a/net/decnet/dn_route.c +++ b/net/decnet/dn_route.c @@ -1781,14 +1781,9 @@ void __init dn_route_init(void) { int i, goal, order; - dn_dst_ops.kmem_cachep = kmem_cache_create("dn_dst_cache", - sizeof(struct dn_route), - 0, SLAB_HWCACHE_ALIGN, - NULL, NULL); - - if (!dn_dst_ops.kmem_cachep) - panic("DECnet: Failed to allocate dn_dst_cache\n"); - + dn_dst_ops.kmem_cachep = + kmem_cache_create("dn_dst_cache", sizeof(struct dn_route), 0, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); init_timer(&dn_route_timer); dn_route_timer.function = dn_dst_check_expire; dn_route_timer.expires = jiffies + decnet_dst_gc_interval * HZ; diff --git a/net/ipv4/inetpeer.c b/net/ipv4/inetpeer.c index 03ff62e..a675602 100644 --- a/net/ipv4/inetpeer.c +++ b/net/ipv4/inetpeer.c @@ -126,12 +126,9 @@ void __init inet_initpeers(void) peer_cachep = kmem_cache_create("inet_peer_cache", sizeof(struct inet_peer), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!peer_cachep) - panic("cannot create inet_peer_cache"); - /* All the timers, started at system startup tend to synchronize. Perturb it a bit. */ diff --git a/net/ipv4/ipmr.c b/net/ipv4/ipmr.c index 98f0aa0..ba49588 100644 --- a/net/ipv4/ipmr.c +++ b/net/ipv4/ipmr.c @@ -1900,11 +1900,8 @@ void __init ip_mr_init(void) { mrt_cachep = kmem_cache_create("ip_mrt_cache", sizeof(struct mfc_cache), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!mrt_cachep) - panic("cannot allocate ip_mrt_cache"); - init_timer(&ipmr_expire_timer); ipmr_expire_timer.function=ipmr_expire_process; register_netdevice_notifier(&ip_mr_notifier); diff --git a/net/ipv4/route.c b/net/ipv4/route.c index a4d4cb8..20ffe8e 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -3147,13 +3147,9 @@ int __init ip_rt_init(void) } #endif - ipv4_dst_ops.kmem_cachep = kmem_cache_create("ip_dst_cache", - sizeof(struct rtable), - 0, SLAB_HWCACHE_ALIGN, - NULL, NULL); - - if (!ipv4_dst_ops.kmem_cachep) - panic("IP: failed to allocate ip_dst_cache\n"); + ipv4_dst_ops.kmem_cachep = + kmem_cache_create("ip_dst_cache", sizeof(struct rtable), 0, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); rt_hash_table = (struct rt_hash_bucket *) alloc_large_system_hash("IP route cache", diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e570db4..29e3d60 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2254,9 +2254,7 @@ void __init tcp_init(void) tcp_hashinfo.bind_bucket_cachep = kmem_cache_create("tcp_bind_bucket", sizeof(struct inet_bind_bucket), 0, - SLAB_HWCACHE_ALIGN, NULL, NULL); - if (!tcp_hashinfo.bind_bucket_cachep) - panic("tcp_init: Cannot alloc tcp_bind_bucket cache."); + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); /* Size and allocate the main established and bind bucket * hash tables. diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c index fbca609..8fcae7a 100644 --- a/net/ipv6/ip6_fib.c +++ b/net/ipv6/ip6_fib.c @@ -1472,10 +1472,8 @@ void __init fib6_init(void) { fib6_node_kmem = kmem_cache_create("fib6_nodes", sizeof(struct fib6_node), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!fib6_node_kmem) - panic("cannot create fib6_nodes cache"); fib6_tables_init(); } diff --git a/net/ipv6/route.c b/net/ipv6/route.c index d83844d..ba1b3d1 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -2419,13 +2419,9 @@ void __init ip6_route_init(void) { struct proc_dir_entry *p; - ip6_dst_ops.kmem_cachep = kmem_cache_create("ip6_dst_cache", - sizeof(struct rt6_info), - 0, SLAB_HWCACHE_ALIGN, - NULL, NULL); - if (!ip6_dst_ops.kmem_cachep) - panic("cannot create ip6_dst_cache"); - + ip6_dst_ops.kmem_cachep = + kmem_cache_create("ip6_dst_cache", sizeof(struct rt6_info), 0, + SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); fib6_init(); #ifdef CONFIG_PROC_FS p = proc_net_create("ipv6_route", 0, rt6_proc_info); diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c index 891a609..dfc90bb 100644 --- a/net/xfrm/xfrm_input.c +++ b/net/xfrm/xfrm_input.c @@ -82,8 +82,6 @@ void __init xfrm_input_init(void) { secpath_cachep = kmem_cache_create("secpath_cache", sizeof(struct sec_path), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!secpath_cachep) - panic("XFRM: failed to allocate secpath_cache\n"); } diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 1cf3209..7db1c48 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1985,10 +1985,8 @@ static void __init xfrm_policy_init(void) xfrm_dst_cache = kmem_cache_create("xfrm_dst_cache", sizeof(struct xfrm_dst), - 0, SLAB_HWCACHE_ALIGN, + 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL, NULL); - if (!xfrm_dst_cache) - panic("XFRM: failed to allocate xfrm_dst_cache\n"); hmask = 8 - 1; sz = (hmask+1) * sizeof(struct hlist_head); -- cgit v0.10.2 From 6a28ec8cd0c6993a4ac0d52f4347f7ed077b5cac Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 26 Aug 2006 19:48:49 -0700 Subject: [NETFILTER]: Fix nf_conntrack_ftp.c build. Noticed by Adrian Bunk. Signed-off-by: David S. Miller diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c index 9dccb40..0c17a5b 100644 --- a/net/netfilter/nf_conntrack_ftp.c +++ b/net/netfilter/nf_conntrack_ftp.c @@ -21,6 +21,7 @@ #include #include #include +#include #include #include @@ -114,7 +115,8 @@ static struct ftp_search { static int get_ipv6_addr(const char *src, size_t dlen, struct in6_addr *dst, u_int8_t term) { - int ret = in6_pton(src, min_t(size_t, dlen, 0xffff), dst, term, &end); + const char *end; + int ret = in6_pton(src, min_t(size_t, dlen, 0xffff), (u8 *)dst, term, &end); if (ret > 0) return (int)(end - src); return 0; -- cgit v0.10.2 From 25030a7f9eeab2dcefff036469e0e2b4f956198f Mon Sep 17 00:00:00 2001 From: Gerrit Renker Date: Sat, 26 Aug 2006 20:06:05 -0700 Subject: [UDP]: Unify UDPv4 and UDPv6 ->get_port() This patch creates one common function which is called by udp_v4_get_port() and udp_v6_get_port(). As a result, * duplicated code is removed * udp_port_rover and local port lookup can now be removed from udp.h * further savings follow since the same function will be used by UDP-Litev4 and UDP-Litev6 In contrast to the patch sent in response to Yoshifujis comments (fixed by this variant), the code below also removes the EXPORT_SYMBOL(udp_port_rover), since udp_port_rover can now remain local to net/ipv4/udp.c. Signed-off-by: Gerrit Renker Signed-off-by: David S. Miller diff --git a/include/net/udp.h b/include/net/udp.h index 766fba1..c490a0f 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -30,25 +30,9 @@ #define UDP_HTABLE_SIZE 128 -/* udp.c: This needs to be shared by v4 and v6 because the lookup - * and hashing code needs to work with different AF's yet - * the port space is shared. - */ extern struct hlist_head udp_hash[UDP_HTABLE_SIZE]; extern rwlock_t udp_hash_lock; -extern int udp_port_rover; - -static inline int udp_lport_inuse(u16 num) -{ - struct sock *sk; - struct hlist_node *node; - - sk_for_each(sk, node, &udp_hash[num & (UDP_HTABLE_SIZE - 1)]) - if (inet_sk(sk)->num == num) - return 1; - return 0; -} /* Note: this must match 'valbool' in sock_setsockopt */ #define UDP_CSUM_NOXMIT 1 @@ -63,6 +47,8 @@ extern struct proto udp_prot; struct sk_buff; +extern int udp_get_port(struct sock *sk, unsigned short snum, + int (*saddr_cmp)(struct sock *, struct sock *)); extern void udp_err(struct sk_buff *, u32); extern int udp_sendmsg(struct kiocb *iocb, struct sock *sk, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 514c1e9..7552b50 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -118,14 +118,34 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_statistics) __read_mostly; struct hlist_head udp_hash[UDP_HTABLE_SIZE]; DEFINE_RWLOCK(udp_hash_lock); -/* Shared by v4/v6 udp. */ +/* Shared by v4/v6 udp_get_port */ int udp_port_rover; -static int udp_v4_get_port(struct sock *sk, unsigned short snum) +static inline int udp_lport_inuse(u16 num) { + struct sock *sk; struct hlist_node *node; + + sk_for_each(sk, node, &udp_hash[num & (UDP_HTABLE_SIZE - 1)]) + if (inet_sk(sk)->num == num) + return 1; + return 0; +} + +/** + * udp_get_port - common port lookup for IPv4 and IPv6 + * + * @sk: socket struct in question + * @snum: port number to look up + * @saddr_comp: AF-dependent comparison of bound local IP addresses + */ +int udp_get_port(struct sock *sk, unsigned short snum, + int (*saddr_cmp)(struct sock *sk1, struct sock *sk2)) +{ + struct hlist_node *node; + struct hlist_head *head; struct sock *sk2; - struct inet_sock *inet = inet_sk(sk); + int error = 1; write_lock_bh(&udp_hash_lock); if (snum == 0) { @@ -137,11 +157,10 @@ static int udp_v4_get_port(struct sock *sk, unsigned short snum) best_size_so_far = 32767; best = result = udp_port_rover; for (i = 0; i < UDP_HTABLE_SIZE; i++, result++) { - struct hlist_head *list; int size; - list = &udp_hash[result & (UDP_HTABLE_SIZE - 1)]; - if (hlist_empty(list)) { + head = &udp_hash[result & (UDP_HTABLE_SIZE - 1)]; + if (hlist_empty(head)) { if (result > sysctl_local_port_range[1]) result = sysctl_local_port_range[0] + ((result - sysctl_local_port_range[0]) & @@ -149,12 +168,11 @@ static int udp_v4_get_port(struct sock *sk, unsigned short snum) goto gotit; } size = 0; - sk_for_each(sk2, node, list) - if (++size >= best_size_so_far) - goto next; - best_size_so_far = size; - best = result; - next:; + sk_for_each(sk2, node, head) + if (++size < best_size_so_far) { + best_size_so_far = size; + best = result; + } } result = best; for(i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; i++, result += UDP_HTABLE_SIZE) { @@ -170,38 +188,44 @@ static int udp_v4_get_port(struct sock *sk, unsigned short snum) gotit: udp_port_rover = snum = result; } else { - sk_for_each(sk2, node, - &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]) { - struct inet_sock *inet2 = inet_sk(sk2); - - if (inet2->num == snum && - sk2 != sk && - !ipv6_only_sock(sk2) && - (!sk2->sk_bound_dev_if || - !sk->sk_bound_dev_if || - sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && - (!inet2->rcv_saddr || - !inet->rcv_saddr || - inet2->rcv_saddr == inet->rcv_saddr) && - (!sk2->sk_reuse || !sk->sk_reuse)) + head = &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]; + + sk_for_each(sk2, node, head) + if (inet_sk(sk2)->num == snum && + sk2 != sk && + (!sk2->sk_reuse || !sk->sk_reuse) && + (!sk2->sk_bound_dev_if || !sk->sk_bound_dev_if + || sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && + (*saddr_cmp)(sk, sk2) ) goto fail; - } } - inet->num = snum; + inet_sk(sk)->num = snum; if (sk_unhashed(sk)) { - struct hlist_head *h = &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]; - - sk_add_node(sk, h); + head = &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]; + sk_add_node(sk, head); sock_prot_inc_use(sk->sk_prot); } - write_unlock_bh(&udp_hash_lock); - return 0; - + error = 0; fail: write_unlock_bh(&udp_hash_lock); - return 1; + return error; +} + +static inline int ipv4_rcv_saddr_equal(struct sock *sk1, struct sock *sk2) +{ + struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); + + return ( !ipv6_only_sock(sk2) && + (!inet1->rcv_saddr || !inet2->rcv_saddr || + inet1->rcv_saddr == inet2->rcv_saddr )); +} + +static inline int udp_v4_get_port(struct sock *sk, unsigned short snum) +{ + return udp_get_port(sk, snum, ipv4_rcv_saddr_equal); } + static void udp_v4_hash(struct sock *sk) { BUG(); @@ -1596,7 +1620,7 @@ EXPORT_SYMBOL(udp_disconnect); EXPORT_SYMBOL(udp_hash); EXPORT_SYMBOL(udp_hash_lock); EXPORT_SYMBOL(udp_ioctl); -EXPORT_SYMBOL(udp_port_rover); +EXPORT_SYMBOL(udp_get_port); EXPORT_SYMBOL(udp_prot); EXPORT_SYMBOL(udp_sendmsg); EXPORT_SYMBOL(udp_poll); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index b9cc55c..9662561 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -61,81 +61,9 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_stats_in6) __read_mostly; -/* Grrr, addr_type already calculated by caller, but I don't want - * to add some silly "cookie" argument to this method just for that. - */ -static int udp_v6_get_port(struct sock *sk, unsigned short snum) +static inline int udp_v6_get_port(struct sock *sk, unsigned short snum) { - struct sock *sk2; - struct hlist_node *node; - - write_lock_bh(&udp_hash_lock); - if (snum == 0) { - int best_size_so_far, best, result, i; - - if (udp_port_rover > sysctl_local_port_range[1] || - udp_port_rover < sysctl_local_port_range[0]) - udp_port_rover = sysctl_local_port_range[0]; - best_size_so_far = 32767; - best = result = udp_port_rover; - for (i = 0; i < UDP_HTABLE_SIZE; i++, result++) { - int size; - struct hlist_head *list; - - list = &udp_hash[result & (UDP_HTABLE_SIZE - 1)]; - if (hlist_empty(list)) { - if (result > sysctl_local_port_range[1]) - result = sysctl_local_port_range[0] + - ((result - sysctl_local_port_range[0]) & - (UDP_HTABLE_SIZE - 1)); - goto gotit; - } - size = 0; - sk_for_each(sk2, node, list) - if (++size >= best_size_so_far) - goto next; - best_size_so_far = size; - best = result; - next:; - } - result = best; - for(i = 0; i < (1 << 16) / UDP_HTABLE_SIZE; i++, result += UDP_HTABLE_SIZE) { - if (result > sysctl_local_port_range[1]) - result = sysctl_local_port_range[0] - + ((result - sysctl_local_port_range[0]) & - (UDP_HTABLE_SIZE - 1)); - if (!udp_lport_inuse(result)) - break; - } - if (i >= (1 << 16) / UDP_HTABLE_SIZE) - goto fail; -gotit: - udp_port_rover = snum = result; - } else { - sk_for_each(sk2, node, - &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]) { - if (inet_sk(sk2)->num == snum && - sk2 != sk && - (!sk2->sk_bound_dev_if || - !sk->sk_bound_dev_if || - sk2->sk_bound_dev_if == sk->sk_bound_dev_if) && - (!sk2->sk_reuse || !sk->sk_reuse) && - ipv6_rcv_saddr_equal(sk, sk2)) - goto fail; - } - } - - inet_sk(sk)->num = snum; - if (sk_unhashed(sk)) { - sk_add_node(sk, &udp_hash[snum & (UDP_HTABLE_SIZE - 1)]); - sock_prot_inc_use(sk->sk_prot); - } - write_unlock_bh(&udp_hash_lock); - return 0; - -fail: - write_unlock_bh(&udp_hash_lock); - return 1; + return udp_get_port(sk, snum, ipv6_rcv_saddr_equal); } static void udp_v6_hash(struct sock *sk) -- cgit v0.10.2 From bed53ea7fef37820b7c92ad74feff1b817c6aae3 Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 26 Aug 2006 20:06:49 -0700 Subject: [UDP]: Mark udp_port_rover static. It is not referenced outside of net/ipv4/udp.c any longer. Signed-off-by: David S. Miller diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 7552b50..aa18230 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -118,8 +118,7 @@ DEFINE_SNMP_STAT(struct udp_mib, udp_statistics) __read_mostly; struct hlist_head udp_hash[UDP_HTABLE_SIZE]; DEFINE_RWLOCK(udp_hash_lock); -/* Shared by v4/v6 udp_get_port */ -int udp_port_rover; +static int udp_port_rover; static inline int udp_lport_inuse(u16 num) { -- cgit v0.10.2 From e3b4eadbea77ecb3c3a74d1bc81b392f454c7f2e Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Sat, 26 Aug 2006 20:10:15 -0700 Subject: [UDP]: saddr_cmp function should take const socket pointers MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This also kills a warning while building ipv6: net/ipv6/udp.c: In function ‘udp_v6_get_port’: net/ipv6/udp.c:66: warning: passing argument 3 of ‘udp_get_port’ from incompatible pointer type Signed-off-by: David S. Miller diff --git a/include/net/udp.h b/include/net/udp.h index c490a0f..db0c05f 100644 --- a/include/net/udp.h +++ b/include/net/udp.h @@ -48,7 +48,7 @@ extern struct proto udp_prot; struct sk_buff; extern int udp_get_port(struct sock *sk, unsigned short snum, - int (*saddr_cmp)(struct sock *, struct sock *)); + int (*saddr_cmp)(const struct sock *, const struct sock *)); extern void udp_err(struct sk_buff *, u32); extern int udp_sendmsg(struct kiocb *iocb, struct sock *sk, diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index aa18230..77e265d 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -139,7 +139,7 @@ static inline int udp_lport_inuse(u16 num) * @saddr_comp: AF-dependent comparison of bound local IP addresses */ int udp_get_port(struct sock *sk, unsigned short snum, - int (*saddr_cmp)(struct sock *sk1, struct sock *sk2)) + int (*saddr_cmp)(const struct sock *sk1, const struct sock *sk2)) { struct hlist_node *node; struct hlist_head *head; @@ -210,7 +210,7 @@ fail: return error; } -static inline int ipv4_rcv_saddr_equal(struct sock *sk1, struct sock *sk2) +static inline int ipv4_rcv_saddr_equal(const struct sock *sk1, const struct sock *sk2) { struct inet_sock *inet1 = inet_sk(sk1), *inet2 = inet_sk(sk2); -- cgit v0.10.2 From a5531a5d852008be40811496029012f4ad3093d1 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Sat, 26 Aug 2006 20:11:47 -0700 Subject: [NETLINK]: Improve string attribute validation Introduces a new attribute type NLA_NUL_STRING to support NUL terminated strings. Attributes of this kind require to carry a terminating NUL within the maximum specified in the policy. The `old' NLA_STRING which is not required to be NUL terminated is extended to provide means to specify a maximum length of the string. Aims at easing the pain with using nla_strlcpy() on temporary buffers. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/net/netlink.h b/include/net/netlink.h index bcb27e3..11dc2e7 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -167,6 +167,7 @@ enum { NLA_FLAG, NLA_MSECS, NLA_NESTED, + NLA_NUL_STRING, __NLA_TYPE_MAX, }; @@ -175,21 +176,27 @@ enum { /** * struct nla_policy - attribute validation policy * @type: Type of attribute or NLA_UNSPEC - * @minlen: Minimal length of payload required to be available + * @len: Type specific length of payload * * Policies are defined as arrays of this struct, the array must be * accessible by attribute type up to the highest identifier to be expected. * + * Meaning of `len' field: + * NLA_STRING Maximum length of string + * NLA_NUL_STRING Maximum length of string (excluding NUL) + * NLA_FLAG Unused + * All other Exact length of attribute payload + * * Example: * static struct nla_policy my_policy[ATTR_MAX+1] __read_mostly = { * [ATTR_FOO] = { .type = NLA_U16 }, - * [ATTR_BAR] = { .type = NLA_STRING }, - * [ATTR_BAZ] = { .minlen = sizeof(struct mystruct) }, + * [ATTR_BAR] = { .type = NLA_STRING, len = BARSIZ }, + * [ATTR_BAZ] = { .len = sizeof(struct mystruct) }, * }; */ struct nla_policy { u16 type; - u16 minlen; + u16 len; }; /** diff --git a/net/netlink/attr.c b/net/netlink/attr.c index 136e529..0041395 100644 --- a/net/netlink/attr.c +++ b/net/netlink/attr.c @@ -20,7 +20,6 @@ static u16 nla_attr_minlen[NLA_TYPE_MAX+1] __read_mostly = { [NLA_U16] = sizeof(u16), [NLA_U32] = sizeof(u32), [NLA_U64] = sizeof(u64), - [NLA_STRING] = 1, [NLA_NESTED] = NLA_HDRLEN, }; @@ -28,7 +27,7 @@ static int validate_nla(struct nlattr *nla, int maxtype, struct nla_policy *policy) { struct nla_policy *pt; - int minlen = 0; + int minlen = 0, attrlen = nla_len(nla); if (nla->nla_type <= 0 || nla->nla_type > maxtype) return 0; @@ -37,16 +36,46 @@ static int validate_nla(struct nlattr *nla, int maxtype, BUG_ON(pt->type > NLA_TYPE_MAX); - if (pt->minlen) - minlen = pt->minlen; - else if (pt->type != NLA_UNSPEC) - minlen = nla_attr_minlen[pt->type]; + switch (pt->type) { + case NLA_FLAG: + if (attrlen > 0) + return -ERANGE; + break; - if (pt->type == NLA_FLAG && nla_len(nla) > 0) - return -ERANGE; + case NLA_NUL_STRING: + if (pt->len) + minlen = min_t(int, attrlen, pt->len + 1); + else + minlen = attrlen; - if (nla_len(nla) < minlen) - return -ERANGE; + if (!minlen || memchr(nla_data(nla), '\0', minlen) == NULL) + return -EINVAL; + /* fall through */ + + case NLA_STRING: + if (attrlen < 1) + return -ERANGE; + + if (pt->len) { + char *buf = nla_data(nla); + + if (buf[attrlen - 1] == '\0') + attrlen--; + + if (attrlen > pt->len) + return -ERANGE; + } + break; + + default: + if (pt->len) + minlen = pt->len; + else if (pt->type != NLA_UNSPEC) + minlen = nla_attr_minlen[pt->type]; + + if (attrlen < minlen) + return -ERANGE; + } return 0; } -- cgit v0.10.2 From 5176f91ea83f1a59eba4dba88634a4729d51d1ac Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Sat, 26 Aug 2006 20:13:18 -0700 Subject: [NETLINK]: Make use of NLA_STRING/NLA_NUL_STRING attribute validation Converts existing NLA_STRING attributes to use the new validation features, saving a couple of temporary buffers. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index 7b2e9bb..a99d87d 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -161,9 +161,6 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) if (err < 0) goto errout; - if (tb[FRA_IFNAME] && nla_len(tb[FRA_IFNAME]) > IFNAMSIZ) - goto errout; - rule = kzalloc(ops->rule_size, GFP_KERNEL); if (rule == NULL) { err = -ENOMEM; @@ -177,10 +174,7 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) struct net_device *dev; rule->ifindex = -1; - if (nla_strlcpy(rule->ifname, tb[FRA_IFNAME], - IFNAMSIZ) >= IFNAMSIZ) - goto errout_free; - + nla_strlcpy(rule->ifname, tb[FRA_IFNAME], IFNAMSIZ); dev = __dev_get_by_name(rule->ifname); if (dev) rule->ifindex = dev->ifindex; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 8f22549..0ebcf84 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -371,8 +371,8 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb) } static struct nla_policy ifla_policy[IFLA_MAX+1] __read_mostly = { - [IFLA_IFNAME] = { .type = NLA_STRING }, - [IFLA_MAP] = { .minlen = sizeof(struct rtnl_link_ifmap) }, + [IFLA_IFNAME] = { .type = NLA_STRING, .len = IFNAMSIZ-1 }, + [IFLA_MAP] = { .len = sizeof(struct rtnl_link_ifmap) }, [IFLA_MTU] = { .type = NLA_U32 }, [IFLA_TXQLEN] = { .type = NLA_U32 }, [IFLA_WEIGHT] = { .type = NLA_U32 }, @@ -392,9 +392,8 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (err < 0) goto errout; - if (tb[IFLA_IFNAME] && - nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ) >= IFNAMSIZ) - return -EINVAL; + if (tb[IFLA_IFNAME]) + nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); err = -EINVAL; ifm = nlmsg_data(nlh); diff --git a/net/decnet/dn_rules.c b/net/decnet/dn_rules.c index 63ad63d..3e0c882 100644 --- a/net/decnet/dn_rules.c +++ b/net/decnet/dn_rules.c @@ -112,7 +112,7 @@ errout: } static struct nla_policy dn_fib_rule_policy[FRA_MAX+1] __read_mostly = { - [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_IFNAME] = { .type = NLA_STRING, .len = IFNAMSIZ - 1 }, [FRA_PRIORITY] = { .type = NLA_U32 }, [FRA_SRC] = { .type = NLA_U16 }, [FRA_DST] = { .type = NLA_U16 }, diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 0487677..8e8d1f1 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -85,7 +85,7 @@ static struct nla_policy ifa_ipv4_policy[IFA_MAX+1] __read_mostly = { [IFA_ADDRESS] = { .type = NLA_U32 }, [IFA_BROADCAST] = { .type = NLA_U32 }, [IFA_ANYCAST] = { .type = NLA_U32 }, - [IFA_LABEL] = { .type = NLA_STRING }, + [IFA_LABEL] = { .type = NLA_STRING, .len = IFNAMSIZ - 1 }, }; static void rtmsg_ifa(int event, struct in_ifaddr *, struct nlmsghdr *, u32); diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index d0abeab..cfb527c 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -462,7 +462,7 @@ struct nla_policy rtm_ipv4_policy[RTA_MAX+1] __read_mostly = { [RTA_PRIORITY] = { .type = NLA_U32 }, [RTA_PREFSRC] = { .type = NLA_U32 }, [RTA_METRICS] = { .type = NLA_NESTED }, - [RTA_MULTIPATH] = { .minlen = sizeof(struct rtnexthop) }, + [RTA_MULTIPATH] = { .len = sizeof(struct rtnexthop) }, [RTA_PROTOINFO] = { .type = NLA_U32 }, [RTA_FLOW] = { .type = NLA_U32 }, [RTA_MP_ALGO] = { .type = NLA_U32 }, diff --git a/net/ipv4/fib_rules.c b/net/ipv4/fib_rules.c index 280f424..52b2ada 100644 --- a/net/ipv4/fib_rules.c +++ b/net/ipv4/fib_rules.c @@ -179,7 +179,7 @@ static struct fib_table *fib_empty_table(void) } static struct nla_policy fib4_rule_policy[FRA_MAX+1] __read_mostly = { - [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_IFNAME] = { .type = NLA_STRING, .len = IFNAMSIZ - 1 }, [FRA_PRIORITY] = { .type = NLA_U32 }, [FRA_SRC] = { .type = NLA_U32 }, [FRA_DST] = { .type = NLA_U32 }, diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c index 2fbc71d..34f5bfa 100644 --- a/net/ipv6/fib6_rules.c +++ b/net/ipv6/fib6_rules.c @@ -137,10 +137,10 @@ static int fib6_rule_match(struct fib_rule *rule, struct flowi *fl, int flags) } static struct nla_policy fib6_rule_policy[FRA_MAX+1] __read_mostly = { - [FRA_IFNAME] = { .type = NLA_STRING }, + [FRA_IFNAME] = { .type = NLA_STRING, .len = IFNAMSIZ - 1 }, [FRA_PRIORITY] = { .type = NLA_U32 }, - [FRA_SRC] = { .minlen = sizeof(struct in6_addr) }, - [FRA_DST] = { .minlen = sizeof(struct in6_addr) }, + [FRA_SRC] = { .len = sizeof(struct in6_addr) }, + [FRA_DST] = { .len = sizeof(struct in6_addr) }, [FRA_FWMARK] = { .type = NLA_U32 }, [FRA_FWMASK] = { .type = NLA_U32 }, [FRA_TABLE] = { .type = NLA_U32 }, diff --git a/net/ipv6/route.c b/net/ipv6/route.c index ba1b3d1..75f4bb9 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -1865,7 +1865,7 @@ void rt6_mtu_change(struct net_device *dev, unsigned mtu) } static struct nla_policy rtm_ipv6_policy[RTA_MAX+1] __read_mostly = { - [RTA_GATEWAY] = { .minlen = sizeof(struct in6_addr) }, + [RTA_GATEWAY] = { .len = sizeof(struct in6_addr) }, [RTA_OIF] = { .type = NLA_U32 }, [RTA_IIF] = { .type = NLA_U32 }, [RTA_PRIORITY] = { .type = NLA_U32 }, diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index d325991..3ac942c 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -455,7 +455,8 @@ static struct sk_buff *ctrl_build_msg(struct genl_family *family, u32 pid, static struct nla_policy ctrl_policy[CTRL_ATTR_MAX+1] __read_mostly = { [CTRL_ATTR_FAMILY_ID] = { .type = NLA_U16 }, - [CTRL_ATTR_FAMILY_NAME] = { .type = NLA_STRING }, + [CTRL_ATTR_FAMILY_NAME] = { .type = NLA_NUL_STRING, + .len = GENL_NAMSIZ - 1 }, }; static int ctrl_getfamily(struct sk_buff *skb, struct genl_info *info) @@ -470,12 +471,9 @@ static int ctrl_getfamily(struct sk_buff *skb, struct genl_info *info) } if (info->attrs[CTRL_ATTR_FAMILY_NAME]) { - char name[GENL_NAMSIZ]; - - if (nla_strlcpy(name, info->attrs[CTRL_ATTR_FAMILY_NAME], - GENL_NAMSIZ) >= GENL_NAMSIZ) - goto errout; + char *name; + name = nla_data(info->attrs[CTRL_ATTR_FAMILY_NAME]); res = genl_family_find_byname(name); } -- cgit v0.10.2 From 33cc48966827165e49de1cb8ff4fb57c127d4be0 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Mon, 28 Aug 2006 13:19:30 -0700 Subject: [IPV6] ROUTE: Fix dst reference counting in ip6_pol_route_lookup(). In ip6_pol_route_lookup(), when we finish backtracking at the top-level root entry, we need to hold it. Bug noticed by Mitsuru Chinen . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 75f4bb9..d6b4b4f 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -510,8 +510,8 @@ restart: rt = fn->leaf; rt = rt6_device_match(rt, fl->oif, flags); BACKTRACK(&fl->fl6_src); - dst_hold(&rt->u.dst); out: + dst_hold(&rt->u.dst); read_unlock_bh(&table->tb6_lock); rt->u.dst.lastuse = jiffies; -- cgit v0.10.2 From 0719bdf1b5e7eb0d9c3c73ebbd9c9d5d382bb9e1 Mon Sep 17 00:00:00 2001 From: Benoit Boissinot Date: Mon, 28 Aug 2006 17:50:37 -0700 Subject: [NETFILTER]: xt_CONNMARK.c build fix net/netfilter/xt_CONNMARK.c: In function 'target': net/netfilter/xt_CONNMARK.c:59: warning: implicit declaration of function 'nf_conntrack_event_cache' The warning is due to the following .config: CONFIG_IP_NF_CONNTRACK=m CONFIG_IP_NF_CONNTRACK_MARK=y # CONFIG_IP_NF_CONNTRACK_EVENTS is not set CONFIG_IP_NF_CONNTRACK_NETLINK=m This change was introduced by: http://www.kernel.org/git/?p=linux/kernel/git/davem/net-2.6.19.git;a=commit;h=76e4b41009b8a2e9dd246135cf43c7fe39553aa5 Proposed solution (based on the define in include/net/netfilter/nf_conntrack_compat.h: Signed-off-by: Benoit Boissinot Acked-by: Pablo Neira Ayuso Signed-off-by: Andrew Morton Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index 0e4249d..6ccb45e 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -53,7 +53,7 @@ target(struct sk_buff **pskb, newmark = (*ctmark & ~markinfo->mask) | markinfo->mark; if (newmark != *ctmark) { *ctmark = newmark; -#ifdef CONFIG_IP_NF_CONNTRACK_EVENTS +#if defined(CONFIG_IP_NF_CONNTRACK) || defined(CONFIG_IP_NF_CONNTRACK_MODULE) ip_conntrack_event_cache(IPCT_MARK, *pskb); #else nf_conntrack_event_cache(IPCT_MARK, *pskb); @@ -65,7 +65,7 @@ target(struct sk_buff **pskb, ((*pskb)->nfmark & markinfo->mask); if (*ctmark != newmark) { *ctmark = newmark; -#ifdef CONFIG_IP_NF_CONNTRACK_EVENTS +#if defined(CONFIG_IP_NF_CONNTRACK) || defined(CONFIG_IP_NF_CONNTRACK_MODULE) ip_conntrack_event_cache(IPCT_MARK, *pskb); #else nf_conntrack_event_cache(IPCT_MARK, *pskb); -- cgit v0.10.2 From def42ff4dd6f54ebcf78192579a8ff1f81d8e2e8 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Mon, 28 Aug 2006 23:57:56 -0700 Subject: [IPV4]: Make struct in_addr::s_addr __be32 There will be relatively small increase in sparse endian warnings, but this (and sin_port) patch is a first step to make networking code endian clean. Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/include/linux/in.h b/include/linux/in.h index 94f557f..9a9d5dd 100644 --- a/include/linux/in.h +++ b/include/linux/in.h @@ -52,7 +52,7 @@ enum { /* Internet address. */ struct in_addr { - __u32 s_addr; + __be32 s_addr; }; #define IP_TOS 1 -- cgit v0.10.2 From cd360007a0eb8cbf17c006cca42aa884d33f96be Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Mon, 28 Aug 2006 23:58:32 -0700 Subject: [IPV4]: Make struct sockaddr_in::sin_port __be16 Signed-off-by: Alexey Dobriyan Signed-off-by: David S. Miller diff --git a/include/linux/in.h b/include/linux/in.h index 9a9d5dd..bcaca83 100644 --- a/include/linux/in.h +++ b/include/linux/in.h @@ -177,7 +177,7 @@ struct in_pktinfo #define __SOCK_SIZE__ 16 /* sizeof(struct sockaddr) */ struct sockaddr_in { sa_family_t sin_family; /* Address family */ - unsigned short int sin_port; /* Port number */ + __be16 sin_port; /* Port number */ struct in_addr sin_addr; /* Internet address */ /* Pad to size of `struct sockaddr'. */ -- cgit v0.10.2 From 07317621d004e8e6967f2dac8562825267e56135 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Tue, 29 Aug 2006 17:48:17 -0700 Subject: [NETFILTER] bridge: code rearrangement for clarity Cleanup and rearrangement for better style and clarity: Split the function nf_bridge_maybe_copy_header into two pieces Move copy portion out of line. Use Ethernet header size macros. Use header file to handle CONFIG_NETFILTER_BRIDGE differences Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h index 427c67f..274fe4b 100644 --- a/include/linux/netfilter_bridge.h +++ b/include/linux/netfilter_bridge.h @@ -47,26 +47,12 @@ enum nf_br_hook_priorities { /* Only used in br_forward.c */ -static inline -int nf_bridge_maybe_copy_header(struct sk_buff *skb) +extern int nf_bridge_copy_header(struct sk_buff *skb); +static inline int nf_bridge_maybe_copy_header(struct sk_buff *skb) { - int err; - - if (skb->nf_bridge) { - if (skb->protocol == __constant_htons(ETH_P_8021Q)) { - err = skb_cow(skb, 18); - if (err) - return err; - memcpy(skb->data - 18, skb->nf_bridge->data, 18); - skb_push(skb, 4); - } else { - err = skb_cow(skb, 16); - if (err) - return err; - memcpy(skb->data - 16, skb->nf_bridge->data, 16); - } - } - return 0; + if (skb->nf_bridge) + return nf_bridge_copy_header(skb); + return 0; } /* This is called by the IP fragmenting code and it ensures there is @@ -90,6 +76,8 @@ struct bridge_skb_cb { }; extern int brnf_deferred_hooks; +#else +#define nf_bridge_maybe_copy_header(skb) (0) #endif /* CONFIG_BRIDGE_NETFILTER */ #endif /* __KERNEL__ */ diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index 864fbbc..191b861 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -38,13 +38,10 @@ int br_dev_queue_push_xmit(struct sk_buff *skb) if (packet_length(skb) > skb->dev->mtu && !skb_is_gso(skb)) kfree_skb(skb); else { -#ifdef CONFIG_BRIDGE_NETFILTER /* ip_refrag calls ip_fragment, doesn't copy the MAC header. */ if (nf_bridge_maybe_copy_header(skb)) kfree_skb(skb); - else -#endif - { + else { skb_push(skb, ETH_HLEN); dev_queue_xmit(skb); diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index 05b3de8..b498efc 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -127,14 +127,37 @@ static inline struct nf_bridge_info *nf_bridge_alloc(struct sk_buff *skb) static inline void nf_bridge_save_header(struct sk_buff *skb) { - int header_size = 16; + int header_size = ETH_HLEN; if (skb->protocol == htons(ETH_P_8021Q)) - header_size = 18; + header_size += VLAN_HLEN; memcpy(skb->nf_bridge->data, skb->data - header_size, header_size); } +/* + * When forwarding bridge frames, we save a copy of the original + * header before processing. + */ +int nf_bridge_copy_header(struct sk_buff *skb) +{ + int err; + int header_size = ETH_HLEN; + + if (skb->protocol == htons(ETH_P_8021Q)) + header_size += VLAN_HLEN; + + err = skb_cow(skb, header_size); + if (err) + return err; + + memcpy(skb->data - header_size, skb->nf_bridge->data, header_size); + + if (skb->protocol == htons(ETH_P_8021Q)) + __skb_push(skb, VLAN_HLEN); + return 0; +} + /* PF_BRIDGE/PRE_ROUTING *********************************************/ /* Undo the changes made for ip6tables PREROUTING and continue the * bridge PRE_ROUTING hook. */ -- cgit v0.10.2 From 9bcfcaf5e9cc887eb39236e43bdbe4b4b2572229 Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Tue, 29 Aug 2006 17:48:57 -0700 Subject: [NETFILTER] bridge: simplify nf_bridge_pad Do some simple optimization on the nf_bridge_pad() function and don't use magic constants. Eliminate a double call and the #ifdef'd code for CONFIG_BRIDGE_NETFILTER. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_bridge.h b/include/linux/netfilter_bridge.h index 274fe4b..9a4dd11 100644 --- a/include/linux/netfilter_bridge.h +++ b/include/linux/netfilter_bridge.h @@ -5,9 +5,8 @@ */ #include -#if defined(__KERNEL__) && defined(CONFIG_BRIDGE_NETFILTER) #include -#endif +#include /* Bridge Hooks */ /* After promisc drops, checksum checks. */ @@ -57,16 +56,10 @@ static inline int nf_bridge_maybe_copy_header(struct sk_buff *skb) /* This is called by the IP fragmenting code and it ensures there is * enough room for the encapsulating header (if there is one). */ -static inline -int nf_bridge_pad(struct sk_buff *skb) +static inline int nf_bridge_pad(const struct sk_buff *skb) { - if (skb->protocol == __constant_htons(ETH_P_IP)) - return 0; - if (skb->nf_bridge) { - if (skb->protocol == __constant_htons(ETH_P_8021Q)) - return 4; - } - return 0; + return (skb->nf_bridge && skb->protocol == htons(ETH_P_8021Q)) + ? VLAN_HLEN : 0; } struct bridge_skb_cb { @@ -78,6 +71,7 @@ struct bridge_skb_cb { extern int brnf_deferred_hooks; #else #define nf_bridge_maybe_copy_header(skb) (0) +#define nf_bridge_pad(skb) (0) #endif /* CONFIG_BRIDGE_NETFILTER */ #endif /* __KERNEL__ */ diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 81b2795..97aee76 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -426,7 +426,7 @@ int ip_fragment(struct sk_buff *skb, int (*output)(struct sk_buff*)) int ptr; struct net_device *dev; struct sk_buff *skb2; - unsigned int mtu, hlen, left, len, ll_rs; + unsigned int mtu, hlen, left, len, ll_rs, pad; int offset; __be16 not_last_frag; struct rtable *rt = (struct rtable*)skb->dst; @@ -556,14 +556,13 @@ slow_path: left = skb->len - hlen; /* Space per frame */ ptr = raw + hlen; /* Where to start from */ -#ifdef CONFIG_BRIDGE_NETFILTER /* for bridged IP traffic encapsulated inside f.e. a vlan header, - * we need to make room for the encapsulating header */ - ll_rs = LL_RESERVED_SPACE_EXTRA(rt->u.dst.dev, nf_bridge_pad(skb)); - mtu -= nf_bridge_pad(skb); -#else - ll_rs = LL_RESERVED_SPACE(rt->u.dst.dev); -#endif + * we need to make room for the encapsulating header + */ + pad = nf_bridge_pad(skb); + ll_rs = LL_RESERVED_SPACE_EXTRA(rt->u.dst.dev, pad); + mtu -= pad; + /* * Fragment the datagram. */ -- cgit v0.10.2 From 8394e9b2faf539f82470b36c86f0485cab5278bd Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Tue, 29 Aug 2006 17:49:31 -0700 Subject: [NETFILTER] bridge: debug message fixes If CONFIG_NETFILTER_DEBUG is enabled, it shouldn't change the actions of the filtering. The message about skb->dst being NULL is commonly triggered by dhclient, so it is useless. Make sure all messages end in newline. Signed-off-by: Stephen Hemminger Signed-off-by: David S. Miller diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index b498efc..cf80dd0 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -718,16 +718,6 @@ static unsigned int br_nf_local_out(unsigned int hook, struct sk_buff **pskb, else pf = PF_INET6; -#ifdef CONFIG_NETFILTER_DEBUG - /* Sometimes we get packets with NULL ->dst here (for example, - * running a dhcp client daemon triggers this). This should now - * be fixed, but let's keep the check around. */ - if (skb->dst == NULL) { - printk(KERN_CRIT "br_netfilter: skb->dst == NULL."); - return NF_ACCEPT; - } -#endif - nf_bridge = skb->nf_bridge; nf_bridge->physoutdev = skb->dev; realindev = nf_bridge->physindev; @@ -809,7 +799,7 @@ static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff **pskb, * keep the check just to be sure... */ if (skb->mac.raw < skb->head || skb->mac.raw + ETH_HLEN > skb->data) { printk(KERN_CRIT "br_netfilter: Argh!! br_nf_post_routing: " - "bad mac.raw pointer."); + "bad mac.raw pointer.\n"); goto print_error; } #endif @@ -827,7 +817,7 @@ static unsigned int br_nf_post_routing(unsigned int hook, struct sk_buff **pskb, #ifdef CONFIG_NETFILTER_DEBUG if (skb->dst == NULL) { - printk(KERN_CRIT "br_netfilter: skb->dst == NULL."); + printk(KERN_INFO "br_netfilter post_routing: skb->dst == NULL\n"); goto print_error; } #endif @@ -864,6 +854,7 @@ print_error: } printk(" head:%p, raw:%p, data:%p\n", skb->head, skb->mac.raw, skb->data); + dump_stack(); return NF_ACCEPT; #endif } -- cgit v0.10.2 From fc747e82b40ea50a62eb2aef55bedd4465607cb0 Mon Sep 17 00:00:00 2001 From: Ian McDonald Date: Tue, 29 Aug 2006 17:50:19 -0700 Subject: [DCCP]: Tidyup CCID3 list handling As Arnaldo Carvalho de Melo points out I should be using list_entry in case the structure changes in future. Current code functions but is reliant on position and requires type cast. Noticed when doing this that I have one more variable than I needed so removing that also. Signed off by: Ian McDonald Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid3.c b/net/dccp/ccids/ccid3.c index 090bc39..195aa95 100644 --- a/net/dccp/ccids/ccid3.c +++ b/net/dccp/ccids/ccid3.c @@ -900,7 +900,7 @@ found: static void ccid3_hc_rx_update_li(struct sock *sk, u64 seq_loss, u8 win_loss) { struct ccid3_hc_rx_sock *hcrx = ccid3_hc_rx_sk(sk); - struct dccp_li_hist_entry *next, *head; + struct dccp_li_hist_entry *head; u64 seq_temp; if (list_empty(&hcrx->ccid3hcrx_li_hist)) { @@ -908,15 +908,15 @@ static void ccid3_hc_rx_update_li(struct sock *sk, u64 seq_loss, u8 win_loss) &hcrx->ccid3hcrx_li_hist, seq_loss, win_loss)) return; - next = (struct dccp_li_hist_entry *) - hcrx->ccid3hcrx_li_hist.next; - next->dccplih_interval = ccid3_hc_rx_calc_first_li(sk); + head = list_entry(hcrx->ccid3hcrx_li_hist.next, + struct dccp_li_hist_entry, dccplih_node); + head->dccplih_interval = ccid3_hc_rx_calc_first_li(sk); } else { struct dccp_li_hist_entry *entry; struct list_head *tail; - head = (struct dccp_li_hist_entry *) - hcrx->ccid3hcrx_li_hist.next; + head = list_entry(hcrx->ccid3hcrx_li_hist.next, + struct dccp_li_hist_entry, dccplih_node); /* FIXME win count check removed as was wrong */ /* should make this check with receive history */ /* and compare there as per section 10.2 of RFC4342 */ -- cgit v0.10.2 From 99f59ed073d3c1b890690064ab285a201dea2e35 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:53:48 -0700 Subject: [NetLabel]: Correctly initialize the NetLabel fields. Fix a problem where the NetLabel specific fields of the sk_security_struct structure were not being initialized early enough in some cases. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 180b26b..5a66c4c 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -281,6 +281,8 @@ static int sk_alloc_security(struct sock *sk, int family, gfp_t priority) ssec->sid = SECINITSID_UNLABELED; sk->sk_security = ssec; + selinux_netlbl_sk_security_init(ssec, family); + return 0; } @@ -3585,6 +3587,8 @@ static void selinux_sk_clone_security(const struct sock *sk, struct sock *newsk) newssec->sid = ssec->sid; newssec->peer_sid = ssec->peer_sid; + + selinux_netlbl_sk_clone_security(ssec, newssec); } static void selinux_sk_getsecid(struct sock *sk, u32 *secid) @@ -3648,6 +3652,8 @@ static void selinux_inet_csk_clone(struct sock *newsk, new socket in sync, but we don't have the isec available yet. So we will wait until sock_graft to do it, by which time it will have been created and available. */ + + selinux_netlbl_sk_security_init(newsksec, req->rsk_ops->family); } static void selinux_req_classify_flow(const struct request_sock *req, diff --git a/security/selinux/include/selinux_netlabel.h b/security/selinux/include/selinux_netlabel.h index 88c463e..d885d88 100644 --- a/security/selinux/include/selinux_netlabel.h +++ b/security/selinux/include/selinux_netlabel.h @@ -39,6 +39,10 @@ int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec, struct avc_audit_data *ad); u32 selinux_netlbl_socket_getpeersec_stream(struct socket *sock); u32 selinux_netlbl_socket_getpeersec_dgram(struct sk_buff *skb); +void selinux_netlbl_sk_security_init(struct sk_security_struct *ssec, + int family); +void selinux_netlbl_sk_clone_security(struct sk_security_struct *ssec, + struct sk_security_struct *newssec); int __selinux_netlbl_inode_permission(struct inode *inode, int mask); /** @@ -115,6 +119,20 @@ static inline u32 selinux_netlbl_socket_getpeersec_dgram(struct sk_buff *skb) return SECSID_NULL; } +static inline void selinux_netlbl_sk_security_init( + struct sk_security_struct *ssec, + int family) +{ + return; +} + +static inline void selinux_netlbl_sk_clone_security( + struct sk_security_struct *ssec, + struct sk_security_struct *newssec) +{ + return; +} + static inline int selinux_netlbl_inode_permission(struct inode *inode, int mask) { diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 910afa1..835b485 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2423,6 +2423,45 @@ netlbl_socket_setsid_return: } /** + * selinux_netlbl_sk_security_init - Setup the NetLabel fields + * @ssec: the sk_security_struct + * @family: the socket family + * + * Description: + * Called when a new sk_security_struct is allocated to initialize the NetLabel + * fields. + * + */ +void selinux_netlbl_sk_security_init(struct sk_security_struct *ssec, + int family) +{ + if (family == PF_INET) + ssec->nlbl_state = NLBL_REQUIRE; + else + ssec->nlbl_state = NLBL_UNSET; +} + +/** + * selinux_netlbl_sk_clone_security - Copy the NetLabel fields + * @ssec: the original sk_security_struct + * @newssec: the cloned sk_security_struct + * + * Description: + * Clone the NetLabel specific sk_security_struct fields from @ssec to + * @newssec. + * + */ +void selinux_netlbl_sk_clone_security(struct sk_security_struct *ssec, + struct sk_security_struct *newssec) +{ + newssec->sclass = ssec->sclass; + if (ssec->nlbl_state != NLBL_UNSET) + newssec->nlbl_state = NLBL_REQUIRE; + else + newssec->nlbl_state = NLBL_UNSET; +} + +/** * selinux_netlbl_socket_post_create - Label a socket using NetLabel * @sock: the socket to label * @sock_family: the socket family @@ -2440,10 +2479,11 @@ int selinux_netlbl_socket_post_create(struct socket *sock, struct inode_security_struct *isec = SOCK_INODE(sock)->i_security; struct sk_security_struct *sksec = sock->sk->sk_security; + sksec->sclass = isec->sclass; + if (sock_family != PF_INET) return 0; - sksec->sclass = isec->sclass; sksec->nlbl_state = NLBL_REQUIRE; return selinux_netlbl_socket_setsid(sock, sid); } @@ -2463,12 +2503,13 @@ void selinux_netlbl_sock_graft(struct sock *sk, struct socket *sock) struct inode_security_struct *isec = SOCK_INODE(sock)->i_security; struct sk_security_struct *sksec = sk->sk_security; + sksec->sclass = isec->sclass; + if (sk->sk_family != PF_INET) return; sksec->nlbl_state = NLBL_REQUIRE; sksec->peer_sid = sksec->sid; - sksec->sclass = isec->sclass; /* Try to set the NetLabel on the socket to save time later, if we fail * here we will pick up the pieces in later calls to -- cgit v0.10.2 From 1b7f775209bbee6b993587bae69acb9fc12ceb17 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:54:17 -0700 Subject: [NetLabel]: remove unused function prototypes Removed some older function prototypes for functions that no longer exist. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/include/net/cipso_ipv4.h b/include/net/cipso_ipv4.h index c7175e7..5aed72a 100644 --- a/include/net/cipso_ipv4.h +++ b/include/net/cipso_ipv4.h @@ -200,15 +200,9 @@ static inline int cipso_v4_cache_add(const struct sk_buff *skb, #ifdef CONFIG_NETLABEL void cipso_v4_error(struct sk_buff *skb, int error, u32 gateway); -int cipso_v4_socket_setopt(struct socket *sock, - unsigned char *opt, - u32 opt_len); int cipso_v4_socket_setattr(const struct socket *sock, const struct cipso_v4_doi *doi_def, const struct netlbl_lsm_secattr *secattr); -int cipso_v4_socket_getopt(const struct socket *sock, - unsigned char **opt, - u32 *opt_len); int cipso_v4_socket_getattr(const struct socket *sock, struct netlbl_lsm_secattr *secattr); int cipso_v4_skbuff_getattr(const struct sk_buff *skb, -- cgit v0.10.2 From c1b14c0a46232246f61d3157bac1201e1e102227 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:54:41 -0700 Subject: [NetLabel]: Comment corrections. Fix some incorrect comments. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 835b485..4f7642c 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2617,7 +2617,7 @@ int selinux_netlbl_sock_rcv_skb(struct sk_security_struct *sksec, } /** - * selinux_netlbl_socket_peersid - Return the peer SID of a connected socket + * selinux_netlbl_socket_getpeersec_stream - Return the connected peer's SID * @sock: the socket * * Description: -- cgit v0.10.2 From 7b3bbb926f4b3dd3a007dcf8dfa00203f52cb58d Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:55:11 -0700 Subject: [NetLabel]: Cleanup ebitmap_import() Rewrite ebitmap_import() so it is a bit cleaner and easier to read. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/security/selinux/ss/ebitmap.c b/security/selinux/ss/ebitmap.c index 4b915eb..cfed1d3 100644 --- a/security/selinux/ss/ebitmap.c +++ b/security/selinux/ss/ebitmap.c @@ -145,29 +145,28 @@ int ebitmap_import(const unsigned char *src, struct ebitmap *dst) { size_t src_off = 0; + size_t node_limit; struct ebitmap_node *node_new; struct ebitmap_node *node_last = NULL; - size_t iter; - size_t iter_bit; - size_t iter_limit; + u32 i_byte; + u32 i_bit; unsigned char src_byte; - do { - iter_limit = src_len - src_off; - if (iter_limit >= sizeof(MAPTYPE)) { + while (src_off < src_len) { + if (src_len - src_off >= sizeof(MAPTYPE)) { if (*(MAPTYPE *)&src[src_off] == 0) { src_off += sizeof(MAPTYPE); continue; } - iter_limit = sizeof(MAPTYPE); + node_limit = sizeof(MAPTYPE); } else { - iter = src_off; - src_byte = 0; - do { - src_byte |= src[iter++]; - } while (iter < src_len && src_byte == 0); + for (src_byte = 0, i_byte = src_off; + i_byte < src_len && src_byte == 0; + i_byte++) + src_byte |= src[i_byte]; if (src_byte == 0) break; + node_limit = src_len - src_off; } node_new = kzalloc(sizeof(*node_new), GFP_ATOMIC); @@ -176,24 +175,21 @@ int ebitmap_import(const unsigned char *src, return -ENOMEM; } node_new->startbit = src_off * 8; - iter = 0; - do { + for (i_byte = 0; i_byte < node_limit; i_byte++) { src_byte = src[src_off++]; - iter_bit = iter++ * 8; - while (src_byte != 0) { + for (i_bit = i_byte * 8; src_byte != 0; i_bit++) { if (src_byte & 0x80) - node_new->map |= MAPBIT << iter_bit; - iter_bit++; + node_new->map |= MAPBIT << i_bit; src_byte <<= 1; } - } while (iter < iter_limit); + } if (node_last != NULL) node_last->next = node_new; else dst->node = node_new; node_last = node_new; - } while (src_off < src_len); + } if (likely(node_last != NULL)) dst->highbit = node_last->startbit + MAPSIZE; -- cgit v0.10.2 From e448e931309e703f51d71a557973c620ff12fbda Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:55:38 -0700 Subject: [NetLabel]: uninline selinux_netlbl_inode_permission() Uninline the selinux_netlbl_inode_permission() at the request of Andrew Morton. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/security/selinux/include/selinux_netlabel.h b/security/selinux/include/selinux_netlabel.h index d885d88..d69ec65 100644 --- a/security/selinux/include/selinux_netlabel.h +++ b/security/selinux/include/selinux_netlabel.h @@ -43,40 +43,7 @@ void selinux_netlbl_sk_security_init(struct sk_security_struct *ssec, int family); void selinux_netlbl_sk_clone_security(struct sk_security_struct *ssec, struct sk_security_struct *newssec); - -int __selinux_netlbl_inode_permission(struct inode *inode, int mask); -/** - * selinux_netlbl_inode_permission - Verify the socket is NetLabel labeled - * @inode: the file descriptor's inode - * @mask: the permission mask - * - * Description: - * Looks at a file's inode and if it is marked as a socket protected by - * NetLabel then verify that the socket has been labeled, if not try to label - * the socket now with the inode's SID. Returns zero on success, negative - * values on failure. - * - */ -static inline int selinux_netlbl_inode_permission(struct inode *inode, - int mask) -{ - int rc = 0; - struct inode_security_struct *isec; - struct sk_security_struct *sksec; - - if (!S_ISSOCK(inode->i_mode)) - return 0; - - isec = inode->i_security; - sksec = SOCKET_I(inode)->sk->sk_security; - down(&isec->sem); - if (unlikely(sksec->nlbl_state == NLBL_REQUIRE && - (mask & (MAY_WRITE | MAY_APPEND)))) - rc = __selinux_netlbl_inode_permission(inode, mask); - up(&isec->sem); - - return rc; -} +int selinux_netlbl_inode_permission(struct inode *inode, int mask); #else static inline void selinux_netlbl_cache_invalidate(void) { diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 4f7642c..27ee28c 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -2544,24 +2544,39 @@ u32 selinux_netlbl_inet_conn_request(struct sk_buff *skb, u32 sock_sid) } /** - * __selinux_netlbl_inode_permission - Label a socket using NetLabel + * selinux_netlbl_inode_permission - Verify the socket is NetLabel labeled * @inode: the file descriptor's inode * @mask: the permission mask * * Description: - * Try to label a socket with the inode's SID using NetLabel. Returns zero on - * success, negative values on failure. + * Looks at a file's inode and if it is marked as a socket protected by + * NetLabel then verify that the socket has been labeled, if not try to label + * the socket now with the inode's SID. Returns zero on success, negative + * values on failure. * */ -int __selinux_netlbl_inode_permission(struct inode *inode, int mask) +int selinux_netlbl_inode_permission(struct inode *inode, int mask) { int rc; - struct socket *sock = SOCKET_I(inode); - struct sk_security_struct *sksec = sock->sk->sk_security; + struct inode_security_struct *isec; + struct sk_security_struct *sksec; + struct socket *sock; - lock_sock(sock->sk); - rc = selinux_netlbl_socket_setsid(sock, sksec->sid); - release_sock(sock->sk); + if (!S_ISSOCK(inode->i_mode)) + return 0; + + sock = SOCKET_I(inode); + isec = inode->i_security; + sksec = sock->sk->sk_security; + down(&isec->sem); + if (unlikely(sksec->nlbl_state == NLBL_REQUIRE && + (mask & (MAY_WRITE | MAY_APPEND)))) { + lock_sock(sock->sk); + rc = selinux_netlbl_socket_setsid(sock, sksec->sid); + release_sock(sock->sk); + } else + rc = 0; + up(&isec->sem); return rc; } -- cgit v0.10.2 From 7a0e1d602288370801c353221c6a938eab925053 Mon Sep 17 00:00:00 2001 From: Paul Moore Date: Tue, 29 Aug 2006 17:56:04 -0700 Subject: [NetLabel]: add some missing #includes to various header files Add some missing include files to the NetLabel related header files. Signed-off-by: Paul Moore Signed-off-by: David S. Miller diff --git a/include/net/cipso_ipv4.h b/include/net/cipso_ipv4.h index 5aed72a..59406e0 100644 --- a/include/net/cipso_ipv4.h +++ b/include/net/cipso_ipv4.h @@ -37,6 +37,8 @@ #include #include #include +#include +#include #include /* known doi values */ diff --git a/include/net/netlabel.h b/include/net/netlabel.h index 7cae730..fc2b72f 100644 --- a/include/net/netlabel.h +++ b/include/net/netlabel.h @@ -31,6 +31,7 @@ #define _NETLABEL_H #include +#include #include #include diff --git a/net/netlabel/netlabel_domainhash.h b/net/netlabel/netlabel_domainhash.h index 9217863..99a2287 100644 --- a/net/netlabel/netlabel_domainhash.h +++ b/net/netlabel/netlabel_domainhash.h @@ -32,6 +32,10 @@ #ifndef _NETLABEL_DOMAINHASH_H #define _NETLABEL_DOMAINHASH_H +#include +#include +#include + /* Domain hash table size */ /* XXX - currently this number is an uneducated guess */ #define NETLBL_DOMHSH_BITSIZE 7 diff --git a/net/netlabel/netlabel_user.h b/net/netlabel/netlabel_user.h index ccf237b..385a6c7 100644 --- a/net/netlabel/netlabel_user.h +++ b/net/netlabel/netlabel_user.h @@ -31,11 +31,12 @@ #ifndef _NETLABEL_USER_H #define _NETLABEL_USER_H +#include #include #include -#include -#include +#include #include +#include /* NetLabel NETLINK helper functions */ diff --git a/security/selinux/include/selinux_netlabel.h b/security/selinux/include/selinux_netlabel.h index d69ec65..ecab4bd 100644 --- a/security/selinux/include/selinux_netlabel.h +++ b/security/selinux/include/selinux_netlabel.h @@ -27,6 +27,15 @@ #ifndef _SELINUX_NETLABEL_H_ #define _SELINUX_NETLABEL_H_ +#include +#include +#include +#include +#include + +#include "avc.h" +#include "objsec.h" + #ifdef CONFIG_NETLABEL void selinux_netlbl_cache_invalidate(void); int selinux_netlbl_socket_post_create(struct socket *sock, -- cgit v0.10.2 From 28a7b327b8cc8ea35662d360d3d11d60195debc9 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Wed, 30 Aug 2006 15:03:07 -0700 Subject: [PKT_SCHED] act_simple.c: make struct simp_hash_info static This patch makes the needlessly global struct simp_hash_info static. Signed-off-by: Adrian Bunk Signed-off-by: David S. Miller diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c index 8c1ab8a..901571a 100644 --- a/net/sched/act_simple.c +++ b/net/sched/act_simple.c @@ -28,7 +28,7 @@ static struct tcf_common *tcf_simp_ht[SIMP_TAB_MASK + 1]; static u32 simp_idx_gen; static DEFINE_RWLOCK(simp_lock); -struct tcf_hashinfo simp_hash_info = { +static struct tcf_hashinfo simp_hash_info = { .htab = tcf_simp_ht, .hmask = SIMP_TAB_MASK, .lock = &simp_lock, -- cgit v0.10.2 From 7a42c2175703f54a3640f25dc078c8190a4f904e Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Thu, 31 Aug 2006 15:03:02 -0700 Subject: [NET]: Change somaxconn sysctl to __read_mostly Signed-off-by: Brian Haley Signed-off-by: David S. Miller diff --git a/net/socket.c b/net/socket.c index d6f27ed..1bc4167 100644 --- a/net/socket.c +++ b/net/socket.c @@ -1337,7 +1337,7 @@ asmlinkage long sys_bind(int fd, struct sockaddr __user *umyaddr, int addrlen) * ready for listening. */ -int sysctl_somaxconn = SOMAXCONN; +int sysctl_somaxconn __read_mostly = SOMAXCONN; asmlinkage long sys_listen(int fd, int backlog) { -- cgit v0.10.2 From 18adaf067cf013fc2690d3830eba99ff800795b4 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Thu, 31 Aug 2006 15:03:36 -0700 Subject: [AF_UNIX]: Change max_dgram_qlen sysctl to __read_mostly Signed-off-by: Brian Haley Signed-off-by: David S. Miller diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 7c91c20..b43a278 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -117,7 +117,7 @@ #include #include -int sysctl_unix_max_dgram_qlen = 10; +int sysctl_unix_max_dgram_qlen __read_mostly = 10; struct hlist_head unix_socket_table[UNIX_HASH_SIZE + 1]; DEFINE_SPINLOCK(unix_table_lock); -- cgit v0.10.2 From 3015d5d4e5b15eddea272a697e83391100581932 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Thu, 31 Aug 2006 15:04:30 -0700 Subject: [RTNETLINK]: Fix typo causing wrong skb to be freed A typo introduced by myself which leads to freeing the skb containing the netlink message when it should free the newly allocated skb for the reply. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 0ebcf84..63b882a 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -596,7 +596,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr* nlh, void *arg) err = rtnl_fill_ifinfo(nskb, dev, iw, iw_buf_len, RTM_NEWLINK, NETLINK_CB(skb).pid, nlh->nlmsg_seq, 0, 0); if (err <= 0) { - kfree_skb(skb); + kfree_skb(nskb); goto errout; } -- cgit v0.10.2 From ff9b5e0f08cb650d113eef0c654f931c0a7ae730 Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Thu, 31 Aug 2006 15:11:02 -0700 Subject: [TCP]: Fix rcv mss estimate for LRO By passing a Linux-generated TSO packet straight back into Linux, Xen becomes our first LRO user :) Unfortunately, there is at least one spot in our stack that needs to be changed to cope with this. The receive MSS estimate is computed from the raw packet size. This is broken if the packet is GSO/LRO. Fortunately the real MSS can be found in gso_size so we simply need to use that if it is non-zero. Real LRO NICs should of course set the gso_size field in future. Signed-off-by: Herbert Xu Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index caf3c41..511b738 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -127,7 +127,7 @@ static void tcp_measure_rcv_mss(struct sock *sk, /* skb->len may jitter because of SACKs, even if peer * sends good full-sized frames. */ - len = skb->len; + len = skb_shinfo(skb)->gso_size ?: skb->len; if (len >= icsk->icsk_ack.rcv_mss) { icsk->icsk_ack.rcv_mss = len; } else { -- cgit v0.10.2 From a9917c06652165fe4eeb9ab7a5d1e0674e90e508 Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Thu, 31 Aug 2006 15:14:32 -0700 Subject: [XFRM] STATE: Fix flusing with hash mask. This is a minor fix about transformation state flushing for net-2.6.19. Please apply it. Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c index 4341795..9f63edd 100644 --- a/net/xfrm/xfrm_state.c +++ b/net/xfrm/xfrm_state.c @@ -384,7 +384,7 @@ void xfrm_state_flush(u8 proto) int i; spin_lock_bh(&xfrm_state_lock); - for (i = 0; i < xfrm_state_hmask; i++) { + for (i = 0; i <= xfrm_state_hmask; i++) { struct hlist_node *entry; struct xfrm_state *x; restart: -- cgit v0.10.2 From dc435e6dac1439340eaeceef84022c4e4749796d Mon Sep 17 00:00:00 2001 From: Masahide NAKAMURA Date: Thu, 31 Aug 2006 15:18:49 -0700 Subject: [IPV6] MIP6: Fix to update IP6CB when cloned skbuff is received at HAO. Signed-off-by: Masahide NAKAMURA Signed-off-by: David S. Miller diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c index 084f78c..88c96b1 100644 --- a/net/ipv6/exthdrs.c +++ b/net/ipv6/exthdrs.c @@ -233,9 +233,14 @@ static int ipv6_dest_hao(struct sk_buff **skbp, int optoff) if (skb_cloned(skb)) { struct sk_buff *skb2 = skb_copy(skb, GFP_ATOMIC); + struct inet6_skb_parm *opt2; + if (skb2 == NULL) goto discard; + opt2 = IP6CB(skb2); + memcpy(opt2, opt, sizeof(*opt2)); + kfree_skb(skb); /* update all variable using below by copied skbuff */ @@ -296,6 +301,7 @@ static int ipv6_destopt_rcv(struct sk_buff **skbp) if (ip6_parse_tlv(tlvprocdestopt_lst, skbp)) { skb = *skbp; skb->h.raw += ((skb->h.raw[1]+1)<<3); + opt = IP6CB(skb); #ifdef CONFIG_IPV6_MIP6 opt->nhoff = dstbuf; #else @@ -690,6 +696,7 @@ int ipv6_parse_hopopts(struct sk_buff **skbp) if (ip6_parse_tlv(tlvprochopopt_lst, skbp)) { skb = *skbp; skb->h.raw += (skb->h.raw[1]+1)<<3; + opt = IP6CB(skb); opt->nhoff = sizeof(struct ipv6hdr); return 1; } -- cgit v0.10.2 From fda9ef5d679b07c9d9097aaf6ef7f069d794a8f9 Mon Sep 17 00:00:00 2001 From: Dmitry Mishin Date: Thu, 31 Aug 2006 15:28:39 -0700 Subject: [NET]: Fix sk->sk_filter field access Function sk_filter() is called from tcp_v{4,6}_rcv() functions with arg needlock = 0, while socket is not locked at that moment. In order to avoid this and similar issues in the future, use rcu for sk->sk_filter field read protection. Signed-off-by: Dmitry Mishin Signed-off-by: Alexey Kuznetsov Signed-off-by: Kirill Korotaev diff --git a/include/linux/filter.h b/include/linux/filter.h index c6cb8f0..91b2e3b 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -25,10 +25,10 @@ struct sock_filter /* Filter block */ { - __u16 code; /* Actual filter code */ - __u8 jt; /* Jump true */ - __u8 jf; /* Jump false */ - __u32 k; /* Generic multiuse field */ + __u16 code; /* Actual filter code */ + __u8 jt; /* Jump true */ + __u8 jf; /* Jump false */ + __u32 k; /* Generic multiuse field */ }; struct sock_fprog /* Required for SO_ATTACH_FILTER. */ @@ -41,8 +41,9 @@ struct sock_fprog /* Required for SO_ATTACH_FILTER. */ struct sk_filter { atomic_t refcnt; - unsigned int len; /* Number of filter blocks */ - struct sock_filter insns[0]; + unsigned int len; /* Number of filter blocks */ + struct rcu_head rcu; + struct sock_filter insns[0]; }; static inline unsigned int sk_filter_len(struct sk_filter *fp) diff --git a/include/net/sock.h b/include/net/sock.h index 337ebec..edd4d73 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -862,30 +862,24 @@ extern void sock_init_data(struct socket *sock, struct sock *sk); * */ -static inline int sk_filter(struct sock *sk, struct sk_buff *skb, int needlock) +static inline int sk_filter(struct sock *sk, struct sk_buff *skb) { int err; + struct sk_filter *filter; err = security_sock_rcv_skb(sk, skb); if (err) return err; - if (sk->sk_filter) { - struct sk_filter *filter; - - if (needlock) - bh_lock_sock(sk); - - filter = sk->sk_filter; - if (filter) { - unsigned int pkt_len = sk_run_filter(skb, filter->insns, - filter->len); - err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM; - } - - if (needlock) - bh_unlock_sock(sk); + rcu_read_lock_bh(); + filter = sk->sk_filter; + if (filter) { + unsigned int pkt_len = sk_run_filter(skb, filter->insns, + filter->len); + err = pkt_len ? pskb_trim(skb, pkt_len) : -EPERM; } + rcu_read_unlock_bh(); + return err; } @@ -897,6 +891,12 @@ static inline int sk_filter(struct sock *sk, struct sk_buff *skb, int needlock) * Remove a filter from a socket and release its resources. */ +static inline void sk_filter_rcu_free(struct rcu_head *rcu) +{ + struct sk_filter *fp = container_of(rcu, struct sk_filter, rcu); + kfree(fp); +} + static inline void sk_filter_release(struct sock *sk, struct sk_filter *fp) { unsigned int size = sk_filter_len(fp); @@ -904,7 +904,7 @@ static inline void sk_filter_release(struct sock *sk, struct sk_filter *fp) atomic_sub(size, &sk->sk_omem_alloc); if (atomic_dec_and_test(&fp->refcnt)) - kfree(fp); + call_rcu_bh(&fp->rcu, sk_filter_rcu_free); } static inline void sk_filter_charge(struct sock *sk, struct sk_filter *fp) diff --git a/net/core/filter.c b/net/core/filter.c index 5b4486a..6732782 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -422,10 +422,10 @@ int sk_attach_filter(struct sock_fprog *fprog, struct sock *sk) if (!err) { struct sk_filter *old_fp; - spin_lock_bh(&sk->sk_lock.slock); - old_fp = sk->sk_filter; - sk->sk_filter = fp; - spin_unlock_bh(&sk->sk_lock.slock); + rcu_read_lock_bh(); + old_fp = rcu_dereference(sk->sk_filter); + rcu_assign_pointer(sk->sk_filter, fp); + rcu_read_unlock_bh(); fp = old_fp; } diff --git a/net/core/sock.c b/net/core/sock.c index cfaf090..b77e155 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -247,11 +247,7 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb) goto out; } - /* It would be deadlock, if sock_queue_rcv_skb is used - with socket lock! We assume that users of this - function are lock free. - */ - err = sk_filter(sk, skb, 1); + err = sk_filter(sk, skb); if (err) goto out; @@ -278,7 +274,7 @@ int sk_receive_skb(struct sock *sk, struct sk_buff *skb) { int rc = NET_RX_SUCCESS; - if (sk_filter(sk, skb, 0)) + if (sk_filter(sk, skb)) goto discard_and_relse; skb->dev = NULL; @@ -606,15 +602,15 @@ set_rcvbuf: break; case SO_DETACH_FILTER: - spin_lock_bh(&sk->sk_lock.slock); - filter = sk->sk_filter; + rcu_read_lock_bh(); + filter = rcu_dereference(sk->sk_filter); if (filter) { - sk->sk_filter = NULL; - spin_unlock_bh(&sk->sk_lock.slock); + rcu_assign_pointer(sk->sk_filter, NULL); sk_filter_release(sk, filter); + rcu_read_unlock_bh(); break; } - spin_unlock_bh(&sk->sk_lock.slock); + rcu_read_unlock_bh(); ret = -ENONET; break; @@ -884,10 +880,10 @@ void sk_free(struct sock *sk) if (sk->sk_destruct) sk->sk_destruct(sk); - filter = sk->sk_filter; + filter = rcu_dereference(sk->sk_filter); if (filter) { sk_filter_release(sk, filter); - sk->sk_filter = NULL; + rcu_assign_pointer(sk->sk_filter, NULL); } sock_disable_timestamp(sk); diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index f9c5e12..7a47399 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -970,7 +970,7 @@ static int dccp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) if (skb->protocol == htons(ETH_P_IP)) return dccp_v4_do_rcv(sk, skb); - if (sk_filter(sk, skb, 0)) + if (sk_filter(sk, skb)) goto discard; /* diff --git a/net/decnet/dn_nsp_in.c b/net/decnet/dn_nsp_in.c index 86f7f3b..72ecc6e 100644 --- a/net/decnet/dn_nsp_in.c +++ b/net/decnet/dn_nsp_in.c @@ -586,7 +586,7 @@ static __inline__ int dn_queue_skb(struct sock *sk, struct sk_buff *skb, int sig goto out; } - err = sk_filter(sk, skb, 0); + err = sk_filter(sk, skb); if (err) goto out; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 23b46e3..39b1798 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1104,7 +1104,7 @@ process: goto discard_and_relse; nf_reset(skb); - if (sk_filter(sk, skb, 0)) + if (sk_filter(sk, skb)) goto discard_and_relse; skb->dev = NULL; diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 2b18918..2546fc9 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -1075,7 +1075,7 @@ static int tcp_v6_do_rcv(struct sock *sk, struct sk_buff *skb) if (skb->protocol == htons(ETH_P_IP)) return tcp_v4_do_rcv(sk, skb); - if (sk_filter(sk, skb, 0)) + if (sk_filter(sk, skb)) goto discard; /* @@ -1232,7 +1232,7 @@ process: if (!xfrm6_policy_check(sk, XFRM_POLICY_IN, skb)) goto discard_and_relse; - if (sk_filter(sk, skb, 0)) + if (sk_filter(sk, skb)) goto discard_and_relse; skb->dev = NULL; diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 300215b..f4ccb90 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -427,21 +427,24 @@ out_unlock: } #endif -static inline unsigned run_filter(struct sk_buff *skb, struct sock *sk, unsigned res) +static inline int run_filter(struct sk_buff *skb, struct sock *sk, + unsigned *snaplen) { struct sk_filter *filter; + int err = 0; - bh_lock_sock(sk); - filter = sk->sk_filter; - /* - * Our caller already checked that filter != NULL but we need to - * verify that under bh_lock_sock() to be safe - */ - if (likely(filter != NULL)) - res = sk_run_filter(skb, filter->insns, filter->len); - bh_unlock_sock(sk); + rcu_read_lock_bh(); + filter = rcu_dereference(sk->sk_filter); + if (filter != NULL) { + err = sk_run_filter(skb, filter->insns, filter->len); + if (!err) + err = -EPERM; + else if (*snaplen > err) + *snaplen = err; + } + rcu_read_unlock_bh(); - return res; + return err; } /* @@ -491,13 +494,8 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev, struct packet snaplen = skb->len; - if (sk->sk_filter) { - unsigned res = run_filter(skb, sk, snaplen); - if (res == 0) - goto drop_n_restore; - if (snaplen > res) - snaplen = res; - } + if (run_filter(skb, sk, &snaplen) < 0) + goto drop_n_restore; if (atomic_read(&sk->sk_rmem_alloc) + skb->truesize >= (unsigned)sk->sk_rcvbuf) @@ -593,13 +591,8 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev, struct packe snaplen = skb->len; - if (sk->sk_filter) { - unsigned res = run_filter(skb, sk, snaplen); - if (res == 0) - goto drop_n_restore; - if (snaplen > res) - snaplen = res; - } + if (run_filter(skb, sk, &snaplen) < 0) + goto drop_n_restore; if (sk->sk_type == SOCK_DGRAM) { macoff = netoff = TPACKET_ALIGN(TPACKET_HDRLEN) + 16; diff --git a/net/sctp/input.c b/net/sctp/input.c index 8a34d95..03f65de 100644 --- a/net/sctp/input.c +++ b/net/sctp/input.c @@ -228,7 +228,7 @@ int sctp_rcv(struct sk_buff *skb) goto discard_release; nf_reset(skb); - if (sk_filter(sk, skb, 1)) + if (sk_filter(sk, skb)) goto discard_release; /* Create an SCTP packet structure. */ -- cgit v0.10.2 From eb878e84575fbce21d2edb079eada78bfa27023d Mon Sep 17 00:00:00 2001 From: Jamal Hadi Salim Date: Thu, 31 Aug 2006 17:42:59 -0700 Subject: [IPSEC]: output mode to take an xfrm state as input param Expose IPSEC modes output path to take an xfrm state as input param. This makes it consistent with the input mode processing (which already takes the xfrm state as a param). Signed-off-by: Jamal Hadi Salim Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 0acabf2..4d6dc62 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -285,7 +285,7 @@ extern void xfrm_put_type(struct xfrm_type *type); struct xfrm_mode { int (*input)(struct xfrm_state *x, struct sk_buff *skb); - int (*output)(struct sk_buff *skb); + int (*output)(struct xfrm_state *x,struct sk_buff *skb); struct module *owner; unsigned int encap; diff --git a/net/ipv4/xfrm4_mode_transport.c b/net/ipv4/xfrm4_mode_transport.c index a9e6b3d..92676b7 100644 --- a/net/ipv4/xfrm4_mode_transport.c +++ b/net/ipv4/xfrm4_mode_transport.c @@ -21,9 +21,8 @@ * On exit, skb->h will be set to the start of the payload to be processed * by x->type->output and skb->nh will be set to the top IP header. */ -static int xfrm4_transport_output(struct sk_buff *skb) +static int xfrm4_transport_output(struct xfrm_state *x, struct sk_buff *skb) { - struct xfrm_state *x; struct iphdr *iph; int ihl; @@ -33,7 +32,6 @@ static int xfrm4_transport_output(struct sk_buff *skb) ihl = iph->ihl * 4; skb->h.raw += ihl; - x = skb->dst->xfrm; skb->nh.raw = memmove(skb_push(skb, x->props.header_len), iph, ihl); return 0; } diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c index 13cafbe..e23c21d 100644 --- a/net/ipv4/xfrm4_mode_tunnel.c +++ b/net/ipv4/xfrm4_mode_tunnel.c @@ -33,10 +33,9 @@ static inline void ipip_ecn_decapsulate(struct sk_buff *skb) * On exit, skb->h will be set to the start of the payload to be processed * by x->type->output and skb->nh will be set to the top IP header. */ -static int xfrm4_tunnel_output(struct sk_buff *skb) +static int xfrm4_tunnel_output(struct xfrm_state *x, struct sk_buff *skb) { struct dst_entry *dst = skb->dst; - struct xfrm_state *x = dst->xfrm; struct iphdr *iph, *top_iph; int flags; diff --git a/net/ipv4/xfrm4_output.c b/net/ipv4/xfrm4_output.c index 5fd115f..04403fb0 100644 --- a/net/ipv4/xfrm4_output.c +++ b/net/ipv4/xfrm4_output.c @@ -66,7 +66,7 @@ static int xfrm4_output_one(struct sk_buff *skb) if (err) goto error; - err = x->mode->output(skb); + err = x->mode->output(x, skb); if (err) goto error; diff --git a/net/ipv6/xfrm6_mode_ro.c b/net/ipv6/xfrm6_mode_ro.c index c11c335..6031c16 100644 --- a/net/ipv6/xfrm6_mode_ro.c +++ b/net/ipv6/xfrm6_mode_ro.c @@ -43,9 +43,8 @@ * its absence, that of the top IP header. The value of skb->data will always * point to the top IP header. */ -static int xfrm6_ro_output(struct sk_buff *skb) +static int xfrm6_ro_output(struct xfrm_state *x, struct sk_buff *skb) { - struct xfrm_state *x = skb->dst->xfrm; struct ipv6hdr *iph; u8 *prevhdr; int hdr_len; diff --git a/net/ipv6/xfrm6_mode_transport.c b/net/ipv6/xfrm6_mode_transport.c index a5dce21..3a4b39b 100644 --- a/net/ipv6/xfrm6_mode_transport.c +++ b/net/ipv6/xfrm6_mode_transport.c @@ -25,9 +25,8 @@ * its absence, that of the top IP header. The value of skb->data will always * point to the top IP header. */ -static int xfrm6_transport_output(struct sk_buff *skb) +static int xfrm6_transport_output(struct xfrm_state *x, struct sk_buff *skb) { - struct xfrm_state *x = skb->dst->xfrm; struct ipv6hdr *iph; u8 *prevhdr; int hdr_len; diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c index 8af79be2..5e7d8a7 100644 --- a/net/ipv6/xfrm6_mode_tunnel.c +++ b/net/ipv6/xfrm6_mode_tunnel.c @@ -37,10 +37,9 @@ static inline void ipip6_ecn_decapsulate(struct sk_buff *skb) * its absence, that of the top IP header. The value of skb->data will always * point to the top IP header. */ -static int xfrm6_tunnel_output(struct sk_buff *skb) +static int xfrm6_tunnel_output(struct xfrm_state *x, struct sk_buff *skb) { struct dst_entry *dst = skb->dst; - struct xfrm_state *x = dst->xfrm; struct ipv6hdr *iph, *top_iph; int dsfield; diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c index db58104..c260ea1 100644 --- a/net/ipv6/xfrm6_output.c +++ b/net/ipv6/xfrm6_output.c @@ -65,7 +65,7 @@ static int xfrm6_output_one(struct sk_buff *skb) if (err) goto error; - err = x->mode->output(skb); + err = x->mode->output(x, skb); if (err) goto error; -- cgit v0.10.2 From d1d9facfd1b326e0df587c96f0ee55de2ae9f946 Mon Sep 17 00:00:00 2001 From: James Morris Date: Fri, 1 Sep 2006 00:32:12 -0700 Subject: [XFRM]: remove xerr_idxp from __xfrm_policy_check() It seems that during the MIPv6 respin, some code which was originally conditionally compiled around CONFIG_XFRM_ADVANCED was accidently left in after the config option was removed. This patch removes an extraneous pointer (xerr_idxp) which is no longer needed. Signed-off-by: James Morris Acked-by: Masahide NAKAMURA Signed-off-by: David S. Miller diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 7db1c48..537854f 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1514,8 +1514,7 @@ static inline int secpath_has_nontransport(struct sec_path *sp, int k, int *idxp { for (; k < sp->len; k++) { if (sp->xvec[k]->props.mode != XFRM_MODE_TRANSPORT) { - if (idxp) - *idxp = k; + *idxp = k; return 1; } } @@ -1534,7 +1533,6 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, struct flowi fl; u8 fl_dir = policy_to_flow_dir(dir); int xerr_idx = -1; - int *xerr_idxp = &xerr_idx; if (xfrm_decode_session(skb, &fl, family) < 0) return 0; @@ -1560,7 +1558,7 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, xfrm_policy_lookup); if (!pol) { - if (skb->sp && secpath_has_nontransport(skb->sp, 0, xerr_idxp)) { + if (skb->sp && secpath_has_nontransport(skb->sp, 0, &xerr_idx)) { xfrm_secpath_reject(xerr_idx, skb, &fl); return 0; } @@ -1619,13 +1617,14 @@ int __xfrm_policy_check(struct sock *sk, int dir, struct sk_buff *skb, for (i = xfrm_nr-1, k = 0; i >= 0; i--) { k = xfrm_policy_ok(tpp[i], sp, k, family); if (k < 0) { - if (k < -1 && xerr_idxp) - *xerr_idxp = -(2+k); + if (k < -1) + /* "-2 - errored_index" returned */ + xerr_idx = -(2+k); goto reject; } } - if (secpath_has_nontransport(sp, k, xerr_idxp)) + if (secpath_has_nontransport(sp, k, &xerr_idx)) goto reject; xfrm_pols_put(pols, npols); -- cgit v0.10.2 From 78e5b8916e7db119850f57ce8548fbb9767078fc Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 13 Sep 2006 20:35:36 -0700 Subject: [RTNETLINK]: Fix netdevice name corruption When changing a device by ifindex without including a IFLA_IFNAME attribute, the ifname variable contains random garbage and is used to change the device name. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 63b882a..d8e25e0 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -394,6 +394,8 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (tb[IFLA_IFNAME]) nla_strlcpy(ifname, tb[IFLA_IFNAME], IFNAMSIZ); + else + ifname[0] = '\0'; err = -EINVAL; ifm = nlmsg_data(nlh); -- cgit v0.10.2 From eb328111efde7bca782f340fe805756039ec6a0c Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:01:59 -0700 Subject: [GENL]: Provide more information to userspace about registered genl families Additionaly exports the following information when providing the list of registered generic netlink families: - protocol version - header size - maximum number of attributes - list of available operations including - id - flags - avaiability of policy and doit/dumpit function libnl HEAD provides a utility to read this new information: 0x0010 nlctrl version 1 hdrsize 0 maxattr 6 op GETFAMILY (0x03) [POLICY,DOIT,DUMPIT] 0x0011 NLBL_MGMT version 1 hdrsize 0 maxattr 0 op unknown (0x02) [DOIT] op unknown (0x03) [DOIT] .... Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/include/linux/genetlink.h b/include/linux/genetlink.h index 84f12a4..9049dc6 100644 --- a/include/linux/genetlink.h +++ b/include/linux/genetlink.h @@ -16,6 +16,8 @@ struct genlmsghdr { #define GENL_HDRLEN NLMSG_ALIGN(sizeof(struct genlmsghdr)) +#define GENL_ADMIN_PERM 0x01 + /* * List of reserved static generic netlink identifiers: */ @@ -43,9 +45,25 @@ enum { CTRL_ATTR_UNSPEC, CTRL_ATTR_FAMILY_ID, CTRL_ATTR_FAMILY_NAME, + CTRL_ATTR_VERSION, + CTRL_ATTR_HDRSIZE, + CTRL_ATTR_MAXATTR, + CTRL_ATTR_OPS, __CTRL_ATTR_MAX, }; #define CTRL_ATTR_MAX (__CTRL_ATTR_MAX - 1) +enum { + CTRL_ATTR_OP_UNSPEC, + CTRL_ATTR_OP_ID, + CTRL_ATTR_OP_FLAGS, + CTRL_ATTR_OP_POLICY, + CTRL_ATTR_OP_DOIT, + CTRL_ATTR_OP_DUMPIT, + __CTRL_ATTR_OP_MAX, +}; + +#define CTRL_ATTR_OP_MAX (__CTRL_ATTR_OP_MAX - 1) + #endif /* __LINUX_GENERIC_NETLINK_H */ diff --git a/include/net/genetlink.h b/include/net/genetlink.h index 97d6d3a..4a38d85 100644 --- a/include/net/genetlink.h +++ b/include/net/genetlink.h @@ -27,8 +27,6 @@ struct genl_family struct list_head family_list; /* private */ }; -#define GENL_ADMIN_PERM 0x01 - /** * struct genl_info - receiving information * @snd_seq: sending sequence number diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c index 3ac942c..49bc2db 100644 --- a/net/netlink/genetlink.c +++ b/net/netlink/genetlink.c @@ -387,7 +387,10 @@ static void genl_rcv(struct sock *sk, int len) static int ctrl_fill_info(struct genl_family *family, u32 pid, u32 seq, u32 flags, struct sk_buff *skb, u8 cmd) { + struct nlattr *nla_ops; + struct genl_ops *ops; void *hdr; + int idx = 1; hdr = genlmsg_put(skb, pid, seq, GENL_ID_CTRL, 0, flags, cmd, family->version); @@ -396,6 +399,37 @@ static int ctrl_fill_info(struct genl_family *family, u32 pid, u32 seq, NLA_PUT_STRING(skb, CTRL_ATTR_FAMILY_NAME, family->name); NLA_PUT_U16(skb, CTRL_ATTR_FAMILY_ID, family->id); + NLA_PUT_U32(skb, CTRL_ATTR_VERSION, family->version); + NLA_PUT_U32(skb, CTRL_ATTR_HDRSIZE, family->hdrsize); + NLA_PUT_U32(skb, CTRL_ATTR_MAXATTR, family->maxattr); + + nla_ops = nla_nest_start(skb, CTRL_ATTR_OPS); + if (nla_ops == NULL) + goto nla_put_failure; + + list_for_each_entry(ops, &family->ops_list, ops_list) { + struct nlattr *nest; + + nest = nla_nest_start(skb, idx++); + if (nest == NULL) + goto nla_put_failure; + + NLA_PUT_U32(skb, CTRL_ATTR_OP_ID, ops->cmd); + NLA_PUT_U32(skb, CTRL_ATTR_OP_FLAGS, ops->flags); + + if (ops->policy) + NLA_PUT_FLAG(skb, CTRL_ATTR_OP_POLICY); + + if (ops->doit) + NLA_PUT_FLAG(skb, CTRL_ATTR_OP_DOIT); + + if (ops->dumpit) + NLA_PUT_FLAG(skb, CTRL_ATTR_OP_DUMPIT); + + nla_nest_end(skb, nest); + } + + nla_nest_end(skb, nla_ops); return genlmsg_end(skb, hdr); @@ -411,6 +445,9 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb) int chains_to_skip = cb->args[0]; int fams_to_skip = cb->args[1]; + if (chains_to_skip != 0) + genl_lock(); + for (i = 0; i < GENL_FAM_TAB_SIZE; i++) { if (i < chains_to_skip) continue; @@ -428,6 +465,9 @@ static int ctrl_dumpfamily(struct sk_buff *skb, struct netlink_callback *cb) } errout: + if (chains_to_skip != 0) + genl_unlock(); + cb->args[0] = i; cb->args[1] = n; -- cgit v0.10.2 From 9c1ea148ad8bb06538b43908891afedebeaf361b Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Mon, 18 Sep 2006 00:03:41 -0700 Subject: [BRIDGE]: Change sysctl tunables to __read_mostly Change some bridge sysctl tunables to __read_mostly. Signed-off-by: Brian Haley Signed-off-by: David S. Miller diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c index cf80dd0..ac181be 100644 --- a/net/bridge/br_netfilter.c +++ b/net/bridge/br_netfilter.c @@ -53,10 +53,10 @@ #ifdef CONFIG_SYSCTL static struct ctl_table_header *brnf_sysctl_header; -static int brnf_call_iptables = 1; -static int brnf_call_ip6tables = 1; -static int brnf_call_arptables = 1; -static int brnf_filter_vlan_tagged = 1; +static int brnf_call_iptables __read_mostly = 1; +static int brnf_call_ip6tables __read_mostly = 1; +static int brnf_call_arptables __read_mostly = 1; +static int brnf_filter_vlan_tagged __read_mostly = 1; #else #define brnf_filter_vlan_tagged 1 #endif -- cgit v0.10.2 From 4cbf1cae9f08c76ed92700090a69a5b1f1f6a982 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Mon, 18 Sep 2006 00:04:22 -0700 Subject: [SCTP]: Change globals to __read_mostly Change sctp globals to __read_mostly. Signed-off-by: Brian Haley Signed-off-by: David S. Miller diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index 5692ef5..d9dd4c4 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -61,7 +61,7 @@ #include /* Global data structures. */ -struct sctp_globals sctp_globals; +struct sctp_globals sctp_globals __read_mostly; struct proc_dir_entry *proc_net_sctp; DEFINE_SNMP_STAT(struct sctp_mib, sctp_statistics) __read_mostly; -- cgit v0.10.2 From 94aec08ea426903a3fb3cafd4d8b900cd50df702 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Mon, 18 Sep 2006 00:05:22 -0700 Subject: [NETFILTER]: Change tunables to __read_mostly Change some netfilter tunables to __read_mostly. Also fixed some incorrect file reference comments while I was in there. (this will be my last __read_mostly patch unless someone points out something else that needs it) Signed-off-by: Brian Haley Acked-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c index aa45917..5da25ad 100644 --- a/net/ipv4/netfilter/ip_conntrack_core.c +++ b/net/ipv4/netfilter/ip_conntrack_core.c @@ -66,13 +66,13 @@ void (*ip_conntrack_destroyed)(struct ip_conntrack *conntrack) = NULL; LIST_HEAD(ip_conntrack_expect_list); struct ip_conntrack_protocol *ip_ct_protos[MAX_IP_CT_PROTO]; static LIST_HEAD(helpers); -unsigned int ip_conntrack_htable_size = 0; -int ip_conntrack_max; +unsigned int ip_conntrack_htable_size __read_mostly = 0; +int ip_conntrack_max __read_mostly; struct list_head *ip_conntrack_hash; static kmem_cache_t *ip_conntrack_cachep __read_mostly; static kmem_cache_t *ip_conntrack_expect_cachep __read_mostly; struct ip_conntrack ip_conntrack_untracked; -unsigned int ip_ct_log_invalid; +unsigned int ip_ct_log_invalid __read_mostly; static LIST_HEAD(unconfirmed); static int ip_conntrack_vmalloc; diff --git a/net/ipv4/netfilter/ip_conntrack_proto_generic.c b/net/ipv4/netfilter/ip_conntrack_proto_generic.c index f891308..36f2b5e 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_generic.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_generic.c @@ -12,7 +12,7 @@ #include #include -unsigned int ip_ct_generic_timeout = 600*HZ; +unsigned int ip_ct_generic_timeout __read_mostly = 600*HZ; static int generic_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff, diff --git a/net/ipv4/netfilter/ip_conntrack_proto_icmp.c b/net/ipv4/netfilter/ip_conntrack_proto_icmp.c index 23f1c50..09c40eb 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_icmp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_icmp.c @@ -21,7 +21,7 @@ #include #include -unsigned int ip_ct_icmp_timeout = 30*HZ; +unsigned int ip_ct_icmp_timeout __read_mostly = 30*HZ; #if 0 #define DEBUGP printk diff --git a/net/ipv4/netfilter/ip_conntrack_proto_sctp.c b/net/ipv4/netfilter/ip_conntrack_proto_sctp.c index 2d3612c..b908a48 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_sctp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_sctp.c @@ -58,13 +58,13 @@ static const char *sctp_conntrack_names[] = { #define HOURS * 60 MINS #define DAYS * 24 HOURS -static unsigned int ip_ct_sctp_timeout_closed = 10 SECS; -static unsigned int ip_ct_sctp_timeout_cookie_wait = 3 SECS; -static unsigned int ip_ct_sctp_timeout_cookie_echoed = 3 SECS; -static unsigned int ip_ct_sctp_timeout_established = 5 DAYS; -static unsigned int ip_ct_sctp_timeout_shutdown_sent = 300 SECS / 1000; -static unsigned int ip_ct_sctp_timeout_shutdown_recd = 300 SECS / 1000; -static unsigned int ip_ct_sctp_timeout_shutdown_ack_sent = 3 SECS; +static unsigned int ip_ct_sctp_timeout_closed __read_mostly = 10 SECS; +static unsigned int ip_ct_sctp_timeout_cookie_wait __read_mostly = 3 SECS; +static unsigned int ip_ct_sctp_timeout_cookie_echoed __read_mostly = 3 SECS; +static unsigned int ip_ct_sctp_timeout_established __read_mostly = 5 DAYS; +static unsigned int ip_ct_sctp_timeout_shutdown_sent __read_mostly = 300 SECS / 1000; +static unsigned int ip_ct_sctp_timeout_shutdown_recd __read_mostly = 300 SECS / 1000; +static unsigned int ip_ct_sctp_timeout_shutdown_ack_sent __read_mostly = 3 SECS; static const unsigned int * sctp_timeouts[] = { NULL, /* SCTP_CONNTRACK_NONE */ diff --git a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c index 9de81ff..75a7237 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c @@ -48,19 +48,19 @@ static DEFINE_RWLOCK(tcp_lock); /* "Be conservative in what you do, be liberal in what you accept from others." If it's non-zero, we mark only out of window RST segments as INVALID. */ -int ip_ct_tcp_be_liberal = 0; +int ip_ct_tcp_be_liberal __read_mostly = 0; /* When connection is picked up from the middle, how many packets are required to pass in each direction when we assume we are in sync - if any side uses window scaling, we lost the game. If it is set to zero, we disable picking up already established connections. */ -int ip_ct_tcp_loose = 3; +int ip_ct_tcp_loose __read_mostly = 3; /* Max number of the retransmitted packets without receiving an (acceptable) ACK from the destination. If this number is reached, a shorter timer will be started. */ -int ip_ct_tcp_max_retrans = 3; +int ip_ct_tcp_max_retrans __read_mostly = 3; /* FIXME: Examine ipfilter's timeouts and conntrack transitions more closely. They're more complex. --RR */ @@ -83,19 +83,19 @@ static const char *tcp_conntrack_names[] = { #define HOURS * 60 MINS #define DAYS * 24 HOURS -unsigned int ip_ct_tcp_timeout_syn_sent = 2 MINS; -unsigned int ip_ct_tcp_timeout_syn_recv = 60 SECS; -unsigned int ip_ct_tcp_timeout_established = 5 DAYS; -unsigned int ip_ct_tcp_timeout_fin_wait = 2 MINS; -unsigned int ip_ct_tcp_timeout_close_wait = 60 SECS; -unsigned int ip_ct_tcp_timeout_last_ack = 30 SECS; -unsigned int ip_ct_tcp_timeout_time_wait = 2 MINS; -unsigned int ip_ct_tcp_timeout_close = 10 SECS; +unsigned int ip_ct_tcp_timeout_syn_sent __read_mostly = 2 MINS; +unsigned int ip_ct_tcp_timeout_syn_recv __read_mostly = 60 SECS; +unsigned int ip_ct_tcp_timeout_established __read_mostly = 5 DAYS; +unsigned int ip_ct_tcp_timeout_fin_wait __read_mostly = 2 MINS; +unsigned int ip_ct_tcp_timeout_close_wait __read_mostly = 60 SECS; +unsigned int ip_ct_tcp_timeout_last_ack __read_mostly = 30 SECS; +unsigned int ip_ct_tcp_timeout_time_wait __read_mostly = 2 MINS; +unsigned int ip_ct_tcp_timeout_close __read_mostly = 10 SECS; /* RFC1122 says the R2 limit should be at least 100 seconds. Linux uses 15 packets as limit, which corresponds to ~13-30min depending on RTO. */ -unsigned int ip_ct_tcp_timeout_max_retrans = 5 MINS; +unsigned int ip_ct_tcp_timeout_max_retrans __read_mostly = 5 MINS; static const unsigned int * tcp_timeouts[] = { NULL, /* TCP_CONNTRACK_NONE */ diff --git a/net/ipv4/netfilter/ip_conntrack_proto_udp.c b/net/ipv4/netfilter/ip_conntrack_proto_udp.c index e58e52f..d0e8a16 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_udp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_udp.c @@ -18,8 +18,8 @@ #include #include -unsigned int ip_ct_udp_timeout = 30*HZ; -unsigned int ip_ct_udp_timeout_stream = 180*HZ; +unsigned int ip_ct_udp_timeout __read_mostly = 30*HZ; +unsigned int ip_ct_udp_timeout_stream __read_mostly = 180*HZ; static int udp_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff, diff --git a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c index 7a9fa04..3f5d495 100644 --- a/net/ipv4/netfilter/ip_conntrack_standalone.c +++ b/net/ipv4/netfilter/ip_conntrack_standalone.c @@ -534,7 +534,7 @@ static struct nf_hook_ops ip_conntrack_ops[] = { /* Sysctl support */ -int ip_conntrack_checksum = 1; +int ip_conntrack_checksum __read_mostly = 1; #ifdef CONFIG_SYSCTL @@ -563,7 +563,7 @@ extern unsigned int ip_ct_udp_timeout_stream; /* From ip_conntrack_proto_icmp.c */ extern unsigned int ip_ct_icmp_timeout; -/* From ip_conntrack_proto_icmp.c */ +/* From ip_conntrack_proto_generic.c */ extern unsigned int ip_ct_generic_timeout; /* Log invalid packets of a given protocol */ diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c index 276a964..80060cb 100644 --- a/net/ipv4/netfilter/ip_queue.c +++ b/net/ipv4/netfilter/ip_queue.c @@ -53,7 +53,7 @@ struct ipq_queue_entry { typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long); static unsigned char copy_mode = IPQ_COPY_NONE; -static unsigned int queue_maxlen = IPQ_QMAX_DEFAULT; +static unsigned int queue_maxlen __read_mostly = IPQ_QMAX_DEFAULT; static DEFINE_RWLOCK(queue_lock); static int peer_pid; static unsigned int copy_range; diff --git a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c index 663a73e..790f00d 100644 --- a/net/ipv4/netfilter/nf_conntrack_proto_icmp.c +++ b/net/ipv4/netfilter/nf_conntrack_proto_icmp.c @@ -25,7 +25,7 @@ #include #include -unsigned long nf_ct_icmp_timeout = 30*HZ; +unsigned long nf_ct_icmp_timeout __read_mostly = 30*HZ; #if 0 #define DEBUGP printk diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c index c01c126..d322e83 100644 --- a/net/ipv6/netfilter/ip6_queue.c +++ b/net/ipv6/netfilter/ip6_queue.c @@ -57,7 +57,7 @@ struct ipq_queue_entry { typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long); static unsigned char copy_mode = IPQ_COPY_NONE; -static unsigned int queue_maxlen = IPQ_QMAX_DEFAULT; +static unsigned int queue_maxlen __read_mostly = IPQ_QMAX_DEFAULT; static DEFINE_RWLOCK(queue_lock); static int peer_pid; static unsigned int copy_range; diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c index c2ab38f..e5e53ff 100644 --- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c +++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c @@ -335,7 +335,7 @@ static struct nf_hook_ops ipv6_conntrack_ops[] = { /* From nf_conntrack_proto_icmpv6.c */ extern unsigned int nf_ct_icmpv6_timeout; -/* From nf_conntrack_frag6.c */ +/* From nf_conntrack_reasm.c */ extern unsigned int nf_ct_frag6_timeout; extern unsigned int nf_ct_frag6_low_thresh; extern unsigned int nf_ct_frag6_high_thresh; diff --git a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c index ef18a7b..34d4472 100644 --- a/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c +++ b/net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c @@ -33,7 +33,7 @@ #include #include -unsigned long nf_ct_icmpv6_timeout = 30*HZ; +unsigned long nf_ct_icmpv6_timeout __read_mostly = 30*HZ; #if 0 #define DEBUGP printk diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index 7a4e4c2..bf93c1e 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -54,9 +54,9 @@ #define NF_CT_FRAG6_LOW_THRESH 196608 /* == 192*1024 */ #define NF_CT_FRAG6_TIMEOUT IPV6_FRAG_TIMEOUT -unsigned int nf_ct_frag6_high_thresh = 256*1024; -unsigned int nf_ct_frag6_low_thresh = 192*1024; -unsigned long nf_ct_frag6_timeout = IPV6_FRAG_TIMEOUT; +unsigned int nf_ct_frag6_high_thresh __read_mostly = 256*1024; +unsigned int nf_ct_frag6_low_thresh __read_mostly = 192*1024; +unsigned long nf_ct_frag6_timeout __read_mostly = IPV6_FRAG_TIMEOUT; struct nf_ct_frag6_skb_cb { diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 8f22619..3b64dbe 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -77,12 +77,12 @@ LIST_HEAD(nf_conntrack_expect_list); struct nf_conntrack_protocol **nf_ct_protos[PF_MAX]; struct nf_conntrack_l3proto *nf_ct_l3protos[PF_MAX]; static LIST_HEAD(helpers); -unsigned int nf_conntrack_htable_size = 0; -int nf_conntrack_max; +unsigned int nf_conntrack_htable_size __read_mostly = 0; +int nf_conntrack_max __read_mostly; struct list_head *nf_conntrack_hash; static kmem_cache_t *nf_conntrack_expect_cachep; struct nf_conn nf_conntrack_untracked; -unsigned int nf_ct_log_invalid; +unsigned int nf_ct_log_invalid __read_mostly; static LIST_HEAD(unconfirmed); static int nf_conntrack_vmalloc; diff --git a/net/netfilter/nf_conntrack_proto_generic.c b/net/netfilter/nf_conntrack_proto_generic.c index 46bc27e..26408bb 100644 --- a/net/netfilter/nf_conntrack_proto_generic.c +++ b/net/netfilter/nf_conntrack_proto_generic.c @@ -17,7 +17,7 @@ #include #include -unsigned int nf_ct_generic_timeout = 600*HZ; +unsigned int nf_ct_generic_timeout __read_mostly = 600*HZ; static int generic_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff, diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c index 9bd8a78..af56877 100644 --- a/net/netfilter/nf_conntrack_proto_sctp.c +++ b/net/netfilter/nf_conntrack_proto_sctp.c @@ -64,13 +64,13 @@ static const char *sctp_conntrack_names[] = { #define HOURS * 60 MINS #define DAYS * 24 HOURS -static unsigned int nf_ct_sctp_timeout_closed = 10 SECS; -static unsigned int nf_ct_sctp_timeout_cookie_wait = 3 SECS; -static unsigned int nf_ct_sctp_timeout_cookie_echoed = 3 SECS; -static unsigned int nf_ct_sctp_timeout_established = 5 DAYS; -static unsigned int nf_ct_sctp_timeout_shutdown_sent = 300 SECS / 1000; -static unsigned int nf_ct_sctp_timeout_shutdown_recd = 300 SECS / 1000; -static unsigned int nf_ct_sctp_timeout_shutdown_ack_sent = 3 SECS; +static unsigned int nf_ct_sctp_timeout_closed __read_mostly = 10 SECS; +static unsigned int nf_ct_sctp_timeout_cookie_wait __read_mostly = 3 SECS; +static unsigned int nf_ct_sctp_timeout_cookie_echoed __read_mostly = 3 SECS; +static unsigned int nf_ct_sctp_timeout_established __read_mostly = 5 DAYS; +static unsigned int nf_ct_sctp_timeout_shutdown_sent __read_mostly = 300 SECS / 1000; +static unsigned int nf_ct_sctp_timeout_shutdown_recd __read_mostly = 300 SECS / 1000; +static unsigned int nf_ct_sctp_timeout_shutdown_ack_sent __read_mostly = 3 SECS; static unsigned int * sctp_timeouts[] = { NULL, /* SCTP_CONNTRACK_NONE */ diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 308d2ab..9fc0ee6 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -57,19 +57,19 @@ static DEFINE_RWLOCK(tcp_lock); /* "Be conservative in what you do, be liberal in what you accept from others." If it's non-zero, we mark only out of window RST segments as INVALID. */ -int nf_ct_tcp_be_liberal = 0; +int nf_ct_tcp_be_liberal __read_mostly = 0; /* When connection is picked up from the middle, how many packets are required to pass in each direction when we assume we are in sync - if any side uses window scaling, we lost the game. If it is set to zero, we disable picking up already established connections. */ -int nf_ct_tcp_loose = 3; +int nf_ct_tcp_loose __read_mostly = 3; /* Max number of the retransmitted packets without receiving an (acceptable) ACK from the destination. If this number is reached, a shorter timer will be started. */ -int nf_ct_tcp_max_retrans = 3; +int nf_ct_tcp_max_retrans __read_mostly = 3; /* FIXME: Examine ipfilter's timeouts and conntrack transitions more closely. They're more complex. --RR */ @@ -92,19 +92,19 @@ static const char *tcp_conntrack_names[] = { #define HOURS * 60 MINS #define DAYS * 24 HOURS -unsigned int nf_ct_tcp_timeout_syn_sent = 2 MINS; -unsigned int nf_ct_tcp_timeout_syn_recv = 60 SECS; -unsigned int nf_ct_tcp_timeout_established = 5 DAYS; -unsigned int nf_ct_tcp_timeout_fin_wait = 2 MINS; -unsigned int nf_ct_tcp_timeout_close_wait = 60 SECS; -unsigned int nf_ct_tcp_timeout_last_ack = 30 SECS; -unsigned int nf_ct_tcp_timeout_time_wait = 2 MINS; -unsigned int nf_ct_tcp_timeout_close = 10 SECS; +unsigned int nf_ct_tcp_timeout_syn_sent __read_mostly = 2 MINS; +unsigned int nf_ct_tcp_timeout_syn_recv __read_mostly = 60 SECS; +unsigned int nf_ct_tcp_timeout_established __read_mostly = 5 DAYS; +unsigned int nf_ct_tcp_timeout_fin_wait __read_mostly = 2 MINS; +unsigned int nf_ct_tcp_timeout_close_wait __read_mostly = 60 SECS; +unsigned int nf_ct_tcp_timeout_last_ack __read_mostly = 30 SECS; +unsigned int nf_ct_tcp_timeout_time_wait __read_mostly = 2 MINS; +unsigned int nf_ct_tcp_timeout_close __read_mostly = 10 SECS; /* RFC1122 says the R2 limit should be at least 100 seconds. Linux uses 15 packets as limit, which corresponds to ~13-30min depending on RTO. */ -unsigned int nf_ct_tcp_timeout_max_retrans = 5 MINS; +unsigned int nf_ct_tcp_timeout_max_retrans __read_mostly = 5 MINS; static unsigned int * tcp_timeouts[] = { NULL, /* TCP_CONNTRACK_NONE */ diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c index d36e031..d28981c 100644 --- a/net/netfilter/nf_conntrack_proto_udp.c +++ b/net/netfilter/nf_conntrack_proto_udp.c @@ -27,8 +27,8 @@ #include #include -unsigned int nf_ct_udp_timeout = 30*HZ; -unsigned int nf_ct_udp_timeout_stream = 180*HZ; +unsigned int nf_ct_udp_timeout __read_mostly = 30*HZ; +unsigned int nf_ct_udp_timeout_stream __read_mostly = 180*HZ; static int udp_pkt_to_tuple(const struct sk_buff *skb, unsigned int dataoff, diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index 4ef8366..9a1de0c 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -428,7 +428,7 @@ static struct file_operations ct_cpu_seq_fops = { /* Sysctl support */ -int nf_conntrack_checksum = 1; +int nf_conntrack_checksum __read_mostly = 1; #ifdef CONFIG_SYSCTL -- cgit v0.10.2 From 461d8837faac141f4676bf451b3339d0e48656d1 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:09:49 -0700 Subject: [IPV6] address: Convert address addition to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index fc9cff3..52ba96a 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2868,6 +2868,29 @@ restart: spin_unlock_bh(&addrconf_verify_lock); } +static struct in6_addr *extract_addr(struct nlattr *addr, struct nlattr *local) +{ + struct in6_addr *pfx = NULL; + + if (addr) + pfx = nla_data(addr); + + if (local) { + if (pfx && nla_memcmp(local, pfx, sizeof(*pfx))) + pfx = NULL; + else + pfx = nla_data(local); + } + + return pfx; +} + +static struct nla_policy ifa_ipv6_policy[IFA_MAX+1] __read_mostly = { + [IFA_ADDRESS] = { .len = sizeof(struct in6_addr) }, + [IFA_LOCAL] = { .len = sizeof(struct in6_addr) }, + [IFA_CACHEINFO] = { .len = sizeof(struct ifa_cacheinfo) }, +}; + static int inet6_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { @@ -2945,46 +2968,41 @@ inet6_addr_modify(int ifindex, struct in6_addr *pfx, static int inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct rtattr **rta = arg; - struct ifaddrmsg *ifm = NLMSG_DATA(nlh); + struct ifaddrmsg *ifm; + struct nlattr *tb[IFA_MAX+1]; struct in6_addr *pfx; - __u32 valid_lft = INFINITY_LIFE_TIME, prefered_lft = INFINITY_LIFE_TIME; + u32 valid_lft, preferred_lft; + int err; - pfx = NULL; - if (rta[IFA_ADDRESS-1]) { - if (RTA_PAYLOAD(rta[IFA_ADDRESS-1]) < sizeof(*pfx)) - return -EINVAL; - pfx = RTA_DATA(rta[IFA_ADDRESS-1]); - } - if (rta[IFA_LOCAL-1]) { - if (RTA_PAYLOAD(rta[IFA_LOCAL-1]) < sizeof(*pfx) || - (pfx && memcmp(pfx, RTA_DATA(rta[IFA_LOCAL-1]), sizeof(*pfx)))) - return -EINVAL; - pfx = RTA_DATA(rta[IFA_LOCAL-1]); - } + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv6_policy); + if (err < 0) + return err; + + ifm = nlmsg_data(nlh); + pfx = extract_addr(tb[IFA_ADDRESS], tb[IFA_LOCAL]); if (pfx == NULL) return -EINVAL; - if (rta[IFA_CACHEINFO-1]) { + if (tb[IFA_CACHEINFO]) { struct ifa_cacheinfo *ci; - if (RTA_PAYLOAD(rta[IFA_CACHEINFO-1]) < sizeof(*ci)) - return -EINVAL; - ci = RTA_DATA(rta[IFA_CACHEINFO-1]); + + ci = nla_data(tb[IFA_CACHEINFO]); valid_lft = ci->ifa_valid; - prefered_lft = ci->ifa_prefered; + preferred_lft = ci->ifa_prefered; + } else { + preferred_lft = INFINITY_LIFE_TIME; + valid_lft = INFINITY_LIFE_TIME; } if (nlh->nlmsg_flags & NLM_F_REPLACE) { - int ret; - ret = inet6_addr_modify(ifm->ifa_index, pfx, - prefered_lft, valid_lft); - if (ret == 0 || !(nlh->nlmsg_flags & NLM_F_CREATE)) - return ret; + err = inet6_addr_modify(ifm->ifa_index, pfx, + preferred_lft, valid_lft); + if (err == 0 || !(nlh->nlmsg_flags & NLM_F_CREATE)) + return err; } return inet6_addr_add(ifm->ifa_index, pfx, ifm->ifa_prefixlen, - prefered_lft, valid_lft); - + preferred_lft, valid_lft); } /* Maximum length of ifa_cacheinfo attributes */ -- cgit v0.10.2 From b933f7166ba376967f88a598ff04256f6d1b0b21 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:10:19 -0700 Subject: [IPV6] address: Convert address deletion to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 52ba96a..6162703 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2894,22 +2894,17 @@ static struct nla_policy ifa_ipv6_policy[IFA_MAX+1] __read_mostly = { static int inet6_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) { - struct rtattr **rta = arg; - struct ifaddrmsg *ifm = NLMSG_DATA(nlh); + struct ifaddrmsg *ifm; + struct nlattr *tb[IFA_MAX+1]; struct in6_addr *pfx; + int err; - pfx = NULL; - if (rta[IFA_ADDRESS-1]) { - if (RTA_PAYLOAD(rta[IFA_ADDRESS-1]) < sizeof(*pfx)) - return -EINVAL; - pfx = RTA_DATA(rta[IFA_ADDRESS-1]); - } - if (rta[IFA_LOCAL-1]) { - if (RTA_PAYLOAD(rta[IFA_LOCAL-1]) < sizeof(*pfx) || - (pfx && memcmp(pfx, RTA_DATA(rta[IFA_LOCAL-1]), sizeof(*pfx)))) - return -EINVAL; - pfx = RTA_DATA(rta[IFA_LOCAL-1]); - } + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv6_policy); + if (err < 0) + return err; + + ifm = nlmsg_data(nlh); + pfx = extract_addr(tb[IFA_ADDRESS], tb[IFA_LOCAL]); if (pfx == NULL) return -EINVAL; -- cgit v0.10.2 From 1b29fc2c8bf42d8fc5310f3770d7fd7ddf4386c0 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:10:50 -0700 Subject: [IPV6] address: Convert address lookup to new netlink api Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 6162703..b2c38b3 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3234,58 +3234,54 @@ static int inet6_dump_ifacaddr(struct sk_buff *skb, struct netlink_callback *cb) return inet6_dump_addr(skb, cb, type); } -static int inet6_rtm_getaddr(struct sk_buff *in_skb, - struct nlmsghdr* nlh, void *arg) +static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr* nlh, + void *arg) { - struct rtattr **rta = arg; - struct ifaddrmsg *ifm = NLMSG_DATA(nlh); + struct ifaddrmsg *ifm; + struct nlattr *tb[IFA_MAX+1]; struct in6_addr *addr = NULL; struct net_device *dev = NULL; struct inet6_ifaddr *ifa; struct sk_buff *skb; - int size = NLMSG_SPACE(sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE); + int payload = sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE; int err; - if (rta[IFA_ADDRESS-1]) { - if (RTA_PAYLOAD(rta[IFA_ADDRESS-1]) < sizeof(*addr)) - return -EINVAL; - addr = RTA_DATA(rta[IFA_ADDRESS-1]); - } - if (rta[IFA_LOCAL-1]) { - if (RTA_PAYLOAD(rta[IFA_LOCAL-1]) < sizeof(*addr) || - (addr && memcmp(addr, RTA_DATA(rta[IFA_LOCAL-1]), sizeof(*addr)))) - return -EINVAL; - addr = RTA_DATA(rta[IFA_LOCAL-1]); + err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv6_policy); + if (err < 0) + goto errout; + + addr = extract_addr(tb[IFA_ADDRESS], tb[IFA_LOCAL]); + if (addr == NULL) { + err = -EINVAL; + goto errout; } - if (addr == NULL) - return -EINVAL; + ifm = nlmsg_data(nlh); if (ifm->ifa_index) dev = __dev_get_by_index(ifm->ifa_index); - if ((ifa = ipv6_get_ifaddr(addr, dev, 1)) == NULL) - return -EADDRNOTAVAIL; + if ((ifa = ipv6_get_ifaddr(addr, dev, 1)) == NULL) { + err = -EADDRNOTAVAIL; + goto errout; + } - if ((skb = alloc_skb(size, GFP_KERNEL)) == NULL) { + if ((skb = nlmsg_new(nlmsg_total_size(payload), GFP_KERNEL)) == NULL) { err = -ENOBUFS; - goto out; + goto errout_ifa; } - NETLINK_CB(skb).dst_pid = NETLINK_CB(in_skb).pid; err = inet6_fill_ifaddr(skb, ifa, NETLINK_CB(in_skb).pid, nlh->nlmsg_seq, RTM_NEWADDR, 0); if (err < 0) { - err = -EMSGSIZE; - goto out_free; + kfree_skb(skb); + goto errout_ifa; } err = rtnl_unicast(skb, NETLINK_CB(in_skb).pid); -out: +errout_ifa: in6_ifa_put(ifa); +errout: return err; -out_free: - kfree_skb(skb); - goto out; } static void inet6_ifa_notify(int event, struct inet6_ifaddr *ifa) -- cgit v0.10.2 From 85486af00b620ebe26fe0fa72172c115667a2fba Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:11:24 -0700 Subject: [IPV6] address: Add put_cacheinfo() to dump struct cacheinfo Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index b2c38b3..d546f0e 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3000,6 +3000,21 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) preferred_lft, valid_lft); } +static int put_cacheinfo(struct sk_buff *skb, unsigned long cstamp, + unsigned long tstamp, u32 preferred, u32 valid) +{ + struct ifa_cacheinfo ci; + + ci.cstamp = (u32)(TIME_DELTA(cstamp, INITIAL_JIFFIES) / HZ * 100 + + TIME_DELTA(cstamp, INITIAL_JIFFIES) % HZ * 100 / HZ); + ci.tstamp = (u32)(TIME_DELTA(tstamp, INITIAL_JIFFIES) / HZ * 100 + + TIME_DELTA(tstamp, INITIAL_JIFFIES) % HZ * 100 / HZ); + ci.ifa_prefered = preferred; + ci.ifa_valid = valid; + + return nla_put(skb, IFA_CACHEINFO, sizeof(ci), &ci); +} + /* Maximum length of ifa_cacheinfo attributes */ #define INET6_IFADDR_RTA_SPACE \ RTA_SPACE(16) /* IFA_ADDRESS */ + \ @@ -3010,8 +3025,8 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, { struct ifaddrmsg *ifm; struct nlmsghdr *nlh; - struct ifa_cacheinfo ci; unsigned char *b = skb->tail; + u32 preferred, valid; nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); ifm = NLMSG_DATA(nlh); @@ -3028,23 +3043,22 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, ifm->ifa_index = ifa->idev->dev->ifindex; RTA_PUT(skb, IFA_ADDRESS, 16, &ifa->addr); if (!(ifa->flags&IFA_F_PERMANENT)) { - ci.ifa_prefered = ifa->prefered_lft; - ci.ifa_valid = ifa->valid_lft; - if (ci.ifa_prefered != INFINITY_LIFE_TIME) { + preferred = ifa->prefered_lft; + valid = ifa->valid_lft; + if (preferred != INFINITY_LIFE_TIME) { long tval = (jiffies - ifa->tstamp)/HZ; - ci.ifa_prefered -= tval; - if (ci.ifa_valid != INFINITY_LIFE_TIME) - ci.ifa_valid -= tval; + preferred -= tval; + if (valid != INFINITY_LIFE_TIME) + valid -= tval; } } else { - ci.ifa_prefered = INFINITY_LIFE_TIME; - ci.ifa_valid = INFINITY_LIFE_TIME; + preferred = INFINITY_LIFE_TIME; + valid = INFINITY_LIFE_TIME; } - ci.cstamp = (__u32)(TIME_DELTA(ifa->cstamp, INITIAL_JIFFIES) / HZ * 100 - + TIME_DELTA(ifa->cstamp, INITIAL_JIFFIES) % HZ * 100 / HZ); - ci.tstamp = (__u32)(TIME_DELTA(ifa->tstamp, INITIAL_JIFFIES) / HZ * 100 - + TIME_DELTA(ifa->tstamp, INITIAL_JIFFIES) % HZ * 100 / HZ); - RTA_PUT(skb, IFA_CACHEINFO, sizeof(ci), &ci); + + if (put_cacheinfo(skb, ifa->cstamp, ifa->tstamp, preferred, valid) < 0) + goto rtattr_failure; + nlh->nlmsg_len = skb->tail - b; return skb->len; @@ -3059,7 +3073,6 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca, { struct ifaddrmsg *ifm; struct nlmsghdr *nlh; - struct ifa_cacheinfo ci; unsigned char *b = skb->tail; nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); @@ -3072,15 +3085,11 @@ static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca, ifm->ifa_scope = RT_SCOPE_SITE; ifm->ifa_index = ifmca->idev->dev->ifindex; RTA_PUT(skb, IFA_MULTICAST, 16, &ifmca->mca_addr); - ci.cstamp = (__u32)(TIME_DELTA(ifmca->mca_cstamp, INITIAL_JIFFIES) / HZ - * 100 + TIME_DELTA(ifmca->mca_cstamp, INITIAL_JIFFIES) % HZ - * 100 / HZ); - ci.tstamp = (__u32)(TIME_DELTA(ifmca->mca_tstamp, INITIAL_JIFFIES) / HZ - * 100 + TIME_DELTA(ifmca->mca_tstamp, INITIAL_JIFFIES) % HZ - * 100 / HZ); - ci.ifa_prefered = INFINITY_LIFE_TIME; - ci.ifa_valid = INFINITY_LIFE_TIME; - RTA_PUT(skb, IFA_CACHEINFO, sizeof(ci), &ci); + + if (put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp, + INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) + goto rtattr_failure; + nlh->nlmsg_len = skb->tail - b; return skb->len; @@ -3095,7 +3104,6 @@ static int inet6_fill_ifacaddr(struct sk_buff *skb, struct ifacaddr6 *ifaca, { struct ifaddrmsg *ifm; struct nlmsghdr *nlh; - struct ifa_cacheinfo ci; unsigned char *b = skb->tail; nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); @@ -3108,15 +3116,11 @@ static int inet6_fill_ifacaddr(struct sk_buff *skb, struct ifacaddr6 *ifaca, ifm->ifa_scope = RT_SCOPE_SITE; ifm->ifa_index = ifaca->aca_idev->dev->ifindex; RTA_PUT(skb, IFA_ANYCAST, 16, &ifaca->aca_addr); - ci.cstamp = (__u32)(TIME_DELTA(ifaca->aca_cstamp, INITIAL_JIFFIES) / HZ - * 100 + TIME_DELTA(ifaca->aca_cstamp, INITIAL_JIFFIES) % HZ - * 100 / HZ); - ci.tstamp = (__u32)(TIME_DELTA(ifaca->aca_tstamp, INITIAL_JIFFIES) / HZ - * 100 + TIME_DELTA(ifaca->aca_tstamp, INITIAL_JIFFIES) % HZ - * 100 / HZ); - ci.ifa_prefered = INFINITY_LIFE_TIME; - ci.ifa_valid = INFINITY_LIFE_TIME; - RTA_PUT(skb, IFA_CACHEINFO, sizeof(ci), &ci); + + if (put_cacheinfo(skb, ifaca->aca_cstamp, ifaca->aca_tstamp, + INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) + goto rtattr_failure; + nlh->nlmsg_len = skb->tail - b; return skb->len; -- cgit v0.10.2 From 101bb229691c438bce4d7f13006494dd4de6910a Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:11:52 -0700 Subject: [IPV6] address: Add put_ifaddrmsg() and rt_scope() Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index d546f0e..ca7ecf2 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3000,6 +3000,19 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) preferred_lft, valid_lft); } +static void put_ifaddrmsg(struct nlmsghdr *nlh, u8 prefixlen, u8 flags, + u8 scope, int ifindex) +{ + struct ifaddrmsg *ifm; + + ifm = nlmsg_data(nlh); + ifm->ifa_family = AF_INET6; + ifm->ifa_prefixlen = prefixlen; + ifm->ifa_flags = flags; + ifm->ifa_scope = scope; + ifm->ifa_index = ifindex; +} + static int put_cacheinfo(struct sk_buff *skb, unsigned long cstamp, unsigned long tstamp, u32 preferred, u32 valid) { @@ -3015,6 +3028,18 @@ static int put_cacheinfo(struct sk_buff *skb, unsigned long cstamp, return nla_put(skb, IFA_CACHEINFO, sizeof(ci), &ci); } +static inline int rt_scope(int ifa_scope) +{ + if (ifa_scope & IFA_HOST) + return RT_SCOPE_HOST; + else if (ifa_scope & IFA_LINK) + return RT_SCOPE_LINK; + else if (ifa_scope & IFA_SITE) + return RT_SCOPE_SITE; + else + return RT_SCOPE_UNIVERSE; +} + /* Maximum length of ifa_cacheinfo attributes */ #define INET6_IFADDR_RTA_SPACE \ RTA_SPACE(16) /* IFA_ADDRESS */ + \ @@ -3023,24 +3048,14 @@ static int put_cacheinfo(struct sk_buff *skb, unsigned long cstamp, static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, u32 pid, u32 seq, int event, unsigned int flags) { - struct ifaddrmsg *ifm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; u32 preferred, valid; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); - ifm = NLMSG_DATA(nlh); - ifm->ifa_family = AF_INET6; - ifm->ifa_prefixlen = ifa->prefix_len; - ifm->ifa_flags = ifa->flags; - ifm->ifa_scope = RT_SCOPE_UNIVERSE; - if (ifa->scope&IFA_HOST) - ifm->ifa_scope = RT_SCOPE_HOST; - else if (ifa->scope&IFA_LINK) - ifm->ifa_scope = RT_SCOPE_LINK; - else if (ifa->scope&IFA_SITE) - ifm->ifa_scope = RT_SCOPE_SITE; - ifm->ifa_index = ifa->idev->dev->ifindex; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + put_ifaddrmsg(nlh, ifa->prefix_len, ifa->flags, rt_scope(ifa->scope), + ifa->idev->dev->ifindex); + RTA_PUT(skb, IFA_ADDRESS, 16, &ifa->addr); if (!(ifa->flags&IFA_F_PERMANENT)) { preferred = ifa->prefered_lft; @@ -3071,19 +3086,16 @@ rtattr_failure: static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca, u32 pid, u32 seq, int event, u16 flags) { - struct ifaddrmsg *ifm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; + u8 scope = RT_SCOPE_UNIVERSE; + int ifindex = ifmca->idev->dev->ifindex; + + if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE) + scope = RT_SCOPE_SITE; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); - ifm = NLMSG_DATA(nlh); - ifm->ifa_family = AF_INET6; - ifm->ifa_prefixlen = 128; - ifm->ifa_flags = IFA_F_PERMANENT; - ifm->ifa_scope = RT_SCOPE_UNIVERSE; - if (ipv6_addr_scope(&ifmca->mca_addr)&IFA_SITE) - ifm->ifa_scope = RT_SCOPE_SITE; - ifm->ifa_index = ifmca->idev->dev->ifindex; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); RTA_PUT(skb, IFA_MULTICAST, 16, &ifmca->mca_addr); if (put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp, @@ -3102,19 +3114,16 @@ rtattr_failure: static int inet6_fill_ifacaddr(struct sk_buff *skb, struct ifacaddr6 *ifaca, u32 pid, u32 seq, int event, unsigned int flags) { - struct ifaddrmsg *ifm; struct nlmsghdr *nlh; unsigned char *b = skb->tail; + u8 scope = RT_SCOPE_UNIVERSE; + int ifindex = ifaca->aca_idev->dev->ifindex; + + if (ipv6_addr_scope(&ifaca->aca_addr) & IFA_SITE) + scope = RT_SCOPE_SITE; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(*ifm), flags); - ifm = NLMSG_DATA(nlh); - ifm->ifa_family = AF_INET6; - ifm->ifa_prefixlen = 128; - ifm->ifa_flags = IFA_F_PERMANENT; - ifm->ifa_scope = RT_SCOPE_UNIVERSE; - if (ipv6_addr_scope(&ifaca->aca_addr)&IFA_SITE) - ifm->ifa_scope = RT_SCOPE_SITE; - ifm->ifa_index = ifaca->aca_idev->dev->ifindex; + nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); RTA_PUT(skb, IFA_ANYCAST, 16, &ifaca->aca_addr); if (put_cacheinfo(skb, ifaca->aca_cstamp, ifaca->aca_tstamp, -- cgit v0.10.2 From 0ab6803bc90a8ee5c860ef09334b30007d1746be Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:12:35 -0700 Subject: [IPV6] address: Convert address dumping to new netlink api Replaces INET6_IFADDR_RTA_SPACE with a new function calculating the total required message size for all address messages. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index ca7ecf2..75a69ba 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -3040,23 +3040,27 @@ static inline int rt_scope(int ifa_scope) return RT_SCOPE_UNIVERSE; } -/* Maximum length of ifa_cacheinfo attributes */ -#define INET6_IFADDR_RTA_SPACE \ - RTA_SPACE(16) /* IFA_ADDRESS */ + \ - RTA_SPACE(sizeof(struct ifa_cacheinfo)) /* CACHEINFO */ +static inline int inet6_ifaddr_msgsize(void) +{ + return nlmsg_total_size(sizeof(struct ifaddrmsg) + + nla_total_size(16) + + nla_total_size(sizeof(struct ifa_cacheinfo)) + + 128); +} static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, u32 pid, u32 seq, int event, unsigned int flags) { struct nlmsghdr *nlh; - unsigned char *b = skb->tail; u32 preferred, valid; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + nlh = nlmsg_put(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + if (nlh == NULL) + return -ENOBUFS; + put_ifaddrmsg(nlh, ifa->prefix_len, ifa->flags, rt_scope(ifa->scope), ifa->idev->dev->ifindex); - RTA_PUT(skb, IFA_ADDRESS, 16, &ifa->addr); if (!(ifa->flags&IFA_F_PERMANENT)) { preferred = ifa->prefered_lft; valid = ifa->valid_lft; @@ -3071,72 +3075,57 @@ static int inet6_fill_ifaddr(struct sk_buff *skb, struct inet6_ifaddr *ifa, valid = INFINITY_LIFE_TIME; } - if (put_cacheinfo(skb, ifa->cstamp, ifa->tstamp, preferred, valid) < 0) - goto rtattr_failure; + if (nla_put(skb, IFA_ADDRESS, 16, &ifa->addr) < 0 || + put_cacheinfo(skb, ifa->cstamp, ifa->tstamp, preferred, valid) < 0) + return nlmsg_cancel(skb, nlh); - nlh->nlmsg_len = skb->tail - b; - return skb->len; - -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; + return nlmsg_end(skb, nlh); } static int inet6_fill_ifmcaddr(struct sk_buff *skb, struct ifmcaddr6 *ifmca, u32 pid, u32 seq, int event, u16 flags) { struct nlmsghdr *nlh; - unsigned char *b = skb->tail; u8 scope = RT_SCOPE_UNIVERSE; int ifindex = ifmca->idev->dev->ifindex; if (ipv6_addr_scope(&ifmca->mca_addr) & IFA_SITE) scope = RT_SCOPE_SITE; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); - put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); - RTA_PUT(skb, IFA_MULTICAST, 16, &ifmca->mca_addr); + nlh = nlmsg_put(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + if (nlh == NULL) + return -ENOBUFS; - if (put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp, + put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); + if (nla_put(skb, IFA_MULTICAST, 16, &ifmca->mca_addr) < 0 || + put_cacheinfo(skb, ifmca->mca_cstamp, ifmca->mca_tstamp, INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) - goto rtattr_failure; + return nlmsg_cancel(skb, nlh); - nlh->nlmsg_len = skb->tail - b; - return skb->len; - -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; + return nlmsg_end(skb, nlh); } static int inet6_fill_ifacaddr(struct sk_buff *skb, struct ifacaddr6 *ifaca, u32 pid, u32 seq, int event, unsigned int flags) { struct nlmsghdr *nlh; - unsigned char *b = skb->tail; u8 scope = RT_SCOPE_UNIVERSE; int ifindex = ifaca->aca_idev->dev->ifindex; if (ipv6_addr_scope(&ifaca->aca_addr) & IFA_SITE) scope = RT_SCOPE_SITE; - nlh = NLMSG_NEW(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); - put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); - RTA_PUT(skb, IFA_ANYCAST, 16, &ifaca->aca_addr); + nlh = nlmsg_put(skb, pid, seq, event, sizeof(struct ifaddrmsg), flags); + if (nlh == NULL) + return -ENOBUFS; - if (put_cacheinfo(skb, ifaca->aca_cstamp, ifaca->aca_tstamp, + put_ifaddrmsg(nlh, 128, IFA_F_PERMANENT, scope, ifindex); + if (nla_put(skb, IFA_ANYCAST, 16, &ifaca->aca_addr) < 0 || + put_cacheinfo(skb, ifaca->aca_cstamp, ifaca->aca_tstamp, INFINITY_LIFE_TIME, INFINITY_LIFE_TIME) < 0) - goto rtattr_failure; + return nlmsg_cancel(skb, nlh); - nlh->nlmsg_len = skb->tail - b; - return skb->len; - -nlmsg_failure: -rtattr_failure: - skb_trim(skb, b - skb->data); - return -1; + return nlmsg_end(skb, nlh); } enum addr_type_t @@ -3256,7 +3245,6 @@ static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr* nlh, struct net_device *dev = NULL; struct inet6_ifaddr *ifa; struct sk_buff *skb; - int payload = sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE; int err; err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv6_policy); @@ -3278,7 +3266,7 @@ static int inet6_rtm_getaddr(struct sk_buff *in_skb, struct nlmsghdr* nlh, goto errout; } - if ((skb = nlmsg_new(nlmsg_total_size(payload), GFP_KERNEL)) == NULL) { + if ((skb = nlmsg_new(inet6_ifaddr_msgsize(), GFP_KERNEL)) == NULL) { err = -ENOBUFS; goto errout_ifa; } @@ -3300,10 +3288,9 @@ errout: static void inet6_ifa_notify(int event, struct inet6_ifaddr *ifa) { struct sk_buff *skb; - int payload = sizeof(struct ifaddrmsg) + INET6_IFADDR_RTA_SPACE; int err = -ENOBUFS; - skb = nlmsg_new(nlmsg_total_size(payload), GFP_ATOMIC); + skb = nlmsg_new(inet6_ifaddr_msgsize(), GFP_ATOMIC); if (skb == NULL) goto errout; -- cgit v0.10.2 From 680a27a23af45307095ae432dd0bc859e1fbe219 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:13:07 -0700 Subject: [IPV6] address: Allow address changes while device is administrative down Same behaviour as IPv4, using IFF_UP is a no-no anyway. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 75a69ba..bb18b9c 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1886,9 +1886,6 @@ static int inet6_addr_add(int ifindex, struct in6_addr *pfx, int plen, if ((dev = __dev_get_by_index(ifindex)) == NULL) return -ENODEV; - if (!(dev->flags&IFF_UP)) - return -ENETDOWN; - if ((idev = addrconf_add_dev(dev)) == NULL) return -ENOBUFS; @@ -2922,9 +2919,6 @@ inet6_addr_modify(int ifindex, struct in6_addr *pfx, if ((dev = __dev_get_by_index(ifindex)) == NULL) return -ENODEV; - if (!(dev->flags&IFF_UP)) - return -ENETDOWN; - if (!valid_lft || (prefered_lft > valid_lft)) return -EINVAL; -- cgit v0.10.2 From 7198f8cec12ccec6d6f2e18c08ecc8c66c8aaf93 Mon Sep 17 00:00:00 2001 From: Thomas Graf Date: Mon, 18 Sep 2006 00:13:46 -0700 Subject: [IPV6] address: Support NLM_F_EXCL when adding addresses iproute2 doesn't provide the NLM_F_CREATE flag when adding addresses, it is assumed to be implied. The existing code issues a check on said flag when the modify operation fails (likely due to ENOENT) before continueing to create it, this leads to a hard to predict result, therefore the NLM_F_CREATE check is removed. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index bb18b9c..1e5a296 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -2908,24 +2908,14 @@ inet6_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) return inet6_addr_del(ifm->ifa_index, pfx, ifm->ifa_prefixlen); } -static int -inet6_addr_modify(int ifindex, struct in6_addr *pfx, - __u32 prefered_lft, __u32 valid_lft) +static int inet6_addr_modify(struct inet6_ifaddr *ifp, u32 prefered_lft, + u32 valid_lft) { - struct inet6_ifaddr *ifp = NULL; - struct net_device *dev; int ifa_flags = 0; - if ((dev = __dev_get_by_index(ifindex)) == NULL) - return -ENODEV; - if (!valid_lft || (prefered_lft > valid_lft)) return -EINVAL; - ifp = ipv6_get_ifaddr(pfx, dev, 1); - if (ifp == NULL) - return -ENOENT; - if (valid_lft == INFINITY_LIFE_TIME) ifa_flags = IFA_F_PERMANENT; else if (valid_lft >= 0x7FFFFFFF/HZ) @@ -2947,7 +2937,6 @@ inet6_addr_modify(int ifindex, struct in6_addr *pfx, spin_unlock_bh(&ifp->lock); if (!(ifp->flags&IFA_F_TENTATIVE)) ipv6_ifa_notify(0, ifp); - in6_ifa_put(ifp); addrconf_verify(0); @@ -2960,6 +2949,8 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) struct ifaddrmsg *ifm; struct nlattr *tb[IFA_MAX+1]; struct in6_addr *pfx; + struct inet6_ifaddr *ifa; + struct net_device *dev; u32 valid_lft, preferred_lft; int err; @@ -2983,15 +2974,29 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) valid_lft = INFINITY_LIFE_TIME; } - if (nlh->nlmsg_flags & NLM_F_REPLACE) { - err = inet6_addr_modify(ifm->ifa_index, pfx, - preferred_lft, valid_lft); - if (err == 0 || !(nlh->nlmsg_flags & NLM_F_CREATE)) - return err; + dev = __dev_get_by_index(ifm->ifa_index); + if (dev == NULL) + return -ENODEV; + + ifa = ipv6_get_ifaddr(pfx, dev, 1); + if (ifa == NULL) { + /* + * It would be best to check for !NLM_F_CREATE here but + * userspace alreay relies on not having to provide this. + */ + return inet6_addr_add(ifm->ifa_index, pfx, ifm->ifa_prefixlen, + preferred_lft, valid_lft); } - return inet6_addr_add(ifm->ifa_index, pfx, ifm->ifa_prefixlen, - preferred_lft, valid_lft); + if (nlh->nlmsg_flags & NLM_F_EXCL || + !(nlh->nlmsg_flags & NLM_F_REPLACE)) + err = -EEXIST; + else + err = inet6_addr_modify(ifa, preferred_lft, valid_lft); + + in6_ifa_put(ifa); + + return err; } static void put_ifaddrmsg(struct nlmsghdr *nlh, u8 prefixlen, u8 flags, -- cgit v0.10.2 From 161643660129dd7d98f0b12418c0a2710ffa7db6 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Mon, 18 Sep 2006 00:40:38 -0700 Subject: [SCTP]: Cleanups This patch contains the following cleanups: - make the following needlessly global function static: - socket.c: sctp_apply_peer_addr_params() - add proper prototypes for the several global functions in include/net/sctp/sctp.h Note that this fixes wrong prototypes for the following functions: - sctp_snmp_proc_exit() - sctp_eps_proc_exit() - sctp_assocs_proc_exit() The latter was spotted by the GNU C compiler and reported by David Woodhouse. Signed-off-by: Adrian Bunk Acked-by: Sridhar Samudrala Signed-off-by: David S. Miller diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index e274fd4..ee68a31 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -128,6 +128,8 @@ extern int sctp_copy_local_addr_list(struct sctp_bind_addr *, int flags); extern struct sctp_pf *sctp_get_pf_specific(sa_family_t family); extern int sctp_register_pf(struct sctp_pf *, sa_family_t); +int sctp_inetaddr_event(struct notifier_block *this, unsigned long ev, + void *ptr); /* * sctp/socket.c @@ -178,6 +180,17 @@ void sctp_backlog_migrate(struct sctp_association *assoc, struct sock *oldsk, struct sock *newsk); /* + * sctp/proc.c + */ +int sctp_snmp_proc_init(void); +void sctp_snmp_proc_exit(void); +int sctp_eps_proc_init(void); +void sctp_eps_proc_exit(void); +int sctp_assocs_proc_init(void); +void sctp_assocs_proc_exit(void); + + +/* * Section: Macros, externs, and inlines */ diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index 99c0cef..fd87e3c 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -78,7 +78,6 @@ #include -extern int sctp_inetaddr_event(struct notifier_block *, unsigned long, void *); static struct notifier_block sctp_inet6addr_notifier = { .notifier_call = sctp_inetaddr_event, }; diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c index d9dd4c4..fac7674 100644 --- a/net/sctp/protocol.c +++ b/net/sctp/protocol.c @@ -82,13 +82,6 @@ static struct sctp_af *sctp_af_v6_specific; kmem_cache_t *sctp_chunk_cachep __read_mostly; kmem_cache_t *sctp_bucket_cachep __read_mostly; -extern int sctp_snmp_proc_init(void); -extern int sctp_snmp_proc_exit(void); -extern int sctp_eps_proc_init(void); -extern int sctp_eps_proc_exit(void); -extern int sctp_assocs_proc_init(void); -extern int sctp_assocs_proc_exit(void); - /* Return the address of the control sock. */ struct sock *sctp_get_ctl_sock(void) { diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 7c1dbb1..79c3e07 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -2081,13 +2081,13 @@ static int sctp_setsockopt_autoclose(struct sock *sk, char __user *optval, * SPP_SACKDELAY_ENABLE, setting both will have undefined * results. */ -int sctp_apply_peer_addr_params(struct sctp_paddrparams *params, - struct sctp_transport *trans, - struct sctp_association *asoc, - struct sctp_sock *sp, - int hb_change, - int pmtud_change, - int sackdelay_change) +static int sctp_apply_peer_addr_params(struct sctp_paddrparams *params, + struct sctp_transport *trans, + struct sctp_association *asoc, + struct sctp_sock *sp, + int hb_change, + int pmtud_change, + int sackdelay_change) { int error; -- cgit v0.10.2 From 4eb327b517cf85f6cb7dcd5691e7b748cbe8c343 Mon Sep 17 00:00:00 2001 From: Venkat Yekkirala Date: Tue, 19 Sep 2006 10:24:19 -0700 Subject: [SELINUX]: Fix bug in security_sid_mls_copy The following fixes a bug where random mem is being tampered with in the non-mls case; encountered by Jashua Brindle on a gentoo box. Signed-off-by: Venkat Yekkirala Acked-by: Stephen Smalley Signed-off-by: James Morris diff --git a/security/selinux/ss/services.c b/security/selinux/ss/services.c index 27ee28c..7eb69a6 100644 --- a/security/selinux/ss/services.c +++ b/security/selinux/ss/services.c @@ -1841,7 +1841,7 @@ int security_sid_mls_copy(u32 sid, u32 mls_sid, u32 *new_sid) u32 len; int rc = 0; - if (!ss_initialized) { + if (!ss_initialized || !selinux_mls_enabled) { *new_sid = sid; goto out; } -- cgit v0.10.2 From 1ef9696c909060ccdae3ade245ca88692b49285b Mon Sep 17 00:00:00 2001 From: Alexey Kuznetsov Date: Tue, 19 Sep 2006 12:52:50 -0700 Subject: [TCP]: Send ACKs each 2nd received segment. It does not affect either mss-sized connections (obviously) or connections controlled by Nagle (because there is only one small segment in flight). The idea is to record the fact that a small segment arrives on a connection, where one small segment has already been received and still not-ACKed. In this case ACK is forced after tcp_recvmsg() drains receive buffer. In other words, it is a "soft" each-2nd-segment ACK, which is enough to preserve ACK clock even when ABC is enabled. Signed-off-by: Alexey Kuznetsov Signed-off-by: David S. Miller diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 9bf73fe..de4e83b6 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -147,7 +147,8 @@ extern struct sock *inet_csk_clone(struct sock *sk, enum inet_csk_ack_state_t { ICSK_ACK_SCHED = 1, ICSK_ACK_TIMER = 2, - ICSK_ACK_PUSHED = 4 + ICSK_ACK_PUSHED = 4, + ICSK_ACK_PUSHED2 = 8 }; extern void inet_csk_init_xmit_timers(struct sock *sk, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 29e3d60..66e9a72 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -955,8 +955,11 @@ void tcp_cleanup_rbuf(struct sock *sk, int copied) * receive buffer and there was a small segment * in queue. */ - (copied > 0 && (icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && - !icsk->icsk_ack.pingpong && !atomic_read(&sk->sk_rmem_alloc))) + (copied > 0 && + ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED2) || + ((icsk->icsk_ack.pending & ICSK_ACK_PUSHED) && + !icsk->icsk_ack.pingpong)) && + !atomic_read(&sk->sk_rmem_alloc))) time_to_ack = 1; } diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 511b738..b3def0d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -156,6 +156,8 @@ static void tcp_measure_rcv_mss(struct sock *sk, return; } } + if (icsk->icsk_ack.pending & ICSK_ACK_PUSHED) + icsk->icsk_ack.pending |= ICSK_ACK_PUSHED2; icsk->icsk_ack.pending |= ICSK_ACK_PUSHED; } } -- cgit v0.10.2 From a1e59abf824969554b90facd44a4ab16e265afa4 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Tue, 19 Sep 2006 12:57:34 -0700 Subject: [XFRM]: Fix wildcard as tunnel source Hashing SAs by source address breaks templates with wildcards as tunnel source since the source address used for hashing/lookup is still 0/0. Move source address lookup to xfrm_tmpl_resolve_one() so we can use the real address in the lookup. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/net/xfrm.h b/include/net/xfrm.h index 4d6dc62..11e0b1d 100644 --- a/include/net/xfrm.h +++ b/include/net/xfrm.h @@ -222,6 +222,7 @@ struct xfrm_policy_afinfo { struct dst_ops *dst_ops; void (*garbage_collect)(void); int (*dst_lookup)(struct xfrm_dst **dst, struct flowi *fl); + int (*get_saddr)(xfrm_address_t *saddr, xfrm_address_t *daddr); struct dst_entry *(*find_bundle)(struct flowi *fl, struct xfrm_policy *policy); int (*bundle_create)(struct xfrm_policy *policy, struct xfrm_state **xfrm, @@ -631,6 +632,18 @@ secpath_reset(struct sk_buff *skb) } static inline int +xfrm_addr_any(xfrm_address_t *addr, unsigned short family) +{ + switch (family) { + case AF_INET: + return addr->a4 == 0; + case AF_INET6: + return ipv6_addr_any((struct in6_addr *)&addr->a6); + } + return 0; +} + +static inline int __xfrm4_state_addr_cmp(struct xfrm_tmpl *tmpl, struct xfrm_state *x) { return (tmpl->saddr.a4 && diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index 4795985..eabcd27 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -21,6 +21,25 @@ static int xfrm4_dst_lookup(struct xfrm_dst **dst, struct flowi *fl) return __ip_route_output_key((struct rtable**)dst, fl); } +static int xfrm4_get_saddr(xfrm_address_t *saddr, xfrm_address_t *daddr) +{ + struct rtable *rt; + struct flowi fl_tunnel = { + .nl_u = { + .ip4_u = { + .daddr = daddr->a4, + }, + }, + }; + + if (!xfrm4_dst_lookup((struct xfrm_dst **)&rt, &fl_tunnel)) { + saddr->a4 = rt->rt_src; + dst_release(&rt->u.dst); + return 0; + } + return -EHOSTUNREACH; +} + static struct dst_entry * __xfrm4_find_bundle(struct flowi *fl, struct xfrm_policy *policy) { @@ -298,6 +317,7 @@ static struct xfrm_policy_afinfo xfrm4_policy_afinfo = { .family = AF_INET, .dst_ops = &xfrm4_dst_ops, .dst_lookup = xfrm4_dst_lookup, + .get_saddr = xfrm4_get_saddr, .find_bundle = __xfrm4_find_bundle, .bundle_create = __xfrm4_bundle_create, .decode_session = _decode_session4, diff --git a/net/ipv4/xfrm4_state.c b/net/ipv4/xfrm4_state.c index 6a2a4ab..fe20344 100644 --- a/net/ipv4/xfrm4_state.c +++ b/net/ipv4/xfrm4_state.c @@ -42,21 +42,6 @@ __xfrm4_init_tempsel(struct xfrm_state *x, struct flowi *fl, x->props.saddr = tmpl->saddr; if (x->props.saddr.a4 == 0) x->props.saddr.a4 = saddr->a4; - if (tmpl->mode == XFRM_MODE_TUNNEL && x->props.saddr.a4 == 0) { - struct rtable *rt; - struct flowi fl_tunnel = { - .nl_u = { - .ip4_u = { - .daddr = x->id.daddr.a4, - } - } - }; - if (!xfrm_dst_lookup((struct xfrm_dst **)&rt, - &fl_tunnel, AF_INET)) { - x->props.saddr.a4 = rt->rt_src; - dst_release(&rt->u.dst); - } - } x->props.mode = tmpl->mode; x->props.reqid = tmpl->reqid; x->props.family = AF_INET; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index 9391c4c..6a252e2 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -34,6 +34,26 @@ static int xfrm6_dst_lookup(struct xfrm_dst **dst, struct flowi *fl) return err; } +static int xfrm6_get_saddr(xfrm_address_t *saddr, xfrm_address_t *daddr) +{ + struct rt6_info *rt; + struct flowi fl_tunnel = { + .nl_u = { + .ip6_u = { + .daddr = *(struct in6_addr *)&daddr->a6, + }, + }, + }; + + if (!xfrm6_dst_lookup((struct xfrm_dst **)&rt, &fl_tunnel)) { + ipv6_get_saddr(&rt->u.dst, (struct in6_addr *)&daddr->a6, + (struct in6_addr *)&saddr->a6); + dst_release(&rt->u.dst); + return 0; + } + return -EHOSTUNREACH; +} + static struct dst_entry * __xfrm6_find_bundle(struct flowi *fl, struct xfrm_policy *policy) { @@ -362,6 +382,7 @@ static struct xfrm_policy_afinfo xfrm6_policy_afinfo = { .family = AF_INET6, .dst_ops = &xfrm6_dst_ops, .dst_lookup = xfrm6_dst_lookup, + .get_saddr = xfrm6_get_saddr, .find_bundle = __xfrm6_find_bundle, .bundle_create = __xfrm6_bundle_create, .decode_session = _decode_session6, diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c index d88cd92..711bfaf 100644 --- a/net/ipv6/xfrm6_state.c +++ b/net/ipv6/xfrm6_state.c @@ -42,22 +42,6 @@ __xfrm6_init_tempsel(struct xfrm_state *x, struct flowi *fl, memcpy(&x->props.saddr, &tmpl->saddr, sizeof(x->props.saddr)); if (ipv6_addr_any((struct in6_addr*)&x->props.saddr)) memcpy(&x->props.saddr, saddr, sizeof(x->props.saddr)); - if (tmpl->mode == XFRM_MODE_TUNNEL && ipv6_addr_any((struct in6_addr*)&x->props.saddr)) { - struct rt6_info *rt; - struct flowi fl_tunnel = { - .nl_u = { - .ip6_u = { - .daddr = *(struct in6_addr *)daddr, - } - } - }; - if (!xfrm_dst_lookup((struct xfrm_dst **)&rt, - &fl_tunnel, AF_INET6)) { - ipv6_get_saddr(&rt->u.dst, (struct in6_addr *)daddr, - (struct in6_addr *)&x->props.saddr); - dst_release(&rt->u.dst); - } - } x->props.mode = tmpl->mode; x->props.reqid = tmpl->reqid; x->props.family = AF_INET6; diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c index 537854f..b6e2e79 100644 --- a/net/xfrm/xfrm_policy.c +++ b/net/xfrm/xfrm_policy.c @@ -1107,6 +1107,20 @@ int __xfrm_sk_clone_policy(struct sock *sk) return 0; } +static int +xfrm_get_saddr(xfrm_address_t *local, xfrm_address_t *remote, + unsigned short family) +{ + int err; + struct xfrm_policy_afinfo *afinfo = xfrm_policy_get_afinfo(family); + + if (unlikely(afinfo == NULL)) + return -EINVAL; + err = afinfo->get_saddr(local, remote); + xfrm_policy_put_afinfo(afinfo); + return err; +} + /* Resolve list of templates for the flow, given policy. */ static int @@ -1118,6 +1132,7 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, struct flowi *fl, int i, error; xfrm_address_t *daddr = xfrm_flowi_daddr(fl, family); xfrm_address_t *saddr = xfrm_flowi_saddr(fl, family); + xfrm_address_t tmp; for (nx=0, i = 0; i < policy->xfrm_nr; i++) { struct xfrm_state *x; @@ -1128,6 +1143,12 @@ xfrm_tmpl_resolve_one(struct xfrm_policy *policy, struct flowi *fl, if (tmpl->mode == XFRM_MODE_TUNNEL) { remote = &tmpl->id.daddr; local = &tmpl->saddr; + if (xfrm_addr_any(local, family)) { + error = xfrm_get_saddr(&tmp, remote, family); + if (error) + goto fail; + local = &tmp; + } } x = xfrm_state_find(remote, local, fl, tmpl, policy, &error, family); -- cgit v0.10.2 From 23d06e3b986677ec57007a24891fa9deb09ac973 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:04:54 -0700 Subject: [DCCP] ACKVEC: fix ackvector length calculation Fix ackvector length calculation upon receiving an "ack-of-ack". This patch avoids the ackvector from growing too large which causes it to not be inserted into packets. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ackvec.c b/net/dccp/ackvec.c index 8c211c5..8dab723 100644 --- a/net/dccp/ackvec.c +++ b/net/dccp/ackvec.c @@ -353,11 +353,13 @@ static void dccp_ackvec_throw_record(struct dccp_ackvec *av, { struct dccp_ackvec_record *next; - av->dccpav_buf_tail = avr->dccpavr_ack_ptr - 1; - if (av->dccpav_buf_tail == 0) - av->dccpav_buf_tail = DCCP_MAX_ACKVEC_LEN - 1; - - av->dccpav_vec_len -= avr->dccpavr_sent_len; + /* sort out vector length */ + if (av->dccpav_buf_head <= avr->dccpavr_ack_ptr) + av->dccpav_vec_len = avr->dccpavr_ack_ptr - av->dccpav_buf_head; + else + av->dccpav_vec_len = DCCP_MAX_ACKVEC_LEN - 1 + - av->dccpav_buf_head + + avr->dccpavr_ack_ptr; /* free records */ list_for_each_entry_safe_from(avr, next, &av->dccpav_records, -- cgit v0.10.2 From 8e27e4650cb7e73aa4dd97d860539e7605ac7e39 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:05:35 -0700 Subject: [DCCP] ackvec: Fix how DCCP_ACKVEC_STATE_NOT_RECEIVED is used Fix the way state is masked out. DCCP_ACKVEC_STATE_NOT_RECEIVED is defined as appears in the packet, therefore bit shifting is not required. This fix allows CCID2 to correctly detect losses. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ackvec.c b/net/dccp/ackvec.c index 8dab723..bc5ff12 100644 --- a/net/dccp/ackvec.c +++ b/net/dccp/ackvec.c @@ -436,8 +436,7 @@ static void dccp_ackvec_check_rcv_ackvector(struct dccp_ackvec *av, break; found: if (between48(avr->dccpavr_ack_seqno, ackno_end_rl, ackno)) { - const u8 state = (*vector & - DCCP_ACKVEC_STATE_MASK) >> 6; + const u8 state = *vector & DCCP_ACKVEC_STATE_MASK; if (state != DCCP_ACKVEC_STATE_NOT_RECEIVED) { #ifdef CONFIG_IP_DCCP_DEBUG struct dccp_sock *dp = dccp_sk(sk); diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index e961562..b1d90c0 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -582,8 +582,8 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) * run length */ while (between48(seqp->ccid2s_seq,ackno_end_rl,ackno)) { - const u8 state = (*vector & - DCCP_ACKVEC_STATE_MASK) >> 6; + const u8 state = *vector & + DCCP_ACKVEC_STATE_MASK; /* new packet received or marked */ if (state != DCCP_ACKVEC_STATE_NOT_RECEIVED && -- cgit v0.10.2 From 4a0a50fb43912b4a593d2416c507a198fe607a6d Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:06:16 -0700 Subject: [DCCP] ackvec: Remove unused variables Get rid of unused variables in ackvector state. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ackvec.c b/net/dccp/ackvec.c index bc5ff12..4d176d3 100644 --- a/net/dccp/ackvec.c +++ b/net/dccp/ackvec.c @@ -142,14 +142,13 @@ struct dccp_ackvec *dccp_ackvec_alloc(const gfp_t priority) struct dccp_ackvec *av = kmem_cache_alloc(dccp_ackvec_slab, priority); if (av != NULL) { - av->dccpav_buf_head = - av->dccpav_buf_tail = DCCP_MAX_ACKVEC_LEN - 1; + av->dccpav_buf_head = DCCP_MAX_ACKVEC_LEN - 1; av->dccpav_buf_ackno = DCCP_MAX_SEQNO + 1; av->dccpav_buf_nonce = av->dccpav_buf_nonce = 0; av->dccpav_ack_ptr = 0; av->dccpav_time.tv_sec = 0; av->dccpav_time.tv_usec = 0; - av->dccpav_sent_len = av->dccpav_vec_len = 0; + av->dccpav_vec_len = 0; INIT_LIST_HEAD(&av->dccpav_records); } diff --git a/net/dccp/ackvec.h b/net/dccp/ackvec.h index 0adf4b5..2424eff 100644 --- a/net/dccp/ackvec.h +++ b/net/dccp/ackvec.h @@ -54,9 +54,7 @@ struct dccp_ackvec { struct list_head dccpav_records; struct timeval dccpav_time; u8 dccpav_buf_head; - u8 dccpav_buf_tail; u8 dccpav_ack_ptr; - u8 dccpav_sent_len; u8 dccpav_vec_len; u8 dccpav_buf_nonce; u8 dccpav_ack_nonce; @@ -107,7 +105,7 @@ extern int dccp_insert_option_ackvec(struct sock *sk, struct sk_buff *skb); static inline int dccp_ackvec_pending(const struct dccp_ackvec *av) { - return av->dccpav_sent_len != av->dccpav_vec_len; + return av->dccpav_vec_len; } #else /* CONFIG_IP_DCCP_ACKVEC */ static inline int dccp_ackvec_init(void) -- cgit v0.10.2 From 29651cda97b0a9e4ac0fbeb5ea731a9909f0f128 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:06:46 -0700 Subject: [DCCP] CCID2: Fix jiffie wrap issues Jiffies are now handled correctly (I hope) in CCID2. If they wrap, no problem. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index b1d90c0..54a6b7e 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -27,7 +27,6 @@ * * BUGS: * - sequence number wrapping - * - jiffies wrapping */ #include "../ccid.h" @@ -71,7 +70,8 @@ static void ccid2_hc_tx_check_sanity(const struct ccid2_hc_tx_sock *hctx) /* packets are sent sequentially */ BUG_ON(seqp->ccid2s_seq <= prev->ccid2s_seq); - BUG_ON(seqp->ccid2s_sent < prev->ccid2s_sent); + BUG_ON(time_before(seqp->ccid2s_sent, + prev->ccid2s_sent)); BUG_ON(len > ccid2_seq_len); seqp = prev; @@ -418,8 +418,8 @@ static inline void ccid2_new_ack(struct sock *sk, /* update RTO */ if (hctx->ccid2hctx_srtt == -1 || - (jiffies - hctx->ccid2hctx_lastrtt) >= hctx->ccid2hctx_srtt) { - unsigned long r = jiffies - seqp->ccid2s_sent; + time_after(jiffies, hctx->ccid2hctx_lastrtt + hctx->ccid2hctx_srtt)) { + unsigned long r = (long)jiffies - (long)seqp->ccid2s_sent; int s; /* first measurement */ -- cgit v0.10.2 From d458c25ce24ce00ea547e9976e293e7835416253 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:07:20 -0700 Subject: [DCCP] CCID2: Initialize ssthresh to infinity Initialize the slow-start threshold to infinity. This way, upon connection initiation, slow-start will be exited only upon a packet loss. This patch will allow connections to quickly gain speed. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index 54a6b7e..699a566 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -678,9 +678,12 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) int seqcount = ccid2_seq_len; int i; - /* XXX init variables with proper values */ hctx->ccid2hctx_cwnd = 1; - hctx->ccid2hctx_ssthresh = 10; + /* Initialize ssthresh to infinity. This means that we will exit the + * initial slow-start after the first packet loss. This is what we + * want. + */ + hctx->ccid2hctx_ssthresh = ~0; hctx->ccid2hctx_numdupack = 3; /* XXX init ~ to window size... */ diff --git a/net/dccp/ccids/ccid2.h b/net/dccp/ccids/ccid2.h index 451a874..b4cc6c0 100644 --- a/net/dccp/ccids/ccid2.h +++ b/net/dccp/ccids/ccid2.h @@ -50,7 +50,7 @@ struct ccid2_hc_tx_sock { int ccid2hctx_cwnd; int ccid2hctx_ssacks; int ccid2hctx_acks; - int ccid2hctx_ssthresh; + unsigned int ccid2hctx_ssthresh; int ccid2hctx_pipe; int ccid2hctx_numdupack; struct ccid2_seq *ccid2hctx_seqbuf; -- cgit v0.10.2 From 69263bcfb5016bc3bdd099607a4232cba06f8491 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Fri, 22 Sep 2006 14:28:11 -0700 Subject: [ATM]: proper prototypes in net/atm/mpc.h (and reduce ifdef clutter) Signed-off-by: Adrian Bunk Signed-off-by: Chas Williams Signed-off-by: David S. Miller diff --git a/net/atm/mpc.c b/net/atm/mpc.c index 0070466..b87c2a8 100644 --- a/net/atm/mpc.c +++ b/net/atm/mpc.c @@ -98,11 +98,6 @@ static struct notifier_block mpoa_notifier = { 0 }; -#ifdef CONFIG_PROC_FS -extern int mpc_proc_init(void); -extern void mpc_proc_clean(void); -#endif - struct mpoa_client *mpcs = NULL; /* FIXME */ static struct atm_mpoa_qos *qos_head = NULL; static DEFINE_TIMER(mpc_timer, NULL, 0, 0); @@ -1439,12 +1434,8 @@ static __init int atm_mpoa_init(void) { register_atm_ioctl(&atm_ioctl_ops); -#ifdef CONFIG_PROC_FS if (mpc_proc_init() != 0) printk(KERN_INFO "mpoa: failed to initialize /proc/mpoa\n"); - else - printk(KERN_INFO "mpoa: /proc/mpoa initialized\n"); -#endif printk("mpc.c: " __DATE__ " " __TIME__ " initialized\n"); @@ -1457,9 +1448,7 @@ static void __exit atm_mpoa_cleanup(void) struct atm_mpoa_qos *qos, *nextqos; struct lec_priv *priv; -#ifdef CONFIG_PROC_FS mpc_proc_clean(); -#endif del_timer(&mpc_timer); unregister_netdevice_notifier(&mpoa_notifier); diff --git a/net/atm/mpc.h b/net/atm/mpc.h index 863ddf6..3c7981a 100644 --- a/net/atm/mpc.h +++ b/net/atm/mpc.h @@ -50,4 +50,12 @@ int atm_mpoa_delete_qos(struct atm_mpoa_qos *qos); struct seq_file; void atm_mpoa_disp_qos(struct seq_file *m); +#ifdef CONFIG_PROC_FS +int mpc_proc_init(void); +void mpc_proc_clean(void); +#else +#define mpc_proc_init() (0) +#define mpc_proc_clean() do { } while(0) +#endif + #endif /* _MPC_H_ */ -- cgit v0.10.2 From 446dec30c7f305ed1bb0092b0a8d9367d842a33f Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:10:11 -0700 Subject: [DCCP] CCID2: Tell DCCP to quickly check whether cwnd is available If not enough cwnd is available, tell the sender to check again as soon as possible. This will increase CPU utilization (polling frequently for cwnd) but will improve network performance. That is, the sender will need to wait less before detecting the increase of cwnd. A better architecture would be for the CCID to call-back (or dequeue) from DCCP when it is able to transmit traffic -- not the other way around as it currently occurs. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index 699a566..e0acd1b 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -122,7 +122,7 @@ static int ccid2_hc_tx_send_packet(struct sock *sk, } } - return 100; /* XXX */ + return 1; /* XXX CCID should dequeue when ready instead of polling */ } static void ccid2_change_l_ack_ratio(struct sock *sk, int val) -- cgit v0.10.2 From 8d424f6ca2d02026dadff409770639d720375afb Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:12:44 -0700 Subject: [DCCP] CCID2: Add Kconfig option for CCID2 debug Allow the user to choose whether or not to enable CCID2 debugging via Kconfig. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/Kconfig b/net/dccp/ccids/Kconfig index ca00191..32752f7 100644 --- a/net/dccp/ccids/Kconfig +++ b/net/dccp/ccids/Kconfig @@ -30,6 +30,14 @@ config IP_DCCP_CCID2 If in doubt, say M. +config IP_DCCP_CCID2_DEBUG + bool "CCID2 debug" + depends on IP_DCCP_CCID2 + ---help--- + Enable CCID2 debug messages. + + If in doubt, say N. + config IP_DCCP_CCID3 tristate "CCID3 (TCP-Friendly) (EXPERIMENTAL)" depends on IP_DCCP diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index e0acd1b..dbcda7e 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -35,8 +35,7 @@ static int ccid2_debug; -#undef CCID2_DEBUG -#ifdef CCID2_DEBUG +#ifdef CONFIG_IP_DCCP_CCID2_DEBUG #define ccid2_pr_debug(format, a...) \ do { if (ccid2_debug) \ printk(KERN_DEBUG "%s: " format, __FUNCTION__, ##a); \ @@ -47,7 +46,7 @@ static int ccid2_debug; static const int ccid2_seq_len = 128; -#ifdef CCID2_DEBUG +#ifdef CONFIG_IP_DCCP_CCID2_DEBUG static void ccid2_hc_tx_check_sanity(const struct ccid2_hc_tx_sock *hctx) { int len = 0; @@ -295,7 +294,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, int more, int len) if (!timer_pending(&hctx->ccid2hctx_rtotimer)) ccid2_start_rto_timer(sk); -#ifdef CCID2_DEBUG +#ifdef CONFIG_IP_DCCP_CCID2_DEBUG ccid2_pr_debug("pipe=%d\n", hctx->ccid2hctx_pipe); ccid2_pr_debug("Sent: seq=%llu\n", seq); do { -- cgit v0.10.2 From 07978aabd52ce67f59971872c80f76d6e3ca18ae Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:13:37 -0700 Subject: [DCCP] CCID2: Allocate seq records on demand Allocate more sequence state on demand. Each time a packet is sent out by CCID2, a record of it needs to be kept. This list of records grows proportionally to cwnd. Previously, the length of this list was hardcored and therefore the cwnd could only grow to this value (of 128). Now, records are allocated on demand as necessary---cwnd may grow as it wishes. The exceptional case of when memory is not available is not handled gracefully. Perhaps, cwnd should be capped at that point. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index dbcda7e..93a30ae 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -44,8 +44,6 @@ static int ccid2_debug; #define ccid2_pr_debug(format, a...) #endif -static const int ccid2_seq_len = 128; - #ifdef CONFIG_IP_DCCP_CCID2_DEBUG static void ccid2_hc_tx_check_sanity(const struct ccid2_hc_tx_sock *hctx) { @@ -71,7 +69,6 @@ static void ccid2_hc_tx_check_sanity(const struct ccid2_hc_tx_sock *hctx) BUG_ON(seqp->ccid2s_seq <= prev->ccid2s_seq); BUG_ON(time_before(seqp->ccid2s_sent, prev->ccid2s_sent)); - BUG_ON(len > ccid2_seq_len); seqp = prev; } @@ -83,16 +80,57 @@ static void ccid2_hc_tx_check_sanity(const struct ccid2_hc_tx_sock *hctx) do { seqp = seqp->ccid2s_prev; len++; - BUG_ON(len > ccid2_seq_len); } while (seqp != hctx->ccid2hctx_seqh); - BUG_ON(len != ccid2_seq_len); ccid2_pr_debug("total len=%d\n", len); + BUG_ON(len != hctx->ccid2hctx_seqbufc * CCID2_SEQBUF_LEN); } #else #define ccid2_hc_tx_check_sanity(hctx) do {} while (0) #endif +static int ccid2_hc_tx_alloc_seq(struct ccid2_hc_tx_sock *hctx, int num, + gfp_t gfp) +{ + struct ccid2_seq *seqp; + int i; + + /* check if we have space to preserve the pointer to the buffer */ + if (hctx->ccid2hctx_seqbufc >= (sizeof(hctx->ccid2hctx_seqbuf) / + sizeof(struct ccid2_seq*))) + return -ENOMEM; + + /* allocate buffer and initialize linked list */ + seqp = kmalloc(sizeof(*seqp) * num, gfp); + if (seqp == NULL) + return -ENOMEM; + + for (i = 0; i < (num - 1); i++) { + seqp[i].ccid2s_next = &seqp[i + 1]; + seqp[i + 1].ccid2s_prev = &seqp[i]; + } + seqp[num - 1].ccid2s_next = seqp; + seqp->ccid2s_prev = &seqp[num - 1]; + + /* This is the first allocation. Initiate the head and tail. */ + if (hctx->ccid2hctx_seqbufc == 0) + hctx->ccid2hctx_seqh = hctx->ccid2hctx_seqt = seqp; + else { + /* link the existing list with the one we just created */ + hctx->ccid2hctx_seqh->ccid2s_next = seqp; + seqp->ccid2s_prev = hctx->ccid2hctx_seqh; + + hctx->ccid2hctx_seqt->ccid2s_prev = &seqp[num - 1]; + seqp[num - 1].ccid2s_next = hctx->ccid2hctx_seqt; + } + + /* store the original pointer to the buffer so we can free it */ + hctx->ccid2hctx_seqbuf[hctx->ccid2hctx_seqbufc] = seqp; + hctx->ccid2hctx_seqbufc++; + + return 0; +} + static int ccid2_hc_tx_send_packet(struct sock *sk, struct sk_buff *skb, int len) { @@ -231,6 +269,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, int more, int len) { struct dccp_sock *dp = dccp_sk(sk); struct ccid2_hc_tx_sock *hctx = ccid2_hc_tx_sk(sk); + struct ccid2_seq *next; u64 seq; ccid2_hc_tx_check_sanity(hctx); @@ -250,15 +289,23 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, int more, int len) hctx->ccid2hctx_seqh->ccid2s_seq = seq; hctx->ccid2hctx_seqh->ccid2s_acked = 0; hctx->ccid2hctx_seqh->ccid2s_sent = jiffies; - hctx->ccid2hctx_seqh = hctx->ccid2hctx_seqh->ccid2s_next; - ccid2_pr_debug("cwnd=%d pipe=%d\n", hctx->ccid2hctx_cwnd, - hctx->ccid2hctx_pipe); + next = hctx->ccid2hctx_seqh->ccid2s_next; + /* check if we need to alloc more space */ + if (next == hctx->ccid2hctx_seqt) { + int rc; - if (hctx->ccid2hctx_seqh == hctx->ccid2hctx_seqt) { - /* XXX allocate more space */ - WARN_ON(1); + ccid2_pr_debug("allocating more space in history\n"); + rc = ccid2_hc_tx_alloc_seq(hctx, CCID2_SEQBUF_LEN, GFP_KERNEL); + BUG_ON(rc); /* XXX what do we do? */ + + next = hctx->ccid2hctx_seqh->ccid2s_next; + BUG_ON(next == hctx->ccid2hctx_seqt); } + hctx->ccid2hctx_seqh = next; + + ccid2_pr_debug("cwnd=%d pipe=%d\n", hctx->ccid2hctx_cwnd, + hctx->ccid2hctx_pipe); hctx->ccid2hctx_sent++; @@ -674,8 +721,6 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) { struct ccid2_hc_tx_sock *hctx = ccid_priv(ccid); - int seqcount = ccid2_seq_len; - int i; hctx->ccid2hctx_cwnd = 1; /* Initialize ssthresh to infinity. This means that we will exit the @@ -684,26 +729,12 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) */ hctx->ccid2hctx_ssthresh = ~0; hctx->ccid2hctx_numdupack = 3; + hctx->ccid2hctx_seqbufc = 0; /* XXX init ~ to window size... */ - hctx->ccid2hctx_seqbuf = kmalloc(sizeof(*hctx->ccid2hctx_seqbuf) * - seqcount, gfp_any()); - if (hctx->ccid2hctx_seqbuf == NULL) + if (ccid2_hc_tx_alloc_seq(hctx, CCID2_SEQBUF_LEN, GFP_ATOMIC) != 0) return -ENOMEM; - for (i = 0; i < (seqcount - 1); i++) { - hctx->ccid2hctx_seqbuf[i].ccid2s_next = - &hctx->ccid2hctx_seqbuf[i + 1]; - hctx->ccid2hctx_seqbuf[i + 1].ccid2s_prev = - &hctx->ccid2hctx_seqbuf[i]; - } - hctx->ccid2hctx_seqbuf[seqcount - 1].ccid2s_next = - hctx->ccid2hctx_seqbuf; - hctx->ccid2hctx_seqbuf->ccid2s_prev = - &hctx->ccid2hctx_seqbuf[seqcount - 1]; - - hctx->ccid2hctx_seqh = hctx->ccid2hctx_seqbuf; - hctx->ccid2hctx_seqt = hctx->ccid2hctx_seqh; hctx->ccid2hctx_sent = 0; hctx->ccid2hctx_rto = 3 * HZ; hctx->ccid2hctx_srtt = -1; @@ -722,10 +753,13 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) static void ccid2_hc_tx_exit(struct sock *sk) { struct ccid2_hc_tx_sock *hctx = ccid2_hc_tx_sk(sk); + int i; ccid2_hc_tx_kill_rto_timer(sk); - kfree(hctx->ccid2hctx_seqbuf); - hctx->ccid2hctx_seqbuf = NULL; + + for (i = 0; i < hctx->ccid2hctx_seqbufc; i++) + kfree(hctx->ccid2hctx_seqbuf[i]); + hctx->ccid2hctx_seqbufc = 0; } static void ccid2_hc_rx_packet_recv(struct sock *sk, struct sk_buff *skb) diff --git a/net/dccp/ccids/ccid2.h b/net/dccp/ccids/ccid2.h index b4cc6c0..2a02ce0 100644 --- a/net/dccp/ccids/ccid2.h +++ b/net/dccp/ccids/ccid2.h @@ -35,6 +35,9 @@ struct ccid2_seq { struct ccid2_seq *ccid2s_next; }; +#define CCID2_SEQBUF_LEN 256 +#define CCID2_SEQBUF_MAX 128 + /** struct ccid2_hc_tx_sock - CCID2 TX half connection * * @ccid2hctx_ssacks - ACKs recv in slow start @@ -53,7 +56,8 @@ struct ccid2_hc_tx_sock { unsigned int ccid2hctx_ssthresh; int ccid2hctx_pipe; int ccid2hctx_numdupack; - struct ccid2_seq *ccid2hctx_seqbuf; + struct ccid2_seq *ccid2hctx_seqbuf[CCID2_SEQBUF_MAX]; + int ccid2hctx_seqbufc; struct ccid2_seq *ccid2hctx_seqh; struct ccid2_seq *ccid2hctx_seqt; long ccid2hctx_rto; -- cgit v0.10.2 From 374bcf32c86e1b56eab832bbb6b21e636707eab6 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:14:43 -0700 Subject: [DCCP] CCID2: Halve cwnd once upon multiple losses in a single RTT When multiple losses occur in one RTT, the window should be halved only once [a single "congestion event"]. This is now implemented, although not perfectly. Slightly changed the interface for changing the cwnd: pass hctx instead of dp. This is required in order to allow for change_cwnd to be called from _init(). Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index 93a30ae..b88da03 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -187,10 +187,8 @@ static void ccid2_change_l_ack_ratio(struct sock *sk, int val) dp->dccps_l_ack_ratio = val; } -static void ccid2_change_cwnd(struct sock *sk, int val) +static void ccid2_change_cwnd(struct ccid2_hc_tx_sock *hctx, int val) { - struct ccid2_hc_tx_sock *hctx = ccid2_hc_tx_sk(sk); - if (val == 0) val = 1; @@ -234,7 +232,7 @@ static void ccid2_hc_tx_rto_expire(unsigned long data) hctx->ccid2hctx_ssthresh = hctx->ccid2hctx_cwnd >> 1; if (hctx->ccid2hctx_ssthresh < 2) hctx->ccid2hctx_ssthresh = 2; - ccid2_change_cwnd(sk, 1); + ccid2_change_cwnd(hctx, 1); /* clear state about stuff we sent */ hctx->ccid2hctx_seqt = hctx->ccid2hctx_seqh; @@ -444,7 +442,7 @@ static inline void ccid2_new_ack(struct sock *sk, /* increase every 2 acks */ hctx->ccid2hctx_ssacks++; if (hctx->ccid2hctx_ssacks == 2) { - ccid2_change_cwnd(sk, hctx->ccid2hctx_cwnd + 1); + ccid2_change_cwnd(hctx, hctx->ccid2hctx_cwnd+1); hctx->ccid2hctx_ssacks = 0; *maxincr = *maxincr - 1; } @@ -457,7 +455,7 @@ static inline void ccid2_new_ack(struct sock *sk, hctx->ccid2hctx_acks++; if (hctx->ccid2hctx_acks >= hctx->ccid2hctx_cwnd) { - ccid2_change_cwnd(sk, hctx->ccid2hctx_cwnd + 1); + ccid2_change_cwnd(hctx, hctx->ccid2hctx_cwnd + 1); hctx->ccid2hctx_acks = 0; } } @@ -532,6 +530,22 @@ static void ccid2_hc_tx_dec_pipe(struct sock *sk) ccid2_hc_tx_kill_rto_timer(sk); } +static void ccid2_congestion_event(struct ccid2_hc_tx_sock *hctx, + struct ccid2_seq *seqp) +{ + if (time_before(seqp->ccid2s_sent, hctx->ccid2hctx_last_cong)) { + ccid2_pr_debug("Multiple losses in an RTT---treating as one\n"); + return; + } + + hctx->ccid2hctx_last_cong = jiffies; + + ccid2_change_cwnd(hctx, hctx->ccid2hctx_cwnd >> 1); + hctx->ccid2hctx_ssthresh = hctx->ccid2hctx_cwnd; + if (hctx->ccid2hctx_ssthresh < 2) + hctx->ccid2hctx_ssthresh = 2; +} + static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) { struct dccp_sock *dp = dccp_sk(sk); @@ -542,7 +556,6 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) unsigned char veclen; int offset = 0; int done = 0; - int loss = 0; unsigned int maxincr = 0; ccid2_hc_tx_check_sanity(hctx); @@ -636,7 +649,8 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) !seqp->ccid2s_acked) { if (state == DCCP_ACKVEC_STATE_ECN_MARKED) { - loss = 1; + ccid2_congestion_event(hctx, + seqp); } else ccid2_new_ack(sk, seqp, &maxincr); @@ -688,7 +702,13 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) /* check for lost packets */ while (1) { if (!seqp->ccid2s_acked) { - loss = 1; + ccid2_pr_debug("Packet lost: %llu\n", + seqp->ccid2s_seq); + /* XXX need to traverse from tail -> head in + * order to detect multiple congestion events in + * one ack vector. + */ + ccid2_congestion_event(hctx, seqp); ccid2_hc_tx_dec_pipe(sk); } if (seqp == hctx->ccid2hctx_seqt) @@ -707,14 +727,6 @@ static void ccid2_hc_tx_packet_recv(struct sock *sk, struct sk_buff *skb) hctx->ccid2hctx_seqt = hctx->ccid2hctx_seqt->ccid2s_next; } - if (loss) { - /* XXX do bit shifts guarantee a 0 as the new bit? */ - ccid2_change_cwnd(sk, hctx->ccid2hctx_cwnd >> 1); - hctx->ccid2hctx_ssthresh = hctx->ccid2hctx_cwnd; - if (hctx->ccid2hctx_ssthresh < 2) - hctx->ccid2hctx_ssthresh = 2; - } - ccid2_hc_tx_check_sanity(hctx); } @@ -722,7 +734,7 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) { struct ccid2_hc_tx_sock *hctx = ccid_priv(ccid); - hctx->ccid2hctx_cwnd = 1; + ccid2_change_cwnd(hctx, 1); /* Initialize ssthresh to infinity. This means that we will exit the * initial slow-start after the first packet loss. This is what we * want. @@ -741,6 +753,7 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) hctx->ccid2hctx_rttvar = -1; hctx->ccid2hctx_lastrtt = 0; hctx->ccid2hctx_rpdupack = -1; + hctx->ccid2hctx_last_cong = jiffies; hctx->ccid2hctx_rtotimer.function = &ccid2_hc_tx_rto_expire; hctx->ccid2hctx_rtotimer.data = (unsigned long)sk; diff --git a/net/dccp/ccids/ccid2.h b/net/dccp/ccids/ccid2.h index 2a02ce0..5b2ef4a 100644 --- a/net/dccp/ccids/ccid2.h +++ b/net/dccp/ccids/ccid2.h @@ -71,6 +71,7 @@ struct ccid2_hc_tx_sock { u64 ccid2hctx_rpseq; int ccid2hctx_rpdupack; int ccid2hctx_sendwait; + unsigned long ccid2hctx_last_cong; }; struct ccid2_hc_rx_sock { -- cgit v0.10.2 From 593f16aa627d61da447c76ee5a159450174627f6 Mon Sep 17 00:00:00 2001 From: Andrea Bittau Date: Tue, 19 Sep 2006 13:15:33 -0700 Subject: [DCCP] CCID2: Add helper functions for changing important CCID2 state Introduce methods which manipulate interesting congestion control state such as pipe and rtt estimate. This is useful for people wishing to monitor the variables of CCID and instrument the code [perhaps using Kprobes]. Personally, I am a fan of encapsulation---that justifies this change =D. Signed-off-by: Andrea Bittau Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: David S. Miller diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c index b88da03..457dd3d 100644 --- a/net/dccp/ccids/ccid2.c +++ b/net/dccp/ccids/ccid2.c @@ -199,6 +199,17 @@ static void ccid2_change_cwnd(struct ccid2_hc_tx_sock *hctx, int val) hctx->ccid2hctx_cwnd = val; } +static void ccid2_change_srtt(struct ccid2_hc_tx_sock *hctx, long val) +{ + ccid2_pr_debug("change SRTT to %ld\n", val); + hctx->ccid2hctx_srtt = val; +} + +static void ccid2_change_pipe(struct ccid2_hc_tx_sock *hctx, long val) +{ + hctx->ccid2hctx_pipe = val; +} + static void ccid2_start_rto_timer(struct sock *sk); static void ccid2_hc_tx_rto_expire(unsigned long data) @@ -228,7 +239,7 @@ static void ccid2_hc_tx_rto_expire(unsigned long data) ccid2_start_rto_timer(sk); /* adjust pipe, cwnd etc */ - hctx->ccid2hctx_pipe = 0; + ccid2_change_pipe(hctx, 0); hctx->ccid2hctx_ssthresh = hctx->ccid2hctx_cwnd >> 1; if (hctx->ccid2hctx_ssthresh < 2) hctx->ccid2hctx_ssthresh = 2; @@ -274,7 +285,7 @@ static void ccid2_hc_tx_packet_sent(struct sock *sk, int more, int len) BUG_ON(!hctx->ccid2hctx_sendwait); hctx->ccid2hctx_sendwait = 0; - hctx->ccid2hctx_pipe++; + ccid2_change_pipe(hctx, hctx->ccid2hctx_pipe + 1); BUG_ON(hctx->ccid2hctx_pipe < 0); /* There is an issue. What if another packet is sent between @@ -470,11 +481,13 @@ static inline void ccid2_new_ack(struct sock *sk, if (hctx->ccid2hctx_srtt == -1) { ccid2_pr_debug("R: %lu Time=%lu seq=%llu\n", r, jiffies, seqp->ccid2s_seq); - hctx->ccid2hctx_srtt = r; + ccid2_change_srtt(hctx, r); hctx->ccid2hctx_rttvar = r >> 1; } else { /* RTTVAR */ long tmp = hctx->ccid2hctx_srtt - r; + long srtt; + if (tmp < 0) tmp *= -1; @@ -484,10 +497,12 @@ static inline void ccid2_new_ack(struct sock *sk, hctx->ccid2hctx_rttvar += tmp; /* SRTT */ - hctx->ccid2hctx_srtt *= 7; - hctx->ccid2hctx_srtt >>= 3; + srtt = hctx->ccid2hctx_srtt; + srtt *= 7; + srtt >>= 3; tmp = r >> 3; - hctx->ccid2hctx_srtt += tmp; + srtt += tmp; + ccid2_change_srtt(hctx, srtt); } s = hctx->ccid2hctx_rttvar << 2; /* clock granularity is 1 when based on jiffies */ @@ -523,7 +538,7 @@ static void ccid2_hc_tx_dec_pipe(struct sock *sk) { struct ccid2_hc_tx_sock *hctx = ccid2_hc_tx_sk(sk); - hctx->ccid2hctx_pipe--; + ccid2_change_pipe(hctx, hctx->ccid2hctx_pipe-1); BUG_ON(hctx->ccid2hctx_pipe < 0); if (hctx->ccid2hctx_pipe == 0) @@ -749,7 +764,7 @@ static int ccid2_hc_tx_init(struct ccid *ccid, struct sock *sk) hctx->ccid2hctx_sent = 0; hctx->ccid2hctx_rto = 3 * HZ; - hctx->ccid2hctx_srtt = -1; + ccid2_change_srtt(hctx, -1); hctx->ccid2hctx_rttvar = -1; hctx->ccid2hctx_lastrtt = 0; hctx->ccid2hctx_rpdupack = -1; -- cgit v0.10.2 From c55e2f4997a104d66b59bdf1aa8ab125d09ae00a Mon Sep 17 00:00:00 2001 From: Al Viro Date: Tue, 19 Sep 2006 13:23:19 -0700 Subject: [IPV4]: ipip and ip_gre encapsulation bugs Handling of ipip and ip_gre ICMP error relaying is b0rken; it accesses 8bit field + 3 reserved octets as host-endian 32bit, does comparison, subtraction and stuffs the result back. That breaks on big-endian. Fixed, made endian-clean. [ Note that this effected code is permanently commented out with and ifdef, so this error couldn't actually cause problems for anyone. -DaveM ] Signed-off-by: Al Viro Signed-off-by: David S. Miller diff --git a/net/ipv4/ip_gre.c b/net/ipv4/ip_gre.c index e66f6ff..f5fba05 100644 --- a/net/ipv4/ip_gre.c +++ b/net/ipv4/ip_gre.c @@ -393,7 +393,8 @@ out: int code = skb->h.icmph->code; int rel_type = 0; int rel_code = 0; - int rel_info = 0; + __be32 rel_info = 0; + __u32 n = 0; u16 flags; int grehlen = (iph->ihl<<2) + 4; struct sk_buff *skb2; @@ -422,14 +423,16 @@ out: default: return; case ICMP_PARAMETERPROB: - if (skb->h.icmph->un.gateway < (iph->ihl<<2)) + n = ntohl(skb->h.icmph->un.gateway) >> 24; + if (n < (iph->ihl<<2)) return; /* So... This guy found something strange INSIDE encapsulated packet. Well, he is fool, but what can we do ? */ rel_type = ICMP_PARAMETERPROB; - rel_info = skb->h.icmph->un.gateway - grehlen; + n -= grehlen; + rel_info = htonl(n << 24); break; case ICMP_DEST_UNREACH: @@ -440,13 +443,14 @@ out: return; case ICMP_FRAG_NEEDED: /* And it is the only really necessary thing :-) */ - rel_info = ntohs(skb->h.icmph->un.frag.mtu); - if (rel_info < grehlen+68) + n = ntohs(skb->h.icmph->un.frag.mtu); + if (n < grehlen+68) return; - rel_info -= grehlen; + n -= grehlen; /* BSD 4.2 MORE DOES NOT EXIST IN NATURE. */ - if (rel_info > ntohs(eiph->tot_len)) + if (n > ntohs(eiph->tot_len)) return; + rel_info = htonl(n); break; default: /* All others are translated to HOST_UNREACH. @@ -508,12 +512,11 @@ out: /* change mtu on this route */ if (type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED) { - if (rel_info > dst_mtu(skb2->dst)) { + if (n > dst_mtu(skb2->dst)) { kfree_skb(skb2); return; } - skb2->dst->ops->update_pmtu(skb2->dst, rel_info); - rel_info = htonl(rel_info); + skb2->dst->ops->update_pmtu(skb2->dst, n); } else if (type == ICMP_TIME_EXCEEDED) { struct ip_tunnel *t = netdev_priv(skb2->dev); if (t->parms.iph.ttl) { diff --git a/net/ipv4/ipip.c b/net/ipv4/ipip.c index 76ab50b..0c45565 100644 --- a/net/ipv4/ipip.c +++ b/net/ipv4/ipip.c @@ -341,7 +341,8 @@ out: int code = skb->h.icmph->code; int rel_type = 0; int rel_code = 0; - int rel_info = 0; + __be32 rel_info = 0; + __u32 n = 0; struct sk_buff *skb2; struct flowi fl; struct rtable *rt; @@ -354,14 +355,15 @@ out: default: return 0; case ICMP_PARAMETERPROB: - if (skb->h.icmph->un.gateway < hlen) + n = ntohl(skb->h.icmph->un.gateway) >> 24; + if (n < hlen) return 0; /* So... This guy found something strange INSIDE encapsulated packet. Well, he is fool, but what can we do ? */ rel_type = ICMP_PARAMETERPROB; - rel_info = skb->h.icmph->un.gateway - hlen; + rel_info = htonl((n - hlen) << 24); break; case ICMP_DEST_UNREACH: @@ -372,13 +374,14 @@ out: return 0; case ICMP_FRAG_NEEDED: /* And it is the only really necessary thing :-) */ - rel_info = ntohs(skb->h.icmph->un.frag.mtu); - if (rel_info < hlen+68) + n = ntohs(skb->h.icmph->un.frag.mtu); + if (n < hlen+68) return 0; - rel_info -= hlen; + n -= hlen; /* BSD 4.2 MORE DOES NOT EXIST IN NATURE. */ - if (rel_info > ntohs(eiph->tot_len)) + if (n > ntohs(eiph->tot_len)) return 0; + rel_info = htonl(n); break; default: /* All others are translated to HOST_UNREACH. @@ -440,12 +443,11 @@ out: /* change mtu on this route */ if (type == ICMP_DEST_UNREACH && code == ICMP_FRAG_NEEDED) { - if (rel_info > dst_mtu(skb2->dst)) { + if (n > dst_mtu(skb2->dst)) { kfree_skb(skb2); return 0; } - skb2->dst->ops->update_pmtu(skb2->dst, rel_info); - rel_info = htonl(rel_info); + skb2->dst->ops->update_pmtu(skb2->dst, n); } else if (type == ICMP_TIME_EXCEEDED) { struct ip_tunnel *t = netdev_priv(skb2->dev); if (t->parms.iph.ttl) { -- cgit v0.10.2 From 1bf38a36b6a0e810dafae048fdbb999e587f0f2f Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:57:09 -0700 Subject: [NETFILTER]: remove unused include file Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_logging.h b/include/linux/netfilter_logging.h deleted file mode 100644 index 562bb6a..0000000 --- a/include/linux/netfilter_logging.h +++ /dev/null @@ -1,33 +0,0 @@ -/* Internal logging interface, which relies on the real - LOG target modules */ -#ifndef __LINUX_NETFILTER_LOGGING_H -#define __LINUX_NETFILTER_LOGGING_H - -#ifdef __KERNEL__ -#include - -struct nf_logging_t { - void (*nf_log_packet)(struct sk_buff **pskb, - unsigned int hooknum, - const struct net_device *in, - const struct net_device *out, - const char *prefix); - void (*nf_log)(char *pfh, size_t len, - const char *prefix); -}; - -extern void nf_log_register(int pf, const struct nf_logging_t *logging); -extern void nf_log_unregister(int pf, const struct nf_logging_t *logging); - -extern void nf_log_packet(int pf, - struct sk_buff **pskb, - unsigned int hooknum, - const struct net_device *in, - const struct net_device *out, - const char *fmt, ...); -extern void nf_log(int pf, - char *pfh, size_t len, - const char *fmt, ...); -#endif /*__KERNEL__*/ - -#endif /*__LINUX_NETFILTER_LOGGING_H*/ -- cgit v0.10.2 From df0933dcb027e156cb5253570ad694b81bd52b69 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:57:53 -0700 Subject: [NETFILTER]: kill listhelp.h Kill listhelp.h and use the list.h functions instead. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index 03d1027..c832295 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -138,10 +138,6 @@ struct xt_counters_info #include -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) -#include - #ifdef CONFIG_COMPAT #define COMPAT_TO_USER 1 #define COMPAT_FROM_USER -1 diff --git a/include/linux/netfilter_ipv4/listhelp.h b/include/linux/netfilter_ipv4/listhelp.h deleted file mode 100644 index 5d92cf0..0000000 --- a/include/linux/netfilter_ipv4/listhelp.h +++ /dev/null @@ -1,123 +0,0 @@ -#ifndef _LISTHELP_H -#define _LISTHELP_H -#include - -/* Header to do more comprehensive job than linux/list.h; assume list - is first entry in structure. */ - -/* Return pointer to first true entry, if any, or NULL. A macro - required to allow inlining of cmpfn. */ -#define LIST_FIND(head, cmpfn, type, args...) \ -({ \ - const struct list_head *__i, *__j = NULL; \ - \ - ASSERT_READ_LOCK(head); \ - list_for_each(__i, (head)) \ - if (cmpfn((const type)__i , ## args)) { \ - __j = __i; \ - break; \ - } \ - (type)__j; \ -}) - -#define LIST_FIND_W(head, cmpfn, type, args...) \ -({ \ - const struct list_head *__i, *__j = NULL; \ - \ - ASSERT_WRITE_LOCK(head); \ - list_for_each(__i, (head)) \ - if (cmpfn((type)__i , ## args)) { \ - __j = __i; \ - break; \ - } \ - (type)__j; \ -}) - -/* Just like LIST_FIND but we search backwards */ -#define LIST_FIND_B(head, cmpfn, type, args...) \ -({ \ - const struct list_head *__i, *__j = NULL; \ - \ - ASSERT_READ_LOCK(head); \ - list_for_each_prev(__i, (head)) \ - if (cmpfn((const type)__i , ## args)) { \ - __j = __i; \ - break; \ - } \ - (type)__j; \ -}) - -static inline int -__list_cmp_same(const void *p1, const void *p2) { return p1 == p2; } - -/* Is this entry in the list? */ -static inline int -list_inlist(struct list_head *head, const void *entry) -{ - return LIST_FIND(head, __list_cmp_same, void *, entry) != NULL; -} - -/* Delete from list. */ -#ifdef CONFIG_NETFILTER_DEBUG -#define LIST_DELETE(head, oldentry) \ -do { \ - ASSERT_WRITE_LOCK(head); \ - if (!list_inlist(head, oldentry)) \ - printk("LIST_DELETE: %s:%u `%s'(%p) not in %s.\n", \ - __FILE__, __LINE__, #oldentry, oldentry, #head); \ - else list_del((struct list_head *)oldentry); \ -} while(0) -#else -#define LIST_DELETE(head, oldentry) list_del((struct list_head *)oldentry) -#endif - -/* Append. */ -static inline void -list_append(struct list_head *head, void *new) -{ - ASSERT_WRITE_LOCK(head); - list_add((new), (head)->prev); -} - -/* Prepend. */ -static inline void -list_prepend(struct list_head *head, void *new) -{ - ASSERT_WRITE_LOCK(head); - list_add(new, head); -} - -/* Insert according to ordering function; insert before first true. */ -#define LIST_INSERT(head, new, cmpfn) \ -do { \ - struct list_head *__i; \ - ASSERT_WRITE_LOCK(head); \ - list_for_each(__i, (head)) \ - if ((new), (typeof (new))__i) \ - break; \ - list_add((struct list_head *)(new), __i->prev); \ -} while(0) - -/* If the field after the list_head is a nul-terminated string, you - can use these functions. */ -static inline int __list_cmp_name(const void *i, const char *name) -{ - return strcmp(name, i+sizeof(struct list_head)) == 0; -} - -/* Returns false if same name already in list, otherwise does insert. */ -static inline int -list_named_insert(struct list_head *head, void *new) -{ - if (LIST_FIND(head, __list_cmp_name, void *, - new + sizeof(struct list_head))) - return 0; - list_prepend(head, new); - return 1; -} - -/* Find this named element in the list. */ -#define list_named_find(head, name) \ -LIST_FIND(head, __list_cmp_name, void *, name) - -#endif /*_LISTHELP_H*/ diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index d06a507..3df55b2 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include #include @@ -31,12 +32,6 @@ /* needed for logical [in,out]-dev filtering */ #include "../br_private.h" -/* list_named_find */ -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) -#include -#include - #define BUGPRINT(format, args...) printk("kernel msg: ebtables bug: please "\ "report to author: "format, ## args) /* #define BUGPRINT(format, args...) */ @@ -278,18 +273,22 @@ static inline void * find_inlist_lock_noload(struct list_head *head, const char *name, int *error, struct mutex *mutex) { - void *ret; + struct { + struct list_head list; + char name[EBT_FUNCTION_MAXNAMELEN]; + } *e; *error = mutex_lock_interruptible(mutex); if (*error != 0) return NULL; - ret = list_named_find(head, name); - if (!ret) { - *error = -ENOENT; - mutex_unlock(mutex); + list_for_each_entry(e, head, list) { + if (strcmp(e->name, name) == 0) + return e; } - return ret; + *error = -ENOENT; + mutex_unlock(mutex); + return NULL; } #ifndef CONFIG_KMOD @@ -1043,15 +1042,19 @@ free_newinfo: int ebt_register_target(struct ebt_target *target) { + struct ebt_target *t; int ret; ret = mutex_lock_interruptible(&ebt_mutex); if (ret != 0) return ret; - if (!list_named_insert(&ebt_targets, target)) { - mutex_unlock(&ebt_mutex); - return -EEXIST; + list_for_each_entry(t, &ebt_targets, list) { + if (strcmp(t->name, target->name) == 0) { + mutex_unlock(&ebt_mutex); + return -EEXIST; + } } + list_add(&target->list, &ebt_targets); mutex_unlock(&ebt_mutex); return 0; @@ -1060,21 +1063,25 @@ int ebt_register_target(struct ebt_target *target) void ebt_unregister_target(struct ebt_target *target) { mutex_lock(&ebt_mutex); - LIST_DELETE(&ebt_targets, target); + list_del(&target->list); mutex_unlock(&ebt_mutex); } int ebt_register_match(struct ebt_match *match) { + struct ebt_match *m; int ret; ret = mutex_lock_interruptible(&ebt_mutex); if (ret != 0) return ret; - if (!list_named_insert(&ebt_matches, match)) { - mutex_unlock(&ebt_mutex); - return -EEXIST; + list_for_each_entry(m, &ebt_matches, list) { + if (strcmp(m->name, match->name) == 0) { + mutex_unlock(&ebt_mutex); + return -EEXIST; + } } + list_add(&match->list, &ebt_matches); mutex_unlock(&ebt_mutex); return 0; @@ -1083,21 +1090,25 @@ int ebt_register_match(struct ebt_match *match) void ebt_unregister_match(struct ebt_match *match) { mutex_lock(&ebt_mutex); - LIST_DELETE(&ebt_matches, match); + list_del(&match->list); mutex_unlock(&ebt_mutex); } int ebt_register_watcher(struct ebt_watcher *watcher) { + struct ebt_watcher *w; int ret; ret = mutex_lock_interruptible(&ebt_mutex); if (ret != 0) return ret; - if (!list_named_insert(&ebt_watchers, watcher)) { - mutex_unlock(&ebt_mutex); - return -EEXIST; + list_for_each_entry(w, &ebt_watchers, list) { + if (strcmp(w->name, watcher->name) == 0) { + mutex_unlock(&ebt_mutex); + return -EEXIST; + } } + list_add(&watcher->list, &ebt_watchers); mutex_unlock(&ebt_mutex); return 0; @@ -1106,13 +1117,14 @@ int ebt_register_watcher(struct ebt_watcher *watcher) void ebt_unregister_watcher(struct ebt_watcher *watcher) { mutex_lock(&ebt_mutex); - LIST_DELETE(&ebt_watchers, watcher); + list_del(&watcher->list); mutex_unlock(&ebt_mutex); } int ebt_register_table(struct ebt_table *table) { struct ebt_table_info *newinfo; + struct ebt_table *t; int ret, i, countersize; if (!table || !table->table ||!table->table->entries || @@ -1158,10 +1170,12 @@ int ebt_register_table(struct ebt_table *table) if (ret != 0) goto free_chainstack; - if (list_named_find(&ebt_tables, table->name)) { - ret = -EEXIST; - BUGPRINT("Table name already exists\n"); - goto free_unlock; + list_for_each_entry(t, &ebt_tables, list) { + if (strcmp(t->name, table->name) == 0) { + ret = -EEXIST; + BUGPRINT("Table name already exists\n"); + goto free_unlock; + } } /* Hold a reference count if the chains aren't empty */ @@ -1169,7 +1183,7 @@ int ebt_register_table(struct ebt_table *table) ret = -ENOENT; goto free_unlock; } - list_prepend(&ebt_tables, table); + list_add(&table->list, &ebt_tables); mutex_unlock(&ebt_mutex); return 0; free_unlock: @@ -1195,7 +1209,7 @@ void ebt_unregister_table(struct ebt_table *table) return; } mutex_lock(&ebt_mutex); - LIST_DELETE(&ebt_tables, table); + list_del(&table->list); mutex_unlock(&ebt_mutex); vfree(table->private->entries); if (table->private->chainstack) { @@ -1465,7 +1479,7 @@ static int __init ebtables_init(void) int ret; mutex_lock(&ebt_mutex); - list_named_insert(&ebt_targets, &ebt_standard_target); + list_add(&ebt_standard_target.list, &ebt_targets); mutex_unlock(&ebt_mutex); if ((ret = nf_register_sockopt(&ebt_sockopts)) < 0) return ret; diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index 4f10b06..aaeaa9c 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -56,8 +56,6 @@ do { \ #define ARP_NF_ASSERT(x) #endif -#include - static inline int arp_devaddr_compare(const struct arpt_devaddr_info *ap, char *hdr_addr, int len) { diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c index 5da25ad..2568d48 100644 --- a/net/ipv4/netfilter/ip_conntrack_core.c +++ b/net/ipv4/netfilter/ip_conntrack_core.c @@ -47,7 +47,6 @@ #include #include #include -#include #define IP_CONNTRACK_VERSION "2.4" @@ -294,15 +293,10 @@ void ip_ct_remove_expectations(struct ip_conntrack *ct) static void clean_from_lists(struct ip_conntrack *ct) { - unsigned int ho, hr; - DEBUGP("clean_from_lists(%p)\n", ct); ASSERT_WRITE_LOCK(&ip_conntrack_lock); - - ho = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); - hr = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); - LIST_DELETE(&ip_conntrack_hash[ho], &ct->tuplehash[IP_CT_DIR_ORIGINAL]); - LIST_DELETE(&ip_conntrack_hash[hr], &ct->tuplehash[IP_CT_DIR_REPLY]); + list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + list_del(&ct->tuplehash[IP_CT_DIR_REPLY].list); /* Destroy all pending expectations */ ip_ct_remove_expectations(ct); @@ -367,16 +361,6 @@ static void death_by_timeout(unsigned long ul_conntrack) ip_conntrack_put(ct); } -static inline int -conntrack_tuple_cmp(const struct ip_conntrack_tuple_hash *i, - const struct ip_conntrack_tuple *tuple, - const struct ip_conntrack *ignored_conntrack) -{ - ASSERT_READ_LOCK(&ip_conntrack_lock); - return tuplehash_to_ctrack(i) != ignored_conntrack - && ip_ct_tuple_equal(tuple, &i->tuple); -} - struct ip_conntrack_tuple_hash * __ip_conntrack_find(const struct ip_conntrack_tuple *tuple, const struct ip_conntrack *ignored_conntrack) @@ -386,7 +370,8 @@ __ip_conntrack_find(const struct ip_conntrack_tuple *tuple, ASSERT_READ_LOCK(&ip_conntrack_lock); list_for_each_entry(h, &ip_conntrack_hash[hash], list) { - if (conntrack_tuple_cmp(h, tuple, ignored_conntrack)) { + if (tuplehash_to_ctrack(h) != ignored_conntrack && + ip_ct_tuple_equal(tuple, &h->tuple)) { CONNTRACK_STAT_INC(found); return h; } @@ -417,10 +402,10 @@ static void __ip_conntrack_hash_insert(struct ip_conntrack *ct, unsigned int repl_hash) { ct->id = ++ip_conntrack_next_id; - list_prepend(&ip_conntrack_hash[hash], - &ct->tuplehash[IP_CT_DIR_ORIGINAL].list); - list_prepend(&ip_conntrack_hash[repl_hash], - &ct->tuplehash[IP_CT_DIR_REPLY].list); + list_add(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list, + &ip_conntrack_hash[hash]); + list_add(&ct->tuplehash[IP_CT_DIR_REPLY].list, + &ip_conntrack_hash[repl_hash]); } void ip_conntrack_hash_insert(struct ip_conntrack *ct) @@ -440,6 +425,7 @@ int __ip_conntrack_confirm(struct sk_buff **pskb) { unsigned int hash, repl_hash; + struct ip_conntrack_tuple_hash *h; struct ip_conntrack *ct; enum ip_conntrack_info ctinfo; @@ -470,43 +456,43 @@ __ip_conntrack_confirm(struct sk_buff **pskb) /* See if there's one in the list already, including reverse: NAT could have grabbed it without realizing, since we're not in the hash. If there is, we lost race. */ - if (!LIST_FIND(&ip_conntrack_hash[hash], - conntrack_tuple_cmp, - struct ip_conntrack_tuple_hash *, - &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, NULL) - && !LIST_FIND(&ip_conntrack_hash[repl_hash], - conntrack_tuple_cmp, - struct ip_conntrack_tuple_hash *, - &ct->tuplehash[IP_CT_DIR_REPLY].tuple, NULL)) { - /* Remove from unconfirmed list */ - list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + list_for_each_entry(h, &ip_conntrack_hash[hash], list) + if (ip_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, + &h->tuple)) + goto out; + list_for_each_entry(h, &ip_conntrack_hash[repl_hash], list) + if (ip_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, + &h->tuple)) + goto out; - __ip_conntrack_hash_insert(ct, hash, repl_hash); - /* Timer relative to confirmation time, not original - setting time, otherwise we'd get timer wrap in - weird delay cases. */ - ct->timeout.expires += jiffies; - add_timer(&ct->timeout); - atomic_inc(&ct->ct_general.use); - set_bit(IPS_CONFIRMED_BIT, &ct->status); - CONNTRACK_STAT_INC(insert); - write_unlock_bh(&ip_conntrack_lock); - if (ct->helper) - ip_conntrack_event_cache(IPCT_HELPER, *pskb); + /* Remove from unconfirmed list */ + list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + + __ip_conntrack_hash_insert(ct, hash, repl_hash); + /* Timer relative to confirmation time, not original + setting time, otherwise we'd get timer wrap in + weird delay cases. */ + ct->timeout.expires += jiffies; + add_timer(&ct->timeout); + atomic_inc(&ct->ct_general.use); + set_bit(IPS_CONFIRMED_BIT, &ct->status); + CONNTRACK_STAT_INC(insert); + write_unlock_bh(&ip_conntrack_lock); + if (ct->helper) + ip_conntrack_event_cache(IPCT_HELPER, *pskb); #ifdef CONFIG_IP_NF_NAT_NEEDED - if (test_bit(IPS_SRC_NAT_DONE_BIT, &ct->status) || - test_bit(IPS_DST_NAT_DONE_BIT, &ct->status)) - ip_conntrack_event_cache(IPCT_NATINFO, *pskb); + if (test_bit(IPS_SRC_NAT_DONE_BIT, &ct->status) || + test_bit(IPS_DST_NAT_DONE_BIT, &ct->status)) + ip_conntrack_event_cache(IPCT_NATINFO, *pskb); #endif - ip_conntrack_event_cache(master_ct(ct) ? - IPCT_RELATED : IPCT_NEW, *pskb); + ip_conntrack_event_cache(master_ct(ct) ? + IPCT_RELATED : IPCT_NEW, *pskb); - return NF_ACCEPT; - } + return NF_ACCEPT; +out: CONNTRACK_STAT_INC(insert_failed); write_unlock_bh(&ip_conntrack_lock); - return NF_DROP; } @@ -527,23 +513,21 @@ ip_conntrack_tuple_taken(const struct ip_conntrack_tuple *tuple, /* There's a small race here where we may free a just-assured connection. Too bad: we're in trouble anyway. */ -static inline int unreplied(const struct ip_conntrack_tuple_hash *i) -{ - return !(test_bit(IPS_ASSURED_BIT, &tuplehash_to_ctrack(i)->status)); -} - static int early_drop(struct list_head *chain) { /* Traverse backwards: gives us oldest, which is roughly LRU */ struct ip_conntrack_tuple_hash *h; - struct ip_conntrack *ct = NULL; + struct ip_conntrack *ct = NULL, *tmp; int dropped = 0; read_lock_bh(&ip_conntrack_lock); - h = LIST_FIND_B(chain, unreplied, struct ip_conntrack_tuple_hash *); - if (h) { - ct = tuplehash_to_ctrack(h); - atomic_inc(&ct->ct_general.use); + list_for_each_entry_reverse(h, chain, list) { + tmp = tuplehash_to_ctrack(h); + if (!test_bit(IPS_ASSURED_BIT, &tmp->status)) { + ct = tmp; + atomic_inc(&ct->ct_general.use); + break; + } } read_unlock_bh(&ip_conntrack_lock); @@ -559,18 +543,16 @@ static int early_drop(struct list_head *chain) return dropped; } -static inline int helper_cmp(const struct ip_conntrack_helper *i, - const struct ip_conntrack_tuple *rtuple) -{ - return ip_ct_tuple_mask_cmp(rtuple, &i->tuple, &i->mask); -} - static struct ip_conntrack_helper * __ip_conntrack_helper_find( const struct ip_conntrack_tuple *tuple) { - return LIST_FIND(&helpers, helper_cmp, - struct ip_conntrack_helper *, - tuple); + struct ip_conntrack_helper *h; + + list_for_each_entry(h, &helpers, list) { + if (ip_ct_tuple_mask_cmp(tuple, &h->tuple, &h->mask)) + return h; + } + return NULL; } struct ip_conntrack_helper * @@ -1062,7 +1044,7 @@ int ip_conntrack_helper_register(struct ip_conntrack_helper *me) { BUG_ON(me->timeout == 0); write_lock_bh(&ip_conntrack_lock); - list_prepend(&helpers, me); + list_add(&me->list, &helpers); write_unlock_bh(&ip_conntrack_lock); return 0; @@ -1081,24 +1063,24 @@ __ip_conntrack_helper_find_byname(const char *name) return NULL; } -static inline int unhelp(struct ip_conntrack_tuple_hash *i, - const struct ip_conntrack_helper *me) +static inline void unhelp(struct ip_conntrack_tuple_hash *i, + const struct ip_conntrack_helper *me) { if (tuplehash_to_ctrack(i)->helper == me) { ip_conntrack_event(IPCT_HELPER, tuplehash_to_ctrack(i)); tuplehash_to_ctrack(i)->helper = NULL; } - return 0; } void ip_conntrack_helper_unregister(struct ip_conntrack_helper *me) { unsigned int i; + struct ip_conntrack_tuple_hash *h; struct ip_conntrack_expect *exp, *tmp; /* Need write lock here, to delete helper. */ write_lock_bh(&ip_conntrack_lock); - LIST_DELETE(&helpers, me); + list_del(&me->list); /* Get rid of expectations */ list_for_each_entry_safe(exp, tmp, &ip_conntrack_expect_list, list) { @@ -1108,10 +1090,12 @@ void ip_conntrack_helper_unregister(struct ip_conntrack_helper *me) } } /* Get rid of expecteds, set helpers to NULL. */ - LIST_FIND_W(&unconfirmed, unhelp, struct ip_conntrack_tuple_hash*, me); - for (i = 0; i < ip_conntrack_htable_size; i++) - LIST_FIND_W(&ip_conntrack_hash[i], unhelp, - struct ip_conntrack_tuple_hash *, me); + list_for_each_entry(h, &unconfirmed, list) + unhelp(h, me); + for (i = 0; i < ip_conntrack_htable_size; i++) { + list_for_each_entry(h, &ip_conntrack_hash[i], list) + unhelp(h, me); + } write_unlock_bh(&ip_conntrack_lock); /* Someone could be still looking at the helper in a bh. */ @@ -1237,46 +1221,43 @@ static void ip_conntrack_attach(struct sk_buff *nskb, struct sk_buff *skb) nf_conntrack_get(nskb->nfct); } -static inline int -do_iter(const struct ip_conntrack_tuple_hash *i, - int (*iter)(struct ip_conntrack *i, void *data), - void *data) -{ - return iter(tuplehash_to_ctrack(i), data); -} - /* Bring out ya dead! */ -static struct ip_conntrack_tuple_hash * +static struct ip_conntrack * get_next_corpse(int (*iter)(struct ip_conntrack *i, void *data), void *data, unsigned int *bucket) { - struct ip_conntrack_tuple_hash *h = NULL; + struct ip_conntrack_tuple_hash *h; + struct ip_conntrack *ct; write_lock_bh(&ip_conntrack_lock); for (; *bucket < ip_conntrack_htable_size; (*bucket)++) { - h = LIST_FIND_W(&ip_conntrack_hash[*bucket], do_iter, - struct ip_conntrack_tuple_hash *, iter, data); - if (h) - break; + list_for_each_entry(h, &ip_conntrack_hash[*bucket], list) { + ct = tuplehash_to_ctrack(h); + if (iter(ct, data)) + goto found; + } + } + list_for_each_entry(h, &unconfirmed, list) { + ct = tuplehash_to_ctrack(h); + if (iter(ct, data)) + goto found; } - if (!h) - h = LIST_FIND_W(&unconfirmed, do_iter, - struct ip_conntrack_tuple_hash *, iter, data); - if (h) - atomic_inc(&tuplehash_to_ctrack(h)->ct_general.use); write_unlock_bh(&ip_conntrack_lock); + return NULL; - return h; +found: + atomic_inc(&ct->ct_general.use); + write_unlock_bh(&ip_conntrack_lock); + return ct; } void ip_ct_iterate_cleanup(int (*iter)(struct ip_conntrack *i, void *), void *data) { - struct ip_conntrack_tuple_hash *h; + struct ip_conntrack *ct; unsigned int bucket = 0; - while ((h = get_next_corpse(iter, data, &bucket)) != NULL) { - struct ip_conntrack *ct = tuplehash_to_ctrack(h); + while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) { /* Time to push up daises... */ if (del_timer(&ct->timeout)) death_by_timeout((unsigned long)ct); diff --git a/net/ipv4/netfilter/ip_conntrack_proto_gre.c b/net/ipv4/netfilter/ip_conntrack_proto_gre.c index 4ee016c..92c6d8b 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_gre.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_gre.c @@ -37,7 +37,6 @@ static DEFINE_RWLOCK(ip_ct_gre_lock); #define ASSERT_READ_LOCK(x) #define ASSERT_WRITE_LOCK(x) -#include #include #include #include @@ -82,10 +81,12 @@ static __be16 gre_keymap_lookup(struct ip_conntrack_tuple *t) __be16 key = 0; read_lock_bh(&ip_ct_gre_lock); - km = LIST_FIND(&gre_keymap_list, gre_key_cmpfn, - struct ip_ct_gre_keymap *, t); - if (km) - key = km->tuple.src.u.gre.key; + list_for_each_entry(km, &gre_keymap_list, list) { + if (gre_key_cmpfn(km, t)) { + key = km->tuple.src.u.gre.key; + break; + } + } read_unlock_bh(&ip_ct_gre_lock); DEBUGP("lookup src key 0x%x up key for ", key); @@ -99,7 +100,7 @@ int ip_ct_gre_keymap_add(struct ip_conntrack *ct, struct ip_conntrack_tuple *t, int reply) { - struct ip_ct_gre_keymap **exist_km, *km, *old; + struct ip_ct_gre_keymap **exist_km, *km; if (!ct->helper || strcmp(ct->helper->name, "pptp")) { DEBUGP("refusing to add GRE keymap to non-pptp session\n"); @@ -113,13 +114,10 @@ ip_ct_gre_keymap_add(struct ip_conntrack *ct, if (*exist_km) { /* check whether it's a retransmission */ - old = LIST_FIND(&gre_keymap_list, gre_key_cmpfn, - struct ip_ct_gre_keymap *, t); - if (old == *exist_km) { - DEBUGP("retransmission\n"); - return 0; + list_for_each_entry(km, &gre_keymap_list, list) { + if (gre_key_cmpfn(km, t) && km == *exist_km) + return 0; } - DEBUGP("trying to override keymap_%s for ct %p\n", reply? "reply":"orig", ct); return -EEXIST; @@ -136,7 +134,7 @@ ip_ct_gre_keymap_add(struct ip_conntrack *ct, DUMP_TUPLE_GRE(&km->tuple); write_lock_bh(&ip_ct_gre_lock); - list_append(&gre_keymap_list, km); + list_add_tail(&km->list, &gre_keymap_list); write_unlock_bh(&ip_ct_gre_lock); return 0; diff --git a/net/ipv4/netfilter/ip_conntrack_standalone.c b/net/ipv4/netfilter/ip_conntrack_standalone.c index 3f5d495..0213575 100644 --- a/net/ipv4/netfilter/ip_conntrack_standalone.c +++ b/net/ipv4/netfilter/ip_conntrack_standalone.c @@ -35,7 +35,6 @@ #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/ipv4/netfilter/ip_nat_core.c b/net/ipv4/netfilter/ip_nat_core.c index 4c540d0..71f3e09 100644 --- a/net/ipv4/netfilter/ip_nat_core.c +++ b/net/ipv4/netfilter/ip_nat_core.c @@ -22,9 +22,6 @@ #include #include -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) - #include #include #include @@ -33,7 +30,6 @@ #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/ipv4/netfilter/ip_nat_helper.c b/net/ipv4/netfilter/ip_nat_helper.c index 021c3da..7f6a759 100644 --- a/net/ipv4/netfilter/ip_nat_helper.c +++ b/net/ipv4/netfilter/ip_nat_helper.c @@ -27,16 +27,12 @@ #include #include -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) - #include #include #include #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/ipv4/netfilter/ip_nat_rule.c b/net/ipv4/netfilter/ip_nat_rule.c index e59f5a8..7b70383 100644 --- a/net/ipv4/netfilter/ip_nat_rule.c +++ b/net/ipv4/netfilter/ip_nat_rule.c @@ -19,14 +19,10 @@ #include #include -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) - #include #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/ipv4/netfilter/ip_nat_standalone.c b/net/ipv4/netfilter/ip_nat_standalone.c index f3b7783..9c577db 100644 --- a/net/ipv4/netfilter/ip_nat_standalone.c +++ b/net/ipv4/netfilter/ip_nat_standalone.c @@ -30,9 +30,6 @@ #include #include -#define ASSERT_READ_LOCK(x) -#define ASSERT_WRITE_LOCK(x) - #include #include #include @@ -40,7 +37,6 @@ #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index d1c3153..73d477c 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -70,9 +70,6 @@ do { \ #define IP_NF_ASSERT(x) #endif - -#include - #if 0 /* All the better to debug you with... */ #define static diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 3b64dbe..927137b 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -57,7 +57,6 @@ #include #include #include -#include #define NF_CONNTRACK_VERSION "0.5.0" @@ -539,15 +538,10 @@ void nf_ct_remove_expectations(struct nf_conn *ct) static void clean_from_lists(struct nf_conn *ct) { - unsigned int ho, hr; - DEBUGP("clean_from_lists(%p)\n", ct); ASSERT_WRITE_LOCK(&nf_conntrack_lock); - - ho = hash_conntrack(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); - hr = hash_conntrack(&ct->tuplehash[IP_CT_DIR_REPLY].tuple); - LIST_DELETE(&nf_conntrack_hash[ho], &ct->tuplehash[IP_CT_DIR_ORIGINAL]); - LIST_DELETE(&nf_conntrack_hash[hr], &ct->tuplehash[IP_CT_DIR_REPLY]); + list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + list_del(&ct->tuplehash[IP_CT_DIR_REPLY].list); /* Destroy all pending expectations */ nf_ct_remove_expectations(ct); @@ -617,16 +611,6 @@ static void death_by_timeout(unsigned long ul_conntrack) nf_ct_put(ct); } -static inline int -conntrack_tuple_cmp(const struct nf_conntrack_tuple_hash *i, - const struct nf_conntrack_tuple *tuple, - const struct nf_conn *ignored_conntrack) -{ - ASSERT_READ_LOCK(&nf_conntrack_lock); - return nf_ct_tuplehash_to_ctrack(i) != ignored_conntrack - && nf_ct_tuple_equal(tuple, &i->tuple); -} - struct nf_conntrack_tuple_hash * __nf_conntrack_find(const struct nf_conntrack_tuple *tuple, const struct nf_conn *ignored_conntrack) @@ -636,7 +620,8 @@ __nf_conntrack_find(const struct nf_conntrack_tuple *tuple, ASSERT_READ_LOCK(&nf_conntrack_lock); list_for_each_entry(h, &nf_conntrack_hash[hash], list) { - if (conntrack_tuple_cmp(h, tuple, ignored_conntrack)) { + if (nf_ct_tuplehash_to_ctrack(h) != ignored_conntrack && + nf_ct_tuple_equal(tuple, &h->tuple)) { NF_CT_STAT_INC(found); return h; } @@ -667,10 +652,10 @@ static void __nf_conntrack_hash_insert(struct nf_conn *ct, unsigned int repl_hash) { ct->id = ++nf_conntrack_next_id; - list_prepend(&nf_conntrack_hash[hash], - &ct->tuplehash[IP_CT_DIR_ORIGINAL].list); - list_prepend(&nf_conntrack_hash[repl_hash], - &ct->tuplehash[IP_CT_DIR_REPLY].list); + list_add(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list, + &nf_conntrack_hash[hash]); + list_add(&ct->tuplehash[IP_CT_DIR_REPLY].list, + &nf_conntrack_hash[repl_hash]); } void nf_conntrack_hash_insert(struct nf_conn *ct) @@ -690,7 +675,9 @@ int __nf_conntrack_confirm(struct sk_buff **pskb) { unsigned int hash, repl_hash; + struct nf_conntrack_tuple_hash *h; struct nf_conn *ct; + struct nf_conn_help *help; enum ip_conntrack_info ctinfo; ct = nf_ct_get(*pskb, &ctinfo); @@ -720,41 +707,41 @@ __nf_conntrack_confirm(struct sk_buff **pskb) /* See if there's one in the list already, including reverse: NAT could have grabbed it without realizing, since we're not in the hash. If there is, we lost race. */ - if (!LIST_FIND(&nf_conntrack_hash[hash], - conntrack_tuple_cmp, - struct nf_conntrack_tuple_hash *, - &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, NULL) - && !LIST_FIND(&nf_conntrack_hash[repl_hash], - conntrack_tuple_cmp, - struct nf_conntrack_tuple_hash *, - &ct->tuplehash[IP_CT_DIR_REPLY].tuple, NULL)) { - struct nf_conn_help *help; - /* Remove from unconfirmed list */ - list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + list_for_each_entry(h, &nf_conntrack_hash[hash], list) + if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, + &h->tuple)) + goto out; + list_for_each_entry(h, &nf_conntrack_hash[repl_hash], list) + if (nf_ct_tuple_equal(&ct->tuplehash[IP_CT_DIR_REPLY].tuple, + &h->tuple)) + goto out; - __nf_conntrack_hash_insert(ct, hash, repl_hash); - /* Timer relative to confirmation time, not original - setting time, otherwise we'd get timer wrap in - weird delay cases. */ - ct->timeout.expires += jiffies; - add_timer(&ct->timeout); - atomic_inc(&ct->ct_general.use); - set_bit(IPS_CONFIRMED_BIT, &ct->status); - NF_CT_STAT_INC(insert); - write_unlock_bh(&nf_conntrack_lock); - help = nfct_help(ct); - if (help && help->helper) - nf_conntrack_event_cache(IPCT_HELPER, *pskb); + /* Remove from unconfirmed list */ + list_del(&ct->tuplehash[IP_CT_DIR_ORIGINAL].list); + + __nf_conntrack_hash_insert(ct, hash, repl_hash); + /* Timer relative to confirmation time, not original + setting time, otherwise we'd get timer wrap in + weird delay cases. */ + ct->timeout.expires += jiffies; + add_timer(&ct->timeout); + atomic_inc(&ct->ct_general.use); + set_bit(IPS_CONFIRMED_BIT, &ct->status); + NF_CT_STAT_INC(insert); + write_unlock_bh(&nf_conntrack_lock); + help = nfct_help(ct); + if (help && help->helper) + nf_conntrack_event_cache(IPCT_HELPER, *pskb); #ifdef CONFIG_NF_NAT_NEEDED - if (test_bit(IPS_SRC_NAT_DONE_BIT, &ct->status) || - test_bit(IPS_DST_NAT_DONE_BIT, &ct->status)) - nf_conntrack_event_cache(IPCT_NATINFO, *pskb); + if (test_bit(IPS_SRC_NAT_DONE_BIT, &ct->status) || + test_bit(IPS_DST_NAT_DONE_BIT, &ct->status)) + nf_conntrack_event_cache(IPCT_NATINFO, *pskb); #endif - nf_conntrack_event_cache(master_ct(ct) ? - IPCT_RELATED : IPCT_NEW, *pskb); - return NF_ACCEPT; - } + nf_conntrack_event_cache(master_ct(ct) ? + IPCT_RELATED : IPCT_NEW, *pskb); + return NF_ACCEPT; +out: NF_CT_STAT_INC(insert_failed); write_unlock_bh(&nf_conntrack_lock); return NF_DROP; @@ -777,24 +764,21 @@ nf_conntrack_tuple_taken(const struct nf_conntrack_tuple *tuple, /* There's a small race here where we may free a just-assured connection. Too bad: we're in trouble anyway. */ -static inline int unreplied(const struct nf_conntrack_tuple_hash *i) -{ - return !(test_bit(IPS_ASSURED_BIT, - &nf_ct_tuplehash_to_ctrack(i)->status)); -} - static int early_drop(struct list_head *chain) { /* Traverse backwards: gives us oldest, which is roughly LRU */ struct nf_conntrack_tuple_hash *h; - struct nf_conn *ct = NULL; + struct nf_conn *ct = NULL, *tmp; int dropped = 0; read_lock_bh(&nf_conntrack_lock); - h = LIST_FIND_B(chain, unreplied, struct nf_conntrack_tuple_hash *); - if (h) { - ct = nf_ct_tuplehash_to_ctrack(h); - atomic_inc(&ct->ct_general.use); + list_for_each_entry_reverse(h, chain, list) { + tmp = nf_ct_tuplehash_to_ctrack(h); + if (!test_bit(IPS_ASSURED_BIT, &tmp->status)) { + ct = tmp; + atomic_inc(&ct->ct_general.use); + break; + } } read_unlock_bh(&nf_conntrack_lock); @@ -810,18 +794,16 @@ static int early_drop(struct list_head *chain) return dropped; } -static inline int helper_cmp(const struct nf_conntrack_helper *i, - const struct nf_conntrack_tuple *rtuple) -{ - return nf_ct_tuple_mask_cmp(rtuple, &i->tuple, &i->mask); -} - static struct nf_conntrack_helper * __nf_ct_helper_find(const struct nf_conntrack_tuple *tuple) { - return LIST_FIND(&helpers, helper_cmp, - struct nf_conntrack_helper *, - tuple); + struct nf_conntrack_helper *h; + + list_for_each_entry(h, &helpers, list) { + if (nf_ct_tuple_mask_cmp(tuple, &h->tuple, &h->mask)) + return h; + } + return NULL; } struct nf_conntrack_helper * @@ -1323,7 +1305,7 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me) return ret; } write_lock_bh(&nf_conntrack_lock); - list_prepend(&helpers, me); + list_add(&me->list, &helpers); write_unlock_bh(&nf_conntrack_lock); return 0; @@ -1342,8 +1324,8 @@ __nf_conntrack_helper_find_byname(const char *name) return NULL; } -static inline int unhelp(struct nf_conntrack_tuple_hash *i, - const struct nf_conntrack_helper *me) +static inline void unhelp(struct nf_conntrack_tuple_hash *i, + const struct nf_conntrack_helper *me) { struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(i); struct nf_conn_help *help = nfct_help(ct); @@ -1352,17 +1334,17 @@ static inline int unhelp(struct nf_conntrack_tuple_hash *i, nf_conntrack_event(IPCT_HELPER, ct); help->helper = NULL; } - return 0; } void nf_conntrack_helper_unregister(struct nf_conntrack_helper *me) { unsigned int i; + struct nf_conntrack_tuple_hash *h; struct nf_conntrack_expect *exp, *tmp; /* Need write lock here, to delete helper. */ write_lock_bh(&nf_conntrack_lock); - LIST_DELETE(&helpers, me); + list_del(&me->list); /* Get rid of expectations */ list_for_each_entry_safe(exp, tmp, &nf_conntrack_expect_list, list) { @@ -1374,10 +1356,12 @@ void nf_conntrack_helper_unregister(struct nf_conntrack_helper *me) } /* Get rid of expecteds, set helpers to NULL. */ - LIST_FIND_W(&unconfirmed, unhelp, struct nf_conntrack_tuple_hash*, me); - for (i = 0; i < nf_conntrack_htable_size; i++) - LIST_FIND_W(&nf_conntrack_hash[i], unhelp, - struct nf_conntrack_tuple_hash *, me); + list_for_each_entry(h, &unconfirmed, list) + unhelp(h, me); + for (i = 0; i < nf_conntrack_htable_size; i++) { + list_for_each_entry(h, &nf_conntrack_hash[i], list) + unhelp(h, me); + } write_unlock_bh(&nf_conntrack_lock); /* Someone could be still looking at the helper in a bh. */ @@ -1510,37 +1494,40 @@ do_iter(const struct nf_conntrack_tuple_hash *i, } /* Bring out ya dead! */ -static struct nf_conntrack_tuple_hash * +static struct nf_conn * get_next_corpse(int (*iter)(struct nf_conn *i, void *data), void *data, unsigned int *bucket) { - struct nf_conntrack_tuple_hash *h = NULL; + struct nf_conntrack_tuple_hash *h; + struct nf_conn *ct; write_lock_bh(&nf_conntrack_lock); for (; *bucket < nf_conntrack_htable_size; (*bucket)++) { - h = LIST_FIND_W(&nf_conntrack_hash[*bucket], do_iter, - struct nf_conntrack_tuple_hash *, iter, data); - if (h) - break; + list_for_each_entry(h, &nf_conntrack_hash[*bucket], list) { + ct = nf_ct_tuplehash_to_ctrack(h); + if (iter(ct, data)) + goto found; + } } - if (!h) - h = LIST_FIND_W(&unconfirmed, do_iter, - struct nf_conntrack_tuple_hash *, iter, data); - if (h) - atomic_inc(&nf_ct_tuplehash_to_ctrack(h)->ct_general.use); + list_for_each_entry(h, &unconfirmed, list) { + ct = nf_ct_tuplehash_to_ctrack(h); + if (iter(ct, data)) + goto found; + } + return NULL; +found: + atomic_inc(&nf_ct_tuplehash_to_ctrack(h)->ct_general.use); write_unlock_bh(&nf_conntrack_lock); - - return h; + return ct; } void nf_ct_iterate_cleanup(int (*iter)(struct nf_conn *i, void *data), void *data) { - struct nf_conntrack_tuple_hash *h; + struct nf_conn *ct; unsigned int bucket = 0; - while ((h = get_next_corpse(iter, data, &bucket)) != NULL) { - struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h); + while ((ct = get_next_corpse(iter, data, &bucket)) != NULL) { /* Time to push up daises... */ if (del_timer(&ct->timeout)) death_by_timeout((unsigned long)ct); diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index 9a1de0c..5954f67 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -37,7 +37,6 @@ #include #include #include -#include #if 0 #define DEBUGP printk diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index 8037ba6..be7baf4 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -81,7 +81,7 @@ xt_unregister_target(struct xt_target *target) int af = target->family; mutex_lock(&xt[af].mutex); - LIST_DELETE(&xt[af].target, target); + list_del(&target->list); mutex_unlock(&xt[af].mutex); } EXPORT_SYMBOL(xt_unregister_target); @@ -138,7 +138,7 @@ xt_unregister_match(struct xt_match *match) int af = match->family; mutex_lock(&xt[af].mutex); - LIST_DELETE(&xt[af].match, match); + list_del(&match->list); mutex_unlock(&xt[af].mutex); } EXPORT_SYMBOL(xt_unregister_match); @@ -575,15 +575,18 @@ int xt_register_table(struct xt_table *table, { int ret; struct xt_table_info *private; + struct xt_table *t; ret = mutex_lock_interruptible(&xt[table->af].mutex); if (ret != 0) return ret; /* Don't autoload: we'd eat our tail... */ - if (list_named_find(&xt[table->af].tables, table->name)) { - ret = -EEXIST; - goto unlock; + list_for_each_entry(t, &xt[table->af].tables, list) { + if (strcmp(t->name, table->name) == 0) { + ret = -EEXIST; + goto unlock; + } } /* Simplifies replace_table code. */ @@ -598,7 +601,7 @@ int xt_register_table(struct xt_table *table, /* save number of initial entries */ private->initial_entries = private->number; - list_prepend(&xt[table->af].tables, table); + list_add(&table->list, &xt[table->af].tables); ret = 0; unlock: @@ -613,7 +616,7 @@ void *xt_unregister_table(struct xt_table *table) mutex_lock(&xt[table->af].mutex); private = table->private; - LIST_DELETE(&xt[table->af].tables, table); + list_del(&table->list); mutex_unlock(&xt[table->af].mutex); return private; -- cgit v0.10.2 From 50b9f1d509eb998db73cd769c9511186474f566e Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:58:17 -0700 Subject: [NETFILTER]: xt_conntrack: clean up overly long lines Also fix some whitespace errors and use the NAT bits instead of deriving the state manually. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_conntrack.c b/net/netfilter/xt_conntrack.c index 39c57e9..0ea501a 100644 --- a/net/netfilter/xt_conntrack.c +++ b/net/netfilter/xt_conntrack.c @@ -45,7 +45,7 @@ match(const struct sk_buff *skb, ct = ip_conntrack_get((struct sk_buff *)skb, &ctinfo); -#define FWINV(bool,invflg) ((bool) ^ !!(sinfo->invflags & invflg)) +#define FWINV(bool, invflg) ((bool) ^ !!(sinfo->invflags & invflg)) if (ct == &ip_conntrack_untracked) statebit = XT_CONNTRACK_STATE_UNTRACKED; @@ -54,63 +54,72 @@ match(const struct sk_buff *skb, else statebit = XT_CONNTRACK_STATE_INVALID; - if(sinfo->flags & XT_CONNTRACK_STATE) { + if (sinfo->flags & XT_CONNTRACK_STATE) { if (ct) { - if(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip != - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip) + if (test_bit(IPS_SRC_NAT_BIT, &ct->status)) statebit |= XT_CONNTRACK_STATE_SNAT; - - if(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip != - ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip) + if (test_bit(IPS_DST_NAT_BIT, &ct->status)) statebit |= XT_CONNTRACK_STATE_DNAT; } - - if (FWINV((statebit & sinfo->statemask) == 0, XT_CONNTRACK_STATE)) + if (FWINV((statebit & sinfo->statemask) == 0, + XT_CONNTRACK_STATE)) return 0; } - if(sinfo->flags & XT_CONNTRACK_PROTO) { - if (!ct || FWINV(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum != sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.protonum, XT_CONNTRACK_PROTO)) - return 0; - } - - if(sinfo->flags & XT_CONNTRACK_ORIGSRC) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip&sinfo->sipmsk[IP_CT_DIR_ORIGINAL].s_addr) != sinfo->tuple[IP_CT_DIR_ORIGINAL].src.ip, XT_CONNTRACK_ORIGSRC)) + if (ct == NULL) { + if (sinfo->flags & ~XT_CONNTRACK_STATE) return 0; + return 1; } - if(sinfo->flags & XT_CONNTRACK_ORIGDST) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip&sinfo->dipmsk[IP_CT_DIR_ORIGINAL].s_addr) != sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.ip, XT_CONNTRACK_ORIGDST)) - return 0; - } - - if(sinfo->flags & XT_CONNTRACK_REPLSRC) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip&sinfo->sipmsk[IP_CT_DIR_REPLY].s_addr) != sinfo->tuple[IP_CT_DIR_REPLY].src.ip, XT_CONNTRACK_REPLSRC)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_PROTO && + FWINV(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum != + sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.protonum, + XT_CONNTRACK_PROTO)) + return 0; + + if (sinfo->flags & XT_CONNTRACK_ORIGSRC && + FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip & + sinfo->sipmsk[IP_CT_DIR_ORIGINAL].s_addr) != + sinfo->tuple[IP_CT_DIR_ORIGINAL].src.ip, + XT_CONNTRACK_ORIGSRC)) + return 0; - if(sinfo->flags & XT_CONNTRACK_REPLDST) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip&sinfo->dipmsk[IP_CT_DIR_REPLY].s_addr) != sinfo->tuple[IP_CT_DIR_REPLY].dst.ip, XT_CONNTRACK_REPLDST)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_ORIGDST && + FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip & + sinfo->dipmsk[IP_CT_DIR_ORIGINAL].s_addr) != + sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.ip, + XT_CONNTRACK_ORIGDST)) + return 0; - if(sinfo->flags & XT_CONNTRACK_STATUS) { - if (!ct || FWINV((ct->status & sinfo->statusmask) == 0, XT_CONNTRACK_STATUS)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_REPLSRC && + FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip & + sinfo->sipmsk[IP_CT_DIR_REPLY].s_addr) != + sinfo->tuple[IP_CT_DIR_REPLY].src.ip, + XT_CONNTRACK_REPLSRC)) + return 0; - if(sinfo->flags & XT_CONNTRACK_EXPIRES) { - unsigned long expires; + if (sinfo->flags & XT_CONNTRACK_REPLDST && + FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip & + sinfo->dipmsk[IP_CT_DIR_REPLY].s_addr) != + sinfo->tuple[IP_CT_DIR_REPLY].dst.ip, + XT_CONNTRACK_REPLDST)) + return 0; - if(!ct) - return 0; + if (sinfo->flags & XT_CONNTRACK_STATUS && + FWINV((ct->status & sinfo->statusmask) == 0, + XT_CONNTRACK_STATUS)) + return 0; - expires = timer_pending(&ct->timeout) ? (ct->timeout.expires - jiffies)/HZ : 0; + if (sinfo->flags & XT_CONNTRACK_EXPIRES) { + unsigned long expires = timer_pending(&ct->timeout) ? + (ct->timeout.expires - jiffies)/HZ : 0; - if (FWINV(!(expires >= sinfo->expires_min && expires <= sinfo->expires_max), XT_CONNTRACK_EXPIRES)) + if (FWINV(!(expires >= sinfo->expires_min && + expires <= sinfo->expires_max), + XT_CONNTRACK_EXPIRES)) return 0; } - return 1; } @@ -141,63 +150,72 @@ match(const struct sk_buff *skb, else statebit = XT_CONNTRACK_STATE_INVALID; - if(sinfo->flags & XT_CONNTRACK_STATE) { + if (sinfo->flags & XT_CONNTRACK_STATE) { if (ct) { - if(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip != - ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3.ip) + if (test_bit(IPS_SRC_NAT_BIT, &ct->status)) statebit |= XT_CONNTRACK_STATE_SNAT; - - if(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.ip != - ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u3.ip) + if (test_bit(IPS_DST_NAT_BIT, &ct->status)) statebit |= XT_CONNTRACK_STATE_DNAT; } - - if (FWINV((statebit & sinfo->statemask) == 0, XT_CONNTRACK_STATE)) + if (FWINV((statebit & sinfo->statemask) == 0, + XT_CONNTRACK_STATE)) return 0; } - if(sinfo->flags & XT_CONNTRACK_PROTO) { - if (!ct || FWINV(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum != sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.protonum, XT_CONNTRACK_PROTO)) - return 0; - } - - if(sinfo->flags & XT_CONNTRACK_ORIGSRC) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip&sinfo->sipmsk[IP_CT_DIR_ORIGINAL].s_addr) != sinfo->tuple[IP_CT_DIR_ORIGINAL].src.ip, XT_CONNTRACK_ORIGSRC)) + if (ct == NULL) { + if (sinfo->flags & ~XT_CONNTRACK_STATE) return 0; + return 1; } - if(sinfo->flags & XT_CONNTRACK_ORIGDST) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.ip&sinfo->dipmsk[IP_CT_DIR_ORIGINAL].s_addr) != sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.ip, XT_CONNTRACK_ORIGDST)) - return 0; - } - - if(sinfo->flags & XT_CONNTRACK_REPLSRC) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u3.ip&sinfo->sipmsk[IP_CT_DIR_REPLY].s_addr) != sinfo->tuple[IP_CT_DIR_REPLY].src.ip, XT_CONNTRACK_REPLSRC)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_PROTO && + FWINV(ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.protonum != + sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.protonum, + XT_CONNTRACK_PROTO)) + return 0; + + if (sinfo->flags & XT_CONNTRACK_ORIGSRC && + FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.u3.ip & + sinfo->sipmsk[IP_CT_DIR_ORIGINAL].s_addr) != + sinfo->tuple[IP_CT_DIR_ORIGINAL].src.ip, + XT_CONNTRACK_ORIGSRC)) + return 0; - if(sinfo->flags & XT_CONNTRACK_REPLDST) { - if (!ct || FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3.ip&sinfo->dipmsk[IP_CT_DIR_REPLY].s_addr) != sinfo->tuple[IP_CT_DIR_REPLY].dst.ip, XT_CONNTRACK_REPLDST)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_ORIGDST && + FWINV((ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.u3.ip & + sinfo->dipmsk[IP_CT_DIR_ORIGINAL].s_addr) != + sinfo->tuple[IP_CT_DIR_ORIGINAL].dst.ip, + XT_CONNTRACK_ORIGDST)) + return 0; - if(sinfo->flags & XT_CONNTRACK_STATUS) { - if (!ct || FWINV((ct->status & sinfo->statusmask) == 0, XT_CONNTRACK_STATUS)) - return 0; - } + if (sinfo->flags & XT_CONNTRACK_REPLSRC && + FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.u3.ip & + sinfo->sipmsk[IP_CT_DIR_REPLY].s_addr) != + sinfo->tuple[IP_CT_DIR_REPLY].src.ip, + XT_CONNTRACK_REPLSRC)) + return 0; - if(sinfo->flags & XT_CONNTRACK_EXPIRES) { - unsigned long expires; + if (sinfo->flags & XT_CONNTRACK_REPLDST && + FWINV((ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u3.ip & + sinfo->dipmsk[IP_CT_DIR_REPLY].s_addr) != + sinfo->tuple[IP_CT_DIR_REPLY].dst.ip, + XT_CONNTRACK_REPLDST)) + return 0; - if(!ct) - return 0; + if (sinfo->flags & XT_CONNTRACK_STATUS && + FWINV((ct->status & sinfo->statusmask) == 0, + XT_CONNTRACK_STATUS)) + return 0; - expires = timer_pending(&ct->timeout) ? (ct->timeout.expires - jiffies)/HZ : 0; + if(sinfo->flags & XT_CONNTRACK_EXPIRES) { + unsigned long expires = timer_pending(&ct->timeout) ? + (ct->timeout.expires - jiffies)/HZ : 0; - if (FWINV(!(expires >= sinfo->expires_min && expires <= sinfo->expires_max), XT_CONNTRACK_EXPIRES)) + if (FWINV(!(expires >= sinfo->expires_min && + expires <= sinfo->expires_max), + XT_CONNTRACK_EXPIRES)) return 0; } - return 1; } @@ -220,8 +238,7 @@ checkentry(const char *tablename, return 1; } -static void -destroy(const struct xt_match *match, void *matchinfo) +static void destroy(const struct xt_match *match, void *matchinfo) { #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE) nf_ct_l3proto_module_put(match->family); -- cgit v0.10.2 From 68e1f188de535865d4543bae92d168c007857e7b Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:58:35 -0700 Subject: [NETFILTER]: ipt_TCPMSS: reformat - fix whitespace error - break lines at 80 characters - reformat some expressions to be more readable Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index ac8a35e..bfc8d9c 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -31,8 +31,10 @@ static inline unsigned int optlen(const u_int8_t *opt, unsigned int offset) { /* Beware zero-length options: make finite progress */ - if (opt[offset] <= TCPOPT_NOP || opt[offset+1] == 0) return 1; - else return opt[offset+1]; + if (opt[offset] <= TCPOPT_NOP || opt[offset+1] == 0) + return 1; + else + return opt[offset+1]; } static unsigned int @@ -55,7 +57,6 @@ ipt_tcpmss_target(struct sk_buff **pskb, iph = (*pskb)->nh.iph; tcplen = (*pskb)->len - iph->ihl*4; - tcph = (void *)iph + iph->ihl*4; /* Since it passed flags test in tcp match, we know it is is @@ -71,37 +72,39 @@ ipt_tcpmss_target(struct sk_buff **pskb, return NF_DROP; } - if(tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU) { - if(!(*pskb)->dst) { + if (tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU) { + if (!(*pskb)->dst) { if (net_ratelimit()) - printk(KERN_ERR - "ipt_tcpmss_target: no dst?! can't determine path-MTU\n"); + printk(KERN_ERR "ipt_tcpmss_target: " + "no dst?! can't determine path-MTU\n"); return NF_DROP; /* or IPT_CONTINUE ?? */ } - if(dst_mtu((*pskb)->dst) <= (sizeof(struct iphdr) + sizeof(struct tcphdr))) { + if (dst_mtu((*pskb)->dst) <= sizeof(struct iphdr) + + sizeof(struct tcphdr)) { if (net_ratelimit()) - printk(KERN_ERR - "ipt_tcpmss_target: unknown or invalid path-MTU (%d)\n", dst_mtu((*pskb)->dst)); + printk(KERN_ERR "ipt_tcpmss_target: " + "unknown or invalid path-MTU (%d)\n", + dst_mtu((*pskb)->dst)); return NF_DROP; /* or IPT_CONTINUE ?? */ } - newmss = dst_mtu((*pskb)->dst) - sizeof(struct iphdr) - sizeof(struct tcphdr); + newmss = dst_mtu((*pskb)->dst) - sizeof(struct iphdr) - + sizeof(struct tcphdr); } else newmss = tcpmssinfo->mss; opt = (u_int8_t *)tcph; - for (i = sizeof(struct tcphdr); i < tcph->doff*4; i += optlen(opt, i)){ - if ((opt[i] == TCPOPT_MSS) && - ((tcph->doff*4 - i) >= TCPOLEN_MSS) && - (opt[i+1] == TCPOLEN_MSS)) { + for (i = sizeof(struct tcphdr); i < tcph->doff*4; i += optlen(opt, i)) { + if (opt[i] == TCPOPT_MSS && tcph->doff*4 - i >= TCPOLEN_MSS && + opt[i+1] == TCPOLEN_MSS) { u_int16_t oldmss; oldmss = (opt[i+2] << 8) | opt[i+3]; - if((tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU) && - (oldmss <= newmss)) - return IPT_CONTINUE; + if (tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU && + oldmss <= newmss) + return IPT_CONTINUE; opt[i+2] = (newmss & 0xff00) >> 8; opt[i+3] = (newmss & 0x00ff); @@ -113,7 +116,7 @@ ipt_tcpmss_target(struct sk_buff **pskb, DEBUGP(KERN_INFO "ipt_tcpmss_target: %u.%u.%u.%u:%hu" "->%u.%u.%u.%u:%hu changed TCP MSS option" - " (from %u to %u)\n", + " (from %u to %u)\n", NIPQUAD((*pskb)->nh.iph->saddr), ntohs(tcph->source), NIPQUAD((*pskb)->nh.iph->daddr), @@ -193,9 +196,9 @@ static inline int find_syn_match(const struct ipt_entry_match *m) { const struct ipt_tcp *tcpinfo = (const struct ipt_tcp *)m->data; - if (strcmp(m->u.kernel.match->name, "tcp") == 0 - && (tcpinfo->flg_cmp & TH_SYN) - && !(tcpinfo->invflags & IPT_TCP_INV_FLAGS)) + if (strcmp(m->u.kernel.match->name, "tcp") == 0 && + tcpinfo->flg_cmp & TH_SYN && + !(tcpinfo->invflags & IPT_TCP_INV_FLAGS)) return 1; return 0; @@ -212,11 +215,12 @@ ipt_tcpmss_checkentry(const char *tablename, const struct ipt_tcpmss_info *tcpmssinfo = targinfo; const struct ipt_entry *e = e_void; - if((tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU) && - ((hook_mask & ~((1 << NF_IP_FORWARD) - | (1 << NF_IP_LOCAL_OUT) - | (1 << NF_IP_POST_ROUTING))) != 0)) { - printk("TCPMSS: path-MTU clamping only supported in FORWARD, OUTPUT and POSTROUTING hooks\n"); + if (tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU && + (hook_mask & ~((1 << NF_IP_FORWARD) | + (1 << NF_IP_LOCAL_OUT) | + (1 << NF_IP_POST_ROUTING))) != 0) { + printk("TCPMSS: path-MTU clamping only supported in " + "FORWARD, OUTPUT and POSTROUTING hooks\n"); return 0; } -- cgit v0.10.2 From 2be344c4461d29b99113f62fa91c5ceab9997329 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:58:50 -0700 Subject: [NETFILTER]: ipt_TCPMSS: remove impossible condition Every skb must have a dst_entry at this point. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index bfc8d9c..b2d3c4f 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -73,13 +73,6 @@ ipt_tcpmss_target(struct sk_buff **pskb, } if (tcpmssinfo->mss == IPT_TCPMSS_CLAMP_PMTU) { - if (!(*pskb)->dst) { - if (net_ratelimit()) - printk(KERN_ERR "ipt_tcpmss_target: " - "no dst?! can't determine path-MTU\n"); - return NF_DROP; /* or IPT_CONTINUE ?? */ - } - if (dst_mtu((*pskb)->dst) <= sizeof(struct iphdr) + sizeof(struct tcphdr)) { if (net_ratelimit()) -- cgit v0.10.2 From ecb70c95c45ece0935b076295388267f6d8db65c Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:59:06 -0700 Subject: [NETFILTER]: ipt_TCPMSS: misc cleanup - remove debugging cruft - remove printk for reallocation failures - remove unused addition Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_TCPMSS.c b/net/ipv4/netfilter/ipt_TCPMSS.c index b2d3c4f..4246c43 100644 --- a/net/ipv4/netfilter/ipt_TCPMSS.c +++ b/net/ipv4/netfilter/ipt_TCPMSS.c @@ -21,12 +21,6 @@ MODULE_LICENSE("GPL"); MODULE_AUTHOR("Marc Boucher "); MODULE_DESCRIPTION("iptables TCP MSS modification module"); -#if 0 -#define DEBUGP printk -#else -#define DEBUGP(format, args...) -#endif - static inline unsigned int optlen(const u_int8_t *opt, unsigned int offset) { @@ -106,16 +100,7 @@ ipt_tcpmss_target(struct sk_buff **pskb, htons(oldmss)^0xFFFF, htons(newmss), tcph->check, 0); - - DEBUGP(KERN_INFO "ipt_tcpmss_target: %u.%u.%u.%u:%hu" - "->%u.%u.%u.%u:%hu changed TCP MSS option" - " (from %u to %u)\n", - NIPQUAD((*pskb)->nh.iph->saddr), - ntohs(tcph->source), - NIPQUAD((*pskb)->nh.iph->daddr), - ntohs(tcph->dest), - oldmss, newmss); - goto retmodified; + return IPT_CONTINUE; } } @@ -127,13 +112,8 @@ ipt_tcpmss_target(struct sk_buff **pskb, newskb = skb_copy_expand(*pskb, skb_headroom(*pskb), TCPOLEN_MSS, GFP_ATOMIC); - if (!newskb) { - if (net_ratelimit()) - printk(KERN_ERR "ipt_tcpmss_target:" - " unable to allocate larger skb\n"); + if (!newskb) return NF_DROP; - } - kfree_skb(*pskb); *pskb = newskb; iph = (*pskb)->nh.iph; @@ -149,8 +129,6 @@ ipt_tcpmss_target(struct sk_buff **pskb, htons(tcplen) ^ 0xFFFF, htons(tcplen + TCPOLEN_MSS), tcph->check, 1); - tcplen += TCPOLEN_MSS; - opt[0] = TCPOPT_MSS; opt[1] = TCPOLEN_MSS; opt[2] = (newmss & 0xff00) >> 8; @@ -170,16 +148,6 @@ ipt_tcpmss_target(struct sk_buff **pskb, iph->check = nf_csum_update(iph->tot_len ^ 0xFFFF, newtotlen, iph->check); iph->tot_len = newtotlen; - - DEBUGP(KERN_INFO "ipt_tcpmss_target: %u.%u.%u.%u:%hu" - "->%u.%u.%u.%u:%hu added TCP MSS option (%u)\n", - NIPQUAD((*pskb)->nh.iph->saddr), - ntohs(tcph->source), - NIPQUAD((*pskb)->nh.iph->daddr), - ntohs(tcph->dest), - newmss); - - retmodified: return IPT_CONTINUE; } -- cgit v0.10.2 From 57dab5d0bfee21663ed20222b4cedeb0655ba1f3 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:59:25 -0700 Subject: [NETFILTER]: xt_limit: don't reset state on unrelated rule updates The limit match reinitializes its state whenever the ruleset changes, which means it will forget about previously used credits. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c index b9c9ff3..8bfcbdf 100644 --- a/net/netfilter/xt_limit.c +++ b/net/netfilter/xt_limit.c @@ -122,16 +122,16 @@ ipt_limit_checkentry(const char *tablename, return 0; } - /* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies * - 128. */ - r->prev = jiffies; - r->credit = user2credits(r->avg * r->burst); /* Credits full. */ - r->credit_cap = user2credits(r->avg * r->burst); /* Credits full. */ - r->cost = user2credits(r->avg); - /* For SMP, we only want to use one set of counters. */ r->master = r; - + if (r->cost == 0) { + /* User avg in seconds * XT_LIMIT_SCALE: convert to jiffies * + 128. */ + r->prev = jiffies; + r->credit = user2credits(r->avg * r->burst); /* Credits full. */ + r->credit_cap = user2credits(r->avg * r->burst); /* Credits full. */ + r->cost = user2credits(r->avg); + } return 1; } -- cgit v0.10.2 From 9123de2c043996050bacf77031cad845f5976f5d Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 11:59:42 -0700 Subject: [NETFILTER]: ip6table_mangle: reroute when nfmark changes in NF_IP6_LOCAL_OUT Now that IPv6 supports policy routing we need to reroute in NF_IP6_LOCAL_OUT when the mark value changes. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv6.h b/include/linux/netfilter_ipv6.h index 52a7b9e..d97e268 100644 --- a/include/linux/netfilter_ipv6.h +++ b/include/linux/netfilter_ipv6.h @@ -73,6 +73,7 @@ enum nf_ip6_hook_priorities { }; #ifdef CONFIG_NETFILTER +extern int ip6_route_me_harder(struct sk_buff *skb); extern unsigned int nf_ip6_checksum(struct sk_buff *skb, unsigned int hook, unsigned int dataoff, u_int8_t protocol); diff --git a/include/net/ip6_route.h b/include/net/ip6_route.h index 2979095..6ca6b71 100644 --- a/include/net/ip6_route.h +++ b/include/net/ip6_route.h @@ -57,8 +57,6 @@ extern void ip6_route_input(struct sk_buff *skb); extern struct dst_entry * ip6_route_output(struct sock *sk, struct flowi *fl); -extern int ip6_route_me_harder(struct sk_buff *skb); - extern void ip6_route_init(void); extern void ip6_route_cleanup(void); diff --git a/net/ipv6/netfilter/ip6table_mangle.c b/net/ipv6/netfilter/ip6table_mangle.c index 32db04f..386ea26 100644 --- a/net/ipv6/netfilter/ip6table_mangle.c +++ b/net/ipv6/netfilter/ip6table_mangle.c @@ -180,12 +180,8 @@ ip6t_local_hook(unsigned int hook, && (memcmp(&(*pskb)->nh.ipv6h->saddr, &saddr, sizeof(saddr)) || memcmp(&(*pskb)->nh.ipv6h->daddr, &daddr, sizeof(daddr)) || (*pskb)->nfmark != nfmark - || (*pskb)->nh.ipv6h->hop_limit != hop_limit)) { - - /* something which could affect routing has changed */ - - DEBUGP("ip6table_mangle: we'd need to re-route a packet\n"); - } + || (*pskb)->nh.ipv6h->hop_limit != hop_limit)) + return ip6_route_me_harder(*pskb) == 0 ? ret : NF_DROP; return ret; } -- cgit v0.10.2 From 90d47db4a06f93f7339618b2a4f0cb032ef8d6d5 Mon Sep 17 00:00:00 2001 From: Dmitry Mishin Date: Wed, 20 Sep 2006 12:00:21 -0700 Subject: [NETFILTER]: x_tables: small check_entry & module_refcount cleanup While standard_target has target->me == NULL, module_put() should be called for it as for others, because there were try_module_get() before. Signed-off-by: Dmitry Mishin Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c index aaeaa9c..85f0d73 100644 --- a/net/ipv4/netfilter/arp_tables.c +++ b/net/ipv4/netfilter/arp_tables.c @@ -485,7 +485,7 @@ static inline int check_entry(struct arpt_entry *e, const char *name, unsigned i if (t->u.kernel.target == &arpt_standard_target) { if (!standard_check(t, size)) { ret = -EINVAL; - goto out; + goto err; } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index a0f3680..38e1e4f 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -573,7 +573,7 @@ check_entry(struct ipt_entry *e, const char *name, unsigned int size, if (t->u.kernel.target == &ipt_standard_target) { if (!standard_check(t, size)) { ret = -EINVAL; - goto cleanup_matches; + goto err; } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c index 73d477c..4ab368f 100644 --- a/net/ipv6/netfilter/ip6_tables.c +++ b/net/ipv6/netfilter/ip6_tables.c @@ -610,7 +610,7 @@ check_entry(struct ip6t_entry *e, const char *name, unsigned int size, if (t->u.kernel.target == &ip6t_standard_target) { if (!standard_check(t, size)) { ret = -EINVAL; - goto cleanup_matches; + goto err; } } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, e, target, t->data, -- cgit v0.10.2 From 01f348484dd8509254d045e3ad49029716eca6a1 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Wed, 20 Sep 2006 12:00:45 -0700 Subject: [NETFILTER]: ctnetlink: simplify the code to dump the conntrack table Merge the bits to dump the conntrack table and the ones to dump and zero counters in a single piece of code. This patch does not change the default behaviour if accounting is not enabled. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_netlink.c b/net/ipv4/netfilter/ip_conntrack_netlink.c index a20b0e3..52eddea 100644 --- a/net/ipv4/netfilter/ip_conntrack_netlink.c +++ b/net/ipv4/netfilter/ip_conntrack_netlink.c @@ -436,6 +436,11 @@ restart: cb->args[1] = (unsigned long)ct; goto out; } +#ifdef CONFIG_NF_CT_ACCT + if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == + IPCTNL_MSG_CT_GET_CTRZERO) + memset(&ct->counters, 0, sizeof(ct->counters)); +#endif } if (cb->args[1]) { cb->args[1] = 0; @@ -451,46 +456,6 @@ out: return skb->len; } -#ifdef CONFIG_IP_NF_CT_ACCT -static int -ctnetlink_dump_table_w(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct ip_conntrack *ct = NULL; - struct ip_conntrack_tuple_hash *h; - struct list_head *i; - u_int32_t *id = (u_int32_t *) &cb->args[1]; - - DEBUGP("entered %s, last bucket=%u id=%u\n", __FUNCTION__, - cb->args[0], *id); - - write_lock_bh(&ip_conntrack_lock); - for (; cb->args[0] < ip_conntrack_htable_size; cb->args[0]++, *id = 0) { - list_for_each_prev(i, &ip_conntrack_hash[cb->args[0]]) { - h = (struct ip_conntrack_tuple_hash *) i; - if (DIRECTION(h) != IP_CT_DIR_ORIGINAL) - continue; - ct = tuplehash_to_ctrack(h); - if (ct->id <= *id) - continue; - if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid, - cb->nlh->nlmsg_seq, - IPCTNL_MSG_CT_NEW, - 1, ct) < 0) - goto out; - *id = ct->id; - - memset(&ct->counters, 0, sizeof(ct->counters)); - } - } -out: - write_unlock_bh(&ip_conntrack_lock); - - DEBUGP("leaving, last bucket=%lu id=%u\n", cb->args[0], *id); - - return skb->len; -} -#endif - static const size_t cta_min_ip[CTA_IP_MAX] = { [CTA_IP_V4_SRC-1] = sizeof(u_int32_t), [CTA_IP_V4_DST-1] = sizeof(u_int32_t), @@ -775,22 +740,14 @@ ctnetlink_get_conntrack(struct sock *ctnl, struct sk_buff *skb, if (msg->nfgen_family != AF_INET) return -EAFNOSUPPORT; - if (NFNL_MSG_TYPE(nlh->nlmsg_type) == - IPCTNL_MSG_CT_GET_CTRZERO) { -#ifdef CONFIG_IP_NF_CT_ACCT - if ((*errp = netlink_dump_start(ctnl, skb, nlh, - ctnetlink_dump_table_w, - ctnetlink_done)) != 0) - return -EINVAL; -#else +#ifndef CONFIG_IP_NF_CT_ACCT + if (NFNL_MSG_TYPE(nlh->nlmsg_type) == IPCTNL_MSG_CT_GET_CTRZERO) return -ENOTSUPP; #endif - } else { - if ((*errp = netlink_dump_start(ctnl, skb, nlh, - ctnetlink_dump_table, - ctnetlink_done)) != 0) + if ((*errp = netlink_dump_start(ctnl, skb, nlh, + ctnetlink_dump_table, + ctnetlink_done)) != 0) return -EINVAL; - } rlen = NLMSG_ALIGN(nlh->nlmsg_len); if (rlen > skb->len) diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 8cd85cf..1721f7c 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -455,6 +455,11 @@ restart: cb->args[1] = (unsigned long)ct; goto out; } +#ifdef CONFIG_NF_CT_ACCT + if (NFNL_MSG_TYPE(cb->nlh->nlmsg_type) == + IPCTNL_MSG_CT_GET_CTRZERO) + memset(&ct->counters, 0, sizeof(ct->counters)); +#endif } if (cb->args[1]) { cb->args[1] = 0; @@ -470,50 +475,6 @@ out: return skb->len; } -#ifdef CONFIG_NF_CT_ACCT -static int -ctnetlink_dump_table_w(struct sk_buff *skb, struct netlink_callback *cb) -{ - struct nf_conn *ct = NULL; - struct nf_conntrack_tuple_hash *h; - struct list_head *i; - u_int32_t *id = (u_int32_t *) &cb->args[1]; - struct nfgenmsg *nfmsg = NLMSG_DATA(cb->nlh); - u_int8_t l3proto = nfmsg->nfgen_family; - - DEBUGP("entered %s, last bucket=%u id=%u\n", __FUNCTION__, - cb->args[0], *id); - - write_lock_bh(&nf_conntrack_lock); - for (; cb->args[0] < nf_conntrack_htable_size; cb->args[0]++, *id = 0) { - list_for_each_prev(i, &nf_conntrack_hash[cb->args[0]]) { - h = (struct nf_conntrack_tuple_hash *) i; - if (DIRECTION(h) != IP_CT_DIR_ORIGINAL) - continue; - ct = nf_ct_tuplehash_to_ctrack(h); - if (l3proto && L3PROTO(ct) != l3proto) - continue; - if (ct->id <= *id) - continue; - if (ctnetlink_fill_info(skb, NETLINK_CB(cb->skb).pid, - cb->nlh->nlmsg_seq, - IPCTNL_MSG_CT_NEW, - 1, ct) < 0) - goto out; - *id = ct->id; - - memset(&ct->counters, 0, sizeof(ct->counters)); - } - } -out: - write_unlock_bh(&nf_conntrack_lock); - - DEBUGP("leaving, last bucket=%lu id=%u\n", cb->args[0], *id); - - return skb->len; -} -#endif - static inline int ctnetlink_parse_tuple_ip(struct nfattr *attr, struct nf_conntrack_tuple *tuple) { @@ -788,22 +749,14 @@ ctnetlink_get_conntrack(struct sock *ctnl, struct sk_buff *skb, if (nlh->nlmsg_flags & NLM_F_DUMP) { u32 rlen; - if (NFNL_MSG_TYPE(nlh->nlmsg_type) == - IPCTNL_MSG_CT_GET_CTRZERO) { -#ifdef CONFIG_NF_CT_ACCT - if ((*errp = netlink_dump_start(ctnl, skb, nlh, - ctnetlink_dump_table_w, - ctnetlink_done)) != 0) - return -EINVAL; -#else +#ifndef CONFIG_NF_CT_ACCT + if (NFNL_MSG_TYPE(nlh->nlmsg_type) == IPCTNL_MSG_CT_GET_CTRZERO) return -ENOTSUPP; #endif - } else { - if ((*errp = netlink_dump_start(ctnl, skb, nlh, - ctnetlink_dump_table, - ctnetlink_done)) != 0) + if ((*errp = netlink_dump_start(ctnl, skb, nlh, + ctnetlink_dump_table, + ctnetlink_done)) != 0) return -EINVAL; - } rlen = NLMSG_ALIGN(nlh->nlmsg_len); if (rlen > skb->len) -- cgit v0.10.2 From 5251e2d2125407bbff0c39394a4011be9ed8b5d0 Mon Sep 17 00:00:00 2001 From: Pablo Neira Ayuso Date: Wed, 20 Sep 2006 12:01:06 -0700 Subject: [NETFILTER]: conntrack: fix race condition in early_drop On SMP environments the maximum number of conntracks can be overpassed under heavy stress situations due to an existing race condition. CPU A CPU B atomic_read() ... early_drop() ... ... atomic_read() allocate conntrack allocate conntrack atomic_inc() atomic_inc() This patch moves the counter incrementation before the early drop stage. Signed-off-by: Pablo Neira Ayuso Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c index 2568d48..422a662 100644 --- a/net/ipv4/netfilter/ip_conntrack_core.c +++ b/net/ipv4/netfilter/ip_conntrack_core.c @@ -622,11 +622,15 @@ struct ip_conntrack *ip_conntrack_alloc(struct ip_conntrack_tuple *orig, ip_conntrack_hash_rnd_initted = 1; } + /* We don't want any race condition at early drop stage */ + atomic_inc(&ip_conntrack_count); + if (ip_conntrack_max - && atomic_read(&ip_conntrack_count) >= ip_conntrack_max) { + && atomic_read(&ip_conntrack_count) > ip_conntrack_max) { unsigned int hash = hash_conntrack(orig); /* Try dropping from this hash chain. */ if (!early_drop(&ip_conntrack_hash[hash])) { + atomic_dec(&ip_conntrack_count); if (net_ratelimit()) printk(KERN_WARNING "ip_conntrack: table full, dropping" @@ -638,6 +642,7 @@ struct ip_conntrack *ip_conntrack_alloc(struct ip_conntrack_tuple *orig, conntrack = kmem_cache_alloc(ip_conntrack_cachep, GFP_ATOMIC); if (!conntrack) { DEBUGP("Can't allocate conntrack.\n"); + atomic_dec(&ip_conntrack_count); return ERR_PTR(-ENOMEM); } @@ -651,8 +656,6 @@ struct ip_conntrack *ip_conntrack_alloc(struct ip_conntrack_tuple *orig, conntrack->timeout.data = (unsigned long)conntrack; conntrack->timeout.function = death_by_timeout; - atomic_inc(&ip_conntrack_count); - return conntrack; } diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index 927137b..adeafa2 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -848,11 +848,15 @@ __nf_conntrack_alloc(const struct nf_conntrack_tuple *orig, nf_conntrack_hash_rnd_initted = 1; } + /* We don't want any race condition at early drop stage */ + atomic_inc(&nf_conntrack_count); + if (nf_conntrack_max - && atomic_read(&nf_conntrack_count) >= nf_conntrack_max) { + && atomic_read(&nf_conntrack_count) > nf_conntrack_max) { unsigned int hash = hash_conntrack(orig); /* Try dropping from this hash chain. */ if (!early_drop(&nf_conntrack_hash[hash])) { + atomic_dec(&nf_conntrack_count); if (net_ratelimit()) printk(KERN_WARNING "nf_conntrack: table full, dropping" @@ -903,10 +907,12 @@ __nf_conntrack_alloc(const struct nf_conntrack_tuple *orig, init_timer(&conntrack->timeout); conntrack->timeout.data = (unsigned long)conntrack; conntrack->timeout.function = death_by_timeout; + read_unlock_bh(&nf_ct_cache_lock); - atomic_inc(&nf_conntrack_count); + return conntrack; out: read_unlock_bh(&nf_ct_cache_lock); + atomic_dec(&nf_conntrack_count); return conntrack; } -- cgit v0.10.2 From ca39df6cdfbe2ea210e31117f5d469576cfe9008 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:01:34 -0700 Subject: [NETFILTER]: ipt_TTL: fix checksum update bug Fix regression introduced by the incremental checksum patches. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_TTL.c b/net/ipv4/netfilter/ipt_TTL.c index 214d9d9..96e79cc 100644 --- a/net/ipv4/netfilter/ipt_TTL.c +++ b/net/ipv4/netfilter/ipt_TTL.c @@ -54,8 +54,8 @@ ipt_ttl_target(struct sk_buff **pskb, } if (new_ttl != iph->ttl) { - iph->check = nf_csum_update((iph->ttl << 8) ^ 0xFFFF, - new_ttl << 8, + iph->check = nf_csum_update(ntohs((iph->ttl << 8)) ^ 0xFFFF, + ntohs(new_ttl << 8), iph->check); iph->ttl = new_ttl; } -- cgit v0.10.2 From 7cf73936fe6bb9b027b75fd8fa3c634fe74843d3 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:02:21 -0700 Subject: [NETFILTER]: ip6t_HL: remove write-only variable Noticed by Alexey Dobriyan Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv6/netfilter/ip6t_HL.c b/net/ipv6/netfilter/ip6t_HL.c index e54ea92..435750f 100644 --- a/net/ipv6/netfilter/ip6t_HL.c +++ b/net/ipv6/netfilter/ip6t_HL.c @@ -26,7 +26,6 @@ static unsigned int ip6t_hl_target(struct sk_buff **pskb, { struct ipv6hdr *ip6h; const struct ip6t_HL_info *info = targinfo; - u_int16_t diffs[2]; int new_hl; if (!skb_make_writable(pskb, (*pskb)->len)) @@ -53,11 +52,8 @@ static unsigned int ip6t_hl_target(struct sk_buff **pskb, break; } - if (new_hl != ip6h->hop_limit) { - diffs[0] = htons(((unsigned)ip6h->hop_limit) << 8) ^ 0xFFFF; + if (new_hl != ip6h->hop_limit) ip6h->hop_limit = new_hl; - diffs[1] = htons(((unsigned)ip6h->hop_limit) << 8); - } return IP6T_CONTINUE; } -- cgit v0.10.2 From 71cd83a8bde61612b277fd5bf91503ac1ad61e23 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Wed, 20 Sep 2006 12:02:44 -0700 Subject: [NETFILTER]: xt_policy: remove dups in .family sparse "defined twice" warning Signed-off-by: Alexey Dobriyan Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_policy.c b/net/netfilter/xt_policy.c index e9d8137..46bde2b 100644 --- a/net/netfilter/xt_policy.c +++ b/net/netfilter/xt_policy.c @@ -171,7 +171,6 @@ static struct xt_match xt_policy_match[] = { .checkentry = checkentry, .match = match, .matchsize = sizeof(struct xt_policy_info), - .family = AF_INET, .me = THIS_MODULE, }, { @@ -180,7 +179,6 @@ static struct xt_match xt_policy_match[] = { .checkentry = checkentry, .match = match, .matchsize = sizeof(struct xt_policy_info), - .family = AF_INET6, .me = THIS_MODULE, }, }; -- cgit v0.10.2 From c1fe3ca5106d9568791433fa6c7f27e71ac69e1b Mon Sep 17 00:00:00 2001 From: George Hansper Date: Wed, 20 Sep 2006 12:03:23 -0700 Subject: [NETFILTER]: TCP conntrack: improve dead connection detection Don't count window updates as retransmissions. Signed-off-by: George Hansper Signed-off-by: Patrick McHardy diff --git a/include/linux/netfilter/nf_conntrack_tcp.h b/include/linux/netfilter/nf_conntrack_tcp.h index b2feeff..6b01ba2 100644 --- a/include/linux/netfilter/nf_conntrack_tcp.h +++ b/include/linux/netfilter/nf_conntrack_tcp.h @@ -49,6 +49,7 @@ struct ip_ct_tcp u_int32_t last_seq; /* Last sequence number seen in dir */ u_int32_t last_ack; /* Last sequence number seen in opposite dir */ u_int32_t last_end; /* Last seq + len */ + u_int16_t last_win; /* Last window advertisement seen in dir */ }; #endif /* __KERNEL__ */ diff --git a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c index 75a7237..03ae9a0 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_tcp.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_tcp.c @@ -731,13 +731,15 @@ static int tcp_in_window(struct ip_ct_tcp *state, if (state->last_dir == dir && state->last_seq == seq && state->last_ack == ack - && state->last_end == end) + && state->last_end == end + && state->last_win == win) state->retrans++; else { state->last_dir = dir; state->last_seq = seq; state->last_ack = ack; state->last_end = end; + state->last_win = win; state->retrans = 0; } } diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index 9fc0ee6..238bbb5 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -688,13 +688,15 @@ static int tcp_in_window(struct ip_ct_tcp *state, if (state->last_dir == dir && state->last_seq == seq && state->last_ack == ack - && state->last_end == end) + && state->last_end == end + && state->last_win == win) state->retrans++; else { state->last_dir = dir; state->last_seq = seq; state->last_ack = ack; state->last_end = end; + state->last_win = win; state->retrans = 0; } } -- cgit v0.10.2 From 1192e403e9ea2dc23bbbe2b4fe9bdbc47e8c6056 Mon Sep 17 00:00:00 2001 From: Brian Haley Date: Wed, 20 Sep 2006 12:03:46 -0700 Subject: [NETFILTER]: make some netfilter globals __read_mostly Signed-off-by: Brian Haley Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c index 422a662..2b6f24f 100644 --- a/net/ipv4/netfilter/ip_conntrack_core.c +++ b/net/ipv4/netfilter/ip_conntrack_core.c @@ -63,17 +63,17 @@ atomic_t ip_conntrack_count = ATOMIC_INIT(0); void (*ip_conntrack_destroyed)(struct ip_conntrack *conntrack) = NULL; LIST_HEAD(ip_conntrack_expect_list); -struct ip_conntrack_protocol *ip_ct_protos[MAX_IP_CT_PROTO]; +struct ip_conntrack_protocol *ip_ct_protos[MAX_IP_CT_PROTO] __read_mostly; static LIST_HEAD(helpers); unsigned int ip_conntrack_htable_size __read_mostly = 0; int ip_conntrack_max __read_mostly; -struct list_head *ip_conntrack_hash; +struct list_head *ip_conntrack_hash __read_mostly; static kmem_cache_t *ip_conntrack_cachep __read_mostly; static kmem_cache_t *ip_conntrack_expect_cachep __read_mostly; struct ip_conntrack ip_conntrack_untracked; unsigned int ip_ct_log_invalid __read_mostly; static LIST_HEAD(unconfirmed); -static int ip_conntrack_vmalloc; +static int ip_conntrack_vmalloc __read_mostly; static unsigned int ip_conntrack_next_id; static unsigned int ip_conntrack_expect_next_id; diff --git a/net/ipv4/netfilter/ip_queue.c b/net/ipv4/netfilter/ip_queue.c index 80060cb..7edad79 100644 --- a/net/ipv4/netfilter/ip_queue.c +++ b/net/ipv4/netfilter/ip_queue.c @@ -52,15 +52,15 @@ struct ipq_queue_entry { typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long); -static unsigned char copy_mode = IPQ_COPY_NONE; +static unsigned char copy_mode __read_mostly = IPQ_COPY_NONE; static unsigned int queue_maxlen __read_mostly = IPQ_QMAX_DEFAULT; static DEFINE_RWLOCK(queue_lock); -static int peer_pid; -static unsigned int copy_range; +static int peer_pid __read_mostly; +static unsigned int copy_range __read_mostly; static unsigned int queue_total; static unsigned int queue_dropped = 0; static unsigned int queue_user_dropped = 0; -static struct sock *ipqnl; +static struct sock *ipqnl __read_mostly; static LIST_HEAD(queue_list); static DEFINE_MUTEX(ipqnl_mutex); diff --git a/net/ipv6/netfilter/ip6_queue.c b/net/ipv6/netfilter/ip6_queue.c index d322e83..9510c24 100644 --- a/net/ipv6/netfilter/ip6_queue.c +++ b/net/ipv6/netfilter/ip6_queue.c @@ -56,15 +56,15 @@ struct ipq_queue_entry { typedef int (*ipq_cmpfn)(struct ipq_queue_entry *, unsigned long); -static unsigned char copy_mode = IPQ_COPY_NONE; +static unsigned char copy_mode __read_mostly = IPQ_COPY_NONE; static unsigned int queue_maxlen __read_mostly = IPQ_QMAX_DEFAULT; static DEFINE_RWLOCK(queue_lock); -static int peer_pid; -static unsigned int copy_range; +static int peer_pid __read_mostly; +static unsigned int copy_range __read_mostly; static unsigned int queue_total; static unsigned int queue_dropped = 0; static unsigned int queue_user_dropped = 0; -static struct sock *ipqnl; +static struct sock *ipqnl __read_mostly; static LIST_HEAD(queue_list); static DEFINE_MUTEX(ipqnl_mutex); diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index adeafa2..093b3dd 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -73,17 +73,17 @@ atomic_t nf_conntrack_count = ATOMIC_INIT(0); void (*nf_conntrack_destroyed)(struct nf_conn *conntrack) = NULL; LIST_HEAD(nf_conntrack_expect_list); -struct nf_conntrack_protocol **nf_ct_protos[PF_MAX]; -struct nf_conntrack_l3proto *nf_ct_l3protos[PF_MAX]; +struct nf_conntrack_protocol **nf_ct_protos[PF_MAX] __read_mostly; +struct nf_conntrack_l3proto *nf_ct_l3protos[PF_MAX] __read_mostly; static LIST_HEAD(helpers); unsigned int nf_conntrack_htable_size __read_mostly = 0; int nf_conntrack_max __read_mostly; -struct list_head *nf_conntrack_hash; -static kmem_cache_t *nf_conntrack_expect_cachep; +struct list_head *nf_conntrack_hash __read_mostly; +static kmem_cache_t *nf_conntrack_expect_cachep __read_mostly; struct nf_conn nf_conntrack_untracked; unsigned int nf_ct_log_invalid __read_mostly; static LIST_HEAD(unconfirmed); -static int nf_conntrack_vmalloc; +static int nf_conntrack_vmalloc __read_mostly; static unsigned int nf_conntrack_next_id; static unsigned int nf_conntrack_expect_next_id; -- cgit v0.10.2 From bec71b162747708d4b45b0cd399b484f52f2901a Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:04:08 -0700 Subject: [NETFILTER]: ip_tables: fix module refcount leaks in compat error paths Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 38e1e4f..3d5d4a4 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1529,7 +1529,7 @@ check_compat_entry_size_and_hooks(struct ipt_entry *e, ret = IPT_MATCH_ITERATE(e, compat_check_calc_match, name, &e->ip, e->comefrom, &off, &j); if (ret != 0) - goto out; + goto cleanup_matches; t = ipt_get_target(e); target = try_then_request_module(xt_find_target(AF_INET, @@ -1539,7 +1539,7 @@ check_compat_entry_size_and_hooks(struct ipt_entry *e, if (IS_ERR(target) || !target) { duprintf("check_entry: `%s' not found\n", t->u.user.name); ret = target ? PTR_ERR(target) : -ENOENT; - goto out; + goto cleanup_matches; } t->u.kernel.target = target; @@ -1566,14 +1566,17 @@ check_compat_entry_size_and_hooks(struct ipt_entry *e, (*i)++; return 0; + out: + module_put(t->u.kernel.target->me); +cleanup_matches: IPT_MATCH_ITERATE(e, cleanup_match, &j); return ret; } static inline int compat_copy_match_from_user(struct ipt_entry_match *m, void **dstptr, compat_uint_t *size, const char *name, - const struct ipt_ip *ip, unsigned int hookmask) + const struct ipt_ip *ip, unsigned int hookmask, int *i) { struct ipt_entry_match *dm; struct ipt_match *match; @@ -1590,16 +1593,22 @@ static inline int compat_copy_match_from_user(struct ipt_entry_match *m, name, hookmask, ip->proto, ip->invflags & IPT_INV_PROTO); if (ret) - return ret; + goto err; if (m->u.kernel.match->checkentry && !m->u.kernel.match->checkentry(name, ip, match, dm->data, hookmask)) { duprintf("ip_tables: check failed for `%s'.\n", m->u.kernel.match->name); - return -EINVAL; + ret = -EINVAL; + goto err; } + (*i)++; return 0; + +err: + module_put(m->u.kernel.match->me); + return ret; } static int compat_copy_entry_from_user(struct ipt_entry *e, void **dstptr, @@ -1610,18 +1619,19 @@ static int compat_copy_entry_from_user(struct ipt_entry *e, void **dstptr, struct ipt_target *target; struct ipt_entry *de; unsigned int origsize; - int ret, h; + int ret, h, j; ret = 0; origsize = *size; de = (struct ipt_entry *)*dstptr; memcpy(de, e, sizeof(struct ipt_entry)); + j = 0; *dstptr += sizeof(struct compat_ipt_entry); ret = IPT_MATCH_ITERATE(e, compat_copy_match_from_user, dstptr, size, - name, &de->ip, de->comefrom); + name, &de->ip, de->comefrom, &j); if (ret) - goto out; + goto cleanup_matches; de->target_offset = e->target_offset - (origsize - *size); t = ipt_get_target(e); target = t->u.kernel.target; @@ -1644,21 +1654,26 @@ static int compat_copy_entry_from_user(struct ipt_entry *e, void **dstptr, name, e->comefrom, e->ip.proto, e->ip.invflags & IPT_INV_PROTO); if (ret) - goto out; + goto err; ret = -EINVAL; if (t->u.kernel.target == &ipt_standard_target) { if (!standard_check(t, *size)) - goto out; + goto err; } else if (t->u.kernel.target->checkentry && !t->u.kernel.target->checkentry(name, de, target, t->data, de->comefrom)) { duprintf("ip_tables: compat: check failed for `%s'.\n", t->u.kernel.target->name); - goto out; + goto err; } ret = 0; -out: + return ret; + +err: + module_put(t->u.kernel.target->me); +cleanup_matches: + IPT_MATCH_ITERATE(e, cleanup_match, &j); return ret; } -- cgit v0.10.2 From 79030ed07de673e8451a03aecb9ada9f4d75d491 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:05:08 -0700 Subject: [NETFILTER]: ip_tables: revision support for compat code Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 3d5d4a4..673581d 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -1994,6 +1994,8 @@ compat_get_entries(struct compat_ipt_get_entries __user *uptr, int *len) return ret; } +static int do_ipt_get_ctl(struct sock *, int, void __user *, int *); + static int compat_do_ipt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) { @@ -2007,8 +2009,7 @@ compat_do_ipt_get_ctl(struct sock *sk, int cmd, void __user *user, int *len) ret = compat_get_entries(user, len); break; default: - duprintf("compat_do_ipt_get_ctl: unknown request %i\n", cmd); - ret = -EINVAL; + ret = do_ipt_get_ctl(sk, cmd, user, len); } return ret; } -- cgit v0.10.2 From 9fa492cdc160cd27ce1046cb36f47d3b2b1efa21 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:05:37 -0700 Subject: [NETFILTER]: x_tables: simplify compat API Split the xt_compat_match/xt_compat_target into smaller type-safe functions performing just one operation. Handle all alignment and size-related conversions centrally in these function instead of requiring each module to implement a full-blown conversion function. Replace ->compat callback by ->compat_from_user and ->compat_to_user callbacks, responsible for converting just a single private structure. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter/x_tables.h b/include/linux/netfilter/x_tables.h index c832295..739a98e 100644 --- a/include/linux/netfilter/x_tables.h +++ b/include/linux/netfilter/x_tables.h @@ -138,12 +138,6 @@ struct xt_counters_info #include -#ifdef CONFIG_COMPAT -#define COMPAT_TO_USER 1 -#define COMPAT_FROM_USER -1 -#define COMPAT_CALC_SIZE 0 -#endif - struct xt_match { struct list_head list; @@ -176,7 +170,8 @@ struct xt_match void (*destroy)(const struct xt_match *match, void *matchinfo); /* Called when userspace align differs from kernel space one */ - int (*compat)(void *match, void **dstptr, int *size, int convert); + void (*compat_from_user)(void *dst, void *src); + int (*compat_to_user)(void __user *dst, void *src); /* Set this to THIS_MODULE if you are a module, otherwise NULL */ struct module *me; @@ -186,6 +181,7 @@ struct xt_match char *table; unsigned int matchsize; + unsigned int compatsize; unsigned int hooks; unsigned short proto; @@ -224,13 +220,15 @@ struct xt_target void (*destroy)(const struct xt_target *target, void *targinfo); /* Called when userspace align differs from kernel space one */ - int (*compat)(void *target, void **dstptr, int *size, int convert); + void (*compat_from_user)(void *dst, void *src); + int (*compat_to_user)(void __user *dst, void *src); /* Set this to THIS_MODULE if you are a module, otherwise NULL */ struct module *me; char *table; unsigned int targetsize; + unsigned int compatsize; unsigned int hooks; unsigned short proto; @@ -387,9 +385,18 @@ struct compat_xt_counters_info extern void xt_compat_lock(int af); extern void xt_compat_unlock(int af); -extern int xt_compat_match(void *match, void **dstptr, int *size, int convert); -extern int xt_compat_target(void *target, void **dstptr, int *size, - int convert); + +extern int xt_compat_match_offset(struct xt_match *match); +extern void xt_compat_match_from_user(struct xt_entry_match *m, + void **dstptr, int *size); +extern int xt_compat_match_to_user(struct xt_entry_match *m, + void * __user *dstptr, int *size); + +extern int xt_compat_target_offset(struct xt_target *target); +extern void xt_compat_target_from_user(struct xt_entry_target *t, + void **dstptr, int *size); +extern int xt_compat_target_to_user(struct xt_entry_target *t, + void * __user *dstptr, int *size); #endif /* CONFIG_COMPAT */ #endif /* __KERNEL__ */ diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c index 673581d..800067d 100644 --- a/net/ipv4/netfilter/ip_tables.c +++ b/net/ipv4/netfilter/ip_tables.c @@ -942,73 +942,28 @@ static short compat_calc_jump(u_int16_t offset) return delta; } -struct compat_ipt_standard_target +static void compat_standard_from_user(void *dst, void *src) { - struct compat_xt_entry_target target; - compat_int_t verdict; -}; - -struct compat_ipt_standard -{ - struct compat_ipt_entry entry; - struct compat_ipt_standard_target target; -}; + int v = *(compat_int_t *)src; -#define IPT_ST_LEN XT_ALIGN(sizeof(struct ipt_standard_target)) -#define IPT_ST_COMPAT_LEN COMPAT_XT_ALIGN(sizeof(struct compat_ipt_standard_target)) -#define IPT_ST_OFFSET (IPT_ST_LEN - IPT_ST_COMPAT_LEN) + if (v > 0) + v += compat_calc_jump(v); + memcpy(dst, &v, sizeof(v)); +} -static int compat_ipt_standard_fn(void *target, - void **dstptr, int *size, int convert) +static int compat_standard_to_user(void __user *dst, void *src) { - struct compat_ipt_standard_target compat_st, *pcompat_st; - struct ipt_standard_target st, *pst; - int ret; + compat_int_t cv = *(int *)src; - ret = 0; - switch (convert) { - case COMPAT_TO_USER: - pst = target; - memcpy(&compat_st.target, &pst->target, - sizeof(compat_st.target)); - compat_st.verdict = pst->verdict; - if (compat_st.verdict > 0) - compat_st.verdict -= - compat_calc_jump(compat_st.verdict); - compat_st.target.u.user.target_size = IPT_ST_COMPAT_LEN; - if (copy_to_user(*dstptr, &compat_st, IPT_ST_COMPAT_LEN)) - ret = -EFAULT; - *size -= IPT_ST_OFFSET; - *dstptr += IPT_ST_COMPAT_LEN; - break; - case COMPAT_FROM_USER: - pcompat_st = target; - memcpy(&st.target, &pcompat_st->target, IPT_ST_COMPAT_LEN); - st.verdict = pcompat_st->verdict; - if (st.verdict > 0) - st.verdict += compat_calc_jump(st.verdict); - st.target.u.user.target_size = IPT_ST_LEN; - memcpy(*dstptr, &st, IPT_ST_LEN); - *size += IPT_ST_OFFSET; - *dstptr += IPT_ST_LEN; - break; - case COMPAT_CALC_SIZE: - *size += IPT_ST_OFFSET; - break; - default: - ret = -ENOPROTOOPT; - break; - } - return ret; + if (cv > 0) + cv -= compat_calc_jump(cv); + return copy_to_user(dst, &cv, sizeof(cv)) ? -EFAULT : 0; } static inline int compat_calc_match(struct ipt_entry_match *m, int * size) { - if (m->u.kernel.match->compat) - m->u.kernel.match->compat(m, NULL, size, COMPAT_CALC_SIZE); - else - xt_compat_match(m, NULL, size, COMPAT_CALC_SIZE); + *size += xt_compat_match_offset(m->u.kernel.match); return 0; } @@ -1023,10 +978,7 @@ static int compat_calc_entry(struct ipt_entry *e, struct xt_table_info *info, entry_offset = (void *)e - base; IPT_MATCH_ITERATE(e, compat_calc_match, &off); t = ipt_get_target(e); - if (t->u.kernel.target->compat) - t->u.kernel.target->compat(t, NULL, &off, COMPAT_CALC_SIZE); - else - xt_compat_target(t, NULL, &off, COMPAT_CALC_SIZE); + off += xt_compat_target_offset(t->u.kernel.target); newinfo->size -= off; ret = compat_add_offset(entry_offset, off); if (ret) @@ -1412,17 +1364,13 @@ struct compat_ipt_replace { }; static inline int compat_copy_match_to_user(struct ipt_entry_match *m, - void __user **dstptr, compat_uint_t *size) + void * __user *dstptr, compat_uint_t *size) { - if (m->u.kernel.match->compat) - return m->u.kernel.match->compat(m, dstptr, size, - COMPAT_TO_USER); - else - return xt_compat_match(m, dstptr, size, COMPAT_TO_USER); + return xt_compat_match_to_user(m, dstptr, size); } static int compat_copy_entry_to_user(struct ipt_entry *e, - void __user **dstptr, compat_uint_t *size) + void * __user *dstptr, compat_uint_t *size) { struct ipt_entry_target __user *t; struct compat_ipt_entry __user *ce; @@ -1442,11 +1390,7 @@ static int compat_copy_entry_to_user(struct ipt_entry *e, if (ret) goto out; t = ipt_get_target(e); - if (t->u.kernel.target->compat) - ret = t->u.kernel.target->compat(t, dstptr, size, - COMPAT_TO_USER); - else - ret = xt_compat_target(t, dstptr, size, COMPAT_TO_USER); + ret = xt_compat_target_to_user(t, dstptr, size); if (ret) goto out; ret = -EFAULT; @@ -1478,11 +1422,7 @@ compat_check_calc_match(struct ipt_entry_match *m, return match ? PTR_ERR(match) : -ENOENT; } m->u.kernel.match = match; - - if (m->u.kernel.match->compat) - m->u.kernel.match->compat(m, NULL, size, COMPAT_CALC_SIZE); - else - xt_compat_match(m, NULL, size, COMPAT_CALC_SIZE); + *size += xt_compat_match_offset(match); (*i)++; return 0; @@ -1543,10 +1483,7 @@ check_compat_entry_size_and_hooks(struct ipt_entry *e, } t->u.kernel.target = target; - if (t->u.kernel.target->compat) - t->u.kernel.target->compat(t, NULL, &off, COMPAT_CALC_SIZE); - else - xt_compat_target(t, NULL, &off, COMPAT_CALC_SIZE); + off += xt_compat_target_offset(target); *size += off; ret = compat_add_offset(entry_offset, off); if (ret) @@ -1584,10 +1521,7 @@ static inline int compat_copy_match_from_user(struct ipt_entry_match *m, dm = (struct ipt_entry_match *)*dstptr; match = m->u.kernel.match; - if (match->compat) - match->compat(m, dstptr, size, COMPAT_FROM_USER); - else - xt_compat_match(m, dstptr, size, COMPAT_FROM_USER); + xt_compat_match_from_user(m, dstptr, size); ret = xt_check_match(match, AF_INET, dm->u.match_size - sizeof(*dm), name, hookmask, ip->proto, @@ -1635,10 +1569,7 @@ static int compat_copy_entry_from_user(struct ipt_entry *e, void **dstptr, de->target_offset = e->target_offset - (origsize - *size); t = ipt_get_target(e); target = t->u.kernel.target; - if (target->compat) - target->compat(t, dstptr, size, COMPAT_FROM_USER); - else - xt_compat_target(t, dstptr, size, COMPAT_FROM_USER); + xt_compat_target_from_user(t, dstptr, size); de->next_offset = e->next_offset - (origsize - *size); for (h = 0; h < NF_IP_NUMHOOKS; h++) { @@ -2205,7 +2136,9 @@ static struct ipt_target ipt_standard_target = { .targetsize = sizeof(int), .family = AF_INET, #ifdef CONFIG_COMPAT - .compat = &compat_ipt_standard_fn, + .compatsize = sizeof(compat_int_t), + .compat_from_user = compat_standard_from_user, + .compat_to_user = compat_standard_to_user, #endif }; diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c index be7baf4..58522fc 100644 --- a/net/netfilter/x_tables.c +++ b/net/netfilter/x_tables.c @@ -333,52 +333,65 @@ int xt_check_match(const struct xt_match *match, unsigned short family, EXPORT_SYMBOL_GPL(xt_check_match); #ifdef CONFIG_COMPAT -int xt_compat_match(void *match, void **dstptr, int *size, int convert) +int xt_compat_match_offset(struct xt_match *match) { - struct xt_match *m; - struct compat_xt_entry_match *pcompat_m; - struct xt_entry_match *pm; - u_int16_t msize; - int off, ret; + u_int16_t csize = match->compatsize ? : match->matchsize; + return XT_ALIGN(match->matchsize) - COMPAT_XT_ALIGN(csize); +} +EXPORT_SYMBOL_GPL(xt_compat_match_offset); - ret = 0; - m = ((struct xt_entry_match *)match)->u.kernel.match; - off = XT_ALIGN(m->matchsize) - COMPAT_XT_ALIGN(m->matchsize); - switch (convert) { - case COMPAT_TO_USER: - pm = (struct xt_entry_match *)match; - msize = pm->u.user.match_size; - if (copy_to_user(*dstptr, pm, msize)) { - ret = -EFAULT; - break; - } - msize -= off; - if (put_user(msize, (u_int16_t *)*dstptr)) - ret = -EFAULT; - *size -= off; - *dstptr += msize; - break; - case COMPAT_FROM_USER: - pcompat_m = (struct compat_xt_entry_match *)match; - pm = (struct xt_entry_match *)*dstptr; - msize = pcompat_m->u.user.match_size; - memcpy(pm, pcompat_m, msize); - msize += off; - pm->u.user.match_size = msize; - *size += off; - *dstptr += msize; - break; - case COMPAT_CALC_SIZE: - *size += off; - break; - default: - ret = -ENOPROTOOPT; - break; +void xt_compat_match_from_user(struct xt_entry_match *m, void **dstptr, + int *size) +{ + struct xt_match *match = m->u.kernel.match; + struct compat_xt_entry_match *cm = (struct compat_xt_entry_match *)m; + int pad, off = xt_compat_match_offset(match); + u_int16_t msize = cm->u.user.match_size; + + m = *dstptr; + memcpy(m, cm, sizeof(*cm)); + if (match->compat_from_user) + match->compat_from_user(m->data, cm->data); + else + memcpy(m->data, cm->data, msize - sizeof(*cm)); + pad = XT_ALIGN(match->matchsize) - match->matchsize; + if (pad > 0) + memset(m->data + match->matchsize, 0, pad); + + msize += off; + m->u.user.match_size = msize; + + *size += off; + *dstptr += msize; +} +EXPORT_SYMBOL_GPL(xt_compat_match_from_user); + +int xt_compat_match_to_user(struct xt_entry_match *m, void __user **dstptr, + int *size) +{ + struct xt_match *match = m->u.kernel.match; + struct compat_xt_entry_match __user *cm = *dstptr; + int off = xt_compat_match_offset(match); + u_int16_t msize = m->u.user.match_size - off; + + if (copy_to_user(cm, m, sizeof(*cm)) || + put_user(msize, &cm->u.user.match_size)) + return -EFAULT; + + if (match->compat_to_user) { + if (match->compat_to_user((void __user *)cm->data, m->data)) + return -EFAULT; + } else { + if (copy_to_user(cm->data, m->data, msize - sizeof(*cm))) + return -EFAULT; } - return ret; + + *size -= off; + *dstptr += msize; + return 0; } -EXPORT_SYMBOL_GPL(xt_compat_match); -#endif +EXPORT_SYMBOL_GPL(xt_compat_match_to_user); +#endif /* CONFIG_COMPAT */ int xt_check_target(const struct xt_target *target, unsigned short family, unsigned int size, const char *table, unsigned int hook_mask, @@ -410,51 +423,64 @@ int xt_check_target(const struct xt_target *target, unsigned short family, EXPORT_SYMBOL_GPL(xt_check_target); #ifdef CONFIG_COMPAT -int xt_compat_target(void *target, void **dstptr, int *size, int convert) +int xt_compat_target_offset(struct xt_target *target) { - struct xt_target *t; - struct compat_xt_entry_target *pcompat; - struct xt_entry_target *pt; - u_int16_t tsize; - int off, ret; + u_int16_t csize = target->compatsize ? : target->targetsize; + return XT_ALIGN(target->targetsize) - COMPAT_XT_ALIGN(csize); +} +EXPORT_SYMBOL_GPL(xt_compat_target_offset); - ret = 0; - t = ((struct xt_entry_target *)target)->u.kernel.target; - off = XT_ALIGN(t->targetsize) - COMPAT_XT_ALIGN(t->targetsize); - switch (convert) { - case COMPAT_TO_USER: - pt = (struct xt_entry_target *)target; - tsize = pt->u.user.target_size; - if (copy_to_user(*dstptr, pt, tsize)) { - ret = -EFAULT; - break; - } - tsize -= off; - if (put_user(tsize, (u_int16_t *)*dstptr)) - ret = -EFAULT; - *size -= off; - *dstptr += tsize; - break; - case COMPAT_FROM_USER: - pcompat = (struct compat_xt_entry_target *)target; - pt = (struct xt_entry_target *)*dstptr; - tsize = pcompat->u.user.target_size; - memcpy(pt, pcompat, tsize); - tsize += off; - pt->u.user.target_size = tsize; - *size += off; - *dstptr += tsize; - break; - case COMPAT_CALC_SIZE: - *size += off; - break; - default: - ret = -ENOPROTOOPT; - break; +void xt_compat_target_from_user(struct xt_entry_target *t, void **dstptr, + int *size) +{ + struct xt_target *target = t->u.kernel.target; + struct compat_xt_entry_target *ct = (struct compat_xt_entry_target *)t; + int pad, off = xt_compat_target_offset(target); + u_int16_t tsize = ct->u.user.target_size; + + t = *dstptr; + memcpy(t, ct, sizeof(*ct)); + if (target->compat_from_user) + target->compat_from_user(t->data, ct->data); + else + memcpy(t->data, ct->data, tsize - sizeof(*ct)); + pad = XT_ALIGN(target->targetsize) - target->targetsize; + if (pad > 0) + memset(t->data + target->targetsize, 0, pad); + + tsize += off; + t->u.user.target_size = tsize; + + *size += off; + *dstptr += tsize; +} +EXPORT_SYMBOL_GPL(xt_compat_target_from_user); + +int xt_compat_target_to_user(struct xt_entry_target *t, void __user **dstptr, + int *size) +{ + struct xt_target *target = t->u.kernel.target; + struct compat_xt_entry_target __user *ct = *dstptr; + int off = xt_compat_target_offset(target); + u_int16_t tsize = t->u.user.target_size - off; + + if (copy_to_user(ct, t, sizeof(*ct)) || + put_user(tsize, &ct->u.user.target_size)) + return -EFAULT; + + if (target->compat_to_user) { + if (target->compat_to_user((void __user *)ct->data, t->data)) + return -EFAULT; + } else { + if (copy_to_user(ct->data, t->data, tsize - sizeof(*ct))) + return -EFAULT; } - return ret; + + *size -= off; + *dstptr += tsize; + return 0; } -EXPORT_SYMBOL_GPL(xt_compat_target); +EXPORT_SYMBOL_GPL(xt_compat_target_to_user); #endif struct xt_table_info *xt_alloc_table_info(unsigned int size) -- cgit v0.10.2 From bc80b656657251fc936d2d93fc70d5566c1c7d29 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:05:54 -0700 Subject: [NETFILTER]: xt_mark: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c index e8059cd..934dddf 100644 --- a/net/netfilter/xt_mark.c +++ b/net/netfilter/xt_mark.c @@ -50,6 +50,37 @@ checkentry(const char *tablename, return 1; } +#ifdef CONFIG_COMPAT +struct compat_xt_mark_info { + compat_ulong_t mark, mask; + u_int8_t invert; + u_int8_t __pad1; + u_int16_t __pad2; +}; + +static void compat_from_user(void *dst, void *src) +{ + struct compat_xt_mark_info *cm = src; + struct xt_mark_info m = { + .mark = cm->mark, + .mask = cm->mask, + .invert = cm->invert, + }; + memcpy(dst, &m, sizeof(m)); +} + +static int compat_to_user(void __user *dst, void *src) +{ + struct xt_mark_info *m = src; + struct compat_xt_mark_info cm = { + .mark = m->mark, + .mask = m->mask, + .invert = m->invert, + }; + return copy_to_user(dst, &cm, sizeof(cm)) ? -EFAULT : 0; +} +#endif /* CONFIG_COMPAT */ + static struct xt_match xt_mark_match[] = { { .name = "mark", @@ -57,6 +88,11 @@ static struct xt_match xt_mark_match[] = { .checkentry = checkentry, .match = match, .matchsize = sizeof(struct xt_mark_info), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_xt_mark_info), + .compat_from_user = compat_from_user, + .compat_to_user = compat_to_user, +#endif .me = THIS_MODULE, }, { -- cgit v0.10.2 From be7263b7b72ed9d5d25958f2b71e77e889e4845a Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:06:10 -0700 Subject: [NETFILTER]: xt_MARK: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_MARK.c b/net/netfilter/xt_MARK.c index 782f8d8..c6e860a 100644 --- a/net/netfilter/xt_MARK.c +++ b/net/netfilter/xt_MARK.c @@ -108,6 +108,35 @@ checkentry_v1(const char *tablename, return 1; } +#ifdef CONFIG_COMPAT +struct compat_xt_mark_target_info_v1 { + compat_ulong_t mark; + u_int8_t mode; + u_int8_t __pad1; + u_int16_t __pad2; +}; + +static void compat_from_user_v1(void *dst, void *src) +{ + struct compat_xt_mark_target_info_v1 *cm = src; + struct xt_mark_target_info_v1 m = { + .mark = cm->mark, + .mode = cm->mode, + }; + memcpy(dst, &m, sizeof(m)); +} + +static int compat_to_user_v1(void __user *dst, void *src) +{ + struct xt_mark_target_info_v1 *m = src; + struct compat_xt_mark_target_info_v1 cm = { + .mark = m->mark, + .mode = m->mode, + }; + return copy_to_user(dst, &cm, sizeof(cm)) ? -EFAULT : 0; +} +#endif /* CONFIG_COMPAT */ + static struct xt_target xt_mark_target[] = { { .name = "MARK", @@ -126,6 +155,11 @@ static struct xt_target xt_mark_target[] = { .checkentry = checkentry_v1, .target = target_v1, .targetsize = sizeof(struct xt_mark_target_info_v1), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_xt_mark_target_info_v1), + .compat_from_user = compat_from_user_v1, + .compat_to_user = compat_to_user_v1, +#endif .table = "mangle", .me = THIS_MODULE, }, -- cgit v0.10.2 From f1eda05386ade8dad4e8e9b48ecbd9432b6739d9 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:06:25 -0700 Subject: [NETFILTER]: xt_connmark: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_connmark.c b/net/netfilter/xt_connmark.c index c9104d0..92a5726 100644 --- a/net/netfilter/xt_connmark.c +++ b/net/netfilter/xt_connmark.c @@ -81,6 +81,37 @@ destroy(const struct xt_match *match, void *matchinfo) #endif } +#ifdef CONFIG_COMPAT +struct compat_xt_connmark_info { + compat_ulong_t mark, mask; + u_int8_t invert; + u_int8_t __pad1; + u_int16_t __pad2; +}; + +static void compat_from_user(void *dst, void *src) +{ + struct compat_xt_connmark_info *cm = src; + struct xt_connmark_info m = { + .mark = cm->mark, + .mask = cm->mask, + .invert = cm->invert, + }; + memcpy(dst, &m, sizeof(m)); +} + +static int compat_to_user(void __user *dst, void *src) +{ + struct xt_connmark_info *m = src; + struct compat_xt_connmark_info cm = { + .mark = m->mark, + .mask = m->mask, + .invert = m->invert, + }; + return copy_to_user(dst, &cm, sizeof(cm)) ? -EFAULT : 0; +} +#endif /* CONFIG_COMPAT */ + static struct xt_match xt_connmark_match[] = { { .name = "connmark", @@ -89,6 +120,11 @@ static struct xt_match xt_connmark_match[] = { .match = match, .destroy = destroy, .matchsize = sizeof(struct xt_connmark_info), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_xt_connmark_info), + .compat_from_user = compat_from_user, + .compat_to_user = compat_to_user, +#endif .me = THIS_MODULE }, { -- cgit v0.10.2 From 7ce975b9da93b46dbf6ba70a1b4751bec211d079 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:06:40 -0700 Subject: [NETFILTER]: xt_CONNMARK: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_CONNMARK.c b/net/netfilter/xt_CONNMARK.c index 6ccb45e..c01524f 100644 --- a/net/netfilter/xt_CONNMARK.c +++ b/net/netfilter/xt_CONNMARK.c @@ -108,6 +108,37 @@ checkentry(const char *tablename, return 1; } +#ifdef CONFIG_COMPAT +struct compat_xt_connmark_target_info { + compat_ulong_t mark, mask; + u_int8_t mode; + u_int8_t __pad1; + u_int16_t __pad2; +}; + +static void compat_from_user(void *dst, void *src) +{ + struct compat_xt_connmark_target_info *cm = src; + struct xt_connmark_target_info m = { + .mark = cm->mark, + .mask = cm->mask, + .mode = cm->mode, + }; + memcpy(dst, &m, sizeof(m)); +} + +static int compat_to_user(void __user *dst, void *src) +{ + struct xt_connmark_target_info *m = src; + struct compat_xt_connmark_target_info cm = { + .mark = m->mark, + .mask = m->mask, + .mode = m->mode, + }; + return copy_to_user(dst, &cm, sizeof(cm)) ? -EFAULT : 0; +} +#endif /* CONFIG_COMPAT */ + static struct xt_target xt_connmark_target[] = { { .name = "CONNMARK", @@ -115,6 +146,11 @@ static struct xt_target xt_connmark_target[] = { .checkentry = checkentry, .target = target, .targetsize = sizeof(struct xt_connmark_target_info), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_xt_connmark_target_info), + .compat_from_user = compat_from_user, + .compat_to_user = compat_to_user, +#endif .me = THIS_MODULE }, { -- cgit v0.10.2 From 02c63cf777c331121bfb6e9c1440a9835ad2f2a8 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:07:06 -0700 Subject: [NETFILTER]: xt_limit: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/netfilter/xt_limit.c b/net/netfilter/xt_limit.c index 8bfcbdf..fda7b7d 100644 --- a/net/netfilter/xt_limit.c +++ b/net/netfilter/xt_limit.c @@ -135,6 +135,50 @@ ipt_limit_checkentry(const char *tablename, return 1; } +#ifdef CONFIG_COMPAT +struct compat_xt_rateinfo { + u_int32_t avg; + u_int32_t burst; + + compat_ulong_t prev; + u_int32_t credit; + u_int32_t credit_cap, cost; + + u_int32_t master; +}; + +/* To keep the full "prev" timestamp, the upper 32 bits are stored in the + * master pointer, which does not need to be preserved. */ +static void compat_from_user(void *dst, void *src) +{ + struct compat_xt_rateinfo *cm = src; + struct xt_rateinfo m = { + .avg = cm->avg, + .burst = cm->burst, + .prev = cm->prev | (unsigned long)cm->master << 32, + .credit = cm->credit, + .credit_cap = cm->credit_cap, + .cost = cm->cost, + }; + memcpy(dst, &m, sizeof(m)); +} + +static int compat_to_user(void __user *dst, void *src) +{ + struct xt_rateinfo *m = src; + struct compat_xt_rateinfo cm = { + .avg = m->avg, + .burst = m->burst, + .prev = m->prev, + .credit = m->credit, + .credit_cap = m->credit_cap, + .cost = m->cost, + .master = m->prev >> 32, + }; + return copy_to_user(dst, &cm, sizeof(cm)) ? -EFAULT : 0; +} +#endif /* CONFIG_COMPAT */ + static struct xt_match xt_limit_match[] = { { .name = "limit", @@ -142,6 +186,11 @@ static struct xt_match xt_limit_match[] = { .checkentry = ipt_limit_checkentry, .match = ipt_limit_match, .matchsize = sizeof(struct xt_rateinfo), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_xt_rateinfo), + .compat_from_user = compat_from_user, + .compat_to_user = compat_to_user, +#endif .me = THIS_MODULE, }, { -- cgit v0.10.2 From 127f15dd659b20e722561ff8c86dc058e1a72323 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:07:23 -0700 Subject: [NETFILTER]: ipt_hashlimit: add compat conversion functions Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ipt_hashlimit.c b/net/ipv4/netfilter/ipt_hashlimit.c index b5b74b0..4f73a61 100644 --- a/net/ipv4/netfilter/ipt_hashlimit.c +++ b/net/ipv4/netfilter/ipt_hashlimit.c @@ -535,10 +535,39 @@ hashlimit_destroy(const struct xt_match *match, void *matchinfo) htable_put(r->hinfo); } +#ifdef CONFIG_COMPAT +struct compat_ipt_hashlimit_info { + char name[IFNAMSIZ]; + struct hashlimit_cfg cfg; + compat_uptr_t hinfo; + compat_uptr_t master; +}; + +static void compat_from_user(void *dst, void *src) +{ + int off = offsetof(struct compat_ipt_hashlimit_info, hinfo); + + memcpy(dst, src, off); + memset(dst + off, 0, sizeof(struct compat_ipt_hashlimit_info) - off); +} + +static int compat_to_user(void __user *dst, void *src) +{ + int off = offsetof(struct compat_ipt_hashlimit_info, hinfo); + + return copy_to_user(dst, src, off) ? -EFAULT : 0; +} +#endif + static struct ipt_match ipt_hashlimit = { .name = "hashlimit", .match = hashlimit_match, .matchsize = sizeof(struct ipt_hashlimit_info), +#ifdef CONFIG_COMPAT + .compatsize = sizeof(struct compat_ipt_hashlimit_info), + .compat_from_user = compat_from_user, + .compat_to_user = compat_to_user, +#endif .checkentry = hashlimit_checkentry, .destroy = hashlimit_destroy, .me = THIS_MODULE -- cgit v0.10.2 From edd5a329cf69c112882e03c8ab55e985062a5d2a Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:07:39 -0700 Subject: [NETFILTER]: PPTP conntrack: fix whitespace errors Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h index 816144c..88f66d3 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h @@ -285,19 +285,19 @@ struct PptpSetLinkInfo { }; union pptp_ctrl_union { - struct PptpStartSessionRequest sreq; - struct PptpStartSessionReply srep; - struct PptpStopSessionRequest streq; - struct PptpStopSessionReply strep; - struct PptpOutCallRequest ocreq; - struct PptpOutCallReply ocack; - struct PptpInCallRequest icreq; - struct PptpInCallReply icack; - struct PptpInCallConnected iccon; - struct PptpClearCallRequest clrreq; - struct PptpCallDisconnectNotify disc; - struct PptpWanErrorNotify wanerr; - struct PptpSetLinkInfo setlink; + struct PptpStartSessionRequest sreq; + struct PptpStartSessionReply srep; + struct PptpStopSessionRequest streq; + struct PptpStopSessionReply strep; + struct PptpOutCallRequest ocreq; + struct PptpOutCallReply ocack; + struct PptpInCallRequest icreq; + struct PptpInCallReply icack; + struct PptpInCallConnected iccon; + struct PptpClearCallRequest clrreq; + struct PptpCallDisconnectNotify disc; + struct PptpWanErrorNotify wanerr; + struct PptpSetLinkInfo setlink; }; extern int diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index b020a33..6c94dd5 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -20,11 +20,11 @@ * - We can only support one single call within each session * * TODO: - * - testing of incoming PPTP calls + * - testing of incoming PPTP calls * - * Changes: + * Changes: * 2002-02-05 - Version 1.3 - * - Call ip_conntrack_unexpect_related() from + * - Call ip_conntrack_unexpect_related() from * pptp_destroy_siblings() to destroy expectations in case * CALL_DISCONNECT_NOTIFY or tcp fin packet was seen * (Philip Craig ) @@ -141,7 +141,7 @@ static void pptp_expectfn(struct ip_conntrack *ct, invert_tuplepr(&inv_t, &exp->tuple); DEBUGP("trying to unexpect other dir: "); DUMP_TUPLE(&inv_t); - + exp_other = ip_conntrack_expect_find(&inv_t); if (exp_other) { /* delete other expectation. */ @@ -194,7 +194,7 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) { struct ip_conntrack_tuple t; - /* Since ct->sibling_list has literally rusted away in 2.6.11, + /* Since ct->sibling_list has literally rusted away in 2.6.11, * we now need another way to find out about our sibling * contrack and expects... -HW */ @@ -264,7 +264,7 @@ exp_gre(struct ip_conntrack *master, exp_orig->mask.dst.u.gre.key = htons(0xffff); exp_orig->mask.dst.ip = 0xffffffff; exp_orig->mask.dst.protonum = 0xff; - + exp_orig->master = master; exp_orig->expectfn = pptp_expectfn; exp_orig->flags = 0; @@ -322,7 +322,7 @@ out_unexpect_orig: goto out_put_both; } -static inline int +static inline int pptp_inbound_pkt(struct sk_buff **pskb, struct tcphdr *tcph, unsigned int nexthdr_off, @@ -336,7 +336,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; __be16 *cid, *pcid; - u_int32_t seq; + u_int32_t seq; ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); if (!ctlh) { @@ -373,7 +373,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, } if (pptpReq->srep.resultCode == PPTP_START_OK) info->sstate = PPTP_SESSION_CONFIRMED; - else + else info->sstate = PPTP_SESSION_ERROR; break; @@ -420,22 +420,22 @@ pptp_inbound_pkt(struct sk_buff **pskb, pcid = &pptpReq->ocack.peersCallID; info->pac_call_id = ntohs(*cid); - + if (htons(info->pns_call_id) != *pcid) { DEBUGP("%s for unknown callid %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; } - DEBUGP("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg], + DEBUGP("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg], ntohs(*cid), ntohs(*pcid)); - + info->cstate = PPTP_CALL_OUT_CONF; seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) + sizeof(struct PptpControlHeader) + ((void *)pcid - (void *)pptpReq); - + if (exp_gre(ct, seq, *cid, *pcid) != 0) printk("ip_conntrack_pptp: error during exp_gre\n"); break; @@ -479,7 +479,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, cid = &info->pac_call_id; if (info->pns_call_id != ntohs(*pcid)) { - DEBUGP("%s for unknown CallID %u\n", + DEBUGP("%s for unknown CallID %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; } @@ -491,7 +491,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) + sizeof(struct PptpControlHeader) + ((void *)pcid - (void *)pptpReq); - + if (exp_gre(ct, seq, *cid, *pcid) != 0) printk("ip_conntrack_pptp: error during exp_gre\n"); @@ -554,7 +554,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, return NF_ACCEPT; nexthdr_off += sizeof(_ctlh); datalen -= sizeof(_ctlh); - + reqlen = datalen; if (reqlen > sizeof(*pptpReq)) reqlen = sizeof(*pptpReq); @@ -606,7 +606,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, /* client answers incoming call */ if (info->cstate != PPTP_CALL_IN_REQ && info->cstate != PPTP_CALL_IN_REP) { - DEBUGP("%s without incall_req\n", + DEBUGP("%s without incall_req\n", pptp_msg_name[msg]); break; } @@ -616,7 +616,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, } pcid = &pptpReq->icack.peersCallID; if (info->pac_call_id != ntohs(*pcid)) { - DEBUGP("%s for unknown call %u\n", + DEBUGP("%s for unknown call %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; } @@ -644,12 +644,12 @@ pptp_outbound_pkt(struct sk_buff **pskb, /* I don't have to explain these ;) */ break; default: - DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX)? + DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX)? pptp_msg_name[msg]:pptp_msg_name[0], msg); /* unknown: no need to create GRE masq table entry */ break; } - + if (ip_nat_pptp_hook_outbound) return ip_nat_pptp_hook_outbound(pskb, ct, ctinfo, ctlh, pptpReq); @@ -659,7 +659,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, /* track caller id inside control connection, call expect_related */ -static int +static int conntrack_pptp_help(struct sk_buff **pskb, struct ip_conntrack *ct, enum ip_conntrack_info ctinfo) @@ -676,12 +676,12 @@ conntrack_pptp_help(struct sk_buff **pskb, int ret; /* don't do any tracking before tcp handshake complete */ - if (ctinfo != IP_CT_ESTABLISHED + if (ctinfo != IP_CT_ESTABLISHED && ctinfo != IP_CT_ESTABLISHED+IP_CT_IS_REPLY) { DEBUGP("ctinfo = %u, skipping\n", ctinfo); return NF_ACCEPT; } - + nexthdr_off = (*pskb)->nh.iph->ihl*4; tcph = skb_header_pointer(*pskb, nexthdr_off, sizeof(_tcph), &_tcph); BUG_ON(!tcph); @@ -735,28 +735,28 @@ conntrack_pptp_help(struct sk_buff **pskb, } /* control protocol helper */ -static struct ip_conntrack_helper pptp = { +static struct ip_conntrack_helper pptp = { .list = { NULL, NULL }, - .name = "pptp", + .name = "pptp", .me = THIS_MODULE, .max_expected = 2, .timeout = 5 * 60, - .tuple = { .src = { .ip = 0, - .u = { .tcp = { .port = - __constant_htons(PPTP_CONTROL_PORT) } } - }, - .dst = { .ip = 0, + .tuple = { .src = { .ip = 0, + .u = { .tcp = { .port = + __constant_htons(PPTP_CONTROL_PORT) } } + }, + .dst = { .ip = 0, .u = { .all = 0 }, .protonum = IPPROTO_TCP - } + } }, - .mask = { .src = { .ip = 0, - .u = { .tcp = { .port = __constant_htons(0xffff) } } - }, - .dst = { .ip = 0, + .mask = { .src = { .ip = 0, + .u = { .tcp = { .port = __constant_htons(0xffff) } } + }, + .dst = { .ip = 0, .u = { .all = 0 }, - .protonum = 0xff - } + .protonum = 0xff + } }, .help = conntrack_pptp_help }; @@ -768,7 +768,7 @@ extern int __init ip_ct_proto_gre_init(void); static int __init ip_conntrack_helper_pptp_init(void) { int retcode; - + retcode = ip_ct_proto_gre_init(); if (retcode < 0) return retcode; diff --git a/net/ipv4/netfilter/ip_conntrack_proto_gre.c b/net/ipv4/netfilter/ip_conntrack_proto_gre.c index 92c6d8b..5fe026f 100644 --- a/net/ipv4/netfilter/ip_conntrack_proto_gre.c +++ b/net/ipv4/netfilter/ip_conntrack_proto_gre.c @@ -1,15 +1,15 @@ /* - * ip_conntrack_proto_gre.c - Version 3.0 + * ip_conntrack_proto_gre.c - Version 3.0 * * Connection tracking protocol helper module for GRE. * * GRE is a generic encapsulation protocol, which is generally not very * suited for NAT, as it has no protocol-specific part as port numbers. * - * It has an optional key field, which may help us distinguishing two + * It has an optional key field, which may help us distinguishing two * connections between the same two hosts. * - * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 + * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 * * PPTP is built on top of a modified version of GRE, and has a mandatory * field called "CallID", which serves us for the same purpose as the key @@ -61,7 +61,7 @@ MODULE_DESCRIPTION("netfilter connection tracking protocol helper for GRE"); #define DEBUGP(x, args...) #define DUMP_TUPLE_GRE(x) #endif - + /* GRE KEYMAP HANDLING FUNCTIONS */ static LIST_HEAD(gre_keymap_list); @@ -88,7 +88,7 @@ static __be16 gre_keymap_lookup(struct ip_conntrack_tuple *t) } } read_unlock_bh(&ip_ct_gre_lock); - + DEBUGP("lookup src key 0x%x up key for ", key); DUMP_TUPLE_GRE(t); @@ -107,7 +107,7 @@ ip_ct_gre_keymap_add(struct ip_conntrack *ct, return -1; } - if (!reply) + if (!reply) exist_km = &ct->help.ct_pptp_info.keymap_orig; else exist_km = &ct->help.ct_pptp_info.keymap_reply; @@ -118,7 +118,7 @@ ip_ct_gre_keymap_add(struct ip_conntrack *ct, if (gre_key_cmpfn(km, t) && km == *exist_km) return 0; } - DEBUGP("trying to override keymap_%s for ct %p\n", + DEBUGP("trying to override keymap_%s for ct %p\n", reply? "reply":"orig", ct); return -EEXIST; } @@ -152,7 +152,7 @@ void ip_ct_gre_keymap_destroy(struct ip_conntrack *ct) write_lock_bh(&ip_ct_gre_lock); if (ct->help.ct_pptp_info.keymap_orig) { - DEBUGP("removing %p from list\n", + DEBUGP("removing %p from list\n", ct->help.ct_pptp_info.keymap_orig); list_del(&ct->help.ct_pptp_info.keymap_orig->list); kfree(ct->help.ct_pptp_info.keymap_orig); @@ -220,7 +220,7 @@ static int gre_pkt_to_tuple(const struct sk_buff *skb, static int gre_print_tuple(struct seq_file *s, const struct ip_conntrack_tuple *tuple) { - return seq_printf(s, "srckey=0x%x dstkey=0x%x ", + return seq_printf(s, "srckey=0x%x dstkey=0x%x ", ntohs(tuple->src.u.gre.key), ntohs(tuple->dst.u.gre.key)); } @@ -250,14 +250,14 @@ static int gre_packet(struct ip_conntrack *ct, } else ip_ct_refresh_acct(ct, conntrackinfo, skb, ct->proto.gre.timeout); - + return NF_ACCEPT; } /* Called when a new connection for this protocol found. */ static int gre_new(struct ip_conntrack *ct, const struct sk_buff *skb) -{ +{ DEBUGP(": "); DUMP_TUPLE_GRE(&ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple); @@ -283,9 +283,9 @@ static void gre_destroy(struct ip_conntrack *ct) } /* protocol helper struct */ -static struct ip_conntrack_protocol gre = { +static struct ip_conntrack_protocol gre = { .proto = IPPROTO_GRE, - .name = "gre", + .name = "gre", .pkt_to_tuple = gre_pkt_to_tuple, .invert_tuple = gre_invert_tuple, .print_tuple = gre_print_tuple, @@ -323,7 +323,7 @@ void ip_ct_proto_gre_fini(void) } write_unlock_bh(&ip_ct_gre_lock); - ip_conntrack_protocol_unregister(&gre); + ip_conntrack_protocol_unregister(&gre); } EXPORT_SYMBOL(ip_ct_gre_keymap_add); diff --git a/net/ipv4/netfilter/ip_nat_helper_pptp.c b/net/ipv4/netfilter/ip_nat_helper_pptp.c index 1d14996..5dde1da 100644 --- a/net/ipv4/netfilter/ip_nat_helper_pptp.c +++ b/net/ipv4/netfilter/ip_nat_helper_pptp.c @@ -32,7 +32,7 @@ * 2005-06-10 - Version 3.0 * - kernel >= 2.6.11 version, * funded by Oxcoda NetBox Blue (http://www.netboxblue.com/) - * + * */ #include @@ -93,10 +93,10 @@ static void pptp_nat_expected(struct ip_conntrack *ct, DEBUGP("we are PAC->PNS\n"); /* build tuple for PNS->PAC */ t.src.ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip; - t.src.u.gre.key = + t.src.u.gre.key = htons(master->nat.help.nat_pptp_info.pns_call_id); t.dst.ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; - t.dst.u.gre.key = + t.dst.u.gre.key = htons(master->nat.help.nat_pptp_info.pac_call_id); t.dst.protonum = IPPROTO_GRE; } @@ -153,47 +153,47 @@ pptp_outbound_pkt(struct sk_buff **pskb, unsigned int cid_off; new_callid = htons(ct_pptp_info->pns_call_id); - + switch (msg = ntohs(ctlh->messageType)) { - case PPTP_OUT_CALL_REQUEST: - cid_off = offsetof(union pptp_ctrl_union, ocreq.callID); - /* FIXME: ideally we would want to reserve a call ID - * here. current netfilter NAT core is not able to do - * this :( For now we use TCP source port. This breaks - * multiple calls within one control session */ - - /* save original call ID in nat_info */ - nat_pptp_info->pns_call_id = ct_pptp_info->pns_call_id; - - /* don't use tcph->source since we are at a DSTmanip - * hook (e.g. PREROUTING) and pkt is not mangled yet */ - new_callid = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.tcp.port; - - /* save new call ID in ct info */ - ct_pptp_info->pns_call_id = ntohs(new_callid); - break; - case PPTP_IN_CALL_REPLY: - cid_off = offsetof(union pptp_ctrl_union, icreq.callID); - break; - case PPTP_CALL_CLEAR_REQUEST: - cid_off = offsetof(union pptp_ctrl_union, clrreq.callID); - break; - default: - DEBUGP("unknown outbound packet 0x%04x:%s\n", msg, - (msg <= PPTP_MSG_MAX)? - pptp_msg_name[msg]:pptp_msg_name[0]); - /* fall through */ - - case PPTP_SET_LINK_INFO: - /* only need to NAT in case PAC is behind NAT box */ - case PPTP_START_SESSION_REQUEST: - case PPTP_START_SESSION_REPLY: - case PPTP_STOP_SESSION_REQUEST: - case PPTP_STOP_SESSION_REPLY: - case PPTP_ECHO_REQUEST: - case PPTP_ECHO_REPLY: - /* no need to alter packet */ - return NF_ACCEPT; + case PPTP_OUT_CALL_REQUEST: + cid_off = offsetof(union pptp_ctrl_union, ocreq.callID); + /* FIXME: ideally we would want to reserve a call ID + * here. current netfilter NAT core is not able to do + * this :( For now we use TCP source port. This breaks + * multiple calls within one control session */ + + /* save original call ID in nat_info */ + nat_pptp_info->pns_call_id = ct_pptp_info->pns_call_id; + + /* don't use tcph->source since we are at a DSTmanip + * hook (e.g. PREROUTING) and pkt is not mangled yet */ + new_callid = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.tcp.port; + + /* save new call ID in ct info */ + ct_pptp_info->pns_call_id = ntohs(new_callid); + break; + case PPTP_IN_CALL_REPLY: + cid_off = offsetof(union pptp_ctrl_union, icreq.callID); + break; + case PPTP_CALL_CLEAR_REQUEST: + cid_off = offsetof(union pptp_ctrl_union, clrreq.callID); + break; + default: + DEBUGP("unknown outbound packet 0x%04x:%s\n", msg, + (msg <= PPTP_MSG_MAX)? + pptp_msg_name[msg]:pptp_msg_name[0]); + /* fall through */ + + case PPTP_SET_LINK_INFO: + /* only need to NAT in case PAC is behind NAT box */ + case PPTP_START_SESSION_REQUEST: + case PPTP_START_SESSION_REPLY: + case PPTP_STOP_SESSION_REQUEST: + case PPTP_STOP_SESSION_REPLY: + case PPTP_ECHO_REQUEST: + case PPTP_ECHO_REPLY: + /* no need to alter packet */ + return NF_ACCEPT; } /* only OUT_CALL_REQUEST, IN_CALL_REPLY, CALL_CLEAR_REQUEST pass @@ -216,9 +216,9 @@ static int pptp_exp_gre(struct ip_conntrack_expect *expect_orig, struct ip_conntrack_expect *expect_reply) { - struct ip_ct_pptp_master *ct_pptp_info = + struct ip_ct_pptp_master *ct_pptp_info = &expect_orig->master->help.ct_pptp_info; - struct ip_nat_pptp *nat_pptp_info = + struct ip_nat_pptp *nat_pptp_info = &expect_orig->master->nat.help.nat_pptp_info; struct ip_conntrack *ct = expect_orig->master; @@ -324,7 +324,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; default: - DEBUGP("unknown inbound packet %s\n", (msg <= PPTP_MSG_MAX)? + DEBUGP("unknown inbound packet %s\n", (msg <= PPTP_MSG_MAX)? pptp_msg_name[msg]:pptp_msg_name[0]); /* fall through */ diff --git a/net/ipv4/netfilter/ip_nat_proto_gre.c b/net/ipv4/netfilter/ip_nat_proto_gre.c index 70a6537..a522669 100644 --- a/net/ipv4/netfilter/ip_nat_proto_gre.c +++ b/net/ipv4/netfilter/ip_nat_proto_gre.c @@ -6,10 +6,10 @@ * GRE is a generic encapsulation protocol, which is generally not very * suited for NAT, as it has no protocol-specific part as port numbers. * - * It has an optional key field, which may help us distinguishing two + * It has an optional key field, which may help us distinguishing two * connections between the same two hosts. * - * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 + * GRE is defined in RFC 1701 and RFC 1702, as well as RFC 2784 * * PPTP is built on top of a modified version of GRE, and has a mandatory * field called "CallID", which serves us for the same purpose as the key @@ -60,7 +60,7 @@ gre_in_range(const struct ip_conntrack_tuple *tuple, } /* generate unique tuple ... */ -static int +static int gre_unique_tuple(struct ip_conntrack_tuple *tuple, const struct ip_nat_range *range, enum ip_nat_manip_type maniptype, @@ -84,7 +84,7 @@ gre_unique_tuple(struct ip_conntrack_tuple *tuple, range_size = ntohs(range->max.gre.key) - min + 1; } - DEBUGP("min = %u, range_size = %u\n", min, range_size); + DEBUGP("min = %u, range_size = %u\n", min, range_size); for (i = 0; i < range_size; i++, key++) { *keyptr = htons(min + key % range_size); @@ -117,7 +117,7 @@ gre_manip_pkt(struct sk_buff **pskb, greh = (void *)(*pskb)->data + hdroff; pgreh = (struct gre_hdr_pptp *) greh; - /* we only have destination manip of a packet, since 'source key' + /* we only have destination manip of a packet, since 'source key' * is not present in the packet itself */ if (maniptype == IP_NAT_MANIP_DST) { /* key manipulation is always dest */ @@ -129,7 +129,7 @@ gre_manip_pkt(struct sk_buff **pskb, } if (greh->csum) { /* FIXME: Never tested this code... */ - *(gre_csum(greh)) = + *(gre_csum(greh)) = nf_proto_csum_update(*pskb, ~*(gre_key(greh)), tuple->dst.u.gre.key, @@ -138,7 +138,7 @@ gre_manip_pkt(struct sk_buff **pskb, *(gre_key(greh)) = tuple->dst.u.gre.key; break; case GRE_VERSION_PPTP: - DEBUGP("call_id -> 0x%04x\n", + DEBUGP("call_id -> 0x%04x\n", ntohs(tuple->dst.u.gre.key)); pgreh->call_id = tuple->dst.u.gre.key; break; @@ -152,8 +152,8 @@ gre_manip_pkt(struct sk_buff **pskb, } /* nat helper struct */ -static struct ip_nat_protocol gre = { - .name = "GRE", +static struct ip_nat_protocol gre = { + .name = "GRE", .protonum = IPPROTO_GRE, .manip_pkt = gre_manip_pkt, .in_range = gre_in_range, @@ -164,7 +164,7 @@ static struct ip_nat_protocol gre = { .nfattr_to_range = ip_nat_port_nfattr_to_range, #endif }; - + int __init ip_nat_proto_gre_init(void) { return ip_nat_protocol_register(&gre); -- cgit v0.10.2 From 955b944293dd4c931ec866ebe19a6b2463b8f9a0 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:08:03 -0700 Subject: [NETFILTER]: PPTP conntrack: get rid of unnecessary byte order conversions The conntrack structure contains the call ID in host byte order for no reason, get rid of back and forth conversions. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h index 88f66d3..0d35623 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h @@ -31,8 +31,8 @@ struct ip_ct_pptp_master { /* everything below is going to be per-expectation in newnat, * since there could be more than one call within one session */ enum pptp_ctrlcall_state cstate; /* call state */ - u_int16_t pac_call_id; /* call id of PAC, host byte order */ - u_int16_t pns_call_id; /* call id of PNS, host byte order */ + __be16 pac_call_id; /* call id of PAC, host byte order */ + __be16 pns_call_id; /* call id of PNS, host byte order */ /* in pre-2.6.11 this used to be per-expect. Now it is per-conntrack * and therefore imposes a fixed limit on the number of maps */ @@ -42,8 +42,8 @@ struct ip_ct_pptp_master { /* conntrack_expect private member */ struct ip_ct_pptp_expect { enum pptp_ctrlcall_state cstate; /* call state */ - u_int16_t pac_call_id; /* call id of PAC */ - u_int16_t pns_call_id; /* call id of PNS */ + __be16 pac_call_id; /* call id of PAC */ + __be16 pns_call_id; /* call id of PNS */ }; diff --git a/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h b/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h index 8d090ef..1d853aa 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_proto_gre.h @@ -49,18 +49,18 @@ struct gre_hdr { #else #error "Adjust your defines" #endif - __u16 protocol; + __be16 protocol; }; /* modified GRE header for PPTP */ struct gre_hdr_pptp { - __u8 flags; /* bitfield */ - __u8 version; /* should be GRE_VERSION_PPTP */ - __u16 protocol; /* should be GRE_PROTOCOL_PPTP */ - __u16 payload_len; /* size of ppp payload, not inc. gre header */ - __u16 call_id; /* peer's call_id for this session */ - __u32 seq; /* sequence number. Present if S==1 */ - __u32 ack; /* seq number of highest packet recieved by */ + __u8 flags; /* bitfield */ + __u8 version; /* should be GRE_VERSION_PPTP */ + __be16 protocol; /* should be GRE_PROTOCOL_PPTP */ + __be16 payload_len; /* size of ppp payload, not inc. gre header */ + __be16 call_id; /* peer's call_id for this session */ + __be32 seq; /* sequence number. Present if S==1 */ + __be32 ack; /* seq number of highest packet recieved by */ /* sender in this session */ }; @@ -92,13 +92,13 @@ void ip_ct_gre_keymap_destroy(struct ip_conntrack *ct); /* get pointer to gre key, if present */ -static inline u_int32_t *gre_key(struct gre_hdr *greh) +static inline __be32 *gre_key(struct gre_hdr *greh) { if (!greh->key) return NULL; if (greh->csum || greh->routing) - return (u_int32_t *) (greh+sizeof(*greh)+4); - return (u_int32_t *) (greh+sizeof(*greh)); + return (__be32 *) (greh+sizeof(*greh)+4); + return (__be32 *) (greh+sizeof(*greh)); } /* get pointer ot gre csum, if present */ diff --git a/include/linux/netfilter_ipv4/ip_nat_pptp.h b/include/linux/netfilter_ipv4/ip_nat_pptp.h index eaf66c2..36668bf 100644 --- a/include/linux/netfilter_ipv4/ip_nat_pptp.h +++ b/include/linux/netfilter_ipv4/ip_nat_pptp.h @@ -4,8 +4,8 @@ /* conntrack private data */ struct ip_nat_pptp { - u_int16_t pns_call_id; /* NAT'ed PNS call id */ - u_int16_t pac_call_id; /* NAT'ed PAC call id */ + __be16 pns_call_id; /* NAT'ed PNS call id */ + __be16 pac_call_id; /* NAT'ed PAC call id */ }; #endif /* _NAT_PPTP_H */ diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 6c94dd5..57637ca 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -201,8 +201,8 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) /* try original (pns->pac) tuple */ memcpy(&t, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(t)); t.dst.protonum = IPPROTO_GRE; - t.src.u.gre.key = htons(ct->help.ct_pptp_info.pns_call_id); - t.dst.u.gre.key = htons(ct->help.ct_pptp_info.pac_call_id); + t.src.u.gre.key = ct->help.ct_pptp_info.pns_call_id; + t.dst.u.gre.key = ct->help.ct_pptp_info.pac_call_id; if (!destroy_sibling_or_exp(&t)) DEBUGP("failed to timeout original pns->pac ct/exp\n"); @@ -210,8 +210,8 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) /* try reply (pac->pns) tuple */ memcpy(&t, &ct->tuplehash[IP_CT_DIR_REPLY].tuple, sizeof(t)); t.dst.protonum = IPPROTO_GRE; - t.src.u.gre.key = htons(ct->help.ct_pptp_info.pac_call_id); - t.dst.u.gre.key = htons(ct->help.ct_pptp_info.pns_call_id); + t.src.u.gre.key = ct->help.ct_pptp_info.pac_call_id; + t.dst.u.gre.key = ct->help.ct_pptp_info.pns_call_id; if (!destroy_sibling_or_exp(&t)) DEBUGP("failed to timeout reply pac->pns ct/exp\n"); @@ -419,9 +419,9 @@ pptp_inbound_pkt(struct sk_buff **pskb, cid = &pptpReq->ocack.callID; pcid = &pptpReq->ocack.peersCallID; - info->pac_call_id = ntohs(*cid); + info->pac_call_id = *cid; - if (htons(info->pns_call_id) != *pcid) { + if (info->pns_call_id != *pcid) { DEBUGP("%s for unknown callid %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; @@ -454,7 +454,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, pcid = &pptpReq->icack.peersCallID; DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(*pcid)); info->cstate = PPTP_CALL_IN_REQ; - info->pac_call_id = ntohs(*pcid); + info->pac_call_id = *pcid; break; case PPTP_IN_CALL_CONNECT: @@ -478,7 +478,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, pcid = &pptpReq->iccon.peersCallID; cid = &info->pac_call_id; - if (info->pns_call_id != ntohs(*pcid)) { + if (info->pns_call_id != *pcid) { DEBUGP("%s for unknown CallID %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; @@ -595,7 +595,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, /* track PNS call id */ cid = &pptpReq->ocreq.callID; DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(*cid)); - info->pns_call_id = ntohs(*cid); + info->pns_call_id = *cid; break; case PPTP_IN_CALL_REPLY: if (reqlen < sizeof(_pptpReq.icack)) { @@ -615,7 +615,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, break; } pcid = &pptpReq->icack.peersCallID; - if (info->pac_call_id != ntohs(*pcid)) { + if (info->pac_call_id != *pcid) { DEBUGP("%s for unknown call %u\n", pptp_msg_name[msg], ntohs(*pcid)); break; @@ -623,7 +623,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(*pcid)); /* part two of the three-way handshake */ info->cstate = PPTP_CALL_IN_REP; - info->pns_call_id = ntohs(pptpReq->icack.callID); + info->pns_call_id = pptpReq->icack.callID; break; case PPTP_CALL_CLEAR_REQUEST: diff --git a/net/ipv4/netfilter/ip_nat_helper_pptp.c b/net/ipv4/netfilter/ip_nat_helper_pptp.c index 5dde1da..6e8bd6b 100644 --- a/net/ipv4/netfilter/ip_nat_helper_pptp.c +++ b/net/ipv4/netfilter/ip_nat_helper_pptp.c @@ -85,19 +85,17 @@ static void pptp_nat_expected(struct ip_conntrack *ct, DEBUGP("we are PNS->PAC\n"); /* therefore, build tuple for PAC->PNS */ t.src.ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip; - t.src.u.gre.key = htons(master->help.ct_pptp_info.pac_call_id); + t.src.u.gre.key = master->help.ct_pptp_info.pac_call_id; t.dst.ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; - t.dst.u.gre.key = htons(master->help.ct_pptp_info.pns_call_id); + t.dst.u.gre.key = master->help.ct_pptp_info.pns_call_id; t.dst.protonum = IPPROTO_GRE; } else { DEBUGP("we are PAC->PNS\n"); /* build tuple for PNS->PAC */ t.src.ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip; - t.src.u.gre.key = - htons(master->nat.help.nat_pptp_info.pns_call_id); + t.src.u.gre.key = master->nat.help.nat_pptp_info.pns_call_id; t.dst.ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; - t.dst.u.gre.key = - htons(master->nat.help.nat_pptp_info.pac_call_id); + t.dst.u.gre.key = master->nat.help.nat_pptp_info.pac_call_id; t.dst.protonum = IPPROTO_GRE; } @@ -149,10 +147,11 @@ pptp_outbound_pkt(struct sk_buff **pskb, { struct ip_ct_pptp_master *ct_pptp_info = &ct->help.ct_pptp_info; struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; - u_int16_t msg, new_callid; + u_int16_t msg; + __be16 new_callid; unsigned int cid_off; - new_callid = htons(ct_pptp_info->pns_call_id); + new_callid = ct_pptp_info->pns_call_id; switch (msg = ntohs(ctlh->messageType)) { case PPTP_OUT_CALL_REQUEST: @@ -170,7 +169,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, new_callid = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.u.tcp.port; /* save new call ID in ct info */ - ct_pptp_info->pns_call_id = ntohs(new_callid); + ct_pptp_info->pns_call_id = new_callid; break; case PPTP_IN_CALL_REPLY: cid_off = offsetof(union pptp_ctrl_union, icreq.callID); @@ -235,14 +234,14 @@ pptp_exp_gre(struct ip_conntrack_expect *expect_orig, /* alter expectation for PNS->PAC direction */ invert_tuplepr(&inv_t, &expect_orig->tuple); - expect_orig->saved_proto.gre.key = htons(ct_pptp_info->pns_call_id); - expect_orig->tuple.src.u.gre.key = htons(nat_pptp_info->pns_call_id); - expect_orig->tuple.dst.u.gre.key = htons(ct_pptp_info->pac_call_id); + expect_orig->saved_proto.gre.key = ct_pptp_info->pns_call_id; + expect_orig->tuple.src.u.gre.key = nat_pptp_info->pns_call_id; + expect_orig->tuple.dst.u.gre.key = ct_pptp_info->pac_call_id; expect_orig->dir = IP_CT_DIR_ORIGINAL; inv_t.src.ip = reply_t->src.ip; inv_t.dst.ip = reply_t->dst.ip; - inv_t.src.u.gre.key = htons(nat_pptp_info->pac_call_id); - inv_t.dst.u.gre.key = htons(ct_pptp_info->pns_call_id); + inv_t.src.u.gre.key = nat_pptp_info->pac_call_id; + inv_t.dst.u.gre.key = ct_pptp_info->pns_call_id; if (!ip_conntrack_expect_related(expect_orig)) { DEBUGP("successfully registered expect\n"); @@ -253,14 +252,14 @@ pptp_exp_gre(struct ip_conntrack_expect *expect_orig, /* alter expectation for PAC->PNS direction */ invert_tuplepr(&inv_t, &expect_reply->tuple); - expect_reply->saved_proto.gre.key = htons(nat_pptp_info->pns_call_id); - expect_reply->tuple.src.u.gre.key = htons(nat_pptp_info->pac_call_id); - expect_reply->tuple.dst.u.gre.key = htons(ct_pptp_info->pns_call_id); + expect_reply->saved_proto.gre.key = nat_pptp_info->pns_call_id; + expect_reply->tuple.src.u.gre.key = nat_pptp_info->pac_call_id; + expect_reply->tuple.dst.u.gre.key = ct_pptp_info->pns_call_id; expect_reply->dir = IP_CT_DIR_REPLY; inv_t.src.ip = orig_t->src.ip; inv_t.dst.ip = orig_t->dst.ip; - inv_t.src.u.gre.key = htons(nat_pptp_info->pns_call_id); - inv_t.dst.u.gre.key = htons(ct_pptp_info->pac_call_id); + inv_t.src.u.gre.key = nat_pptp_info->pns_call_id; + inv_t.dst.u.gre.key = ct_pptp_info->pac_call_id; if (!ip_conntrack_expect_related(expect_reply)) { DEBUGP("successfully registered expect\n"); @@ -297,10 +296,11 @@ pptp_inbound_pkt(struct sk_buff **pskb, union pptp_ctrl_union *pptpReq) { struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; - u_int16_t msg, new_cid = 0, new_pcid; + u_int16_t msg, new_cid = 0; + __be16 new_pcid; unsigned int pcid_off, cid_off = 0; - new_pcid = htons(nat_pptp_info->pns_call_id); + new_pcid = nat_pptp_info->pns_call_id; switch (msg = ntohs(ctlh->messageType)) { case PPTP_OUT_CALL_REPLY: diff --git a/net/ipv4/netfilter/ip_nat_proto_gre.c b/net/ipv4/netfilter/ip_nat_proto_gre.c index a522669..bf91f93 100644 --- a/net/ipv4/netfilter/ip_nat_proto_gre.c +++ b/net/ipv4/netfilter/ip_nat_proto_gre.c @@ -67,7 +67,7 @@ gre_unique_tuple(struct ip_conntrack_tuple *tuple, const struct ip_conntrack *conntrack) { static u_int16_t key; - u_int16_t *keyptr; + __be16 *keyptr; unsigned int min, i, range_size; if (maniptype == IP_NAT_MANIP_SRC) -- cgit v0.10.2 From a1ad1deed5bf6fa06f2213b7f1a794de4cf791a6 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:08:23 -0700 Subject: [NETFILTER]: PPTP conntrack: remove dead code The call ID in reply packets is never changed, remove the code. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_nat_helper_pptp.c b/net/ipv4/netfilter/ip_nat_helper_pptp.c index 6e8bd6b..0f5e753 100644 --- a/net/ipv4/netfilter/ip_nat_helper_pptp.c +++ b/net/ipv4/netfilter/ip_nat_helper_pptp.c @@ -296,16 +296,15 @@ pptp_inbound_pkt(struct sk_buff **pskb, union pptp_ctrl_union *pptpReq) { struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; - u_int16_t msg, new_cid = 0; + u_int16_t msg; __be16 new_pcid; - unsigned int pcid_off, cid_off = 0; + unsigned int pcid_off; new_pcid = nat_pptp_info->pns_call_id; switch (msg = ntohs(ctlh->messageType)) { case PPTP_OUT_CALL_REPLY: pcid_off = offsetof(union pptp_ctrl_union, ocack.peersCallID); - cid_off = offsetof(union pptp_ctrl_union, ocack.callID); break; case PPTP_IN_CALL_CONNECT: pcid_off = offsetof(union pptp_ctrl_union, iccon.peersCallID); @@ -351,17 +350,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, sizeof(new_pcid), (char *)&new_pcid, sizeof(new_pcid)) == 0) return NF_DROP; - - if (new_cid) { - DEBUGP("altering call id from 0x%04x to 0x%04x\n", - ntohs(REQ_CID(pptpReq, cid_off)), ntohs(new_cid)); - if (ip_nat_mangle_tcp_packet(pskb, ct, ctinfo, - cid_off + sizeof(struct pptp_pkt_hdr) + - sizeof(struct PptpControlHeader), - sizeof(new_cid), (char *)&new_cid, - sizeof(new_cid)) == 0) - return NF_DROP; - } return NF_ACCEPT; } -- cgit v0.10.2 From 5256f663a0228af9bf69ba74ad9f0928f35713f7 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:08:41 -0700 Subject: [NETFILTER]: PPTP conntrack: remove more dead code The calculated sequence numbers are not used for anything. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 57637ca..0510ee5 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -220,7 +220,6 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) /* expect GRE connections (PNS->PAC and PAC->PNS direction) */ static inline int exp_gre(struct ip_conntrack *master, - u_int32_t seq, __be16 callid, __be16 peer_callid) { @@ -336,7 +335,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; __be16 *cid, *pcid; - u_int32_t seq; ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); if (!ctlh) { @@ -432,12 +430,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, info->cstate = PPTP_CALL_OUT_CONF; - seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) - + sizeof(struct PptpControlHeader) - + ((void *)pcid - (void *)pptpReq); - - if (exp_gre(ct, seq, *cid, *pcid) != 0) - printk("ip_conntrack_pptp: error during exp_gre\n"); + exp_gre(ct, *cid, *pcid); break; case PPTP_IN_CALL_REQUEST: @@ -488,13 +481,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, info->cstate = PPTP_CALL_IN_CONF; /* we expect a GRE connection from PAC to PNS */ - seq = ntohl(tcph->seq) + sizeof(struct pptp_pkt_hdr) - + sizeof(struct PptpControlHeader) - + ((void *)pcid - (void *)pptpReq); - - if (exp_gre(ct, seq, *cid, *pcid) != 0) - printk("ip_conntrack_pptp: error during exp_gre\n"); - + exp_gre(ct, *cid, *pcid); break; case PPTP_CALL_DISCONNECT_NOTIFY: -- cgit v0.10.2 From 6013c0a13e335674a783215e182c367406294392 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:08:56 -0700 Subject: [NETFILTER]: PPTP conntrack: fix header definitions Fix a few header definitions to match RFC2637. Most importantly the PptpOutCallRequest header included an invalid padding field and a size check was disabled because of this. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h index 0d35623..620bf06 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h @@ -107,8 +107,7 @@ struct PptpControlHeader { struct PptpStartSessionRequest { __be16 protocolVersion; - __u8 reserved1; - __u8 reserved2; + __u16 reserved1; __be32 framingCapability; __be32 bearerCapability; __be16 maxChannels; @@ -143,6 +142,8 @@ struct PptpStartSessionReply { struct PptpStopSessionRequest { __u8 reason; + __u8 reserved1; + __u16 reserved2; }; /* PptpStopSessionResultCode */ @@ -152,6 +153,7 @@ struct PptpStopSessionRequest { struct PptpStopSessionReply { __u8 resultCode; __u8 generalErrorCode; + __u16 reserved1; }; struct PptpEchoRequest { @@ -188,9 +190,8 @@ struct PptpOutCallRequest { __be32 framingType; __be16 packetWindow; __be16 packetProcDelay; - __u16 reserved1; __be16 phoneNumberLength; - __u16 reserved2; + __u16 reserved1; __u8 phoneNumber[64]; __u8 subAddress[64]; }; diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 0510ee5..1a8da90 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -569,7 +569,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, case PPTP_OUT_CALL_REQUEST: if (reqlen < sizeof(_pptpReq.ocreq)) { DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - /* FIXME: break; */ + break; } /* client initiating connection to server */ -- cgit v0.10.2 From 857c06da2ba2e00b81677c2f6740048d87da0207 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:09:19 -0700 Subject: [NETFILTER]: PPTP conntrack: remove unnecessary cid/pcid header pointers Just the values are needed, not the memory locations. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 1a8da90..5f7af6e 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -334,7 +334,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, union pptp_ctrl_union _pptpReq, *pptpReq; struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; - __be16 *cid, *pcid; + __be16 cid, pcid; ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); if (!ctlh) { @@ -414,23 +414,23 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; } - cid = &pptpReq->ocack.callID; - pcid = &pptpReq->ocack.peersCallID; + cid = pptpReq->ocack.callID; + pcid = pptpReq->ocack.peersCallID; - info->pac_call_id = *cid; + info->pac_call_id = cid; - if (info->pns_call_id != *pcid) { + if (info->pns_call_id != pcid) { DEBUGP("%s for unknown callid %u\n", - pptp_msg_name[msg], ntohs(*pcid)); + pptp_msg_name[msg], ntohs(pcid)); break; } DEBUGP("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg], - ntohs(*cid), ntohs(*pcid)); + ntohs(cid), ntohs(pcid)); info->cstate = PPTP_CALL_OUT_CONF; - exp_gre(ct, *cid, *pcid); + exp_gre(ct, cid, pcid); break; case PPTP_IN_CALL_REQUEST: @@ -444,10 +444,10 @@ pptp_inbound_pkt(struct sk_buff **pskb, DEBUGP("%s but no session\n", pptp_msg_name[msg]); break; } - pcid = &pptpReq->icack.peersCallID; - DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(*pcid)); + pcid = pptpReq->icack.peersCallID; + DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid)); info->cstate = PPTP_CALL_IN_REQ; - info->pac_call_id = *pcid; + info->pac_call_id = pcid; break; case PPTP_IN_CALL_CONNECT: @@ -468,20 +468,20 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; } - pcid = &pptpReq->iccon.peersCallID; - cid = &info->pac_call_id; + pcid = pptpReq->iccon.peersCallID; + cid = info->pac_call_id; - if (info->pns_call_id != *pcid) { + if (info->pns_call_id != pcid) { DEBUGP("%s for unknown CallID %u\n", - pptp_msg_name[msg], ntohs(*pcid)); + pptp_msg_name[msg], ntohs(pcid)); break; } - DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(*pcid)); + DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid)); info->cstate = PPTP_CALL_IN_CONF; /* we expect a GRE connection from PAC to PNS */ - exp_gre(ct, *cid, *pcid); + exp_gre(ct, cid, pcid); break; case PPTP_CALL_DISCONNECT_NOTIFY: @@ -491,8 +491,8 @@ pptp_inbound_pkt(struct sk_buff **pskb, } /* server confirms disconnect */ - cid = &pptpReq->disc.callID; - DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(*cid)); + cid = pptpReq->disc.callID; + DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid)); info->cstate = PPTP_CALL_NONE; /* untrack this call id, unexpect GRE packets */ @@ -534,7 +534,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, union pptp_ctrl_union _pptpReq, *pptpReq; struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; - __be16 *cid, *pcid; + __be16 cid, pcid; ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); if (!ctlh) @@ -580,9 +580,9 @@ pptp_outbound_pkt(struct sk_buff **pskb, } info->cstate = PPTP_CALL_OUT_REQ; /* track PNS call id */ - cid = &pptpReq->ocreq.callID; - DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(*cid)); - info->pns_call_id = *cid; + cid = pptpReq->ocreq.callID; + DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid)); + info->pns_call_id = cid; break; case PPTP_IN_CALL_REPLY: if (reqlen < sizeof(_pptpReq.icack)) { @@ -601,16 +601,16 @@ pptp_outbound_pkt(struct sk_buff **pskb, info->cstate = PPTP_CALL_NONE; break; } - pcid = &pptpReq->icack.peersCallID; - if (info->pac_call_id != *pcid) { + pcid = pptpReq->icack.peersCallID; + if (info->pac_call_id != pcid) { DEBUGP("%s for unknown call %u\n", - pptp_msg_name[msg], ntohs(*pcid)); + pptp_msg_name[msg], ntohs(pcid)); break; } - DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(*pcid)); + DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(pcid)); /* part two of the three-way handshake */ info->cstate = PPTP_CALL_IN_REP; - info->pns_call_id = pptpReq->icack.callID; + info->pns_call_id = pcid; break; case PPTP_CALL_CLEAR_REQUEST: -- cgit v0.10.2 From cf9f81523ef3e95d9f222c896d266e4562999150 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:09:34 -0700 Subject: [NETFILTER]: PPTP conntrack: simplify expectation handling Remove duplicated expectation handling in the NAT helper and simplify the remains in the conntrack helper. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h index 620bf06..2644b1f 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_pptp.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_pptp.h @@ -315,7 +315,7 @@ extern int struct PptpControlHeader *ctlh, union pptp_ctrl_union *pptpReq); -extern int +extern void (*ip_nat_pptp_hook_exp_gre)(struct ip_conntrack_expect *exp_orig, struct ip_conntrack_expect *exp_reply); diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 5f7af6e..57eac6e 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -80,7 +80,7 @@ int struct PptpControlHeader *ctlh, union pptp_ctrl_union *pptpReq); -int +void (*ip_nat_pptp_hook_exp_gre)(struct ip_conntrack_expect *expect_orig, struct ip_conntrack_expect *expect_reply); @@ -219,93 +219,63 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) /* expect GRE connections (PNS->PAC and PAC->PNS direction) */ static inline int -exp_gre(struct ip_conntrack *master, +exp_gre(struct ip_conntrack *ct, __be16 callid, __be16 peer_callid) { - struct ip_conntrack_tuple inv_tuple; - struct ip_conntrack_tuple exp_tuples[] = { - /* tuple in original direction, PNS->PAC */ - { .src = { .ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip, - .u = { .gre = { .key = peer_callid } } - }, - .dst = { .ip = master->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip, - .u = { .gre = { .key = callid } }, - .protonum = IPPROTO_GRE - }, - }, - /* tuple in reply direction, PAC->PNS */ - { .src = { .ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip, - .u = { .gre = { .key = callid } } - }, - .dst = { .ip = master->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip, - .u = { .gre = { .key = peer_callid } }, - .protonum = IPPROTO_GRE - }, - } - }; struct ip_conntrack_expect *exp_orig, *exp_reply; int ret = 1; - exp_orig = ip_conntrack_expect_alloc(master); + exp_orig = ip_conntrack_expect_alloc(ct); if (exp_orig == NULL) goto out; - exp_reply = ip_conntrack_expect_alloc(master); + exp_reply = ip_conntrack_expect_alloc(ct); if (exp_reply == NULL) goto out_put_orig; - memcpy(&exp_orig->tuple, &exp_tuples[0], sizeof(exp_orig->tuple)); + /* original direction, PNS->PAC */ + exp_orig->tuple.src.ip = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.src.ip; + exp_orig->tuple.src.u.gre.key = peer_callid; + exp_orig->tuple.dst.ip = ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple.dst.ip; + exp_orig->tuple.dst.u.gre.key = callid; + exp_orig->tuple.dst.protonum = IPPROTO_GRE; exp_orig->mask.src.ip = 0xffffffff; exp_orig->mask.src.u.all = 0; - exp_orig->mask.dst.u.all = 0; exp_orig->mask.dst.u.gre.key = htons(0xffff); exp_orig->mask.dst.ip = 0xffffffff; exp_orig->mask.dst.protonum = 0xff; - exp_orig->master = master; + exp_orig->master = ct; exp_orig->expectfn = pptp_expectfn; exp_orig->flags = 0; /* both expectations are identical apart from tuple */ memcpy(exp_reply, exp_orig, sizeof(*exp_reply)); - memcpy(&exp_reply->tuple, &exp_tuples[1], sizeof(exp_reply->tuple)); - - if (ip_nat_pptp_hook_exp_gre) - ret = ip_nat_pptp_hook_exp_gre(exp_orig, exp_reply); - else { - DEBUGP("calling expect_related PNS->PAC"); - DUMP_TUPLE(&exp_orig->tuple); + /* reply direction, PAC->PNS */ + exp_reply->tuple.src.ip = ct->tuplehash[IP_CT_DIR_REPLY].tuple.src.ip; + exp_reply->tuple.src.u.gre.key = callid; + exp_reply->tuple.dst.ip = ct->tuplehash[IP_CT_DIR_REPLY].tuple.dst.ip; + exp_reply->tuple.dst.u.gre.key = peer_callid; + exp_reply->tuple.dst.protonum = IPPROTO_GRE; - if (ip_conntrack_expect_related(exp_orig) != 0) { - DEBUGP("cannot expect_related()\n"); - goto out_put_both; - } - - DEBUGP("calling expect_related PAC->PNS"); - DUMP_TUPLE(&exp_reply->tuple); - - if (ip_conntrack_expect_related(exp_reply) != 0) { - DEBUGP("cannot expect_related()\n"); - goto out_unexpect_orig; - } - - /* Add GRE keymap entries */ - if (ip_ct_gre_keymap_add(master, &exp_reply->tuple, 0) != 0) { - DEBUGP("cannot keymap_add() exp\n"); - goto out_unexpect_both; - } - - invert_tuplepr(&inv_tuple, &exp_reply->tuple); - if (ip_ct_gre_keymap_add(master, &inv_tuple, 1) != 0) { - ip_ct_gre_keymap_destroy(master); - DEBUGP("cannot keymap_add() exp_inv\n"); - goto out_unexpect_both; - } - ret = 0; + if (ip_nat_pptp_hook_exp_gre) + ip_nat_pptp_hook_exp_gre(exp_orig, exp_reply); + if (ip_conntrack_expect_related(exp_orig) != 0) + goto out_put_both; + if (ip_conntrack_expect_related(exp_reply) != 0) + goto out_unexpect_orig; + + /* Add GRE keymap entries */ + if (ip_ct_gre_keymap_add(ct, &exp_orig->tuple, 0) != 0) + goto out_unexpect_both; + if (ip_ct_gre_keymap_add(ct, &exp_reply->tuple, 1) != 0) { + ip_ct_gre_keymap_destroy(ct); + goto out_unexpect_both; } + ret = 0; out_put_both: ip_conntrack_expect_put(exp_reply); diff --git a/net/ipv4/netfilter/ip_nat_helper_pptp.c b/net/ipv4/netfilter/ip_nat_helper_pptp.c index 0f5e753..84f6bd0 100644 --- a/net/ipv4/netfilter/ip_nat_helper_pptp.c +++ b/net/ipv4/netfilter/ip_nat_helper_pptp.c @@ -211,80 +211,28 @@ pptp_outbound_pkt(struct sk_buff **pskb, return NF_ACCEPT; } -static int +static void pptp_exp_gre(struct ip_conntrack_expect *expect_orig, struct ip_conntrack_expect *expect_reply) { - struct ip_ct_pptp_master *ct_pptp_info = - &expect_orig->master->help.ct_pptp_info; - struct ip_nat_pptp *nat_pptp_info = - &expect_orig->master->nat.help.nat_pptp_info; - struct ip_conntrack *ct = expect_orig->master; - - struct ip_conntrack_tuple inv_t; - struct ip_conntrack_tuple *orig_t, *reply_t; + struct ip_ct_pptp_master *ct_pptp_info = &ct->help.ct_pptp_info; + struct ip_nat_pptp *nat_pptp_info = &ct->nat.help.nat_pptp_info; /* save original PAC call ID in nat_info */ nat_pptp_info->pac_call_id = ct_pptp_info->pac_call_id; - /* alter expectation */ - orig_t = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple; - reply_t = &ct->tuplehash[IP_CT_DIR_REPLY].tuple; - /* alter expectation for PNS->PAC direction */ - invert_tuplepr(&inv_t, &expect_orig->tuple); expect_orig->saved_proto.gre.key = ct_pptp_info->pns_call_id; expect_orig->tuple.src.u.gre.key = nat_pptp_info->pns_call_id; expect_orig->tuple.dst.u.gre.key = ct_pptp_info->pac_call_id; expect_orig->dir = IP_CT_DIR_ORIGINAL; - inv_t.src.ip = reply_t->src.ip; - inv_t.dst.ip = reply_t->dst.ip; - inv_t.src.u.gre.key = nat_pptp_info->pac_call_id; - inv_t.dst.u.gre.key = ct_pptp_info->pns_call_id; - - if (!ip_conntrack_expect_related(expect_orig)) { - DEBUGP("successfully registered expect\n"); - } else { - DEBUGP("can't expect_related(expect_orig)\n"); - return 1; - } /* alter expectation for PAC->PNS direction */ - invert_tuplepr(&inv_t, &expect_reply->tuple); expect_reply->saved_proto.gre.key = nat_pptp_info->pns_call_id; expect_reply->tuple.src.u.gre.key = nat_pptp_info->pac_call_id; expect_reply->tuple.dst.u.gre.key = ct_pptp_info->pns_call_id; expect_reply->dir = IP_CT_DIR_REPLY; - inv_t.src.ip = orig_t->src.ip; - inv_t.dst.ip = orig_t->dst.ip; - inv_t.src.u.gre.key = nat_pptp_info->pns_call_id; - inv_t.dst.u.gre.key = ct_pptp_info->pac_call_id; - - if (!ip_conntrack_expect_related(expect_reply)) { - DEBUGP("successfully registered expect\n"); - } else { - DEBUGP("can't expect_related(expect_reply)\n"); - ip_conntrack_unexpect_related(expect_orig); - return 1; - } - - if (ip_ct_gre_keymap_add(ct, &expect_reply->tuple, 0) < 0) { - DEBUGP("can't register original keymap\n"); - ip_conntrack_unexpect_related(expect_orig); - ip_conntrack_unexpect_related(expect_reply); - return 1; - } - - if (ip_ct_gre_keymap_add(ct, &inv_t, 1) < 0) { - DEBUGP("can't register reply keymap\n"); - ip_conntrack_unexpect_related(expect_orig); - ip_conntrack_unexpect_related(expect_reply); - ip_ct_gre_keymap_destroy(ct); - return 1; - } - - return 0; } /* inbound packets == from PAC to PNS */ -- cgit v0.10.2 From a1073406a124c1d3b33a0f06bfb8078a9ddd1985 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:09:51 -0700 Subject: [NETFILTER]: PPTP conntrack: consolidate header size checks Also make sure not to pass undersized messages to the NAT helper. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 57eac6e..3b5464f 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -291,6 +291,22 @@ out_unexpect_orig: goto out_put_both; } +static const unsigned int pptp_msg_size[] = { + [PPTP_START_SESSION_REQUEST] = sizeof(struct PptpStartSessionRequest), + [PPTP_START_SESSION_REPLY] = sizeof(struct PptpStartSessionReply), + [PPTP_STOP_SESSION_REQUEST] = sizeof(struct PptpStopSessionRequest), + [PPTP_STOP_SESSION_REPLY] = sizeof(struct PptpStopSessionReply), + [PPTP_OUT_CALL_REQUEST] = sizeof(struct PptpOutCallRequest), + [PPTP_OUT_CALL_REPLY] = sizeof(struct PptpOutCallReply), + [PPTP_IN_CALL_REQUEST] = sizeof(struct PptpInCallRequest), + [PPTP_IN_CALL_REPLY] = sizeof(struct PptpInCallReply), + [PPTP_IN_CALL_CONNECT] = sizeof(struct PptpInCallConnected), + [PPTP_CALL_CLEAR_REQUEST] = sizeof(struct PptpClearCallRequest), + [PPTP_CALL_DISCONNECT_NOTIFY] = sizeof(struct PptpCallDisconnectNotify), + [PPTP_WAN_ERROR_NOTIFY] = sizeof(struct PptpWanErrorNotify), + [PPTP_SET_LINK_INFO] = sizeof(struct PptpSetLinkInfo), +}; + static inline int pptp_inbound_pkt(struct sk_buff **pskb, struct tcphdr *tcph, @@ -326,13 +342,11 @@ pptp_inbound_pkt(struct sk_buff **pskb, msg = ntohs(ctlh->messageType); DEBUGP("inbound control message %s\n", pptp_msg_name[msg]); + if (msg > 0 && msg <= PPTP_MSG_MAX && reqlen < pptp_msg_size[msg]) + return NF_ACCEPT; + switch (msg) { case PPTP_START_SESSION_REPLY: - if (reqlen < sizeof(_pptpReq.srep)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server confirms new control session */ if (info->sstate < PPTP_SESSION_REQUESTED) { DEBUGP("%s without START_SESS_REQUEST\n", @@ -346,11 +360,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_STOP_SESSION_REPLY: - if (reqlen < sizeof(_pptpReq.strep)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server confirms end of control session */ if (info->sstate > PPTP_SESSION_STOPREQ) { DEBUGP("%s without STOP_SESS_REQUEST\n", @@ -364,11 +373,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_OUT_CALL_REPLY: - if (reqlen < sizeof(_pptpReq.ocack)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server accepted call, we now expect GRE frames */ if (info->sstate != PPTP_SESSION_CONFIRMED) { DEBUGP("%s but no session\n", pptp_msg_name[msg]); @@ -404,11 +408,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_IN_CALL_REQUEST: - if (reqlen < sizeof(_pptpReq.icack)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server tells us about incoming call request */ if (info->sstate != PPTP_SESSION_CONFIRMED) { DEBUGP("%s but no session\n", pptp_msg_name[msg]); @@ -421,11 +420,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_IN_CALL_CONNECT: - if (reqlen < sizeof(_pptpReq.iccon)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server tells us about incoming call established */ if (info->sstate != PPTP_SESSION_CONFIRMED) { DEBUGP("%s but no session\n", pptp_msg_name[msg]); @@ -455,11 +449,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_CALL_DISCONNECT_NOTIFY: - if (reqlen < sizeof(_pptpReq.disc)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* server confirms disconnect */ cid = pptpReq->disc.callID; DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid)); @@ -470,8 +459,6 @@ pptp_inbound_pkt(struct sk_buff **pskb, break; case PPTP_WAN_ERROR_NOTIFY: - break; - case PPTP_ECHO_REQUEST: case PPTP_ECHO_REPLY: /* I don't have to explain these ;) */ @@ -522,6 +509,9 @@ pptp_outbound_pkt(struct sk_buff **pskb, msg = ntohs(ctlh->messageType); DEBUGP("outbound control message %s\n", pptp_msg_name[msg]); + if (msg > 0 && msg <= PPTP_MSG_MAX && reqlen < pptp_msg_size[msg]) + return NF_ACCEPT; + switch (msg) { case PPTP_START_SESSION_REQUEST: /* client requests for new control session */ @@ -537,11 +527,6 @@ pptp_outbound_pkt(struct sk_buff **pskb, break; case PPTP_OUT_CALL_REQUEST: - if (reqlen < sizeof(_pptpReq.ocreq)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* client initiating connection to server */ if (info->sstate != PPTP_SESSION_CONFIRMED) { DEBUGP("%s but no session\n", @@ -555,11 +540,6 @@ pptp_outbound_pkt(struct sk_buff **pskb, info->pns_call_id = cid; break; case PPTP_IN_CALL_REPLY: - if (reqlen < sizeof(_pptpReq.icack)) { - DEBUGP("%s: short packet\n", pptp_msg_name[msg]); - break; - } - /* client answers incoming call */ if (info->cstate != PPTP_CALL_IN_REQ && info->cstate != PPTP_CALL_IN_REP) { @@ -595,7 +575,6 @@ pptp_outbound_pkt(struct sk_buff **pskb, info->cstate = PPTP_CALL_CLEAR_REQ; break; case PPTP_SET_LINK_INFO: - break; case PPTP_ECHO_REQUEST: case PPTP_ECHO_REPLY: /* I don't have to explain these ;) */ -- cgit v0.10.2 From 4c651756d502e72a68b0bc6fb20bb18c68785227 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:10:06 -0700 Subject: [NETFILTER]: PPTP conntrack: consolidate header parsing Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 3b5464f..9a98a6c 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -291,60 +291,21 @@ out_unexpect_orig: goto out_put_both; } -static const unsigned int pptp_msg_size[] = { - [PPTP_START_SESSION_REQUEST] = sizeof(struct PptpStartSessionRequest), - [PPTP_START_SESSION_REPLY] = sizeof(struct PptpStartSessionReply), - [PPTP_STOP_SESSION_REQUEST] = sizeof(struct PptpStopSessionRequest), - [PPTP_STOP_SESSION_REPLY] = sizeof(struct PptpStopSessionReply), - [PPTP_OUT_CALL_REQUEST] = sizeof(struct PptpOutCallRequest), - [PPTP_OUT_CALL_REPLY] = sizeof(struct PptpOutCallReply), - [PPTP_IN_CALL_REQUEST] = sizeof(struct PptpInCallRequest), - [PPTP_IN_CALL_REPLY] = sizeof(struct PptpInCallReply), - [PPTP_IN_CALL_CONNECT] = sizeof(struct PptpInCallConnected), - [PPTP_CALL_CLEAR_REQUEST] = sizeof(struct PptpClearCallRequest), - [PPTP_CALL_DISCONNECT_NOTIFY] = sizeof(struct PptpCallDisconnectNotify), - [PPTP_WAN_ERROR_NOTIFY] = sizeof(struct PptpWanErrorNotify), - [PPTP_SET_LINK_INFO] = sizeof(struct PptpSetLinkInfo), -}; - static inline int pptp_inbound_pkt(struct sk_buff **pskb, - struct tcphdr *tcph, - unsigned int nexthdr_off, - unsigned int datalen, + struct PptpControlHeader *ctlh, + union pptp_ctrl_union *pptpReq, + unsigned int reqlen, struct ip_conntrack *ct, enum ip_conntrack_info ctinfo) { - struct PptpControlHeader _ctlh, *ctlh; - unsigned int reqlen; - union pptp_ctrl_union _pptpReq, *pptpReq; struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; __be16 cid, pcid; - ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); - if (!ctlh) { - DEBUGP("error during skb_header_pointer\n"); - return NF_ACCEPT; - } - nexthdr_off += sizeof(_ctlh); - datalen -= sizeof(_ctlh); - - reqlen = datalen; - if (reqlen > sizeof(*pptpReq)) - reqlen = sizeof(*pptpReq); - pptpReq = skb_header_pointer(*pskb, nexthdr_off, reqlen, &_pptpReq); - if (!pptpReq) { - DEBUGP("error during skb_header_pointer\n"); - return NF_ACCEPT; - } - msg = ntohs(ctlh->messageType); DEBUGP("inbound control message %s\n", pptp_msg_name[msg]); - if (msg > 0 && msg <= PPTP_MSG_MAX && reqlen < pptp_msg_size[msg]) - return NF_ACCEPT; - switch (msg) { case PPTP_START_SESSION_REPLY: /* server confirms new control session */ @@ -480,38 +441,19 @@ pptp_inbound_pkt(struct sk_buff **pskb, static inline int pptp_outbound_pkt(struct sk_buff **pskb, - struct tcphdr *tcph, - unsigned int nexthdr_off, - unsigned int datalen, + struct PptpControlHeader *ctlh, + union pptp_ctrl_union *pptpReq, + unsigned int reqlen, struct ip_conntrack *ct, enum ip_conntrack_info ctinfo) { - struct PptpControlHeader _ctlh, *ctlh; - unsigned int reqlen; - union pptp_ctrl_union _pptpReq, *pptpReq; struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; __be16 cid, pcid; - ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); - if (!ctlh) - return NF_ACCEPT; - nexthdr_off += sizeof(_ctlh); - datalen -= sizeof(_ctlh); - - reqlen = datalen; - if (reqlen > sizeof(*pptpReq)) - reqlen = sizeof(*pptpReq); - pptpReq = skb_header_pointer(*pskb, nexthdr_off, reqlen, &_pptpReq); - if (!pptpReq) - return NF_ACCEPT; - msg = ntohs(ctlh->messageType); DEBUGP("outbound control message %s\n", pptp_msg_name[msg]); - if (msg > 0 && msg <= PPTP_MSG_MAX && reqlen < pptp_msg_size[msg]) - return NF_ACCEPT; - switch (msg) { case PPTP_START_SESSION_REQUEST: /* client requests for new control session */ @@ -593,6 +535,21 @@ pptp_outbound_pkt(struct sk_buff **pskb, return NF_ACCEPT; } +static const unsigned int pptp_msg_size[] = { + [PPTP_START_SESSION_REQUEST] = sizeof(struct PptpStartSessionRequest), + [PPTP_START_SESSION_REPLY] = sizeof(struct PptpStartSessionReply), + [PPTP_STOP_SESSION_REQUEST] = sizeof(struct PptpStopSessionRequest), + [PPTP_STOP_SESSION_REPLY] = sizeof(struct PptpStopSessionReply), + [PPTP_OUT_CALL_REQUEST] = sizeof(struct PptpOutCallRequest), + [PPTP_OUT_CALL_REPLY] = sizeof(struct PptpOutCallReply), + [PPTP_IN_CALL_REQUEST] = sizeof(struct PptpInCallRequest), + [PPTP_IN_CALL_REPLY] = sizeof(struct PptpInCallReply), + [PPTP_IN_CALL_CONNECT] = sizeof(struct PptpInCallConnected), + [PPTP_CALL_CLEAR_REQUEST] = sizeof(struct PptpClearCallRequest), + [PPTP_CALL_DISCONNECT_NOTIFY] = sizeof(struct PptpCallDisconnectNotify), + [PPTP_WAN_ERROR_NOTIFY] = sizeof(struct PptpWanErrorNotify), + [PPTP_SET_LINK_INFO] = sizeof(struct PptpSetLinkInfo), +}; /* track caller id inside control connection, call expect_related */ static int @@ -600,16 +557,17 @@ conntrack_pptp_help(struct sk_buff **pskb, struct ip_conntrack *ct, enum ip_conntrack_info ctinfo) { - struct pptp_pkt_hdr _pptph, *pptph; - struct tcphdr _tcph, *tcph; - u_int32_t tcplen = (*pskb)->len - (*pskb)->nh.iph->ihl * 4; - u_int32_t datalen; int dir = CTINFO2DIR(ctinfo); struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; - unsigned int nexthdr_off; - + struct tcphdr _tcph, *tcph; + struct pptp_pkt_hdr _pptph, *pptph; + struct PptpControlHeader _ctlh, *ctlh; + union pptp_ctrl_union _pptpReq, *pptpReq; + unsigned int tcplen = (*pskb)->len - (*pskb)->nh.iph->ihl * 4; + unsigned int datalen, reqlen, nexthdr_off; int oldsstate, oldcstate; int ret; + u_int16_t msg; /* don't do any tracking before tcp handshake complete */ if (ctinfo != IP_CT_ESTABLISHED @@ -648,6 +606,23 @@ conntrack_pptp_help(struct sk_buff **pskb, return NF_ACCEPT; } + ctlh = skb_header_pointer(*pskb, nexthdr_off, sizeof(_ctlh), &_ctlh); + if (!ctlh) + return NF_ACCEPT; + nexthdr_off += sizeof(_ctlh); + datalen -= sizeof(_ctlh); + + reqlen = datalen; + msg = ntohs(ctlh->messageType); + if (msg > 0 && msg <= PPTP_MSG_MAX && reqlen < pptp_msg_size[msg]) + return NF_ACCEPT; + if (reqlen > sizeof(*pptpReq)) + reqlen = sizeof(*pptpReq); + + pptpReq = skb_header_pointer(*pskb, nexthdr_off, reqlen, &_pptpReq); + if (!pptpReq) + return NF_ACCEPT; + oldsstate = info->sstate; oldcstate = info->cstate; @@ -657,11 +632,11 @@ conntrack_pptp_help(struct sk_buff **pskb, * established from PNS->PAC. However, RFC makes no guarantee */ if (dir == IP_CT_DIR_ORIGINAL) /* client -> server (PNS -> PAC) */ - ret = pptp_outbound_pkt(pskb, tcph, nexthdr_off, datalen, ct, + ret = pptp_outbound_pkt(pskb, ctlh, pptpReq, reqlen, ct, ctinfo); else /* server -> client (PAC -> PNS) */ - ret = pptp_inbound_pkt(pskb, tcph, nexthdr_off, datalen, ct, + ret = pptp_inbound_pkt(pskb, ctlh, pptpReq, reqlen, ct, ctinfo); DEBUGP("sstate: %d->%d, cstate: %d->%d\n", oldsstate, info->sstate, oldcstate, info->cstate); -- cgit v0.10.2 From 87a0117afdfe64473a6c802501bc15aee145ebb8 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:10:21 -0700 Subject: [NETFILTER]: PPTP conntrack: clean up debugging cruft Also make sure not to hand packets received in an invalid state to the NAT helper since it will mangle the packet with invalid data. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 9a98a6c..7b6d5aa 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -301,7 +301,7 @@ pptp_inbound_pkt(struct sk_buff **pskb, { struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; - __be16 cid, pcid; + __be16 cid = 0, pcid = 0; msg = ntohs(ctlh->messageType); DEBUGP("inbound control message %s\n", pptp_msg_name[msg]); @@ -309,11 +309,8 @@ pptp_inbound_pkt(struct sk_buff **pskb, switch (msg) { case PPTP_START_SESSION_REPLY: /* server confirms new control session */ - if (info->sstate < PPTP_SESSION_REQUESTED) { - DEBUGP("%s without START_SESS_REQUEST\n", - pptp_msg_name[msg]); - break; - } + if (info->sstate < PPTP_SESSION_REQUESTED) + goto invalid; if (pptpReq->srep.resultCode == PPTP_START_OK) info->sstate = PPTP_SESSION_CONFIRMED; else @@ -322,11 +319,8 @@ pptp_inbound_pkt(struct sk_buff **pskb, case PPTP_STOP_SESSION_REPLY: /* server confirms end of control session */ - if (info->sstate > PPTP_SESSION_STOPREQ) { - DEBUGP("%s without STOP_SESS_REQUEST\n", - pptp_msg_name[msg]); - break; - } + if (info->sstate > PPTP_SESSION_STOPREQ) + goto invalid; if (pptpReq->strep.resultCode == PPTP_STOP_OK) info->sstate = PPTP_SESSION_NONE; else @@ -335,15 +329,12 @@ pptp_inbound_pkt(struct sk_buff **pskb, case PPTP_OUT_CALL_REPLY: /* server accepted call, we now expect GRE frames */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", pptp_msg_name[msg]); - break; - } + if (info->sstate != PPTP_SESSION_CONFIRMED) + goto invalid; if (info->cstate != PPTP_CALL_OUT_REQ && - info->cstate != PPTP_CALL_OUT_CONF) { - DEBUGP("%s without OUTCALL_REQ\n", pptp_msg_name[msg]); - break; - } + info->cstate != PPTP_CALL_OUT_CONF) + goto invalid; + if (pptpReq->ocack.resultCode != PPTP_OUTCALL_CONNECT) { info->cstate = PPTP_CALL_NONE; break; @@ -354,11 +345,8 @@ pptp_inbound_pkt(struct sk_buff **pskb, info->pac_call_id = cid; - if (info->pns_call_id != pcid) { - DEBUGP("%s for unknown callid %u\n", - pptp_msg_name[msg], ntohs(pcid)); - break; - } + if (info->pns_call_id != pcid) + goto invalid; DEBUGP("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg], ntohs(cid), ntohs(pcid)); @@ -370,10 +358,9 @@ pptp_inbound_pkt(struct sk_buff **pskb, case PPTP_IN_CALL_REQUEST: /* server tells us about incoming call request */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", pptp_msg_name[msg]); - break; - } + if (info->sstate != PPTP_SESSION_CONFIRMED) + goto invalid; + pcid = pptpReq->icack.peersCallID; DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid)); info->cstate = PPTP_CALL_IN_REQ; @@ -382,25 +369,17 @@ pptp_inbound_pkt(struct sk_buff **pskb, case PPTP_IN_CALL_CONNECT: /* server tells us about incoming call established */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", pptp_msg_name[msg]); - break; - } - if (info->cstate != PPTP_CALL_IN_REP - && info->cstate != PPTP_CALL_IN_CONF) { - DEBUGP("%s but never sent IN_CALL_REPLY\n", - pptp_msg_name[msg]); - break; - } + if (info->sstate != PPTP_SESSION_CONFIRMED) + goto invalid; + if (info->cstate != PPTP_CALL_IN_REP && + info->cstate != PPTP_CALL_IN_CONF) + goto invalid; pcid = pptpReq->iccon.peersCallID; cid = info->pac_call_id; - if (info->pns_call_id != pcid) { - DEBUGP("%s for unknown CallID %u\n", - pptp_msg_name[msg], ntohs(pcid)); - break; - } + if (info->pns_call_id != pcid) + goto invalid; DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid)); info->cstate = PPTP_CALL_IN_CONF; @@ -425,18 +404,21 @@ pptp_inbound_pkt(struct sk_buff **pskb, /* I don't have to explain these ;) */ break; default: - DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX) - ? pptp_msg_name[msg]:pptp_msg_name[0], msg); - break; + goto invalid; } - if (ip_nat_pptp_hook_inbound) return ip_nat_pptp_hook_inbound(pskb, ct, ctinfo, ctlh, pptpReq); - return NF_ACCEPT; +invalid: + DEBUGP("invalid %s: type=%d cid=%u pcid=%u " + "cstate=%d sstate=%d pns_cid=%u pac_cid=%u\n", + msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] : pptp_msg_name[0], + msg, ntohs(cid), ntohs(pcid), info->cstate, info->sstate, + ntohs(info->pns_call_id), ntohs(info->pac_call_id)); + return NF_ACCEPT; } static inline int @@ -449,7 +431,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, { struct ip_ct_pptp_master *info = &ct->help.ct_pptp_info; u_int16_t msg; - __be16 cid, pcid; + __be16 cid = 0, pcid = 0; msg = ntohs(ctlh->messageType); DEBUGP("outbound control message %s\n", pptp_msg_name[msg]); @@ -457,10 +439,8 @@ pptp_outbound_pkt(struct sk_buff **pskb, switch (msg) { case PPTP_START_SESSION_REQUEST: /* client requests for new control session */ - if (info->sstate != PPTP_SESSION_NONE) { - DEBUGP("%s but we already have one", - pptp_msg_name[msg]); - } + if (info->sstate != PPTP_SESSION_NONE) + goto invalid; info->sstate = PPTP_SESSION_REQUESTED; break; case PPTP_STOP_SESSION_REQUEST: @@ -470,11 +450,8 @@ pptp_outbound_pkt(struct sk_buff **pskb, case PPTP_OUT_CALL_REQUEST: /* client initiating connection to server */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("%s but no session\n", - pptp_msg_name[msg]); - break; - } + if (info->sstate != PPTP_SESSION_CONFIRMED) + goto invalid; info->cstate = PPTP_CALL_OUT_REQ; /* track PNS call id */ cid = pptpReq->ocreq.callID; @@ -483,22 +460,17 @@ pptp_outbound_pkt(struct sk_buff **pskb, break; case PPTP_IN_CALL_REPLY: /* client answers incoming call */ - if (info->cstate != PPTP_CALL_IN_REQ - && info->cstate != PPTP_CALL_IN_REP) { - DEBUGP("%s without incall_req\n", - pptp_msg_name[msg]); - break; - } + if (info->cstate != PPTP_CALL_IN_REQ && + info->cstate != PPTP_CALL_IN_REP) + goto invalid; + if (pptpReq->icack.resultCode != PPTP_INCALL_ACCEPT) { info->cstate = PPTP_CALL_NONE; break; } pcid = pptpReq->icack.peersCallID; - if (info->pac_call_id != pcid) { - DEBUGP("%s for unknown call %u\n", - pptp_msg_name[msg], ntohs(pcid)); - break; - } + if (info->pac_call_id != pcid) + goto invalid; DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(pcid)); /* part two of the three-way handshake */ info->cstate = PPTP_CALL_IN_REP; @@ -507,10 +479,8 @@ pptp_outbound_pkt(struct sk_buff **pskb, case PPTP_CALL_CLEAR_REQUEST: /* client requests hangup of call */ - if (info->sstate != PPTP_SESSION_CONFIRMED) { - DEBUGP("CLEAR_CALL but no session\n"); - break; - } + if (info->sstate != PPTP_SESSION_CONFIRMED) + goto invalid; /* FUTURE: iterate over all calls and check if * call ID is valid. We don't do this without newnat, * because we only know about last call */ @@ -522,16 +492,20 @@ pptp_outbound_pkt(struct sk_buff **pskb, /* I don't have to explain these ;) */ break; default: - DEBUGP("invalid %s (TY=%d)\n", (msg <= PPTP_MSG_MAX)? - pptp_msg_name[msg]:pptp_msg_name[0], msg); - /* unknown: no need to create GRE masq table entry */ - break; + goto invalid; } if (ip_nat_pptp_hook_outbound) return ip_nat_pptp_hook_outbound(pskb, ct, ctinfo, ctlh, pptpReq); + return NF_ACCEPT; +invalid: + DEBUGP("invalid %s: type=%d cid=%u pcid=%u " + "cstate=%d sstate=%d pns_cid=%u pac_cid=%u\n", + msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] : pptp_msg_name[0], + msg, ntohs(cid), ntohs(pcid), info->cstate, info->sstate, + ntohs(info->pns_call_id), ntohs(info->pac_call_id)); return NF_ACCEPT; } -- cgit v0.10.2 From 750a58423309b56751076329e9edf61b93213e0f Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:10:37 -0700 Subject: [NETFILTER]: PPTP conntrack: check call ID before changing state For rejected calls the state is set to PPTP_CALL_NONE even for non-matching call ids. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 7b6d5aa..5cb6b61 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -335,25 +335,19 @@ pptp_inbound_pkt(struct sk_buff **pskb, info->cstate != PPTP_CALL_OUT_CONF) goto invalid; - if (pptpReq->ocack.resultCode != PPTP_OUTCALL_CONNECT) { - info->cstate = PPTP_CALL_NONE; - break; - } - cid = pptpReq->ocack.callID; pcid = pptpReq->ocack.peersCallID; - - info->pac_call_id = cid; - if (info->pns_call_id != pcid) goto invalid; - DEBUGP("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg], ntohs(cid), ntohs(pcid)); - info->cstate = PPTP_CALL_OUT_CONF; - - exp_gre(ct, cid, pcid); + if (pptpReq->ocack.resultCode == PPTP_OUTCALL_CONNECT) { + info->cstate = PPTP_CALL_OUT_CONF; + info->pac_call_id = cid; + exp_gre(ct, cid, pcid); + } else + info->cstate = PPTP_CALL_NONE; break; case PPTP_IN_CALL_REQUEST: @@ -464,17 +458,17 @@ pptp_outbound_pkt(struct sk_buff **pskb, info->cstate != PPTP_CALL_IN_REP) goto invalid; - if (pptpReq->icack.resultCode != PPTP_INCALL_ACCEPT) { - info->cstate = PPTP_CALL_NONE; - break; - } pcid = pptpReq->icack.peersCallID; if (info->pac_call_id != pcid) goto invalid; DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(pcid)); - /* part two of the three-way handshake */ - info->cstate = PPTP_CALL_IN_REP; - info->pns_call_id = pcid; + + if (pptpReq->icack.resultCode == PPTP_INCALL_ACCEPT) { + /* part two of the three-way handshake */ + info->cstate = PPTP_CALL_IN_REP; + info->pns_call_id = pcid; + } else + info->cstate = PPTP_CALL_NONE; break; case PPTP_CALL_CLEAR_REQUEST: -- cgit v0.10.2 From 62fbe9c82b20197a4f9c54f7add5d368418ba277 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:10:52 -0700 Subject: [NETFILTER]: PPTP conntrack: fix PPTP_IN_CALL message types Fix incorrectly used message types and call IDs: - PPTP_IN_CALL_REQUEST (PAC->PNS) contains a PptpInCallRequest (icreq) message and the PAC call ID - PPTP_IN_CALL_REPLY (PNS->PAC) contains a PptpInCallReply (icack) message and the PNS call ID Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 5cb6b61..b0225b6 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -355,10 +355,10 @@ pptp_inbound_pkt(struct sk_buff **pskb, if (info->sstate != PPTP_SESSION_CONFIRMED) goto invalid; - pcid = pptpReq->icack.peersCallID; - DEBUGP("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid)); + cid = pptpReq->icreq.callID; + DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid)); info->cstate = PPTP_CALL_IN_REQ; - info->pac_call_id = pcid; + info->pac_call_id = cid; break; case PPTP_IN_CALL_CONNECT: @@ -458,15 +458,17 @@ pptp_outbound_pkt(struct sk_buff **pskb, info->cstate != PPTP_CALL_IN_REP) goto invalid; + cid = pptpReq->icack.callID; pcid = pptpReq->icack.peersCallID; if (info->pac_call_id != pcid) goto invalid; - DEBUGP("%s, CID=%X\n", pptp_msg_name[msg], ntohs(pcid)); + DEBUGP("%s, CID=%X PCID=%X\n", pptp_msg_name[msg], + ntohs(cid), ntohs(pcid)); if (pptpReq->icack.resultCode == PPTP_INCALL_ACCEPT) { /* part two of the three-way handshake */ info->cstate = PPTP_CALL_IN_REP; - info->pns_call_id = pcid; + info->pns_call_id = cid; } else info->cstate = PPTP_CALL_NONE; break; diff --git a/net/ipv4/netfilter/ip_nat_helper_pptp.c b/net/ipv4/netfilter/ip_nat_helper_pptp.c index 84f6bd0..2ff5788 100644 --- a/net/ipv4/netfilter/ip_nat_helper_pptp.c +++ b/net/ipv4/netfilter/ip_nat_helper_pptp.c @@ -172,7 +172,7 @@ pptp_outbound_pkt(struct sk_buff **pskb, ct_pptp_info->pns_call_id = new_callid; break; case PPTP_IN_CALL_REPLY: - cid_off = offsetof(union pptp_ctrl_union, icreq.callID); + cid_off = offsetof(union pptp_ctrl_union, icack.callID); break; case PPTP_CALL_CLEAR_REQUEST: cid_off = offsetof(union pptp_ctrl_union, clrreq.callID); -- cgit v0.10.2 From fd5e3befa405ea64d4db6b393b821644bf963c57 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:11:12 -0700 Subject: [NETFILTER]: PPTP conntrack: fix GRE keymap leak When destroying the GRE expectations without having seen the GRE connection the keymap entry is not freed, leading to a memory leak and, in case of a following call within the same session, failure during expectation setup. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index b0225b6..98267b0 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -194,6 +194,7 @@ static void pptp_destroy_siblings(struct ip_conntrack *ct) { struct ip_conntrack_tuple t; + ip_ct_gre_keymap_destroy(ct); /* Since ct->sibling_list has literally rusted away in 2.6.11, * we now need another way to find out about our sibling * contrack and expects... -HW */ -- cgit v0.10.2 From 4c5de695cf7f71c85ad8cfff509f6475b8bd4d27 Mon Sep 17 00:00:00 2001 From: Patrick McHardy Date: Wed, 20 Sep 2006 12:11:30 -0700 Subject: [NETFILTER]: PPTP conntrack: fix another GRE keymap leak When the master PPTP connection times out while still having unfullfilled expectations (and a GRE keymap entry) associated with it, the keymap entry is not destroyed. Add a destroy callback to struct ip_conntrack_helper and use it to destroy PPTP siblings when the master is destroyed. Signed-off-by: Patrick McHardy Signed-off-by: David S. Miller diff --git a/include/linux/netfilter_ipv4/ip_conntrack_helper.h b/include/linux/netfilter_ipv4/ip_conntrack_helper.h index 8d69279..77fe868 100644 --- a/include/linux/netfilter_ipv4/ip_conntrack_helper.h +++ b/include/linux/netfilter_ipv4/ip_conntrack_helper.h @@ -25,6 +25,8 @@ struct ip_conntrack_helper struct ip_conntrack *ct, enum ip_conntrack_info conntrackinfo); + void (*destroy)(struct ip_conntrack *ct); + int (*to_nfattr)(struct sk_buff *skb, const struct ip_conntrack *ct); }; diff --git a/net/ipv4/netfilter/ip_conntrack_core.c b/net/ipv4/netfilter/ip_conntrack_core.c index 2b6f24f..c432b31 100644 --- a/net/ipv4/netfilter/ip_conntrack_core.c +++ b/net/ipv4/netfilter/ip_conntrack_core.c @@ -307,6 +307,7 @@ destroy_conntrack(struct nf_conntrack *nfct) { struct ip_conntrack *ct = (struct ip_conntrack *)nfct; struct ip_conntrack_protocol *proto; + struct ip_conntrack_helper *helper; DEBUGP("destroy_conntrack(%p)\n", ct); IP_NF_ASSERT(atomic_read(&nfct->use) == 0); @@ -315,6 +316,10 @@ destroy_conntrack(struct nf_conntrack *nfct) ip_conntrack_event(IPCT_DESTROY, ct); set_bit(IPS_DYING_BIT, &ct->status); + helper = ct->helper; + if (helper && helper->destroy) + helper->destroy(ct); + /* To make sure we don't get any weird locking issues here: * destroy_conntrack() MUST NOT be called with a write lock * to ip_conntrack_lock!!! -HW */ diff --git a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c index 98267b0..fb0aee6 100644 --- a/net/ipv4/netfilter/ip_conntrack_helper_pptp.c +++ b/net/ipv4/netfilter/ip_conntrack_helper_pptp.c @@ -553,15 +553,6 @@ conntrack_pptp_help(struct sk_buff **pskb, nexthdr_off += tcph->doff * 4; datalen = tcplen - tcph->doff * 4; - if (tcph->fin || tcph->rst) { - DEBUGP("RST/FIN received, timeouting GRE\n"); - /* can't do this after real newnat */ - info->cstate = PPTP_CALL_NONE; - - /* untrack this call id, unexpect GRE packets */ - pptp_destroy_siblings(ct); - } - pptph = skb_header_pointer(*pskb, nexthdr_off, sizeof(_pptph), &_pptph); if (!pptph) { DEBUGP("no full PPTP header, can't track\n"); @@ -640,7 +631,8 @@ static struct ip_conntrack_helper pptp = { .protonum = 0xff } }, - .help = conntrack_pptp_help + .help = conntrack_pptp_help, + .destroy = pptp_destroy_siblings, }; extern void ip_ct_proto_gre_fini(void); -- cgit v0.10.2 From e21e0b5f19ac7835a244c2016f7ed726f971b3e9 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala Date: Fri, 22 Sep 2006 14:41:44 -0700 Subject: [IPV6] NDISC: Handle NDP messages to proxied addresses. It is required to respond to NDP messages sent directly to the "target" unicast address. Proxying node (router) is required to handle such messages. To achieve this, check if the packet in forwarding patch is NDP message. With this patch, the proxy neighbor entries are always looked up in forwarding path. We may want to optimize further. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index c14ea1e..0f56e9e 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -308,6 +308,46 @@ static int ip6_call_ra_chain(struct sk_buff *skb, int sel) return 0; } +static int ip6_forward_proxy_check(struct sk_buff *skb) +{ + struct ipv6hdr *hdr = skb->nh.ipv6h; + u8 nexthdr = hdr->nexthdr; + int offset; + + if (ipv6_ext_hdr(nexthdr)) { + offset = ipv6_skip_exthdr(skb, sizeof(*hdr), &nexthdr); + if (offset < 0) + return 0; + } else + offset = sizeof(struct ipv6hdr); + + if (nexthdr == IPPROTO_ICMPV6) { + struct icmp6hdr *icmp6; + + if (!pskb_may_pull(skb, skb->nh.raw + offset + 1 - skb->data)) + return 0; + + icmp6 = (struct icmp6hdr *)(skb->nh.raw + offset); + + switch (icmp6->icmp6_type) { + case NDISC_ROUTER_SOLICITATION: + case NDISC_ROUTER_ADVERTISEMENT: + case NDISC_NEIGHBOUR_SOLICITATION: + case NDISC_NEIGHBOUR_ADVERTISEMENT: + case NDISC_REDIRECT: + /* For reaction involving unicast neighbor discovery + * message destined to the proxied address, pass it to + * input function. + */ + return 1; + default: + break; + } + } + + return 0; +} + static inline int ip6_forward_finish(struct sk_buff *skb) { return dst_output(skb); @@ -362,6 +402,11 @@ int ip6_forward(struct sk_buff *skb) return -ETIMEDOUT; } + if (pneigh_lookup(&nd_tbl, &hdr->daddr, skb->dev, 0)) { + if (ip6_forward_proxy_check(skb)) + return ip6_input(skb); + } + if (!xfrm6_route_forward(skb)) { IP6_INC_STATS(IPSTATS_MIB_INDISCARDS); goto drop; -- cgit v0.10.2 From 74553b09dcd9194cbda737016f0b89f245145670 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala Date: Fri, 22 Sep 2006 14:42:18 -0700 Subject: [IPV6]: Don't forward packets to proxied link-local address. Proxying router can't forward traffic sent to link-local address, so signal the sender and discard the packet. This behavior is clarified by Mobile IPv6 specification (RFC3775) but might be required for all proxying router. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 0f56e9e..b2be749 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -345,6 +345,16 @@ static int ip6_forward_proxy_check(struct sk_buff *skb) } } + /* + * The proxying router can't forward traffic sent to a link-local + * address, so signal the sender and discard the packet. This + * behavior is clarified by the MIPv6 specification. + */ + if (ipv6_addr_type(&hdr->daddr) & IPV6_ADDR_LINKLOCAL) { + dst_link_failure(skb); + return -1; + } + return 0; } @@ -403,8 +413,13 @@ int ip6_forward(struct sk_buff *skb) } if (pneigh_lookup(&nd_tbl, &hdr->daddr, skb->dev, 0)) { - if (ip6_forward_proxy_check(skb)) + int proxied = ip6_forward_proxy_check(skb); + if (proxied > 0) return ip6_input(skb); + else if (proxied < 0) { + IP6_INC_STATS(IPSTATS_MIB_INDISCARDS); + goto drop; + } } if (!xfrm6_route_forward(skb)) { -- cgit v0.10.2 From 5f3e6e9e19f50a6910aec2dbd479187aabba04b7 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala Date: Fri, 22 Sep 2006 14:42:46 -0700 Subject: [IPV6] NDISC: Avoid updating neighbor cache for proxied address in receiving NA. This aims at proxying router not updating neighbor cache entry for proxied address when it receives NA because either the proxied node is off link or it has already sent a NA to the proxied router. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index ed01f9a..0e0d6ce 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -952,6 +952,15 @@ static void ndisc_recv_na(struct sk_buff *skb) if (neigh->nud_state & NUD_FAILED) goto out; + /* + * Don't update the neighbor cache entry on a proxy NA from + * ourselves because either the proxied node is off link or it + * has already sent a NA to us. + */ + if (lladdr && !memcmp(lladdr, dev->dev_addr, dev->addr_len) && + pneigh_lookup(&nd_tbl, &msg->target, dev, 0)) + goto out; + neigh_update(neigh, lladdr, msg->icmph.icmp6_solicited ? NUD_REACHABLE : NUD_STALE, NEIGH_UPDATE_F_WEAK_OVERRIDE| -- cgit v0.10.2 From 62dd93181aaa1d5a501a9cebcb254f44b8a48af7 Mon Sep 17 00:00:00 2001 From: Ville Nuorvala Date: Fri, 22 Sep 2006 14:43:19 -0700 Subject: [IPV6] NDISC: Set per-entry is_router flag in Proxy NA. We have sent NA with router flag from the node-wide forwarding configuration. This is not appropriate for proxy NA, and it should be set according to each proxy entry's configuration. This is used by Mobile IPv6 home agent to support physical home link in acting as a proxy router for mobile node which is not a router, for example. Based on MIPL2 kernel patch. Signed-off-by: Ville Nuorvala Signed-off-by: Masahide NAKAMURA Signed-off-by: YOSHIFUJI Hideaki diff --git a/include/net/neighbour.h b/include/net/neighbour.h index bd187da..c8aacbd 100644 --- a/include/net/neighbour.h +++ b/include/net/neighbour.h @@ -126,6 +126,7 @@ struct pneigh_entry { struct pneigh_entry *next; struct net_device *dev; + u8 flags; u8 key[0]; }; diff --git a/net/core/neighbour.c b/net/core/neighbour.c index a45bd21..b6c69e1 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -1544,9 +1544,14 @@ int neigh_add(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) lladdr = tb[NDA_LLADDR] ? nla_data(tb[NDA_LLADDR]) : NULL; if (ndm->ndm_flags & NTF_PROXY) { - err = 0; - if (pneigh_lookup(tbl, dst, dev, 1) == NULL) - err = -ENOBUFS; + struct pneigh_entry *pn; + + err = -ENOBUFS; + pn = pneigh_lookup(tbl, dst, dev, 1); + if (pn) { + pn->flags = ndm->ndm_flags; + err = 0; + } goto out_dev_put; } diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 0e0d6ce..ddf0386 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -736,8 +736,10 @@ static void ndisc_recv_ns(struct sk_buff *skb) struct inet6_ifaddr *ifp; struct inet6_dev *idev = NULL; struct neighbour *neigh; + struct pneigh_entry *pneigh = NULL; int dad = ipv6_addr_any(saddr); int inc; + int is_router; if (ipv6_addr_is_multicast(&msg->target)) { ND_PRINTK2(KERN_WARNING @@ -822,7 +824,8 @@ static void ndisc_recv_ns(struct sk_buff *skb) if (ipv6_chk_acast_addr(dev, &msg->target) || (idev->cnf.forwarding && - pneigh_lookup(&nd_tbl, &msg->target, dev, 0))) { + (pneigh = pneigh_lookup(&nd_tbl, + &msg->target, dev, 0)) != NULL)) { if (!(NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED) && skb->pkt_type != PACKET_HOST && inc != 0 && @@ -843,12 +846,17 @@ static void ndisc_recv_ns(struct sk_buff *skb) goto out; } + if (pneigh) + is_router = pneigh->flags & NTF_ROUTER; + else + is_router = idev->cnf.forwarding; + if (dad) { struct in6_addr maddr; ipv6_addr_all_nodes(&maddr); ndisc_send_na(dev, NULL, &maddr, &msg->target, - idev->cnf.forwarding, 0, (ifp != NULL), 1); + is_router, 0, (ifp != NULL), 1); goto out; } @@ -869,7 +877,7 @@ static void ndisc_recv_ns(struct sk_buff *skb) NEIGH_UPDATE_F_OVERRIDE); if (neigh || !dev->hard_header) { ndisc_send_na(dev, neigh, saddr, &msg->target, - idev->cnf.forwarding, + is_router, 1, (ifp != NULL && inc), inc); if (neigh) neigh_release(neigh); -- cgit v0.10.2 From fbea49e1e2404baa2d88ab47e2db89e49551b53b Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 22 Sep 2006 14:43:49 -0700 Subject: [IPV6] NDISC: Add proxy_ndp sysctl. We do not always need proxy NDP functionality even we enable forwarding. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index 307cd4e..935e298 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -765,6 +765,9 @@ conf/all/forwarding - BOOLEAN This referred to as global forwarding. +proxy_ndp - BOOLEAN + Do proxy ndp. + conf/interface/*: Change special settings per interface. diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h index 1d6d3cc..caca57d 100644 --- a/include/linux/ipv6.h +++ b/include/linux/ipv6.h @@ -176,6 +176,7 @@ struct ipv6_devconf { __s32 accept_ra_rt_info_max_plen; #endif #endif + __s32 proxy_ndp; void *sysctl; }; @@ -203,6 +204,7 @@ enum { DEVCONF_ACCEPT_RA_RTR_PREF, DEVCONF_RTR_PROBE_INTERVAL, DEVCONF_ACCEPT_RA_RT_INFO_MAX_PLEN, + DEVCONF_PROXY_NDP, DEVCONF_MAX }; diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index af61d92..736ed91 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -556,6 +556,7 @@ enum { NET_IPV6_ACCEPT_RA_RTR_PREF=20, NET_IPV6_RTR_PROBE_INTERVAL=21, NET_IPV6_ACCEPT_RA_RT_INFO_MAX_PLEN=22, + NET_IPV6_PROXY_NDP=23, __NET_IPV6_MAX }; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 1e5a296..825a291 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -175,6 +175,7 @@ struct ipv6_devconf ipv6_devconf __read_mostly = { .accept_ra_rt_info_max_plen = 0, #endif #endif + .proxy_ndp = 0, }; static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { @@ -205,6 +206,7 @@ static struct ipv6_devconf ipv6_devconf_dflt __read_mostly = { .accept_ra_rt_info_max_plen = 0, #endif #endif + .proxy_ndp = 0, }; /* IPv6 Wildcard Address and Loopback Address defined by RFC2553 */ @@ -3337,6 +3339,7 @@ static void inline ipv6_store_devconf(struct ipv6_devconf *cnf, array[DEVCONF_ACCEPT_RA_RT_INFO_MAX_PLEN] = cnf->accept_ra_rt_info_max_plen; #endif #endif + array[DEVCONF_PROXY_NDP] = cnf->proxy_ndp; } /* Maximum length of ifinfomsg attributes */ @@ -3860,6 +3863,14 @@ static struct addrconf_sysctl_table #endif #endif { + .ctl_name = NET_IPV6_PROXY_NDP, + .procname = "proxy_ndp", + .data = &ipv6_devconf.proxy_ndp, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = &proc_dointvec, + }, + { .ctl_name = 0, /* sentinel */ } }, diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index b2be749..6671691 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -412,7 +412,9 @@ int ip6_forward(struct sk_buff *skb) return -ETIMEDOUT; } - if (pneigh_lookup(&nd_tbl, &hdr->daddr, skb->dev, 0)) { + /* XXX: idev->cnf.proxy_ndp? */ + if (ipv6_devconf.proxy_ndp && + pneigh_lookup(&nd_tbl, &hdr->daddr, skb->dev, 0)) { int proxied = ip6_forward_proxy_check(skb); if (proxied > 0) return ip6_input(skb); diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index ddf0386..76517a5 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -824,6 +824,7 @@ static void ndisc_recv_ns(struct sk_buff *skb) if (ipv6_chk_acast_addr(dev, &msg->target) || (idev->cnf.forwarding && + (ipv6_devconf.proxy_ndp || idev->cnf.proxy_ndp) && (pneigh = pneigh_lookup(&nd_tbl, &msg->target, dev, 0)) != NULL)) { if (!(NEIGH_CB(skb)->flags & LOCALLY_ENQUEUED) && @@ -966,8 +967,13 @@ static void ndisc_recv_na(struct sk_buff *skb) * has already sent a NA to us. */ if (lladdr && !memcmp(lladdr, dev->dev_addr, dev->addr_len) && - pneigh_lookup(&nd_tbl, &msg->target, dev, 0)) + ipv6_devconf.forwarding && ipv6_devconf.proxy_ndp && + pneigh_lookup(&nd_tbl, &msg->target, dev, 0)) { + /* XXX: idev->cnf.prixy_ndp */ + WARN_ON(skb->dst != NULL && + ((struct rt6_info *)skb->dst)->rt6i_idev); goto out; + } neigh_update(neigh, lladdr, msg->icmph.icmp6_solicited ? NUD_REACHABLE : NUD_STALE, -- cgit v0.10.2 From 8814c4b533817df825485ff32ce6ac406c3a54d1 Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 22 Sep 2006 14:44:24 -0700 Subject: [IPV6] ADDRCONF: Convert addrconf_lock to RCU. Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/net/addrconf.h b/include/net/addrconf.h index 5fc8627..aa2ed8f 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -133,20 +133,18 @@ extern int unregister_inet6addr_notifier(struct notifier_block *nb); static inline struct inet6_dev * __in6_dev_get(struct net_device *dev) { - return (struct inet6_dev *)dev->ip6_ptr; + return rcu_dereference(dev->ip6_ptr); } -extern rwlock_t addrconf_lock; - static inline struct inet6_dev * in6_dev_get(struct net_device *dev) { struct inet6_dev *idev = NULL; - read_lock(&addrconf_lock); - idev = dev->ip6_ptr; + rcu_read_lock(); + idev = __in6_dev_get(dev); if (idev) atomic_inc(&idev->refcnt); - read_unlock(&addrconf_lock); + rcu_read_unlock(); return idev; } diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index e459e1a..34489c1 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -189,6 +189,7 @@ struct inet6_dev struct ipv6_devconf cnf; struct ipv6_devstat stats; unsigned long tstamp; /* ipv6InterfaceTable update timestamp */ + struct rcu_head rcu; }; extern struct ipv6_devconf ipv6_devconf; diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 6a7320b..72145d4 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -1786,7 +1786,7 @@ static void pktgen_setup_inject(struct pktgen_dev *pkt_dev) * use ipv6_get_lladdr if/when it's get exported */ - read_lock(&addrconf_lock); + rcu_read_lock(); if ((idev = __in6_dev_get(pkt_dev->odev)) != NULL) { struct inet6_ifaddr *ifp; @@ -1805,7 +1805,7 @@ static void pktgen_setup_inject(struct pktgen_dev *pkt_dev) } read_unlock_bh(&idev->lock); } - read_unlock(&addrconf_lock); + rcu_read_unlock(); if (err) printk("pktgen: ERROR: IPv6 link address not availble.\n"); } diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 825a291..c09ebb7 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -119,9 +119,6 @@ static int ipv6_count_addresses(struct inet6_dev *idev); static struct inet6_ifaddr *inet6_addr_lst[IN6_ADDR_HSIZE]; static DEFINE_RWLOCK(addrconf_hash_lock); -/* Protects inet6 devices */ -DEFINE_RWLOCK(addrconf_lock); - static void addrconf_verify(unsigned long); static DEFINE_TIMER(addr_chk_timer, addrconf_verify, 0, 0); @@ -318,6 +315,12 @@ static void addrconf_mod_timer(struct inet6_ifaddr *ifp, /* Nobody refers to this device, we may destroy it. */ +static void in6_dev_finish_destroy_rcu(struct rcu_head *head) +{ + struct inet6_dev *idev = container_of(head, struct inet6_dev, rcu); + kfree(idev); +} + void in6_dev_finish_destroy(struct inet6_dev *idev) { struct net_device *dev = idev->dev; @@ -332,7 +335,7 @@ void in6_dev_finish_destroy(struct inet6_dev *idev) return; } snmp6_free_dev(idev); - kfree(idev); + call_rcu(&idev->rcu, in6_dev_finish_destroy_rcu); } static struct inet6_dev * ipv6_add_dev(struct net_device *dev) @@ -408,9 +411,8 @@ static struct inet6_dev * ipv6_add_dev(struct net_device *dev) if (netif_carrier_ok(dev)) ndev->if_flags |= IF_READY; - write_lock_bh(&addrconf_lock); - dev->ip6_ptr = ndev; - write_unlock_bh(&addrconf_lock); + /* protected by rtnl_lock */ + rcu_assign_pointer(dev->ip6_ptr, ndev); ipv6_mc_init_dev(ndev); ndev->tstamp = jiffies; @@ -474,7 +476,7 @@ static void addrconf_forward_change(void) read_lock(&dev_base_lock); for (dev=dev_base; dev; dev=dev->next) { - read_lock(&addrconf_lock); + rcu_read_lock(); idev = __in6_dev_get(dev); if (idev) { int changed = (!idev->cnf.forwarding) ^ (!ipv6_devconf.forwarding); @@ -482,7 +484,7 @@ static void addrconf_forward_change(void) if (changed) dev_forward_change(idev); } - read_unlock(&addrconf_lock); + rcu_read_unlock(); } read_unlock(&dev_base_lock); } @@ -543,7 +545,7 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, int pfxlen, int hash; int err = 0; - read_lock_bh(&addrconf_lock); + rcu_read_lock_bh(); if (idev->dead) { err = -ENODEV; /*XXX*/ goto out2; @@ -612,7 +614,7 @@ ipv6_add_addr(struct inet6_dev *idev, const struct in6_addr *addr, int pfxlen, in6_ifa_hold(ifa); write_unlock(&idev->lock); out2: - read_unlock_bh(&addrconf_lock); + rcu_read_unlock_bh(); if (likely(err == 0)) atomic_notifier_call_chain(&inet6addr_chain, NETDEV_UP, ifa); @@ -915,7 +917,7 @@ int ipv6_dev_get_saddr(struct net_device *daddr_dev, memset(&hiscore, 0, sizeof(hiscore)); read_lock(&dev_base_lock); - read_lock(&addrconf_lock); + rcu_read_lock(); for (dev = dev_base; dev; dev=dev->next) { struct inet6_dev *idev; @@ -1127,7 +1129,7 @@ record_it: } read_unlock_bh(&idev->lock); } - read_unlock(&addrconf_lock); + rcu_read_unlock(); read_unlock(&dev_base_lock); if (!ifa_result) @@ -1151,7 +1153,7 @@ int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr) struct inet6_dev *idev; int err = -EADDRNOTAVAIL; - read_lock(&addrconf_lock); + rcu_read_lock(); if ((idev = __in6_dev_get(dev)) != NULL) { struct inet6_ifaddr *ifp; @@ -1165,7 +1167,7 @@ int ipv6_get_lladdr(struct net_device *dev, struct in6_addr *addr) } read_unlock_bh(&idev->lock); } - read_unlock(&addrconf_lock); + rcu_read_unlock(); return err; } @@ -1466,7 +1468,7 @@ static void ipv6_regen_rndid(unsigned long data) struct inet6_dev *idev = (struct inet6_dev *) data; unsigned long expires; - read_lock_bh(&addrconf_lock); + rcu_read_lock_bh(); write_lock_bh(&idev->lock); if (idev->dead) @@ -1490,7 +1492,7 @@ static void ipv6_regen_rndid(unsigned long data) out: write_unlock_bh(&idev->lock); - read_unlock_bh(&addrconf_lock); + rcu_read_unlock_bh(); in6_dev_put(idev); } @@ -2342,10 +2344,10 @@ static int addrconf_ifdown(struct net_device *dev, int how) Do not dev_put! */ if (how == 1) { - write_lock_bh(&addrconf_lock); - dev->ip6_ptr = NULL; idev->dead = 1; - write_unlock_bh(&addrconf_lock); + + /* protected by rtnl_lock */ + rcu_assign_pointer(dev->ip6_ptr, NULL); /* Step 1.5: remove snmp6 entry */ snmp6_unregister_dev(idev); @@ -3573,10 +3575,10 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp) { - read_lock_bh(&addrconf_lock); + rcu_read_lock_bh(); if (likely(ifp->idev->dead == 0)) __ipv6_ifa_notify(event, ifp); - read_unlock_bh(&addrconf_lock); + rcu_read_unlock_bh(); } #ifdef CONFIG_SYSCTL diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index b80fc50..a960476 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -56,7 +56,7 @@ ip6_onlink(struct in6_addr *addr, struct net_device *dev) int onlink; onlink = 0; - read_lock(&addrconf_lock); + rcu_read_lock(); idev = __in6_dev_get(dev); if (idev) { read_lock_bh(&idev->lock); @@ -68,7 +68,7 @@ ip6_onlink(struct in6_addr *addr, struct net_device *dev) } read_unlock_bh(&idev->lock); } - read_unlock(&addrconf_lock); + rcu_read_unlock(); return onlink; } diff --git a/net/ipv6/ipv6_syms.c b/net/ipv6/ipv6_syms.c index 7b7b90d..0e8e067 100644 --- a/net/ipv6/ipv6_syms.c +++ b/net/ipv6/ipv6_syms.c @@ -14,7 +14,6 @@ EXPORT_SYMBOL(ndisc_mc_map); EXPORT_SYMBOL(register_inet6addr_notifier); EXPORT_SYMBOL(unregister_inet6addr_notifier); EXPORT_SYMBOL(ip6_route_output); -EXPORT_SYMBOL(addrconf_lock); EXPORT_SYMBOL(ipv6_setsockopt); EXPORT_SYMBOL(ipv6_getsockopt); EXPORT_SYMBOL(inet6_register_protosw); diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c index fd87e3c..249e503 100644 --- a/net/sctp/ipv6.c +++ b/net/sctp/ipv6.c @@ -321,9 +321,9 @@ static void sctp_v6_copy_addrlist(struct list_head *addrlist, struct inet6_ifaddr *ifp; struct sctp_sockaddr_entry *addr; - read_lock(&addrconf_lock); + rcu_read_lock(); if ((in6_dev = __in6_dev_get(dev)) == NULL) { - read_unlock(&addrconf_lock); + rcu_read_unlock(); return; } @@ -342,7 +342,7 @@ static void sctp_v6_copy_addrlist(struct list_head *addrlist, } read_unlock(&in6_dev->lock); - read_unlock(&addrconf_lock); + rcu_read_unlock(); } /* Initialize a sockaddr_storage from in incoming skb. */ -- cgit v0.10.2 From fc26d0abd5afd2b5268a7dbdbf8be1095ce5703e Mon Sep 17 00:00:00 2001 From: YOSHIFUJI Hideaki Date: Fri, 22 Sep 2006 14:44:53 -0700 Subject: [IPV6] NDISC: Fix is_router flag setting. We did not send appropriate IsRouter flag if the forwarding setting is positive even value. Let's give 1/0 value to ndisc_send_na(). Also, existing users of ndisc_send_na() give 0/1 to override, we can omit redundant operation in that function. Bug hinted by Nicolas Dichtel . Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c index 76517a5..0304b5f 100644 --- a/net/ipv6/ndisc.c +++ b/net/ipv6/ndisc.c @@ -496,7 +496,7 @@ static void ndisc_send_na(struct net_device *dev, struct neighbour *neigh, msg->icmph.icmp6_unused = 0; msg->icmph.icmp6_router = router; msg->icmph.icmp6_solicited = solicited; - msg->icmph.icmp6_override = !!override; + msg->icmph.icmp6_override = override; /* Set the target address. */ ipv6_addr_copy(&msg->target, solicited_addr); @@ -847,10 +847,7 @@ static void ndisc_recv_ns(struct sk_buff *skb) goto out; } - if (pneigh) - is_router = pneigh->flags & NTF_ROUTER; - else - is_router = idev->cnf.forwarding; + is_router = !!(pneigh ? pneigh->flags & NTF_ROUTER : idev->cnf.forwarding); if (dad) { struct in6_addr maddr; -- cgit v0.10.2 From 55ebaef1d5db9c1c76ba01a87fd986db5dee550d Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Fri, 22 Sep 2006 14:45:27 -0700 Subject: [IPV6] ADDRCONF: Allow non-DAD'able addresses. IFA_F_NODAD flag, similar to IN6_IFF_NODAD in BSDs, is introduced to skip DAD. This flag should be set to Mobile IPv6 Home Address(es) on Mobile Node because DAD would fail if we should perform DAD; our Home Agent protects our Home Address(es). Signed-off-by: Noriaki TAKAMIYA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/if_addr.h b/include/linux/if_addr.h index e159045..ca24b9d 100644 --- a/include/linux/if_addr.h +++ b/include/linux/if_addr.h @@ -38,6 +38,7 @@ enum #define IFA_F_SECONDARY 0x01 #define IFA_F_TEMPORARY IFA_F_SECONDARY +#define IFA_F_NODAD 0x02 #define IFA_F_DEPRECATED 0x20 #define IFA_F_TENTATIVE 0x40 #define IFA_F_PERMANENT 0x80 diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index c09ebb7..adb583a 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1873,12 +1873,11 @@ err_exit: * Manual configuration of address on an interface */ static int inet6_addr_add(int ifindex, struct in6_addr *pfx, int plen, - __u32 prefered_lft, __u32 valid_lft) + __u8 ifa_flags, __u32 prefered_lft, __u32 valid_lft) { struct inet6_ifaddr *ifp; struct inet6_dev *idev; struct net_device *dev; - __u8 ifa_flags = 0; int scope; ASSERT_RTNL(); @@ -1971,7 +1970,7 @@ int addrconf_add_ifaddr(void __user *arg) rtnl_lock(); err = inet6_addr_add(ireq.ifr6_ifindex, &ireq.ifr6_addr, ireq.ifr6_prefixlen, - INFINITY_LIFE_TIME, INFINITY_LIFE_TIME); + IFA_F_PERMANENT, INFINITY_LIFE_TIME, INFINITY_LIFE_TIME); rtnl_unlock(); return err; } @@ -2514,7 +2513,8 @@ static void addrconf_dad_start(struct inet6_ifaddr *ifp, u32 flags) spin_lock_bh(&ifp->lock); if (dev->flags&(IFF_NOARP|IFF_LOOPBACK) || - !(ifp->flags&IFA_F_TENTATIVE)) { + !(ifp->flags&IFA_F_TENTATIVE) || + ifp->flags & IFA_F_NODAD) { ifp->flags &= ~IFA_F_TENTATIVE; spin_unlock_bh(&ifp->lock); read_unlock_bh(&idev->lock); @@ -2912,28 +2912,25 @@ inet6_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) return inet6_addr_del(ifm->ifa_index, pfx, ifm->ifa_prefixlen); } -static int inet6_addr_modify(struct inet6_ifaddr *ifp, u32 prefered_lft, - u32 valid_lft) +static int inet6_addr_modify(struct inet6_ifaddr *ifp, u8 ifa_flags, + u32 prefered_lft, u32 valid_lft) { - int ifa_flags = 0; - if (!valid_lft || (prefered_lft > valid_lft)) return -EINVAL; if (valid_lft == INFINITY_LIFE_TIME) - ifa_flags = IFA_F_PERMANENT; + ifa_flags |= IFA_F_PERMANENT; else if (valid_lft >= 0x7FFFFFFF/HZ) valid_lft = 0x7FFFFFFF/HZ; if (prefered_lft == 0) - ifa_flags = IFA_F_DEPRECATED; + ifa_flags |= IFA_F_DEPRECATED; else if ((prefered_lft >= 0x7FFFFFFF/HZ) && (prefered_lft != INFINITY_LIFE_TIME)) prefered_lft = 0x7FFFFFFF/HZ; spin_lock_bh(&ifp->lock); - ifp->flags = (ifp->flags & ~(IFA_F_DEPRECATED|IFA_F_PERMANENT)) | ifa_flags; - + ifp->flags = (ifp->flags & ~(IFA_F_DEPRECATED | IFA_F_PERMANENT | IFA_F_NODAD)) | ifa_flags; ifp->tstamp = jiffies; ifp->valid_lft = valid_lft; ifp->prefered_lft = prefered_lft; @@ -2955,7 +2952,8 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) struct in6_addr *pfx; struct inet6_ifaddr *ifa; struct net_device *dev; - u32 valid_lft, preferred_lft; + u32 valid_lft = INFINITY_LIFE_TIME, preferred_lft = INFINITY_LIFE_TIME; + u8 ifa_flags; int err; err = nlmsg_parse(nlh, sizeof(*ifm), tb, IFA_MAX, ifa_ipv6_policy); @@ -2982,6 +2980,9 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) if (dev == NULL) return -ENODEV; + /* We ignore other flags so far. */ + ifa_flags = ifm->ifa_flags & IFA_F_NODAD; + ifa = ipv6_get_ifaddr(pfx, dev, 1); if (ifa == NULL) { /* @@ -2989,14 +2990,14 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) * userspace alreay relies on not having to provide this. */ return inet6_addr_add(ifm->ifa_index, pfx, ifm->ifa_prefixlen, - preferred_lft, valid_lft); + ifa_flags, preferred_lft, valid_lft); } if (nlh->nlmsg_flags & NLM_F_EXCL || !(nlh->nlmsg_flags & NLM_F_REPLACE)) err = -EEXIST; else - err = inet6_addr_modify(ifa, preferred_lft, valid_lft); + err = inet6_addr_modify(ifa, ifa_flags, preferred_lft, valid_lft); in6_ifa_put(ifa); -- cgit v0.10.2 From 3b9f9a1c3903b64c38505f9fed3bb11e48dbc931 Mon Sep 17 00:00:00 2001 From: Noriaki TAKAMIYA Date: Fri, 22 Sep 2006 14:45:56 -0700 Subject: [IPV6] ADDRCONF: Mobile IPv6 Home Address support. IFA_F_HOMEADDRESS is introduced for Mobile IPv6 Home Addresses on Mobile Node. The IFA_F_HOMEADDRESS flag should be set for Mobile IPv6 Home Addresses for 2 purposes. 1) We need to check this on receipt of Type 2 Routing Header (RFC3775 Secion 6.4), 2) We prefer Home Address(es) in source address selection (RFC3484 Section 5 Rule 4). Signed-off-by: Noriaki TAKAMIYA Signed-off-by: YOSHIFUJI Hideaki Signed-off-by: David S. Miller diff --git a/include/linux/if_addr.h b/include/linux/if_addr.h index ca24b9d..dbe8f61 100644 --- a/include/linux/if_addr.h +++ b/include/linux/if_addr.h @@ -39,6 +39,7 @@ enum #define IFA_F_TEMPORARY IFA_F_SECONDARY #define IFA_F_NODAD 0x02 +#define IFA_F_HOMEADDRESS 0x10 #define IFA_F_DEPRECATED 0x20 #define IFA_F_TENTATIVE 0x40 #define IFA_F_PERMANENT 0x80 diff --git a/include/net/addrconf.h b/include/net/addrconf.h index aa2ed8f..44f1b67 100644 --- a/include/net/addrconf.h +++ b/include/net/addrconf.h @@ -61,12 +61,8 @@ extern int addrconf_set_dstaddr(void __user *arg); extern int ipv6_chk_addr(struct in6_addr *addr, struct net_device *dev, int strict); -/* XXX: this is a placeholder till addrconf supports */ #ifdef CONFIG_IPV6_MIP6 -static inline int ipv6_chk_home_addr(struct in6_addr *addr) -{ - return 0; -} +extern int ipv6_chk_home_addr(struct in6_addr *addr); #endif extern struct inet6_ifaddr * ipv6_get_ifaddr(struct in6_addr *addr, struct net_device *dev, diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index adb583a..c186763 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -1038,9 +1038,27 @@ int ipv6_dev_get_saddr(struct net_device *daddr_dev, continue; } - /* Rule 4: Prefer home address -- not implemented yet */ + /* Rule 4: Prefer home address */ +#ifdef CONFIG_IPV6_MIP6 + if (hiscore.rule < 4) { + if (ifa_result->flags & IFA_F_HOMEADDRESS) + hiscore.attrs |= IPV6_SADDR_SCORE_HOA; + hiscore.rule++; + } + if (ifa->flags & IFA_F_HOMEADDRESS) { + score.attrs |= IPV6_SADDR_SCORE_HOA; + if (!(ifa_result->flags & IFA_F_HOMEADDRESS)) { + score.rule = 4; + goto record_it; + } + } else { + if (hiscore.attrs & IPV6_SADDR_SCORE_HOA) + continue; + } +#else if (hiscore.rule < 4) hiscore.rule++; +#endif /* Rule 5: Prefer outgoing interface */ if (hiscore.rule < 5) { @@ -2759,6 +2777,26 @@ void if6_proc_exit(void) } #endif /* CONFIG_PROC_FS */ +#ifdef CONFIG_IPV6_MIP6 +/* Check if address is a home address configured on any interface. */ +int ipv6_chk_home_addr(struct in6_addr *addr) +{ + int ret = 0; + struct inet6_ifaddr * ifp; + u8 hash = ipv6_addr_hash(addr); + read_lock_bh(&addrconf_hash_lock); + for (ifp = inet6_addr_lst[hash]; ifp; ifp = ifp->lst_next) { + if (ipv6_addr_cmp(&ifp->addr, addr) == 0 && + (ifp->flags & IFA_F_HOMEADDRESS)) { + ret = 1; + break; + } + } + read_unlock_bh(&addrconf_hash_lock); + return ret; +} +#endif + /* * Periodic address status verification */ @@ -2930,7 +2968,7 @@ static int inet6_addr_modify(struct inet6_ifaddr *ifp, u8 ifa_flags, prefered_lft = 0x7FFFFFFF/HZ; spin_lock_bh(&ifp->lock); - ifp->flags = (ifp->flags & ~(IFA_F_DEPRECATED | IFA_F_PERMANENT | IFA_F_NODAD)) | ifa_flags; + ifp->flags = (ifp->flags & ~(IFA_F_DEPRECATED | IFA_F_PERMANENT | IFA_F_NODAD | IFA_F_HOMEADDRESS)) | ifa_flags; ifp->tstamp = jiffies; ifp->valid_lft = valid_lft; ifp->prefered_lft = prefered_lft; @@ -2981,7 +3019,7 @@ inet6_rtm_newaddr(struct sk_buff *skb, struct nlmsghdr *nlh, void *arg) return -ENODEV; /* We ignore other flags so far. */ - ifa_flags = ifm->ifa_flags & IFA_F_NODAD; + ifa_flags = ifm->ifa_flags & (IFA_F_NODAD | IFA_F_HOMEADDRESS); ifa = ipv6_get_ifaddr(pfx, dev, 1); if (ifa == NULL) { -- cgit v0.10.2 From 1c3c07e9f6cc50dab2aeb8051325e317d4f6c70e Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 25 Jul 2006 11:28:18 -0400 Subject: NFS: Add a new ACCESS rpc call cache to the linux nfs client The current access cache only allows one entry at a time to be cached for each inode. Add a per-inode red-black tree in order to allow more than one to be cached at a time. Should significantly cut down the time spent in path traversal for shared directories such as ${PATH}, /usr/share, etc. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index e7ffb4d..094afde 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1638,35 +1638,134 @@ out: return error; } -int nfs_access_get_cached(struct inode *inode, struct rpc_cred *cred, struct nfs_access_entry *res) +static void nfs_access_free_entry(struct nfs_access_entry *entry) +{ + put_rpccred(entry->cred); + kfree(entry); +} + +static void __nfs_access_zap_cache(struct inode *inode) { struct nfs_inode *nfsi = NFS_I(inode); - struct nfs_access_entry *cache = &nfsi->cache_access; + struct rb_root *root_node = &nfsi->access_cache; + struct rb_node *n, *dispose = NULL; + struct nfs_access_entry *entry; + + /* Unhook entries from the cache */ + while ((n = rb_first(root_node)) != NULL) { + entry = rb_entry(n, struct nfs_access_entry, rb_node); + rb_erase(n, root_node); + n->rb_left = dispose; + dispose = n; + } + nfsi->cache_validity &= ~NFS_INO_INVALID_ACCESS; + spin_unlock(&inode->i_lock); - if (cache->cred != cred - || time_after(jiffies, cache->jiffies + NFS_ATTRTIMEO(inode)) - || (nfsi->cache_validity & NFS_INO_INVALID_ACCESS)) - return -ENOENT; - memcpy(res, cache, sizeof(*res)); - return 0; + /* Now kill them all! */ + while (dispose != NULL) { + n = dispose; + dispose = n->rb_left; + nfs_access_free_entry(rb_entry(n, struct nfs_access_entry, rb_node)); + } } -void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set) +void nfs_access_zap_cache(struct inode *inode) { - struct nfs_inode *nfsi = NFS_I(inode); - struct nfs_access_entry *cache = &nfsi->cache_access; + spin_lock(&inode->i_lock); + /* This will release the spinlock */ + __nfs_access_zap_cache(inode); +} - if (cache->cred != set->cred) { - if (cache->cred) - put_rpccred(cache->cred); - cache->cred = get_rpccred(set->cred); +static struct nfs_access_entry *nfs_access_search_rbtree(struct inode *inode, struct rpc_cred *cred) +{ + struct rb_node *n = NFS_I(inode)->access_cache.rb_node; + struct nfs_access_entry *entry; + + while (n != NULL) { + entry = rb_entry(n, struct nfs_access_entry, rb_node); + + if (cred < entry->cred) + n = n->rb_left; + else if (cred > entry->cred) + n = n->rb_right; + else + return entry; } - /* FIXME: replace current access_cache BKL reliance with inode->i_lock */ + return NULL; +} + +int nfs_access_get_cached(struct inode *inode, struct rpc_cred *cred, struct nfs_access_entry *res) +{ + struct nfs_inode *nfsi = NFS_I(inode); + struct nfs_access_entry *cache; + int err = -ENOENT; + spin_lock(&inode->i_lock); - nfsi->cache_validity &= ~NFS_INO_INVALID_ACCESS; + if (nfsi->cache_validity & NFS_INO_INVALID_ACCESS) + goto out_zap; + cache = nfs_access_search_rbtree(inode, cred); + if (cache == NULL) + goto out; + if (time_after(jiffies, cache->jiffies + NFS_ATTRTIMEO(inode))) + goto out_stale; + res->jiffies = cache->jiffies; + res->cred = cache->cred; + res->mask = cache->mask; + err = 0; +out: + spin_unlock(&inode->i_lock); + return err; +out_stale: + rb_erase(&cache->rb_node, &nfsi->access_cache); + spin_unlock(&inode->i_lock); + nfs_access_free_entry(cache); + return -ENOENT; +out_zap: + /* This will release the spinlock */ + __nfs_access_zap_cache(inode); + return -ENOENT; +} + +static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry *set) +{ + struct rb_root *root_node = &NFS_I(inode)->access_cache; + struct rb_node **p = &root_node->rb_node; + struct rb_node *parent = NULL; + struct nfs_access_entry *entry; + + spin_lock(&inode->i_lock); + while (*p != NULL) { + parent = *p; + entry = rb_entry(parent, struct nfs_access_entry, rb_node); + + if (set->cred < entry->cred) + p = &parent->rb_left; + else if (set->cred > entry->cred) + p = &parent->rb_right; + else + goto found; + } + rb_link_node(&set->rb_node, parent, p); + rb_insert_color(&set->rb_node, root_node); spin_unlock(&inode->i_lock); + return; +found: + rb_replace_node(parent, &set->rb_node, root_node); + spin_unlock(&inode->i_lock); + nfs_access_free_entry(entry); +} + +void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set) +{ + struct nfs_access_entry *cache = kmalloc(sizeof(*cache), GFP_KERNEL); + if (cache == NULL) + return; + RB_CLEAR_NODE(&cache->rb_node); cache->jiffies = set->jiffies; + cache->cred = get_rpccred(set->cred); cache->mask = set->mask; + + nfs_access_add_rbtree(inode, cache); } static int nfs_do_access(struct inode *inode, struct rpc_cred *cred, int mask) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index d349fb2..b94ab06 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -76,19 +76,14 @@ int nfs_write_inode(struct inode *inode, int sync) void nfs_clear_inode(struct inode *inode) { - struct nfs_inode *nfsi = NFS_I(inode); - struct rpc_cred *cred; - /* * The following should never happen... */ BUG_ON(nfs_have_writebacks(inode)); - BUG_ON (!list_empty(&nfsi->open_files)); + BUG_ON(!list_empty(&NFS_I(inode)->open_files)); + BUG_ON(atomic_read(&NFS_I(inode)->data_updates) != 0); nfs_zap_acl_cache(inode); - cred = nfsi->cache_access.cred; - if (cred) - put_rpccred(cred); - BUG_ON(atomic_read(&nfsi->data_updates) != 0); + nfs_access_zap_cache(inode); } /** @@ -290,7 +285,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr) nfsi->attrtimeo = NFS_MINATTRTIMEO(inode); nfsi->attrtimeo_timestamp = jiffies; memset(nfsi->cookieverf, 0, sizeof(nfsi->cookieverf)); - nfsi->cache_access.cred = NULL; + nfsi->access_cache = RB_ROOT; unlock_new_inode(inode); } else diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 6c2066c..cc013ed 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -69,6 +70,7 @@ * NFSv3/v4 Access mode cache entry */ struct nfs_access_entry { + struct rb_node rb_node; unsigned long jiffies; struct rpc_cred * cred; int mask; @@ -145,7 +147,7 @@ struct nfs_inode { */ atomic_t data_updates; - struct nfs_access_entry cache_access; + struct rb_root access_cache; #ifdef CONFIG_NFS_V3_ACL struct posix_acl *acl_access; struct posix_acl *acl_default; @@ -297,6 +299,7 @@ extern int nfs_getattr(struct vfsmount *, struct dentry *, struct kstat *); extern int nfs_permission(struct inode *, int, struct nameidata *); extern int nfs_access_get_cached(struct inode *, struct rpc_cred *, struct nfs_access_entry *); extern void nfs_access_add_cache(struct inode *, struct nfs_access_entry *); +extern void nfs_access_zap_cache(struct inode *inode); extern int nfs_open(struct inode *, struct file *); extern int nfs_release(struct inode *, struct file *); extern int nfs_attribute_timeout(struct inode *inode); -- cgit v0.10.2 From cfcea3e8c66c2dcde98d5c2693d4bff50b5cac97 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 25 Jul 2006 11:28:18 -0400 Subject: NFS: Add a global LRU list for the ACCESS cache ...in order to allow the addition of a memory shrinker. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 094afde..bf4f5ff 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1638,10 +1638,17 @@ out: return error; } +static DEFINE_SPINLOCK(nfs_access_lru_lock); +static LIST_HEAD(nfs_access_lru_list); +static atomic_long_t nfs_access_nr_entries; + static void nfs_access_free_entry(struct nfs_access_entry *entry) { put_rpccred(entry->cred); kfree(entry); + smp_mb__before_atomic_dec(); + atomic_long_dec(&nfs_access_nr_entries); + smp_mb__after_atomic_dec(); } static void __nfs_access_zap_cache(struct inode *inode) @@ -1655,6 +1662,7 @@ static void __nfs_access_zap_cache(struct inode *inode) while ((n = rb_first(root_node)) != NULL) { entry = rb_entry(n, struct nfs_access_entry, rb_node); rb_erase(n, root_node); + list_del(&entry->lru); n->rb_left = dispose; dispose = n; } @@ -1671,6 +1679,13 @@ static void __nfs_access_zap_cache(struct inode *inode) void nfs_access_zap_cache(struct inode *inode) { + /* Remove from global LRU init */ + if (test_and_clear_bit(NFS_INO_ACL_LRU_SET, &NFS_FLAGS(inode))) { + spin_lock(&nfs_access_lru_lock); + list_del_init(&NFS_I(inode)->access_cache_inode_lru); + spin_unlock(&nfs_access_lru_lock); + } + spin_lock(&inode->i_lock); /* This will release the spinlock */ __nfs_access_zap_cache(inode); @@ -1711,12 +1726,14 @@ int nfs_access_get_cached(struct inode *inode, struct rpc_cred *cred, struct nfs res->jiffies = cache->jiffies; res->cred = cache->cred; res->mask = cache->mask; + list_move_tail(&cache->lru, &nfsi->access_cache_entry_lru); err = 0; out: spin_unlock(&inode->i_lock); return err; out_stale: rb_erase(&cache->rb_node, &nfsi->access_cache); + list_del(&cache->lru); spin_unlock(&inode->i_lock); nfs_access_free_entry(cache); return -ENOENT; @@ -1728,7 +1745,8 @@ out_zap: static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry *set) { - struct rb_root *root_node = &NFS_I(inode)->access_cache; + struct nfs_inode *nfsi = NFS_I(inode); + struct rb_root *root_node = &nfsi->access_cache; struct rb_node **p = &root_node->rb_node; struct rb_node *parent = NULL; struct nfs_access_entry *entry; @@ -1747,10 +1765,13 @@ static void nfs_access_add_rbtree(struct inode *inode, struct nfs_access_entry * } rb_link_node(&set->rb_node, parent, p); rb_insert_color(&set->rb_node, root_node); + list_add_tail(&set->lru, &nfsi->access_cache_entry_lru); spin_unlock(&inode->i_lock); return; found: rb_replace_node(parent, &set->rb_node, root_node); + list_add_tail(&set->lru, &nfsi->access_cache_entry_lru); + list_del(&entry->lru); spin_unlock(&inode->i_lock); nfs_access_free_entry(entry); } @@ -1766,6 +1787,18 @@ void nfs_access_add_cache(struct inode *inode, struct nfs_access_entry *set) cache->mask = set->mask; nfs_access_add_rbtree(inode, cache); + + /* Update accounting */ + smp_mb__before_atomic_inc(); + atomic_long_inc(&nfs_access_nr_entries); + smp_mb__after_atomic_inc(); + + /* Add inode to global LRU list */ + if (!test_and_set_bit(NFS_INO_ACL_LRU_SET, &NFS_FLAGS(inode))) { + spin_lock(&nfs_access_lru_lock); + list_add_tail(&NFS_I(inode)->access_cache_inode_lru, &nfs_access_lru_list); + spin_unlock(&nfs_access_lru_lock); + } } static int nfs_do_access(struct inode *inode, struct rpc_cred *cred, int mask) diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index b94ab06..6ed018c 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1104,6 +1104,8 @@ static void init_once(void * foo, kmem_cache_t * cachep, unsigned long flags) INIT_LIST_HEAD(&nfsi->dirty); INIT_LIST_HEAD(&nfsi->commit); INIT_LIST_HEAD(&nfsi->open_files); + INIT_LIST_HEAD(&nfsi->access_cache_entry_lru); + INIT_LIST_HEAD(&nfsi->access_cache_inode_lru); INIT_RADIX_TREE(&nfsi->nfs_page_tree, GFP_ATOMIC); atomic_set(&nfsi->data_updates, 0); nfsi->ndirty = 0; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index cc013ed..a36e01cd 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -71,6 +71,7 @@ */ struct nfs_access_entry { struct rb_node rb_node; + struct list_head lru; unsigned long jiffies; struct rpc_cred * cred; int mask; @@ -148,6 +149,8 @@ struct nfs_inode { atomic_t data_updates; struct rb_root access_cache; + struct list_head access_cache_entry_lru; + struct list_head access_cache_inode_lru; #ifdef CONFIG_NFS_V3_ACL struct posix_acl *acl_access; struct posix_acl *acl_default; @@ -201,6 +204,7 @@ struct nfs_inode { #define NFS_INO_REVALIDATING (0) /* revalidating attrs */ #define NFS_INO_ADVISE_RDPLUS (1) /* advise readdirplus */ #define NFS_INO_STALE (2) /* possible stale inode */ +#define NFS_INO_ACL_LRU_SET (3) /* Inode is on the LRU list */ static inline struct nfs_inode *NFS_I(struct inode *inode) { -- cgit v0.10.2 From 979df72e6f963b42ee484f2eca049c3344da0ba7 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 25 Jul 2006 11:28:19 -0400 Subject: NFS: Add an ACCESS cache memory shrinker A pinned inode may in theory end up filling memory with cached ACCESS calls. This patch ensures that the VM may shrink away the cache in these particular cases. The shrinker works by iterating through the list of inodes on the global nfs_access_lru_list, and removing the least recently used access cache entry until it is done (or until the entire cache is empty). Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index bf4f5ff..067d144 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1651,6 +1651,50 @@ static void nfs_access_free_entry(struct nfs_access_entry *entry) smp_mb__after_atomic_dec(); } +int nfs_access_cache_shrinker(int nr_to_scan, gfp_t gfp_mask) +{ + LIST_HEAD(head); + struct nfs_inode *nfsi; + struct nfs_access_entry *cache; + + spin_lock(&nfs_access_lru_lock); +restart: + list_for_each_entry(nfsi, &nfs_access_lru_list, access_cache_inode_lru) { + struct inode *inode; + + if (nr_to_scan-- == 0) + break; + inode = igrab(&nfsi->vfs_inode); + if (inode == NULL) + continue; + spin_lock(&inode->i_lock); + if (list_empty(&nfsi->access_cache_entry_lru)) + goto remove_lru_entry; + cache = list_entry(nfsi->access_cache_entry_lru.next, + struct nfs_access_entry, lru); + list_move(&cache->lru, &head); + rb_erase(&cache->rb_node, &nfsi->access_cache); + if (!list_empty(&nfsi->access_cache_entry_lru)) + list_move_tail(&nfsi->access_cache_inode_lru, + &nfs_access_lru_list); + else { +remove_lru_entry: + list_del_init(&nfsi->access_cache_inode_lru); + clear_bit(NFS_INO_ACL_LRU_SET, &nfsi->flags); + } + spin_unlock(&inode->i_lock); + iput(inode); + goto restart; + } + spin_unlock(&nfs_access_lru_lock); + while (!list_empty(&head)) { + cache = list_entry(head.next, struct nfs_access_entry, lru); + list_del(&cache->lru); + nfs_access_free_entry(cache); + } + return (atomic_long_read(&nfs_access_nr_entries) / 100) * sysctl_vfs_cache_pressure; +} + static void __nfs_access_zap_cache(struct inode *inode) { struct nfs_inode *nfsi = NFS_I(inode); diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index e4f4e5d..660e9ff 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -66,6 +66,9 @@ extern int nfs4_proc_fs_locations(struct inode *dir, struct dentry *dentry, struct page *page); #endif +/* dir.c */ +extern int nfs_access_cache_shrinker(int nr_to_scan, gfp_t gfp_mask); + /* inode.c */ extern struct inode *nfs_alloc_inode(struct super_block *sb); extern void nfs_destroy_inode(struct inode *); diff --git a/fs/nfs/super.c b/fs/nfs/super.c index e8a9bee..06c321be 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -221,6 +221,8 @@ module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int, &nfs_idmap_cache_timeout, 0644); #endif +static struct shrinker *acl_shrinker; + /* * Register the NFS filesystems */ @@ -240,6 +242,7 @@ int __init register_nfs_fs(void) if (ret < 0) goto error_2; #endif + acl_shrinker = set_shrinker(DEFAULT_SEEKS, nfs_access_cache_shrinker); return 0; #ifdef CONFIG_NFS_V4 @@ -257,6 +260,8 @@ error_0: */ void __exit unregister_nfs_fs(void) { + if (acl_shrinker != NULL) + remove_shrinker(acl_shrinker); #ifdef CONFIG_NFS_V4 unregister_filesystem(&nfs4_fs_type); nfs_unregister_sysctl(); -- cgit v0.10.2 From 770bfad846ab6628444428467b11fa6773ae9ea1 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:07 -0400 Subject: NFS: Add dentry materialisation op The attached patch adds a new directory cache management function that prepares a disconnected anonymous function to be connected into the dentry tree. The anonymous dentry is transferred the name and parentage from another dentry. The following changes were made in [try #2]: (*) d_materialise_dentry() now switches the parentage of the two nodes around correctly when one or other of them is self-referential. The following changes were made in [try #7]: (*) d_instantiate_unique() has had the interior part split out as function __d_instantiate_unique(). Callers of this latter function must be holding the appropriate locks. (*) _d_rehash() has been added as a wrapper around __d_rehash() to call it with the most obvious hash list (the one from the name). d_rehash() now calls _d_rehash(). (*) d_materialise_dentry() is now __d_materialise_dentry() and is static. (*) d_materialise_unique() added to perform the combination of d_find_alias(), d_materialise_dentry() and d_add_unique() that the NFS client was doing twice, all within a single dcache_lock critical section. This reduces the number of times two different spinlocks were being accessed. The following further changes were made: (*) Add the dentries onto their parents d_subdirs lists. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/dcache.c b/fs/dcache.c index 1b4a3a3..17b392a 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -828,17 +828,19 @@ void d_instantiate(struct dentry *entry, struct inode * inode) * (or otherwise set) by the caller to indicate that it is now * in use by the dcache. */ -struct dentry *d_instantiate_unique(struct dentry *entry, struct inode *inode) +static struct dentry *__d_instantiate_unique(struct dentry *entry, + struct inode *inode) { struct dentry *alias; int len = entry->d_name.len; const char *name = entry->d_name.name; unsigned int hash = entry->d_name.hash; - BUG_ON(!list_empty(&entry->d_alias)); - spin_lock(&dcache_lock); - if (!inode) - goto do_negative; + if (!inode) { + entry->d_inode = NULL; + return NULL; + } + list_for_each_entry(alias, &inode->i_dentry, d_alias) { struct qstr *qstr = &alias->d_name; @@ -851,19 +853,35 @@ struct dentry *d_instantiate_unique(struct dentry *entry, struct inode *inode) if (memcmp(qstr->name, name, len)) continue; dget_locked(alias); - spin_unlock(&dcache_lock); - BUG_ON(!d_unhashed(alias)); - iput(inode); return alias; } + list_add(&entry->d_alias, &inode->i_dentry); -do_negative: entry->d_inode = inode; fsnotify_d_instantiate(entry, inode); - spin_unlock(&dcache_lock); - security_d_instantiate(entry, inode); return NULL; } + +struct dentry *d_instantiate_unique(struct dentry *entry, struct inode *inode) +{ + struct dentry *result; + + BUG_ON(!list_empty(&entry->d_alias)); + + spin_lock(&dcache_lock); + result = __d_instantiate_unique(entry, inode); + spin_unlock(&dcache_lock); + + if (!result) { + security_d_instantiate(entry, inode); + return NULL; + } + + BUG_ON(!d_unhashed(result)); + iput(inode); + return result; +} + EXPORT_SYMBOL(d_instantiate_unique); /** @@ -1235,6 +1253,11 @@ static void __d_rehash(struct dentry * entry, struct hlist_head *list) hlist_add_head_rcu(&entry->d_hash, list); } +static void _d_rehash(struct dentry * entry) +{ + __d_rehash(entry, d_hash(entry->d_parent, entry->d_name.hash)); +} + /** * d_rehash - add an entry back to the hash * @entry: dentry to add to the hash @@ -1244,11 +1267,9 @@ static void __d_rehash(struct dentry * entry, struct hlist_head *list) void d_rehash(struct dentry * entry) { - struct hlist_head *list = d_hash(entry->d_parent, entry->d_name.hash); - spin_lock(&dcache_lock); spin_lock(&entry->d_lock); - __d_rehash(entry, list); + _d_rehash(entry); spin_unlock(&entry->d_lock); spin_unlock(&dcache_lock); } @@ -1386,6 +1407,120 @@ already_unhashed: spin_unlock(&dcache_lock); } +/* + * Prepare an anonymous dentry for life in the superblock's dentry tree as a + * named dentry in place of the dentry to be replaced. + */ +static void __d_materialise_dentry(struct dentry *dentry, struct dentry *anon) +{ + struct dentry *dparent, *aparent; + + switch_names(dentry, anon); + do_switch(dentry->d_name.len, anon->d_name.len); + do_switch(dentry->d_name.hash, anon->d_name.hash); + + dparent = dentry->d_parent; + aparent = anon->d_parent; + + dentry->d_parent = (aparent == anon) ? dentry : aparent; + list_del(&dentry->d_u.d_child); + if (!IS_ROOT(dentry)) + list_add(&dentry->d_u.d_child, &dentry->d_parent->d_subdirs); + else + INIT_LIST_HEAD(&dentry->d_u.d_child); + + anon->d_parent = (dparent == dentry) ? anon : dparent; + list_del(&anon->d_u.d_child); + if (!IS_ROOT(anon)) + list_add(&anon->d_u.d_child, &anon->d_parent->d_subdirs); + else + INIT_LIST_HEAD(&anon->d_u.d_child); + + anon->d_flags &= ~DCACHE_DISCONNECTED; +} + +/** + * d_materialise_unique - introduce an inode into the tree + * @dentry: candidate dentry + * @inode: inode to bind to the dentry, to which aliases may be attached + * + * Introduces an dentry into the tree, substituting an extant disconnected + * root directory alias in its place if there is one + */ +struct dentry *d_materialise_unique(struct dentry *dentry, struct inode *inode) +{ + struct dentry *alias, *actual; + + BUG_ON(!d_unhashed(dentry)); + + spin_lock(&dcache_lock); + + if (!inode) { + actual = dentry; + dentry->d_inode = NULL; + goto found_lock; + } + + /* See if a disconnected directory already exists as an anonymous root + * that we should splice into the tree instead */ + if (S_ISDIR(inode->i_mode) && (alias = __d_find_alias(inode, 1))) { + spin_lock(&alias->d_lock); + + /* Is this a mountpoint that we could splice into our tree? */ + if (IS_ROOT(alias)) + goto connect_mountpoint; + + if (alias->d_name.len == dentry->d_name.len && + alias->d_parent == dentry->d_parent && + memcmp(alias->d_name.name, + dentry->d_name.name, + dentry->d_name.len) == 0) + goto replace_with_alias; + + spin_unlock(&alias->d_lock); + + /* Doh! Seem to be aliasing directories for some reason... */ + dput(alias); + } + + /* Add a unique reference */ + actual = __d_instantiate_unique(dentry, inode); + if (!actual) + actual = dentry; + else if (unlikely(!d_unhashed(actual))) + goto shouldnt_be_hashed; + +found_lock: + spin_lock(&actual->d_lock); +found: + _d_rehash(actual); + spin_unlock(&actual->d_lock); + spin_unlock(&dcache_lock); + + if (actual == dentry) { + security_d_instantiate(dentry, inode); + return NULL; + } + + iput(inode); + return actual; + + /* Convert the anonymous/root alias into an ordinary dentry */ +connect_mountpoint: + __d_materialise_dentry(dentry, alias); + + /* Replace the candidate dentry with the alias in the tree */ +replace_with_alias: + __d_drop(alias); + actual = alias; + goto found; + +shouldnt_be_hashed: + spin_unlock(&dcache_lock); + BUG(); + goto shouldnt_be_hashed; +} + /** * d_path - return the path of a dentry * @dentry: dentry to report @@ -1784,6 +1919,7 @@ EXPORT_SYMBOL(d_instantiate); EXPORT_SYMBOL(d_invalidate); EXPORT_SYMBOL(d_lookup); EXPORT_SYMBOL(d_move); +EXPORT_SYMBOL_GPL(d_materialise_unique); EXPORT_SYMBOL(d_path); EXPORT_SYMBOL(d_prune_aliases); EXPORT_SYMBOL(d_rehash); diff --git a/include/linux/dcache.h b/include/linux/dcache.h index 471781f..44605be 100644 --- a/include/linux/dcache.h +++ b/include/linux/dcache.h @@ -221,6 +221,7 @@ static inline int dname_external(struct dentry *dentry) */ extern void d_instantiate(struct dentry *, struct inode *); extern struct dentry * d_instantiate_unique(struct dentry *, struct inode *); +extern struct dentry * d_materialise_unique(struct dentry *, struct inode *); extern void d_delete(struct dentry *); /* allocate/de-allocate */ -- cgit v0.10.2 From 7d4e2747a0412583526a162fbbd6edeeafcceb08 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:07 -0400 Subject: NFS: Fix up split of fs/nfs/inode.c Fix ups for the splitting of the superblock stuff out of fs/nfs/inode.c, including: (*) Move the callback tcpport module param into callback.c. (*) Move the idmap cache timeout module param into idmap.c. (*) Changes to internal.h: (*) namespace-nfs4.c was renamed to nfs4namespace.c. (*) nfs_stat_to_errno() is in nfs2xdr.c, not nfs4xdr.c. (*) nfs4xdr.c is contingent on CONFIG_NFS_V4. (*) nfs4_path() is only uses if CONFIG_NFS_V4 is set. Plus also: (*) The sec_flavours[] table should really be const. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index fe0a6b8..d6c4bae 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -36,6 +36,21 @@ static struct svc_program nfs4_callback_program; unsigned int nfs_callback_set_tcpport; unsigned short nfs_callback_tcpport; +static const int nfs_set_port_min = 0; +static const int nfs_set_port_max = 65535; + +static int param_set_port(const char *val, struct kernel_param *kp) +{ + char *endp; + int num = simple_strtol(val, &endp, 0); + if (endp == val || *endp || num < nfs_set_port_min || num > nfs_set_port_max) + return -EINVAL; + *((int *)kp->arg) = num; + return 0; +} + +module_param_call(callback_tcpport, param_set_port, param_get_int, + &nfs_callback_set_tcpport, 0644); /* * This is the callback kernel thread. diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index 07a5dd5..873deb9 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -57,6 +57,20 @@ /* Default cache timeout is 10 minutes */ unsigned int nfs_idmap_cache_timeout = 600 * HZ; +static int param_set_idmap_timeout(const char *val, struct kernel_param *kp) +{ + char *endp; + int num = simple_strtol(val, &endp, 0); + int jif = num * HZ; + if (endp == val || *endp || num < 0 || jif < num) + return -EINVAL; + *((int *)kp->arg) = jif; + return 0; +} + +module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int, + &nfs_idmap_cache_timeout, 0644); + struct idmap_hashent { unsigned long ih_expires; __u32 ih_id; diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 660e9ff..4802157 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -15,7 +15,7 @@ struct nfs_clone_mount { rpc_authflavor_t authflavor; }; -/* namespace-nfs4.c */ +/* nfs4namespace.c */ #ifdef CONFIG_NFS_V4 extern struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentry *dentry); #else @@ -46,6 +46,7 @@ extern void nfs_destroy_directcache(void); #endif /* nfs2xdr.c */ +extern int nfs_stat_to_errno(int); extern struct rpc_procinfo nfs_procedures[]; extern u32 * nfs_decode_dirent(u32 *, struct nfs_entry *, int); @@ -54,8 +55,9 @@ extern struct rpc_procinfo nfs3_procedures[]; extern u32 *nfs3_decode_dirent(u32 *, struct nfs_entry *, int); /* nfs4xdr.c */ -extern int nfs_stat_to_errno(int); +#ifdef CONFIG_NFS_V4 extern u32 *nfs4_decode_dirent(u32 *p, struct nfs_entry *entry, int plus); +#endif /* nfs4proc.c */ #ifdef CONFIG_NFS_V4 @@ -97,15 +99,13 @@ extern char *nfs_path(const char *base, const struct dentry *dentry, /* * Determine the mount path as a string */ +#ifdef CONFIG_NFS_V4 static inline char * nfs4_path(const struct dentry *dentry, char *buffer, ssize_t buflen) { -#ifdef CONFIG_NFS_V4 return nfs_path(NFS_SB(dentry->d_sb)->mnt_path, dentry, buffer, buflen); -#else - return NULL; -#endif } +#endif /* * Determine the device name as a string diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 06c321be..6349734 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -187,40 +187,6 @@ static struct super_operations nfs4_sops = { }; #endif -#ifdef CONFIG_NFS_V4 -static const int nfs_set_port_min = 0; -static const int nfs_set_port_max = 65535; - -static int param_set_port(const char *val, struct kernel_param *kp) -{ - char *endp; - int num = simple_strtol(val, &endp, 0); - if (endp == val || *endp || num < nfs_set_port_min || num > nfs_set_port_max) - return -EINVAL; - *((int *)kp->arg) = num; - return 0; -} - -module_param_call(callback_tcpport, param_set_port, param_get_int, - &nfs_callback_set_tcpport, 0644); -#endif - -#ifdef CONFIG_NFS_V4 -static int param_set_idmap_timeout(const char *val, struct kernel_param *kp) -{ - char *endp; - int num = simple_strtol(val, &endp, 0); - int jif = num * HZ; - if (endp == val || *endp || num < 0 || jif < num) - return -EINVAL; - *((int *)kp->arg) = jif; - return 0; -} - -module_param_call(idmap_cache_timeout, param_set_idmap_timeout, param_get_int, - &nfs_idmap_cache_timeout, 0644); -#endif - static struct shrinker *acl_shrinker; /* @@ -328,9 +294,12 @@ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) } +/* + * Map the security flavour number to a name + */ static const char *nfs_pseudoflavour_to_name(rpc_authflavor_t flavour) { - static struct { + static const struct { rpc_authflavor_t flavour; const char *str; } sec_flavours[] = { @@ -1368,7 +1337,6 @@ static int nfs4_get_sb(struct file_system_type *fs_type, } s = sget(fs_type, nfs4_compare_super, nfs_set_super, server); - if (IS_ERR(s)) { error = PTR_ERR(s); goto out_free; -- cgit v0.10.2 From 0a8ea4372b2868842986118ca90912f3382e6c5a Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:08 -0400 Subject: NFS: Disambiguate nfs_stat_to_errno() Rename the NFS4 version of nfs_stat_to_errno() so that it doesn't conflict with the common one used by NFS2 and NFS3. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 730ec8f..1dee6ef 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -58,7 +58,7 @@ /* Mapping from NFS error code to "errno" error code. */ #define errno_NFSERR_IO EIO -static int nfs_stat_to_errno(int); +static int nfs4_stat_to_errno(int); /* NFSv4 COMPOUND tags are only wanted for debugging purposes */ #ifdef DEBUG @@ -2127,7 +2127,7 @@ static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected) } READ32(nfserr); if (nfserr != NFS_OK) - return -nfs_stat_to_errno(nfserr); + return -nfs4_stat_to_errno(nfserr); return 0; } @@ -3598,7 +3598,7 @@ static int decode_setclientid(struct xdr_stream *xdr, struct nfs4_client *clp) READ_BUF(len); return -NFSERR_CLID_INUSE; } else - return -nfs_stat_to_errno(nfserr); + return -nfs4_stat_to_errno(nfserr); return 0; } @@ -4256,7 +4256,7 @@ static int nfs4_xdr_dec_fsinfo(struct rpc_rqst *req, uint32_t *p, struct nfs_fsi if (!status) status = decode_fsinfo(&xdr, fsinfo); if (!status) - status = -nfs_stat_to_errno(hdr.status); + status = -nfs4_stat_to_errno(hdr.status); return status; } @@ -4346,7 +4346,7 @@ static int nfs4_xdr_dec_setclientid(struct rpc_rqst *req, uint32_t *p, if (!status) status = decode_setclientid(&xdr, clp); if (!status) - status = -nfs_stat_to_errno(hdr.status); + status = -nfs4_stat_to_errno(hdr.status); return status; } @@ -4368,7 +4368,7 @@ static int nfs4_xdr_dec_setclientid_confirm(struct rpc_rqst *req, uint32_t *p, s if (!status) status = decode_fsinfo(&xdr, fsinfo); if (!status) - status = -nfs_stat_to_errno(hdr.status); + status = -nfs4_stat_to_errno(hdr.status); return status; } @@ -4521,7 +4521,7 @@ static struct { * This one is used jointly by NFSv2 and NFSv3. */ static int -nfs_stat_to_errno(int stat) +nfs4_stat_to_errno(int stat) { int i; for (i = 0; nfs_errtbl[i].stat != -1; i++) { -- cgit v0.10.2 From 5ae1fbce142b67bf59e15fb1af96e88a96abde7b Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:08 -0400 Subject: NFS: Fix NFS4 callback up/down prototypes Make the nfs_callback_up()/down() prototypes just do nothing if NFS4 is not enabled. Also make the down function void type since we can't really do anything if it fails. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index d6c4bae..b1f7dc4 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -149,10 +149,8 @@ out_err: /* * Kill the server process if it is not already up. */ -int nfs_callback_down(void) +void nfs_callback_down(void) { - int ret = 0; - lock_kernel(); mutex_lock(&nfs_callback_mutex); nfs_callback_info.users--; @@ -164,7 +162,6 @@ int nfs_callback_down(void) } while (wait_for_completion_timeout(&nfs_callback_info.stopped, 5*HZ) == 0); mutex_unlock(&nfs_callback_mutex); unlock_kernel(); - return ret; } static int nfs_callback_authenticate(struct svc_rqst *rqstp) diff --git a/fs/nfs/callback.h b/fs/nfs/callback.h index b252e7f..5676163d 100644 --- a/fs/nfs/callback.h +++ b/fs/nfs/callback.h @@ -62,8 +62,13 @@ struct cb_recallargs { extern unsigned nfs4_callback_getattr(struct cb_getattrargs *args, struct cb_getattrres *res); extern unsigned nfs4_callback_recall(struct cb_recallargs *args, void *dummy); +#ifdef CONFIG_NFS_V4 extern int nfs_callback_up(void); -extern int nfs_callback_down(void); +extern void nfs_callback_down(void); +#else +#define nfs_callback_up() (0) +#define nfs_callback_down() do {} while(0) +#endif extern unsigned int nfs_callback_set_tcpport; extern unsigned short nfs_callback_tcpport; -- cgit v0.10.2 From adfa6f980bd46974e6b32b22dd0c45e3f52063f4 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:08 -0400 Subject: NFS: Rename struct nfs4_client to struct nfs_client Rename struct nfs4_client to struct nfs_client so that it can become the basis for a general client record for NFS2 and NFS3 in addition to NFS4. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index b1f7dc4..1b596b6 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -167,7 +167,7 @@ void nfs_callback_down(void) static int nfs_callback_authenticate(struct svc_rqst *rqstp) { struct in_addr *addr = &rqstp->rq_addr.sin_addr; - struct nfs4_client *clp; + struct nfs_client *clp; /* Don't talk to strangers */ clp = nfs4_find_client(addr); diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 7719483..55d6e2e 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -15,7 +15,7 @@ unsigned nfs4_callback_getattr(struct cb_getattrargs *args, struct cb_getattrres *res) { - struct nfs4_client *clp; + struct nfs_client *clp; struct nfs_delegation *delegation; struct nfs_inode *nfsi; struct inode *inode; @@ -56,7 +56,7 @@ out: unsigned nfs4_callback_recall(struct cb_recallargs *args, void *dummy) { - struct nfs4_client *clp; + struct nfs_client *clp; struct inode *inode; unsigned res; diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 9540a31..5a1105c 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -114,7 +114,7 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, st */ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res) { - struct nfs4_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int status = 0; @@ -176,7 +176,7 @@ static void nfs_msync_inode(struct inode *inode) */ int __nfs_inode_return_delegation(struct inode *inode) { - struct nfs4_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int res = 0; @@ -208,7 +208,7 @@ int __nfs_inode_return_delegation(struct inode *inode) */ void nfs_return_all_delegations(struct super_block *sb) { - struct nfs4_client *clp = NFS_SB(sb)->nfs4_state; + struct nfs_client *clp = NFS_SB(sb)->nfs4_state; struct nfs_delegation *delegation; struct inode *inode; @@ -232,7 +232,7 @@ restart: int nfs_do_expire_all_delegations(void *ptr) { - struct nfs4_client *clp = ptr; + struct nfs_client *clp = ptr; struct nfs_delegation *delegation; struct inode *inode; @@ -258,7 +258,7 @@ out: module_put_and_exit(0); } -void nfs_expire_all_delegations(struct nfs4_client *clp) +void nfs_expire_all_delegations(struct nfs_client *clp) { struct task_struct *task; @@ -276,7 +276,7 @@ void nfs_expire_all_delegations(struct nfs4_client *clp) /* * Return all delegations following an NFS4ERR_CB_PATH_DOWN error. */ -void nfs_handle_cb_pathdown(struct nfs4_client *clp) +void nfs_handle_cb_pathdown(struct nfs_client *clp) { struct nfs_delegation *delegation; struct inode *inode; @@ -299,7 +299,7 @@ restart: struct recall_threadargs { struct inode *inode; - struct nfs4_client *clp; + struct nfs_client *clp; const nfs4_stateid *stateid; struct completion started; @@ -310,7 +310,7 @@ static int recall_thread(void *data) { struct recall_threadargs *args = (struct recall_threadargs *)data; struct inode *inode = igrab(args->inode); - struct nfs4_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; @@ -371,7 +371,7 @@ out_module_put: /* * Retrieve the inode associated with a delegation */ -struct inode *nfs_delegation_find_inode(struct nfs4_client *clp, const struct nfs_fh *fhandle) +struct inode *nfs_delegation_find_inode(struct nfs_client *clp, const struct nfs_fh *fhandle) { struct nfs_delegation *delegation; struct inode *res = NULL; @@ -389,7 +389,7 @@ struct inode *nfs_delegation_find_inode(struct nfs4_client *clp, const struct nf /* * Mark all delegations as needing to be reclaimed */ -void nfs_delegation_mark_reclaim(struct nfs4_client *clp) +void nfs_delegation_mark_reclaim(struct nfs_client *clp) { struct nfs_delegation *delegation; spin_lock(&clp->cl_lock); @@ -401,7 +401,7 @@ void nfs_delegation_mark_reclaim(struct nfs4_client *clp) /* * Reap all unclaimed delegations after reboot recovery is done */ -void nfs_delegation_reap_unclaimed(struct nfs4_client *clp) +void nfs_delegation_reap_unclaimed(struct nfs_client *clp) { struct nfs_delegation *delegation, *n; LIST_HEAD(head); @@ -423,7 +423,7 @@ void nfs_delegation_reap_unclaimed(struct nfs4_client *clp) int nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode) { - struct nfs4_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int res = 0; diff --git a/fs/nfs/delegation.h b/fs/nfs/delegation.h index 3858694..2cfd4b2 100644 --- a/fs/nfs/delegation.h +++ b/fs/nfs/delegation.h @@ -29,13 +29,13 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, st int __nfs_inode_return_delegation(struct inode *inode); int nfs_async_inode_return_delegation(struct inode *inode, const nfs4_stateid *stateid); -struct inode *nfs_delegation_find_inode(struct nfs4_client *clp, const struct nfs_fh *fhandle); +struct inode *nfs_delegation_find_inode(struct nfs_client *clp, const struct nfs_fh *fhandle); void nfs_return_all_delegations(struct super_block *sb); -void nfs_expire_all_delegations(struct nfs4_client *clp); -void nfs_handle_cb_pathdown(struct nfs4_client *clp); +void nfs_expire_all_delegations(struct nfs_client *clp); +void nfs_handle_cb_pathdown(struct nfs_client *clp); -void nfs_delegation_mark_reclaim(struct nfs4_client *clp); -void nfs_delegation_reap_unclaimed(struct nfs4_client *clp); +void nfs_delegation_mark_reclaim(struct nfs_client *clp); +void nfs_delegation_reap_unclaimed(struct nfs_client *clp); /* NFSv4 delegation-related procedures */ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4_stateid *stateid); diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index 873deb9..d05148e 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -109,7 +109,7 @@ static struct rpc_pipe_ops idmap_upcall_ops = { }; void -nfs_idmap_new(struct nfs4_client *clp) +nfs_idmap_new(struct nfs_client *clp) { struct idmap *idmap; @@ -138,7 +138,7 @@ nfs_idmap_new(struct nfs4_client *clp) } void -nfs_idmap_delete(struct nfs4_client *clp) +nfs_idmap_delete(struct nfs_client *clp) { struct idmap *idmap = clp->cl_idmap; @@ -491,27 +491,27 @@ static unsigned int fnvhash32(const void *buf, size_t buflen) return (hash); } -int nfs_map_name_to_uid(struct nfs4_client *clp, const char *name, size_t namelen, __u32 *uid) +int nfs_map_name_to_uid(struct nfs_client *clp, const char *name, size_t namelen, __u32 *uid) { struct idmap *idmap = clp->cl_idmap; return nfs_idmap_id(idmap, &idmap->idmap_user_hash, name, namelen, uid); } -int nfs_map_group_to_gid(struct nfs4_client *clp, const char *name, size_t namelen, __u32 *uid) +int nfs_map_group_to_gid(struct nfs_client *clp, const char *name, size_t namelen, __u32 *uid) { struct idmap *idmap = clp->cl_idmap; return nfs_idmap_id(idmap, &idmap->idmap_group_hash, name, namelen, uid); } -int nfs_map_uid_to_name(struct nfs4_client *clp, __u32 uid, char *buf) +int nfs_map_uid_to_name(struct nfs_client *clp, __u32 uid, char *buf) { struct idmap *idmap = clp->cl_idmap; return nfs_idmap_name(idmap, &idmap->idmap_user_hash, uid, buf); } -int nfs_map_gid_to_group(struct nfs4_client *clp, __u32 uid, char *buf) +int nfs_map_gid_to_group(struct nfs_client *clp, __u32 uid, char *buf) { struct idmap *idmap = clp->cl_idmap; diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 9a10286..4e334cb 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -43,9 +43,9 @@ enum nfs4_client_state { }; /* - * The nfs4_client identifies our client state to the server. + * The nfs_client identifies our client state to the server. */ -struct nfs4_client { +struct nfs_client { struct list_head cl_servers; /* Global list of servers */ struct in_addr cl_addr; /* Server identifier */ u64 cl_clientid; /* constant */ @@ -127,7 +127,7 @@ static inline void nfs_confirm_seqid(struct nfs_seqid_counter *seqid, int status struct nfs4_state_owner { spinlock_t so_lock; struct list_head so_list; /* per-clientid list of state_owners */ - struct nfs4_client *so_client; + struct nfs_client *so_client; u32 so_id; /* 32-bit identifier, unique */ atomic_t so_count; @@ -210,10 +210,10 @@ extern ssize_t nfs4_listxattr(struct dentry *, char *, size_t); /* nfs4proc.c */ extern int nfs4_map_errors(int err); -extern int nfs4_proc_setclientid(struct nfs4_client *, u32, unsigned short, struct rpc_cred *); -extern int nfs4_proc_setclientid_confirm(struct nfs4_client *, struct rpc_cred *); -extern int nfs4_proc_async_renew(struct nfs4_client *, struct rpc_cred *); -extern int nfs4_proc_renew(struct nfs4_client *, struct rpc_cred *); +extern int nfs4_proc_setclientid(struct nfs_client *, u32, unsigned short, struct rpc_cred *); +extern int nfs4_proc_setclientid_confirm(struct nfs_client *, struct rpc_cred *); +extern int nfs4_proc_async_renew(struct nfs_client *, struct rpc_cred *); +extern int nfs4_proc_renew(struct nfs_client *, struct rpc_cred *); extern int nfs4_do_close(struct inode *inode, struct nfs4_state *state); extern struct dentry *nfs4_atomic_open(struct inode *, struct dentry *, struct nameidata *); extern int nfs4_open_revalidate(struct inode *, struct dentry *, int, struct nameidata *); @@ -231,19 +231,19 @@ extern const u32 nfs4_fsinfo_bitmap[2]; extern const u32 nfs4_fs_locations_bitmap[2]; /* nfs4renewd.c */ -extern void nfs4_schedule_state_renewal(struct nfs4_client *); +extern void nfs4_schedule_state_renewal(struct nfs_client *); extern void nfs4_renewd_prepare_shutdown(struct nfs_server *); -extern void nfs4_kill_renewd(struct nfs4_client *); +extern void nfs4_kill_renewd(struct nfs_client *); extern void nfs4_renew_state(void *); /* nfs4state.c */ extern void init_nfsv4_state(struct nfs_server *); extern void destroy_nfsv4_state(struct nfs_server *); -extern struct nfs4_client *nfs4_get_client(struct in_addr *); -extern void nfs4_put_client(struct nfs4_client *clp); -extern struct nfs4_client *nfs4_find_client(struct in_addr *); -struct rpc_cred *nfs4_get_renew_cred(struct nfs4_client *clp); -extern u32 nfs4_alloc_lockowner_id(struct nfs4_client *); +extern struct nfs_client *nfs4_get_client(struct in_addr *); +extern void nfs4_put_client(struct nfs_client *clp); +extern struct nfs_client *nfs4_find_client(struct in_addr *); +struct rpc_cred *nfs4_get_renew_cred(struct nfs_client *clp); +extern u32 nfs4_alloc_lockowner_id(struct nfs_client *); extern struct nfs4_state_owner * nfs4_get_state_owner(struct nfs_server *, struct rpc_cred *); extern void nfs4_put_state_owner(struct nfs4_state_owner *); @@ -252,7 +252,7 @@ extern struct nfs4_state * nfs4_get_open_state(struct inode *, struct nfs4_state extern void nfs4_put_open_state(struct nfs4_state *); extern void nfs4_close_state(struct nfs4_state *, mode_t); extern void nfs4_state_set_mode_locked(struct nfs4_state *, mode_t); -extern void nfs4_schedule_state_recovery(struct nfs4_client *); +extern void nfs4_schedule_state_recovery(struct nfs_client *); extern void nfs4_put_lock_state(struct nfs4_lock_state *lsp); extern int nfs4_set_lock_state(struct nfs4_state *state, struct file_lock *fl); extern void nfs4_copy_stateid(nfs4_stateid *, struct nfs4_state *, fl_owner_t); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index b14145b..168f3ff 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -64,7 +64,7 @@ static int nfs4_do_fsinfo(struct nfs_server *, struct nfs_fh *, struct nfs_fsinf static int nfs4_async_handle_error(struct rpc_task *, const struct nfs_server *); static int _nfs4_proc_access(struct inode *inode, struct nfs_access_entry *entry); static int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct nfs4_exception *exception); -static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs4_client *clp); +static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs_client *clp); /* Prevent leaks of NFSv4 errors into userland */ int nfs4_map_errors(int err) @@ -195,7 +195,7 @@ static void nfs4_setup_readdir(u64 cookie, u32 *verifier, struct dentry *dentry, static void renew_lease(const struct nfs_server *server, unsigned long timestamp) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; spin_lock(&clp->cl_lock); if (time_before(clp->cl_last_renewal,timestamp)) clp->cl_last_renewal = timestamp; @@ -792,7 +792,7 @@ out: int nfs4_recover_expired_lease(struct nfs_server *server) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) nfs4_schedule_state_recovery(clp); @@ -867,7 +867,7 @@ static int _nfs4_open_delegated(struct inode *inode, int flags, struct rpc_cred { struct nfs_delegation *delegation; struct nfs_server *server = NFS_SERVER(inode); - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; struct nfs_inode *nfsi = NFS_I(inode); struct nfs4_state_owner *sp = NULL; struct nfs4_state *state = NULL; @@ -953,7 +953,7 @@ static int _nfs4_do_open(struct inode *dir, struct dentry *dentry, int flags, st struct nfs4_state_owner *sp; struct nfs4_state *state = NULL; struct nfs_server *server = NFS_SERVER(dir); - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; struct nfs4_opendata *opendata; int status; @@ -2521,7 +2521,7 @@ static void nfs4_proc_commit_setup(struct nfs_write_data *data, int how) */ static void nfs4_renew_done(struct rpc_task *task, void *data) { - struct nfs4_client *clp = (struct nfs4_client *)task->tk_msg.rpc_argp; + struct nfs_client *clp = (struct nfs_client *)task->tk_msg.rpc_argp; unsigned long timestamp = (unsigned long)data; if (task->tk_status < 0) { @@ -2543,7 +2543,7 @@ static const struct rpc_call_ops nfs4_renew_ops = { .rpc_call_done = nfs4_renew_done, }; -int nfs4_proc_async_renew(struct nfs4_client *clp, struct rpc_cred *cred) +int nfs4_proc_async_renew(struct nfs_client *clp, struct rpc_cred *cred) { struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENEW], @@ -2555,7 +2555,7 @@ int nfs4_proc_async_renew(struct nfs4_client *clp, struct rpc_cred *cred) &nfs4_renew_ops, (void *)jiffies); } -int nfs4_proc_renew(struct nfs4_client *clp, struct rpc_cred *cred) +int nfs4_proc_renew(struct nfs_client *clp, struct rpc_cred *cred) { struct rpc_message msg = { .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_RENEW], @@ -2791,7 +2791,7 @@ static int nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t buflen static int nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; if (!clp || task->tk_status >= 0) return 0; @@ -2828,7 +2828,7 @@ static int nfs4_wait_bit_interruptible(void *word) return 0; } -static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs4_client *clp) +static int nfs4_wait_clnt_recover(struct rpc_clnt *clnt, struct nfs_client *clp) { sigset_t oldset; int res; @@ -2871,7 +2871,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout) */ int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct nfs4_exception *exception) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; int ret = errorcode; exception->retry = 0; @@ -2898,7 +2898,7 @@ int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct return nfs4_map_errors(ret); } -int nfs4_proc_setclientid(struct nfs4_client *clp, u32 program, unsigned short port, struct rpc_cred *cred) +int nfs4_proc_setclientid(struct nfs_client *clp, u32 program, unsigned short port, struct rpc_cred *cred) { nfs4_verifier sc_verifier; struct nfs4_setclientid setclientid = { @@ -2945,7 +2945,7 @@ int nfs4_proc_setclientid(struct nfs4_client *clp, u32 program, unsigned short p return status; } -static int _nfs4_proc_setclientid_confirm(struct nfs4_client *clp, struct rpc_cred *cred) +static int _nfs4_proc_setclientid_confirm(struct nfs_client *clp, struct rpc_cred *cred) { struct nfs_fsinfo fsinfo; struct rpc_message msg = { @@ -2969,7 +2969,7 @@ static int _nfs4_proc_setclientid_confirm(struct nfs4_client *clp, struct rpc_cr return status; } -int nfs4_proc_setclientid_confirm(struct nfs4_client *clp, struct rpc_cred *cred) +int nfs4_proc_setclientid_confirm(struct nfs_client *clp, struct rpc_cred *cred) { long timeout; int err; @@ -3106,7 +3106,7 @@ static int _nfs4_proc_getlk(struct nfs4_state *state, int cmd, struct file_lock { struct inode *inode = state->inode; struct nfs_server *server = NFS_SERVER(inode); - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; struct nfs_lockt_args arg = { .fh = NFS_FH(inode), .fl = request, @@ -3513,7 +3513,7 @@ static int nfs4_lock_expired(struct nfs4_state *state, struct file_lock *request static int _nfs4_proc_setlk(struct nfs4_state *state, int cmd, struct file_lock *request) { - struct nfs4_client *clp = state->owner->so_client; + struct nfs_client *clp = state->owner->so_client; unsigned char fl_flags = request->fl_flags; int status; diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index 5d764d8..2087640 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -61,7 +61,7 @@ void nfs4_renew_state(void *data) { - struct nfs4_client *clp = (struct nfs4_client *)data; + struct nfs_client *clp = (struct nfs_client *)data; struct rpc_cred *cred; long lease, timeout; unsigned long last, now; @@ -108,7 +108,7 @@ out: /* Must be called with clp->cl_sem locked for writes */ void -nfs4_schedule_state_renewal(struct nfs4_client *clp) +nfs4_schedule_state_renewal(struct nfs_client *clp) { long timeout; @@ -127,7 +127,7 @@ nfs4_schedule_state_renewal(struct nfs4_client *clp) void nfs4_renewd_prepare_shutdown(struct nfs_server *server) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; if (!clp) return; @@ -140,7 +140,7 @@ nfs4_renewd_prepare_shutdown(struct nfs_server *server) /* Must be called with clp->cl_sem locked for writes */ void -nfs4_kill_renewd(struct nfs4_client *clp) +nfs4_kill_renewd(struct nfs_client *clp) { down_read(&clp->cl_sem); if (!list_empty(&clp->cl_superblocks)) { diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 090a36b..c0b6439 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -83,10 +83,10 @@ destroy_nfsv4_state(struct nfs_server *server) * Since these are allocated/deallocated very rarely, we don't * bother putting them in a slab cache... */ -static struct nfs4_client * +static struct nfs_client * nfs4_alloc_client(struct in_addr *addr) { - struct nfs4_client *clp; + struct nfs_client *clp; if (nfs_callback_up() < 0) return NULL; @@ -111,7 +111,7 @@ nfs4_alloc_client(struct in_addr *addr) } static void -nfs4_free_client(struct nfs4_client *clp) +nfs4_free_client(struct nfs_client *clp) { struct nfs4_state_owner *sp; @@ -130,9 +130,9 @@ nfs4_free_client(struct nfs4_client *clp) nfs_callback_down(); } -static struct nfs4_client *__nfs4_find_client(struct in_addr *addr) +static struct nfs_client *__nfs4_find_client(struct in_addr *addr) { - struct nfs4_client *clp; + struct nfs_client *clp; list_for_each_entry(clp, &nfs4_clientid_list, cl_servers) { if (memcmp(&clp->cl_addr, addr, sizeof(clp->cl_addr)) == 0) { atomic_inc(&clp->cl_count); @@ -142,19 +142,19 @@ static struct nfs4_client *__nfs4_find_client(struct in_addr *addr) return NULL; } -struct nfs4_client *nfs4_find_client(struct in_addr *addr) +struct nfs_client *nfs4_find_client(struct in_addr *addr) { - struct nfs4_client *clp; + struct nfs_client *clp; spin_lock(&state_spinlock); clp = __nfs4_find_client(addr); spin_unlock(&state_spinlock); return clp; } -struct nfs4_client * +struct nfs_client * nfs4_get_client(struct in_addr *addr) { - struct nfs4_client *clp, *new = NULL; + struct nfs_client *clp, *new = NULL; spin_lock(&state_spinlock); for (;;) { @@ -180,7 +180,7 @@ nfs4_get_client(struct in_addr *addr) } void -nfs4_put_client(struct nfs4_client *clp) +nfs4_put_client(struct nfs_client *clp) { if (!atomic_dec_and_lock(&clp->cl_count, &state_spinlock)) return; @@ -192,7 +192,7 @@ nfs4_put_client(struct nfs4_client *clp) nfs4_free_client(clp); } -static int nfs4_init_client(struct nfs4_client *clp, struct rpc_cred *cred) +static int nfs4_init_client(struct nfs_client *clp, struct rpc_cred *cred) { int status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, nfs_callback_tcpport, cred); @@ -204,13 +204,13 @@ static int nfs4_init_client(struct nfs4_client *clp, struct rpc_cred *cred) } u32 -nfs4_alloc_lockowner_id(struct nfs4_client *clp) +nfs4_alloc_lockowner_id(struct nfs_client *clp) { return clp->cl_lockowner_id ++; } static struct nfs4_state_owner * -nfs4_client_grab_unused(struct nfs4_client *clp, struct rpc_cred *cred) +nfs4_client_grab_unused(struct nfs_client *clp, struct rpc_cred *cred) { struct nfs4_state_owner *sp = NULL; @@ -224,7 +224,7 @@ nfs4_client_grab_unused(struct nfs4_client *clp, struct rpc_cred *cred) return sp; } -struct rpc_cred *nfs4_get_renew_cred(struct nfs4_client *clp) +struct rpc_cred *nfs4_get_renew_cred(struct nfs_client *clp) { struct nfs4_state_owner *sp; struct rpc_cred *cred = NULL; @@ -238,7 +238,7 @@ struct rpc_cred *nfs4_get_renew_cred(struct nfs4_client *clp) return cred; } -struct rpc_cred *nfs4_get_setclientid_cred(struct nfs4_client *clp) +struct rpc_cred *nfs4_get_setclientid_cred(struct nfs_client *clp) { struct nfs4_state_owner *sp; @@ -251,7 +251,7 @@ struct rpc_cred *nfs4_get_setclientid_cred(struct nfs4_client *clp) } static struct nfs4_state_owner * -nfs4_find_state_owner(struct nfs4_client *clp, struct rpc_cred *cred) +nfs4_find_state_owner(struct nfs_client *clp, struct rpc_cred *cred) { struct nfs4_state_owner *sp, *res = NULL; @@ -294,7 +294,7 @@ nfs4_alloc_state_owner(void) void nfs4_drop_state_owner(struct nfs4_state_owner *sp) { - struct nfs4_client *clp = sp->so_client; + struct nfs_client *clp = sp->so_client; spin_lock(&clp->cl_lock); list_del_init(&sp->so_list); spin_unlock(&clp->cl_lock); @@ -306,7 +306,7 @@ nfs4_drop_state_owner(struct nfs4_state_owner *sp) */ struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *server, struct rpc_cred *cred) { - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; struct nfs4_state_owner *sp, *new; get_rpccred(cred); @@ -337,7 +337,7 @@ struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *server, struct */ void nfs4_put_state_owner(struct nfs4_state_owner *sp) { - struct nfs4_client *clp = sp->so_client; + struct nfs_client *clp = sp->so_client; struct rpc_cred *cred = sp->so_cred; if (!atomic_dec_and_lock(&sp->so_count, &clp->cl_lock)) @@ -540,7 +540,7 @@ __nfs4_find_lock_state(struct nfs4_state *state, fl_owner_t fl_owner) static struct nfs4_lock_state *nfs4_alloc_lock_state(struct nfs4_state *state, fl_owner_t fl_owner) { struct nfs4_lock_state *lsp; - struct nfs4_client *clp = state->owner->so_client; + struct nfs_client *clp = state->owner->so_client; lsp = kzalloc(sizeof(*lsp), GFP_KERNEL); if (lsp == NULL) @@ -752,7 +752,7 @@ out: static int reclaimer(void *); -static inline void nfs4_clear_recover_bit(struct nfs4_client *clp) +static inline void nfs4_clear_recover_bit(struct nfs_client *clp) { smp_mb__before_clear_bit(); clear_bit(NFS4CLNT_STATE_RECOVER, &clp->cl_state); @@ -764,7 +764,7 @@ static inline void nfs4_clear_recover_bit(struct nfs4_client *clp) /* * State recovery routine */ -static void nfs4_recover_state(struct nfs4_client *clp) +static void nfs4_recover_state(struct nfs_client *clp) { struct task_struct *task; @@ -782,7 +782,7 @@ static void nfs4_recover_state(struct nfs4_client *clp) /* * Schedule a state recovery attempt */ -void nfs4_schedule_state_recovery(struct nfs4_client *clp) +void nfs4_schedule_state_recovery(struct nfs_client *clp) { if (!clp) return; @@ -879,7 +879,7 @@ out_err: return status; } -static void nfs4_state_mark_reclaim(struct nfs4_client *clp) +static void nfs4_state_mark_reclaim(struct nfs_client *clp) { struct nfs4_state_owner *sp; struct nfs4_state *state; @@ -903,7 +903,7 @@ static void nfs4_state_mark_reclaim(struct nfs4_client *clp) static int reclaimer(void *ptr) { - struct nfs4_client *clp = ptr; + struct nfs_client *clp = ptr; struct nfs4_state_owner *sp; struct nfs4_state_recovery_ops *ops; struct rpc_cred *cred; diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 1dee6ef..04748ab9 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -1160,7 +1160,7 @@ static int encode_rename(struct xdr_stream *xdr, const struct qstr *oldname, con return 0; } -static int encode_renew(struct xdr_stream *xdr, const struct nfs4_client *client_stateid) +static int encode_renew(struct xdr_stream *xdr, const struct nfs_client *client_stateid) { uint32_t *p; @@ -1246,7 +1246,7 @@ static int encode_setclientid(struct xdr_stream *xdr, const struct nfs4_setclien return 0; } -static int encode_setclientid_confirm(struct xdr_stream *xdr, const struct nfs4_client *client_state) +static int encode_setclientid_confirm(struct xdr_stream *xdr, const struct nfs_client *client_state) { uint32_t *p; @@ -1945,7 +1945,7 @@ static int nfs4_xdr_enc_server_caps(struct rpc_rqst *req, uint32_t *p, const str /* * a RENEW request */ -static int nfs4_xdr_enc_renew(struct rpc_rqst *req, uint32_t *p, struct nfs4_client *clp) +static int nfs4_xdr_enc_renew(struct rpc_rqst *req, uint32_t *p, struct nfs_client *clp) { struct xdr_stream xdr; struct compound_hdr hdr = { @@ -1975,7 +1975,7 @@ static int nfs4_xdr_enc_setclientid(struct rpc_rqst *req, uint32_t *p, struct nf /* * a SETCLIENTID_CONFIRM request */ -static int nfs4_xdr_enc_setclientid_confirm(struct rpc_rqst *req, uint32_t *p, struct nfs4_client *clp) +static int nfs4_xdr_enc_setclientid_confirm(struct rpc_rqst *req, uint32_t *p, struct nfs_client *clp) { struct xdr_stream xdr; struct compound_hdr hdr = { @@ -2132,7 +2132,7 @@ static int decode_op_hdr(struct xdr_stream *xdr, enum nfs_opnum4 expected) } /* Dummy routine */ -static int decode_ace(struct xdr_stream *xdr, void *ace, struct nfs4_client *clp) +static int decode_ace(struct xdr_stream *xdr, void *ace, struct nfs_client *clp) { uint32_t *p; unsigned int strlen; @@ -2636,7 +2636,7 @@ static int decode_attr_nlink(struct xdr_stream *xdr, uint32_t *bitmap, uint32_t return 0; } -static int decode_attr_owner(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs4_client *clp, int32_t *uid) +static int decode_attr_owner(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs_client *clp, int32_t *uid) { uint32_t len, *p; @@ -2660,7 +2660,7 @@ static int decode_attr_owner(struct xdr_stream *xdr, uint32_t *bitmap, struct nf return 0; } -static int decode_attr_group(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs4_client *clp, int32_t *gid) +static int decode_attr_group(struct xdr_stream *xdr, uint32_t *bitmap, struct nfs_client *clp, int32_t *gid) { uint32_t len, *p; @@ -3565,7 +3565,7 @@ static int decode_setattr(struct xdr_stream *xdr, struct nfs_setattrres *res) return 0; } -static int decode_setclientid(struct xdr_stream *xdr, struct nfs4_client *clp) +static int decode_setclientid(struct xdr_stream *xdr, struct nfs_client *clp) { uint32_t *p; uint32_t opnum; @@ -4335,7 +4335,7 @@ static int nfs4_xdr_dec_renew(struct rpc_rqst *rqstp, uint32_t *p, void *dummy) * a SETCLIENTID request */ static int nfs4_xdr_dec_setclientid(struct rpc_rqst *req, uint32_t *p, - struct nfs4_client *clp) + struct nfs_client *clp) { struct xdr_stream xdr; struct compound_hdr hdr; diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 6349734..d03ede5 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1099,7 +1099,7 @@ static int nfs_clone_nfs_sb(struct file_system_type *fs_type, static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, struct rpc_timeout *timeparms, int proto, rpc_authflavor_t flavor) { - struct nfs4_client *clp; + struct nfs_client *clp; struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; int err = -EIO; @@ -1416,7 +1416,7 @@ static inline char *nfs4_dup_path(const struct dentry *dentry) static struct super_block *nfs4_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) { const struct dentry *dentry = data->dentry; - struct nfs4_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs4_state; struct super_block *sb; server->fsid = data->fattr->fsid; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 6b4a13c..4db90df 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -43,7 +43,7 @@ struct nfs_server { */ char ip_addr[16]; char * mnt_path; - struct nfs4_client * nfs4_state; /* all NFSv4 state starts here */ + struct nfs_client * nfs4_state; /* all NFSv4 state starts here */ struct list_head nfs4_siblings; /* List of other nfs_server structs * that share the same clientid */ diff --git a/include/linux/nfs_idmap.h b/include/linux/nfs_idmap.h index 102e560..678fe68 100644 --- a/include/linux/nfs_idmap.h +++ b/include/linux/nfs_idmap.h @@ -62,15 +62,15 @@ struct idmap_msg { #ifdef __KERNEL__ /* Forward declaration to make this header independent of others */ -struct nfs4_client; +struct nfs_client; -void nfs_idmap_new(struct nfs4_client *); -void nfs_idmap_delete(struct nfs4_client *); +void nfs_idmap_new(struct nfs_client *); +void nfs_idmap_delete(struct nfs_client *); -int nfs_map_name_to_uid(struct nfs4_client *, const char *, size_t, __u32 *); -int nfs_map_group_to_gid(struct nfs4_client *, const char *, size_t, __u32 *); -int nfs_map_uid_to_name(struct nfs4_client *, __u32, char *); -int nfs_map_gid_to_group(struct nfs4_client *, __u32, char *); +int nfs_map_name_to_uid(struct nfs_client *, const char *, size_t, __u32 *); +int nfs_map_group_to_gid(struct nfs_client *, const char *, size_t, __u32 *); +int nfs_map_uid_to_name(struct nfs_client *, __u32, char *); +int nfs_map_gid_to_group(struct nfs_client *, __u32, char *); extern unsigned int nfs_idmap_cache_timeout; #endif /* __KERNEL__ */ -- cgit v0.10.2 From 7539bbab8062aadc1db95a22b377146843cfa88f Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:09 -0400 Subject: NFS: Rename nfs_server::nfs4_state Rename nfs_server::nfs4_state to nfs_client as it will be used to represent the client state for NFS2 and NFS3 also. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index 5a1105c..cfe2397 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -52,7 +52,7 @@ static int nfs_delegation_claim_locks(struct nfs_open_context *ctx, struct nfs4_ case -NFS4ERR_EXPIRED: /* kill_proc(fl->fl_pid, SIGLOST, 1); */ case -NFS4ERR_STALE_CLIENTID: - nfs4_schedule_state_recovery(NFS_SERVER(inode)->nfs4_state); + nfs4_schedule_state_recovery(NFS_SERVER(inode)->nfs_client); goto out_err; } } @@ -114,7 +114,7 @@ void nfs_inode_reclaim_delegation(struct inode *inode, struct rpc_cred *cred, st */ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct nfs_openres *res) { - struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int status = 0; @@ -176,7 +176,7 @@ static void nfs_msync_inode(struct inode *inode) */ int __nfs_inode_return_delegation(struct inode *inode) { - struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int res = 0; @@ -208,7 +208,7 @@ int __nfs_inode_return_delegation(struct inode *inode) */ void nfs_return_all_delegations(struct super_block *sb) { - struct nfs_client *clp = NFS_SB(sb)->nfs4_state; + struct nfs_client *clp = NFS_SB(sb)->nfs_client; struct nfs_delegation *delegation; struct inode *inode; @@ -310,7 +310,7 @@ static int recall_thread(void *data) { struct recall_threadargs *args = (struct recall_threadargs *)data; struct inode *inode = igrab(args->inode); - struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; @@ -423,7 +423,7 @@ void nfs_delegation_reap_unclaimed(struct nfs_client *clp) int nfs4_copy_delegation_stateid(nfs4_stateid *dst, struct inode *inode) { - struct nfs_client *clp = NFS_SERVER(inode)->nfs4_state; + struct nfs_client *clp = NFS_SERVER(inode)->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); struct nfs_delegation *delegation; int res = 0; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 168f3ff..b46597f 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -195,7 +195,7 @@ static void nfs4_setup_readdir(u64 cookie, u32 *verifier, struct dentry *dentry, static void renew_lease(const struct nfs_server *server, unsigned long timestamp) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; spin_lock(&clp->cl_lock); if (time_before(clp->cl_last_renewal,timestamp)) clp->cl_last_renewal = timestamp; @@ -252,7 +252,7 @@ static struct nfs4_opendata *nfs4_opendata_alloc(struct dentry *dentry, atomic_inc(&sp->so_count); p->o_arg.fh = NFS_FH(dir); p->o_arg.open_flags = flags, - p->o_arg.clientid = server->nfs4_state->cl_clientid; + p->o_arg.clientid = server->nfs_client->cl_clientid; p->o_arg.id = sp->so_id; p->o_arg.name = &dentry->d_name; p->o_arg.server = server; @@ -550,7 +550,7 @@ int nfs4_open_delegation_recall(struct dentry *dentry, struct nfs4_state *state) case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: /* Don't recall a delegation if it was lost */ - nfs4_schedule_state_recovery(server->nfs4_state); + nfs4_schedule_state_recovery(server->nfs_client); return err; } err = nfs4_handle_exception(server, err, &exception); @@ -792,7 +792,7 @@ out: int nfs4_recover_expired_lease(struct nfs_server *server) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) nfs4_schedule_state_recovery(clp); @@ -867,7 +867,7 @@ static int _nfs4_open_delegated(struct inode *inode, int flags, struct rpc_cred { struct nfs_delegation *delegation; struct nfs_server *server = NFS_SERVER(inode); - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; struct nfs_inode *nfsi = NFS_I(inode); struct nfs4_state_owner *sp = NULL; struct nfs4_state *state = NULL; @@ -953,7 +953,7 @@ static int _nfs4_do_open(struct inode *dir, struct dentry *dentry, int flags, st struct nfs4_state_owner *sp; struct nfs4_state *state = NULL; struct nfs_server *server = NFS_SERVER(dir); - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; struct nfs4_opendata *opendata; int status; @@ -1133,7 +1133,7 @@ static void nfs4_close_done(struct rpc_task *task, void *data) break; case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: - nfs4_schedule_state_recovery(server->nfs4_state); + nfs4_schedule_state_recovery(server->nfs_client); break; default: if (nfs4_async_handle_error(task, server) == -EAGAIN) { @@ -2791,7 +2791,7 @@ static int nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t buflen static int nfs4_async_handle_error(struct rpc_task *task, const struct nfs_server *server) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; if (!clp || task->tk_status >= 0) return 0; @@ -2871,7 +2871,7 @@ static int nfs4_delay(struct rpc_clnt *clnt, long *timeout) */ int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct nfs4_exception *exception) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; int ret = errorcode; exception->retry = 0; @@ -3077,7 +3077,7 @@ int nfs4_proc_delegreturn(struct inode *inode, struct rpc_cred *cred, const nfs4 switch (err) { case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: - nfs4_schedule_state_recovery(server->nfs4_state); + nfs4_schedule_state_recovery(server->nfs_client); case 0: return 0; } @@ -3106,7 +3106,7 @@ static int _nfs4_proc_getlk(struct nfs4_state *state, int cmd, struct file_lock { struct inode *inode = state->inode; struct nfs_server *server = NFS_SERVER(inode); - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; struct nfs_lockt_args arg = { .fh = NFS_FH(inode), .fl = request, @@ -3231,7 +3231,7 @@ static void nfs4_locku_done(struct rpc_task *task, void *data) break; case -NFS4ERR_STALE_STATEID: case -NFS4ERR_EXPIRED: - nfs4_schedule_state_recovery(calldata->server->nfs4_state); + nfs4_schedule_state_recovery(calldata->server->nfs_client); break; default: if (nfs4_async_handle_error(task, calldata->server) == -EAGAIN) { @@ -3343,7 +3343,7 @@ static struct nfs4_lockdata *nfs4_alloc_lockdata(struct file_lock *fl, if (p->arg.lock_seqid == NULL) goto out_free; p->arg.lock_stateid = &lsp->ls_stateid; - p->arg.lock_owner.clientid = server->nfs4_state->cl_clientid; + p->arg.lock_owner.clientid = server->nfs_client->cl_clientid; p->arg.lock_owner.id = lsp->ls_id; p->lsp = lsp; atomic_inc(&lsp->ls_count); diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index 2087640..ff947ec 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -127,7 +127,7 @@ nfs4_schedule_state_renewal(struct nfs_client *clp) void nfs4_renewd_prepare_shutdown(struct nfs_server *server) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; if (!clp) return; diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index c0b6439..fa51a7d 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -61,7 +61,7 @@ static LIST_HEAD(nfs4_clientid_list); void init_nfsv4_state(struct nfs_server *server) { - server->nfs4_state = NULL; + server->nfs_client = NULL; INIT_LIST_HEAD(&server->nfs4_siblings); } @@ -70,9 +70,9 @@ destroy_nfsv4_state(struct nfs_server *server) { kfree(server->mnt_path); server->mnt_path = NULL; - if (server->nfs4_state) { - nfs4_put_client(server->nfs4_state); - server->nfs4_state = NULL; + if (server->nfs_client) { + nfs4_put_client(server->nfs_client); + server->nfs_client = NULL; } } @@ -306,7 +306,7 @@ nfs4_drop_state_owner(struct nfs4_state_owner *sp) */ struct nfs4_state_owner *nfs4_get_state_owner(struct nfs_server *server, struct rpc_cred *cred) { - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; struct nfs4_state_owner *sp, *new; get_rpccred(cred); diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 04748ab9..9992606 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -529,7 +529,7 @@ static int encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const s if (iap->ia_valid & ATTR_MODE) len += 4; if (iap->ia_valid & ATTR_UID) { - owner_namelen = nfs_map_uid_to_name(server->nfs4_state, iap->ia_uid, owner_name); + owner_namelen = nfs_map_uid_to_name(server->nfs_client, iap->ia_uid, owner_name); if (owner_namelen < 0) { printk(KERN_WARNING "nfs: couldn't resolve uid %d to string\n", iap->ia_uid); @@ -541,7 +541,7 @@ static int encode_attrs(struct xdr_stream *xdr, const struct iattr *iap, const s len += 4 + (XDR_QUADLEN(owner_namelen) << 2); } if (iap->ia_valid & ATTR_GID) { - owner_grouplen = nfs_map_gid_to_group(server->nfs4_state, iap->ia_gid, owner_group); + owner_grouplen = nfs_map_gid_to_group(server->nfs_client, iap->ia_gid, owner_group); if (owner_grouplen < 0) { printk(KERN_WARNING "nfs4: couldn't resolve gid %d to string\n", iap->ia_gid); @@ -3051,9 +3051,9 @@ static int decode_getfattr(struct xdr_stream *xdr, struct nfs_fattr *fattr, cons fattr->mode |= fmode; if ((status = decode_attr_nlink(xdr, bitmap, &fattr->nlink)) != 0) goto xdr_error; - if ((status = decode_attr_owner(xdr, bitmap, server->nfs4_state, &fattr->uid)) != 0) + if ((status = decode_attr_owner(xdr, bitmap, server->nfs_client, &fattr->uid)) != 0) goto xdr_error; - if ((status = decode_attr_group(xdr, bitmap, server->nfs4_state, &fattr->gid)) != 0) + if ((status = decode_attr_group(xdr, bitmap, server->nfs_client, &fattr->gid)) != 0) goto xdr_error; if ((status = decode_attr_rdev(xdr, bitmap, &fattr->rdev)) != 0) goto xdr_error; @@ -3254,7 +3254,7 @@ static int decode_delegation(struct xdr_stream *xdr, struct nfs_openres *res) if (decode_space_limit(xdr, &res->maxsize) < 0) return -EIO; } - return decode_ace(xdr, NULL, res->server->nfs4_state); + return decode_ace(xdr, NULL, res->server->nfs_client); } static int decode_open(struct xdr_stream *xdr, struct nfs_openres *res) diff --git a/fs/nfs/super.c b/fs/nfs/super.c index d03ede5..ab4c78e 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1141,7 +1141,7 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); clnt = rpc_clone_client(clp->cl_rpcclient); if (!IS_ERR(clnt)) - server->nfs4_state = clp; + server->nfs_client = clp; up_write(&clp->cl_sem); clp = NULL; @@ -1151,7 +1151,7 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, return clnt; } - if (server->nfs4_state->cl_idmap == NULL) { + if (server->nfs_client->cl_idmap == NULL) { dprintk("%s: failed to create idmapper.\n", __FUNCTION__); return ERR_PTR(-ENOMEM); } @@ -1416,7 +1416,7 @@ static inline char *nfs4_dup_path(const struct dentry *dentry) static struct super_block *nfs4_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) { const struct dentry *dentry = data->dentry; - struct nfs_client *clp = server->nfs4_state; + struct nfs_client *clp = server->nfs_client; struct super_block *sb; server->fsid = data->fattr->fsid; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 4db90df..fc20d6b 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -43,7 +43,7 @@ struct nfs_server { */ char ip_addr[16]; char * mnt_path; - struct nfs_client * nfs4_state; /* all NFSv4 state starts here */ + struct nfs_client * nfs_client; /* all NFSv4 state starts here */ struct list_head nfs4_siblings; /* List of other nfs_server structs * that share the same clientid */ -- cgit v0.10.2 From b7162792b5c0e0f6e91b8997f8e6bbc76ec5420a Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:09 -0400 Subject: NFS: Return an error when starting the idmapping pipe Return an error when starting the idmapping pipe so that we can detect it failing. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index d05148e..231c20f 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -108,15 +108,17 @@ static struct rpc_pipe_ops idmap_upcall_ops = { .destroy_msg = idmap_pipe_destroy_msg, }; -void +int nfs_idmap_new(struct nfs_client *clp) { struct idmap *idmap; + int error; if (clp->cl_idmap != NULL) - return; + return 0; + if ((idmap = kzalloc(sizeof(*idmap), GFP_KERNEL)) == NULL) - return; + return -ENOMEM; snprintf(idmap->idmap_path, sizeof(idmap->idmap_path), "%s/idmap", clp->cl_rpcclient->cl_pathname); @@ -124,8 +126,9 @@ nfs_idmap_new(struct nfs_client *clp) idmap->idmap_dentry = rpc_mkpipe(idmap->idmap_path, idmap, &idmap_upcall_ops, 0); if (IS_ERR(idmap->idmap_dentry)) { + error = PTR_ERR(idmap->idmap_dentry); kfree(idmap); - return; + return error; } mutex_init(&idmap->idmap_lock); @@ -135,6 +138,7 @@ nfs_idmap_new(struct nfs_client *clp) idmap->idmap_group_hash.h_type = IDMAP_TYPE_GROUP; clp->cl_idmap = idmap; + return 0; } void diff --git a/fs/nfs/super.c b/fs/nfs/super.c index ab4c78e..3ee85c4 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1136,7 +1136,8 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, clnt->cl_softrtry = 1; clp->cl_rpcclient = clnt; memcpy(clp->cl_ipaddr, server->ip_addr, sizeof(clp->cl_ipaddr)); - nfs_idmap_new(clp); + if (nfs_idmap_new(clp) < 0) + goto out_fail; } list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); clnt = rpc_clone_client(clp->cl_rpcclient); diff --git a/include/linux/nfs_idmap.h b/include/linux/nfs_idmap.h index 678fe68..15a9f3b 100644 --- a/include/linux/nfs_idmap.h +++ b/include/linux/nfs_idmap.h @@ -64,7 +64,7 @@ struct idmap_msg { /* Forward declaration to make this header independent of others */ struct nfs_client; -void nfs_idmap_new(struct nfs_client *); +int nfs_idmap_new(struct nfs_client *); void nfs_idmap_delete(struct nfs_client *); int nfs_map_name_to_uid(struct nfs_client *, const char *, size_t, __u32 *); -- cgit v0.10.2 From 2b3de4411b3ccaeb00018c99d1bbe7203554cf7f Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:09 -0400 Subject: NFS: Add a lookupfh NFS RPC op Add a lookup filehandle NFS RPC op so that a file handle can be looked up without requiring dentries and inodes and other VFS stuff when doing an NFS4 pathwalk during mounting. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index b46597f..de2006f 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1583,6 +1583,52 @@ nfs4_proc_setattr(struct dentry *dentry, struct nfs_fattr *fattr, return status; } +static int _nfs4_proc_lookupfh(struct nfs_server *server, struct nfs_fh *dirfh, + struct qstr *name, struct nfs_fh *fhandle, + struct nfs_fattr *fattr) +{ + int status; + struct nfs4_lookup_arg args = { + .bitmask = server->attr_bitmask, + .dir_fh = dirfh, + .name = name, + }; + struct nfs4_lookup_res res = { + .server = server, + .fattr = fattr, + .fh = fhandle, + }; + struct rpc_message msg = { + .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOOKUP], + .rpc_argp = &args, + .rpc_resp = &res, + }; + + nfs_fattr_init(fattr); + + dprintk("NFS call lookupfh %s\n", name->name); + status = rpc_call_sync(server->client, &msg, 0); + dprintk("NFS reply lookupfh: %d\n", status); + if (status == -NFS4ERR_MOVED) + status = -EREMOTE; + return status; +} + +static int nfs4_proc_lookupfh(struct nfs_server *server, struct nfs_fh *dirfh, + struct qstr *name, struct nfs_fh *fhandle, + struct nfs_fattr *fattr) +{ + struct nfs4_exception exception = { }; + int err; + do { + err = nfs4_handle_exception(server, + _nfs4_proc_lookupfh(server, dirfh, name, + fhandle, fattr), + &exception); + } while (exception.retry); + return err; +} + static int _nfs4_proc_lookup(struct inode *dir, struct qstr *name, struct nfs_fh *fhandle, struct nfs_fattr *fattr) { @@ -3723,6 +3769,7 @@ struct nfs_rpc_ops nfs_v4_clientops = { .getroot = nfs4_proc_get_root, .getattr = nfs4_proc_getattr, .setattr = nfs4_proc_setattr, + .lookupfh = nfs4_proc_lookupfh, .lookup = nfs4_proc_lookup, .access = nfs4_proc_access, .readlink = nfs4_proc_readlink, diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 41e5a19..2687977 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -770,6 +770,9 @@ struct nfs_rpc_ops { int (*getroot) (struct nfs_server *, struct nfs_fh *, struct nfs_fsinfo *); + int (*lookupfh)(struct nfs_server *, struct nfs_fh *, + struct qstr *, struct nfs_fh *, + struct nfs_fattr *); int (*getattr) (struct nfs_server *, struct nfs_fh *, struct nfs_fattr *); int (*setattr) (struct dentry *, struct nfs_fattr *, -- cgit v0.10.2 From e9326dcab413848e70ab746c7c5363da13e5f801 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:10 -0400 Subject: NFS: Add a server capabilities NFS RPC op Add a set_capabilities NFS RPC op so that the server capabilities can be set. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index de2006f..850f085 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3790,6 +3790,7 @@ struct nfs_rpc_ops nfs_v4_clientops = { .statfs = nfs4_proc_statfs, .fsinfo = nfs4_proc_fsinfo, .pathconf = nfs4_proc_pathconf, + .set_capabilities = nfs4_server_capabilities, .decode_dirent = nfs4_decode_dirent, .read_setup = nfs4_proc_read_setup, .read_done = nfs4_read_done, diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 2687977..dd9ae67 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -809,6 +809,7 @@ struct nfs_rpc_ops { struct nfs_fsinfo *); int (*pathconf) (struct nfs_server *, struct nfs_fh *, struct nfs_pathconf *); + int (*set_capabilities)(struct nfs_server *, struct nfs_fh *); u32 * (*decode_dirent)(u32 *, struct nfs_entry *, int plus); void (*read_setup) (struct nfs_read_data *); int (*read_done) (struct rpc_task *, struct nfs_read_data *); -- cgit v0.10.2 From 24c8dbbb5f777187d660393599641ab3307b4b97 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:10 -0400 Subject: NFS: Generalise the nfs_client structure Generalise the nfs_client structure by: (1) Moving nfs_client to a more general place (nfs_fs_sb.h). (2) Renaming its maintenance routines to be non-NFS4 specific. (3) Move those maintenance routines to a new non-NFS4 specific file (client.c) and move the declarations to internal.h. (4) Make nfs_find/get_client() take a full sockaddr_in to include the port number (will be required for NFS2/3). (5) Make nfs_find/get_client() take the NFS protocol version (again will be required to differentiate NFS2, 3 & 4 client records). Also: (6) Make nfs_client construction proceed akin to inodes, marking them as under construction and providing a function to indicate completion. (7) Make nfs_get_client() wait interruptibly if it finds a client that it can share, but that client is currently being constructed. (8) Make nfs4_create_client() use (6) and (7) instead of locking cl_sem. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index 0b572a0..3b993a6 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -4,9 +4,9 @@ obj-$(CONFIG_NFS_FS) += nfs.o -nfs-y := dir.o file.o inode.o super.o nfs2xdr.o pagelist.o \ - proc.o read.o symlink.o unlink.o write.o \ - namespace.o +nfs-y := client.o dir.o file.o inode.o super.o nfs2xdr.o \ + pagelist.o proc.o read.o symlink.o unlink.o \ + write.o namespace.o nfs-$(CONFIG_ROOT_NFS) += nfsroot.o mount_clnt.o nfs-$(CONFIG_NFS_V3) += nfs3proc.o nfs3xdr.o nfs-$(CONFIG_NFS_V3_ACL) += nfs3acl.o diff --git a/fs/nfs/callback.c b/fs/nfs/callback.c index 1b596b6..a3ee113 100644 --- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -19,6 +19,7 @@ #include "nfs4_fs.h" #include "callback.h" +#include "internal.h" #define NFSDBG_FACILITY NFSDBG_CALLBACK @@ -166,15 +167,15 @@ void nfs_callback_down(void) static int nfs_callback_authenticate(struct svc_rqst *rqstp) { - struct in_addr *addr = &rqstp->rq_addr.sin_addr; + struct sockaddr_in *addr = &rqstp->rq_addr; struct nfs_client *clp; /* Don't talk to strangers */ - clp = nfs4_find_client(addr); + clp = nfs_find_client(addr, 4); if (clp == NULL) return SVC_DROP; - dprintk("%s: %u.%u.%u.%u NFSv4 callback!\n", __FUNCTION__, NIPQUAD(addr)); - nfs4_put_client(clp); + dprintk("%s: %u.%u.%u.%u NFSv4 callback!\n", __FUNCTION__, NIPQUAD(addr->sin_addr)); + nfs_put_client(clp); switch (rqstp->rq_authop->flavour) { case RPC_AUTH_NULL: if (rqstp->rq_proc != CB_NULL) diff --git a/fs/nfs/callback_proc.c b/fs/nfs/callback_proc.c index 55d6e2e..97cf8f7 100644 --- a/fs/nfs/callback_proc.c +++ b/fs/nfs/callback_proc.c @@ -10,6 +10,7 @@ #include "nfs4_fs.h" #include "callback.h" #include "delegation.h" +#include "internal.h" #define NFSDBG_FACILITY NFSDBG_CALLBACK @@ -22,7 +23,7 @@ unsigned nfs4_callback_getattr(struct cb_getattrargs *args, struct cb_getattrres res->bitmap[0] = res->bitmap[1] = 0; res->status = htonl(NFS4ERR_BADHANDLE); - clp = nfs4_find_client(&args->addr->sin_addr); + clp = nfs_find_client(args->addr, 4); if (clp == NULL) goto out; inode = nfs_delegation_find_inode(clp, &args->fh); @@ -48,7 +49,7 @@ out_iput: up_read(&nfsi->rwsem); iput(inode); out_putclient: - nfs4_put_client(clp); + nfs_put_client(clp); out: dprintk("%s: exit with status = %d\n", __FUNCTION__, ntohl(res->status)); return res->status; @@ -61,7 +62,7 @@ unsigned nfs4_callback_recall(struct cb_recallargs *args, void *dummy) unsigned res; res = htonl(NFS4ERR_BADHANDLE); - clp = nfs4_find_client(&args->addr->sin_addr); + clp = nfs_find_client(args->addr, 4); if (clp == NULL) goto out; inode = nfs_delegation_find_inode(clp, &args->fh); @@ -80,7 +81,7 @@ unsigned nfs4_callback_recall(struct cb_recallargs *args, void *dummy) } iput(inode); out_putclient: - nfs4_put_client(clp); + nfs_put_client(clp); out: dprintk("%s: exit with status = %d\n", __FUNCTION__, ntohl(res)); return res; diff --git a/fs/nfs/client.c b/fs/nfs/client.c new file mode 100644 index 0000000..cb5e924 --- /dev/null +++ b/fs/nfs/client.c @@ -0,0 +1,312 @@ +/* client.c: NFS client sharing and management code + * + * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + + +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include + +#include "nfs4_fs.h" +#include "callback.h" +#include "delegation.h" +#include "iostat.h" +#include "internal.h" + +#define NFSDBG_FACILITY NFSDBG_CLIENT + +static DEFINE_SPINLOCK(nfs_client_lock); +static LIST_HEAD(nfs_client_list); +static DECLARE_WAIT_QUEUE_HEAD(nfs_client_active_wq); + +/* + * Allocate a shared client record + * + * Since these are allocated/deallocated very rarely, we don't + * bother putting them in a slab cache... + */ +static struct nfs_client *nfs_alloc_client(const char *hostname, + const struct sockaddr_in *addr, + int nfsversion) +{ + struct nfs_client *clp; + int error; + + if ((clp = kzalloc(sizeof(*clp), GFP_KERNEL)) == NULL) + goto error_0; + + error = rpciod_up(); + if (error < 0) { + dprintk("%s: couldn't start rpciod! Error = %d\n", + __FUNCTION__, error); + __set_bit(NFS_CS_RPCIOD, &clp->cl_res_state); + goto error_1; + } + + if (nfsversion == 4) { + if (nfs_callback_up() < 0) + goto error_2; + __set_bit(NFS_CS_CALLBACK, &clp->cl_res_state); + } + + atomic_set(&clp->cl_count, 1); + clp->cl_cons_state = NFS_CS_INITING; + + clp->cl_nfsversion = nfsversion; + memcpy(&clp->cl_addr, addr, sizeof(clp->cl_addr)); + + if (hostname) { + clp->cl_hostname = kstrdup(hostname, GFP_KERNEL); + if (!clp->cl_hostname) + goto error_3; + } + + INIT_LIST_HEAD(&clp->cl_superblocks); + clp->cl_rpcclient = ERR_PTR(-EINVAL); + +#ifdef CONFIG_NFS_V4 + init_rwsem(&clp->cl_sem); + INIT_LIST_HEAD(&clp->cl_delegations); + INIT_LIST_HEAD(&clp->cl_state_owners); + INIT_LIST_HEAD(&clp->cl_unused); + spin_lock_init(&clp->cl_lock); + INIT_WORK(&clp->cl_renewd, nfs4_renew_state, clp); + rpc_init_wait_queue(&clp->cl_rpcwaitq, "NFS client"); + clp->cl_boot_time = CURRENT_TIME; + clp->cl_state = 1 << NFS4CLNT_LEASE_EXPIRED; +#endif + + return clp; + +error_3: + nfs_callback_down(); + __clear_bit(NFS_CS_CALLBACK, &clp->cl_res_state); +error_2: + rpciod_down(); + __clear_bit(NFS_CS_RPCIOD, &clp->cl_res_state); +error_1: + kfree(clp); +error_0: + return NULL; +} + +/* + * Destroy a shared client record + */ +static void nfs_free_client(struct nfs_client *clp) +{ + dprintk("--> nfs_free_client(%d)\n", clp->cl_nfsversion); + +#ifdef CONFIG_NFS_V4 + if (__test_and_clear_bit(NFS_CS_IDMAP, &clp->cl_res_state)) { + while (!list_empty(&clp->cl_unused)) { + struct nfs4_state_owner *sp; + + sp = list_entry(clp->cl_unused.next, + struct nfs4_state_owner, + so_list); + list_del(&sp->so_list); + kfree(sp); + } + BUG_ON(!list_empty(&clp->cl_state_owners)); + nfs_idmap_delete(clp); + } +#endif + + /* -EIO all pending I/O */ + if (!IS_ERR(clp->cl_rpcclient)) + rpc_shutdown_client(clp->cl_rpcclient); + + if (__test_and_clear_bit(NFS_CS_CALLBACK, &clp->cl_res_state)) + nfs_callback_down(); + + if (__test_and_clear_bit(NFS_CS_RPCIOD, &clp->cl_res_state)) + rpciod_down(); + + kfree(clp->cl_hostname); + kfree(clp); + + dprintk("<-- nfs_free_client()\n"); +} + +/* + * Release a reference to a shared client record + */ +void nfs_put_client(struct nfs_client *clp) +{ + dprintk("--> nfs_put_client({%d})\n", atomic_read(&clp->cl_count)); + + if (atomic_dec_and_lock(&clp->cl_count, &nfs_client_lock)) { + list_del(&clp->cl_share_link); + spin_unlock(&nfs_client_lock); + + BUG_ON(!list_empty(&clp->cl_superblocks)); + + nfs_free_client(clp); + } +} + +/* + * Find a client by address + * - caller must hold nfs_client_lock + */ +static struct nfs_client *__nfs_find_client(const struct sockaddr_in *addr, int nfsversion) +{ + struct nfs_client *clp; + + list_for_each_entry(clp, &nfs_client_list, cl_share_link) { + /* Different NFS versions cannot share the same nfs_client */ + if (clp->cl_nfsversion != nfsversion) + continue; + + if (memcmp(&clp->cl_addr.sin_addr, &addr->sin_addr, + sizeof(clp->cl_addr.sin_addr)) != 0) + continue; + + if (clp->cl_addr.sin_port == addr->sin_port) + goto found; + } + + return NULL; + +found: + atomic_inc(&clp->cl_count); + return clp; +} + +/* + * Find a client by IP address and protocol version + * - returns NULL if no such client + */ +struct nfs_client *nfs_find_client(const struct sockaddr_in *addr, int nfsversion) +{ + struct nfs_client *clp; + + spin_lock(&nfs_client_lock); + clp = __nfs_find_client(addr, nfsversion); + spin_unlock(&nfs_client_lock); + + BUG_ON(clp->cl_cons_state == 0); + + return clp; +} + +/* + * Look up a client by IP address and protocol version + * - creates a new record if one doesn't yet exist + */ +struct nfs_client *nfs_get_client(const char *hostname, + const struct sockaddr_in *addr, + int nfsversion) +{ + struct nfs_client *clp, *new = NULL; + int error; + + dprintk("--> nfs_get_client(%s,"NIPQUAD_FMT":%d,%d)\n", + hostname ?: "", NIPQUAD(addr->sin_addr), + addr->sin_port, nfsversion); + + /* see if the client already exists */ + do { + spin_lock(&nfs_client_lock); + + clp = __nfs_find_client(addr, nfsversion); + if (clp) + goto found_client; + if (new) + goto install_client; + + spin_unlock(&nfs_client_lock); + + new = nfs_alloc_client(hostname, addr, nfsversion); + } while (new); + + return ERR_PTR(-ENOMEM); + + /* install a new client and return with it unready */ +install_client: + clp = new; + list_add(&clp->cl_share_link, &nfs_client_list); + spin_unlock(&nfs_client_lock); + dprintk("--> nfs_get_client() = %p [new]\n", clp); + return clp; + + /* found an existing client + * - make sure it's ready before returning + */ +found_client: + spin_unlock(&nfs_client_lock); + + if (new) + nfs_free_client(new); + + if (clp->cl_cons_state == NFS_CS_INITING) { + DECLARE_WAITQUEUE(myself, current); + + add_wait_queue(&nfs_client_active_wq, &myself); + + for (;;) { + set_current_state(TASK_INTERRUPTIBLE); + if (signal_pending(current) || + clp->cl_cons_state > NFS_CS_READY) + break; + schedule(); + } + + remove_wait_queue(&nfs_client_active_wq, &myself); + + if (signal_pending(current)) { + nfs_put_client(clp); + return ERR_PTR(-ERESTARTSYS); + } + } + + if (clp->cl_cons_state < NFS_CS_READY) { + error = clp->cl_cons_state; + nfs_put_client(clp); + return ERR_PTR(error); + } + + dprintk("--> nfs_get_client() = %p [share]\n", clp); + return clp; +} + +/* + * Mark a server as ready or failed + */ +void nfs_mark_client_ready(struct nfs_client *clp, int state) +{ + clp->cl_cons_state = state; + wake_up_all(&nfs_client_active_wq); +} diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c index cfe2397..5713367 100644 --- a/fs/nfs/delegation.c +++ b/fs/nfs/delegation.c @@ -18,6 +18,7 @@ #include "nfs4_fs.h" #include "delegation.h" +#include "internal.h" static struct nfs_delegation *nfs_alloc_delegation(void) { @@ -145,7 +146,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct sizeof(delegation->stateid)) != 0 || delegation->type != nfsi->delegation->type) { printk("%s: server %u.%u.%u.%u, handed out a duplicate delegation!\n", - __FUNCTION__, NIPQUAD(clp->cl_addr)); + __FUNCTION__, NIPQUAD(clp->cl_addr.sin_addr)); status = -EIO; } } @@ -254,7 +255,7 @@ restart: } out: spin_unlock(&clp->cl_lock); - nfs4_put_client(clp); + nfs_put_client(clp); module_put_and_exit(0); } @@ -266,10 +267,10 @@ void nfs_expire_all_delegations(struct nfs_client *clp) atomic_inc(&clp->cl_count); task = kthread_run(nfs_do_expire_all_delegations, clp, "%u.%u.%u.%u-delegreturn", - NIPQUAD(clp->cl_addr)); + NIPQUAD(clp->cl_addr.sin_addr)); if (!IS_ERR(task)) return; - nfs4_put_client(clp); + nfs_put_client(clp); module_put(THIS_MODULE); } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 4802157..ac370d5 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -15,6 +15,12 @@ struct nfs_clone_mount { rpc_authflavor_t authflavor; }; +/* client.c */ +extern void nfs_put_client(struct nfs_client *); +extern struct nfs_client *nfs_find_client(const struct sockaddr_in *, int); +extern struct nfs_client *nfs_get_client(const char *, const struct sockaddr_in *, int); +extern void nfs_mark_client_ready(struct nfs_client *, int); + /* nfs4namespace.c */ #ifdef CONFIG_NFS_V4 extern struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentry *dentry); diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index 4e334cb..e787924 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -43,55 +43,6 @@ enum nfs4_client_state { }; /* - * The nfs_client identifies our client state to the server. - */ -struct nfs_client { - struct list_head cl_servers; /* Global list of servers */ - struct in_addr cl_addr; /* Server identifier */ - u64 cl_clientid; /* constant */ - nfs4_verifier cl_confirm; - unsigned long cl_state; - - u32 cl_lockowner_id; - - /* - * The following rwsem ensures exclusive access to the server - * while we recover the state following a lease expiration. - */ - struct rw_semaphore cl_sem; - - struct list_head cl_delegations; - struct list_head cl_state_owners; - struct list_head cl_unused; - int cl_nunused; - spinlock_t cl_lock; - atomic_t cl_count; - - struct rpc_clnt * cl_rpcclient; - - struct list_head cl_superblocks; /* List of nfs_server structs */ - - unsigned long cl_lease_time; - unsigned long cl_last_renewal; - struct work_struct cl_renewd; - struct work_struct cl_recoverd; - - struct rpc_wait_queue cl_rpcwaitq; - - /* used for the setclientid verifier */ - struct timespec cl_boot_time; - - /* idmapper */ - struct idmap * cl_idmap; - - /* Our own IP address, as a null-terminated string. - * This is used to generate the clientid, and the callback address. - */ - char cl_ipaddr[16]; - unsigned char cl_id_uniquifier; -}; - -/* * struct rpc_sequence ensures that RPC calls are sent in the exact * order that they appear on the list. */ @@ -239,9 +190,6 @@ extern void nfs4_renew_state(void *); /* nfs4state.c */ extern void init_nfsv4_state(struct nfs_server *); extern void destroy_nfsv4_state(struct nfs_server *); -extern struct nfs_client *nfs4_get_client(struct in_addr *); -extern void nfs4_put_client(struct nfs_client *clp); -extern struct nfs_client *nfs4_find_client(struct in_addr *); struct rpc_cred *nfs4_get_renew_cred(struct nfs_client *clp); extern u32 nfs4_alloc_lockowner_id(struct nfs_client *); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 850f085..803c31b 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2968,7 +2968,7 @@ int nfs4_proc_setclientid(struct nfs_client *clp, u32 program, unsigned short po for(;;) { setclientid.sc_name_len = scnprintf(setclientid.sc_name, sizeof(setclientid.sc_name), "%s/%u.%u.%u.%u %s %u", - clp->cl_ipaddr, NIPQUAD(clp->cl_addr.s_addr), + clp->cl_ipaddr, NIPQUAD(clp->cl_addr.sin_addr), cred->cr_ops->cr_name, clp->cl_id_uniquifier); setclientid.sc_netid_len = scnprintf(setclientid.sc_netid, diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index fa51a7d..058811e 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -50,12 +50,12 @@ #include "nfs4_fs.h" #include "callback.h" #include "delegation.h" +#include "internal.h" #define OPENOWNER_POOL_SIZE 8 const nfs4_stateid zero_stateid; -static DEFINE_SPINLOCK(state_spinlock); static LIST_HEAD(nfs4_clientid_list); void @@ -71,127 +71,11 @@ destroy_nfsv4_state(struct nfs_server *server) kfree(server->mnt_path); server->mnt_path = NULL; if (server->nfs_client) { - nfs4_put_client(server->nfs_client); + nfs_put_client(server->nfs_client); server->nfs_client = NULL; } } -/* - * nfs4_get_client(): returns an empty client structure - * nfs4_put_client(): drops reference to client structure - * - * Since these are allocated/deallocated very rarely, we don't - * bother putting them in a slab cache... - */ -static struct nfs_client * -nfs4_alloc_client(struct in_addr *addr) -{ - struct nfs_client *clp; - - if (nfs_callback_up() < 0) - return NULL; - if ((clp = kzalloc(sizeof(*clp), GFP_KERNEL)) == NULL) { - nfs_callback_down(); - return NULL; - } - memcpy(&clp->cl_addr, addr, sizeof(clp->cl_addr)); - init_rwsem(&clp->cl_sem); - INIT_LIST_HEAD(&clp->cl_delegations); - INIT_LIST_HEAD(&clp->cl_state_owners); - INIT_LIST_HEAD(&clp->cl_unused); - spin_lock_init(&clp->cl_lock); - atomic_set(&clp->cl_count, 1); - INIT_WORK(&clp->cl_renewd, nfs4_renew_state, clp); - INIT_LIST_HEAD(&clp->cl_superblocks); - rpc_init_wait_queue(&clp->cl_rpcwaitq, "NFS4 client"); - clp->cl_rpcclient = ERR_PTR(-EINVAL); - clp->cl_boot_time = CURRENT_TIME; - clp->cl_state = 1 << NFS4CLNT_LEASE_EXPIRED; - return clp; -} - -static void -nfs4_free_client(struct nfs_client *clp) -{ - struct nfs4_state_owner *sp; - - while (!list_empty(&clp->cl_unused)) { - sp = list_entry(clp->cl_unused.next, - struct nfs4_state_owner, - so_list); - list_del(&sp->so_list); - kfree(sp); - } - BUG_ON(!list_empty(&clp->cl_state_owners)); - nfs_idmap_delete(clp); - if (!IS_ERR(clp->cl_rpcclient)) - rpc_shutdown_client(clp->cl_rpcclient); - kfree(clp); - nfs_callback_down(); -} - -static struct nfs_client *__nfs4_find_client(struct in_addr *addr) -{ - struct nfs_client *clp; - list_for_each_entry(clp, &nfs4_clientid_list, cl_servers) { - if (memcmp(&clp->cl_addr, addr, sizeof(clp->cl_addr)) == 0) { - atomic_inc(&clp->cl_count); - return clp; - } - } - return NULL; -} - -struct nfs_client *nfs4_find_client(struct in_addr *addr) -{ - struct nfs_client *clp; - spin_lock(&state_spinlock); - clp = __nfs4_find_client(addr); - spin_unlock(&state_spinlock); - return clp; -} - -struct nfs_client * -nfs4_get_client(struct in_addr *addr) -{ - struct nfs_client *clp, *new = NULL; - - spin_lock(&state_spinlock); - for (;;) { - clp = __nfs4_find_client(addr); - if (clp != NULL) - break; - clp = new; - if (clp != NULL) { - list_add(&clp->cl_servers, &nfs4_clientid_list); - new = NULL; - break; - } - spin_unlock(&state_spinlock); - new = nfs4_alloc_client(addr); - spin_lock(&state_spinlock); - if (new == NULL) - break; - } - spin_unlock(&state_spinlock); - if (new) - nfs4_free_client(new); - return clp; -} - -void -nfs4_put_client(struct nfs_client *clp) -{ - if (!atomic_dec_and_lock(&clp->cl_count, &state_spinlock)) - return; - list_del(&clp->cl_servers); - spin_unlock(&state_spinlock); - BUG_ON(!list_empty(&clp->cl_superblocks)); - rpc_wake_up(&clp->cl_rpcwaitq); - nfs4_kill_renewd(clp); - nfs4_free_client(clp); -} - static int nfs4_init_client(struct nfs_client *clp, struct rpc_cred *cred) { int status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, @@ -771,11 +655,11 @@ static void nfs4_recover_state(struct nfs_client *clp) __module_get(THIS_MODULE); atomic_inc(&clp->cl_count); task = kthread_run(reclaimer, clp, "%u.%u.%u.%u-reclaim", - NIPQUAD(clp->cl_addr)); + NIPQUAD(clp->cl_addr.sin_addr)); if (!IS_ERR(task)) return; nfs4_clear_recover_bit(clp); - nfs4_put_client(clp); + nfs_put_client(clp); module_put(THIS_MODULE); } @@ -970,12 +854,12 @@ out: if (status == -NFS4ERR_CB_PATH_DOWN) nfs_handle_cb_pathdown(clp); nfs4_clear_recover_bit(clp); - nfs4_put_client(clp); + nfs_put_client(clp); module_put_and_exit(0); return 0; out_error: printk(KERN_WARNING "Error: state recovery failed on NFSv4 server %u.%u.%u.%u with error %d\n", - NIPQUAD(clp->cl_addr.s_addr), -status); + NIPQUAD(clp->cl_addr.sin_addr), -status); set_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state); goto out; } diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 3ee85c4..f97d7d9 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -1104,47 +1104,46 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, struct rpc_clnt *clnt = NULL; int err = -EIO; - clp = nfs4_get_client(&server->addr.sin_addr); + clp = nfs_get_client(server->hostname, &server->addr, 4); if (!clp) { dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); return ERR_PTR(err); } /* Now create transport and client */ - down_write(&clp->cl_sem); - if (IS_ERR(clp->cl_rpcclient)) { + if (clp->cl_cons_state == NFS_CS_INITING) { xprt = xprt_create_proto(proto, &server->addr, timeparms); if (IS_ERR(xprt)) { - up_write(&clp->cl_sem); err = PTR_ERR(xprt); dprintk("%s: cannot create RPC transport. Error = %d\n", __FUNCTION__, err); - goto out_fail; + goto client_init_error; } /* Bind to a reserved port! */ xprt->resvport = 1; clnt = rpc_create_client(xprt, server->hostname, &nfs_program, server->rpc_ops->version, flavor); if (IS_ERR(clnt)) { - up_write(&clp->cl_sem); err = PTR_ERR(clnt); dprintk("%s: cannot create RPC client. Error = %d\n", __FUNCTION__, err); - goto out_fail; + goto client_init_error; } clnt->cl_intr = 1; clnt->cl_softrtry = 1; clp->cl_rpcclient = clnt; memcpy(clp->cl_ipaddr, server->ip_addr, sizeof(clp->cl_ipaddr)); - if (nfs_idmap_new(clp) < 0) - goto out_fail; + err = nfs_idmap_new(clp); + if (err < 0) { + dprintk("%s: failed to create idmapper.\n", + __FUNCTION__); + goto client_init_error; + } + __set_bit(NFS_CS_IDMAP, &clp->cl_res_state); + nfs_mark_client_ready(clp, 0); } - list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); + clnt = rpc_clone_client(clp->cl_rpcclient); - if (!IS_ERR(clnt)) - server->nfs_client = clp; - up_write(&clp->cl_sem); - clp = NULL; if (IS_ERR(clnt)) { dprintk("%s: cannot create RPC client. Error = %d\n", @@ -1152,11 +1151,6 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, return clnt; } - if (server->nfs_client->cl_idmap == NULL) { - dprintk("%s: failed to create idmapper.\n", __FUNCTION__); - return ERR_PTR(-ENOMEM); - } - if (clnt->cl_auth->au_flavor != flavor) { struct rpc_auth *auth; @@ -1166,11 +1160,16 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, return (struct rpc_clnt *)auth; } } + + server->nfs_client = clp; + down_write(&clp->cl_sem); + list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); + up_write(&clp->cl_sem); return clnt; - out_fail: - if (clp) - nfs4_put_client(clp); +client_init_error: + nfs_mark_client_ready(clp, err); + nfs_put_client(clp); return ERR_PTR(err); } @@ -1329,14 +1328,6 @@ static int nfs4_get_sb(struct file_system_type *fs_type, goto out_free; } - /* Fire up rpciod if not yet running */ - error = rpciod_up(); - if (error < 0) { - dprintk("%s: couldn't start rpciod! Error = %d\n", - __FUNCTION__, error); - goto out_free; - } - s = sget(fs_type, nfs4_compare_super, nfs_set_super, server); if (IS_ERR(s)) { error = PTR_ERR(s); @@ -1383,8 +1374,6 @@ static void nfs4_kill_super(struct super_block *sb) destroy_nfsv4_state(server); - rpciod_down(); - nfs_free_iostats(server->io_stats); kfree(server->hostname); kfree(server); diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index a36e01cd..70e1dc9 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -586,6 +586,7 @@ extern void * nfs_root_data(void); #define NFSDBG_FILE 0x0040 #define NFSDBG_ROOT 0x0080 #define NFSDBG_CALLBACK 0x0100 +#define NFSDBG_CLIENT 0x0200 #define NFSDBG_ALL 0xFFFF #ifdef __KERNEL__ diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index fc20d6b..a727657 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -7,6 +7,66 @@ struct nfs_iostats; /* + * The nfs_client identifies our client state to the server. + */ +struct nfs_client { + atomic_t cl_count; + int cl_cons_state; /* current construction state (-ve: init error) */ +#define NFS_CS_READY 0 /* ready to be used */ +#define NFS_CS_INITING 1 /* busy initialising */ + int cl_nfsversion; /* NFS protocol version */ + unsigned long cl_res_state; /* NFS resources state */ +#define NFS_CS_RPCIOD 0 /* - rpciod started */ +#define NFS_CS_CALLBACK 1 /* - callback started */ +#define NFS_CS_IDMAP 2 /* - idmap started */ + struct sockaddr_in cl_addr; /* server identifier */ + char * cl_hostname; /* hostname of server */ + struct list_head cl_share_link; /* link in global client list */ + struct list_head cl_superblocks; /* List of nfs_server structs */ + + struct rpc_clnt * cl_rpcclient; + +#ifdef CONFIG_NFS_V4 + u64 cl_clientid; /* constant */ + nfs4_verifier cl_confirm; + unsigned long cl_state; + + u32 cl_lockowner_id; + + /* + * The following rwsem ensures exclusive access to the server + * while we recover the state following a lease expiration. + */ + struct rw_semaphore cl_sem; + + struct list_head cl_delegations; + struct list_head cl_state_owners; + struct list_head cl_unused; + int cl_nunused; + spinlock_t cl_lock; + + unsigned long cl_lease_time; + unsigned long cl_last_renewal; + struct work_struct cl_renewd; + struct work_struct cl_recoverd; + + struct rpc_wait_queue cl_rpcwaitq; + + /* used for the setclientid verifier */ + struct timespec cl_boot_time; + + /* idmapper */ + struct idmap * cl_idmap; + + /* Our own IP address, as a null-terminated string. + * This is used to generate the clientid, and the callback address. + */ + char cl_ipaddr[16]; + unsigned char cl_id_uniquifier; +#endif +}; + +/* * NFS client parameters stored in the superblock. */ struct nfs_server { -- cgit v0.10.2 From 0c7d90cfed91a283228017ba6faf37ee0bcd32b1 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:10 -0400 Subject: NFS: Use the dentry superblock directly in nfs_statfs() Use the nominated dentry's superblock directly in the NFS statfs() op to get a file handle, rather than using s_root (which will become a dummy dentry in a future patch). Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/super.c b/fs/nfs/super.c index f97d7d9..a41d516 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -240,11 +240,10 @@ void __exit unregister_nfs_fs(void) */ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) { - struct super_block *sb = dentry->d_sb; - struct nfs_server *server = NFS_SB(sb); + struct nfs_server *server = NFS_SB(dentry->d_sb); unsigned char blockbits; unsigned long blockres; - struct nfs_fh *rootfh = NFS_FH(sb->s_root->d_inode); + struct nfs_fh *fh = NFS_FH(dentry->d_inode); struct nfs_fattr fattr; struct nfs_fsstat res = { .fattr = &fattr, @@ -253,7 +252,7 @@ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) lock_kernel(); - error = server->rpc_ops->statfs(server, rootfh, &res); + error = server->rpc_ops->statfs(server, fh, &res); buf->f_type = NFS_SUPER_MAGIC; if (error < 0) goto out_err; @@ -263,7 +262,7 @@ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) * case where f_frsize != f_bsize. Eventually we want to * report the value of wtmult in this field. */ - buf->f_frsize = sb->s_blocksize; + buf->f_frsize = dentry->d_sb->s_blocksize; /* * On most *nix systems, f_blocks, f_bfree, and f_bavail @@ -272,8 +271,8 @@ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) * thus historically Linux's sys_statfs reports these * fields in units of f_bsize. */ - buf->f_bsize = sb->s_blocksize; - blockbits = sb->s_blocksize_bits; + buf->f_bsize = dentry->d_sb->s_blocksize; + blockbits = dentry->d_sb->s_blocksize_bits; blockres = (1 << blockbits) - 1; buf->f_blocks = (res.tbytes + blockres) >> blockbits; buf->f_bfree = (res.fbytes + blockres) >> blockbits; -- cgit v0.10.2 From 509de8111656a7d89b4a1a5f430f4460ce510f0f Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:11 -0400 Subject: NFS: Add extra const qualifiers Add some extra const qualifiers into NFS. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 86b3169..85d9ed1 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -172,7 +172,8 @@ void nfs_release_automount_timer(void) /* * Clone a mountpoint of the appropriate type */ -static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, char *devname, +static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, + const char *devname, struct nfs_clone_mount *mountdata) { #ifdef CONFIG_NFS_V4 diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 7143b1f..3e53712 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -886,7 +886,7 @@ nfs3_proc_lock(struct file *filp, int cmd, struct file_lock *fl) return nlmclnt_proc(filp->f_dentry->d_inode, cmd, fl); } -struct nfs_rpc_ops nfs_v3_clientops = { +const struct nfs_rpc_ops nfs_v3_clientops = { .version = 3, /* protocol version */ .dentry_ops = &nfs_dentry_operations, .dir_inode_ops = &nfs3_dir_inode_operations, diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c index ea38d27..faed9bc 100644 --- a/fs/nfs/nfs4namespace.c +++ b/fs/nfs/nfs4namespace.c @@ -23,7 +23,7 @@ /* * Check if fs_root is valid */ -static inline char *nfs4_pathname_string(struct nfs4_pathname *pathname, +static inline char *nfs4_pathname_string(const struct nfs4_pathname *pathname, char *buffer, ssize_t buflen) { char *end = buffer + buflen; @@ -34,7 +34,7 @@ static inline char *nfs4_pathname_string(struct nfs4_pathname *pathname, n = pathname->ncomponents; while (--n >= 0) { - struct nfs4_string *component = &pathname->components[n]; + const struct nfs4_string *component = &pathname->components[n]; buflen -= component->len + 1; if (buflen < 0) goto Elong; @@ -60,7 +60,7 @@ Elong: */ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, const struct dentry *dentry, - struct nfs4_fs_locations *locations) + const struct nfs4_fs_locations *locations) { struct vfsmount *mnt = ERR_PTR(-ENOENT); struct nfs_clone_mount mountdata = { @@ -108,7 +108,7 @@ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, loc = 0; while (loc < locations->nlocations && IS_ERR(mnt)) { - struct nfs4_fs_location *location = &locations->locations[loc]; + const struct nfs4_fs_location *location = &locations->locations[loc]; char *mnt_path; if (location == NULL || location->nservers <= 0 || diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 803c31b..061be71 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -3761,7 +3761,7 @@ static struct inode_operations nfs4_file_inode_operations = { .listxattr = nfs4_listxattr, }; -struct nfs_rpc_ops nfs_v4_clientops = { +const struct nfs_rpc_ops nfs_v4_clientops = { .version = 4, /* protocol version */ .dentry_ops = &nfs4_dentry_operations, .dir_inode_ops = &nfs4_dir_inode_operations, diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index b3899ea..7767690 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -671,7 +671,7 @@ nfs_proc_lock(struct file *filp, int cmd, struct file_lock *fl) } -struct nfs_rpc_ops nfs_v2_clientops = { +const struct nfs_rpc_ops nfs_v2_clientops = { .version = 2, /* protocol version */ .dentry_ops = &nfs_dentry_operations, .dir_inode_ops = &nfs_dir_inode_operations, diff --git a/fs/nfs/super.c b/fs/nfs/super.c index a41d516..c97f309 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -329,10 +329,10 @@ static const char *nfs_pseudoflavour_to_name(rpc_authflavor_t flavour) */ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, int showdefaults) { - static struct proc_nfs_info { + static const struct proc_nfs_info { int flag; - char *str; - char *nostr; + const char *str; + const char *nostr; } nfs_info[] = { { NFS_MOUNT_SOFT, ",soft", ",hard" }, { NFS_MOUNT_INTR, ",intr", "" }, @@ -342,9 +342,9 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, { NFS_MOUNT_NOACL, ",noacl", "" }, { 0, NULL, NULL } }; - struct proc_nfs_info *nfs_infop; + const struct proc_nfs_info *nfs_infop; char buf[12]; - char *proto; + const char *proto; seq_printf(m, ",vers=%d", nfss->rpc_ops->version); seq_printf(m, ",rsize=%d", nfss->rsize); diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index a727657..95f32d5 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -73,7 +73,7 @@ struct nfs_server { struct rpc_clnt * client; /* RPC client handle */ struct rpc_clnt * client_sys; /* 2nd handle for FSINFO */ struct rpc_clnt * client_acl; /* ACL RPC client handle */ - struct nfs_rpc_ops * rpc_ops; /* NFS protocol vector */ + const struct nfs_rpc_ops *rpc_ops; /* NFS protocol vector */ struct nfs_iostats * io_stats; /* I/O statistics */ struct backing_dev_info backing_dev_info; int flags; /* various flags */ diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index dd9ae67..2426b11 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -833,9 +833,9 @@ struct nfs_rpc_ops { /* * Function vectors etc. for the NFS client */ -extern struct nfs_rpc_ops nfs_v2_clientops; -extern struct nfs_rpc_ops nfs_v3_clientops; -extern struct nfs_rpc_ops nfs_v4_clientops; +extern const struct nfs_rpc_ops nfs_v2_clientops; +extern const struct nfs_rpc_ops nfs_v3_clientops; +extern const struct nfs_rpc_ops nfs_v4_clientops; extern struct rpc_version nfs_version2; extern struct rpc_version nfs_version3; extern struct rpc_version nfs_version4; -- cgit v0.10.2 From 27951bd26031f6c27d38df9e94623bbe208a2464 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:11 -0400 Subject: NFS: Maintain a common server record for NFS2/3 as well as for NFS4 Maintain a common server record for NFS2/3 as well as for NFS4 so that common stuff can be moved there from struct nfs_server. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/super.c b/fs/nfs/super.c index c97f309..d1b4a5b 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -658,11 +658,19 @@ static void nfs_init_timeout_values(struct rpc_timeout *to, int proto, unsigned static struct rpc_clnt * nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) { + struct nfs_client *clp; struct rpc_timeout timeparms; struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; int proto = (data->flags & NFS_MOUNT_TCP) ? IPPROTO_TCP : IPPROTO_UDP; + clp = nfs_get_client(server->hostname, &server->addr, + server->rpc_ops->version); + if (!clp) { + dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); + return ERR_PTR(PTR_ERR(clp)); + } + nfs_init_timeout_values(&timeparms, proto, data->timeo, data->retrans); server->retrans_timeo = timeparms.to_initval; @@ -673,6 +681,8 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) if (IS_ERR(xprt)) { dprintk("%s: cannot create RPC transport. Error = %ld\n", __FUNCTION__, PTR_ERR(xprt)); + nfs_mark_client_ready(clp, PTR_ERR(xprt)); + nfs_put_client(clp); return (struct rpc_clnt *)xprt; } clnt = rpc_create_client(xprt, server->hostname, &nfs_program, @@ -686,9 +696,13 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) clnt->cl_intr = 1; clnt->cl_softrtry = 1; + nfs_mark_client_ready(clp, 0); + server->nfs_client = clp; return clnt; out_fail: + nfs_mark_client_ready(clp, PTR_ERR(xprt)); + nfs_put_client(clp); return clnt; } @@ -764,6 +778,7 @@ static int nfs_clone_generic_sb(struct nfs_clone_mount *data, if (server == NULL) goto out_err; memcpy(server, parent, sizeof(*server)); + atomic_inc(&server->nfs_client->cl_count); hostname = (data->hostname != NULL) ? data->hostname : parent->hostname; len = strlen(hostname) + 1; server->hostname = kmalloc(len, GFP_KERNEL); @@ -796,6 +811,7 @@ out_deactivate: out_rpciod_down: rpciod_down(); kfree(server->hostname); + nfs_put_client(server->nfs_client); kfree(server); return simple_set_mnt(mnt, sb); kill_rpciod: @@ -803,6 +819,7 @@ kill_rpciod: free_hostname: kfree(server->hostname); free_server: + nfs_put_client(server->nfs_client); kfree(server); out_err: return error; @@ -1071,6 +1088,7 @@ static void nfs_kill_super(struct super_block *s) nfs_free_iostats(server->io_stats); kfree(server->hostname); + nfs_put_client(server->nfs_client); kfree(server); nfs_release_automount_timer(); } @@ -1421,7 +1439,6 @@ static struct super_block *nfs4_clone_sb(struct nfs_server *server, struct nfs_c nfs4_server_capabilities(server, &server->fh); down_write(&clp->cl_sem); - atomic_inc(&clp->cl_count); list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); up_write(&clp->cl_sem); return sb; @@ -1476,6 +1493,8 @@ static struct nfs_server *nfs4_referral_server(struct super_block *sb, struct nf retrans = 1; nfs_init_timeout_values(&timeparms, proto, timeo, retrans); + nfs_put_client(server->nfs_client); + server->nfs_client = NULL; server->client = nfs4_create_client(server, &timeparms, proto, data->authflavor); if (IS_ERR((err = server->client))) goto out_err; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 95f32d5..e7d7662 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -70,6 +70,7 @@ struct nfs_client { * NFS client parameters stored in the superblock. */ struct nfs_server { + struct nfs_client * nfs_client; /* shared client and NFS4 state */ struct rpc_clnt * client; /* RPC client handle */ struct rpc_clnt * client_sys; /* 2nd handle for FSINFO */ struct rpc_clnt * client_acl; /* ACL RPC client handle */ @@ -103,7 +104,6 @@ struct nfs_server { */ char ip_addr[16]; char * mnt_path; - struct nfs_client * nfs_client; /* all NFSv4 state starts here */ struct list_head nfs4_siblings; /* List of other nfs_server structs * that share the same clientid */ -- cgit v0.10.2 From 1f163415dc05983830bcc47b33c155b2528b1574 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:11 -0400 Subject: NFS: Make better use of inode* dereferencing macros Make better use of inode* dereferencing macros to hide dereferencing chains (including NFS_PROTO and NFS_CLIENT). Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 48e8928..a146ed3 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -111,7 +111,7 @@ nfs_file_open(struct inode *inode, struct file *filp) nfs_inc_stats(inode, NFSIOS_VFSOPEN); lock_kernel(); - res = NFS_SERVER(inode)->rpc_ops->file_open(inode, filp); + res = NFS_PROTO(inode)->file_open(inode, filp); unlock_kernel(); return res; } diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 061be71..b731b19 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1268,7 +1268,7 @@ nfs4_atomic_open(struct inode *dir, struct dentry *dentry, struct nameidata *nd) BUG_ON(nd->intent.open.flags & O_CREAT); } - cred = rpcauth_lookupcred(NFS_SERVER(dir)->client->cl_auth, 0); + cred = rpcauth_lookupcred(NFS_CLIENT(dir)->cl_auth, 0); if (IS_ERR(cred)) return (struct dentry *)cred; state = nfs4_do_open(dir, dentry, nd->intent.open.flags, &attr, cred); @@ -1291,7 +1291,7 @@ nfs4_open_revalidate(struct inode *dir, struct dentry *dentry, int openflags, st struct rpc_cred *cred; struct nfs4_state *state; - cred = rpcauth_lookupcred(NFS_SERVER(dir)->client->cl_auth, 0); + cred = rpcauth_lookupcred(NFS_CLIENT(dir)->cl_auth, 0); if (IS_ERR(cred)) return PTR_ERR(cred); state = nfs4_open_delegated(dentry->d_inode, openflags, cred); @@ -1565,7 +1565,7 @@ nfs4_proc_setattr(struct dentry *dentry, struct nfs_fattr *fattr, nfs_fattr_init(fattr); - cred = rpcauth_lookupcred(NFS_SERVER(inode)->client->cl_auth, 0); + cred = rpcauth_lookupcred(NFS_CLIENT(inode)->cl_auth, 0); if (IS_ERR(cred)) return PTR_ERR(cred); @@ -1927,7 +1927,7 @@ nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr, struct rpc_cred *cred; int status = 0; - cred = rpcauth_lookupcred(NFS_SERVER(dir)->client->cl_auth, 0); + cred = rpcauth_lookupcred(NFS_CLIENT(dir)->cl_auth, 0); if (IS_ERR(cred)) { status = PTR_ERR(cred); goto out; @@ -2816,7 +2816,7 @@ static int __nfs4_proc_set_acl(struct inode *inode, const void *buf, size_t bufl return -EOPNOTSUPP; nfs_inode_return_delegation(inode); buf_to_pages(buf, buflen, arg.acl_pages, &arg.acl_pgbase); - ret = rpc_call_sync(NFS_SERVER(inode)->client, &msg, 0); + ret = rpc_call_sync(NFS_CLIENT(inode), &msg, 0); if (ret == 0) nfs4_write_cached_acl(inode, buf, buflen); return ret; -- cgit v0.10.2 From 8fa5c000d7f986ef9cdc6d95f9f7fcee20e0a7d6 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:12 -0400 Subject: NFS: Move rpc_ops from nfs_server to nfs_client Move the rpc_ops from the nfs_server struct to the nfs_client struct as they're common to all server records of a particular NFS protocol version. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 067d144..1936271 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1147,7 +1147,7 @@ int nfs_instantiate(struct dentry *dentry, struct nfs_fh *fhandle, } if (!(fattr->valid & NFS_ATTR_FATTR)) { struct nfs_server *server = NFS_SB(dentry->d_sb); - error = server->rpc_ops->getattr(server, fhandle, fattr); + error = server->nfs_client->rpc_ops->getattr(server, fhandle, fattr); if (error < 0) goto out_err; } diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 6ed018c..771c3b8 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -237,13 +237,13 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr) /* Why so? Because we want revalidate for devices/FIFOs, and * that's precisely what we have in nfs_file_inode_operations. */ - inode->i_op = NFS_SB(sb)->rpc_ops->file_inode_ops; + inode->i_op = NFS_SB(sb)->nfs_client->rpc_ops->file_inode_ops; if (S_ISREG(inode->i_mode)) { inode->i_fop = &nfs_file_operations; inode->i_data.a_ops = &nfs_file_aops; inode->i_data.backing_dev_info = &NFS_SB(sb)->backing_dev_info; } else if (S_ISDIR(inode->i_mode)) { - inode->i_op = NFS_SB(sb)->rpc_ops->dir_inode_ops; + inode->i_op = NFS_SB(sb)->nfs_client->rpc_ops->dir_inode_ops; inode->i_fop = &nfs_dir_operations; if (nfs_server_capable(inode, NFS_CAP_READDIRPLUS) && fattr->size <= NFS_LIMIT_READDIRPLUS) diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index 85d9ed1..d8b8d56 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -104,7 +104,9 @@ static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) goto out_follow; /* Look it up again */ parent = dget_parent(nd->dentry); - err = server->rpc_ops->lookup(parent->d_inode, &nd->dentry->d_name, &fh, &fattr); + err = server->nfs_client->rpc_ops->lookup(parent->d_inode, + &nd->dentry->d_name, + &fh, &fattr); dput(parent); if (err != 0) goto out_err; @@ -178,7 +180,7 @@ static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, { #ifdef CONFIG_NFS_V4 struct vfsmount *mnt = NULL; - switch (server->rpc_ops->version) { + switch (server->nfs_client->cl_nfsversion) { case 2: case 3: mnt = vfs_kern_mount(&clone_nfs_fs_type, 0, devname, mountdata); diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index b731b19..1573eeb 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -758,7 +758,7 @@ static int _nfs4_proc_open(struct nfs4_opendata *data) } nfs_confirm_seqid(&data->owner->so_seqid, 0); if (!(o_res->f_attr->valid & NFS_ATTR_FATTR)) - return server->rpc_ops->getattr(server, &o_res->fh, o_res->f_attr); + return server->nfs_client->rpc_ops->getattr(server, &o_res->fh, o_res->f_attr); return 0; } diff --git a/fs/nfs/super.c b/fs/nfs/super.c index d1b4a5b..e1e5eab 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -252,7 +252,7 @@ static int nfs_statfs(struct dentry *dentry, struct kstatfs *buf) lock_kernel(); - error = server->rpc_ops->statfs(server, fh, &res); + error = server->nfs_client->rpc_ops->statfs(server, fh, &res); buf->f_type = NFS_SUPER_MAGIC; if (error < 0) goto out_err; @@ -343,10 +343,11 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, { 0, NULL, NULL } }; const struct proc_nfs_info *nfs_infop; + struct nfs_client *clp = nfss->nfs_client; char buf[12]; const char *proto; - seq_printf(m, ",vers=%d", nfss->rpc_ops->version); + seq_printf(m, ",vers=%d", clp->rpc_ops->version); seq_printf(m, ",rsize=%d", nfss->rsize); seq_printf(m, ",wsize=%d", nfss->wsize); if (nfss->acregmin != 3*HZ || showdefaults) @@ -427,7 +428,7 @@ static int nfs_show_stats(struct seq_file *m, struct vfsmount *mnt) seq_printf(m, ",namelen=%d", nfss->namelen); #ifdef CONFIG_NFS_V4 - if (nfss->rpc_ops->version == 4) { + if (nfss->nfs_client->cl_nfsversion == 4) { seq_printf(m, "\n\tnfsv4:\t"); seq_printf(m, "bm0=0x%x", nfss->attr_bitmask[0]); seq_printf(m, ",bm1=0x%x", nfss->attr_bitmask[1]); @@ -503,7 +504,7 @@ nfs_get_root(struct super_block *sb, struct nfs_fh *rootfh, struct nfs_fsinfo *f struct nfs_server *server = NFS_SB(sb); int error; - error = server->rpc_ops->getroot(server, rootfh, fsinfo); + error = server->nfs_client->rpc_ops->getroot(server, rootfh, fsinfo); if (error < 0) { dprintk("nfs_get_root: getattr error = %d\n", -error); return ERR_PTR(error); @@ -553,14 +554,14 @@ nfs_sb_init(struct super_block *sb, rpc_authflavor_t authflavor) no_root_error = -ENOMEM; goto out_no_root; } - sb->s_root->d_op = server->rpc_ops->dentry_ops; + sb->s_root->d_op = server->nfs_client->rpc_ops->dentry_ops; /* mount time stamp, in seconds */ server->mount_time = jiffies; /* Get some general file system info */ if (server->namelen == 0 && - server->rpc_ops->pathconf(server, &server->fh, &pathinfo) >= 0) + server->nfs_client->rpc_ops->pathconf(server, &server->fh, &pathinfo) >= 0) server->namelen = pathinfo.max_namelen; /* Work out a lot of parameters */ if (server->rsize == 0) @@ -663,9 +664,14 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; int proto = (data->flags & NFS_MOUNT_TCP) ? IPPROTO_TCP : IPPROTO_UDP; + int nfsversion = 2; - clp = nfs_get_client(server->hostname, &server->addr, - server->rpc_ops->version); +#ifdef CONFIG_NFS_V3 + if (server->flags & NFS_MOUNT_VER3) + nfsversion = 3; +#endif + + clp = nfs_get_client(server->hostname, &server->addr, nfsversion); if (!clp) { dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); return ERR_PTR(PTR_ERR(clp)); @@ -676,6 +682,19 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) server->retrans_timeo = timeparms.to_initval; server->retrans_count = timeparms.to_retries; + /* Check NFS protocol revision and initialize RPC op vector + * and file handle pool. */ +#ifdef CONFIG_NFS_V3 + if (nfsversion == 3) { + clp->rpc_ops = &nfs_v3_clientops; + server->caps |= NFS_CAP_READDIRPLUS; + } else { + clp->rpc_ops = &nfs_v2_clientops; + } +#else + clp->rpc_ops = &nfs_v2_clientops; +#endif + /* create transport and client */ xprt = xprt_create_proto(proto, &server->addr, &timeparms); if (IS_ERR(xprt)) { @@ -686,7 +705,7 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) return (struct rpc_clnt *)xprt; } clnt = rpc_create_client(xprt, server->hostname, &nfs_program, - server->rpc_ops->version, data->pseudoflavor); + clp->cl_nfsversion, data->pseudoflavor); if (IS_ERR(clnt)) { dprintk("%s: cannot create RPC client. Error = %ld\n", __FUNCTION__, PTR_ERR(xprt)); @@ -750,7 +769,7 @@ static struct nfs_server *nfs_clone_server(struct super_block *sb, struct nfs_cl fsinfo.fattr = data->fattr; if (NFS_PROTO(root_inode)->fsinfo(server, data->fh, &fsinfo) == 0) nfs_super_set_maxbytes(sb, fsinfo.maxfilesize); - sb->s_root->d_op = server->rpc_ops->dentry_ops; + sb->s_root->d_op = server->nfs_client->rpc_ops->dentry_ops; sb->s_flags |= MS_ACTIVE; return server; out_put_root: @@ -865,19 +884,6 @@ nfs_fill_super(struct super_block *sb, struct nfs_mount_data *data, int silent) return -ENOMEM; strcpy(server->hostname, data->hostname); - /* Check NFS protocol revision and initialize RPC op vector - * and file handle pool. */ -#ifdef CONFIG_NFS_V3 - if (server->flags & NFS_MOUNT_VER3) { - server->rpc_ops = &nfs_v3_clientops; - server->caps |= NFS_CAP_READDIRPLUS; - } else { - server->rpc_ops = &nfs_v2_clientops; - } -#else - server->rpc_ops = &nfs_v2_clientops; -#endif - /* Fill in pseudoflavor for mount version < 5 */ if (!(data->flags & NFS_MOUNT_SECFLAVOUR)) data->pseudoflavor = RPC_AUTH_UNIX; @@ -888,6 +894,7 @@ nfs_fill_super(struct super_block *sb, struct nfs_mount_data *data, int silent) server->client = nfs_create_client(server, data); if (IS_ERR(server->client)) return PTR_ERR(server->client); + /* RFC 2623, sec 2.3.2 */ if (authflavor != RPC_AUTH_UNIX) { struct rpc_auth *auth; @@ -1129,6 +1136,8 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, /* Now create transport and client */ if (clp->cl_cons_state == NFS_CS_INITING) { + clp->rpc_ops = &nfs_v4_clientops; + xprt = xprt_create_proto(proto, &server->addr, timeparms); if (IS_ERR(xprt)) { err = PTR_ERR(xprt); @@ -1139,7 +1148,7 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, /* Bind to a reserved port! */ xprt->resvport = 1; clnt = rpc_create_client(xprt, server->hostname, &nfs_program, - server->rpc_ops->version, flavor); + clp->cl_nfsversion, flavor); if (IS_ERR(clnt)) { err = PTR_ERR(clnt); dprintk("%s: cannot create RPC client. Error = %d\n", @@ -1215,8 +1224,6 @@ static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, server->acdirmin = data->acdirmin*HZ; server->acdirmax = data->acdirmax*HZ; - server->rpc_ops = &nfs_v4_clientops; - nfs_init_timeout_values(&timeparms, data->proto, data->timeo, data->retrans); server->retrans_timeo = timeparms.to_initval; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 70e1dc9..51e9bd9 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -215,7 +215,7 @@ static inline struct nfs_inode *NFS_I(struct inode *inode) #define NFS_FH(inode) (&NFS_I(inode)->fh) #define NFS_SERVER(inode) (NFS_SB(inode->i_sb)) #define NFS_CLIENT(inode) (NFS_SERVER(inode)->client) -#define NFS_PROTO(inode) (NFS_SERVER(inode)->rpc_ops) +#define NFS_PROTO(inode) (NFS_SERVER(inode)->nfs_client->rpc_ops) #define NFS_ADDR(inode) (RPC_PEERADDR(NFS_CLIENT(inode))) #define NFS_COOKIEVERF(inode) (NFS_I(inode)->cookieverf) #define NFS_READTIME(inode) (NFS_I(inode)->read_cache_jiffies) diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index e7d7662..aae7c11 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -25,6 +25,7 @@ struct nfs_client { struct list_head cl_superblocks; /* List of nfs_server structs */ struct rpc_clnt * cl_rpcclient; + const struct nfs_rpc_ops *rpc_ops; /* NFS protocol vector */ #ifdef CONFIG_NFS_V4 u64 cl_clientid; /* constant */ @@ -74,7 +75,6 @@ struct nfs_server { struct rpc_clnt * client; /* RPC client handle */ struct rpc_clnt * client_sys; /* 2nd handle for FSINFO */ struct rpc_clnt * client_acl; /* ACL RPC client handle */ - const struct nfs_rpc_ops *rpc_ops; /* NFS protocol vector */ struct nfs_iostats * io_stats; /* I/O statistics */ struct backing_dev_info backing_dev_info; int flags; /* various flags */ -- cgit v0.10.2 From 5006a76cca8f86c6975c16fcf67e83b8b0eee2b6 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:12 -0400 Subject: NFS: Eliminate client_sys in favour of cl_rpcclient Eliminate nfs_server::client_sys in favour of nfs_client::cl_rpcclient as we only really need one per server that we're talking to since it doesn't have any security on it. The retransmission management variables are also moved to the common struct as they're required to set up the cl_rpcclient connection. The NFS2/3 client and client_acl connections are thenceforth derived by cloning the cl_rpcclient connection and post-applying the authorisation flavour. The code for setting up the initial common connection has been moved to client.c as nfs_create_rpc_client(). All the NFS program definition tables are also moved there as that's where they're now required rather than super.c. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index cb5e924..c08cab9 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -51,6 +51,48 @@ static LIST_HEAD(nfs_client_list); static DECLARE_WAIT_QUEUE_HEAD(nfs_client_active_wq); /* + * RPC cruft for NFS + */ +static struct rpc_version *nfs_version[5] = { + [2] = &nfs_version2, +#ifdef CONFIG_NFS_V3 + [3] = &nfs_version3, +#endif +#ifdef CONFIG_NFS_V4 + [4] = &nfs_version4, +#endif +}; + +struct rpc_program nfs_program = { + .name = "nfs", + .number = NFS_PROGRAM, + .nrvers = ARRAY_SIZE(nfs_version), + .version = nfs_version, + .stats = &nfs_rpcstat, + .pipe_dir_name = "/nfs", +}; + +struct rpc_stat nfs_rpcstat = { + .program = &nfs_program +}; + + +#ifdef CONFIG_NFS_V3_ACL +static struct rpc_stat nfsacl_rpcstat = { &nfsacl_program }; +static struct rpc_version * nfsacl_version[] = { + [3] = &nfsacl_version3, +}; + +struct rpc_program nfsacl_program = { + .name = "nfsacl", + .number = NFS_ACL_PROGRAM, + .nrvers = ARRAY_SIZE(nfsacl_version), + .version = nfsacl_version, + .stats = &nfsacl_rpcstat, +}; +#endif /* CONFIG_NFS_V3_ACL */ + +/* * Allocate a shared client record * * Since these are allocated/deallocated very rarely, we don't @@ -310,3 +352,80 @@ void nfs_mark_client_ready(struct nfs_client *clp, int state) clp->cl_cons_state = state; wake_up_all(&nfs_client_active_wq); } + +/* + * Initialise the timeout values for a connection + */ +static void nfs_init_timeout_values(struct rpc_timeout *to, int proto, + unsigned int timeo, unsigned int retrans) +{ + to->to_initval = timeo * HZ / 10; + to->to_retries = retrans; + if (!to->to_retries) + to->to_retries = 2; + + switch (proto) { + case IPPROTO_TCP: + if (!to->to_initval) + to->to_initval = 60 * HZ; + if (to->to_initval > NFS_MAX_TCP_TIMEOUT) + to->to_initval = NFS_MAX_TCP_TIMEOUT; + to->to_increment = to->to_initval; + to->to_maxval = to->to_initval + (to->to_increment * to->to_retries); + to->to_exponential = 0; + break; + case IPPROTO_UDP: + default: + if (!to->to_initval) + to->to_initval = 11 * HZ / 10; + if (to->to_initval > NFS_MAX_UDP_TIMEOUT) + to->to_initval = NFS_MAX_UDP_TIMEOUT; + to->to_maxval = NFS_MAX_UDP_TIMEOUT; + to->to_exponential = 1; + break; + } +} + +/* + * Create an RPC client handle + */ +int nfs_create_rpc_client(struct nfs_client *clp, int proto, + unsigned int timeo, + unsigned int retrans, + rpc_authflavor_t flavor) +{ + struct rpc_timeout timeparms; + struct rpc_xprt *xprt = NULL; + struct rpc_clnt *clnt = NULL; + + if (!IS_ERR(clp->cl_rpcclient)) + return 0; + + nfs_init_timeout_values(&timeparms, proto, timeo, retrans); + clp->retrans_timeo = timeparms.to_initval; + clp->retrans_count = timeparms.to_retries; + + /* create transport and client */ + xprt = xprt_create_proto(proto, &clp->cl_addr, &timeparms); + if (IS_ERR(xprt)) { + dprintk("%s: cannot create RPC transport. Error = %ld\n", + __FUNCTION__, PTR_ERR(xprt)); + return PTR_ERR(xprt); + } + + /* Bind to a reserved port! */ + xprt->resvport = 1; + /* Create the client RPC handle */ + clnt = rpc_create_client(xprt, clp->cl_hostname, &nfs_program, + clp->rpc_ops->version, RPC_AUTH_UNIX); + if (IS_ERR(clnt)) { + dprintk("%s: cannot create RPC client. Error = %ld\n", + __FUNCTION__, PTR_ERR(clnt)); + return PTR_ERR(clnt); + } + + clnt->cl_intr = 1; + clnt->cl_softrtry = 1; + clp->cl_rpcclient = clnt; + return 0; +} diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index ac370d5..2f3aa52 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -20,6 +20,8 @@ extern void nfs_put_client(struct nfs_client *); extern struct nfs_client *nfs_find_client(const struct sockaddr_in *, int); extern struct nfs_client *nfs_get_client(const char *, const struct sockaddr_in *, int); extern void nfs_mark_client_ready(struct nfs_client *, int); +extern int nfs_create_rpc_client(struct nfs_client *, int, unsigned int, + unsigned int, rpc_authflavor_t); /* nfs4namespace.c */ #ifdef CONFIG_NFS_V4 diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 3e53712..0622af0 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -90,8 +90,8 @@ nfs3_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, int status; status = do_proc_get_root(server->client, fhandle, info); - if (status && server->client_sys != server->client) - status = do_proc_get_root(server->client_sys, fhandle, info); + if (status && server->nfs_client->cl_rpcclient != server->client) + status = do_proc_get_root(server->nfs_client->cl_rpcclient, fhandle, info); return status; } @@ -785,7 +785,7 @@ nfs3_proc_fsinfo(struct nfs_server *server, struct nfs_fh *fhandle, dprintk("NFS call fsinfo\n"); nfs_fattr_init(info->fattr); - status = rpc_call_sync(server->client_sys, &msg, 0); + status = rpc_call_sync(server->nfs_client->cl_rpcclient, &msg, 0); dprintk("NFS reply fsinfo: %d\n", status); return status; } diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 7767690..5a8b940 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -66,14 +66,14 @@ nfs_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, dprintk("%s: call getattr\n", __FUNCTION__); nfs_fattr_init(fattr); - status = rpc_call_sync(server->client_sys, &msg, 0); + status = rpc_call_sync(server->nfs_client->cl_rpcclient, &msg, 0); dprintk("%s: reply getattr: %d\n", __FUNCTION__, status); if (status) return status; dprintk("%s: call statfs\n", __FUNCTION__); msg.rpc_proc = &nfs_procedures[NFSPROC_STATFS]; msg.rpc_resp = &fsinfo; - status = rpc_call_sync(server->client_sys, &msg, 0); + status = rpc_call_sync(server->nfs_client->cl_rpcclient, &msg, 0); dprintk("%s: reply statfs: %d\n", __FUNCTION__, status); if (status) return status; diff --git a/fs/nfs/super.c b/fs/nfs/super.c index e1e5eab..8558341 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -60,52 +60,6 @@ */ #define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1) -/* - * RPC cruft for NFS - */ -static struct rpc_version * nfs_version[] = { - NULL, - NULL, - &nfs_version2, -#if defined(CONFIG_NFS_V3) - &nfs_version3, -#elif defined(CONFIG_NFS_V4) - NULL, -#endif -#if defined(CONFIG_NFS_V4) - &nfs_version4, -#endif -}; - -static struct rpc_program nfs_program = { - .name = "nfs", - .number = NFS_PROGRAM, - .nrvers = ARRAY_SIZE(nfs_version), - .version = nfs_version, - .stats = &nfs_rpcstat, - .pipe_dir_name = "/nfs", -}; - -struct rpc_stat nfs_rpcstat = { - .program = &nfs_program -}; - - -#ifdef CONFIG_NFS_V3_ACL -static struct rpc_stat nfsacl_rpcstat = { &nfsacl_program }; -static struct rpc_version * nfsacl_version[] = { - [3] = &nfsacl_version3, -}; - -struct rpc_program nfsacl_program = { - .name = "nfsacl", - .number = NFS_ACL_PROGRAM, - .nrvers = ARRAY_SIZE(nfsacl_version), - .version = nfsacl_version, - .stats = &nfsacl_rpcstat, -}; -#endif /* CONFIG_NFS_V3_ACL */ - static void nfs_umount_begin(struct vfsmount *, int); static int nfs_statfs(struct dentry *, struct kstatfs *); static int nfs_show_options(struct seq_file *, struct vfsmount *); @@ -376,8 +330,8 @@ static void nfs_show_mount_options(struct seq_file *m, struct nfs_server *nfss, proto = buf; } seq_printf(m, ",proto=%s", proto); - seq_printf(m, ",timeo=%lu", 10U * nfss->retrans_timeo / HZ); - seq_printf(m, ",retrans=%u", nfss->retrans_count); + seq_printf(m, ",timeo=%lu", 10U * clp->retrans_timeo / HZ); + seq_printf(m, ",retrans=%u", clp->retrans_count); seq_printf(m, ",sec=%s", nfs_pseudoflavour_to_name(nfss->client->cl_auth->au_flavor)); } @@ -622,49 +576,16 @@ out_no_root: } /* - * Initialise the timeout values for a connection - */ -static void nfs_init_timeout_values(struct rpc_timeout *to, int proto, unsigned int timeo, unsigned int retrans) -{ - to->to_initval = timeo * HZ / 10; - to->to_retries = retrans; - if (!to->to_retries) - to->to_retries = 2; - - switch (proto) { - case IPPROTO_TCP: - if (!to->to_initval) - to->to_initval = 60 * HZ; - if (to->to_initval > NFS_MAX_TCP_TIMEOUT) - to->to_initval = NFS_MAX_TCP_TIMEOUT; - to->to_increment = to->to_initval; - to->to_maxval = to->to_initval + (to->to_increment * to->to_retries); - to->to_exponential = 0; - break; - case IPPROTO_UDP: - default: - if (!to->to_initval) - to->to_initval = 11 * HZ / 10; - if (to->to_initval > NFS_MAX_UDP_TIMEOUT) - to->to_initval = NFS_MAX_UDP_TIMEOUT; - to->to_maxval = NFS_MAX_UDP_TIMEOUT; - to->to_exponential = 1; - break; - } -} - -/* * Create an RPC client handle. */ static struct rpc_clnt * nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) { struct nfs_client *clp; - struct rpc_timeout timeparms; - struct rpc_xprt *xprt = NULL; - struct rpc_clnt *clnt = NULL; + struct rpc_clnt *clnt; int proto = (data->flags & NFS_MOUNT_TCP) ? IPPROTO_TCP : IPPROTO_UDP; int nfsversion = 2; + int err; #ifdef CONFIG_NFS_V3 if (server->flags & NFS_MOUNT_VER3) @@ -677,52 +598,54 @@ nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) return ERR_PTR(PTR_ERR(clp)); } - nfs_init_timeout_values(&timeparms, proto, data->timeo, data->retrans); - - server->retrans_timeo = timeparms.to_initval; - server->retrans_count = timeparms.to_retries; - - /* Check NFS protocol revision and initialize RPC op vector - * and file handle pool. */ + if (clp->cl_cons_state == NFS_CS_INITING) { + /* Check NFS protocol revision and initialize RPC op + * vector and file handle pool. */ #ifdef CONFIG_NFS_V3 - if (nfsversion == 3) { - clp->rpc_ops = &nfs_v3_clientops; - server->caps |= NFS_CAP_READDIRPLUS; - } else { - clp->rpc_ops = &nfs_v2_clientops; - } + if (nfsversion == 3) { + clp->rpc_ops = &nfs_v3_clientops; + server->caps |= NFS_CAP_READDIRPLUS; + } else { + clp->rpc_ops = &nfs_v2_clientops; + } #else - clp->rpc_ops = &nfs_v2_clientops; + clp->rpc_ops = &nfs_v2_clientops; #endif - /* create transport and client */ - xprt = xprt_create_proto(proto, &server->addr, &timeparms); - if (IS_ERR(xprt)) { - dprintk("%s: cannot create RPC transport. Error = %ld\n", - __FUNCTION__, PTR_ERR(xprt)); - nfs_mark_client_ready(clp, PTR_ERR(xprt)); - nfs_put_client(clp); - return (struct rpc_clnt *)xprt; + /* create transport and client */ + err = nfs_create_rpc_client(clp, proto, data->timeo, + data->retrans, RPC_AUTH_UNIX); + if (err < 0) + goto client_init_error; + + nfs_mark_client_ready(clp, 0); } - clnt = rpc_create_client(xprt, server->hostname, &nfs_program, - clp->cl_nfsversion, data->pseudoflavor); + + /* create an nfs_server-specific client */ + clnt = rpc_clone_client(clp->cl_rpcclient); if (IS_ERR(clnt)) { - dprintk("%s: cannot create RPC client. Error = %ld\n", - __FUNCTION__, PTR_ERR(xprt)); - goto out_fail; + dprintk("%s: couldn't create rpc_client!\n", __FUNCTION__); + nfs_put_client(clp); + return ERR_PTR(PTR_ERR(clnt)); } - clnt->cl_intr = 1; - clnt->cl_softrtry = 1; + if (data->pseudoflavor != clp->cl_rpcclient->cl_auth->au_flavor) { + struct rpc_auth *auth; + + auth = rpcauth_create(data->pseudoflavor, server->client); + if (IS_ERR(auth)) { + dprintk("%s: couldn't create credcache!\n", __FUNCTION__); + return ERR_PTR(PTR_ERR(auth)); + } + } - nfs_mark_client_ready(clp, 0); server->nfs_client = clp; return clnt; -out_fail: - nfs_mark_client_ready(clp, PTR_ERR(xprt)); +client_init_error: + nfs_mark_client_ready(clp, err); nfs_put_client(clp); - return clnt; + return ERR_PTR(err); } /* @@ -741,7 +664,7 @@ static struct nfs_server *nfs_clone_server(struct super_block *sb, struct nfs_cl sb->s_blocksize_bits = data->sb->s_blocksize_bits; sb->s_maxbytes = data->sb->s_maxbytes; - server->client_sys = server->client_acl = ERR_PTR(-EINVAL); + server->client_acl = ERR_PTR(-EINVAL); server->io_stats = nfs_alloc_iostats(); if (server->io_stats == NULL) goto out; @@ -750,11 +673,6 @@ static struct nfs_server *nfs_clone_server(struct super_block *sb, struct nfs_cl if (IS_ERR((err = server->client))) goto out; - if (!IS_ERR(parent->client_sys)) { - server->client_sys = rpc_clone_client(parent->client_sys); - if (IS_ERR((err = server->client_sys))) - goto out; - } if (!IS_ERR(parent->client_acl)) { server->client_acl = rpc_clone_client(parent->client_acl); if (IS_ERR((err = server->client_acl))) @@ -813,7 +731,7 @@ static int nfs_clone_generic_sb(struct nfs_clone_mount *data, error = PTR_ERR(sb); goto kill_rpciod; } - + if (sb->s_root) goto out_rpciod_down; @@ -896,19 +814,6 @@ nfs_fill_super(struct super_block *sb, struct nfs_mount_data *data, int silent) return PTR_ERR(server->client); /* RFC 2623, sec 2.3.2 */ - if (authflavor != RPC_AUTH_UNIX) { - struct rpc_auth *auth; - - server->client_sys = rpc_clone_client(server->client); - if (IS_ERR(server->client_sys)) - return PTR_ERR(server->client_sys); - auth = rpcauth_create(RPC_AUTH_UNIX, server->client_sys); - if (IS_ERR(auth)) - return PTR_ERR(auth); - } else { - atomic_inc(&server->client->cl_count); - server->client_sys = server->client; - } if (server->flags & NFS_MOUNT_VER3) { #ifdef CONFIG_NFS_V3_ACL if (!(server->flags & NFS_MOUNT_NOACL)) { @@ -1012,7 +917,7 @@ static int nfs_get_sb(struct file_system_type *fs_type, goto out_err_noserver; /* Zero out the NFS state stuff */ init_nfsv4_state(server); - server->client = server->client_sys = server->client_acl = ERR_PTR(-EINVAL); + server->client = server->client_acl = ERR_PTR(-EINVAL); root = &server->fh; if (data->flags & NFS_MOUNT_VER3) @@ -1083,8 +988,6 @@ static void nfs_kill_super(struct super_block *s) if (!IS_ERR(server->client)) rpc_shutdown_client(server->client); - if (!IS_ERR(server->client_sys)) - rpc_shutdown_client(server->client_sys); if (!IS_ERR(server->client_acl)) rpc_shutdown_client(server->client_acl); @@ -1121,10 +1024,9 @@ static int nfs_clone_nfs_sb(struct file_system_type *fs_type, #ifdef CONFIG_NFS_V4 static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, - struct rpc_timeout *timeparms, int proto, rpc_authflavor_t flavor) + int timeo, int retrans, int proto, rpc_authflavor_t flavor) { struct nfs_client *clp; - struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; int err = -EIO; @@ -1138,26 +1040,10 @@ static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, if (clp->cl_cons_state == NFS_CS_INITING) { clp->rpc_ops = &nfs_v4_clientops; - xprt = xprt_create_proto(proto, &server->addr, timeparms); - if (IS_ERR(xprt)) { - err = PTR_ERR(xprt); - dprintk("%s: cannot create RPC transport. Error = %d\n", - __FUNCTION__, err); + err = nfs_create_rpc_client(clp, proto, timeo, retrans, flavor); + if (err < 0) goto client_init_error; - } - /* Bind to a reserved port! */ - xprt->resvport = 1; - clnt = rpc_create_client(xprt, server->hostname, &nfs_program, - clp->cl_nfsversion, flavor); - if (IS_ERR(clnt)) { - err = PTR_ERR(clnt); - dprintk("%s: cannot create RPC client. Error = %d\n", - __FUNCTION__, err); - goto client_init_error; - } - clnt->cl_intr = 1; - clnt->cl_softrtry = 1; - clp->cl_rpcclient = clnt; + memcpy(clp->cl_ipaddr, server->ip_addr, sizeof(clp->cl_ipaddr)); err = nfs_idmap_new(clp); if (err < 0) { @@ -1205,7 +1091,6 @@ client_init_error: static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, int silent) { struct nfs_server *server; - struct rpc_timeout timeparms; rpc_authflavor_t authflavour; int err = -EIO; @@ -1224,11 +1109,6 @@ static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, server->acdirmin = data->acdirmin*HZ; server->acdirmax = data->acdirmax*HZ; - nfs_init_timeout_values(&timeparms, data->proto, data->timeo, data->retrans); - - server->retrans_timeo = timeparms.to_initval; - server->retrans_count = timeparms.to_retries; - /* Now create transport and client */ authflavour = RPC_AUTH_UNIX; if (data->auth_flavourlen != 0) { @@ -1244,7 +1124,8 @@ static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, } } - server->client = nfs4_create_client(server, &timeparms, data->proto, authflavour); + server->client = nfs4_create_client(server, data->timeo, data->retrans, + data->proto, authflavour); if (IS_ERR(server->client)) { err = PTR_ERR(server->client); dprintk("%s: cannot create RPC client. Error = %d\n", @@ -1318,7 +1199,7 @@ static int nfs4_get_sb(struct file_system_type *fs_type, return -ENOMEM; /* Zero out the NFS state stuff */ init_nfsv4_state(server); - server->client = server->client_sys = server->client_acl = ERR_PTR(-EINVAL); + server->client = server->client_acl = ERR_PTR(-EINVAL); p = nfs_copy_user_string(NULL, &data->hostname, 256); if (IS_ERR(p)) @@ -1489,7 +1370,6 @@ err: static struct nfs_server *nfs4_referral_server(struct super_block *sb, struct nfs_clone_mount *data) { struct nfs_server *server = NFS_SB(sb); - struct rpc_timeout timeparms; int proto, timeo, retrans; void *err; @@ -1498,11 +1378,11 @@ static struct nfs_server *nfs4_referral_server(struct super_block *sb, struct nf set the timeouts and retries to low values */ timeo = 2; retrans = 1; - nfs_init_timeout_values(&timeparms, proto, timeo, retrans); nfs_put_client(server->nfs_client); server->nfs_client = NULL; - server->client = nfs4_create_client(server, &timeparms, proto, data->authflavor); + server->client = nfs4_create_client(server, timeo, retrans, proto, + data->authflavor); if (IS_ERR((err = server->client))) goto out_err; diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index aae7c11..d404cec 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -26,6 +26,8 @@ struct nfs_client { struct rpc_clnt * cl_rpcclient; const struct nfs_rpc_ops *rpc_ops; /* NFS protocol vector */ + unsigned long retrans_timeo; /* retransmit timeout */ + unsigned int retrans_count; /* number of retransmit tries */ #ifdef CONFIG_NFS_V4 u64 cl_clientid; /* constant */ @@ -73,7 +75,6 @@ struct nfs_client { struct nfs_server { struct nfs_client * nfs_client; /* shared client and NFS4 state */ struct rpc_clnt * client; /* RPC client handle */ - struct rpc_clnt * client_sys; /* 2nd handle for FSINFO */ struct rpc_clnt * client_acl; /* ACL RPC client handle */ struct nfs_iostats * io_stats; /* I/O statistics */ struct backing_dev_info backing_dev_info; @@ -90,8 +91,6 @@ struct nfs_server { unsigned int acregmax; unsigned int acdirmin; unsigned int acdirmax; - unsigned long retrans_timeo; /* retransmit timeout */ - unsigned int retrans_count; /* number of retransmit tries */ unsigned int namelen; char * hostname; /* remote hostname */ struct nfs_fh fh; -- cgit v0.10.2 From cf6d7b5de8535a9f0088c5cc28ee2dae87371b4a Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:12 -0400 Subject: NFS: Start rpciod in server common management Start rpciod in the server common (nfs_client struct) management code rather than in the superblock management code. This means we only need to "start" it once per server instead of once per superblock. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 8558341..5842d51 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -722,18 +722,15 @@ static int nfs_clone_generic_sb(struct nfs_clone_mount *data, if (server->hostname == NULL) goto free_server; memcpy(server->hostname, hostname, len); - error = rpciod_up(); - if (error != 0) - goto free_hostname; sb = fill_sb(server, data); if (IS_ERR(sb)) { error = PTR_ERR(sb); - goto kill_rpciod; + goto free_hostname; } if (sb->s_root) - goto out_rpciod_down; + goto out_share; server = fill_server(sb, data); if (IS_ERR(server)) { @@ -745,14 +742,11 @@ out_deactivate: up_write(&sb->s_umount); deactivate_super(sb); return error; -out_rpciod_down: - rpciod_down(); +out_share: kfree(server->hostname); nfs_put_client(server->nfs_client); kfree(server); return simple_set_mnt(mnt, sb); -kill_rpciod: - rpciod_down(); free_hostname: kfree(server->hostname); free_server: @@ -939,22 +933,14 @@ static int nfs_get_sb(struct file_system_type *fs_type, goto out_err; } - /* Fire up rpciod if not yet running */ - error = rpciod_up(); - if (error < 0) { - dprintk("%s: couldn't start rpciod! Error = %d\n", - __FUNCTION__, error); - goto out_err; - } - s = sget(fs_type, nfs_compare_super, nfs_set_super, server); if (IS_ERR(s)) { error = PTR_ERR(s); - goto out_err_rpciod; + goto out_err; } if (s->s_root) - goto out_rpciod_down; + goto out_share; s->s_flags = flags; @@ -967,13 +953,10 @@ static int nfs_get_sb(struct file_system_type *fs_type, s->s_flags |= MS_ACTIVE; return simple_set_mnt(mnt, s); -out_rpciod_down: - rpciod_down(); +out_share: kfree(server); return simple_set_mnt(mnt, s); -out_err_rpciod: - rpciod_down(); out_err: kfree(server); out_err_noserver: @@ -994,8 +977,6 @@ static void nfs_kill_super(struct super_block *s) if (!(server->flags & NFS_MOUNT_NONLM)) lockd_down(); /* release rpc.lockd */ - rpciod_down(); /* release rpciod */ - nfs_free_iostats(server->io_stats); kfree(server->hostname); nfs_put_client(server->nfs_client); -- cgit v0.10.2 From 54ceac4515986030c2502960be620198dd8fe25b Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:13 -0400 Subject: NFS: Share NFS superblocks per-protocol per-server per-FSID The attached patch makes NFS share superblocks between mounts from the same server and FSID over the same protocol. It does this by creating each superblock with a false root and returning the real root dentry in the vfsmount presented by get_sb(). The root dentry set starts off as an anonymous dentry if we don't already have the dentry for its inode, otherwise it simply returns the dentry we already have. We may thus end up with several trees of dentries in the superblock, and if at some later point one of anonymous tree roots is discovered by normal filesystem activity to be located in another tree within the superblock, the anonymous root is named and materialises attached to the second tree at the appropriate point. Why do it this way? Why not pass an extra argument to the mount() syscall to indicate the subpath and then pathwalk from the server root to the desired directory? You can't guarantee this will work for two reasons: (1) The root and intervening nodes may not be accessible to the client. With NFS2 and NFS3, for instance, mountd is called on the server to get the filehandle for the tip of a path. mountd won't give us handles for anything we don't have permission to access, and so we can't set up NFS inodes for such nodes, and so can't easily set up dentries (we'd have to have ghost inodes or something). With this patch we don't actually create dentries until we get handles from the server that we can use to set up their inodes, and we don't actually bind them into the tree until we know for sure where they go. (2) Inaccessible symbolic links. If we're asked to mount two exports from the server, eg: mount warthog:/warthog/aaa/xxx /mmm mount warthog:/warthog/bbb/yyy /nnn We may not be able to access anything nearer the root than xxx and yyy, but we may find out later that /mmm/www/yyy, say, is actually the same directory as the one mounted on /nnn. What we might then find out, for example, is that /warthog/bbb was actually a symbolic link to /warthog/aaa/xxx/www, but we can't actually determine that by talking to the server until /warthog is made available by NFS. This would lead to having constructed an errneous dentry tree which we can't easily fix. We can end up with a dentry marked as a directory when it should actually be a symlink, or we could end up with an apparently hardlinked directory. With this patch we need not make assumptions about the type of a dentry for which we can't retrieve information, nor need we assume we know its place in the grand scheme of things until we actually see that place. This patch reduces the possibility of aliasing in the inode and page caches for inodes that may be accessed by more than one NFS export. It also reduces the number of superblocks required for NFS where there are many NFS exports being used from a server (home directory server + autofs for example). This in turn makes it simpler to do local caching of network filesystems, as it can then be guaranteed that there won't be links from multiple inodes in separate superblocks to the same cache file. Obviously, cache aliasing between different levels of NFS protocol could still be a problem, but at least that gives us another key to use when indexing the cache. This patch makes the following changes: (1) The server record construction/destruction has been abstracted out into its own set of functions to make things easier to get right. These have been moved into fs/nfs/client.c. All the code in fs/nfs/client.c has to do with the management of connections to servers, and doesn't touch superblocks in any way; the remaining code in fs/nfs/super.c has to do with VFS superblock management. (2) The sequence of events undertaken by NFS mount is now reordered: (a) A volume representation (struct nfs_server) is allocated. (b) A server representation (struct nfs_client) is acquired. This may be allocated or shared, and is keyed on server address, port and NFS version. (c) If allocated, the client representation is initialised. The state member variable of nfs_client is used to prevent a race during initialisation from two mounts. (d) For NFS4 a simple pathwalk is performed, walking from FH to FH to find the root filehandle for the mount (fs/nfs/getroot.c). For NFS2/3 we are given the root FH in advance. (e) The volume FSID is probed for on the root FH. (f) The volume representation is initialised from the FSINFO record retrieved on the root FH. (g) sget() is called to acquire a superblock. This may be allocated or shared, keyed on client pointer and FSID. (h) If allocated, the superblock is initialised. (i) If the superblock is shared, then the new nfs_server record is discarded. (j) The root dentry for this mount is looked up from the root FH. (k) The root dentry for this mount is assigned to the vfsmount. (3) nfs_readdir_lookup() creates dentries for each of the entries readdir() returns; this function now attaches disconnected trees from alternate roots that happen to be discovered attached to a directory being read (in the same way nfs_lookup() is made to do for lookup ops). The new d_materialise_unique() function is now used to do this, thus permitting the whole thing to be done under one set of locks, and thus avoiding any race between mount and lookup operations on the same directory. (4) The client management code uses a new debug facility: NFSDBG_CLIENT which is set by echoing 1024 to /proc/net/sunrpc/nfs_debug. (5) Clone mounts are now called xdev mounts. (6) Use the dentry passed to the statfs() op as the handle for retrieving fs statistics rather than the root dentry of the superblock (which is now a dummy). Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/Makefile b/fs/nfs/Makefile index 3b993a6..f4580b4 100644 --- a/fs/nfs/Makefile +++ b/fs/nfs/Makefile @@ -4,7 +4,7 @@ obj-$(CONFIG_NFS_FS) += nfs.o -nfs-y := client.o dir.o file.o inode.o super.o nfs2xdr.o \ +nfs-y := client.o dir.o file.o getroot.o inode.o super.o nfs2xdr.o \ pagelist.o proc.o read.o symlink.o unlink.o \ write.o namespace.o nfs-$(CONFIG_ROOT_NFS) += nfsroot.o mount_clnt.o diff --git a/fs/nfs/client.c b/fs/nfs/client.c index c08cab9..dafba60 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -48,6 +48,7 @@ static DEFINE_SPINLOCK(nfs_client_lock); static LIST_HEAD(nfs_client_list); +static LIST_HEAD(nfs_volume_list); static DECLARE_WAIT_QUEUE_HEAD(nfs_client_active_wq); /* @@ -268,9 +269,9 @@ struct nfs_client *nfs_find_client(const struct sockaddr_in *addr, int nfsversio * Look up a client by IP address and protocol version * - creates a new record if one doesn't yet exist */ -struct nfs_client *nfs_get_client(const char *hostname, - const struct sockaddr_in *addr, - int nfsversion) +static struct nfs_client *nfs_get_client(const char *hostname, + const struct sockaddr_in *addr, + int nfsversion) { struct nfs_client *clp, *new = NULL; int error; @@ -340,6 +341,8 @@ found_client: return ERR_PTR(error); } + BUG_ON(clp->cl_cons_state != NFS_CS_READY); + dprintk("--> nfs_get_client() = %p [share]\n", clp); return clp; } @@ -347,7 +350,7 @@ found_client: /* * Mark a server as ready or failed */ -void nfs_mark_client_ready(struct nfs_client *clp, int state) +static void nfs_mark_client_ready(struct nfs_client *clp, int state) { clp->cl_cons_state = state; wake_up_all(&nfs_client_active_wq); @@ -389,10 +392,10 @@ static void nfs_init_timeout_values(struct rpc_timeout *to, int proto, /* * Create an RPC client handle */ -int nfs_create_rpc_client(struct nfs_client *clp, int proto, - unsigned int timeo, - unsigned int retrans, - rpc_authflavor_t flavor) +static int nfs_create_rpc_client(struct nfs_client *clp, int proto, + unsigned int timeo, + unsigned int retrans, + rpc_authflavor_t flavor) { struct rpc_timeout timeparms; struct rpc_xprt *xprt = NULL; @@ -429,3 +432,719 @@ int nfs_create_rpc_client(struct nfs_client *clp, int proto, clp->cl_rpcclient = clnt; return 0; } + +/* + * Version 2 or 3 client destruction + */ +static void nfs_destroy_server(struct nfs_server *server) +{ + if (!IS_ERR(server->client_acl)) + rpc_shutdown_client(server->client_acl); + + if (!(server->flags & NFS_MOUNT_NONLM)) + lockd_down(); /* release rpc.lockd */ +} + +/* + * Version 2 or 3 lockd setup + */ +static int nfs_start_lockd(struct nfs_server *server) +{ + int error = 0; + + if (server->nfs_client->cl_nfsversion > 3) + goto out; + if (server->flags & NFS_MOUNT_NONLM) + goto out; + error = lockd_up(); + if (error < 0) + server->flags |= NFS_MOUNT_NONLM; + else + server->destroy = nfs_destroy_server; +out: + return error; +} + +/* + * Initialise an NFSv3 ACL client connection + */ +#ifdef CONFIG_NFS_V3_ACL +static void nfs_init_server_aclclient(struct nfs_server *server) +{ + if (server->nfs_client->cl_nfsversion != 3) + goto out_noacl; + if (server->flags & NFS_MOUNT_NOACL) + goto out_noacl; + + server->client_acl = rpc_bind_new_program(server->client, &nfsacl_program, 3); + if (IS_ERR(server->client_acl)) + goto out_noacl; + + /* No errors! Assume that Sun nfsacls are supported */ + server->caps |= NFS_CAP_ACLS; + return; + +out_noacl: + server->caps &= ~NFS_CAP_ACLS; +} +#else +static inline void nfs_init_server_aclclient(struct nfs_server *server) +{ + server->flags &= ~NFS_MOUNT_NOACL; + server->caps &= ~NFS_CAP_ACLS; +} +#endif + +/* + * Create a general RPC client + */ +static int nfs_init_server_rpcclient(struct nfs_server *server, rpc_authflavor_t pseudoflavour) +{ + struct nfs_client *clp = server->nfs_client; + + server->client = rpc_clone_client(clp->cl_rpcclient); + if (IS_ERR(server->client)) { + dprintk("%s: couldn't create rpc_client!\n", __FUNCTION__); + return PTR_ERR(server->client); + } + + if (pseudoflavour != clp->cl_rpcclient->cl_auth->au_flavor) { + struct rpc_auth *auth; + + auth = rpcauth_create(pseudoflavour, server->client); + if (IS_ERR(auth)) { + dprintk("%s: couldn't create credcache!\n", __FUNCTION__); + return PTR_ERR(auth); + } + } + server->client->cl_softrtry = 0; + if (server->flags & NFS_MOUNT_SOFT) + server->client->cl_softrtry = 1; + + server->client->cl_intr = 0; + if (server->flags & NFS4_MOUNT_INTR) + server->client->cl_intr = 1; + + return 0; +} + +/* + * Initialise an NFS2 or NFS3 client + */ +static int nfs_init_client(struct nfs_client *clp, const struct nfs_mount_data *data) +{ + int proto = (data->flags & NFS_MOUNT_TCP) ? IPPROTO_TCP : IPPROTO_UDP; + int error; + + if (clp->cl_cons_state == NFS_CS_READY) { + /* the client is already initialised */ + dprintk("<-- nfs_init_client() = 0 [already %p]\n", clp); + return 0; + } + + /* Check NFS protocol revision and initialize RPC op vector */ + clp->rpc_ops = &nfs_v2_clientops; +#ifdef CONFIG_NFS_V3 + if (clp->cl_nfsversion == 3) + clp->rpc_ops = &nfs_v3_clientops; +#endif + /* + * Create a client RPC handle for doing FSSTAT with UNIX auth only + * - RFC 2623, sec 2.3.2 + */ + error = nfs_create_rpc_client(clp, proto, data->timeo, data->retrans, + RPC_AUTH_UNIX); + if (error < 0) + goto error; + nfs_mark_client_ready(clp, NFS_CS_READY); + return 0; + +error: + nfs_mark_client_ready(clp, error); + dprintk("<-- nfs_init_client() = xerror %d\n", error); + return error; +} + +/* + * Create a version 2 or 3 client + */ +static int nfs_init_server(struct nfs_server *server, const struct nfs_mount_data *data) +{ + struct nfs_client *clp; + int error, nfsvers = 2; + + dprintk("--> nfs_init_server()\n"); + +#ifdef CONFIG_NFS_V3 + if (data->flags & NFS_MOUNT_VER3) + nfsvers = 3; +#endif + + /* Allocate or find a client reference we can use */ + clp = nfs_get_client(data->hostname, &data->addr, nfsvers); + if (IS_ERR(clp)) { + dprintk("<-- nfs_init_server() = error %ld\n", PTR_ERR(clp)); + return PTR_ERR(clp); + } + + error = nfs_init_client(clp, data); + if (error < 0) + goto error; + + server->nfs_client = clp; + + /* Initialise the client representation from the mount data */ + server->flags = data->flags & NFS_MOUNT_FLAGMASK; + + if (data->rsize) + server->rsize = nfs_block_size(data->rsize, NULL); + if (data->wsize) + server->wsize = nfs_block_size(data->wsize, NULL); + + server->acregmin = data->acregmin * HZ; + server->acregmax = data->acregmax * HZ; + server->acdirmin = data->acdirmin * HZ; + server->acdirmax = data->acdirmax * HZ; + + /* Start lockd here, before we might error out */ + error = nfs_start_lockd(server); + if (error < 0) + goto error; + + error = nfs_init_server_rpcclient(server, data->pseudoflavor); + if (error < 0) + goto error; + + server->namelen = data->namlen; + /* Create a client RPC handle for the NFSv3 ACL management interface */ + nfs_init_server_aclclient(server); + if (clp->cl_nfsversion == 3) { + if (server->namelen == 0 || server->namelen > NFS3_MAXNAMLEN) + server->namelen = NFS3_MAXNAMLEN; + server->caps |= NFS_CAP_READDIRPLUS; + } else { + if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN) + server->namelen = NFS2_MAXNAMLEN; + } + + dprintk("<-- nfs_init_server() = 0 [new %p]\n", clp); + return 0; + +error: + server->nfs_client = NULL; + nfs_put_client(clp); + dprintk("<-- nfs_init_server() = xerror %d\n", error); + return error; +} + +/* + * Load up the server record from information gained in an fsinfo record + */ +static void nfs_server_set_fsinfo(struct nfs_server *server, struct nfs_fsinfo *fsinfo) +{ + unsigned long max_rpc_payload; + + /* Work out a lot of parameters */ + if (server->rsize == 0) + server->rsize = nfs_block_size(fsinfo->rtpref, NULL); + if (server->wsize == 0) + server->wsize = nfs_block_size(fsinfo->wtpref, NULL); + + if (fsinfo->rtmax >= 512 && server->rsize > fsinfo->rtmax) + server->rsize = nfs_block_size(fsinfo->rtmax, NULL); + if (fsinfo->wtmax >= 512 && server->wsize > fsinfo->wtmax) + server->wsize = nfs_block_size(fsinfo->wtmax, NULL); + + max_rpc_payload = nfs_block_size(rpc_max_payload(server->client), NULL); + if (server->rsize > max_rpc_payload) + server->rsize = max_rpc_payload; + if (server->rsize > NFS_MAX_FILE_IO_SIZE) + server->rsize = NFS_MAX_FILE_IO_SIZE; + server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD; + + if (server->wsize > max_rpc_payload) + server->wsize = max_rpc_payload; + if (server->wsize > NFS_MAX_FILE_IO_SIZE) + server->wsize = NFS_MAX_FILE_IO_SIZE; + server->wpages = (server->wsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; + server->wtmult = nfs_block_bits(fsinfo->wtmult, NULL); + + server->dtsize = nfs_block_size(fsinfo->dtpref, NULL); + if (server->dtsize > PAGE_CACHE_SIZE) + server->dtsize = PAGE_CACHE_SIZE; + if (server->dtsize > server->rsize) + server->dtsize = server->rsize; + + if (server->flags & NFS_MOUNT_NOAC) { + server->acregmin = server->acregmax = 0; + server->acdirmin = server->acdirmax = 0; + } + + server->maxfilesize = fsinfo->maxfilesize; + + /* We're airborne Set socket buffersize */ + rpc_setbufsize(server->client, server->wsize + 100, server->rsize + 100); +} + +/* + * Probe filesystem information, including the FSID on v2/v3 + */ +static int nfs_probe_fsinfo(struct nfs_server *server, struct nfs_fh *mntfh, struct nfs_fattr *fattr) +{ + struct nfs_fsinfo fsinfo; + struct nfs_client *clp = server->nfs_client; + int error; + + dprintk("--> nfs_probe_fsinfo()\n"); + + if (clp->rpc_ops->set_capabilities != NULL) { + error = clp->rpc_ops->set_capabilities(server, mntfh); + if (error < 0) + goto out_error; + } + + fsinfo.fattr = fattr; + nfs_fattr_init(fattr); + error = clp->rpc_ops->fsinfo(server, mntfh, &fsinfo); + if (error < 0) + goto out_error; + + nfs_server_set_fsinfo(server, &fsinfo); + + /* Get some general file system info */ + if (server->namelen == 0) { + struct nfs_pathconf pathinfo; + + pathinfo.fattr = fattr; + nfs_fattr_init(fattr); + + if (clp->rpc_ops->pathconf(server, mntfh, &pathinfo) >= 0) + server->namelen = pathinfo.max_namelen; + } + + dprintk("<-- nfs_probe_fsinfo() = 0\n"); + return 0; + +out_error: + dprintk("nfs_probe_fsinfo: error = %d\n", -error); + return error; +} + +/* + * Copy useful information when duplicating a server record + */ +static void nfs_server_copy_userdata(struct nfs_server *target, struct nfs_server *source) +{ + target->flags = source->flags; + target->acregmin = source->acregmin; + target->acregmax = source->acregmax; + target->acdirmin = source->acdirmin; + target->acdirmax = source->acdirmax; + target->caps = source->caps; +} + +/* + * Allocate and initialise a server record + */ +static struct nfs_server *nfs_alloc_server(void) +{ + struct nfs_server *server; + + server = kzalloc(sizeof(struct nfs_server), GFP_KERNEL); + if (!server) + return NULL; + + server->client = server->client_acl = ERR_PTR(-EINVAL); + + /* Zero out the NFS state stuff */ + INIT_LIST_HEAD(&server->client_link); + INIT_LIST_HEAD(&server->master_link); + + server->io_stats = nfs_alloc_iostats(); + if (!server->io_stats) { + kfree(server); + return NULL; + } + + return server; +} + +/* + * Free up a server record + */ +void nfs_free_server(struct nfs_server *server) +{ + dprintk("--> nfs_free_server()\n"); + + spin_lock(&nfs_client_lock); + list_del(&server->client_link); + list_del(&server->master_link); + spin_unlock(&nfs_client_lock); + + if (server->destroy != NULL) + server->destroy(server); + if (!IS_ERR(server->client)) + rpc_shutdown_client(server->client); + + nfs_put_client(server->nfs_client); + + nfs_free_iostats(server->io_stats); + kfree(server); + nfs_release_automount_timer(); + dprintk("<-- nfs_free_server()\n"); +} + +/* + * Create a version 2 or 3 volume record + * - keyed on server and FSID + */ +struct nfs_server *nfs_create_server(const struct nfs_mount_data *data, + struct nfs_fh *mntfh) +{ + struct nfs_server *server; + struct nfs_fattr fattr; + int error; + + server = nfs_alloc_server(); + if (!server) + return ERR_PTR(-ENOMEM); + + /* Get a client representation */ + error = nfs_init_server(server, data); + if (error < 0) + goto error; + + BUG_ON(!server->nfs_client); + BUG_ON(!server->nfs_client->rpc_ops); + BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops); + + /* Probe the root fh to retrieve its FSID */ + error = nfs_probe_fsinfo(server, mntfh, &fattr); + if (error < 0) + goto error; + if (!(fattr.valid & NFS_ATTR_FATTR)) { + error = server->nfs_client->rpc_ops->getattr(server, mntfh, &fattr); + if (error < 0) { + dprintk("nfs_create_server: getattr error = %d\n", -error); + goto error; + } + } + memcpy(&server->fsid, &fattr.fsid, sizeof(server->fsid)); + + dprintk("Server FSID: %llx:%llx\n", server->fsid.major, server->fsid.minor); + + BUG_ON(!server->nfs_client); + BUG_ON(!server->nfs_client->rpc_ops); + BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops); + + spin_lock(&nfs_client_lock); + list_add_tail(&server->client_link, &server->nfs_client->cl_superblocks); + list_add_tail(&server->master_link, &nfs_volume_list); + spin_unlock(&nfs_client_lock); + + server->mount_time = jiffies; + return server; + +error: + nfs_free_server(server); + return ERR_PTR(error); +} + +#ifdef CONFIG_NFS_V4 +/* + * Initialise an NFS4 client record + */ +static int nfs4_init_client(struct nfs_client *clp, + int proto, int timeo, int retrans, + rpc_authflavor_t authflavour) +{ + int error; + + if (clp->cl_cons_state == NFS_CS_READY) { + /* the client is initialised already */ + dprintk("<-- nfs4_init_client() = 0 [already %p]\n", clp); + return 0; + } + + /* Check NFS protocol revision and initialize RPC op vector */ + clp->rpc_ops = &nfs_v4_clientops; + + error = nfs_create_rpc_client(clp, proto, timeo, retrans, authflavour); + if (error < 0) + goto error; + + error = nfs_idmap_new(clp); + if (error < 0) { + dprintk("%s: failed to create idmapper. Error = %d\n", + __FUNCTION__, error); + __set_bit(NFS_CS_IDMAP, &clp->cl_res_state); + goto error; + } + + nfs_mark_client_ready(clp, NFS_CS_READY); + return 0; + +error: + nfs_mark_client_ready(clp, error); + dprintk("<-- nfs4_init_client() = xerror %d\n", error); + return error; +} + +/* + * Set up an NFS4 client + */ +static int nfs4_set_client(struct nfs_server *server, + const char *hostname, const struct sockaddr_in *addr, + rpc_authflavor_t authflavour, + int proto, int timeo, int retrans) +{ + struct nfs_client *clp; + int error; + + dprintk("--> nfs4_set_client()\n"); + + /* Allocate or find a client reference we can use */ + clp = nfs_get_client(hostname, addr, 4); + if (IS_ERR(clp)) { + error = PTR_ERR(clp); + goto error; + } + error = nfs4_init_client(clp, proto, timeo, retrans, authflavour); + if (error < 0) + goto error_put; + + server->nfs_client = clp; + dprintk("<-- nfs4_set_client() = 0 [new %p]\n", clp); + return 0; + +error_put: + nfs_put_client(clp); +error: + dprintk("<-- nfs4_set_client() = xerror %d\n", error); + return error; +} + +/* + * Create a version 4 volume record + */ +static int nfs4_init_server(struct nfs_server *server, + const struct nfs4_mount_data *data, rpc_authflavor_t authflavour) +{ + int error; + + dprintk("--> nfs4_init_server()\n"); + + /* Initialise the client representation from the mount data */ + server->flags = data->flags & NFS_MOUNT_FLAGMASK; + server->caps |= NFS_CAP_ATOMIC_OPEN; + + if (data->rsize) + server->rsize = nfs_block_size(data->rsize, NULL); + if (data->wsize) + server->wsize = nfs_block_size(data->wsize, NULL); + + server->acregmin = data->acregmin * HZ; + server->acregmax = data->acregmax * HZ; + server->acdirmin = data->acdirmin * HZ; + server->acdirmax = data->acdirmax * HZ; + + error = nfs_init_server_rpcclient(server, authflavour); + + /* Done */ + dprintk("<-- nfs4_init_server() = %d\n", error); + return error; +} + +/* + * Create a version 4 volume record + * - keyed on server and FSID + */ +struct nfs_server *nfs4_create_server(const struct nfs4_mount_data *data, + const char *hostname, + const struct sockaddr_in *addr, + const char *mntpath, + const char *ip_addr, + rpc_authflavor_t authflavour, + struct nfs_fh *mntfh) +{ + struct nfs_fattr fattr; + struct nfs_server *server; + int error; + + dprintk("--> nfs4_create_server()\n"); + + server = nfs_alloc_server(); + if (!server) + return ERR_PTR(-ENOMEM); + + /* Get a client record */ + error = nfs4_set_client(server, hostname, addr, authflavour, + data->proto, data->timeo, data->retrans); + if (error < 0) + goto error; + + /* set up the general RPC client */ + error = nfs4_init_server(server, data, authflavour); + if (error < 0) + goto error; + + BUG_ON(!server->nfs_client); + BUG_ON(!server->nfs_client->rpc_ops); + BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops); + + /* Probe the root fh to retrieve its FSID */ + error = nfs4_path_walk(server, mntfh, mntpath); + if (error < 0) + goto error; + + dprintk("Server FSID: %llx:%llx\n", server->fsid.major, server->fsid.minor); + dprintk("Mount FH: %d\n", mntfh->size); + + error = nfs_probe_fsinfo(server, mntfh, &fattr); + if (error < 0) + goto error; + + BUG_ON(!server->nfs_client); + BUG_ON(!server->nfs_client->rpc_ops); + BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops); + + spin_lock(&nfs_client_lock); + list_add_tail(&server->client_link, &server->nfs_client->cl_superblocks); + list_add_tail(&server->master_link, &nfs_volume_list); + spin_unlock(&nfs_client_lock); + + server->mount_time = jiffies; + dprintk("<-- nfs4_create_server() = %p\n", server); + return server; + +error: + nfs_free_server(server); + dprintk("<-- nfs4_create_server() = error %d\n", error); + return ERR_PTR(error); +} + +/* + * Create an NFS4 referral server record + */ +struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, + struct nfs_fh *fh) +{ + struct nfs_client *parent_client; + struct nfs_server *server, *parent_server; + struct nfs_fattr fattr; + int error; + + dprintk("--> nfs4_create_referral_server()\n"); + + server = nfs_alloc_server(); + if (!server) + return ERR_PTR(-ENOMEM); + + parent_server = NFS_SB(data->sb); + parent_client = parent_server->nfs_client; + + /* Get a client representation. + * Note: NFSv4 always uses TCP, */ + error = nfs4_set_client(server, data->hostname, data->addr, + data->authflavor, + parent_server->client->cl_xprt->prot, + parent_client->retrans_timeo, + parent_client->retrans_count); + + /* Initialise the client representation from the parent server */ + nfs_server_copy_userdata(server, parent_server); + server->caps |= NFS_CAP_ATOMIC_OPEN; + + error = nfs_init_server_rpcclient(server, data->authflavor); + if (error < 0) + goto error; + + BUG_ON(!server->nfs_client); + BUG_ON(!server->nfs_client->rpc_ops); + BUG_ON(!server->nfs_client->rpc_ops->file_inode_ops); + + /* probe the filesystem info for this server filesystem */ + error = nfs_probe_fsinfo(server, fh, &fattr); + if (error < 0) + goto error; + + dprintk("Referral FSID: %llx:%llx\n", + server->fsid.major, server->fsid.minor); + + spin_lock(&nfs_client_lock); + list_add_tail(&server->client_link, &server->nfs_client->cl_superblocks); + list_add_tail(&server->master_link, &nfs_volume_list); + spin_unlock(&nfs_client_lock); + + server->mount_time = jiffies; + + dprintk("<-- nfs_create_referral_server() = %p\n", server); + return server; + +error: + nfs_free_server(server); + dprintk("<-- nfs4_create_referral_server() = error %d\n", error); + return ERR_PTR(error); +} + +#endif /* CONFIG_NFS_V4 */ + +/* + * Clone an NFS2, NFS3 or NFS4 server record + */ +struct nfs_server *nfs_clone_server(struct nfs_server *source, + struct nfs_fh *fh, + struct nfs_fattr *fattr) +{ + struct nfs_server *server; + struct nfs_fattr fattr_fsinfo; + int error; + + dprintk("--> nfs_clone_server(,%llx:%llx,)\n", + fattr->fsid.major, fattr->fsid.minor); + + server = nfs_alloc_server(); + if (!server) + return ERR_PTR(-ENOMEM); + + /* Copy data from the source */ + server->nfs_client = source->nfs_client; + atomic_inc(&server->nfs_client->cl_count); + nfs_server_copy_userdata(server, source); + + server->fsid = fattr->fsid; + + error = nfs_init_server_rpcclient(server, source->client->cl_auth->au_flavor); + if (error < 0) + goto out_free_server; + if (!IS_ERR(source->client_acl)) + nfs_init_server_aclclient(server); + + /* probe the filesystem info for this server filesystem */ + error = nfs_probe_fsinfo(server, fh, &fattr_fsinfo); + if (error < 0) + goto out_free_server; + + dprintk("Cloned FSID: %llx:%llx\n", + server->fsid.major, server->fsid.minor); + + error = nfs_start_lockd(server); + if (error < 0) + goto out_free_server; + + spin_lock(&nfs_client_lock); + list_add_tail(&server->client_link, &server->nfs_client->cl_superblocks); + list_add_tail(&server->master_link, &nfs_volume_list); + spin_unlock(&nfs_client_lock); + + server->mount_time = jiffies; + + dprintk("<-- nfs_clone_server() = %p\n", server); + return server; + +out_free_server: + nfs_free_server(server); + dprintk("<-- nfs_clone_server() = error %d\n", error); + return ERR_PTR(error); +} diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 1936271..9b496ef 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -31,6 +31,7 @@ #include #include #include +#include #include "nfs4_fs.h" #include "delegation.h" @@ -870,14 +871,14 @@ int nfs_is_exclusive_create(struct inode *dir, struct nameidata *nd) return (nd->intent.open.flags & O_EXCL) != 0; } -static inline int nfs_reval_fsid(struct inode *dir, - struct nfs_fh *fh, struct nfs_fattr *fattr) +static inline int nfs_reval_fsid(struct vfsmount *mnt, struct inode *dir, + struct nfs_fh *fh, struct nfs_fattr *fattr) { struct nfs_server *server = NFS_SERVER(dir); if (!nfs_fsid_equal(&server->fsid, &fattr->fsid)) /* Revalidate fsid on root dir */ - return __nfs_revalidate_inode(server, dir->i_sb->s_root->d_inode); + return __nfs_revalidate_inode(server, mnt->mnt_root->d_inode); return 0; } @@ -913,7 +914,7 @@ static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, stru res = ERR_PTR(error); goto out_unlock; } - error = nfs_reval_fsid(dir, &fhandle, &fattr); + error = nfs_reval_fsid(nd->mnt, dir, &fhandle, &fattr); if (error < 0) { res = ERR_PTR(error); goto out_unlock; @@ -922,8 +923,9 @@ static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, stru res = (struct dentry *)inode; if (IS_ERR(res)) goto out_unlock; + no_entry: - res = d_add_unique(dentry, inode); + res = d_materialise_unique(dentry, inode); if (res != NULL) dentry = res; nfs_renew_times(dentry); @@ -1117,11 +1119,13 @@ static struct dentry *nfs_readdir_lookup(nfs_readdir_descriptor_t *desc) dput(dentry); return NULL; } - alias = d_add_unique(dentry, inode); + + alias = d_materialise_unique(dentry, inode); if (alias != NULL) { dput(dentry); dentry = alias; } + nfs_renew_times(dentry); nfs_set_verifier(dentry, nfs_save_change_attribute(dir)); return dentry; diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c new file mode 100644 index 0000000..977e590 --- /dev/null +++ b/fs/nfs/getroot.c @@ -0,0 +1,306 @@ +/* getroot.c: get the root dentry for an NFS mount + * + * Copyright (C) 2006 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "nfs4_fs.h" +#include "delegation.h" +#include "internal.h" + +#define NFSDBG_FACILITY NFSDBG_CLIENT +#define NFS_PARANOIA 1 + +/* + * get an NFS2/NFS3 root dentry from the root filehandle + */ +struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh) +{ + struct nfs_server *server = NFS_SB(sb); + struct nfs_fsinfo fsinfo; + struct nfs_fattr fattr; + struct dentry *mntroot; + struct inode *inode; + int error; + + /* create a dummy root dentry with dummy inode for this superblock */ + if (!sb->s_root) { + struct nfs_fh dummyfh; + struct dentry *root; + struct inode *iroot; + + memset(&dummyfh, 0, sizeof(dummyfh)); + memset(&fattr, 0, sizeof(fattr)); + nfs_fattr_init(&fattr); + fattr.valid = NFS_ATTR_FATTR; + fattr.type = NFDIR; + fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR; + fattr.nlink = 2; + + iroot = nfs_fhget(sb, &dummyfh, &fattr); + if (IS_ERR(iroot)) + return ERR_PTR(PTR_ERR(iroot)); + + root = d_alloc_root(iroot); + if (!root) { + iput(iroot); + return ERR_PTR(-ENOMEM); + } + + sb->s_root = root; + } + + /* get the actual root for this mount */ + fsinfo.fattr = &fattr; + + error = server->nfs_client->rpc_ops->getroot(server, mntfh, &fsinfo); + if (error < 0) { + dprintk("nfs_get_root: getattr error = %d\n", -error); + return ERR_PTR(error); + } + + inode = nfs_fhget(sb, mntfh, fsinfo.fattr); + if (IS_ERR(inode)) { + dprintk("nfs_get_root: get root inode failed\n"); + return ERR_PTR(PTR_ERR(inode)); + } + + /* root dentries normally start off anonymous and get spliced in later + * if the dentry tree reaches them; however if the dentry already + * exists, we'll pick it up at this point and use it as the root + */ + mntroot = d_alloc_anon(inode); + if (!mntroot) { + iput(inode); + dprintk("nfs_get_root: get root dentry failed\n"); + return ERR_PTR(-ENOMEM); + } + + if (!mntroot->d_op) + mntroot->d_op = server->nfs_client->rpc_ops->dentry_ops; + + return mntroot; +} + +#ifdef CONFIG_NFS_V4 + +/* + * Do a simple pathwalk from the root FH of the server to the nominated target + * of the mountpoint + * - give error on symlinks + * - give error on ".." occurring in the path + * - follow traversals + */ +int nfs4_path_walk(struct nfs_server *server, + struct nfs_fh *mntfh, + const char *path) +{ + struct nfs_fsinfo fsinfo; + struct nfs_fattr fattr; + struct nfs_fh lastfh; + struct qstr name; + int ret; + //int referral_count = 0; + + dprintk("--> nfs4_path_walk(,,%s)\n", path); + + fsinfo.fattr = &fattr; + nfs_fattr_init(&fattr); + + if (*path++ != '/') { + dprintk("nfs4_get_root: Path does not begin with a slash\n"); + return -EINVAL; + } + + /* Start by getting the root filehandle from the server */ + ret = server->nfs_client->rpc_ops->getroot(server, mntfh, &fsinfo); + if (ret < 0) { + dprintk("nfs4_get_root: getroot error = %d\n", -ret); + return ret; + } + + if (fattr.type != NFDIR) { + printk(KERN_ERR "nfs4_get_root:" + " getroot encountered non-directory\n"); + return -ENOTDIR; + } + + if (fattr.valid & NFS_ATTR_FATTR_V4_REFERRAL) { + printk(KERN_ERR "nfs4_get_root:" + " getroot obtained referral\n"); + return -EREMOTE; + } + +next_component: + dprintk("Next: %s\n", path); + + /* extract the next bit of the path */ + if (!*path) + goto path_walk_complete; + + name.name = path; + while (*path && *path != '/') + path++; + name.len = path - (const char *) name.name; + +eat_dot_dir: + while (*path == '/') + path++; + + if (path[0] == '.' && (path[1] == '/' || !path[1])) { + path += 2; + goto eat_dot_dir; + } + + if (path[0] == '.' && path[1] == '.' && (path[2] == '/' || !path[2]) + ) { + printk(KERN_ERR "nfs4_get_root:" + " Mount path contains reference to \"..\"\n"); + return -EINVAL; + } + + /* lookup the next FH in the sequence */ + memcpy(&lastfh, mntfh, sizeof(lastfh)); + + dprintk("LookupFH: %*.*s [%s]\n", name.len, name.len, name.name, path); + + ret = server->nfs_client->rpc_ops->lookupfh(server, &lastfh, &name, + mntfh, &fattr); + if (ret < 0) { + dprintk("nfs4_get_root: getroot error = %d\n", -ret); + return ret; + } + + if (fattr.type != NFDIR) { + printk(KERN_ERR "nfs4_get_root:" + " lookupfh encountered non-directory\n"); + return -ENOTDIR; + } + + if (fattr.valid & NFS_ATTR_FATTR_V4_REFERRAL) { + printk(KERN_ERR "nfs4_get_root:" + " lookupfh obtained referral\n"); + return -EREMOTE; + } + + goto next_component; + +path_walk_complete: + memcpy(&server->fsid, &fattr.fsid, sizeof(server->fsid)); + dprintk("<-- nfs4_path_walk() = 0\n"); + return 0; +} + +/* + * get an NFS4 root dentry from the root filehandle + */ +struct dentry *nfs4_get_root(struct super_block *sb, struct nfs_fh *mntfh) +{ + struct nfs_server *server = NFS_SB(sb); + struct nfs_fattr fattr; + struct dentry *mntroot; + struct inode *inode; + int error; + + dprintk("--> nfs4_get_root()\n"); + + /* create a dummy root dentry with dummy inode for this superblock */ + if (!sb->s_root) { + struct nfs_fh dummyfh; + struct dentry *root; + struct inode *iroot; + + memset(&dummyfh, 0, sizeof(dummyfh)); + memset(&fattr, 0, sizeof(fattr)); + nfs_fattr_init(&fattr); + fattr.valid = NFS_ATTR_FATTR; + fattr.type = NFDIR; + fattr.mode = S_IFDIR | S_IRUSR | S_IWUSR; + fattr.nlink = 2; + + iroot = nfs_fhget(sb, &dummyfh, &fattr); + if (IS_ERR(iroot)) + return ERR_PTR(PTR_ERR(iroot)); + + root = d_alloc_root(iroot); + if (!root) { + iput(iroot); + return ERR_PTR(-ENOMEM); + } + + sb->s_root = root; + } + + /* get the info about the server and filesystem */ + error = nfs4_server_capabilities(server, mntfh); + if (error < 0) { + dprintk("nfs_get_root: getcaps error = %d\n", + -error); + return ERR_PTR(error); + } + + /* get the actual root for this mount */ + error = server->nfs_client->rpc_ops->getattr(server, mntfh, &fattr); + if (error < 0) { + dprintk("nfs_get_root: getattr error = %d\n", -error); + return ERR_PTR(error); + } + + inode = nfs_fhget(sb, mntfh, &fattr); + if (IS_ERR(inode)) { + dprintk("nfs_get_root: get root inode failed\n"); + return ERR_PTR(PTR_ERR(inode)); + } + + /* root dentries normally start off anonymous and get spliced in later + * if the dentry tree reaches them; however if the dentry already + * exists, we'll pick it up at this point and use it as the root + */ + mntroot = d_alloc_anon(inode); + if (!mntroot) { + iput(inode); + dprintk("nfs_get_root: get root dentry failed\n"); + return ERR_PTR(-ENOMEM); + } + + if (!mntroot->d_op) + mntroot->d_op = server->nfs_client->rpc_ops->dentry_ops; + + dprintk("<-- nfs4_get_root()\n"); + return mntroot; +} + +#endif /* CONFIG_NFS_V4 */ diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index 231c20f..f96dfac 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -114,8 +114,7 @@ nfs_idmap_new(struct nfs_client *clp) struct idmap *idmap; int error; - if (clp->cl_idmap != NULL) - return 0; + BUG_ON(clp->cl_idmap != NULL); if ((idmap = kzalloc(sizeof(*idmap), GFP_KERNEL)) == NULL) return -ENOMEM; diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index 771c3b8..a547c58 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1020,7 +1020,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr) out_fileid: printk(KERN_ERR "NFS: server %s error: fileid changed\n" "fsid %s: expected fileid 0x%Lx, got 0x%Lx\n", - NFS_SERVER(inode)->hostname, inode->i_sb->s_id, + NFS_SERVER(inode)->nfs_client->cl_hostname, inode->i_sb->s_id, (long long)nfsi->fileid, (long long)fattr->fileid); goto out_err; } diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index 2f3aa52..e73ba4f 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -4,6 +4,18 @@ #include +struct nfs_string; +struct nfs_mount_data; +struct nfs4_mount_data; + +/* Maximum number of readahead requests + * FIXME: this should really be a sysctl so that users may tune it to suit + * their needs. People that do NFS over a slow network, might for + * instance want to reduce it to something closer to 1 for improved + * interactive response. + */ +#define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1) + struct nfs_clone_mount { const struct super_block *sb; const struct dentry *dentry; @@ -16,12 +28,25 @@ struct nfs_clone_mount { }; /* client.c */ +extern struct rpc_program nfs_program; + extern void nfs_put_client(struct nfs_client *); extern struct nfs_client *nfs_find_client(const struct sockaddr_in *, int); -extern struct nfs_client *nfs_get_client(const char *, const struct sockaddr_in *, int); -extern void nfs_mark_client_ready(struct nfs_client *, int); -extern int nfs_create_rpc_client(struct nfs_client *, int, unsigned int, - unsigned int, rpc_authflavor_t); +extern struct nfs_server *nfs_create_server(const struct nfs_mount_data *, + struct nfs_fh *); +extern struct nfs_server *nfs4_create_server(const struct nfs4_mount_data *, + const char *, + const struct sockaddr_in *, + const char *, + const char *, + rpc_authflavor_t, + struct nfs_fh *); +extern struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *, + struct nfs_fh *); +extern void nfs_free_server(struct nfs_server *server); +extern struct nfs_server *nfs_clone_server(struct nfs_server *, + struct nfs_fh *, + struct nfs_fattr *); /* nfs4namespace.c */ #ifdef CONFIG_NFS_V4 @@ -89,10 +114,10 @@ extern void nfs4_clear_inode(struct inode *); #endif /* super.c */ -extern struct file_system_type nfs_referral_nfs4_fs_type; -extern struct file_system_type clone_nfs_fs_type; +extern struct file_system_type nfs_xdev_fs_type; #ifdef CONFIG_NFS_V4 -extern struct file_system_type clone_nfs4_fs_type; +extern struct file_system_type nfs4_xdev_fs_type; +extern struct file_system_type nfs4_referral_fs_type; #endif extern struct rpc_stat nfs_rpcstat; @@ -101,28 +126,30 @@ extern int __init register_nfs_fs(void); extern void __exit unregister_nfs_fs(void); /* namespace.c */ -extern char *nfs_path(const char *base, const struct dentry *dentry, +extern char *nfs_path(const char *base, + const struct dentry *droot, + const struct dentry *dentry, char *buffer, ssize_t buflen); -/* - * Determine the mount path as a string - */ +/* getroot.c */ +extern struct dentry *nfs_get_root(struct super_block *, struct nfs_fh *); #ifdef CONFIG_NFS_V4 -static inline char * -nfs4_path(const struct dentry *dentry, char *buffer, ssize_t buflen) -{ - return nfs_path(NFS_SB(dentry->d_sb)->mnt_path, dentry, buffer, buflen); -} +extern struct dentry *nfs4_get_root(struct super_block *, struct nfs_fh *); + +extern int nfs4_path_walk(struct nfs_server *server, + struct nfs_fh *mntfh, + const char *path); #endif /* * Determine the device name as a string */ static inline char *nfs_devname(const struct vfsmount *mnt_parent, - const struct dentry *dentry, - char *buffer, ssize_t buflen) + const struct dentry *dentry, + char *buffer, ssize_t buflen) { - return nfs_path(mnt_parent->mnt_devname, dentry, buffer, buflen); + return nfs_path(mnt_parent->mnt_devname, mnt_parent->mnt_root, + dentry, buffer, buflen); } /* @@ -178,20 +205,3 @@ void nfs_super_set_maxbytes(struct super_block *sb, __u64 maxfilesize) if (sb->s_maxbytes > MAX_LFS_FILESIZE || sb->s_maxbytes <= 0) sb->s_maxbytes = MAX_LFS_FILESIZE; } - -/* - * Check if the string represents a "valid" IPv4 address - */ -static inline int valid_ipaddr4(const char *buf) -{ - int rc, count, in[4]; - - rc = sscanf(buf, "%d.%d.%d.%d", &in[0], &in[1], &in[2], &in[3]); - if (rc != 4) - return -EINVAL; - for (count = 0; count < 4; count++) { - if (in[count] > 255) - return -EINVAL; - } - return 0; -} diff --git a/fs/nfs/namespace.c b/fs/nfs/namespace.c index d8b8d56..77b0068 100644 --- a/fs/nfs/namespace.c +++ b/fs/nfs/namespace.c @@ -2,6 +2,7 @@ * linux/fs/nfs/namespace.c * * Copyright (C) 2005 Trond Myklebust + * - Modified by David Howells * * NFS namespace */ @@ -28,6 +29,7 @@ int nfs_mountpoint_expiry_timeout = 500 * HZ; /* * nfs_path - reconstruct the path given an arbitrary dentry * @base - arbitrary string to prepend to the path + * @droot - pointer to root dentry for mountpoint * @dentry - pointer to dentry * @buffer - result buffer * @buflen - length of buffer @@ -38,7 +40,9 @@ int nfs_mountpoint_expiry_timeout = 500 * HZ; * This is mainly for use in figuring out the path on the * server side when automounting on top of an existing partition. */ -char *nfs_path(const char *base, const struct dentry *dentry, +char *nfs_path(const char *base, + const struct dentry *droot, + const struct dentry *dentry, char *buffer, ssize_t buflen) { char *end = buffer+buflen; @@ -47,7 +51,7 @@ char *nfs_path(const char *base, const struct dentry *dentry, *--end = '\0'; buflen--; spin_lock(&dcache_lock); - while (!IS_ROOT(dentry)) { + while (!IS_ROOT(dentry) && dentry != droot) { namelen = dentry->d_name.len; buflen -= namelen + 1; if (buflen < 0) @@ -96,12 +100,13 @@ static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) struct nfs_fattr fattr; int err; + dprintk("--> nfs_follow_mountpoint()\n"); + BUG_ON(IS_ROOT(dentry)); dprintk("%s: enter\n", __FUNCTION__); dput(nd->dentry); nd->dentry = dget(dentry); - if (d_mountpoint(nd->dentry)) - goto out_follow; + /* Look it up again */ parent = dget_parent(nd->dentry); err = server->nfs_client->rpc_ops->lookup(parent->d_inode, @@ -134,6 +139,8 @@ static void * nfs_follow_mountpoint(struct dentry *dentry, struct nameidata *nd) schedule_delayed_work(&nfs_automount_task, nfs_mountpoint_expiry_timeout); out: dprintk("%s: done, returned %d\n", __FUNCTION__, err); + + dprintk("<-- nfs_follow_mountpoint() = %d\n", err); return ERR_PTR(err); out_err: path_release(nd); @@ -183,14 +190,14 @@ static struct vfsmount *nfs_do_clone_mount(struct nfs_server *server, switch (server->nfs_client->cl_nfsversion) { case 2: case 3: - mnt = vfs_kern_mount(&clone_nfs_fs_type, 0, devname, mountdata); + mnt = vfs_kern_mount(&nfs_xdev_fs_type, 0, devname, mountdata); break; case 4: - mnt = vfs_kern_mount(&clone_nfs4_fs_type, 0, devname, mountdata); + mnt = vfs_kern_mount(&nfs4_xdev_fs_type, 0, devname, mountdata); } return mnt; #else - return vfs_kern_mount(&clone_nfs_fs_type, 0, devname, mountdata); + return vfs_kern_mount(&nfs_xdev_fs_type, 0, devname, mountdata); #endif } @@ -216,6 +223,8 @@ struct vfsmount *nfs_do_submount(const struct vfsmount *mnt_parent, char *page = (char *) __get_free_page(GFP_USER); char *devname; + dprintk("--> nfs_do_submount()\n"); + dprintk("%s: submounting on %s/%s\n", __FUNCTION__, dentry->d_parent->d_name.name, dentry->d_name.name); @@ -230,5 +239,7 @@ free_page: free_page((unsigned long)page); out: dprintk("%s: done\n", __FUNCTION__); + + dprintk("<-- nfs_do_submount() = %p\n", mnt); return mnt; } diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 0622af0..9e8258e 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -81,7 +81,7 @@ do_proc_get_root(struct rpc_clnt *client, struct nfs_fh *fhandle, } /* - * Bare-bones access to getattr: this is for nfs_read_super. + * Bare-bones access to getattr: this is for nfs_get_root/nfs_get_sb */ static int nfs3_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, diff --git a/fs/nfs/nfs4_fs.h b/fs/nfs/nfs4_fs.h index e787924..61095fe 100644 --- a/fs/nfs/nfs4_fs.h +++ b/fs/nfs/nfs4_fs.h @@ -188,8 +188,6 @@ extern void nfs4_kill_renewd(struct nfs_client *); extern void nfs4_renew_state(void *); /* nfs4state.c */ -extern void init_nfsv4_state(struct nfs_server *); -extern void destroy_nfsv4_state(struct nfs_server *); struct rpc_cred *nfs4_get_renew_cred(struct nfs_client *clp); extern u32 nfs4_alloc_lockowner_id(struct nfs_client *); @@ -224,10 +222,6 @@ extern struct svc_version nfs4_callback_version1; #else -#define init_nfsv4_state(server) do { } while (0) -#define destroy_nfsv4_state(server) do { } while (0) -#define nfs4_put_state_owner(inode, owner) do { } while (0) -#define nfs4_put_open_state(state) do { } while (0) #define nfs4_close_state(a, b) do { } while (0) #endif /* CONFIG_NFS_V4 */ diff --git a/fs/nfs/nfs4namespace.c b/fs/nfs/nfs4namespace.c index faed9bc..24e47f3 100644 --- a/fs/nfs/nfs4namespace.c +++ b/fs/nfs/nfs4namespace.c @@ -2,6 +2,7 @@ * linux/fs/nfs/nfs4namespace.c * * Copyright (C) 2005 Trond Myklebust + * - Modified by David Howells * * NFSv4 namespace */ @@ -47,6 +48,68 @@ Elong: return ERR_PTR(-ENAMETOOLONG); } +/* + * Determine the mount path as a string + */ +static char *nfs4_path(const struct vfsmount *mnt_parent, + const struct dentry *dentry, + char *buffer, ssize_t buflen) +{ + const char *srvpath; + + srvpath = strchr(mnt_parent->mnt_devname, ':'); + if (srvpath) + srvpath++; + else + srvpath = mnt_parent->mnt_devname; + + return nfs_path(srvpath, mnt_parent->mnt_root, dentry, buffer, buflen); +} + +/* + * Check that fs_locations::fs_root [RFC3530 6.3] is a prefix for what we + * believe to be the server path to this dentry + */ +static int nfs4_validate_fspath(const struct vfsmount *mnt_parent, + const struct dentry *dentry, + const struct nfs4_fs_locations *locations, + char *page, char *page2) +{ + const char *path, *fs_path; + + path = nfs4_path(mnt_parent, dentry, page, PAGE_SIZE); + if (IS_ERR(path)) + return PTR_ERR(path); + + fs_path = nfs4_pathname_string(&locations->fs_path, page2, PAGE_SIZE); + if (IS_ERR(fs_path)) + return PTR_ERR(fs_path); + + if (strncmp(path, fs_path, strlen(fs_path)) != 0) { + dprintk("%s: path %s does not begin with fsroot %s\n", + __FUNCTION__, path, fs_path); + return -ENOENT; + } + + return 0; +} + +/* + * Check if the string represents a "valid" IPv4 address + */ +static inline int valid_ipaddr4(const char *buf) +{ + int rc, count, in[4]; + + rc = sscanf(buf, "%d.%d.%d.%d", &in[0], &in[1], &in[2], &in[3]); + if (rc != 4) + return -EINVAL; + for (count = 0; count < 4; count++) { + if (in[count] > 255) + return -EINVAL; + } + return 0; +} /** * nfs_follow_referral - set up mountpoint when hitting a referral on moved error @@ -68,10 +131,9 @@ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, .dentry = dentry, .authflavor = NFS_SB(mnt_parent->mnt_sb)->client->cl_auth->au_flavor, }; - char *page, *page2; - char *path, *fs_path; + char *page = NULL, *page2 = NULL; char *devname; - int loc, s; + int loc, s, error; if (locations == NULL || locations->nlocations <= 0) goto out; @@ -79,31 +141,25 @@ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, dprintk("%s: referral at %s/%s\n", __FUNCTION__, dentry->d_parent->d_name.name, dentry->d_name.name); - /* Ensure fs path is a prefix of current dentry path */ page = (char *) __get_free_page(GFP_USER); - if (page == NULL) + if (!page) goto out; + page2 = (char *) __get_free_page(GFP_USER); - if (page2 == NULL) + if (!page2) goto out; - path = nfs4_path(dentry, page, PAGE_SIZE); - if (IS_ERR(path)) - goto out_free; - - fs_path = nfs4_pathname_string(&locations->fs_path, page2, PAGE_SIZE); - if (IS_ERR(fs_path)) - goto out_free; - - if (strncmp(path, fs_path, strlen(fs_path)) != 0) { - dprintk("%s: path %s does not begin with fsroot %s\n", __FUNCTION__, path, fs_path); - goto out_free; + /* Ensure fs path is a prefix of current dentry path */ + error = nfs4_validate_fspath(mnt_parent, dentry, locations, page, page2); + if (error < 0) { + mnt = ERR_PTR(error); + goto out; } devname = nfs_devname(mnt_parent, dentry, page, PAGE_SIZE); if (IS_ERR(devname)) { mnt = (struct vfsmount *)devname; - goto out_free; + goto out; } loc = 0; @@ -140,7 +196,7 @@ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, addr.sin_port = htons(NFS_PORT); mountdata.addr = &addr; - mnt = vfs_kern_mount(&nfs_referral_nfs4_fs_type, 0, devname, &mountdata); + mnt = vfs_kern_mount(&nfs4_referral_fs_type, 0, devname, &mountdata); if (!IS_ERR(mnt)) { break; } @@ -149,10 +205,9 @@ static struct vfsmount *nfs_follow_referral(const struct vfsmount *mnt_parent, loc++; } -out_free: - free_page((unsigned long)page); - free_page((unsigned long)page2); out: + free_page((unsigned long) page); + free_page((unsigned long) page2); dprintk("%s: done\n", __FUNCTION__); return mnt; } @@ -165,7 +220,7 @@ out: */ struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentry *dentry) { - struct vfsmount *mnt = ERR_PTR(-ENOENT); + struct vfsmount *mnt = ERR_PTR(-ENOMEM); struct dentry *parent; struct nfs4_fs_locations *fs_locations = NULL; struct page *page; @@ -183,11 +238,16 @@ struct vfsmount *nfs_do_refmount(const struct vfsmount *mnt_parent, struct dentr goto out_free; /* Get locations */ + mnt = ERR_PTR(-ENOENT); + parent = dget_parent(dentry); - dprintk("%s: getting locations for %s/%s\n", __FUNCTION__, parent->d_name.name, dentry->d_name.name); + dprintk("%s: getting locations for %s/%s\n", + __FUNCTION__, parent->d_name.name, dentry->d_name.name); + err = nfs4_proc_fs_locations(parent->d_inode, dentry, fs_locations, page); dput(parent); - if (err != 0 || fs_locations->nlocations <= 0 || + if (err != 0 || + fs_locations->nlocations <= 0 || fs_locations->fs_path.ncomponents <= 0) goto out_free; diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 1573eeb..a825547 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1393,70 +1393,19 @@ static int nfs4_lookup_root(struct nfs_server *server, struct nfs_fh *fhandle, return err; } +/* + * get the file handle for the "/" directory on the server + */ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle, - struct nfs_fsinfo *info) + struct nfs_fsinfo *info) { - struct nfs_fattr * fattr = info->fattr; - unsigned char * p; - struct qstr q; - struct nfs4_lookup_arg args = { - .dir_fh = fhandle, - .name = &q, - .bitmask = nfs4_fattr_bitmap, - }; - struct nfs4_lookup_res res = { - .server = server, - .fattr = fattr, - .fh = fhandle, - }; - struct rpc_message msg = { - .rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_LOOKUP], - .rpc_argp = &args, - .rpc_resp = &res, - }; int status; - /* - * Now we do a separate LOOKUP for each component of the mount path. - * The LOOKUPs are done separately so that we can conveniently - * catch an ERR_WRONGSEC if it occurs along the way... - */ status = nfs4_lookup_root(server, fhandle, info); - if (status) - goto out; - - p = server->mnt_path; - for (;;) { - struct nfs4_exception exception = { }; - - while (*p == '/') - p++; - if (!*p) - break; - q.name = p; - while (*p && (*p != '/')) - p++; - q.len = p - q.name; - - do { - nfs_fattr_init(fattr); - status = nfs4_handle_exception(server, - rpc_call_sync(server->client, &msg, 0), - &exception); - } while (exception.retry); - if (status == 0) - continue; - if (status == -ENOENT) { - printk(KERN_NOTICE "NFS: mount path %s does not exist!\n", server->mnt_path); - printk(KERN_NOTICE "NFS: suggestion: try mounting '/' instead.\n"); - } - break; - } if (status == 0) status = nfs4_server_capabilities(server, fhandle); if (status == 0) status = nfs4_do_fsinfo(server, fhandle, info); -out: return nfs4_map_errors(status); } diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index ff947ec..f2c8936 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -127,26 +127,13 @@ nfs4_schedule_state_renewal(struct nfs_client *clp) void nfs4_renewd_prepare_shutdown(struct nfs_server *server) { - struct nfs_client *clp = server->nfs_client; - - if (!clp) - return; flush_scheduled_work(); - down_write(&clp->cl_sem); - if (!list_empty(&server->nfs4_siblings)) - list_del_init(&server->nfs4_siblings); - up_write(&clp->cl_sem); } -/* Must be called with clp->cl_sem locked for writes */ void nfs4_kill_renewd(struct nfs_client *clp) { down_read(&clp->cl_sem); - if (!list_empty(&clp->cl_superblocks)) { - up_read(&clp->cl_sem); - return; - } cancel_delayed_work(&clp->cl_renewd); up_read(&clp->cl_sem); flush_scheduled_work(); diff --git a/fs/nfs/nfs4state.c b/fs/nfs/nfs4state.c index 058811e..5fffbdf 100644 --- a/fs/nfs/nfs4state.c +++ b/fs/nfs/nfs4state.c @@ -58,24 +58,6 @@ const nfs4_stateid zero_stateid; static LIST_HEAD(nfs4_clientid_list); -void -init_nfsv4_state(struct nfs_server *server) -{ - server->nfs_client = NULL; - INIT_LIST_HEAD(&server->nfs4_siblings); -} - -void -destroy_nfsv4_state(struct nfs_server *server) -{ - kfree(server->mnt_path); - server->mnt_path = NULL; - if (server->nfs_client) { - nfs_put_client(server->nfs_client); - server->nfs_client = NULL; - } -} - static int nfs4_init_client(struct nfs_client *clp, struct rpc_cred *cred) { int status = nfs4_proc_setclientid(clp, NFS4_CALLBACK, diff --git a/fs/nfs/read.c b/fs/nfs/read.c index f0aff82..dae33c1 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -171,7 +171,7 @@ static int nfs_readpage_sync(struct nfs_open_context *ctx, struct inode *inode, rdata->args.offset = page_offset(page) + rdata->args.pgbase; dprintk("NFS: nfs_proc_read(%s, (%s/%Ld), %Lu, %u)\n", - NFS_SERVER(inode)->hostname, + NFS_SERVER(inode)->nfs_client->cl_hostname, inode->i_sb->s_id, (long long)NFS_FILEID(inode), (unsigned long long)rdata->args.pgbase, diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 5842d51..867b5dc 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -13,6 +13,11 @@ * * Split from inode.c by David Howells * + * - superblocks are indexed on server only - all inodes, dentries, etc. associated with a + * particular server are held in the same superblock + * - NFS superblocks can have several effective roots to the dentry tree + * - directory type roots are spliced into the tree when a path from one root reaches the root + * of another (see nfs_lookup()) */ #include @@ -52,20 +57,12 @@ #define NFSDBG_FACILITY NFSDBG_VFS -/* Maximum number of readahead requests - * FIXME: this should really be a sysctl so that users may tune it to suit - * their needs. People that do NFS over a slow network, might for - * instance want to reduce it to something closer to 1 for improved - * interactive response. - */ -#define NFS_MAX_READAHEAD (RPC_DEF_SLOT_TABLE - 1) - static void nfs_umount_begin(struct vfsmount *, int); static int nfs_statfs(struct dentry *, struct kstatfs *); static int nfs_show_options(struct seq_file *, struct vfsmount *); static int nfs_show_stats(struct seq_file *, struct vfsmount *); static int nfs_get_sb(struct file_system_type *, int, const char *, void *, struct vfsmount *); -static int nfs_clone_nfs_sb(struct file_system_type *fs_type, +static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); static void nfs_kill_super(struct super_block *); @@ -77,10 +74,10 @@ static struct file_system_type nfs_fs_type = { .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; -struct file_system_type clone_nfs_fs_type = { +struct file_system_type nfs_xdev_fs_type = { .owner = THIS_MODULE, .name = "nfs", - .get_sb = nfs_clone_nfs_sb, + .get_sb = nfs_xdev_get_sb, .kill_sb = nfs_kill_super, .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; @@ -99,10 +96,10 @@ static struct super_operations nfs_sops = { #ifdef CONFIG_NFS_V4 static int nfs4_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); -static int nfs_clone_nfs4_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); -static int nfs_referral_nfs4_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); +static int nfs4_xdev_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); +static int nfs4_referral_get_sb(struct file_system_type *fs_type, + int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt); static void nfs4_kill_super(struct super_block *sb); static struct file_system_type nfs4_fs_type = { @@ -113,18 +110,18 @@ static struct file_system_type nfs4_fs_type = { .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; -struct file_system_type clone_nfs4_fs_type = { +struct file_system_type nfs4_xdev_fs_type = { .owner = THIS_MODULE, .name = "nfs4", - .get_sb = nfs_clone_nfs4_sb, + .get_sb = nfs4_xdev_get_sb, .kill_sb = nfs4_kill_super, .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; -struct file_system_type nfs_referral_nfs4_fs_type = { +struct file_system_type nfs4_referral_fs_type = { .owner = THIS_MODULE, .name = "nfs4", - .get_sb = nfs_referral_nfs4_sb, + .get_sb = nfs4_referral_get_sb, .kill_sb = nfs4_kill_super, .fs_flags = FS_ODD_RENAME|FS_REVAL_DOT|FS_BINARY_MOUNTDATA, }; @@ -345,7 +342,7 @@ static int nfs_show_options(struct seq_file *m, struct vfsmount *mnt) nfs_show_mount_options(m, nfss, 0); seq_puts(m, ",addr="); - seq_escape(m, nfss->hostname, " \t\n\\"); + seq_escape(m, nfss->nfs_client->cl_hostname, " \t\n\\"); return 0; } @@ -429,714 +426,351 @@ static int nfs_show_stats(struct seq_file *m, struct vfsmount *mnt) /* * Begin unmount by attempting to remove all automounted mountpoints we added - * in response to traversals + * in response to xdev traversals and referrals */ static void nfs_umount_begin(struct vfsmount *vfsmnt, int flags) { - struct nfs_server *server; - struct rpc_clnt *rpc; - shrink_submounts(vfsmnt, &nfs_automount_list); - if (!(flags & MNT_FORCE)) - return; - /* -EIO all pending I/O */ - server = NFS_SB(vfsmnt->mnt_sb); - rpc = server->client; - if (!IS_ERR(rpc)) - rpc_killall_tasks(rpc); - rpc = server->client_acl; - if (!IS_ERR(rpc)) - rpc_killall_tasks(rpc); } /* - * Obtain the root inode of the file system. + * Validate the NFS2/NFS3 mount data + * - fills in the mount root filehandle */ -static struct inode * -nfs_get_root(struct super_block *sb, struct nfs_fh *rootfh, struct nfs_fsinfo *fsinfo) +static int nfs_validate_mount_data(struct nfs_mount_data *data, + struct nfs_fh *mntfh) { - struct nfs_server *server = NFS_SB(sb); - int error; - - error = server->nfs_client->rpc_ops->getroot(server, rootfh, fsinfo); - if (error < 0) { - dprintk("nfs_get_root: getattr error = %d\n", -error); - return ERR_PTR(error); + if (data == NULL) { + dprintk("%s: missing data argument\n", __FUNCTION__); + return -EINVAL; } - server->fsid = fsinfo->fattr->fsid; - return nfs_fhget(sb, rootfh, fsinfo->fattr); -} - -/* - * Do NFS version-independent mount processing, and sanity checking - */ -static int -nfs_sb_init(struct super_block *sb, rpc_authflavor_t authflavor) -{ - struct nfs_server *server; - struct inode *root_inode; - struct nfs_fattr fattr; - struct nfs_fsinfo fsinfo = { - .fattr = &fattr, - }; - struct nfs_pathconf pathinfo = { - .fattr = &fattr, - }; - int no_root_error = 0; - unsigned long max_rpc_payload; - - /* We probably want something more informative here */ - snprintf(sb->s_id, sizeof(sb->s_id), "%x:%x", MAJOR(sb->s_dev), MINOR(sb->s_dev)); - - server = NFS_SB(sb); - - sb->s_magic = NFS_SUPER_MAGIC; - - server->io_stats = nfs_alloc_iostats(); - if (server->io_stats == NULL) - return -ENOMEM; + if (data->version <= 0 || data->version > NFS_MOUNT_VERSION) { + dprintk("%s: bad mount version\n", __FUNCTION__); + return -EINVAL; + } - root_inode = nfs_get_root(sb, &server->fh, &fsinfo); - /* Did getting the root inode fail? */ - if (IS_ERR(root_inode)) { - no_root_error = PTR_ERR(root_inode); - goto out_no_root; + switch (data->version) { + case 1: + data->namlen = 0; + case 2: + data->bsize = 0; + case 3: + if (data->flags & NFS_MOUNT_VER3) { + dprintk("%s: mount structure version %d does not support NFSv3\n", + __FUNCTION__, + data->version); + return -EINVAL; + } + data->root.size = NFS2_FHSIZE; + memcpy(data->root.data, data->old_root.data, NFS2_FHSIZE); + case 4: + if (data->flags & NFS_MOUNT_SECFLAVOUR) { + dprintk("%s: mount structure version %d does not support strong security\n", + __FUNCTION__, + data->version); + return -EINVAL; + } + /* Fill in pseudoflavor for mount version < 5 */ + data->pseudoflavor = RPC_AUTH_UNIX; + case 5: + memset(data->context, 0, sizeof(data->context)); } - sb->s_root = d_alloc_root(root_inode); - if (!sb->s_root) { - no_root_error = -ENOMEM; - goto out_no_root; + +#ifndef CONFIG_NFS_V3 + /* If NFSv3 is not compiled in, return -EPROTONOSUPPORT */ + if (data->flags & NFS_MOUNT_VER3) { + dprintk("%s: NFSv3 not compiled into kernel\n", __FUNCTION__); + return -EPROTONOSUPPORT; } - sb->s_root->d_op = server->nfs_client->rpc_ops->dentry_ops; - - /* mount time stamp, in seconds */ - server->mount_time = jiffies; - - /* Get some general file system info */ - if (server->namelen == 0 && - server->nfs_client->rpc_ops->pathconf(server, &server->fh, &pathinfo) >= 0) - server->namelen = pathinfo.max_namelen; - /* Work out a lot of parameters */ - if (server->rsize == 0) - server->rsize = nfs_block_size(fsinfo.rtpref, NULL); - if (server->wsize == 0) - server->wsize = nfs_block_size(fsinfo.wtpref, NULL); - - if (fsinfo.rtmax >= 512 && server->rsize > fsinfo.rtmax) - server->rsize = nfs_block_size(fsinfo.rtmax, NULL); - if (fsinfo.wtmax >= 512 && server->wsize > fsinfo.wtmax) - server->wsize = nfs_block_size(fsinfo.wtmax, NULL); - - max_rpc_payload = nfs_block_size(rpc_max_payload(server->client), NULL); - if (server->rsize > max_rpc_payload) - server->rsize = max_rpc_payload; - if (server->rsize > NFS_MAX_FILE_IO_SIZE) - server->rsize = NFS_MAX_FILE_IO_SIZE; - server->rpages = (server->rsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; - - if (server->wsize > max_rpc_payload) - server->wsize = max_rpc_payload; - if (server->wsize > NFS_MAX_FILE_IO_SIZE) - server->wsize = NFS_MAX_FILE_IO_SIZE; - server->wpages = (server->wsize + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT; +#endif /* CONFIG_NFS_V3 */ - if (sb->s_blocksize == 0) - sb->s_blocksize = nfs_block_bits(server->wsize, - &sb->s_blocksize_bits); - server->wtmult = nfs_block_bits(fsinfo.wtmult, NULL); - - server->dtsize = nfs_block_size(fsinfo.dtpref, NULL); - if (server->dtsize > PAGE_CACHE_SIZE) - server->dtsize = PAGE_CACHE_SIZE; - if (server->dtsize > server->rsize) - server->dtsize = server->rsize; - - if (server->flags & NFS_MOUNT_NOAC) { - server->acregmin = server->acregmax = 0; - server->acdirmin = server->acdirmax = 0; - sb->s_flags |= MS_SYNCHRONOUS; + /* We now require that the mount process passes the remote address */ + if (data->addr.sin_addr.s_addr == INADDR_ANY) { + dprintk("%s: mount program didn't pass remote address!\n", + __FUNCTION__); + return -EINVAL; } - server->backing_dev_info.ra_pages = server->rpages * NFS_MAX_READAHEAD; - nfs_super_set_maxbytes(sb, fsinfo.maxfilesize); + /* Prepare the root filehandle */ + if (data->flags & NFS_MOUNT_VER3) + mntfh->size = data->root.size; + else + mntfh->size = NFS2_FHSIZE; - server->client->cl_intr = (server->flags & NFS_MOUNT_INTR) ? 1 : 0; - server->client->cl_softrtry = (server->flags & NFS_MOUNT_SOFT) ? 1 : 0; + if (mntfh->size > sizeof(mntfh->data)) { + dprintk("%s: invalid root filehandle\n", __FUNCTION__); + return -EINVAL; + } + + memcpy(mntfh->data, data->root.data, mntfh->size); + if (mntfh->size < sizeof(mntfh->data)) + memset(mntfh->data + mntfh->size, 0, + sizeof(mntfh->data) - mntfh->size); - /* We're airborne Set socket buffersize */ - rpc_setbufsize(server->client, server->wsize + 100, server->rsize + 100); return 0; - /* Yargs. It didn't work out. */ -out_no_root: - dprintk("nfs_sb_init: get root inode failed: errno %d\n", -no_root_error); - if (!IS_ERR(root_inode)) - iput(root_inode); - return no_root_error; } /* - * Create an RPC client handle. + * Initialise the common bits of the superblock */ -static struct rpc_clnt * -nfs_create_client(struct nfs_server *server, const struct nfs_mount_data *data) +static inline void nfs_initialise_sb(struct super_block *sb) { - struct nfs_client *clp; - struct rpc_clnt *clnt; - int proto = (data->flags & NFS_MOUNT_TCP) ? IPPROTO_TCP : IPPROTO_UDP; - int nfsversion = 2; - int err; - -#ifdef CONFIG_NFS_V3 - if (server->flags & NFS_MOUNT_VER3) - nfsversion = 3; -#endif - - clp = nfs_get_client(server->hostname, &server->addr, nfsversion); - if (!clp) { - dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); - return ERR_PTR(PTR_ERR(clp)); - } - - if (clp->cl_cons_state == NFS_CS_INITING) { - /* Check NFS protocol revision and initialize RPC op - * vector and file handle pool. */ -#ifdef CONFIG_NFS_V3 - if (nfsversion == 3) { - clp->rpc_ops = &nfs_v3_clientops; - server->caps |= NFS_CAP_READDIRPLUS; - } else { - clp->rpc_ops = &nfs_v2_clientops; - } -#else - clp->rpc_ops = &nfs_v2_clientops; -#endif - - /* create transport and client */ - err = nfs_create_rpc_client(clp, proto, data->timeo, - data->retrans, RPC_AUTH_UNIX); - if (err < 0) - goto client_init_error; - - nfs_mark_client_ready(clp, 0); - } + struct nfs_server *server = NFS_SB(sb); - /* create an nfs_server-specific client */ - clnt = rpc_clone_client(clp->cl_rpcclient); - if (IS_ERR(clnt)) { - dprintk("%s: couldn't create rpc_client!\n", __FUNCTION__); - nfs_put_client(clp); - return ERR_PTR(PTR_ERR(clnt)); - } + sb->s_magic = NFS_SUPER_MAGIC; - if (data->pseudoflavor != clp->cl_rpcclient->cl_auth->au_flavor) { - struct rpc_auth *auth; + /* We probably want something more informative here */ + snprintf(sb->s_id, sizeof(sb->s_id), + "%x:%x", MAJOR(sb->s_dev), MINOR(sb->s_dev)); - auth = rpcauth_create(data->pseudoflavor, server->client); - if (IS_ERR(auth)) { - dprintk("%s: couldn't create credcache!\n", __FUNCTION__); - return ERR_PTR(PTR_ERR(auth)); - } - } + if (sb->s_blocksize == 0) + sb->s_blocksize = nfs_block_bits(server->wsize, + &sb->s_blocksize_bits); - server->nfs_client = clp; - return clnt; + if (server->flags & NFS_MOUNT_NOAC) + sb->s_flags |= MS_SYNCHRONOUS; -client_init_error: - nfs_mark_client_ready(clp, err); - nfs_put_client(clp); - return ERR_PTR(err); + nfs_super_set_maxbytes(sb, server->maxfilesize); } /* - * Clone a server record + * Finish setting up an NFS2/3 superblock */ -static struct nfs_server *nfs_clone_server(struct super_block *sb, struct nfs_clone_mount *data) +static void nfs_fill_super(struct super_block *sb, struct nfs_mount_data *data) { struct nfs_server *server = NFS_SB(sb); - struct nfs_server *parent = NFS_SB(data->sb); - struct inode *root_inode; - struct nfs_fsinfo fsinfo; - void *err = ERR_PTR(-ENOMEM); - - sb->s_op = data->sb->s_op; - sb->s_blocksize = data->sb->s_blocksize; - sb->s_blocksize_bits = data->sb->s_blocksize_bits; - sb->s_maxbytes = data->sb->s_maxbytes; - - server->client_acl = ERR_PTR(-EINVAL); - server->io_stats = nfs_alloc_iostats(); - if (server->io_stats == NULL) - goto out; - - server->client = rpc_clone_client(parent->client); - if (IS_ERR((err = server->client))) - goto out; - - if (!IS_ERR(parent->client_acl)) { - server->client_acl = rpc_clone_client(parent->client_acl); - if (IS_ERR((err = server->client_acl))) - goto out; - } - root_inode = nfs_fhget(sb, data->fh, data->fattr); - if (!root_inode) - goto out; - sb->s_root = d_alloc_root(root_inode); - if (!sb->s_root) - goto out_put_root; - fsinfo.fattr = data->fattr; - if (NFS_PROTO(root_inode)->fsinfo(server, data->fh, &fsinfo) == 0) - nfs_super_set_maxbytes(sb, fsinfo.maxfilesize); - sb->s_root->d_op = server->nfs_client->rpc_ops->dentry_ops; - sb->s_flags |= MS_ACTIVE; - return server; -out_put_root: - iput(root_inode); -out: - return err; -} - -/* - * Copy an existing superblock and attach revised data - */ -static int nfs_clone_generic_sb(struct nfs_clone_mount *data, - struct super_block *(*fill_sb)(struct nfs_server *, struct nfs_clone_mount *), - struct nfs_server *(*fill_server)(struct super_block *, struct nfs_clone_mount *), - struct vfsmount *mnt) -{ - struct nfs_server *server; - struct nfs_server *parent = NFS_SB(data->sb); - struct super_block *sb = ERR_PTR(-EINVAL); - char *hostname; - int error = -ENOMEM; - int len; - - server = kmalloc(sizeof(struct nfs_server), GFP_KERNEL); - if (server == NULL) - goto out_err; - memcpy(server, parent, sizeof(*server)); - atomic_inc(&server->nfs_client->cl_count); - hostname = (data->hostname != NULL) ? data->hostname : parent->hostname; - len = strlen(hostname) + 1; - server->hostname = kmalloc(len, GFP_KERNEL); - if (server->hostname == NULL) - goto free_server; - memcpy(server->hostname, hostname, len); - - sb = fill_sb(server, data); - if (IS_ERR(sb)) { - error = PTR_ERR(sb); - goto free_hostname; - } - if (sb->s_root) - goto out_share; + sb->s_blocksize_bits = 0; + sb->s_blocksize = 0; + if (data->bsize) + sb->s_blocksize = nfs_block_size(data->bsize, &sb->s_blocksize_bits); - server = fill_server(sb, data); - if (IS_ERR(server)) { - error = PTR_ERR(server); - goto out_deactivate; + if (server->flags & NFS_MOUNT_VER3) { + /* The VFS shouldn't apply the umask to mode bits. We will do + * so ourselves when necessary. + */ + sb->s_flags |= MS_POSIXACL; + sb->s_time_gran = 1; } - return simple_set_mnt(mnt, sb); -out_deactivate: - up_write(&sb->s_umount); - deactivate_super(sb); - return error; -out_share: - kfree(server->hostname); - nfs_put_client(server->nfs_client); - kfree(server); - return simple_set_mnt(mnt, sb); -free_hostname: - kfree(server->hostname); -free_server: - nfs_put_client(server->nfs_client); - kfree(server); -out_err: - return error; + + sb->s_op = &nfs_sops; + nfs_initialise_sb(sb); } /* - * Set up an NFS2/3 superblock - * - * The way this works is that the mount process passes a structure - * in the data argument which contains the server's IP address - * and the root file handle obtained from the server's mount - * daemon. We stash these away in the private superblock fields. + * Finish setting up a cloned NFS2/3 superblock */ -static int -nfs_fill_super(struct super_block *sb, struct nfs_mount_data *data, int silent) +static void nfs_clone_super(struct super_block *sb, + const struct super_block *old_sb) { - struct nfs_server *server; - rpc_authflavor_t authflavor; + struct nfs_server *server = NFS_SB(sb); + + sb->s_blocksize_bits = old_sb->s_blocksize_bits; + sb->s_blocksize = old_sb->s_blocksize; + sb->s_maxbytes = old_sb->s_maxbytes; - server = NFS_SB(sb); - sb->s_blocksize_bits = 0; - sb->s_blocksize = 0; - if (data->bsize) - sb->s_blocksize = nfs_block_size(data->bsize, &sb->s_blocksize_bits); - if (data->rsize) - server->rsize = nfs_block_size(data->rsize, NULL); - if (data->wsize) - server->wsize = nfs_block_size(data->wsize, NULL); - server->flags = data->flags & NFS_MOUNT_FLAGMASK; - - server->acregmin = data->acregmin*HZ; - server->acregmax = data->acregmax*HZ; - server->acdirmin = data->acdirmin*HZ; - server->acdirmax = data->acdirmax*HZ; - - /* Start lockd here, before we might error out */ - if (!(server->flags & NFS_MOUNT_NONLM)) - lockd_up(); - - server->namelen = data->namlen; - server->hostname = kmalloc(strlen(data->hostname) + 1, GFP_KERNEL); - if (!server->hostname) - return -ENOMEM; - strcpy(server->hostname, data->hostname); - - /* Fill in pseudoflavor for mount version < 5 */ - if (!(data->flags & NFS_MOUNT_SECFLAVOUR)) - data->pseudoflavor = RPC_AUTH_UNIX; - authflavor = data->pseudoflavor; /* save for sb_init() */ - /* XXX maybe we want to add a server->pseudoflavor field */ - - /* Create RPC client handles */ - server->client = nfs_create_client(server, data); - if (IS_ERR(server->client)) - return PTR_ERR(server->client); - - /* RFC 2623, sec 2.3.2 */ if (server->flags & NFS_MOUNT_VER3) { -#ifdef CONFIG_NFS_V3_ACL - if (!(server->flags & NFS_MOUNT_NOACL)) { - server->client_acl = rpc_bind_new_program(server->client, &nfsacl_program, 3); - /* No errors! Assume that Sun nfsacls are supported */ - if (!IS_ERR(server->client_acl)) - server->caps |= NFS_CAP_ACLS; - } -#else - server->flags &= ~NFS_MOUNT_NOACL; -#endif /* CONFIG_NFS_V3_ACL */ - /* - * The VFS shouldn't apply the umask to mode bits. We will - * do so ourselves when necessary. + /* The VFS shouldn't apply the umask to mode bits. We will do + * so ourselves when necessary. */ sb->s_flags |= MS_POSIXACL; - if (server->namelen == 0 || server->namelen > NFS3_MAXNAMLEN) - server->namelen = NFS3_MAXNAMLEN; sb->s_time_gran = 1; - } else { - if (server->namelen == 0 || server->namelen > NFS2_MAXNAMLEN) - server->namelen = NFS2_MAXNAMLEN; } - sb->s_op = &nfs_sops; - return nfs_sb_init(sb, authflavor); + sb->s_op = old_sb->s_op; + nfs_initialise_sb(sb); } -static int nfs_set_super(struct super_block *s, void *data) +static int nfs_set_super(struct super_block *s, void *_server) { - s->s_fs_info = data; - return set_anon_super(s, data); + struct nfs_server *server = _server; + int ret; + + s->s_fs_info = server; + ret = set_anon_super(s, server); + if (ret == 0) + server->s_dev = s->s_dev; + return ret; } static int nfs_compare_super(struct super_block *sb, void *data) { - struct nfs_server *server = data; - struct nfs_server *old = NFS_SB(sb); + struct nfs_server *server = data, *old = NFS_SB(sb); - if (old->addr.sin_addr.s_addr != server->addr.sin_addr.s_addr) + if (old->nfs_client != server->nfs_client) return 0; - if (old->addr.sin_port != server->addr.sin_port) + if (memcmp(&old->fsid, &server->fsid, sizeof(old->fsid)) != 0) return 0; - return !nfs_compare_fh(&old->fh, &server->fh); + return 1; } static int nfs_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt) { - int error; struct nfs_server *server = NULL; struct super_block *s; - struct nfs_fh *root; + struct nfs_fh mntfh; struct nfs_mount_data *data = raw_data; + struct dentry *mntroot; + int error; - error = -EINVAL; - if (data == NULL) { - dprintk("%s: missing data argument\n", __FUNCTION__); - goto out_err_noserver; - } - if (data->version <= 0 || data->version > NFS_MOUNT_VERSION) { - dprintk("%s: bad mount version\n", __FUNCTION__); - goto out_err_noserver; - } - switch (data->version) { - case 1: - data->namlen = 0; - case 2: - data->bsize = 0; - case 3: - if (data->flags & NFS_MOUNT_VER3) { - dprintk("%s: mount structure version %d does not support NFSv3\n", - __FUNCTION__, - data->version); - goto out_err_noserver; - } - data->root.size = NFS2_FHSIZE; - memcpy(data->root.data, data->old_root.data, NFS2_FHSIZE); - case 4: - if (data->flags & NFS_MOUNT_SECFLAVOUR) { - dprintk("%s: mount structure version %d does not support strong security\n", - __FUNCTION__, - data->version); - goto out_err_noserver; - } - case 5: - memset(data->context, 0, sizeof(data->context)); - } -#ifndef CONFIG_NFS_V3 - /* If NFSv3 is not compiled in, return -EPROTONOSUPPORT */ - error = -EPROTONOSUPPORT; - if (data->flags & NFS_MOUNT_VER3) { - dprintk("%s: NFSv3 not compiled into kernel\n", __FUNCTION__); - goto out_err_noserver; - } -#endif /* CONFIG_NFS_V3 */ + /* Validate the mount data */ + error = nfs_validate_mount_data(data, &mntfh); + if (error < 0) + return error; - error = -ENOMEM; - server = kzalloc(sizeof(struct nfs_server), GFP_KERNEL); - if (!server) + /* Get a volume representation */ + server = nfs_create_server(data, &mntfh); + if (IS_ERR(server)) { + error = PTR_ERR(server); goto out_err_noserver; - /* Zero out the NFS state stuff */ - init_nfsv4_state(server); - server->client = server->client_acl = ERR_PTR(-EINVAL); - - root = &server->fh; - if (data->flags & NFS_MOUNT_VER3) - root->size = data->root.size; - else - root->size = NFS2_FHSIZE; - error = -EINVAL; - if (root->size > sizeof(root->data)) { - dprintk("%s: invalid root filehandle\n", __FUNCTION__); - goto out_err; - } - memcpy(root->data, data->root.data, root->size); - - /* We now require that the mount process passes the remote address */ - memcpy(&server->addr, &data->addr, sizeof(server->addr)); - if (server->addr.sin_addr.s_addr == INADDR_ANY) { - dprintk("%s: mount program didn't pass remote address!\n", - __FUNCTION__); - goto out_err; } + /* Get a superblock - note that we may end up sharing one that already exists */ s = sget(fs_type, nfs_compare_super, nfs_set_super, server); if (IS_ERR(s)) { error = PTR_ERR(s); - goto out_err; + goto out_err_nosb; } - if (s->s_root) - goto out_share; + if (s->s_fs_info != server) { + nfs_free_server(server); + server = NULL; + } - s->s_flags = flags; + if (!s->s_root) { + /* initial superblock/root creation */ + s->s_flags = flags; + nfs_fill_super(s, data); + } - error = nfs_fill_super(s, data, flags & MS_SILENT ? 1 : 0); - if (error) { - up_write(&s->s_umount); - deactivate_super(s); - return error; + mntroot = nfs_get_root(s, &mntfh); + if (IS_ERR(mntroot)) { + error = PTR_ERR(mntroot); + goto error_splat_super; } - s->s_flags |= MS_ACTIVE; - return simple_set_mnt(mnt, s); -out_share: - kfree(server); - return simple_set_mnt(mnt, s); + s->s_flags |= MS_ACTIVE; + mnt->mnt_sb = s; + mnt->mnt_root = mntroot; + return 0; -out_err: - kfree(server); +out_err_nosb: + nfs_free_server(server); out_err_noserver: return error; + +error_splat_super: + up_write(&s->s_umount); + deactivate_super(s); + return error; } +/* + * Destroy an NFS2/3 superblock + */ static void nfs_kill_super(struct super_block *s) { struct nfs_server *server = NFS_SB(s); kill_anon_super(s); - - if (!IS_ERR(server->client)) - rpc_shutdown_client(server->client); - if (!IS_ERR(server->client_acl)) - rpc_shutdown_client(server->client_acl); - - if (!(server->flags & NFS_MOUNT_NONLM)) - lockd_down(); /* release rpc.lockd */ - - nfs_free_iostats(server->io_stats); - kfree(server->hostname); - nfs_put_client(server->nfs_client); - kfree(server); - nfs_release_automount_timer(); + nfs_free_server(server); } -static struct super_block *nfs_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) -{ - struct super_block *sb; - - server->fsid = data->fattr->fsid; - nfs_copy_fh(&server->fh, data->fh); - sb = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server); - if (!IS_ERR(sb) && sb->s_root == NULL && !(server->flags & NFS_MOUNT_NONLM)) - lockd_up(); - return sb; -} - -static int nfs_clone_nfs_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt) +/* + * Clone an NFS2/3 server record on xdev traversal (FSID-change) + */ +static int nfs_xdev_get_sb(struct file_system_type *fs_type, int flags, + const char *dev_name, void *raw_data, + struct vfsmount *mnt) { struct nfs_clone_mount *data = raw_data; - return nfs_clone_generic_sb(data, nfs_clone_sb, nfs_clone_server, mnt); -} + struct super_block *s; + struct nfs_server *server; + struct dentry *mntroot; + int error; -#ifdef CONFIG_NFS_V4 -static struct rpc_clnt *nfs4_create_client(struct nfs_server *server, - int timeo, int retrans, int proto, rpc_authflavor_t flavor) -{ - struct nfs_client *clp; - struct rpc_clnt *clnt = NULL; - int err = -EIO; - - clp = nfs_get_client(server->hostname, &server->addr, 4); - if (!clp) { - dprintk("%s: failed to create NFS4 client.\n", __FUNCTION__); - return ERR_PTR(err); - } + dprintk("--> nfs_xdev_get_sb()\n"); - /* Now create transport and client */ - if (clp->cl_cons_state == NFS_CS_INITING) { - clp->rpc_ops = &nfs_v4_clientops; + /* create a new volume representation */ + server = nfs_clone_server(NFS_SB(data->sb), data->fh, data->fattr); + if (IS_ERR(server)) { + error = PTR_ERR(server); + goto out_err_noserver; + } - err = nfs_create_rpc_client(clp, proto, timeo, retrans, flavor); - if (err < 0) - goto client_init_error; + /* Get a superblock - note that we may end up sharing one that already exists */ + s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server); + if (IS_ERR(s)) { + error = PTR_ERR(s); + goto out_err_nosb; + } - memcpy(clp->cl_ipaddr, server->ip_addr, sizeof(clp->cl_ipaddr)); - err = nfs_idmap_new(clp); - if (err < 0) { - dprintk("%s: failed to create idmapper.\n", - __FUNCTION__); - goto client_init_error; - } - __set_bit(NFS_CS_IDMAP, &clp->cl_res_state); - nfs_mark_client_ready(clp, 0); + if (s->s_fs_info != server) { + nfs_free_server(server); + server = NULL; } - clnt = rpc_clone_client(clp->cl_rpcclient); + if (!s->s_root) { + /* initial superblock/root creation */ + s->s_flags = flags; + nfs_clone_super(s, data->sb); + } - if (IS_ERR(clnt)) { - dprintk("%s: cannot create RPC client. Error = %d\n", - __FUNCTION__, err); - return clnt; + mntroot = nfs_get_root(s, data->fh); + if (IS_ERR(mntroot)) { + error = PTR_ERR(mntroot); + goto error_splat_super; } - if (clnt->cl_auth->au_flavor != flavor) { - struct rpc_auth *auth; + s->s_flags |= MS_ACTIVE; + mnt->mnt_sb = s; + mnt->mnt_root = mntroot; - auth = rpcauth_create(flavor, clnt); - if (IS_ERR(auth)) { - dprintk("%s: couldn't create credcache!\n", __FUNCTION__); - return (struct rpc_clnt *)auth; - } - } + dprintk("<-- nfs_xdev_get_sb() = 0\n"); + return 0; - server->nfs_client = clp; - down_write(&clp->cl_sem); - list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); - up_write(&clp->cl_sem); - return clnt; +out_err_nosb: + nfs_free_server(server); +out_err_noserver: + dprintk("<-- nfs_xdev_get_sb() = %d [error]\n", error); + return error; -client_init_error: - nfs_mark_client_ready(clp, err); - nfs_put_client(clp); - return ERR_PTR(err); +error_splat_super: + up_write(&s->s_umount); + deactivate_super(s); + dprintk("<-- nfs_xdev_get_sb() = %d [splat]\n", error); + return error; } +#ifdef CONFIG_NFS_V4 + /* - * Set up an NFS4 superblock + * Finish setting up a cloned NFS4 superblock */ -static int nfs4_fill_super(struct super_block *sb, struct nfs4_mount_data *data, int silent) +static void nfs4_clone_super(struct super_block *sb, + const struct super_block *old_sb) { - struct nfs_server *server; - rpc_authflavor_t authflavour; - int err = -EIO; - - sb->s_blocksize_bits = 0; - sb->s_blocksize = 0; - server = NFS_SB(sb); - if (data->rsize != 0) - server->rsize = nfs_block_size(data->rsize, NULL); - if (data->wsize != 0) - server->wsize = nfs_block_size(data->wsize, NULL); - server->flags = data->flags & NFS_MOUNT_FLAGMASK; - server->caps = NFS_CAP_ATOMIC_OPEN; - - server->acregmin = data->acregmin*HZ; - server->acregmax = data->acregmax*HZ; - server->acdirmin = data->acdirmin*HZ; - server->acdirmax = data->acdirmax*HZ; - - /* Now create transport and client */ - authflavour = RPC_AUTH_UNIX; - if (data->auth_flavourlen != 0) { - if (data->auth_flavourlen != 1) { - dprintk("%s: Invalid number of RPC auth flavours %d.\n", - __FUNCTION__, data->auth_flavourlen); - err = -EINVAL; - goto out_fail; - } - if (copy_from_user(&authflavour, data->auth_flavours, sizeof(authflavour))) { - err = -EFAULT; - goto out_fail; - } - } - - server->client = nfs4_create_client(server, data->timeo, data->retrans, - data->proto, authflavour); - if (IS_ERR(server->client)) { - err = PTR_ERR(server->client); - dprintk("%s: cannot create RPC client. Error = %d\n", - __FUNCTION__, err); - goto out_fail; - } - + sb->s_blocksize_bits = old_sb->s_blocksize_bits; + sb->s_blocksize = old_sb->s_blocksize; + sb->s_maxbytes = old_sb->s_maxbytes; sb->s_time_gran = 1; - - sb->s_op = &nfs4_sops; - err = nfs_sb_init(sb, authflavour); - - out_fail: - return err; + sb->s_op = old_sb->s_op; + nfs_initialise_sb(sb); } -static int nfs4_compare_super(struct super_block *sb, void *data) +/* + * Set up an NFS4 superblock + */ +static void nfs4_fill_super(struct super_block *sb) { - struct nfs_server *server = data; - struct nfs_server *old = NFS_SB(sb); - - if (strcmp(server->hostname, old->hostname) != 0) - return 0; - if (strcmp(server->mnt_path, old->mnt_path) != 0) - return 0; - return 1; + sb->s_time_gran = 1; + sb->s_op = &nfs4_sops; + nfs_initialise_sb(sb); } -static void * -nfs_copy_user_string(char *dst, struct nfs_string *src, int maxlen) +static void *nfs_copy_user_string(char *dst, struct nfs_string *src, int maxlen) { void *p = NULL; @@ -1157,14 +791,22 @@ nfs_copy_user_string(char *dst, struct nfs_string *src, int maxlen) return dst; } +/* + * Get the superblock for an NFS4 mountpoint + */ static int nfs4_get_sb(struct file_system_type *fs_type, int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt) { - int error; - struct nfs_server *server; - struct super_block *s; struct nfs4_mount_data *data = raw_data; + struct super_block *s; + struct nfs_server *server; + struct sockaddr_in addr; + rpc_authflavor_t authflavour; + struct nfs_fh mntfh; + struct dentry *mntroot; + char *mntpath = NULL, *hostname = NULL, ip_addr[16]; void *p; + int error; if (data == NULL) { dprintk("%s: missing data argument\n", __FUNCTION__); @@ -1175,75 +817,107 @@ static int nfs4_get_sb(struct file_system_type *fs_type, return -EINVAL; } - server = kzalloc(sizeof(struct nfs_server), GFP_KERNEL); - if (!server) - return -ENOMEM; - /* Zero out the NFS state stuff */ - init_nfsv4_state(server); - server->client = server->client_acl = ERR_PTR(-EINVAL); + /* We now require that the mount process passes the remote address */ + if (data->host_addrlen != sizeof(addr)) + return -EINVAL; + + if (copy_from_user(&addr, data->host_addr, sizeof(addr))) + return -EFAULT; + + if (addr.sin_family != AF_INET || + addr.sin_addr.s_addr == INADDR_ANY + ) { + dprintk("%s: mount program didn't pass remote IP address!\n", + __FUNCTION__); + return -EINVAL; + } + + /* Grab the authentication type */ + authflavour = RPC_AUTH_UNIX; + if (data->auth_flavourlen != 0) { + if (data->auth_flavourlen != 1) { + dprintk("%s: Invalid number of RPC auth flavours %d.\n", + __FUNCTION__, data->auth_flavourlen); + error = -EINVAL; + goto out_err_noserver; + } + + if (copy_from_user(&authflavour, data->auth_flavours, + sizeof(authflavour))) { + error = -EFAULT; + goto out_err_noserver; + } + } p = nfs_copy_user_string(NULL, &data->hostname, 256); if (IS_ERR(p)) goto out_err; - server->hostname = p; + hostname = p; p = nfs_copy_user_string(NULL, &data->mnt_path, 1024); if (IS_ERR(p)) goto out_err; - server->mnt_path = p; + mntpath = p; - p = nfs_copy_user_string(server->ip_addr, &data->client_addr, - sizeof(server->ip_addr) - 1); + dprintk("MNTPATH: %s\n", mntpath); + + p = nfs_copy_user_string(ip_addr, &data->client_addr, + sizeof(ip_addr) - 1); if (IS_ERR(p)) goto out_err; - /* We now require that the mount process passes the remote address */ - if (data->host_addrlen != sizeof(server->addr)) { - error = -EINVAL; - goto out_free; - } - if (copy_from_user(&server->addr, data->host_addr, sizeof(server->addr))) { - error = -EFAULT; - goto out_free; - } - if (server->addr.sin_family != AF_INET || - server->addr.sin_addr.s_addr == INADDR_ANY) { - dprintk("%s: mount program didn't pass remote IP address!\n", - __FUNCTION__); - error = -EINVAL; - goto out_free; + /* Get a volume representation */ + server = nfs4_create_server(data, hostname, &addr, mntpath, ip_addr, + authflavour, &mntfh); + if (IS_ERR(server)) { + error = PTR_ERR(server); + goto out_err_noserver; } - s = sget(fs_type, nfs4_compare_super, nfs_set_super, server); + /* Get a superblock - note that we may end up sharing one that already exists */ + s = sget(fs_type, nfs_compare_super, nfs_set_super, server); if (IS_ERR(s)) { error = PTR_ERR(s); goto out_free; } - if (s->s_root) { - kfree(server->mnt_path); - kfree(server->hostname); - kfree(server); - return simple_set_mnt(mnt, s); - } + if (!s->s_root) { + /* initial superblock/root creation */ + s->s_flags = flags; - s->s_flags = flags; + nfs4_fill_super(s); + } else { + nfs_free_server(server); + } - error = nfs4_fill_super(s, data, flags & MS_SILENT ? 1 : 0); - if (error) { - up_write(&s->s_umount); - deactivate_super(s); - return error; + mntroot = nfs4_get_root(s, &mntfh); + if (IS_ERR(mntroot)) { + error = PTR_ERR(mntroot); + goto error_splat_super; } + s->s_flags |= MS_ACTIVE; - return simple_set_mnt(mnt, s); + mnt->mnt_sb = s; + mnt->mnt_root = mntroot; + kfree(mntpath); + kfree(hostname); + return 0; + out_err: error = PTR_ERR(p); + goto out_err_noserver; + out_free: - kfree(server->mnt_path); - kfree(server->hostname); - kfree(server); + nfs_free_server(server); +out_err_noserver: + kfree(mntpath); + kfree(hostname); return error; + +error_splat_super: + up_write(&s->s_umount); + deactivate_super(s); + goto out_err_noserver; } static void nfs4_kill_super(struct super_block *sb) @@ -1254,133 +928,140 @@ static void nfs4_kill_super(struct super_block *sb) kill_anon_super(sb); nfs4_renewd_prepare_shutdown(server); - - if (server->client != NULL && !IS_ERR(server->client)) - rpc_shutdown_client(server->client); - - destroy_nfsv4_state(server); - - nfs_free_iostats(server->io_stats); - kfree(server->hostname); - kfree(server); - nfs_release_automount_timer(); + nfs_free_server(server); } /* - * Constructs the SERVER-side path + * Clone an NFS4 server record on xdev traversal (FSID-change) */ -static inline char *nfs4_dup_path(const struct dentry *dentry) +static int nfs4_xdev_get_sb(struct file_system_type *fs_type, int flags, + const char *dev_name, void *raw_data, + struct vfsmount *mnt) { - char *page = (char *) __get_free_page(GFP_USER); - char *path; + struct nfs_clone_mount *data = raw_data; + struct super_block *s; + struct nfs_server *server; + struct dentry *mntroot; + int error; - path = nfs4_path(dentry, page, PAGE_SIZE); - if (!IS_ERR(path)) { - int len = PAGE_SIZE + page - path; - char *tmp = path; + dprintk("--> nfs4_xdev_get_sb()\n"); - path = kmalloc(len, GFP_KERNEL); - if (path) - memcpy(path, tmp, len); - else - path = ERR_PTR(-ENOMEM); + /* create a new volume representation */ + server = nfs_clone_server(NFS_SB(data->sb), data->fh, data->fattr); + if (IS_ERR(server)) { + error = PTR_ERR(server); + goto out_err_noserver; } - free_page((unsigned long)page); - return path; -} -static struct super_block *nfs4_clone_sb(struct nfs_server *server, struct nfs_clone_mount *data) -{ - const struct dentry *dentry = data->dentry; - struct nfs_client *clp = server->nfs_client; - struct super_block *sb; - - server->fsid = data->fattr->fsid; - nfs_copy_fh(&server->fh, data->fh); - server->mnt_path = nfs4_dup_path(dentry); - if (IS_ERR(server->mnt_path)) { - sb = (struct super_block *)server->mnt_path; - goto err; + /* Get a superblock - note that we may end up sharing one that already exists */ + s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server); + if (IS_ERR(s)) { + error = PTR_ERR(s); + goto out_err_nosb; } - sb = sget(&nfs4_fs_type, nfs4_compare_super, nfs_set_super, server); - if (IS_ERR(sb) || sb->s_root) - goto free_path; - nfs4_server_capabilities(server, &server->fh); - - down_write(&clp->cl_sem); - list_add_tail(&server->nfs4_siblings, &clp->cl_superblocks); - up_write(&clp->cl_sem); - return sb; -free_path: - kfree(server->mnt_path); -err: - server->mnt_path = NULL; - return sb; -} -static int nfs_clone_nfs4_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt) -{ - struct nfs_clone_mount *data = raw_data; - return nfs_clone_generic_sb(data, nfs4_clone_sb, nfs_clone_server, mnt); -} + if (s->s_fs_info != server) { + nfs_free_server(server); + server = NULL; + } -static struct super_block *nfs4_referral_sb(struct nfs_server *server, struct nfs_clone_mount *data) -{ - struct super_block *sb = ERR_PTR(-ENOMEM); - int len; - - len = strlen(data->mnt_path) + 1; - server->mnt_path = kmalloc(len, GFP_KERNEL); - if (server->mnt_path == NULL) - goto err; - memcpy(server->mnt_path, data->mnt_path, len); - memcpy(&server->addr, data->addr, sizeof(struct sockaddr_in)); - - sb = sget(&nfs4_fs_type, nfs4_compare_super, nfs_set_super, server); - if (IS_ERR(sb) || sb->s_root) - goto free_path; - return sb; -free_path: - kfree(server->mnt_path); -err: - server->mnt_path = NULL; - return sb; -} + if (!s->s_root) { + /* initial superblock/root creation */ + s->s_flags = flags; + nfs4_clone_super(s, data->sb); + } -static struct nfs_server *nfs4_referral_server(struct super_block *sb, struct nfs_clone_mount *data) -{ - struct nfs_server *server = NFS_SB(sb); - int proto, timeo, retrans; - void *err; - - proto = IPPROTO_TCP; - /* Since we are following a referral and there may be alternatives, - set the timeouts and retries to low values */ - timeo = 2; - retrans = 1; - - nfs_put_client(server->nfs_client); - server->nfs_client = NULL; - server->client = nfs4_create_client(server, timeo, retrans, proto, - data->authflavor); - if (IS_ERR((err = server->client))) - goto out_err; + mntroot = nfs4_get_root(s, data->fh); + if (IS_ERR(mntroot)) { + error = PTR_ERR(mntroot); + goto error_splat_super; + } - sb->s_time_gran = 1; - sb->s_op = &nfs4_sops; - err = ERR_PTR(nfs_sb_init(sb, data->authflavor)); - if (!IS_ERR(err)) - return server; -out_err: - return (struct nfs_server *)err; + s->s_flags |= MS_ACTIVE; + mnt->mnt_sb = s; + mnt->mnt_root = mntroot; + + dprintk("<-- nfs4_xdev_get_sb() = 0\n"); + return 0; + +out_err_nosb: + nfs_free_server(server); +out_err_noserver: + dprintk("<-- nfs4_xdev_get_sb() = %d [error]\n", error); + return error; + +error_splat_super: + up_write(&s->s_umount); + deactivate_super(s); + dprintk("<-- nfs4_xdev_get_sb() = %d [splat]\n", error); + return error; } -static int nfs_referral_nfs4_sb(struct file_system_type *fs_type, - int flags, const char *dev_name, void *raw_data, struct vfsmount *mnt) +/* + * Create an NFS4 server record on referral traversal + */ +static int nfs4_referral_get_sb(struct file_system_type *fs_type, int flags, + const char *dev_name, void *raw_data, + struct vfsmount *mnt) { struct nfs_clone_mount *data = raw_data; - return nfs_clone_generic_sb(data, nfs4_referral_sb, nfs4_referral_server, mnt); + struct super_block *s; + struct nfs_server *server; + struct dentry *mntroot; + struct nfs_fh mntfh; + int error; + + dprintk("--> nfs4_referral_get_sb()\n"); + + /* create a new volume representation */ + server = nfs4_create_referral_server(data, &mntfh); + if (IS_ERR(server)) { + error = PTR_ERR(server); + goto out_err_noserver; + } + + /* Get a superblock - note that we may end up sharing one that already exists */ + s = sget(&nfs_fs_type, nfs_compare_super, nfs_set_super, server); + if (IS_ERR(s)) { + error = PTR_ERR(s); + goto out_err_nosb; + } + + if (s->s_fs_info != server) { + nfs_free_server(server); + server = NULL; + } + + if (!s->s_root) { + /* initial superblock/root creation */ + s->s_flags = flags; + nfs4_fill_super(s); + } + + mntroot = nfs4_get_root(s, data->fh); + if (IS_ERR(mntroot)) { + error = PTR_ERR(mntroot); + goto error_splat_super; + } + + s->s_flags |= MS_ACTIVE; + mnt->mnt_sb = s; + mnt->mnt_root = mntroot; + + dprintk("<-- nfs4_referral_get_sb() = 0\n"); + return 0; + +out_err_nosb: + nfs_free_server(server); +out_err_noserver: + dprintk("<-- nfs4_referral_get_sb() = %d [error]\n", error); + return error; + +error_splat_super: + up_write(&s->s_umount); + deactivate_super(s); + dprintk("<-- nfs4_referral_get_sb() = %d [splat]\n", error); + return error; } -#endif +#endif /* CONFIG_NFS_V4 */ diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 7084ac9..453d446 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1273,7 +1273,7 @@ int nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data) if (time_before(complain, jiffies)) { dprintk("NFS: faulty NFS server %s:" " (committed = %d) != (stable = %d)\n", - NFS_SERVER(data->inode)->hostname, + NFS_SERVER(data->inode)->nfs_client->cl_hostname, resp->verf->committed, argp->stable); complain = jiffies + 300 * HZ; } diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index d404cec..6d0be0e 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -51,7 +51,6 @@ struct nfs_client { unsigned long cl_lease_time; unsigned long cl_last_renewal; struct work_struct cl_renewd; - struct work_struct cl_recoverd; struct rpc_wait_queue cl_rpcwaitq; @@ -74,6 +73,10 @@ struct nfs_client { */ struct nfs_server { struct nfs_client * nfs_client; /* shared client and NFS4 state */ + struct list_head client_link; /* List of other nfs_server structs + * that share the same client + */ + struct list_head master_link; /* link in master servers list */ struct rpc_clnt * client; /* RPC client handle */ struct rpc_clnt * client_acl; /* ACL RPC client handle */ struct nfs_iostats * io_stats; /* I/O statistics */ @@ -92,20 +95,13 @@ struct nfs_server { unsigned int acdirmin; unsigned int acdirmax; unsigned int namelen; - char * hostname; /* remote hostname */ - struct nfs_fh fh; - struct sockaddr_in addr; + struct nfs_fsid fsid; + __u64 maxfilesize; /* maximum file size */ unsigned long mount_time; /* when this fs was mounted */ + dev_t s_dev; /* superblock dev numbers */ + #ifdef CONFIG_NFS_V4 - /* Our own IP address, as a null-terminated string. - * This is used to generate the clientid, and the callback address. - */ - char ip_addr[16]; - char * mnt_path; - struct list_head nfs4_siblings; /* List of other nfs_server structs - * that share the same clientid - */ u32 attr_bitmask[2];/* V4 bitmask representing the set of attributes supported on this filesystem */ @@ -113,6 +109,7 @@ struct nfs_server { that are supported on this filesystem */ #endif + void (*destroy)(struct nfs_server *); }; /* Server capabilities */ -- cgit v0.10.2 From 6aaca566503296a73f956908ec98173946134fe2 Mon Sep 17 00:00:00 2001 From: David Howells Date: Tue, 22 Aug 2006 20:06:13 -0400 Subject: NFS: Add server and volume lists to /proc Make two new proc files available: /proc/fs/nfsfs/servers /proc/fs/nfsfs/volumes The first lists the servers with which we are currently dealing (struct nfs_client), and the second lists the volumes we have on those servers (struct nfs_server). Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index dafba60..27f6478 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -1148,3 +1148,287 @@ out_free_server: dprintk("<-- nfs_clone_server() = error %d\n", error); return ERR_PTR(error); } + +#ifdef CONFIG_PROC_FS +static struct proc_dir_entry *proc_fs_nfs; + +static int nfs_server_list_open(struct inode *inode, struct file *file); +static void *nfs_server_list_start(struct seq_file *p, loff_t *pos); +static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos); +static void nfs_server_list_stop(struct seq_file *p, void *v); +static int nfs_server_list_show(struct seq_file *m, void *v); + +static struct seq_operations nfs_server_list_ops = { + .start = nfs_server_list_start, + .next = nfs_server_list_next, + .stop = nfs_server_list_stop, + .show = nfs_server_list_show, +}; + +static struct file_operations nfs_server_list_fops = { + .open = nfs_server_list_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +static int nfs_volume_list_open(struct inode *inode, struct file *file); +static void *nfs_volume_list_start(struct seq_file *p, loff_t *pos); +static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos); +static void nfs_volume_list_stop(struct seq_file *p, void *v); +static int nfs_volume_list_show(struct seq_file *m, void *v); + +static struct seq_operations nfs_volume_list_ops = { + .start = nfs_volume_list_start, + .next = nfs_volume_list_next, + .stop = nfs_volume_list_stop, + .show = nfs_volume_list_show, +}; + +static struct file_operations nfs_volume_list_fops = { + .open = nfs_volume_list_open, + .read = seq_read, + .llseek = seq_lseek, + .release = seq_release, +}; + +/* + * open "/proc/fs/nfsfs/servers" which provides a summary of servers with which + * we're dealing + */ +static int nfs_server_list_open(struct inode *inode, struct file *file) +{ + struct seq_file *m; + int ret; + + ret = seq_open(file, &nfs_server_list_ops); + if (ret < 0) + return ret; + + m = file->private_data; + m->private = PDE(inode)->data; + + return 0; +} + +/* + * set up the iterator to start reading from the server list and return the first item + */ +static void *nfs_server_list_start(struct seq_file *m, loff_t *_pos) +{ + struct list_head *_p; + loff_t pos = *_pos; + + /* lock the list against modification */ + spin_lock(&nfs_client_lock); + + /* allow for the header line */ + if (!pos) + return SEQ_START_TOKEN; + pos--; + + /* find the n'th element in the list */ + list_for_each(_p, &nfs_client_list) + if (!pos--) + break; + + return _p != &nfs_client_list ? _p : NULL; +} + +/* + * move to next server + */ +static void *nfs_server_list_next(struct seq_file *p, void *v, loff_t *pos) +{ + struct list_head *_p; + + (*pos)++; + + _p = v; + _p = (v == SEQ_START_TOKEN) ? nfs_client_list.next : _p->next; + + return _p != &nfs_client_list ? _p : NULL; +} + +/* + * clean up after reading from the transports list + */ +static void nfs_server_list_stop(struct seq_file *p, void *v) +{ + spin_unlock(&nfs_client_lock); +} + +/* + * display a header line followed by a load of call lines + */ +static int nfs_server_list_show(struct seq_file *m, void *v) +{ + struct nfs_client *clp; + + /* display header on line 1 */ + if (v == SEQ_START_TOKEN) { + seq_puts(m, "NV SERVER PORT USE HOSTNAME\n"); + return 0; + } + + /* display one transport per line on subsequent lines */ + clp = list_entry(v, struct nfs_client, cl_share_link); + + seq_printf(m, "v%d %02x%02x%02x%02x %4hx %3d %s\n", + clp->cl_nfsversion, + NIPQUAD(clp->cl_addr.sin_addr), + ntohs(clp->cl_addr.sin_port), + atomic_read(&clp->cl_count), + clp->cl_hostname); + + return 0; +} + +/* + * open "/proc/fs/nfsfs/volumes" which provides a summary of extant volumes + */ +static int nfs_volume_list_open(struct inode *inode, struct file *file) +{ + struct seq_file *m; + int ret; + + ret = seq_open(file, &nfs_volume_list_ops); + if (ret < 0) + return ret; + + m = file->private_data; + m->private = PDE(inode)->data; + + return 0; +} + +/* + * set up the iterator to start reading from the volume list and return the first item + */ +static void *nfs_volume_list_start(struct seq_file *m, loff_t *_pos) +{ + struct list_head *_p; + loff_t pos = *_pos; + + /* lock the list against modification */ + spin_lock(&nfs_client_lock); + + /* allow for the header line */ + if (!pos) + return SEQ_START_TOKEN; + pos--; + + /* find the n'th element in the list */ + list_for_each(_p, &nfs_volume_list) + if (!pos--) + break; + + return _p != &nfs_volume_list ? _p : NULL; +} + +/* + * move to next volume + */ +static void *nfs_volume_list_next(struct seq_file *p, void *v, loff_t *pos) +{ + struct list_head *_p; + + (*pos)++; + + _p = v; + _p = (v == SEQ_START_TOKEN) ? nfs_volume_list.next : _p->next; + + return _p != &nfs_volume_list ? _p : NULL; +} + +/* + * clean up after reading from the transports list + */ +static void nfs_volume_list_stop(struct seq_file *p, void *v) +{ + spin_unlock(&nfs_client_lock); +} + +/* + * display a header line followed by a load of call lines + */ +static int nfs_volume_list_show(struct seq_file *m, void *v) +{ + struct nfs_server *server; + struct nfs_client *clp; + char dev[8], fsid[17]; + + /* display header on line 1 */ + if (v == SEQ_START_TOKEN) { + seq_puts(m, "NV SERVER PORT DEV FSID\n"); + return 0; + } + /* display one transport per line on subsequent lines */ + server = list_entry(v, struct nfs_server, master_link); + clp = server->nfs_client; + + snprintf(dev, 8, "%u:%u", + MAJOR(server->s_dev), MINOR(server->s_dev)); + + snprintf(fsid, 17, "%llx:%llx", + server->fsid.major, server->fsid.minor); + + seq_printf(m, "v%d %02x%02x%02x%02x %4hx %-7s %-17s\n", + clp->cl_nfsversion, + NIPQUAD(clp->cl_addr.sin_addr), + ntohs(clp->cl_addr.sin_port), + dev, + fsid); + + return 0; +} + +/* + * initialise the /proc/fs/nfsfs/ directory + */ +int __init nfs_fs_proc_init(void) +{ + struct proc_dir_entry *p; + + proc_fs_nfs = proc_mkdir("nfsfs", proc_root_fs); + if (!proc_fs_nfs) + goto error_0; + + proc_fs_nfs->owner = THIS_MODULE; + + /* a file of servers with which we're dealing */ + p = create_proc_entry("servers", S_IFREG|S_IRUGO, proc_fs_nfs); + if (!p) + goto error_1; + + p->proc_fops = &nfs_server_list_fops; + p->owner = THIS_MODULE; + + /* a file of volumes that we have mounted */ + p = create_proc_entry("volumes", S_IFREG|S_IRUGO, proc_fs_nfs); + if (!p) + goto error_2; + + p->proc_fops = &nfs_volume_list_fops; + p->owner = THIS_MODULE; + return 0; + +error_2: + remove_proc_entry("servers", proc_fs_nfs); +error_1: + remove_proc_entry("nfsfs", proc_root_fs); +error_0: + return -ENOMEM; +} + +/* + * clean up the /proc/fs/nfsfs/ directory + */ +void nfs_fs_proc_exit(void) +{ + remove_proc_entry("volumes", proc_fs_nfs); + remove_proc_entry("servers", proc_fs_nfs); + remove_proc_entry("nfsfs", proc_root_fs); +} + +#endif /* CONFIG_PROC_FS */ diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index a547c58..cb5c65f 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -1141,6 +1141,10 @@ static int __init init_nfs_fs(void) { int err; + err = nfs_fs_proc_init(); + if (err) + goto out5; + err = nfs_init_nfspagecache(); if (err) goto out4; @@ -1181,6 +1185,8 @@ out2: out3: nfs_destroy_nfspagecache(); out4: + nfs_fs_proc_exit(); +out5: return err; } @@ -1195,6 +1201,7 @@ static void __exit exit_nfs_fs(void) rpc_proc_unregister("nfs"); #endif unregister_nfs_fs(); + nfs_fs_proc_exit(); } /* Not quite true; I just maintain it */ diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h index e73ba4f..bea0b01 100644 --- a/fs/nfs/internal.h +++ b/fs/nfs/internal.h @@ -47,6 +47,18 @@ extern void nfs_free_server(struct nfs_server *server); extern struct nfs_server *nfs_clone_server(struct nfs_server *, struct nfs_fh *, struct nfs_fattr *); +#ifdef CONFIG_PROC_FS +extern int __init nfs_fs_proc_init(void); +extern void nfs_fs_proc_exit(void); +#else +static inline int nfs_fs_proc_init(void) +{ + return 0; +} +static inline void nfs_fs_proc_exit(void) +{ +} +#endif /* nfs4namespace.c */ #ifdef CONFIG_NFS_V4 -- cgit v0.10.2 From 27ba851244f627a302d0fc6469d1ad413fc34fcb Mon Sep 17 00:00:00 2001 From: David Howells Date: Sun, 30 Jul 2006 14:40:56 -0400 Subject: NFS: Fix error handling Fix an error handling problem: nfs_put_client() can be given a NULL pointer if nfs_free_server() is asked to destroy a partially initialised record. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 27f6478..700bd58 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -208,6 +208,9 @@ static void nfs_free_client(struct nfs_client *clp) */ void nfs_put_client(struct nfs_client *clp) { + if (!clp) + return; + dprintk("--> nfs_put_client({%d})\n", atomic_read(&clp->cl_count)); if (atomic_dec_and_lock(&clp->cl_count, &nfs_client_lock)) { -- cgit v0.10.2 From 738a35195941ecf604d3070e2a053e1df3de350b Mon Sep 17 00:00:00 2001 From: David Howells Date: Sun, 30 Jul 2006 14:58:27 -0400 Subject: NFS: Secure the roots of the NFS subtrees in a shared superblock Invoke security_d_instantiate() on root dentries after allocating them with dentry_alloc_anon(). Normally dentry_alloc_root() would do that, but we don't call that as we don't want to assign a name to the root dentry at this point (we may discover the real name later). Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/getroot.c b/fs/nfs/getroot.c index 977e590..76b08ae 100644 --- a/fs/nfs/getroot.c +++ b/fs/nfs/getroot.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -109,6 +110,8 @@ struct dentry *nfs_get_root(struct super_block *sb, struct nfs_fh *mntfh) return ERR_PTR(-ENOMEM); } + security_d_instantiate(mntroot, inode); + if (!mntroot->d_op) mntroot->d_op = server->nfs_client->rpc_ops->dentry_ops; @@ -296,6 +299,8 @@ struct dentry *nfs4_get_root(struct super_block *sb, struct nfs_fh *mntfh) return ERR_PTR(-ENOMEM); } + security_d_instantiate(mntroot, inode); + if (!mntroot->d_op) mntroot->d_op = server->nfs_client->rpc_ops->dentry_ops; -- cgit v0.10.2 From 36b15c54cd0d6f707a3ac03e4a2a60bb530a95b9 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 22 Aug 2006 20:06:14 -0400 Subject: NFS: Ensure NFSv2/v3 mounts respect the NFS_MOUNT_SECFLAVOUR flag Signed-off-by: Trond Myklebust diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 867b5dc..97cfb14 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -471,12 +471,14 @@ static int nfs_validate_mount_data(struct nfs_mount_data *data, data->version); return -EINVAL; } - /* Fill in pseudoflavor for mount version < 5 */ - data->pseudoflavor = RPC_AUTH_UNIX; case 5: memset(data->context, 0, sizeof(data->context)); } + /* Set the pseudoflavor */ + if (!(data->flags & NFS_MOUNT_SECFLAVOUR)) + data->pseudoflavor = RPC_AUTH_UNIX; + #ifndef CONFIG_NFS_V3 /* If NFSv3 is not compiled in, return -EPROTONOSUPPORT */ if (data->flags & NFS_MOUNT_VER3) { -- cgit v0.10.2 From 9c5bf38d85a31b946664bcc21078ef5bb10672f7 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 22 Aug 2006 20:06:14 -0400 Subject: NFS: Fix nfs_alloc_client() The scheme to indicate which services have been started up appears to be seriously broken. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 700bd58..471d975 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -113,9 +113,9 @@ static struct nfs_client *nfs_alloc_client(const char *hostname, if (error < 0) { dprintk("%s: couldn't start rpciod! Error = %d\n", __FUNCTION__, error); - __set_bit(NFS_CS_RPCIOD, &clp->cl_res_state); goto error_1; } + __set_bit(NFS_CS_RPCIOD, &clp->cl_res_state); if (nfsversion == 4) { if (nfs_callback_up() < 0) @@ -153,8 +153,8 @@ static struct nfs_client *nfs_alloc_client(const char *hostname, return clp; error_3: - nfs_callback_down(); - __clear_bit(NFS_CS_CALLBACK, &clp->cl_res_state); + if (__test_and_clear_bit(NFS_CS_CALLBACK, &clp->cl_res_state)) + nfs_callback_down(); error_2: rpciod_down(); __clear_bit(NFS_CS_RPCIOD, &clp->cl_res_state); @@ -195,7 +195,7 @@ static void nfs_free_client(struct nfs_client *clp) nfs_callback_down(); if (__test_and_clear_bit(NFS_CS_RPCIOD, &clp->cl_res_state)) - rpciod_down(); + rpciod_down(); kfree(clp->cl_hostname); kfree(clp); @@ -881,9 +881,9 @@ static int nfs4_init_client(struct nfs_client *clp, if (error < 0) { dprintk("%s: failed to create idmapper. Error = %d\n", __FUNCTION__, error); - __set_bit(NFS_CS_IDMAP, &clp->cl_res_state); goto error; } + __set_bit(NFS_CS_IDMAP, &clp->cl_res_state); nfs_mark_client_ready(clp, NFS_CS_READY); return 0; -- cgit v0.10.2 From ec739ef03dc926d05051c8c5838971445504470a Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:15 -0400 Subject: SUNRPC: Create a helper to tell whether a transport is bound Hide the contents and format of xprt->addr by eliminating direct uses of the xprt->addr.sin_port field. This change is required to support alternate RPC host address formats (eg IPv6). Test-plan: Destructive testing (unplugging the network temporarily). Repeated runs of Connectathon locking suite with UDP and TCP. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 3a0cca2..a711067 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -269,6 +269,7 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to); #define XPRT_CONNECTED (1) #define XPRT_CONNECTING (2) #define XPRT_CLOSE_WAIT (3) +#define XPRT_BOUND (4) static inline void xprt_set_connected(struct rpc_xprt *xprt) { @@ -312,6 +313,21 @@ static inline int xprt_test_and_set_connecting(struct rpc_xprt *xprt) return test_and_set_bit(XPRT_CONNECTING, &xprt->state); } +static inline void xprt_set_bound(struct rpc_xprt *xprt) +{ + test_and_set_bit(XPRT_BOUND, &xprt->state); +} + +static inline int xprt_bound(struct rpc_xprt *xprt) +{ + return test_bit(XPRT_BOUND, &xprt->state); +} + +static inline void xprt_clear_bound(struct rpc_xprt *xprt) +{ + clear_bit(XPRT_BOUND, &xprt->state); +} + #endif /* __KERNEL__*/ #endif /* _LINUX_SUNRPC_XPRT_H */ diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 3e19d32..0b8d03d 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -148,7 +148,6 @@ rpc_new_client(struct rpc_xprt *xprt, char *servname, clnt->cl_maxproc = version->nrprocs; clnt->cl_protname = program->name; clnt->cl_pmap = &clnt->cl_pmap_default; - clnt->cl_port = xprt->addr.sin_port; clnt->cl_prog = program->number; clnt->cl_vers = version->number; clnt->cl_prot = xprt->prot; @@ -156,7 +155,7 @@ rpc_new_client(struct rpc_xprt *xprt, char *servname, clnt->cl_metrics = rpc_alloc_iostats(clnt); rpc_init_wait_queue(&clnt->cl_pmap_default.pm_bindwait, "bindwait"); - if (!clnt->cl_port) + if (!xprt_bound(clnt->cl_xprt)) clnt->cl_autobind = 1; clnt->cl_rtt = &clnt->cl_rtt_default; @@ -570,7 +569,7 @@ EXPORT_SYMBOL(rpc_max_payload); void rpc_force_rebind(struct rpc_clnt *clnt) { if (clnt->cl_autobind) - clnt->cl_port = 0; + xprt_clear_bound(clnt->cl_xprt); } EXPORT_SYMBOL(rpc_force_rebind); @@ -782,14 +781,15 @@ static void call_bind(struct rpc_task *task) { struct rpc_clnt *clnt = task->tk_client; + struct rpc_xprt *xprt = task->tk_xprt; dprintk("RPC: %4d call_bind (status %d)\n", task->tk_pid, task->tk_status); task->tk_action = call_connect; - if (!clnt->cl_port) { + if (!xprt_bound(xprt)) { task->tk_action = call_bind_status; - task->tk_timeout = task->tk_xprt->bind_timeout; + task->tk_timeout = xprt->bind_timeout; rpc_getport(task, clnt); } } diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 623180f..209ffdf 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -142,15 +142,17 @@ pmap_getport_done(struct rpc_task *task) dprintk("RPC: %4d pmap_getport_done(status %d, port %d)\n", task->tk_pid, task->tk_status, clnt->cl_port); - xprt->ops->set_port(xprt, 0); if (task->tk_status < 0) { /* Make the calling task exit with an error */ + xprt->ops->set_port(xprt, 0); task->tk_action = rpc_exit_task; } else if (clnt->cl_port == 0) { /* Program not registered */ + xprt->ops->set_port(xprt, 0); rpc_exit(task, -EACCES); } else { xprt->ops->set_port(xprt, clnt->cl_port); + xprt_set_bound(xprt); clnt->cl_port = htons(clnt->cl_port); } spin_lock(&pmap_lock); @@ -218,6 +220,7 @@ pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileg if (IS_ERR(xprt)) return (struct rpc_clnt *)xprt; xprt->ops->set_port(xprt, RPC_PMAP_PORT); + xprt_set_bound(xprt); if (!privileged) xprt->resvport = 0; diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index e8c2bc4..e239ef9 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -534,7 +534,7 @@ void xprt_connect(struct rpc_task *task) dprintk("RPC: %4d xprt_connect xprt %p %s connected\n", task->tk_pid, xprt, (xprt_connected(xprt) ? "is" : "is not")); - if (!xprt->addr.sin_port) { + if (!xprt_bound(xprt)) { task->tk_status = -EIO; return; } diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 441bd53..123ac1e 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1016,7 +1016,7 @@ static void xs_udp_connect_worker(void *args) struct socket *sock = xprt->sock; int err, status = -EIO; - if (xprt->shutdown || xprt->addr.sin_port == 0) + if (xprt->shutdown || !xprt_bound(xprt)) goto out; dprintk("RPC: xs_udp_connect_worker for xprt %p\n", xprt); @@ -1099,7 +1099,7 @@ static void xs_tcp_connect_worker(void *args) struct socket *sock = xprt->sock; int err, status = -EIO; - if (xprt->shutdown || xprt->addr.sin_port == 0) + if (xprt->shutdown || !xprt_bound(xprt)) goto out; dprintk("RPC: xs_tcp_connect_worker for xprt %p\n", xprt); @@ -1307,8 +1307,11 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) if (xprt->slot == NULL) return -ENOMEM; - xprt->prot = IPPROTO_UDP; + if (ntohs(xprt->addr.sin_port) != 0) + xprt_set_bound(xprt); xprt->port = xs_get_random_port(); + + xprt->prot = IPPROTO_UDP; xprt->tsh_size = 0; xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; /* XXX: header size can vary due to auth type, IPv6, etc. */ @@ -1348,8 +1351,11 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) if (xprt->slot == NULL) return -ENOMEM; - xprt->prot = IPPROTO_TCP; + if (ntohs(xprt->addr.sin_port) != 0) + xprt_set_bound(xprt); xprt->port = xs_get_random_port(); + + xprt->prot = IPPROTO_TCP; xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32); xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; xprt->max_payload = RPC_MAX_FRAGMENT_SIZE; -- cgit v0.10.2 From 4a68179d38874c37be2802442a71b847f5d1a2a9 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:15 -0400 Subject: SUNRPC: Make RPC portmapper use per-transport storage Move connection and bind state that was maintained in the rpc_clnt structure to the rpc_xprt structure. This will allow the creation of a clean API for plugging in different types of bind mechanisms. This brings improvements such as the elimination of a single spin lock to control serialization for all in-kernel RPC binding. A set of per-xprt bitops is used to serialize tasks during RPC binding, just like it now works for making RPC transport connections. Test-plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 8fe9f35..00e9dba 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -18,18 +18,6 @@ #include #include -/* - * This defines an RPC port mapping - */ -struct rpc_portmap { - __u32 pm_prog; - __u32 pm_vers; - __u32 pm_prot; - __u16 pm_port; - unsigned char pm_binding : 1; /* doing a getport() */ - struct rpc_wait_queue pm_bindwait; /* waiting on getport() */ -}; - struct rpc_inode; /* @@ -40,7 +28,9 @@ struct rpc_clnt { atomic_t cl_users; /* number of references */ struct rpc_xprt * cl_xprt; /* transport */ struct rpc_procinfo * cl_procinfo; /* procedure info */ - u32 cl_maxproc; /* max procedure number */ + u32 cl_prog, /* RPC program number */ + cl_vers, /* RPC version number */ + cl_maxproc; /* max procedure number */ char * cl_server; /* server machine name */ char * cl_protname; /* protocol name */ @@ -55,7 +45,6 @@ struct rpc_clnt { cl_dead : 1;/* abandoned */ struct rpc_rtt * cl_rtt; /* RTO estimator data */ - struct rpc_portmap * cl_pmap; /* port mapping */ int cl_nodelen; /* nodename length */ char cl_nodename[UNX_MAXNODENAME]; @@ -64,14 +53,8 @@ struct rpc_clnt { struct dentry * cl_dentry; /* inode */ struct rpc_clnt * cl_parent; /* Points to parent of clones */ struct rpc_rtt cl_rtt_default; - struct rpc_portmap cl_pmap_default; char cl_inline_name[32]; }; -#define cl_timeout cl_xprt->timeout -#define cl_prog cl_pmap->pm_prog -#define cl_vers cl_pmap->pm_vers -#define cl_port cl_pmap->pm_port -#define cl_prot cl_pmap->pm_prot /* * General RPC program info diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index a711067..4ce8261 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -138,6 +138,7 @@ struct rpc_xprt { unsigned int tsh_size; /* size of transport specific header */ + struct rpc_wait_queue binding; /* requests waiting on rpcbind */ struct rpc_wait_queue sending; /* requests waiting to send */ struct rpc_wait_queue resend; /* requests waiting to resend */ struct rpc_wait_queue pending; /* requests in flight */ @@ -270,6 +271,7 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to); #define XPRT_CONNECTING (2) #define XPRT_CLOSE_WAIT (3) #define XPRT_BOUND (4) +#define XPRT_BINDING (5) static inline void xprt_set_connected(struct rpc_xprt *xprt) { @@ -328,6 +330,18 @@ static inline void xprt_clear_bound(struct rpc_xprt *xprt) clear_bit(XPRT_BOUND, &xprt->state); } +static inline void xprt_clear_binding(struct rpc_xprt *xprt) +{ + smp_mb__before_clear_bit(); + clear_bit(XPRT_BINDING, &xprt->state); + smp_mb__after_clear_bit(); +} + +static inline int xprt_test_and_set_binding(struct rpc_xprt *xprt) +{ + return test_and_set_bit(XPRT_BINDING, &xprt->state); +} + #endif /* __KERNEL__*/ #endif /* _LINUX_SUNRPC_XPRT_H */ diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 0b8d03d..cee5041 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -147,13 +147,10 @@ rpc_new_client(struct rpc_xprt *xprt, char *servname, clnt->cl_procinfo = version->procs; clnt->cl_maxproc = version->nrprocs; clnt->cl_protname = program->name; - clnt->cl_pmap = &clnt->cl_pmap_default; clnt->cl_prog = program->number; clnt->cl_vers = version->number; - clnt->cl_prot = xprt->prot; clnt->cl_stats = program->stats; clnt->cl_metrics = rpc_alloc_iostats(clnt); - rpc_init_wait_queue(&clnt->cl_pmap_default.pm_bindwait, "bindwait"); if (!xprt_bound(clnt->cl_xprt)) clnt->cl_autobind = 1; @@ -243,8 +240,6 @@ rpc_clone_client(struct rpc_clnt *clnt) atomic_set(&new->cl_users, 0); new->cl_parent = clnt; atomic_inc(&clnt->cl_count); - /* Duplicate portmapper */ - rpc_init_wait_queue(&new->cl_pmap_default.pm_bindwait, "bindwait"); /* Turn off autobind on clones */ new->cl_autobind = 0; new->cl_oneshot = 0; @@ -254,8 +249,7 @@ rpc_clone_client(struct rpc_clnt *clnt) rpc_init_rtt(&new->cl_rtt_default, clnt->cl_xprt->timeout.to_initval); if (new->cl_auth) atomic_inc(&new->cl_auth->au_count); - new->cl_pmap = &new->cl_pmap_default; - new->cl_metrics = rpc_alloc_iostats(clnt); + new->cl_metrics = rpc_alloc_iostats(clnt); return new; out_no_clnt: printk(KERN_INFO "RPC: out of memory in %s\n", __FUNCTION__); diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 209ffdf..59d5424 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -24,11 +24,57 @@ #define PMAP_UNSET 2 #define PMAP_GETPORT 3 +struct portmap_args { + u32 pm_prog; + u32 pm_vers; + u32 pm_prot; + unsigned short pm_port; + struct rpc_task * pm_task; +}; + static struct rpc_procinfo pmap_procedures[]; static struct rpc_clnt * pmap_create(char *, struct sockaddr_in *, int, int); -static void pmap_getport_done(struct rpc_task *); +static void pmap_getport_done(struct rpc_task *, void *); static struct rpc_program pmap_program; -static DEFINE_SPINLOCK(pmap_lock); + +static void pmap_getport_prepare(struct rpc_task *task, void *calldata) +{ + struct portmap_args *map = calldata; + struct rpc_message msg = { + .rpc_proc = &pmap_procedures[PMAP_GETPORT], + .rpc_argp = map, + .rpc_resp = &map->pm_port, + }; + + rpc_call_setup(task, &msg, 0); +} + +static inline struct portmap_args *pmap_map_alloc(void) +{ + return kmalloc(sizeof(struct portmap_args), GFP_NOFS); +} + +static inline void pmap_map_free(struct portmap_args *map) +{ + kfree(map); +} + +static void pmap_map_release(void *data) +{ + pmap_map_free(data); +} + +static const struct rpc_call_ops pmap_getport_ops = { + .rpc_call_prepare = pmap_getport_prepare, + .rpc_call_done = pmap_getport_done, + .rpc_release = pmap_map_release, +}; + +static inline void pmap_wake_portmap_waiters(struct rpc_xprt *xprt) +{ + xprt_clear_binding(xprt); + rpc_wake_up(&xprt->binding); +} /* * Obtain the port for a given RPC service on a given host. This one can @@ -37,67 +83,71 @@ static DEFINE_SPINLOCK(pmap_lock); void rpc_getport(struct rpc_task *task, struct rpc_clnt *clnt) { - struct rpc_portmap *map = clnt->cl_pmap; - struct sockaddr_in *sap = &clnt->cl_xprt->addr; - struct rpc_message msg = { - .rpc_proc = &pmap_procedures[PMAP_GETPORT], - .rpc_argp = map, - .rpc_resp = &clnt->cl_port, - .rpc_cred = NULL - }; + struct rpc_xprt *xprt = task->tk_xprt; + struct sockaddr_in *sap = &xprt->addr; + struct portmap_args *map; struct rpc_clnt *pmap_clnt; - struct rpc_task *child; + struct rpc_task *child; - dprintk("RPC: %4d rpc_getport(%s, %d, %d, %d)\n", + dprintk("RPC: %4d rpc_getport(%s, %u, %u, %d)\n", task->tk_pid, clnt->cl_server, - map->pm_prog, map->pm_vers, map->pm_prot); + clnt->cl_prog, clnt->cl_vers, xprt->prot); /* Autobind on cloned rpc clients is discouraged */ BUG_ON(clnt->cl_parent != clnt); - spin_lock(&pmap_lock); - if (map->pm_binding) { - rpc_sleep_on(&map->pm_bindwait, task, NULL, NULL); - spin_unlock(&pmap_lock); + if (xprt_test_and_set_binding(xprt)) { + task->tk_status = -EACCES; /* tell caller to check again */ + rpc_sleep_on(&xprt->binding, task, NULL, NULL); return; } - map->pm_binding = 1; - spin_unlock(&pmap_lock); + + /* Someone else may have bound if we slept */ + if (xprt_bound(xprt)) { + task->tk_status = 0; + goto bailout_nofree; + } + + map = pmap_map_alloc(); + if (!map) { + task->tk_status = -ENOMEM; + goto bailout_nofree; + } + map->pm_prog = clnt->cl_prog; + map->pm_vers = clnt->cl_vers; + map->pm_prot = xprt->prot; + map->pm_port = 0; + map->pm_task = task; pmap_clnt = pmap_create(clnt->cl_server, sap, map->pm_prot, 0); if (IS_ERR(pmap_clnt)) { task->tk_status = PTR_ERR(pmap_clnt); goto bailout; } - task->tk_status = 0; - /* - * Note: rpc_new_child will release client after a failure. - */ - if (!(child = rpc_new_child(pmap_clnt, task))) + child = rpc_run_task(pmap_clnt, RPC_TASK_ASYNC, &pmap_getport_ops, map); + if (IS_ERR(child)) { + task->tk_status = -EIO; goto bailout; + } + rpc_release_task(child); - /* Setup the call info struct */ - rpc_call_setup(child, &msg, 0); + rpc_sleep_on(&xprt->binding, task, NULL, NULL); - /* ... and run the child task */ task->tk_xprt->stat.bind_count++; - rpc_run_child(task, child, pmap_getport_done); return; bailout: - spin_lock(&pmap_lock); - map->pm_binding = 0; - rpc_wake_up(&map->pm_bindwait); - spin_unlock(&pmap_lock); - rpc_exit(task, -EIO); + pmap_map_free(map); +bailout_nofree: + pmap_wake_portmap_waiters(xprt); } #ifdef CONFIG_ROOT_NFS int rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) { - struct rpc_portmap map = { + struct portmap_args map = { .pm_prog = prog, .pm_vers = vers, .pm_prot = prot, @@ -133,32 +183,32 @@ rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) #endif static void -pmap_getport_done(struct rpc_task *task) +pmap_getport_done(struct rpc_task *child, void *data) { - struct rpc_clnt *clnt = task->tk_client; + struct portmap_args *map = data; + struct rpc_task *task = map->pm_task; struct rpc_xprt *xprt = task->tk_xprt; - struct rpc_portmap *map = clnt->cl_pmap; - - dprintk("RPC: %4d pmap_getport_done(status %d, port %d)\n", - task->tk_pid, task->tk_status, clnt->cl_port); + int status = child->tk_status; - if (task->tk_status < 0) { - /* Make the calling task exit with an error */ + if (status < 0) { + /* Portmapper not available */ xprt->ops->set_port(xprt, 0); - task->tk_action = rpc_exit_task; - } else if (clnt->cl_port == 0) { - /* Program not registered */ + task->tk_status = status; + } else if (map->pm_port == 0) { + /* Requested RPC service wasn't registered */ xprt->ops->set_port(xprt, 0); - rpc_exit(task, -EACCES); + task->tk_status = -EACCES; } else { - xprt->ops->set_port(xprt, clnt->cl_port); + /* Succeeded */ + xprt->ops->set_port(xprt, map->pm_port); xprt_set_bound(xprt); - clnt->cl_port = htons(clnt->cl_port); + task->tk_status = 0; } - spin_lock(&pmap_lock); - map->pm_binding = 0; - rpc_wake_up(&map->pm_bindwait); - spin_unlock(&pmap_lock); + + dprintk("RPC: %4d pmap_getport_done(status %d, port %u)\n", + child->tk_pid, child->tk_status, map->pm_port); + + pmap_wake_portmap_waiters(xprt); } /* @@ -172,7 +222,7 @@ rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) .sin_family = AF_INET, .sin_addr.s_addr = htonl(INADDR_LOOPBACK), }; - struct rpc_portmap map = { + struct portmap_args map = { .pm_prog = prog, .pm_vers = vers, .pm_prot = prot, @@ -239,7 +289,7 @@ pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileg * XDR encode/decode functions for PMAP */ static int -xdr_encode_mapping(struct rpc_rqst *req, u32 *p, struct rpc_portmap *map) +xdr_encode_mapping(struct rpc_rqst *req, u32 *p, struct portmap_args *map) { dprintk("RPC: xdr_encode_mapping(%d, %d, %d, %d)\n", map->pm_prog, map->pm_vers, map->pm_prot, map->pm_port); diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index e239ef9..b45abd0 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -928,6 +928,7 @@ static struct rpc_xprt *xprt_setup(int proto, struct sockaddr_in *ap, struct rpc xprt->last_used = jiffies; xprt->cwnd = RPC_INITCWND; + rpc_init_wait_queue(&xprt->binding, "xprt_binding"); rpc_init_wait_queue(&xprt->pending, "xprt_pending"); rpc_init_wait_queue(&xprt->sending, "xprt_sending"); rpc_init_wait_queue(&xprt->resend, "xprt_resend"); -- cgit v0.10.2 From c4a5692fb83f23008c720fe84454d5603e80b103 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:16 -0400 Subject: SUNRPC: Clean-up after recent changes to sunrpc/pmap_clnt.c Add comments for external functions, use modern function definition style, and fix up dprintk formatting. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 59d5424..0efcbf1 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -1,7 +1,9 @@ /* - * linux/net/sunrpc/pmap.c + * linux/net/sunrpc/pmap_clnt.c * - * Portmapper client. + * In-kernel RPC portmapper client. + * + * Portmapper supports version 2 of the rpcbind protocol (RFC 1833). * * Copyright (C) 1996, Olaf Kirch */ @@ -76,12 +78,15 @@ static inline void pmap_wake_portmap_waiters(struct rpc_xprt *xprt) rpc_wake_up(&xprt->binding); } -/* - * Obtain the port for a given RPC service on a given host. This one can - * be called for an ongoing RPC request. +/** + * rpc_getport - obtain the port for a given RPC service on a given host + * @task: task that is waiting for portmapper request + * @clnt: controlling rpc_clnt + * + * This one can be called for an ongoing RPC request, and can be used in + * an async (rpciod) context. */ -void -rpc_getport(struct rpc_task *task, struct rpc_clnt *clnt) +void rpc_getport(struct rpc_task *task, struct rpc_clnt *clnt) { struct rpc_xprt *xprt = task->tk_xprt; struct sockaddr_in *sap = &xprt->addr; @@ -144,8 +149,16 @@ bailout_nofree: } #ifdef CONFIG_ROOT_NFS -int -rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) +/** + * rpc_getport_external - obtain the port for a given RPC service on a given host + * @sin: address of remote peer + * @prog: RPC program number to bind + * @vers: RPC version number to bind + * @prot: transport protocol to use to make this request + * + * This one is called from outside the RPC client in a synchronous task context. + */ +int rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) { struct portmap_args map = { .pm_prog = prog, @@ -162,7 +175,7 @@ rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) char hostname[32]; int status; - dprintk("RPC: rpc_getport_external(%u.%u.%u.%u, %d, %d, %d)\n", + dprintk("RPC: rpc_getport_external(%u.%u.%u.%u, %u, %u, %d)\n", NIPQUAD(sin->sin_addr.s_addr), prog, vers, prot); sprintf(hostname, "%u.%u.%u.%u", NIPQUAD(sin->sin_addr.s_addr)); @@ -182,8 +195,10 @@ rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int prot) } #endif -static void -pmap_getport_done(struct rpc_task *child, void *data) +/* + * Portmapper child task invokes this callback via tk_exit. + */ +static void pmap_getport_done(struct rpc_task *child, void *data) { struct portmap_args *map = data; struct rpc_task *task = map->pm_task; @@ -211,12 +226,17 @@ pmap_getport_done(struct rpc_task *child, void *data) pmap_wake_portmap_waiters(xprt); } -/* - * Set or unset a port registration with the local portmapper. +/** + * rpc_register - set or unset a port registration with the local portmapper + * @prog: RPC program number to bind + * @vers: RPC version number to bind + * @prot: transport protocol to use to make this request + * @port: port value to register + * @okay: result code + * * port == 0 means unregister, port != 0 means register. */ -int -rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) +int rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) { struct sockaddr_in sin = { .sin_family = AF_INET, @@ -236,7 +256,7 @@ rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) struct rpc_clnt *pmap_clnt; int error = 0; - dprintk("RPC: registering (%d, %d, %d, %d) with portmapper.\n", + dprintk("RPC: registering (%u, %u, %d, %u) with portmapper.\n", prog, vers, prot, port); pmap_clnt = pmap_create("localhost", &sin, IPPROTO_UDP, 1); @@ -259,13 +279,11 @@ rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) return error; } -static struct rpc_clnt * -pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileged) +static struct rpc_clnt *pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileged) { struct rpc_xprt *xprt; struct rpc_clnt *clnt; - /* printk("pmap: create xprt\n"); */ xprt = xprt_create_proto(proto, srvaddr, NULL); if (IS_ERR(xprt)) return (struct rpc_clnt *)xprt; @@ -274,7 +292,6 @@ pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileg if (!privileged) xprt->resvport = 0; - /* printk("pmap: create clnt\n"); */ clnt = rpc_new_client(xprt, hostname, &pmap_program, RPC_PMAP_VERSION, RPC_AUTH_UNIX); @@ -288,10 +305,9 @@ pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileg /* * XDR encode/decode functions for PMAP */ -static int -xdr_encode_mapping(struct rpc_rqst *req, u32 *p, struct portmap_args *map) +static int xdr_encode_mapping(struct rpc_rqst *req, u32 *p, struct portmap_args *map) { - dprintk("RPC: xdr_encode_mapping(%d, %d, %d, %d)\n", + dprintk("RPC: xdr_encode_mapping(%u, %u, %u, %u)\n", map->pm_prog, map->pm_vers, map->pm_prot, map->pm_port); *p++ = htonl(map->pm_prog); *p++ = htonl(map->pm_vers); @@ -302,15 +318,13 @@ xdr_encode_mapping(struct rpc_rqst *req, u32 *p, struct portmap_args *map) return 0; } -static int -xdr_decode_port(struct rpc_rqst *req, u32 *p, unsigned short *portp) +static int xdr_decode_port(struct rpc_rqst *req, u32 *p, unsigned short *portp) { *portp = (unsigned short) ntohl(*p++); return 0; } -static int -xdr_decode_bool(struct rpc_rqst *req, u32 *p, unsigned int *boolp) +static int xdr_decode_bool(struct rpc_rqst *req, u32 *p, unsigned int *boolp) { *boolp = (unsigned int) ntohl(*p++); return 0; -- cgit v0.10.2 From 5b1eacbcd78930d976eb50a93f1779d311b553d1 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:16 -0400 Subject: SUNRPC: Support for RPC child tasks no longer needed The previous patches removed the last user of RPC child tasks, so we can remove support for child tasks from net/sunrpc/sched.c now. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/sched.h b/include/linux/sunrpc/sched.h index 82a91bb..f399c13 100644 --- a/include/linux/sunrpc/sched.h +++ b/include/linux/sunrpc/sched.h @@ -127,7 +127,6 @@ struct rpc_call_ops { */ #define RPC_TASK_ASYNC 0x0001 /* is an async task */ #define RPC_TASK_SWAPPER 0x0002 /* is swapping in/out */ -#define RPC_TASK_CHILD 0x0008 /* is child of other task */ #define RPC_CALL_MAJORSEEN 0x0020 /* major timeout seen */ #define RPC_TASK_ROOTCREDS 0x0040 /* force root creds */ #define RPC_TASK_DYNAMIC 0x0080 /* task was kmalloc'ed */ @@ -136,7 +135,6 @@ struct rpc_call_ops { #define RPC_TASK_NOINTR 0x0400 /* uninterruptible task */ #define RPC_IS_ASYNC(t) ((t)->tk_flags & RPC_TASK_ASYNC) -#define RPC_IS_CHILD(t) ((t)->tk_flags & RPC_TASK_CHILD) #define RPC_IS_SWAPPER(t) ((t)->tk_flags & RPC_TASK_SWAPPER) #define RPC_DO_ROOTOVERRIDE(t) ((t)->tk_flags & RPC_TASK_ROOTCREDS) #define RPC_ASSASSINATED(t) ((t)->tk_flags & RPC_TASK_KILLED) @@ -253,7 +251,6 @@ struct rpc_task *rpc_new_task(struct rpc_clnt *, int flags, const struct rpc_call_ops *ops, void *data); struct rpc_task *rpc_run_task(struct rpc_clnt *clnt, int flags, const struct rpc_call_ops *ops, void *data); -struct rpc_task *rpc_new_child(struct rpc_clnt *, struct rpc_task *parent); void rpc_init_task(struct rpc_task *task, struct rpc_clnt *clnt, int flags, const struct rpc_call_ops *ops, void *data); @@ -261,8 +258,6 @@ void rpc_release_task(struct rpc_task *); void rpc_exit_task(struct rpc_task *); void rpc_killall_tasks(struct rpc_clnt *); int rpc_execute(struct rpc_task *); -void rpc_run_child(struct rpc_task *parent, struct rpc_task *child, - rpc_action action); void rpc_init_priority_wait_queue(struct rpc_wait_queue *, const char *); void rpc_init_wait_queue(struct rpc_wait_queue *, const char *); void rpc_sleep_on(struct rpc_wait_queue *, struct rpc_task *, diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index 5c3eee76..015ffe4 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -45,12 +45,6 @@ static void rpciod_killall(void); static void rpc_async_schedule(void *); /* - * RPC tasks that create another task (e.g. for contacting the portmapper) - * will wait on this queue for their child's completion - */ -static RPC_WAITQ(childq, "childq"); - -/* * RPC tasks sit here while waiting for conditions to improve. */ static RPC_WAITQ(delay_queue, "delayq"); @@ -324,16 +318,6 @@ static void rpc_make_runnable(struct rpc_task *task) } /* - * Place a newly initialized task on the workqueue. - */ -static inline void -rpc_schedule_run(struct rpc_task *task) -{ - rpc_set_active(task); - rpc_make_runnable(task); -} - -/* * Prepare for sleeping on a wait queue. * By always appending tasks to the list we ensure FIFO behavior. * NB: An RPC task will only receive interrupt-driven events as long @@ -933,72 +917,6 @@ struct rpc_task *rpc_run_task(struct rpc_clnt *clnt, int flags, } EXPORT_SYMBOL(rpc_run_task); -/** - * rpc_find_parent - find the parent of a child task. - * @child: child task - * @parent: parent task - * - * Checks that the parent task is still sleeping on the - * queue 'childq'. If so returns a pointer to the parent. - * Upon failure returns NULL. - * - * Caller must hold childq.lock - */ -static inline struct rpc_task *rpc_find_parent(struct rpc_task *child, struct rpc_task *parent) -{ - struct rpc_task *task; - struct list_head *le; - - task_for_each(task, le, &childq.tasks[0]) - if (task == parent) - return parent; - - return NULL; -} - -static void rpc_child_exit(struct rpc_task *child, void *calldata) -{ - struct rpc_task *parent; - - spin_lock_bh(&childq.lock); - if ((parent = rpc_find_parent(child, calldata)) != NULL) { - parent->tk_status = child->tk_status; - __rpc_wake_up_task(parent); - } - spin_unlock_bh(&childq.lock); -} - -static const struct rpc_call_ops rpc_child_ops = { - .rpc_call_done = rpc_child_exit, -}; - -/* - * Note: rpc_new_task releases the client after a failure. - */ -struct rpc_task * -rpc_new_child(struct rpc_clnt *clnt, struct rpc_task *parent) -{ - struct rpc_task *task; - - task = rpc_new_task(clnt, RPC_TASK_ASYNC | RPC_TASK_CHILD, &rpc_child_ops, parent); - if (!task) - goto fail; - return task; - -fail: - parent->tk_status = -ENOMEM; - return NULL; -} - -void rpc_run_child(struct rpc_task *task, struct rpc_task *child, rpc_action func) -{ - spin_lock_bh(&childq.lock); - /* N.B. Is it possible for the child to have already finished? */ - __rpc_sleep_on(&childq, task, func, NULL); - rpc_schedule_run(child); - spin_unlock_bh(&childq.lock); -} - /* * Kill all tasks for the given client. * XXX: kill their descendants as well? -- cgit v0.10.2 From bbf7c1dd2ae2b4040b41b1065ee9b1b6905b1605 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:16 -0400 Subject: SUNRPC: Introduce transport switch callout for pluggable rpcbind Introduce a clean transport switch API for plugging in different types of rpcbind mechanisms. For instance, rpcbind can cleanly replace the existing portmapper client, or a transport can choose to implement RPC binding any way it likes. Test plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 00e9dba..2e68ac0 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -106,7 +106,7 @@ struct rpc_clnt *rpc_clone_client(struct rpc_clnt *); int rpc_shutdown_client(struct rpc_clnt *); int rpc_destroy_client(struct rpc_clnt *); void rpc_release_client(struct rpc_clnt *); -void rpc_getport(struct rpc_task *, struct rpc_clnt *); +void rpc_getport(struct rpc_task *); int rpc_register(u32, u32, int, unsigned short, int *); void rpc_call_setup(struct rpc_task *, struct rpc_message *, int); diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 4ce8261..8412255 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -105,6 +105,7 @@ struct rpc_xprt_ops { void (*set_buffer_size)(struct rpc_xprt *xprt, size_t sndsize, size_t rcvsize); int (*reserve_xprt)(struct rpc_task *task); void (*release_xprt)(struct rpc_xprt *xprt, struct rpc_task *task); + void (*rpcbind)(struct rpc_task *task); void (*set_port)(struct rpc_xprt *xprt, unsigned short port); void (*connect)(struct rpc_task *task); void * (*buf_alloc)(struct rpc_task *task, size_t size); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index cee5041..d003c2f 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -774,7 +774,6 @@ call_encode(struct rpc_task *task) static void call_bind(struct rpc_task *task) { - struct rpc_clnt *clnt = task->tk_client; struct rpc_xprt *xprt = task->tk_xprt; dprintk("RPC: %4d call_bind (status %d)\n", @@ -784,7 +783,7 @@ call_bind(struct rpc_task *task) if (!xprt_bound(xprt)) { task->tk_action = call_bind_status; task->tk_timeout = xprt->bind_timeout; - rpc_getport(task, clnt); + xprt->ops->rpcbind(task); } } diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 0efcbf1..f7b279a 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -81,13 +81,13 @@ static inline void pmap_wake_portmap_waiters(struct rpc_xprt *xprt) /** * rpc_getport - obtain the port for a given RPC service on a given host * @task: task that is waiting for portmapper request - * @clnt: controlling rpc_clnt * * This one can be called for an ongoing RPC request, and can be used in * an async (rpciod) context. */ -void rpc_getport(struct rpc_task *task, struct rpc_clnt *clnt) +void rpc_getport(struct rpc_task *task) { + struct rpc_clnt *clnt = task->tk_client; struct rpc_xprt *xprt = task->tk_xprt; struct sockaddr_in *sap = &xprt->addr; struct portmap_args *map; diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 123ac1e..4c98b89 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1262,6 +1262,7 @@ static struct rpc_xprt_ops xs_udp_ops = { .set_buffer_size = xs_udp_set_buffer_size, .reserve_xprt = xprt_reserve_xprt_cong, .release_xprt = xprt_release_xprt_cong, + .rpcbind = rpc_getport, .set_port = xs_set_port, .connect = xs_connect, .buf_alloc = rpc_malloc, @@ -1278,6 +1279,7 @@ static struct rpc_xprt_ops xs_udp_ops = { static struct rpc_xprt_ops xs_tcp_ops = { .reserve_xprt = xprt_reserve_xprt, .release_xprt = xs_tcp_release_xprt, + .rpcbind = rpc_getport, .set_port = xs_set_port, .connect = xs_connect, .buf_alloc = rpc_malloc, -- cgit v0.10.2 From ed39440a2573abc926f230267000f21fa5a87822 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:17 -0400 Subject: SUNRPC: create API for getting remote peer address Provide an API for retrieving the remote peer address without allowing direct access to the rpc_xprt struct. Test-plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 2e68ac0..65196b0 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -123,6 +123,7 @@ void rpc_setbufsize(struct rpc_clnt *, unsigned int, unsigned int); size_t rpc_max_payload(struct rpc_clnt *); void rpc_force_rebind(struct rpc_clnt *); int rpc_ping(struct rpc_clnt *clnt, int flags); +size_t rpc_peeraddr(struct rpc_clnt *, struct sockaddr *, size_t); /* * Helper function for NFSroot support diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index d003c2f..94768cf 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -533,6 +533,27 @@ rpc_call_setup(struct rpc_task *task, struct rpc_message *msg, int flags) task->tk_action = rpc_exit_task; } +/** + * rpc_peeraddr - extract remote peer address from clnt's xprt + * @clnt: RPC client structure + * @buf: target buffer + * @size: length of target buffer + * + * Returns the number of bytes that are actually in the stored address. + */ +size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize) +{ + size_t bytes; + struct rpc_xprt *xprt = clnt->cl_xprt; + + bytes = sizeof(xprt->addr); + if (bytes > bufsize) + bytes = bufsize; + memcpy(buf, &clnt->cl_xprt->addr, bytes); + return sizeof(xprt->addr); +} +EXPORT_SYMBOL(rpc_peeraddr); + void rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize) { -- cgit v0.10.2 From 44c31be261540acf66ddd730631ead8009cc361d Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:17 -0400 Subject: LOCKD: Teach lockd to use the new rpc_peeraddr() API Hide the details of how the RPC client stores remote peer addresses from the Network Lock Manager. Test plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/lockd/clntproc.c b/fs/lockd/clntproc.c index 89ba0df..50dbb67 100644 --- a/fs/lockd/clntproc.c +++ b/fs/lockd/clntproc.c @@ -151,11 +151,13 @@ static void nlmclnt_release_lockargs(struct nlm_rqst *req) int nlmclnt_proc(struct inode *inode, int cmd, struct file_lock *fl) { + struct rpc_clnt *client = NFS_CLIENT(inode); + struct sockaddr_in addr; struct nlm_host *host; struct nlm_rqst *call; sigset_t oldset; unsigned long flags; - int status, proto, vers; + int status, vers; vers = (NFS_PROTO(inode)->version == 3) ? 4 : 1; if (NFS_PROTO(inode)->version > 3) { @@ -163,10 +165,8 @@ nlmclnt_proc(struct inode *inode, int cmd, struct file_lock *fl) return -ENOLCK; } - /* Retrieve transport protocol from NFS client */ - proto = NFS_CLIENT(inode)->cl_xprt->prot; - - host = nlmclnt_lookup_host(NFS_ADDR(inode), proto, vers); + rpc_peeraddr(client, (struct sockaddr *) &addr, sizeof(addr)); + host = nlmclnt_lookup_host(&addr, client->cl_xprt->prot, vers); if (host == NULL) return -ENOLCK; -- cgit v0.10.2 From 081f79a9b09b634f0dc08ed014e0195464d52535 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:17 -0400 Subject: SUNRPC: Teach the RPC portmapper to use the new rpc_peeraddr() API. Hide the details of how the RPC client stores remote peer addresses from the RPC portmapper. Test plan: Destructive testing (unplugging the network temporarily). Connectathon with UDP and TCP. NFSv2/3 and NFSv4 mounting should be carefully checked. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index f7b279a..3eee8e9 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -89,7 +89,7 @@ void rpc_getport(struct rpc_task *task) { struct rpc_clnt *clnt = task->tk_client; struct rpc_xprt *xprt = task->tk_xprt; - struct sockaddr_in *sap = &xprt->addr; + struct sockaddr_in addr; struct portmap_args *map; struct rpc_clnt *pmap_clnt; struct rpc_task *child; @@ -124,7 +124,8 @@ void rpc_getport(struct rpc_task *task) map->pm_port = 0; map->pm_task = task; - pmap_clnt = pmap_create(clnt->cl_server, sap, map->pm_prot, 0); + rpc_peeraddr(clnt, (struct sockaddr *) &addr, sizeof(addr)); + pmap_clnt = pmap_create(clnt->cl_server, &addr, map->pm_prot, 0); if (IS_ERR(pmap_clnt)) { task->tk_status = PTR_ERR(pmap_clnt); goto bailout; -- cgit v0.10.2 From 39d7bbcb5ba5e9d8d658b70903dd7939400e57db Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:18 -0400 Subject: SUNRPC: remove extraneous header inclusions include/linux/sunrpc/clnt.h already includes include/linux/sunrpc/xprt.h. We can remove xprt.h from source files that already include clnt.h. Likewise include/linux/sunrpc/timer.h. Test plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c index 445abb4..41274874 100644 --- a/fs/nfs/mount_clnt.c +++ b/fs/nfs/mount_clnt.c @@ -14,7 +14,6 @@ #include #include #include -#include #include #include diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 2426b11..0f33e62 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -1,7 +1,6 @@ #ifndef _LINUX_NFS_XDR_H #define _LINUX_NFS_XDR_H -#include #include /* diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 3eee8e9..523f0e8 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -15,7 +15,6 @@ #include #include #include -#include #include #ifdef RPC_DEBUG diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index 015ffe4..ecf3663 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -21,7 +21,6 @@ #include #include -#include #ifdef RPC_DEBUG #define RPCDBG_FACILITY RPCDBG_SCHED diff --git a/net/sunrpc/timer.c b/net/sunrpc/timer.c index bcbdf64..8142fdb 100644 --- a/net/sunrpc/timer.c +++ b/net/sunrpc/timer.c @@ -19,8 +19,6 @@ #include #include -#include -#include #define RPC_RTO_MAX (60*HZ) #define RPC_RTO_INIT (HZ/5) -- cgit v0.10.2 From edb267a688fcee5335d596752f117a30c7152e44 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:18 -0400 Subject: SUNRPC: add xprt switch API for printing formatted remote peer addresses Add a new method to the transport switch API to provide a way to convert the opaque contents of xprt->addr to a human-readable string. Test plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 8412255..8372ab8 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -51,6 +51,14 @@ struct rpc_timeout { unsigned char to_exponential; }; +enum rpc_display_format_t { + RPC_DISPLAY_ADDR = 0, + RPC_DISPLAY_PORT, + RPC_DISPLAY_PROTO, + RPC_DISPLAY_ALL, + RPC_DISPLAY_MAX, +}; + struct rpc_task; struct rpc_xprt; struct seq_file; @@ -103,6 +111,7 @@ struct rpc_rqst { struct rpc_xprt_ops { void (*set_buffer_size)(struct rpc_xprt *xprt, size_t sndsize, size_t rcvsize); + char * (*print_addr)(struct rpc_xprt *xprt, enum rpc_display_format_t format); int (*reserve_xprt)(struct rpc_task *task); void (*release_xprt)(struct rpc_xprt *xprt, struct rpc_task *task); void (*rpcbind)(struct rpc_task *task); @@ -207,6 +216,8 @@ struct rpc_xprt { void (*old_data_ready)(struct sock *, int); void (*old_state_change)(struct sock *); void (*old_write_space)(struct sock *); + + char * address_strings[RPC_DISPLAY_MAX]; }; #define XPRT_LAST_FRAG (1 << 0) diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 4c98b89..cb8e6c3 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -125,6 +125,47 @@ static inline void xs_pktdump(char *msg, u32 *packet, unsigned int count) } #endif +static void xs_format_peer_addresses(struct rpc_xprt *xprt) +{ + struct sockaddr_in *addr = (struct sockaddr_in *) &xprt->addr; + char *buf; + + buf = kzalloc(20, GFP_KERNEL); + if (buf) { + snprintf(buf, 20, "%u.%u.%u.%u", + NIPQUAD(addr->sin_addr.s_addr)); + } + xprt->address_strings[RPC_DISPLAY_ADDR] = buf; + + buf = kzalloc(8, GFP_KERNEL); + if (buf) { + snprintf(buf, 8, "%u", + ntohs(addr->sin_port)); + } + xprt->address_strings[RPC_DISPLAY_PORT] = buf; + + if (xprt->prot == IPPROTO_UDP) + xprt->address_strings[RPC_DISPLAY_PROTO] = "udp"; + else + xprt->address_strings[RPC_DISPLAY_PROTO] = "tcp"; + + buf = kzalloc(48, GFP_KERNEL); + if (buf) { + snprintf(buf, 48, "addr=%u.%u.%u.%u port=%u proto=%s", + NIPQUAD(addr->sin_addr.s_addr), + ntohs(addr->sin_port), + xprt->prot == IPPROTO_UDP ? "udp" : "tcp"); + } + xprt->address_strings[RPC_DISPLAY_ALL] = buf; +} + +static void xs_free_peer_addresses(struct rpc_xprt *xprt) +{ + kfree(xprt->address_strings[RPC_DISPLAY_ADDR]); + kfree(xprt->address_strings[RPC_DISPLAY_PORT]); + kfree(xprt->address_strings[RPC_DISPLAY_ALL]); +} + #define XS_SENDMSG_FLAGS (MSG_DONTWAIT | MSG_NOSIGNAL) static inline int xs_send_head(struct socket *sock, struct sockaddr *addr, int addrlen, struct xdr_buf *xdr, unsigned int base, unsigned int len) @@ -490,6 +531,7 @@ static void xs_destroy(struct rpc_xprt *xprt) xprt_disconnect(xprt); xs_close(xprt); + xs_free_peer_addresses(xprt); kfree(xprt->slot); } @@ -965,6 +1007,19 @@ static unsigned short xs_get_random_port(void) } /** + * xs_print_peer_address - format an IPv4 address for printing + * @xprt: generic transport + * @format: flags field indicating which parts of the address to render + */ +static char *xs_print_peer_address(struct rpc_xprt *xprt, enum rpc_display_format_t format) +{ + if (xprt->address_strings[format] != NULL) + return xprt->address_strings[format]; + else + return "unprintable"; +} + +/** * xs_set_port - reset the port number in the remote endpoint address * @xprt: generic transport * @port: new port number @@ -1019,8 +1074,6 @@ static void xs_udp_connect_worker(void *args) if (xprt->shutdown || !xprt_bound(xprt)) goto out; - dprintk("RPC: xs_udp_connect_worker for xprt %p\n", xprt); - /* Start by resetting any existing state */ xs_close(xprt); @@ -1034,6 +1087,9 @@ static void xs_udp_connect_worker(void *args) goto out; } + dprintk("RPC: worker connecting xprt %p to address: %s\n", + xprt, xs_print_peer_address(xprt, RPC_DISPLAY_ALL)); + if (!xprt->inet) { struct sock *sk = sock->sk; @@ -1102,8 +1158,6 @@ static void xs_tcp_connect_worker(void *args) if (xprt->shutdown || !xprt_bound(xprt)) goto out; - dprintk("RPC: xs_tcp_connect_worker for xprt %p\n", xprt); - if (!xprt->sock) { /* start from scratch */ if ((err = sock_create_kern(PF_INET, SOCK_STREAM, IPPROTO_TCP, &sock)) < 0) { @@ -1119,6 +1173,9 @@ static void xs_tcp_connect_worker(void *args) /* "close" the socket, preserving the local port */ xs_tcp_reuse_connection(xprt); + dprintk("RPC: worker connecting xprt %p to address: %s\n", + xprt, xs_print_peer_address(xprt, RPC_DISPLAY_ALL)); + if (!xprt->inet) { struct sock *sk = sock->sk; @@ -1260,6 +1317,7 @@ static void xs_tcp_print_stats(struct rpc_xprt *xprt, struct seq_file *seq) static struct rpc_xprt_ops xs_udp_ops = { .set_buffer_size = xs_udp_set_buffer_size, + .print_addr = xs_print_peer_address, .reserve_xprt = xprt_reserve_xprt_cong, .release_xprt = xprt_release_xprt_cong, .rpcbind = rpc_getport, @@ -1277,6 +1335,7 @@ static struct rpc_xprt_ops xs_udp_ops = { }; static struct rpc_xprt_ops xs_tcp_ops = { + .print_addr = xs_print_peer_address, .reserve_xprt = xprt_reserve_xprt, .release_xprt = xs_tcp_release_xprt, .rpcbind = rpc_getport, @@ -1301,8 +1360,6 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) { size_t slot_table_size; - dprintk("RPC: setting up udp-ipv4 transport...\n"); - xprt->max_reqs = xprt_udp_slot_table_entries; slot_table_size = xprt->max_reqs * sizeof(xprt->slot[0]); xprt->slot = kzalloc(slot_table_size, GFP_KERNEL); @@ -1332,6 +1389,10 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) else xprt_set_timeout(&xprt->timeout, 5, 5 * HZ); + xs_format_peer_addresses(xprt); + dprintk("RPC: set up transport to address %s\n", + xs_print_peer_address(xprt, RPC_DISPLAY_ALL)); + return 0; } @@ -1345,8 +1406,6 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) { size_t slot_table_size; - dprintk("RPC: setting up tcp-ipv4 transport...\n"); - xprt->max_reqs = xprt_tcp_slot_table_entries; slot_table_size = xprt->max_reqs * sizeof(xprt->slot[0]); xprt->slot = kzalloc(slot_table_size, GFP_KERNEL); @@ -1375,5 +1434,9 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) else xprt_set_timeout(&xprt->timeout, 2, 60 * HZ); + xs_format_peer_addresses(xprt); + dprintk("RPC: set up transport to address %s\n", + xs_print_peer_address(xprt, RPC_DISPLAY_ALL)); + return 0; } -- cgit v0.10.2 From f425eba437f0051bde979ea2eef8bc875a77cd00 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:18 -0400 Subject: SUNRPC: Create API for displaying remote peer address Provide an API for formatting the remote peer address for printing without exposing its internal structure. The address could be dynamic, so we support a function call to get the address rather than reading it straight out of a structure. Test-plan: Destructive testing (unplugging the network temporarily). Probably need to rig a server where certain services aren't running, or that returns an error for some typical operation. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 65196b0..b7d47f0 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -124,6 +124,7 @@ size_t rpc_max_payload(struct rpc_clnt *); void rpc_force_rebind(struct rpc_clnt *); int rpc_ping(struct rpc_clnt *clnt, int flags); size_t rpc_peeraddr(struct rpc_clnt *, struct sockaddr *, size_t); +char * rpc_peeraddr2str(struct rpc_clnt *, enum rpc_display_format_t); /* * Helper function for NFSroot support diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 94768cf..e5b19e3 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -554,6 +554,19 @@ size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize) } EXPORT_SYMBOL(rpc_peeraddr); +/** + * rpc_peeraddr2str - return remote peer address in printable format + * @clnt: RPC client structure + * @format: address format + * + */ +char *rpc_peeraddr2str(struct rpc_clnt *clnt, enum rpc_display_format_t format) +{ + struct rpc_xprt *xprt = clnt->cl_xprt; + return xprt->ops->print_addr(xprt, format); +} +EXPORT_SYMBOL(rpc_peeraddr2str); + void rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize) { -- cgit v0.10.2 From e7f7865743fff3d3938ec7540e5a784d662426da Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:19 -0400 Subject: SUNRPC: Teach rpc_pipe.c to use new rpc_peeraddr() API Hide the details of how the RPC client stores remote peer addresses from the RPC pipefs implementation. Test plan: Connectathon with Kerberos 5 authentication. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 0b1a1ac..c21dc07 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -327,10 +327,8 @@ rpc_show_info(struct seq_file *m, void *v) seq_printf(m, "RPC server: %s\n", clnt->cl_server); seq_printf(m, "service: %s (%d) version %d\n", clnt->cl_protname, clnt->cl_prog, clnt->cl_vers); - seq_printf(m, "address: %u.%u.%u.%u\n", - NIPQUAD(clnt->cl_xprt->addr.sin_addr.s_addr)); - seq_printf(m, "protocol: %s\n", - clnt->cl_xprt->prot == IPPROTO_UDP ? "udp" : "tcp"); + seq_printf(m, "address: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_ADDR)); + seq_printf(m, "protocol: %s\n", rpc_peeraddr2str(clnt, RPC_DISPLAY_PROTO)); return 0; } -- cgit v0.10.2 From c4efcb1d3e0bc76aeb9ca6301d19a5079893c6c9 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:19 -0400 Subject: SUNRPC: Use "sockaddr_storage" for storing RPC client's remote peer address IPv6 addresses are big (128 bytes). Now that no RPC client consumers treat the addr field in rpc_xprt structs as an opaque, and access it only via the API calls, we can safely widen the field in the rpc_xprt struct to accomodate larger addresses. Test plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index 8372ab8..fc05cfb 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -134,7 +134,8 @@ struct rpc_xprt { struct sock * inet; /* INET layer */ struct rpc_timeout timeout; /* timeout parms */ - struct sockaddr_in addr; /* server address */ + struct sockaddr_storage addr; /* server address */ + size_t addrlen; /* size of server address */ int prot; /* IP protocol */ unsigned long cong; /* current congestion */ diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index e5b19e3..ff1e90f 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -550,7 +550,7 @@ size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize) if (bytes > bufsize) bytes = bufsize; memcpy(buf, &clnt->cl_xprt->addr, bytes); - return sizeof(xprt->addr); + return xprt->addrlen; } EXPORT_SYMBOL(rpc_peeraddr); diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index b45abd0..4987517 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -896,7 +896,8 @@ static struct rpc_xprt *xprt_setup(int proto, struct sockaddr_in *ap, struct rpc if ((xprt = kzalloc(sizeof(struct rpc_xprt), GFP_KERNEL)) == NULL) return ERR_PTR(-ENOMEM); - xprt->addr = *ap; + memcpy(&xprt->addr, ap, sizeof(*ap)); + xprt->addrlen = sizeof(*ap); switch (proto) { case IPPROTO_UDP: diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index cb8e6c3..17179aa 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -341,7 +341,7 @@ static int xs_udp_send_request(struct rpc_task *task) req->rq_xtime = jiffies; status = xs_sendpages(xprt->sock, (struct sockaddr *) &xprt->addr, - sizeof(xprt->addr), xdr, req->rq_bytes_sent); + xprt->addrlen, xdr, req->rq_bytes_sent); dprintk("RPC: xs_udp_send_request(%u) = %d\n", xdr->len - req->rq_bytes_sent, status); @@ -1027,8 +1027,11 @@ static char *xs_print_peer_address(struct rpc_xprt *xprt, enum rpc_display_forma */ static void xs_set_port(struct rpc_xprt *xprt, unsigned short port) { + struct sockaddr_in *sap = (struct sockaddr_in *) &xprt->addr; + dprintk("RPC: setting port for xprt %p to %u\n", xprt, port); - xprt->addr.sin_port = htons(port); + + sap->sin_port = htons(port); } static int xs_bindresvport(struct rpc_xprt *xprt, struct socket *sock) @@ -1209,7 +1212,7 @@ static void xs_tcp_connect_worker(void *args) xprt->stat.connect_count++; xprt->stat.connect_start = jiffies; status = sock->ops->connect(sock, (struct sockaddr *) &xprt->addr, - sizeof(xprt->addr), O_NONBLOCK); + xprt->addrlen, O_NONBLOCK); dprintk("RPC: %p connect status %d connected %d sock state %d\n", xprt, -status, xprt_connected(xprt), sock->sk->sk_state); if (status < 0) { @@ -1359,6 +1362,7 @@ static struct rpc_xprt_ops xs_tcp_ops = { int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) { size_t slot_table_size; + struct sockaddr_in *addr = (struct sockaddr_in *) &xprt->addr; xprt->max_reqs = xprt_udp_slot_table_entries; slot_table_size = xprt->max_reqs * sizeof(xprt->slot[0]); @@ -1366,7 +1370,7 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) if (xprt->slot == NULL) return -ENOMEM; - if (ntohs(xprt->addr.sin_port) != 0) + if (ntohs(addr->sin_port != 0)) xprt_set_bound(xprt); xprt->port = xs_get_random_port(); @@ -1405,6 +1409,7 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) { size_t slot_table_size; + struct sockaddr_in *addr = (struct sockaddr_in *) &xprt->addr; xprt->max_reqs = xprt_tcp_slot_table_entries; slot_table_size = xprt->max_reqs * sizeof(xprt->slot[0]); @@ -1412,7 +1417,7 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) if (xprt->slot == NULL) return -ENOMEM; - if (ntohs(xprt->addr.sin_port) != 0) + if (ntohs(addr->sin_port) != 0) xprt_set_bound(xprt); xprt->port = xs_get_random_port(); -- cgit v0.10.2 From 6ca948238724c945bd353f51d54ae7d285f3889f Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:19 -0400 Subject: SUNRPC: Clean-up after previous patches. Remove some unused macros related to accessing an RPC peer address Test plan: Compile kernel with CONFIG_NFS option enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/lockd/host.c b/fs/lockd/host.c index 38b0e8a..a516a01 100644 --- a/fs/lockd/host.c +++ b/fs/lockd/host.c @@ -26,7 +26,6 @@ #define NLM_HOST_REBIND (60 * HZ) #define NLM_HOST_EXPIRE ((nrhosts > NLM_HOST_MAX)? 300 * HZ : 120 * HZ) #define NLM_HOST_COLLECT ((nrhosts > NLM_HOST_MAX)? 120 * HZ : 60 * HZ) -#define NLM_HOST_ADDR(sv) (&(sv)->s_nlmclnt->cl_xprt->addr) static struct nlm_host * nlm_hosts[NLM_HOST_NRHASH]; static unsigned long next_gc; diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h index 51e9bd9..3b5b041 100644 --- a/include/linux/nfs_fs.h +++ b/include/linux/nfs_fs.h @@ -216,7 +216,6 @@ static inline struct nfs_inode *NFS_I(struct inode *inode) #define NFS_SERVER(inode) (NFS_SB(inode->i_sb)) #define NFS_CLIENT(inode) (NFS_SERVER(inode)->client) #define NFS_PROTO(inode) (NFS_SERVER(inode)->nfs_client->rpc_ops) -#define NFS_ADDR(inode) (RPC_PEERADDR(NFS_CLIENT(inode))) #define NFS_COOKIEVERF(inode) (NFS_I(inode)->cookieverf) #define NFS_READTIME(inode) (NFS_I(inode)->read_cache_jiffies) #define NFS_CHANGE_ATTR(inode) (NFS_I(inode)->change_attr) diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index b7d47f0..a26d695 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -89,9 +89,6 @@ struct rpc_procinfo { char * p_name; /* name of procedure */ }; -#define RPC_CONGESTED(clnt) (RPCXPRT_CONGESTED((clnt)->cl_xprt)) -#define RPC_PEERADDR(clnt) (&(clnt)->cl_xprt->addr) - #ifdef __KERNEL__ struct rpc_clnt *rpc_create_client(struct rpc_xprt *xprt, char *servname, -- cgit v0.10.2 From c2866763b4029411d166040306691773c12d4caf Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:20 -0400 Subject: SUNRPC: use sockaddr + size when creating remote transport endpoints Prepare for more generic transport endpoint handling needed by transports that might use different forms of addressing, such as IPv6. Introduce a single function call to replace the two-call xprt_create_proto/rpc_create_client API. Define a new rpc_create_args structure that allows callers to pass in remote endpoint addresses of varying length. Test-plan: Compile kernel with CONFIG_NFS enabled. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index a26d695..7817ba8 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -97,6 +97,28 @@ struct rpc_clnt *rpc_create_client(struct rpc_xprt *xprt, char *servname, struct rpc_clnt *rpc_new_client(struct rpc_xprt *xprt, char *servname, struct rpc_program *info, u32 version, rpc_authflavor_t authflavor); + +struct rpc_create_args { + int protocol; + struct sockaddr *address; + size_t addrsize; + struct rpc_timeout *timeout; + char *servername; + struct rpc_program *program; + u32 version; + rpc_authflavor_t authflavor; + unsigned long flags; +}; + +/* Values for "flags" field */ +#define RPC_CLNT_CREATE_HARDRTRY (1UL << 0) +#define RPC_CLNT_CREATE_INTR (1UL << 1) +#define RPC_CLNT_CREATE_AUTOBIND (1UL << 2) +#define RPC_CLNT_CREATE_ONESHOT (1UL << 3) +#define RPC_CLNT_CREATE_NONPRIVPORT (1UL << 4) +#define RPC_CLNT_CREATE_NOPING (1UL << 5) + +struct rpc_clnt *rpc_create(struct rpc_create_args *args); struct rpc_clnt *rpc_bind_new_program(struct rpc_clnt *, struct rpc_program *, int); struct rpc_clnt *rpc_clone_client(struct rpc_clnt *); diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index fc05cfb..bc80fcf 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -237,6 +237,7 @@ void xprt_set_timeout(struct rpc_timeout *to, unsigned int retr, unsigned long /* * Generic internal transport functions */ +struct rpc_xprt * xprt_create_transport(int proto, struct sockaddr *addr, size_t size, struct rpc_timeout *toparms); void xprt_connect(struct rpc_task *task); void xprt_reserve(struct rpc_task *task); int xprt_reserve_xprt(struct rpc_task *task); diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index ff1e90f..dbb93bd 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -192,6 +192,67 @@ out_no_xprt: return ERR_PTR(err); } +/* + * rpc_create - create an RPC client and transport with one call + * @args: rpc_clnt create argument structure + * + * Creates and initializes an RPC transport and an RPC client. + * + * It can ping the server in order to determine if it is up, and to see if + * it supports this program and version. RPC_CLNT_CREATE_NOPING disables + * this behavior so asynchronous tasks can also use rpc_create. + */ +struct rpc_clnt *rpc_create(struct rpc_create_args *args) +{ + struct rpc_xprt *xprt; + struct rpc_clnt *clnt; + + xprt = xprt_create_transport(args->protocol, args->address, + args->addrsize, args->timeout); + if (IS_ERR(xprt)) + return (struct rpc_clnt *)xprt; + + /* + * By default, kernel RPC client connects from a reserved port. + * CAP_NET_BIND_SERVICE will not be set for unprivileged requesters, + * but it is always enabled for rpciod, which handles the connect + * operation. + */ + xprt->resvport = 1; + if (args->flags & RPC_CLNT_CREATE_NONPRIVPORT) + xprt->resvport = 0; + + dprintk("RPC: creating %s client for %s (xprt %p)\n", + args->program->name, args->servername, xprt); + + clnt = rpc_new_client(xprt, args->servername, args->program, + args->version, args->authflavor); + if (IS_ERR(clnt)) + return clnt; + + if (!(args->flags & RPC_CLNT_CREATE_NOPING)) { + int err = rpc_ping(clnt, RPC_TASK_SOFT|RPC_TASK_NOINTR); + if (err != 0) { + rpc_shutdown_client(clnt); + return ERR_PTR(err); + } + } + + clnt->cl_softrtry = 1; + if (args->flags & RPC_CLNT_CREATE_HARDRTRY) + clnt->cl_softrtry = 0; + + if (args->flags & RPC_CLNT_CREATE_INTR) + clnt->cl_intr = 1; + if (args->flags & RPC_CLNT_CREATE_AUTOBIND) + clnt->cl_autobind = 1; + if (args->flags & RPC_CLNT_CREATE_ONESHOT) + clnt->cl_oneshot = 1; + + return clnt; +} +EXPORT_SYMBOL(rpc_create); + /** * Create an RPC client * @xprt - pointer to xprt struct diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 4987517..17f56cf 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -887,6 +887,81 @@ void xprt_set_timeout(struct rpc_timeout *to, unsigned int retr, unsigned long i to->to_exponential = 0; } +/** + * xprt_create_transport - create an RPC transport + * @proto: requested transport protocol + * @ap: remote peer address + * @size: length of address + * @to: timeout parameters + * + */ +struct rpc_xprt *xprt_create_transport(int proto, struct sockaddr *ap, size_t size, struct rpc_timeout *to) +{ + int result; + struct rpc_xprt *xprt; + struct rpc_rqst *req; + + if ((xprt = kzalloc(sizeof(struct rpc_xprt), GFP_KERNEL)) == NULL) { + dprintk("RPC: xprt_create_transport: no memory\n"); + return ERR_PTR(-ENOMEM); + } + if (size <= sizeof(xprt->addr)) { + memcpy(&xprt->addr, ap, size); + xprt->addrlen = size; + } else { + kfree(xprt); + dprintk("RPC: xprt_create_transport: address too large\n"); + return ERR_PTR(-EBADF); + } + + switch (proto) { + case IPPROTO_UDP: + result = xs_setup_udp(xprt, to); + break; + case IPPROTO_TCP: + result = xs_setup_tcp(xprt, to); + break; + default: + printk(KERN_ERR "RPC: unrecognized transport protocol: %d\n", + proto); + return ERR_PTR(-EIO); + } + if (result) { + kfree(xprt); + dprintk("RPC: xprt_create_transport: failed, %d\n", result); + return ERR_PTR(result); + } + + spin_lock_init(&xprt->transport_lock); + spin_lock_init(&xprt->reserve_lock); + + INIT_LIST_HEAD(&xprt->free); + INIT_LIST_HEAD(&xprt->recv); + INIT_WORK(&xprt->task_cleanup, xprt_autoclose, xprt); + init_timer(&xprt->timer); + xprt->timer.function = xprt_init_autodisconnect; + xprt->timer.data = (unsigned long) xprt; + xprt->last_used = jiffies; + xprt->cwnd = RPC_INITCWND; + + rpc_init_wait_queue(&xprt->binding, "xprt_binding"); + rpc_init_wait_queue(&xprt->pending, "xprt_pending"); + rpc_init_wait_queue(&xprt->sending, "xprt_sending"); + rpc_init_wait_queue(&xprt->resend, "xprt_resend"); + rpc_init_priority_wait_queue(&xprt->backlog, "xprt_backlog"); + + /* initialize free list */ + for (req = &xprt->slot[xprt->max_reqs-1]; req >= &xprt->slot[0]; req--) + list_add(&req->rq_list, &xprt->free); + + xprt_init_xid(xprt); + + dprintk("RPC: created transport %p with %u slots\n", xprt, + xprt->max_reqs); + + return xprt; +} + static struct rpc_xprt *xprt_setup(int proto, struct sockaddr_in *ap, struct rpc_timeout *to) { int result; -- cgit v0.10.2 From e1ec78928b4d5a31b7a847e65c6009f4229f7c0f Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:20 -0400 Subject: LOCKD: Convert to use new rpc_create() API Replace xprt_create_proto/rpc_create_client with new rpc_create() interface in the Network Lock Manager. Note that the semantics of NLM transports is now "hard" instead of "soft" to provide a better guarantee that lock requests will get to the server. Test plan: Repeated runs of Connectathon locking suite. Check network trace to ensure NLM requests are working correctly. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/lockd/host.c b/fs/lockd/host.c index a516a01..703fb03 100644 --- a/fs/lockd/host.c +++ b/fs/lockd/host.c @@ -166,7 +166,6 @@ struct rpc_clnt * nlm_bind_host(struct nlm_host *host) { struct rpc_clnt *clnt; - struct rpc_xprt *xprt; dprintk("lockd: nlm_bind_host(%08x)\n", (unsigned)ntohl(host->h_addr.sin_addr.s_addr)); @@ -178,7 +177,6 @@ nlm_bind_host(struct nlm_host *host) * RPC rebind is required */ if ((clnt = host->h_rpcclnt) != NULL) { - xprt = clnt->cl_xprt; if (time_after_eq(jiffies, host->h_nextrebind)) { rpc_force_rebind(clnt); host->h_nextrebind = jiffies + NLM_HOST_REBIND; @@ -186,31 +184,37 @@ nlm_bind_host(struct nlm_host *host) host->h_nextrebind - jiffies); } } else { - xprt = xprt_create_proto(host->h_proto, &host->h_addr, NULL); - if (IS_ERR(xprt)) - goto forgetit; - - xprt_set_timeout(&xprt->timeout, 5, nlmsvc_timeout); - xprt->resvport = 1; /* NLM requires a reserved port */ - - /* Existing NLM servers accept AUTH_UNIX only */ - clnt = rpc_new_client(xprt, host->h_name, &nlm_program, - host->h_version, RPC_AUTH_UNIX); - if (IS_ERR(clnt)) - goto forgetit; - clnt->cl_autobind = 1; /* turn on pmap queries */ - clnt->cl_softrtry = 1; /* All queries are soft */ - - host->h_rpcclnt = clnt; + unsigned long increment = nlmsvc_timeout * HZ; + struct rpc_timeout timeparms = { + .to_initval = increment, + .to_increment = increment, + .to_maxval = increment * 6UL, + .to_retries = 5U, + }; + struct rpc_create_args args = { + .protocol = host->h_proto, + .address = (struct sockaddr *)&host->h_addr, + .addrsize = sizeof(host->h_addr), + .timeout = &timeparms, + .servername = host->h_name, + .program = &nlm_program, + .version = host->h_version, + .authflavor = RPC_AUTH_UNIX, + .flags = (RPC_CLNT_CREATE_HARDRTRY | + RPC_CLNT_CREATE_AUTOBIND), + }; + + clnt = rpc_create(&args); + if (!IS_ERR(clnt)) + host->h_rpcclnt = clnt; + else { + printk("lockd: couldn't create RPC handle for %s\n", host->h_name); + clnt = NULL; + } } mutex_unlock(&host->h_mutex); return clnt; - -forgetit: - printk("lockd: couldn't create RPC handle for %s\n", host->h_name); - mutex_unlock(&host->h_mutex); - return NULL; } /* diff --git a/fs/lockd/mon.c b/fs/lockd/mon.c index 3fc683f..5954dcb 100644 --- a/fs/lockd/mon.c +++ b/fs/lockd/mon.c @@ -109,30 +109,23 @@ nsm_unmonitor(struct nlm_host *host) static struct rpc_clnt * nsm_create(void) { - struct rpc_xprt *xprt; - struct rpc_clnt *clnt; - struct sockaddr_in sin; - - sin.sin_family = AF_INET; - sin.sin_addr.s_addr = htonl(INADDR_LOOPBACK); - sin.sin_port = 0; - - xprt = xprt_create_proto(IPPROTO_UDP, &sin, NULL); - if (IS_ERR(xprt)) - return (struct rpc_clnt *)xprt; - xprt->resvport = 1; /* NSM requires a reserved port */ - - clnt = rpc_create_client(xprt, "localhost", - &nsm_program, SM_VERSION, - RPC_AUTH_NULL); - if (IS_ERR(clnt)) - goto out_err; - clnt->cl_softrtry = 1; - clnt->cl_oneshot = 1; - return clnt; - -out_err: - return clnt; + struct sockaddr_in sin = { + .sin_family = AF_INET, + .sin_addr.s_addr = htonl(INADDR_LOOPBACK), + .sin_port = 0, + }; + struct rpc_create_args args = { + .protocol = IPPROTO_UDP, + .address = (struct sockaddr *)&sin, + .addrsize = sizeof(sin), + .servername = "localhost", + .program = &nsm_program, + .version = SM_VERSION, + .authflavor = RPC_AUTH_NULL, + .flags = (RPC_CLNT_CREATE_ONESHOT), + }; + + return rpc_create(&args); } /* -- cgit v0.10.2 From 41877d207c46f050b709f452703ade20c3b4a096 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:20 -0400 Subject: NFS: Convert NFS client to use new rpc_create() API Convert NFS client mount logic to use rpc_create() instead of the old xprt_create_proto/rpc_create_client API. Test plan: Mount stress tests. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 471d975..12941a8 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -401,8 +401,17 @@ static int nfs_create_rpc_client(struct nfs_client *clp, int proto, rpc_authflavor_t flavor) { struct rpc_timeout timeparms; - struct rpc_xprt *xprt = NULL; struct rpc_clnt *clnt = NULL; + struct rpc_create_args args = { + .protocol = proto, + .address = (struct sockaddr *)&clp->cl_addr, + .addrsize = sizeof(clp->cl_addr), + .timeout = &timeparms, + .servername = clp->cl_hostname, + .program = &nfs_program, + .version = clp->rpc_ops->version, + .authflavor = flavor, + }; if (!IS_ERR(clp->cl_rpcclient)) return 0; @@ -411,27 +420,13 @@ static int nfs_create_rpc_client(struct nfs_client *clp, int proto, clp->retrans_timeo = timeparms.to_initval; clp->retrans_count = timeparms.to_retries; - /* create transport and client */ - xprt = xprt_create_proto(proto, &clp->cl_addr, &timeparms); - if (IS_ERR(xprt)) { - dprintk("%s: cannot create RPC transport. Error = %ld\n", - __FUNCTION__, PTR_ERR(xprt)); - return PTR_ERR(xprt); - } - - /* Bind to a reserved port! */ - xprt->resvport = 1; - /* Create the client RPC handle */ - clnt = rpc_create_client(xprt, clp->cl_hostname, &nfs_program, - clp->rpc_ops->version, RPC_AUTH_UNIX); + clnt = rpc_create(&args); if (IS_ERR(clnt)) { dprintk("%s: cannot create RPC client. Error = %ld\n", __FUNCTION__, PTR_ERR(clnt)); return PTR_ERR(clnt); } - clnt->cl_intr = 1; - clnt->cl_softrtry = 1; clp->cl_rpcclient = clnt; return 0; } -- cgit v0.10.2 From ae5c79476f36512d1100e162606bb5691f2cce5a Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:21 -0400 Subject: NFSD: Convert NFS server callback logic to use new rpc_create API Replace xprt_create_proto/rpc_create_client call in NFS server callback functions to use new rpc_create() API. Test plan: NFSv4 delegation functionality tests. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfsd/nfs4callback.c b/fs/nfsd/nfs4callback.c index 54b37b1..8583d99 100644 --- a/fs/nfsd/nfs4callback.c +++ b/fs/nfsd/nfs4callback.c @@ -375,16 +375,28 @@ nfsd4_probe_callback(struct nfs4_client *clp) { struct sockaddr_in addr; struct nfs4_callback *cb = &clp->cl_callback; - struct rpc_timeout timeparms; - struct rpc_xprt * xprt; + struct rpc_timeout timeparms = { + .to_initval = (NFSD_LEASE_TIME/4) * HZ, + .to_retries = 5, + .to_maxval = (NFSD_LEASE_TIME/2) * HZ, + .to_exponential = 1, + }; struct rpc_program * program = &cb->cb_program; - struct rpc_stat * stat = &cb->cb_stat; - struct rpc_clnt * clnt; + struct rpc_create_args args = { + .protocol = IPPROTO_TCP, + .address = (struct sockaddr *)&addr, + .addrsize = sizeof(addr), + .timeout = &timeparms, + .servername = clp->cl_name.data, + .program = program, + .version = nfs_cb_version[1]->number, + .authflavor = RPC_AUTH_UNIX, /* XXX: need AUTH_GSS... */ + .flags = (RPC_CLNT_CREATE_NOPING), + }; struct rpc_message msg = { .rpc_proc = &nfs4_cb_procedures[NFSPROC4_CLNT_CB_NULL], .rpc_argp = clp, }; - char hostname[32]; int status; if (atomic_read(&cb->cb_set)) @@ -396,51 +408,27 @@ nfsd4_probe_callback(struct nfs4_client *clp) addr.sin_port = htons(cb->cb_port); addr.sin_addr.s_addr = htonl(cb->cb_addr); - /* Initialize timeout */ - timeparms.to_initval = (NFSD_LEASE_TIME/4) * HZ; - timeparms.to_retries = 0; - timeparms.to_maxval = (NFSD_LEASE_TIME/2) * HZ; - timeparms.to_exponential = 1; - - /* Create RPC transport */ - xprt = xprt_create_proto(IPPROTO_TCP, &addr, &timeparms); - if (IS_ERR(xprt)) { - dprintk("NFSD: couldn't create callback transport!\n"); - goto out_err; - } - /* Initialize rpc_program */ program->name = "nfs4_cb"; program->number = cb->cb_prog; program->nrvers = ARRAY_SIZE(nfs_cb_version); program->version = nfs_cb_version; - program->stats = stat; + program->stats = &cb->cb_stat; /* Initialize rpc_stat */ - memset(stat, 0, sizeof(struct rpc_stat)); - stat->program = program; - - /* Create RPC client - * - * XXX AUTH_UNIX only - need AUTH_GSS.... - */ - sprintf(hostname, "%u.%u.%u.%u", NIPQUAD(addr.sin_addr.s_addr)); - clnt = rpc_new_client(xprt, hostname, program, 1, RPC_AUTH_UNIX); - if (IS_ERR(clnt)) { + memset(program->stats, 0, sizeof(cb->cb_stat)); + program->stats->program = program; + + /* Create RPC client */ + cb->cb_client = rpc_create(&args); + if (!cb->cb_client) { dprintk("NFSD: couldn't create callback client\n"); goto out_err; } - clnt->cl_intr = 0; - clnt->cl_softrtry = 1; /* Kick rpciod, put the call on the wire. */ - - if (rpciod_up() != 0) { - dprintk("nfsd: couldn't start rpciod for callbacks!\n"); + if (rpciod_up() != 0) goto out_clnt; - } - - cb->cb_client = clnt; /* the task holds a reference to the nfs4_client struct */ atomic_inc(&clp->cl_count); @@ -448,7 +436,7 @@ nfsd4_probe_callback(struct nfs4_client *clp) msg.rpc_cred = nfsd4_lookupcred(clp,0); if (IS_ERR(msg.rpc_cred)) goto out_rpciod; - status = rpc_call_async(clnt, &msg, RPC_TASK_ASYNC, &nfs4_cb_null_ops, NULL); + status = rpc_call_async(cb->cb_client, &msg, RPC_TASK_ASYNC, &nfs4_cb_null_ops, NULL); put_rpccred(msg.rpc_cred); if (status != 0) { @@ -462,7 +450,7 @@ out_rpciod: rpciod_down(); cb->cb_client = NULL; out_clnt: - rpc_shutdown_client(clnt); + rpc_shutdown_client(cb->cb_client); out_err: dprintk("NFSD: warning: no callback path to client %.*s\n", (int)clp->cl_name.len, clp->cl_name.data); -- cgit v0.10.2 From 9e1968c58d72c4b85d8a69bda1e194f9701fb224 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:21 -0400 Subject: SUNRPC: Convert RPC portmapper to use new rpc_create() API Replace xprt_create_proto/rpc_create_client calls in pmap_clnt.c with new rpc_create() API. Test plan: Repeated runs of Connectathon locking suite. Check network trace for proper PMAP calls and replies. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index 523f0e8..f476f4d 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -281,25 +281,22 @@ int rpc_register(u32 prog, u32 vers, int prot, unsigned short port, int *okay) static struct rpc_clnt *pmap_create(char *hostname, struct sockaddr_in *srvaddr, int proto, int privileged) { - struct rpc_xprt *xprt; - struct rpc_clnt *clnt; - - xprt = xprt_create_proto(proto, srvaddr, NULL); - if (IS_ERR(xprt)) - return (struct rpc_clnt *)xprt; - xprt->ops->set_port(xprt, RPC_PMAP_PORT); - xprt_set_bound(xprt); + struct rpc_create_args args = { + .protocol = proto, + .address = (struct sockaddr *)srvaddr, + .addrsize = sizeof(*srvaddr), + .servername = hostname, + .program = &pmap_program, + .version = RPC_PMAP_VERSION, + .authflavor = RPC_AUTH_UNIX, + .flags = (RPC_CLNT_CREATE_ONESHOT | + RPC_CLNT_CREATE_NOPING), + }; + + srvaddr->sin_port = htons(RPC_PMAP_PORT); if (!privileged) - xprt->resvport = 0; - - clnt = rpc_new_client(xprt, hostname, - &pmap_program, RPC_PMAP_VERSION, - RPC_AUTH_UNIX); - if (!IS_ERR(clnt)) { - clnt->cl_softrtry = 1; - clnt->cl_oneshot = 1; - } - return clnt; + args.flags |= RPC_CLNT_CREATE_NONPRIVPORT; + return rpc_create(&args); } /* -- cgit v0.10.2 From ff9aa5e56df60cc8565a93cc868fe25ae3f20e49 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:21 -0400 Subject: SUNRPC: Eliminate xprt_create_proto and rpc_create_client The two function call API for creating a new RPC client is now obsolete. Remove it. Also, remove an unnecessary check to see whether the caller is capable of using privileged network services. The kernel RPC client always uses a privileged ephemeral port by default; callers are responsible for checking the authority of users to make use of any RPC service, or for specifying that a nonprivileged port is acceptable. Test plan: Repeated runs of Connectathon locking suite. Check network trace to ensure correctness of NLM requests and replies. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 7817ba8..f6d1d64 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -91,13 +91,6 @@ struct rpc_procinfo { #ifdef __KERNEL__ -struct rpc_clnt *rpc_create_client(struct rpc_xprt *xprt, char *servname, - struct rpc_program *info, - u32 version, rpc_authflavor_t authflavor); -struct rpc_clnt *rpc_new_client(struct rpc_xprt *xprt, char *servname, - struct rpc_program *info, - u32 version, rpc_authflavor_t authflavor); - struct rpc_create_args { int protocol; struct sockaddr *address; diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index bc80fcf..de4efea 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -231,7 +231,6 @@ struct rpc_xprt { /* * Transport operations used by ULPs */ -struct rpc_xprt * xprt_create_proto(int proto, struct sockaddr_in *addr, struct rpc_timeout *to); void xprt_set_timeout(struct rpc_timeout *to, unsigned int retr, unsigned long incr); /* diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index dbb93bd..428704d 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -97,17 +97,7 @@ rpc_setup_pipedir(struct rpc_clnt *clnt, char *dir_name) } } -/* - * Create an RPC client - * FIXME: This should also take a flags argument (as in task->tk_flags). - * It's called (among others) from pmap_create_client, which may in - * turn be called by an async task. In this case, rpciod should not be - * made to sleep too long. - */ -struct rpc_clnt * -rpc_new_client(struct rpc_xprt *xprt, char *servname, - struct rpc_program *program, u32 vers, - rpc_authflavor_t flavor) +static struct rpc_clnt * rpc_new_client(struct rpc_xprt *xprt, char *servname, struct rpc_program *program, u32 vers, rpc_authflavor_t flavor) { struct rpc_version *version; struct rpc_clnt *clnt = NULL; @@ -253,36 +243,6 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) } EXPORT_SYMBOL(rpc_create); -/** - * Create an RPC client - * @xprt - pointer to xprt struct - * @servname - name of server - * @info - rpc_program - * @version - rpc_program version - * @authflavor - rpc_auth flavour to use - * - * Creates an RPC client structure, then pings the server in order to - * determine if it is up, and if it supports this program and version. - * - * This function should never be called by asynchronous tasks such as - * the portmapper. - */ -struct rpc_clnt *rpc_create_client(struct rpc_xprt *xprt, char *servname, - struct rpc_program *info, u32 version, rpc_authflavor_t authflavor) -{ - struct rpc_clnt *clnt; - int err; - - clnt = rpc_new_client(xprt, servname, info, version, authflavor); - if (IS_ERR(clnt)) - return clnt; - err = rpc_ping(clnt, RPC_TASK_SOFT|RPC_TASK_NOINTR); - if (err == 0) - return clnt; - rpc_shutdown_client(clnt); - return ERR_PTR(err); -} - /* * This function clones the RPC client structure. It allows us to share the * same transport while varying parameters such as the authentication diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c index f38f939c..26c0531 100644 --- a/net/sunrpc/sunrpc_syms.c +++ b/net/sunrpc/sunrpc_syms.c @@ -36,8 +36,6 @@ EXPORT_SYMBOL(rpc_wake_up_status); EXPORT_SYMBOL(rpc_release_task); /* RPC client functions */ -EXPORT_SYMBOL(rpc_create_client); -EXPORT_SYMBOL(rpc_new_client); EXPORT_SYMBOL(rpc_clone_client); EXPORT_SYMBOL(rpc_bind_new_program); EXPORT_SYMBOL(rpc_destroy_client); @@ -57,7 +55,6 @@ EXPORT_SYMBOL(rpc_queue_upcall); EXPORT_SYMBOL(rpc_mkpipe); /* Client transport */ -EXPORT_SYMBOL(xprt_create_proto); EXPORT_SYMBOL(xprt_set_timeout); /* Client credential cache */ diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index 17f56cf..e4f64fb 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -962,85 +962,6 @@ struct rpc_xprt *xprt_create_transport(int proto, struct sockaddr *ap, size_t si return xprt; } -static struct rpc_xprt *xprt_setup(int proto, struct sockaddr_in *ap, struct rpc_timeout *to) -{ - int result; - struct rpc_xprt *xprt; - struct rpc_rqst *req; - - if ((xprt = kzalloc(sizeof(struct rpc_xprt), GFP_KERNEL)) == NULL) - return ERR_PTR(-ENOMEM); - - memcpy(&xprt->addr, ap, sizeof(*ap)); - xprt->addrlen = sizeof(*ap); - - switch (proto) { - case IPPROTO_UDP: - result = xs_setup_udp(xprt, to); - break; - case IPPROTO_TCP: - result = xs_setup_tcp(xprt, to); - break; - default: - printk(KERN_ERR "RPC: unrecognized transport protocol: %d\n", - proto); - result = -EIO; - break; - } - if (result) { - kfree(xprt); - return ERR_PTR(result); - } - - spin_lock_init(&xprt->transport_lock); - spin_lock_init(&xprt->reserve_lock); - - INIT_LIST_HEAD(&xprt->free); - INIT_LIST_HEAD(&xprt->recv); - INIT_WORK(&xprt->task_cleanup, xprt_autoclose, xprt); - init_timer(&xprt->timer); - xprt->timer.function = xprt_init_autodisconnect; - xprt->timer.data = (unsigned long) xprt; - xprt->last_used = jiffies; - xprt->cwnd = RPC_INITCWND; - - rpc_init_wait_queue(&xprt->binding, "xprt_binding"); - rpc_init_wait_queue(&xprt->pending, "xprt_pending"); - rpc_init_wait_queue(&xprt->sending, "xprt_sending"); - rpc_init_wait_queue(&xprt->resend, "xprt_resend"); - rpc_init_priority_wait_queue(&xprt->backlog, "xprt_backlog"); - - /* initialize free list */ - for (req = &xprt->slot[xprt->max_reqs-1]; req >= &xprt->slot[0]; req--) - list_add(&req->rq_list, &xprt->free); - - xprt_init_xid(xprt); - - dprintk("RPC: created transport %p with %u slots\n", xprt, - xprt->max_reqs); - - return xprt; -} - -/** - * xprt_create_proto - create an RPC client transport - * @proto: requested transport protocol - * @sap: remote peer's address - * @to: timeout parameters for new transport - * - */ -struct rpc_xprt *xprt_create_proto(int proto, struct sockaddr_in *sap, struct rpc_timeout *to) -{ - struct rpc_xprt *xprt; - - xprt = xprt_setup(proto, sap, to); - if (IS_ERR(xprt)) - dprintk("RPC: xprt_create_proto failed\n"); - else - dprintk("RPC: xprt_create_proto created xprt %p\n", xprt); - return xprt; -} - /** * xprt_destroy - destroy an RPC transport, killing off all requests. * @xprt: transport to destroy diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 17179aa..0b84fab 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1376,7 +1376,6 @@ int xs_setup_udp(struct rpc_xprt *xprt, struct rpc_timeout *to) xprt->prot = IPPROTO_UDP; xprt->tsh_size = 0; - xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; /* XXX: header size can vary due to auth type, IPv6, etc. */ xprt->max_payload = (1U << 16) - (MAX_HEADER << 3); @@ -1423,7 +1422,6 @@ int xs_setup_tcp(struct rpc_xprt *xprt, struct rpc_timeout *to) xprt->prot = IPPROTO_TCP; xprt->tsh_size = sizeof(rpc_fraghdr) / sizeof(u32); - xprt->resvport = capable(CAP_NET_BIND_SERVICE) ? 1 : 0; xprt->max_payload = RPC_MAX_FRAGMENT_SIZE; INIT_WORK(&xprt->connect_worker, xs_tcp_connect_worker, xprt); -- cgit v0.10.2 From b86acd501a34227e0ed2b2d54dc8002c1701ce17 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:22 -0400 Subject: SUNRPC: export new RPC client functions with _GPL This patch is optional. It has been suggested that the RPC client internal functions used by upper layer protocols (such as NFS) be exported via EXPORT_SYMBOL_GPL. This patch does that. Test plan: Compile kernel with CONFIG_NFS enabled as a module. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 428704d..87efcd2 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -241,7 +241,7 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args) return clnt; } -EXPORT_SYMBOL(rpc_create); +EXPORT_SYMBOL_GPL(rpc_create); /* * This function clones the RPC client structure. It allows us to share the @@ -573,7 +573,7 @@ size_t rpc_peeraddr(struct rpc_clnt *clnt, struct sockaddr *buf, size_t bufsize) memcpy(buf, &clnt->cl_xprt->addr, bytes); return xprt->addrlen; } -EXPORT_SYMBOL(rpc_peeraddr); +EXPORT_SYMBOL_GPL(rpc_peeraddr); /** * rpc_peeraddr2str - return remote peer address in printable format @@ -586,7 +586,7 @@ char *rpc_peeraddr2str(struct rpc_clnt *clnt, enum rpc_display_format_t format) struct rpc_xprt *xprt = clnt->cl_xprt; return xprt->ops->print_addr(xprt, format); } -EXPORT_SYMBOL(rpc_peeraddr2str); +EXPORT_SYMBOL_GPL(rpc_peeraddr2str); void rpc_setbufsize(struct rpc_clnt *clnt, unsigned int sndsize, unsigned int rcvsize) @@ -608,7 +608,7 @@ size_t rpc_max_payload(struct rpc_clnt *clnt) { return clnt->cl_xprt->max_payload; } -EXPORT_SYMBOL(rpc_max_payload); +EXPORT_SYMBOL_GPL(rpc_max_payload); /** * rpc_force_rebind - force transport to check that remote port is unchanged @@ -620,7 +620,7 @@ void rpc_force_rebind(struct rpc_clnt *clnt) if (clnt->cl_autobind) xprt_clear_bound(clnt->cl_xprt); } -EXPORT_SYMBOL(rpc_force_rebind); +EXPORT_SYMBOL_GPL(rpc_force_rebind); /* * Restart an (async) RPC call. Usually called from within the -- cgit v0.10.2 From d3db90e270791b21cd00d3c094884bffa907cc9e Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:22 -0400 Subject: NFS: remove a no-longer-needed error check in nfs_symlink() In the early days of NFS, there was no duplicate reply cache on the server. Thus retransmitted non-idempotent requests often found that the request had already completed on the server. To avoid passing an unanticipated return code to unsuspecting applications, NFS clients would often shunt error codes that implied the request had been retried but already completed. Thanks to NFS over TCP, duplicate reply caches on the server, and network performance and reliability improvements, it is safe to remove such checks. Test plan: None. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 9b496ef..084e8cb 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1476,14 +1476,10 @@ dentry->d_parent->d_name.name, dentry->d_name.name); error = NFS_PROTO(dir)->symlink(dir, &dentry->d_name, &qsymname, &attr, &sym_fh, &sym_attr); nfs_end_data_update(dir); - if (!error) { + if (!error) error = nfs_instantiate(dentry, &sym_fh, &sym_attr); - } else { - if (error == -EEXIST) - printk("nfs_proc_symlink: %s/%s already exists??\n", - dentry->d_parent->d_name.name, dentry->d_name.name); + else d_drop(dentry); - } unlock_kernel(); return error; } -- cgit v0.10.2 From 4f390c152bc87165da4b1f5b7d870b46fb106d4e Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:22 -0400 Subject: NFS: Fix double d_drop in nfs_instantiate() error path If the LOOKUP or GETATTR in nfs_instantiate fail, nfs_instantiate will do a d_drop before returning. But some callers already do a d_drop in the case of an error return. Make certain we do only one d_drop in all error paths. This issue was introduced because over time, the symlink proc API diverged slightly from the create/mkdir/mknod proc API. To prevent other coding mistakes of this type, change the symlink proc API to be more like create/mkdir/mknod and move the nfs_instantiate call into the symlink proc routines so it is used in exactly the same way for create, mkdir, mknod, and symlink. Test plan: Connectathon, all versions of NFS. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 084e8cb..affd3ae 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1147,23 +1147,20 @@ int nfs_instantiate(struct dentry *dentry, struct nfs_fh *fhandle, struct inode *dir = dentry->d_parent->d_inode; error = NFS_PROTO(dir)->lookup(dir, &dentry->d_name, fhandle, fattr); if (error) - goto out_err; + return error; } if (!(fattr->valid & NFS_ATTR_FATTR)) { struct nfs_server *server = NFS_SB(dentry->d_sb); error = server->nfs_client->rpc_ops->getattr(server, fhandle, fattr); if (error < 0) - goto out_err; + return error; } inode = nfs_fhget(dentry->d_sb, fhandle, fattr); error = PTR_ERR(inode); if (IS_ERR(inode)) - goto out_err; + return error; d_instantiate(dentry, inode); return 0; -out_err: - d_drop(dentry); - return error; } /* @@ -1448,8 +1445,6 @@ static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname) { struct iattr attr; - struct nfs_fattr sym_attr; - struct nfs_fh sym_fh; struct qstr qsymname; int error; @@ -1473,12 +1468,9 @@ dentry->d_parent->d_name.name, dentry->d_name.name); lock_kernel(); nfs_begin_data_update(dir); - error = NFS_PROTO(dir)->symlink(dir, &dentry->d_name, &qsymname, - &attr, &sym_fh, &sym_attr); + error = NFS_PROTO(dir)->symlink(dir, dentry, &qsymname, &attr); nfs_end_data_update(dir); if (!error) - error = nfs_instantiate(dentry, &sym_fh, &sym_attr); - else d_drop(dentry); unlock_kernel(); return error; diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index 9e8258e..d85ac42 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -544,23 +544,23 @@ nfs3_proc_link(struct inode *inode, struct inode *dir, struct qstr *name) } static int -nfs3_proc_symlink(struct inode *dir, struct qstr *name, struct qstr *path, - struct iattr *sattr, struct nfs_fh *fhandle, - struct nfs_fattr *fattr) +nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, + struct iattr *sattr) { - struct nfs_fattr dir_attr; + struct nfs_fh fhandle; + struct nfs_fattr fattr, dir_attr; struct nfs3_symlinkargs arg = { .fromfh = NFS_FH(dir), - .fromname = name->name, - .fromlen = name->len, + .fromname = dentry->d_name.name, + .fromlen = dentry->d_name.len, .topath = path->name, .tolen = path->len, .sattr = sattr }; struct nfs3_diropres res = { .dir_attr = &dir_attr, - .fh = fhandle, - .fattr = fattr + .fh = &fhandle, + .fattr = &fattr }; struct rpc_message msg = { .rpc_proc = &nfs3_procedures[NFS3PROC_SYMLINK], @@ -571,11 +571,17 @@ nfs3_proc_symlink(struct inode *dir, struct qstr *name, struct qstr *path, if (path->len > NFS3_MAXPATHLEN) return -ENAMETOOLONG; - dprintk("NFS call symlink %s -> %s\n", name->name, path->name); + + dprintk("NFS call symlink %s -> %s\n", dentry->d_name.name, + path->name); nfs_fattr_init(&dir_attr); - nfs_fattr_init(fattr); + nfs_fattr_init(&fattr); status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); nfs_post_op_update_inode(dir, &dir_attr); + if (status != 0) + goto out; + status = nfs_instantiate(dentry, &fhandle, &fattr); +out: dprintk("NFS reply symlink: %d\n", status); return status; } diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index a825547..2d18eac 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2084,24 +2084,24 @@ static int nfs4_proc_link(struct inode *inode, struct inode *dir, struct qstr *n return err; } -static int _nfs4_proc_symlink(struct inode *dir, struct qstr *name, - struct qstr *path, struct iattr *sattr, struct nfs_fh *fhandle, - struct nfs_fattr *fattr) +static int _nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, + struct qstr *path, struct iattr *sattr) { struct nfs_server *server = NFS_SERVER(dir); - struct nfs_fattr dir_fattr; + struct nfs_fh fhandle; + struct nfs_fattr fattr, dir_fattr; struct nfs4_create_arg arg = { .dir_fh = NFS_FH(dir), .server = server, - .name = name, + .name = &dentry->d_name, .attrs = sattr, .ftype = NF4LNK, .bitmask = server->attr_bitmask, }; struct nfs4_create_res res = { .server = server, - .fh = fhandle, - .fattr = fattr, + .fh = &fhandle, + .fattr = &fattr, .dir_fattr = &dir_fattr, }; struct rpc_message msg = { @@ -2113,27 +2113,28 @@ static int _nfs4_proc_symlink(struct inode *dir, struct qstr *name, if (path->len > NFS4_MAXPATHLEN) return -ENAMETOOLONG; + arg.u.symlink = path; - nfs_fattr_init(fattr); + nfs_fattr_init(&fattr); nfs_fattr_init(&dir_fattr); status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); - if (!status) + if (!status) { update_changeattr(dir, &res.dir_cinfo); - nfs_post_op_update_inode(dir, res.dir_fattr); + nfs_post_op_update_inode(dir, res.dir_fattr); + status = nfs_instantiate(dentry, &fhandle, &fattr); + } return status; } -static int nfs4_proc_symlink(struct inode *dir, struct qstr *name, - struct qstr *path, struct iattr *sattr, struct nfs_fh *fhandle, - struct nfs_fattr *fattr) +static int nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, + struct qstr *path, struct iattr *sattr) { struct nfs4_exception exception = { }; int err; do { err = nfs4_handle_exception(NFS_SERVER(dir), - _nfs4_proc_symlink(dir, name, path, sattr, - fhandle, fattr), + _nfs4_proc_symlink(dir, dentry, path, sattr), &exception); } while (exception.retry); return err; diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 5a8b940..0b507bf 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -425,14 +425,15 @@ nfs_proc_link(struct inode *inode, struct inode *dir, struct qstr *name) } static int -nfs_proc_symlink(struct inode *dir, struct qstr *name, struct qstr *path, - struct iattr *sattr, struct nfs_fh *fhandle, - struct nfs_fattr *fattr) +nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, + struct iattr *sattr) { + struct nfs_fh fhandle; + struct nfs_fattr fattr; struct nfs_symlinkargs arg = { .fromfh = NFS_FH(dir), - .fromname = name->name, - .fromlen = name->len, + .fromname = dentry->d_name.name, + .fromlen = dentry->d_name.len, .topath = path->name, .tolen = path->len, .sattr = sattr @@ -445,11 +446,23 @@ nfs_proc_symlink(struct inode *dir, struct qstr *name, struct qstr *path, if (path->len > NFS2_MAXPATHLEN) return -ENAMETOOLONG; - dprintk("NFS call symlink %s -> %s\n", name->name, path->name); - nfs_fattr_init(fattr); - fhandle->size = 0; + + dprintk("NFS call symlink %s -> %s\n", dentry->d_name.name, + path->name); status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); nfs_mark_for_revalidate(dir); + + /* + * V2 SYMLINK requests don't return any attributes. Setting the + * filehandle size to zero indicates to nfs_instantiate that it + * should fill in the data with a LOOKUP call on the wire. + */ + if (status == 0) { + nfs_fattr_init(&fattr); + fhandle.size = 0; + status = nfs_instantiate(dentry, &fhandle, &fattr); + } + dprintk("NFS reply symlink: %d\n", status); return status; } diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index 0f33e62..ddf5d75 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -793,9 +793,8 @@ struct nfs_rpc_ops { int (*rename) (struct inode *, struct qstr *, struct inode *, struct qstr *); int (*link) (struct inode *, struct inode *, struct qstr *); - int (*symlink) (struct inode *, struct qstr *, struct qstr *, - struct iattr *, struct nfs_fh *, - struct nfs_fattr *); + int (*symlink) (struct inode *, struct dentry *, struct qstr *, + struct iattr *); int (*mkdir) (struct inode *, struct dentry *, struct iattr *); int (*rmdir) (struct inode *, struct qstr *); int (*readdir) (struct dentry *, struct rpc_cred *, -- cgit v0.10.2 From 873101b33776780d32610fc4c90c7358a5e98f51 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:23 -0400 Subject: NFS: copy symlinks into page cache before sending NFS SYMLINK request Currently the NFS client does not cache symlinks it creates. They get cached only when the NFS client reads them back from the server. Copy the symlink into the page cache before sending it. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index affd3ae..b483e5d 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -30,6 +30,7 @@ #include #include #include +#include #include #include @@ -1441,39 +1442,88 @@ static int nfs_unlink(struct inode *dir, struct dentry *dentry) return error; } -static int -nfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname) +/* + * To create a symbolic link, most file systems instantiate a new inode, + * add a page to it containing the path, then write it out to the disk + * using prepare_write/commit_write. + * + * Unfortunately the NFS client can't create the in-core inode first + * because it needs a file handle to create an in-core inode (see + * fs/nfs/inode.c:nfs_fhget). We only have a file handle *after* the + * symlink request has completed on the server. + * + * So instead we allocate a raw page, copy the symname into it, then do + * the SYMLINK request with the page as the buffer. If it succeeds, we + * now have a new file handle and can instantiate an in-core NFS inode + * and move the raw page into its mapping. + */ +static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *symname) { + struct pagevec lru_pvec; + struct page *page; + char *kaddr; struct iattr attr; - struct qstr qsymname; + unsigned int pathlen = strlen(symname); + struct qstr qsymname = { + .name = symname, + .len = pathlen, + }; int error; dfprintk(VFS, "NFS: symlink(%s/%ld, %s, %s)\n", dir->i_sb->s_id, dir->i_ino, dentry->d_name.name, symname); -#ifdef NFS_PARANOIA -if (dentry->d_inode) -printk("nfs_proc_symlink: %s/%s not negative!\n", -dentry->d_parent->d_name.name, dentry->d_name.name); -#endif - /* - * Fill in the sattr for the call. - * Note: SunOS 4.1.2 crashes if the mode isn't initialized! - */ - attr.ia_valid = ATTR_MODE; - attr.ia_mode = S_IFLNK | S_IRWXUGO; + if (pathlen > PAGE_SIZE) + return -ENAMETOOLONG; - qsymname.name = symname; - qsymname.len = strlen(symname); + attr.ia_mode = S_IFLNK | S_IRWXUGO; + attr.ia_valid = ATTR_MODE; lock_kernel(); + + page = alloc_page(GFP_KERNEL); + if (!page) { + unlock_kernel(); + return -ENOMEM; + } + + kaddr = kmap_atomic(page, KM_USER0); + memcpy(kaddr, symname, pathlen); + if (pathlen < PAGE_SIZE) + memset(kaddr + pathlen, 0, PAGE_SIZE - pathlen); + kunmap_atomic(kaddr, KM_USER0); + + /* XXX: eventually this will pass in {page, pathlen}, + * instead of qsymname; need XDR changes for that */ nfs_begin_data_update(dir); error = NFS_PROTO(dir)->symlink(dir, dentry, &qsymname, &attr); nfs_end_data_update(dir); - if (!error) + if (error != 0) { + dfprintk(VFS, "NFS: symlink(%s/%ld, %s, %s) error %d\n", + dir->i_sb->s_id, dir->i_ino, + dentry->d_name.name, symname, error); d_drop(dentry); + __free_page(page); + unlock_kernel(); + return error; + } + + /* + * No big deal if we can't add this page to the page cache here. + * READLINK will get the missing page from the server if needed. + */ + pagevec_init(&lru_pvec, 0); + if (!add_to_page_cache(page, dentry->d_inode->i_mapping, 0, + GFP_KERNEL)) { + if (!pagevec_add(&lru_pvec, page)) + __pagevec_lru_add(&lru_pvec); + SetPageUptodate(page); + unlock_page(page); + } else + __free_page(page); + unlock_kernel(); - return error; + return 0; } static int -- cgit v0.10.2 From 94a6d75320b3681e6e728b70e18bd186cb55e682 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Tue, 22 Aug 2006 20:06:23 -0400 Subject: NFS: Use cached page as buffer for NFS symlink requests Now that we have a copy of the symlink path in the page cache, we can pass a struct page down to the XDR routines instead of a string buffer. Test plan: Connectathon, all NFS versions. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index b483e5d..51328ae 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -1464,10 +1464,6 @@ static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *sym char *kaddr; struct iattr attr; unsigned int pathlen = strlen(symname); - struct qstr qsymname = { - .name = symname, - .len = pathlen, - }; int error; dfprintk(VFS, "NFS: symlink(%s/%ld, %s, %s)\n", dir->i_sb->s_id, @@ -1493,10 +1489,8 @@ static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *sym memset(kaddr + pathlen, 0, PAGE_SIZE - pathlen); kunmap_atomic(kaddr, KM_USER0); - /* XXX: eventually this will pass in {page, pathlen}, - * instead of qsymname; need XDR changes for that */ nfs_begin_data_update(dir); - error = NFS_PROTO(dir)->symlink(dir, dentry, &qsymname, &attr); + error = NFS_PROTO(dir)->symlink(dir, dentry, page, pathlen, &attr); nfs_end_data_update(dir); if (error != 0) { dfprintk(VFS, "NFS: symlink(%s/%ld, %s, %s) error %d\n", diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c index 67391ee..b49501f 100644 --- a/fs/nfs/nfs2xdr.c +++ b/fs/nfs/nfs2xdr.c @@ -51,7 +51,7 @@ #define NFS_createargs_sz (NFS_diropargs_sz+NFS_sattr_sz) #define NFS_renameargs_sz (NFS_diropargs_sz+NFS_diropargs_sz) #define NFS_linkargs_sz (NFS_fhandle_sz+NFS_diropargs_sz) -#define NFS_symlinkargs_sz (NFS_diropargs_sz+NFS_path_sz+NFS_sattr_sz) +#define NFS_symlinkargs_sz (NFS_diropargs_sz+1+NFS_sattr_sz) #define NFS_readdirargs_sz (NFS_fhandle_sz+2) #define NFS_attrstat_sz (1+NFS_fattr_sz) @@ -351,11 +351,26 @@ nfs_xdr_linkargs(struct rpc_rqst *req, u32 *p, struct nfs_linkargs *args) static int nfs_xdr_symlinkargs(struct rpc_rqst *req, u32 *p, struct nfs_symlinkargs *args) { + struct xdr_buf *sndbuf = &req->rq_snd_buf; + size_t pad; + p = xdr_encode_fhandle(p, args->fromfh); p = xdr_encode_array(p, args->fromname, args->fromlen); - p = xdr_encode_array(p, args->topath, args->tolen); + *p++ = htonl(args->pathlen); + sndbuf->len = xdr_adjust_iovec(sndbuf->head, p); + + xdr_encode_pages(sndbuf, args->pages, 0, args->pathlen); + + /* + * xdr_encode_pages may have added a few bytes to ensure the + * pathname ends on a 4-byte boundary. Start encoding the + * attributes after the pad bytes. + */ + pad = sndbuf->tail->iov_len; + if (pad > 0) + p++; p = xdr_encode_sattr(p, args->sattr); - req->rq_slen = xdr_adjust_iovec(req->rq_svec, p); + sndbuf->len += xdr_adjust_iovec(sndbuf->tail, p) - pad; return 0; } diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c index d85ac42..f8688ea 100644 --- a/fs/nfs/nfs3proc.c +++ b/fs/nfs/nfs3proc.c @@ -544,8 +544,8 @@ nfs3_proc_link(struct inode *inode, struct inode *dir, struct qstr *name) } static int -nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, - struct iattr *sattr) +nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct page *page, + unsigned int len, struct iattr *sattr) { struct nfs_fh fhandle; struct nfs_fattr fattr, dir_attr; @@ -553,8 +553,8 @@ nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, .fromfh = NFS_FH(dir), .fromname = dentry->d_name.name, .fromlen = dentry->d_name.len, - .topath = path->name, - .tolen = path->len, + .pages = &page, + .pathlen = len, .sattr = sattr }; struct nfs3_diropres res = { @@ -569,11 +569,11 @@ nfs3_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, }; int status; - if (path->len > NFS3_MAXPATHLEN) + if (len > NFS3_MAXPATHLEN) return -ENAMETOOLONG; - dprintk("NFS call symlink %s -> %s\n", dentry->d_name.name, - path->name); + dprintk("NFS call symlink %s\n", dentry->d_name.name); + nfs_fattr_init(&dir_attr); nfs_fattr_init(&fattr); status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c index 0250269..16556fa 100644 --- a/fs/nfs/nfs3xdr.c +++ b/fs/nfs/nfs3xdr.c @@ -56,7 +56,7 @@ #define NFS3_writeargs_sz (NFS3_fh_sz+5) #define NFS3_createargs_sz (NFS3_diropargs_sz+NFS3_sattr_sz) #define NFS3_mkdirargs_sz (NFS3_diropargs_sz+NFS3_sattr_sz) -#define NFS3_symlinkargs_sz (NFS3_diropargs_sz+NFS3_path_sz+NFS3_sattr_sz) +#define NFS3_symlinkargs_sz (NFS3_diropargs_sz+1+NFS3_sattr_sz) #define NFS3_mknodargs_sz (NFS3_diropargs_sz+2+NFS3_sattr_sz) #define NFS3_renameargs_sz (NFS3_diropargs_sz+NFS3_diropargs_sz) #define NFS3_linkargs_sz (NFS3_fh_sz+NFS3_diropargs_sz) @@ -398,8 +398,11 @@ nfs3_xdr_symlinkargs(struct rpc_rqst *req, u32 *p, struct nfs3_symlinkargs *args p = xdr_encode_fhandle(p, args->fromfh); p = xdr_encode_array(p, args->fromname, args->fromlen); p = xdr_encode_sattr(p, args->sattr); - p = xdr_encode_array(p, args->topath, args->tolen); + *p++ = htonl(args->pathlen); req->rq_slen = xdr_adjust_iovec(req->rq_svec, p); + + /* Copy the page */ + xdr_encode_pages(&req->rq_snd_buf, args->pages, 0, args->pathlen); return 0; } diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 2d18eac..7f60beb 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2085,7 +2085,7 @@ static int nfs4_proc_link(struct inode *inode, struct inode *dir, struct qstr *n } static int _nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, - struct qstr *path, struct iattr *sattr) + struct page *page, unsigned int len, struct iattr *sattr) { struct nfs_server *server = NFS_SERVER(dir); struct nfs_fh fhandle; @@ -2111,10 +2111,11 @@ static int _nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, }; int status; - if (path->len > NFS4_MAXPATHLEN) + if (len > NFS4_MAXPATHLEN) return -ENAMETOOLONG; - arg.u.symlink = path; + arg.u.symlink.pages = &page; + arg.u.symlink.len = len; nfs_fattr_init(&fattr); nfs_fattr_init(&dir_fattr); @@ -2128,13 +2129,14 @@ static int _nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, } static int nfs4_proc_symlink(struct inode *dir, struct dentry *dentry, - struct qstr *path, struct iattr *sattr) + struct page *page, unsigned int len, struct iattr *sattr) { struct nfs4_exception exception = { }; int err; do { err = nfs4_handle_exception(NFS_SERVER(dir), - _nfs4_proc_symlink(dir, dentry, path, sattr), + _nfs4_proc_symlink(dir, dentry, page, + len, sattr), &exception); } while (exception.retry); return err; diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c index 9992606..3dd413f 100644 --- a/fs/nfs/nfs4xdr.c +++ b/fs/nfs/nfs4xdr.c @@ -128,7 +128,7 @@ static int nfs4_stat_to_errno(int); #define decode_link_maxsz (op_decode_hdr_maxsz + 5) #define encode_symlink_maxsz (op_encode_hdr_maxsz + \ 1 + nfs4_name_maxsz + \ - nfs4_path_maxsz + \ + 1 + \ nfs4_fattr_maxsz) #define decode_symlink_maxsz (op_decode_hdr_maxsz + 8) #define encode_create_maxsz (op_encode_hdr_maxsz + \ @@ -673,9 +673,9 @@ static int encode_create(struct xdr_stream *xdr, const struct nfs4_create_arg *c switch (create->ftype) { case NF4LNK: - RESERVE_SPACE(4 + create->u.symlink->len); - WRITE32(create->u.symlink->len); - WRITEMEM(create->u.symlink->name, create->u.symlink->len); + RESERVE_SPACE(4); + WRITE32(create->u.symlink.len); + xdr_write_pages(xdr, create->u.symlink.pages, 0, create->u.symlink.len); break; case NF4BLK: case NF4CHR: diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c index 0b507bf..630e506 100644 --- a/fs/nfs/proc.c +++ b/fs/nfs/proc.c @@ -425,8 +425,8 @@ nfs_proc_link(struct inode *inode, struct inode *dir, struct qstr *name) } static int -nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, - struct iattr *sattr) +nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct page *page, + unsigned int len, struct iattr *sattr) { struct nfs_fh fhandle; struct nfs_fattr fattr; @@ -434,8 +434,8 @@ nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, .fromfh = NFS_FH(dir), .fromname = dentry->d_name.name, .fromlen = dentry->d_name.len, - .topath = path->name, - .tolen = path->len, + .pages = &page, + .pathlen = len, .sattr = sattr }; struct rpc_message msg = { @@ -444,11 +444,11 @@ nfs_proc_symlink(struct inode *dir, struct dentry *dentry, struct qstr *path, }; int status; - if (path->len > NFS2_MAXPATHLEN) + if (len > NFS2_MAXPATHLEN) return -ENAMETOOLONG; - dprintk("NFS call symlink %s -> %s\n", dentry->d_name.name, - path->name); + dprintk("NFS call symlink %s\n", dentry->d_name.name); + status = rpc_call_sync(NFS_CLIENT(dir), &msg, 0); nfs_mark_for_revalidate(dir); diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h index ddf5d75..dc5397d 100644 --- a/include/linux/nfs_xdr.h +++ b/include/linux/nfs_xdr.h @@ -358,8 +358,8 @@ struct nfs_symlinkargs { struct nfs_fh * fromfh; const char * fromname; unsigned int fromlen; - const char * topath; - unsigned int tolen; + struct page ** pages; + unsigned int pathlen; struct iattr * sattr; }; @@ -434,8 +434,8 @@ struct nfs3_symlinkargs { struct nfs_fh * fromfh; const char * fromname; unsigned int fromlen; - const char * topath; - unsigned int tolen; + struct page ** pages; + unsigned int pathlen; struct iattr * sattr; }; @@ -533,7 +533,10 @@ struct nfs4_accessres { struct nfs4_create_arg { u32 ftype; union { - struct qstr * symlink; /* NF4LNK */ + struct { + struct page ** pages; + unsigned int len; + } symlink; /* NF4LNK */ struct { u32 specdata1; u32 specdata2; @@ -793,8 +796,8 @@ struct nfs_rpc_ops { int (*rename) (struct inode *, struct qstr *, struct inode *, struct qstr *); int (*link) (struct inode *, struct inode *, struct qstr *); - int (*symlink) (struct inode *, struct dentry *, struct qstr *, - struct iattr *); + int (*symlink) (struct inode *, struct dentry *, struct page *, + unsigned int, struct iattr *); int (*mkdir) (struct inode *, struct dentry *, struct iattr *); int (*rmdir) (struct inode *, struct qstr *); int (*readdir) (struct dentry *, struct rpc_cred *, -- cgit v0.10.2 From 275a082fe9308e710324e26ccb5363c53d8fd45f Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 22 Aug 2006 20:06:24 -0400 Subject: Add a real API for dealing with blk_congestion_wait() Signed-off-by: Trond Myklebust diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c index ddd9253..dcbd6ff 100644 --- a/block/ll_rw_blk.c +++ b/block/ll_rw_blk.c @@ -2734,6 +2734,18 @@ long blk_congestion_wait(int rw, long timeout) EXPORT_SYMBOL(blk_congestion_wait); +/** + * blk_congestion_end - wake up sleepers on a congestion queue + * @rw: READ or WRITE + */ +void blk_congestion_end(int rw) +{ + wait_queue_head_t *wqh = &congestion_wqh[rw]; + + if (waitqueue_active(wqh)) + wake_up(wqh); +} + /* * Has to be called with the request spinlock acquired */ diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 453d446..38ba5c0 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -396,6 +396,7 @@ int nfs_writepages(struct address_space *mapping, struct writeback_control *wbc) out: clear_bit(BDI_write_congested, &bdi->state); wake_up_all(&nfs_write_congestion); + writeback_congestion_end(); return err; } diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index aafe827..96c9040 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -746,6 +746,7 @@ extern void blk_queue_free_tags(request_queue_t *); extern int blk_queue_resize_tags(request_queue_t *, int); extern void blk_queue_invalidate_tags(request_queue_t *); extern long blk_congestion_wait(int rw, long timeout); +extern void blk_congestion_end(int rw); extern void blk_rq_bio_prep(request_queue_t *, struct request *, struct bio *); extern int blkdev_issue_flush(struct block_device *, sector_t *); diff --git a/include/linux/writeback.h b/include/linux/writeback.h index 9e38b56..0422036 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -85,6 +85,7 @@ int wakeup_pdflush(long nr_pages); void laptop_io_completion(void); void laptop_sync_completion(void); void throttle_vm_writeout(void); +void writeback_congestion_end(void); /* These are exported to sysctl. */ extern int dirty_background_ratio; diff --git a/mm/page-writeback.c b/mm/page-writeback.c index e630188..77a0bc4 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -803,6 +803,15 @@ int test_set_page_writeback(struct page *page) EXPORT_SYMBOL(test_set_page_writeback); /* + * Wakes up tasks that are being throttled due to writeback congestion + */ +void writeback_congestion_end(void) +{ + blk_congestion_end(WRITE); +} +EXPORT_SYMBOL(writeback_congestion_end); + +/* * Return true if any of the pages in the mapping are marged with the * passed tag. */ -- cgit v0.10.2 From 5dd3177ae5012c1e2ad7a9ffdbd0e0d0de2f60e4 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 24 Aug 2006 01:03:05 -0400 Subject: NFSv4: Fix a use-after-free issue with the nfs server. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 12941a8..f1ff2ae 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -164,6 +164,26 @@ error_0: return NULL; } +static void nfs4_shutdown_client(struct nfs_client *clp) +{ +#ifdef CONFIG_NFS_V4 + if (__test_and_clear_bit(NFS_CS_RENEWD, &clp->cl_res_state)) + nfs4_kill_renewd(clp); + while (!list_empty(&clp->cl_unused)) { + struct nfs4_state_owner *sp; + + sp = list_entry(clp->cl_unused.next, + struct nfs4_state_owner, + so_list); + list_del(&sp->so_list); + kfree(sp); + } + BUG_ON(!list_empty(&clp->cl_state_owners)); + if (__test_and_clear_bit(NFS_CS_IDMAP, &clp->cl_res_state)) + nfs_idmap_delete(clp); +#endif +} + /* * Destroy a shared client record */ @@ -171,21 +191,7 @@ static void nfs_free_client(struct nfs_client *clp) { dprintk("--> nfs_free_client(%d)\n", clp->cl_nfsversion); -#ifdef CONFIG_NFS_V4 - if (__test_and_clear_bit(NFS_CS_IDMAP, &clp->cl_res_state)) { - while (!list_empty(&clp->cl_unused)) { - struct nfs4_state_owner *sp; - - sp = list_entry(clp->cl_unused.next, - struct nfs4_state_owner, - so_list); - list_del(&sp->so_list); - kfree(sp); - } - BUG_ON(!list_empty(&clp->cl_state_owners)); - nfs_idmap_delete(clp); - } -#endif + nfs4_shutdown_client(clp); /* -EIO all pending I/O */ if (!IS_ERR(clp->cl_rpcclient)) diff --git a/fs/nfs/nfs4renewd.c b/fs/nfs/nfs4renewd.c index f2c8936..7b6df18 100644 --- a/fs/nfs/nfs4renewd.c +++ b/fs/nfs/nfs4renewd.c @@ -121,6 +121,7 @@ nfs4_schedule_state_renewal(struct nfs_client *clp) __FUNCTION__, (timeout + HZ - 1) / HZ); cancel_delayed_work(&clp->cl_renewd); schedule_delayed_work(&clp->cl_renewd, timeout); + set_bit(NFS_CS_RENEWD, &clp->cl_res_state); spin_unlock(&clp->cl_lock); } diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 97cfb14..665949d 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -883,13 +883,15 @@ static int nfs4_get_sb(struct file_system_type *fs_type, goto out_free; } + if (s->s_fs_info != server) { + nfs_free_server(server); + server = NULL; + } + if (!s->s_root) { /* initial superblock/root creation */ s->s_flags = flags; - nfs4_fill_super(s); - } else { - nfs_free_server(server); } mntroot = nfs4_get_root(s, &mntfh); diff --git a/include/linux/nfs_fs_sb.h b/include/linux/nfs_fs_sb.h index 6d0be0e..7ccfc7e 100644 --- a/include/linux/nfs_fs_sb.h +++ b/include/linux/nfs_fs_sb.h @@ -19,6 +19,7 @@ struct nfs_client { #define NFS_CS_RPCIOD 0 /* - rpciod started */ #define NFS_CS_CALLBACK 1 /* - callback started */ #define NFS_CS_IDMAP 2 /* - idmap started */ +#define NFS_CS_RENEWD 3 /* - renewd started */ struct sockaddr_in cl_addr; /* server identifier */ char * cl_hostname; /* hostname of server */ struct list_head cl_share_link; /* link in global client list */ -- cgit v0.10.2 From 158998b6fe36f6acef087f574c96d44713499cc9 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 24 Aug 2006 01:03:17 -0400 Subject: SUNRPC: Make rpc_mkpipe() take the parent dentry as an argument Signed-off-by: Trond Myklebust diff --git a/fs/nfs/idmap.c b/fs/nfs/idmap.c index f96dfac..82ad711 100644 --- a/fs/nfs/idmap.c +++ b/fs/nfs/idmap.c @@ -84,7 +84,6 @@ struct idmap_hashtable { }; struct idmap { - char idmap_path[48]; struct dentry *idmap_dentry; wait_queue_head_t idmap_wq; struct idmap_msg idmap_im; @@ -119,10 +118,7 @@ nfs_idmap_new(struct nfs_client *clp) if ((idmap = kzalloc(sizeof(*idmap), GFP_KERNEL)) == NULL) return -ENOMEM; - snprintf(idmap->idmap_path, sizeof(idmap->idmap_path), - "%s/idmap", clp->cl_rpcclient->cl_pathname); - - idmap->idmap_dentry = rpc_mkpipe(idmap->idmap_path, + idmap->idmap_dentry = rpc_mkpipe(clp->cl_rpcclient->cl_dentry, "idmap", idmap, &idmap_upcall_ops, 0); if (IS_ERR(idmap->idmap_dentry)) { error = PTR_ERR(idmap->idmap_dentry); diff --git a/include/linux/sunrpc/rpc_pipe_fs.h b/include/linux/sunrpc/rpc_pipe_fs.h index a481472..a2eb9b4 100644 --- a/include/linux/sunrpc/rpc_pipe_fs.h +++ b/include/linux/sunrpc/rpc_pipe_fs.h @@ -43,7 +43,7 @@ extern int rpc_queue_upcall(struct inode *, struct rpc_pipe_msg *); extern struct dentry *rpc_mkdir(char *, struct rpc_clnt *); extern int rpc_rmdir(struct dentry *); -extern struct dentry *rpc_mkpipe(char *, void *, struct rpc_pipe_ops *, int flags); +extern struct dentry *rpc_mkpipe(struct dentry *, const char *, void *, struct rpc_pipe_ops *, int flags); extern int rpc_unlink(struct dentry *); extern struct vfsmount *rpc_get_mount(void); extern void rpc_put_mount(void); diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c index ef1cf5b..6eed3e1 100644 --- a/net/sunrpc/auth_gss/auth_gss.c +++ b/net/sunrpc/auth_gss/auth_gss.c @@ -88,7 +88,6 @@ struct gss_auth { struct list_head upcalls; struct rpc_clnt *client; struct dentry *dentry; - char path[48]; spinlock_t lock; }; @@ -690,10 +689,8 @@ gss_create(struct rpc_clnt *clnt, rpc_authflavor_t flavor) if (err) goto err_put_mech; - snprintf(gss_auth->path, sizeof(gss_auth->path), "%s/%s", - clnt->cl_pathname, - gss_auth->mech->gm_name); - gss_auth->dentry = rpc_mkpipe(gss_auth->path, clnt, &gss_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN); + gss_auth->dentry = rpc_mkpipe(clnt->cl_dentry, gss_auth->mech->gm_name, + clnt, &gss_upcall_ops, RPC_PIPE_WAIT_FOR_OPEN); if (IS_ERR(gss_auth->dentry)) { err = PTR_ERR(gss_auth->dentry); goto err_put_mech; diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index c21dc07..11ec12a 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -621,17 +621,13 @@ __rpc_rmdir(struct inode *dir, struct dentry *dentry) } static struct dentry * -rpc_lookup_negative(char *path, struct nameidata *nd) +rpc_lookup_create(struct dentry *parent, const char *name, int len) { + struct inode *dir = parent->d_inode; struct dentry *dentry; - struct inode *dir; - int error; - if ((error = rpc_lookup_parent(path, nd)) != 0) - return ERR_PTR(error); - dir = nd->dentry->d_inode; mutex_lock_nested(&dir->i_mutex, I_MUTEX_PARENT); - dentry = lookup_one_len(nd->last.name, nd->dentry, nd->last.len); + dentry = lookup_one_len(name, parent, len); if (IS_ERR(dentry)) goto out_err; if (dentry->d_inode) { @@ -642,7 +638,20 @@ rpc_lookup_negative(char *path, struct nameidata *nd) return dentry; out_err: mutex_unlock(&dir->i_mutex); - rpc_release_path(nd); + return dentry; +} + +static struct dentry * +rpc_lookup_negative(char *path, struct nameidata *nd) +{ + struct dentry *dentry; + int error; + + if ((error = rpc_lookup_parent(path, nd)) != 0) + return ERR_PTR(error); + dentry = rpc_lookup_create(nd->dentry, nd->last.name, nd->last.len); + if (IS_ERR(dentry)) + rpc_release_path(nd); return dentry; } @@ -701,17 +710,16 @@ rpc_rmdir(struct dentry *dentry) } struct dentry * -rpc_mkpipe(char *path, void *private, struct rpc_pipe_ops *ops, int flags) +rpc_mkpipe(struct dentry *parent, const char *name, void *private, struct rpc_pipe_ops *ops, int flags) { - struct nameidata nd; struct dentry *dentry; struct inode *dir, *inode; struct rpc_inode *rpci; - dentry = rpc_lookup_negative(path, &nd); + dentry = rpc_lookup_create(parent, name, strlen(name)); if (IS_ERR(dentry)) return dentry; - dir = nd.dentry->d_inode; + dir = parent->d_inode; inode = rpc_get_inode(dir->i_sb, S_IFSOCK | S_IRUSR | S_IWUSR); if (!inode) goto err_dput; @@ -726,13 +734,13 @@ rpc_mkpipe(char *path, void *private, struct rpc_pipe_ops *ops, int flags) dget(dentry); out: mutex_unlock(&dir->i_mutex); - rpc_release_path(&nd); return dentry; err_dput: dput(dentry); dentry = ERR_PTR(-ENOMEM); - printk(KERN_WARNING "%s: %s() failed to create pipe %s (errno = %d)\n", - __FILE__, __FUNCTION__, path, -ENOMEM); + printk(KERN_WARNING "%s: %s() failed to create pipe %s/%s (errno = %d)\n", + __FILE__, __FUNCTION__, parent->d_name.name, name, + -ENOMEM); goto out; } -- cgit v0.10.2 From 6daabf1b04c89f1fbd8eab5450261360943c8e20 Mon Sep 17 00:00:00 2001 From: David Howells Date: Thu, 24 Aug 2006 15:44:16 -0400 Subject: NFS: Fix up compiler warnings on 64-bit platforms in client.c Fix up warnings from compiling on ppc64. Signed-Off-By: David Howells Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index f1ff2ae..a4aa479 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -836,7 +836,9 @@ struct nfs_server *nfs_create_server(const struct nfs_mount_data *data, } memcpy(&server->fsid, &fattr.fsid, sizeof(server->fsid)); - dprintk("Server FSID: %llx:%llx\n", server->fsid.major, server->fsid.minor); + dprintk("Server FSID: %llx:%llx\n", + (unsigned long long) server->fsid.major, + (unsigned long long) server->fsid.minor); BUG_ON(!server->nfs_client); BUG_ON(!server->nfs_client->rpc_ops); @@ -1002,7 +1004,9 @@ struct nfs_server *nfs4_create_server(const struct nfs4_mount_data *data, if (error < 0) goto error; - dprintk("Server FSID: %llx:%llx\n", server->fsid.major, server->fsid.minor); + dprintk("Server FSID: %llx:%llx\n", + (unsigned long long) server->fsid.major, + (unsigned long long) server->fsid.minor); dprintk("Mount FH: %d\n", mntfh->size); error = nfs_probe_fsinfo(server, mntfh, &fattr); @@ -1074,7 +1078,8 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, goto error; dprintk("Referral FSID: %llx:%llx\n", - server->fsid.major, server->fsid.minor); + (unsigned long long) server->fsid.major, + (unsigned long long) server->fsid.minor); spin_lock(&nfs_client_lock); list_add_tail(&server->client_link, &server->nfs_client->cl_superblocks); @@ -1106,7 +1111,8 @@ struct nfs_server *nfs_clone_server(struct nfs_server *source, int error; dprintk("--> nfs_clone_server(,%llx:%llx,)\n", - fattr->fsid.major, fattr->fsid.minor); + (unsigned long long) fattr->fsid.major, + (unsigned long long) fattr->fsid.minor); server = nfs_alloc_server(); if (!server) @@ -1131,7 +1137,8 @@ struct nfs_server *nfs_clone_server(struct nfs_server *source, goto out_free_server; dprintk("Cloned FSID: %llx:%llx\n", - server->fsid.major, server->fsid.minor); + (unsigned long long) server->fsid.major, + (unsigned long long) server->fsid.minor); error = nfs_start_lockd(server); if (error < 0) @@ -1375,7 +1382,8 @@ static int nfs_volume_list_show(struct seq_file *m, void *v) MAJOR(server->s_dev), MINOR(server->s_dev)); snprintf(fsid, 17, "%llx:%llx", - server->fsid.major, server->fsid.minor); + (unsigned long long) server->fsid.major, + (unsigned long long) server->fsid.minor); seq_printf(m, "v%d %02x%02x%02x%02x %4hx %-7s %-17s\n", clp->cl_nfsversion, -- cgit v0.10.2 From 058ad9cbf14b3c7480d01b20280cb4d5858f7a50 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Sun, 27 Aug 2006 17:23:53 -0400 Subject: NFS: NFS_ROOT should use the new rpc_create API Teach NFS_ROOT to use the new rpc_create API instead of the old two-call API for creating an RPC transport. Test plan: Compile the kernel with the NFS client build-in, and set CONFIG_NFS_ROOT. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/mount_clnt.c b/fs/nfs/mount_clnt.c index 41274874..d507b02 100644 --- a/fs/nfs/mount_clnt.c +++ b/fs/nfs/mount_clnt.c @@ -76,22 +76,19 @@ static struct rpc_clnt * mnt_create(char *hostname, struct sockaddr_in *srvaddr, int version, int protocol) { - struct rpc_xprt *xprt; - struct rpc_clnt *clnt; - - xprt = xprt_create_proto(protocol, srvaddr, NULL); - if (IS_ERR(xprt)) - return (struct rpc_clnt *)xprt; - - clnt = rpc_create_client(xprt, hostname, - &mnt_program, version, - RPC_AUTH_UNIX); - if (!IS_ERR(clnt)) { - clnt->cl_softrtry = 1; - clnt->cl_oneshot = 1; - clnt->cl_intr = 1; - } - return clnt; + struct rpc_create_args args = { + .protocol = protocol, + .address = (struct sockaddr *)srvaddr, + .addrsize = sizeof(*srvaddr), + .servername = hostname, + .program = &mnt_program, + .version = version, + .authflavor = RPC_AUTH_UNIX, + .flags = (RPC_CLNT_CREATE_ONESHOT | + RPC_CLNT_CREATE_INTR), + }; + + return rpc_create(&args); } /* -- cgit v0.10.2 From 297de4f65698ee1e1c75e27d57933b5fa8227e72 Mon Sep 17 00:00:00 2001 From: "andros@citi.umich.edu" Date: Tue, 29 Aug 2006 12:19:41 -0400 Subject: Fix a referral error Oops Fix an oops when the referral server is not responding. Check the error return from nfs4_set_client() in nfs4_create_referral_server. Signed-off-by: Andy Adamson Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index a4aa479..110f80e 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -1059,6 +1059,8 @@ struct nfs_server *nfs4_create_referral_server(struct nfs_clone_mount *data, parent_server->client->cl_xprt->prot, parent_client->retrans_timeo, parent_client->retrans_count); + if (error < 0) + goto error; /* Initialise the client representation from the parent server */ nfs_server_copy_userdata(server, parent_server); -- cgit v0.10.2 From 8014793b1b2869445adfe678d64cdacd10e99d53 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 31 Aug 2006 18:24:08 -0400 Subject: SUNRPC: rpc_delay() should not clobber the rpc_task->tk_status Doing so prevents stuff like call_encode() from working correctly. Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/sched.c b/net/sunrpc/sched.c index ecf3663..6390461 100644 --- a/net/sunrpc/sched.c +++ b/net/sunrpc/sched.c @@ -542,24 +542,20 @@ void rpc_wake_up_status(struct rpc_wait_queue *queue, int status) spin_unlock_bh(&queue->lock); } +static void __rpc_atrun(struct rpc_task *task) +{ + rpc_wake_up_task(task); +} + /* * Run a task at a later time */ -static void __rpc_atrun(struct rpc_task *); -void -rpc_delay(struct rpc_task *task, unsigned long delay) +void rpc_delay(struct rpc_task *task, unsigned long delay) { task->tk_timeout = delay; rpc_sleep_on(&delay_queue, task, NULL, __rpc_atrun); } -static void -__rpc_atrun(struct rpc_task *task) -{ - task->tk_status = 0; - rpc_wake_up_task(task); -} - /* * Helper to call task->tk_ops->rpc_call_prepare */ -- cgit v0.10.2 From 76303992b4701124f4cd0791ae2049ab4332f02c Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Wed, 30 Aug 2006 14:32:49 -0400 Subject: SUNRPC: Handle ENETUNREACH, EHOSTUNREACH and EHOSTDOWN socket errors In case of any of the above errors occuring, delay for 3 seconds, then handle as if it were a timeout error. Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 87efcd2..355e786 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -1030,6 +1030,14 @@ call_status(struct rpc_task *task) task->tk_status = 0; switch(status) { + case -EHOSTDOWN: + case -EHOSTUNREACH: + case -ENETUNREACH: + /* + * Delay any retries for 3 seconds, then handle as if it + * were a timeout. + */ + rpc_delay(task, 3*HZ); case -ETIMEDOUT: task->tk_action = call_timeout; break; -- cgit v0.10.2 From da45828e2835057045150b318c4fbe9bb91f18dd Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 31 Aug 2006 15:44:52 -0400 Subject: SUNRPC: Clean up soft task error handling - Ensure that the task aborts the RPC call only when it has actually timed out. - Ensure that req->rq_majortimeo is initialised correctly. Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 355e786..ceadb72 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -863,15 +863,11 @@ call_bind_status(struct rpc_task *task) dprintk("RPC: %4d remote rpcbind: RPC program/version unavailable\n", task->tk_pid); rpc_delay(task, 3*HZ); - goto retry_bind; + goto retry_timeout; case -ETIMEDOUT: dprintk("RPC: %4d rpcbind request timed out\n", task->tk_pid); - if (RPC_IS_SOFT(task)) { - status = -EIO; - break; - } - goto retry_bind; + goto retry_timeout; case -EPFNOSUPPORT: dprintk("RPC: %4d remote rpcbind service unavailable\n", task->tk_pid); @@ -884,16 +880,13 @@ call_bind_status(struct rpc_task *task) dprintk("RPC: %4d unrecognized rpcbind error (%d)\n", task->tk_pid, -task->tk_status); status = -EIO; - break; } rpc_exit(task, status); return; -retry_bind: - task->tk_status = 0; - task->tk_action = call_bind; - return; +retry_timeout: + task->tk_action = call_timeout; } /* @@ -941,14 +934,16 @@ call_connect_status(struct rpc_task *task) switch (status) { case -ENOTCONN: - case -ETIMEDOUT: case -EAGAIN: task->tk_action = call_bind; - break; - default: - rpc_exit(task, -EIO); - break; + if (!RPC_IS_SOFT(task)) + return; + /* if soft mounted, test if we've timed out */ + case -ETIMEDOUT: + task->tk_action = call_timeout; + return; } + rpc_exit(task, -EIO); } /* @@ -1057,7 +1052,6 @@ call_status(struct rpc_task *task) printk("%s: RPC call returned error %d\n", clnt->cl_protname, -status); rpc_exit(task, status); - break; } } @@ -1125,10 +1119,10 @@ call_decode(struct rpc_task *task) clnt->cl_stats->rpcretrans++; goto out_retry; } - printk(KERN_WARNING "%s: too small RPC reply size (%d bytes)\n", + dprintk("%s: too small RPC reply size (%d bytes)\n", clnt->cl_protname, task->tk_status); - rpc_exit(task, -EIO); - return; + task->tk_action = call_timeout; + goto out_retry; } /* diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index e4f64fb..a85f82b 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -585,13 +585,6 @@ static void xprt_connect_status(struct rpc_task *task) task->tk_pid, -task->tk_status, task->tk_client->cl_server); xprt_release_write(xprt, task); task->tk_status = -EIO; - return; - } - - /* if soft mounted, just cause this RPC to fail */ - if (RPC_IS_SOFT(task)) { - xprt_release_write(xprt, task); - task->tk_status = -EIO; } } @@ -829,6 +822,7 @@ static void xprt_request_init(struct rpc_task *task, struct rpc_xprt *xprt) req->rq_bufsize = 0; req->rq_xid = xprt_alloc_xid(xprt); req->rq_release_snd_buf = NULL; + xprt_reset_majortimeo(req); dprintk("RPC: %4d reserved req %p xid %08x\n", task->tk_pid, req, ntohl(req->rq_xid)); } -- cgit v0.10.2 From 6b6ca86b77b62b798cf9ca2599036420abce7796 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 5 Sep 2006 12:55:57 -0400 Subject: SUNRPC: Add refcounting to the struct rpc_xprt In a subsequent patch, this will allow the portmapper to take a reference to the rpc_xprt for which it is updating the port number, fixing an Oops. Signed-off-by: Trond Myklebust diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h index de4efea..bdeba85 100644 --- a/include/linux/sunrpc/xprt.h +++ b/include/linux/sunrpc/xprt.h @@ -12,6 +12,7 @@ #include #include #include +#include #include #include @@ -129,6 +130,7 @@ struct rpc_xprt_ops { }; struct rpc_xprt { + struct kref kref; /* Reference count */ struct rpc_xprt_ops * ops; /* transport methods */ struct socket * sock; /* BSD socket layer */ struct sock * inet; /* INET layer */ @@ -248,7 +250,8 @@ int xprt_adjust_timeout(struct rpc_rqst *req); void xprt_release_xprt(struct rpc_xprt *xprt, struct rpc_task *task); void xprt_release_xprt_cong(struct rpc_xprt *xprt, struct rpc_task *task); void xprt_release(struct rpc_task *task); -int xprt_destroy(struct rpc_xprt *xprt); +struct rpc_xprt * xprt_get(struct rpc_xprt *xprt); +void xprt_put(struct rpc_xprt *xprt); static inline u32 *xprt_skip_transport_header(struct rpc_xprt *xprt, u32 *p) { diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index ceadb72..084a0ad 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -177,7 +177,7 @@ out_no_path: kfree(clnt->cl_server); kfree(clnt); out_err: - xprt_destroy(xprt); + xprt_put(xprt); out_no_xprt: return ERR_PTR(err); } @@ -261,6 +261,7 @@ rpc_clone_client(struct rpc_clnt *clnt) atomic_set(&new->cl_users, 0); new->cl_parent = clnt; atomic_inc(&clnt->cl_count); + new->cl_xprt = xprt_get(clnt->cl_xprt); /* Turn off autobind on clones */ new->cl_autobind = 0; new->cl_oneshot = 0; @@ -337,15 +338,12 @@ rpc_destroy_client(struct rpc_clnt *clnt) rpc_rmdir(clnt->cl_dentry); rpc_put_mount(); } - if (clnt->cl_xprt) { - xprt_destroy(clnt->cl_xprt); - clnt->cl_xprt = NULL; - } if (clnt->cl_server != clnt->cl_inline_name) kfree(clnt->cl_server); out_free: rpc_free_iostats(clnt->cl_metrics); clnt->cl_metrics = NULL; + xprt_put(clnt->cl_xprt); kfree(clnt); return 0; } diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c index a85f82b..1f786f6 100644 --- a/net/sunrpc/xprt.c +++ b/net/sunrpc/xprt.c @@ -926,6 +926,7 @@ struct rpc_xprt *xprt_create_transport(int proto, struct sockaddr *ap, size_t si return ERR_PTR(result); } + kref_init(&xprt->kref); spin_lock_init(&xprt->transport_lock); spin_lock_init(&xprt->reserve_lock); @@ -958,16 +959,37 @@ struct rpc_xprt *xprt_create_transport(int proto, struct sockaddr *ap, size_t si /** * xprt_destroy - destroy an RPC transport, killing off all requests. - * @xprt: transport to destroy + * @kref: kref for the transport to destroy * */ -int xprt_destroy(struct rpc_xprt *xprt) +static void xprt_destroy(struct kref *kref) { + struct rpc_xprt *xprt = container_of(kref, struct rpc_xprt, kref); + dprintk("RPC: destroying transport %p\n", xprt); xprt->shutdown = 1; del_timer_sync(&xprt->timer); xprt->ops->destroy(xprt); kfree(xprt); +} - return 0; +/** + * xprt_put - release a reference to an RPC transport. + * @xprt: pointer to the transport + * + */ +void xprt_put(struct rpc_xprt *xprt) +{ + kref_put(&xprt->kref, xprt_destroy); +} + +/** + * xprt_get - return a reference to an RPC transport. + * @xprt: pointer to the transport + * + */ +struct rpc_xprt *xprt_get(struct rpc_xprt *xprt) +{ + kref_get(&xprt->kref); + return xprt; } -- cgit v0.10.2 From 762d4527c2fc19d821a13d9a3455ccc2d4073731 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Sun, 3 Sep 2006 00:51:55 -0400 Subject: SUNRPC: Fix Oops in pmap_getport_done There is no guarantee that the parent task still exists when we exit from the portmapper. Save the xprt instead. Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/pmap_clnt.c b/net/sunrpc/pmap_clnt.c index f476f4d..c04609d 100644 --- a/net/sunrpc/pmap_clnt.c +++ b/net/sunrpc/pmap_clnt.c @@ -30,7 +30,7 @@ struct portmap_args { u32 pm_vers; u32 pm_prot; unsigned short pm_port; - struct rpc_task * pm_task; + struct rpc_xprt * pm_xprt; }; static struct rpc_procinfo pmap_procedures[]; @@ -71,10 +71,10 @@ static const struct rpc_call_ops pmap_getport_ops = { .rpc_release = pmap_map_release, }; -static inline void pmap_wake_portmap_waiters(struct rpc_xprt *xprt) +static inline void pmap_wake_portmap_waiters(struct rpc_xprt *xprt, int status) { xprt_clear_binding(xprt); - rpc_wake_up(&xprt->binding); + rpc_wake_up_status(&xprt->binding, status); } /** @@ -92,6 +92,7 @@ void rpc_getport(struct rpc_task *task) struct portmap_args *map; struct rpc_clnt *pmap_clnt; struct rpc_task *child; + int status; dprintk("RPC: %4d rpc_getport(%s, %u, %u, %d)\n", task->tk_pid, clnt->cl_server, @@ -107,34 +108,30 @@ void rpc_getport(struct rpc_task *task) } /* Someone else may have bound if we slept */ - if (xprt_bound(xprt)) { - task->tk_status = 0; + status = 0; + if (xprt_bound(xprt)) goto bailout_nofree; - } + status = -ENOMEM; map = pmap_map_alloc(); - if (!map) { - task->tk_status = -ENOMEM; + if (!map) goto bailout_nofree; - } map->pm_prog = clnt->cl_prog; map->pm_vers = clnt->cl_vers; map->pm_prot = xprt->prot; map->pm_port = 0; - map->pm_task = task; + map->pm_xprt = xprt_get(xprt); rpc_peeraddr(clnt, (struct sockaddr *) &addr, sizeof(addr)); pmap_clnt = pmap_create(clnt->cl_server, &addr, map->pm_prot, 0); - if (IS_ERR(pmap_clnt)) { - task->tk_status = PTR_ERR(pmap_clnt); + status = PTR_ERR(pmap_clnt); + if (IS_ERR(pmap_clnt)) goto bailout; - } + status = -EIO; child = rpc_run_task(pmap_clnt, RPC_TASK_ASYNC, &pmap_getport_ops, map); - if (IS_ERR(child)) { - task->tk_status = -EIO; + if (IS_ERR(child)) goto bailout; - } rpc_release_task(child); rpc_sleep_on(&xprt->binding, task, NULL, NULL); @@ -144,8 +141,10 @@ void rpc_getport(struct rpc_task *task) bailout: pmap_map_free(map); + xprt_put(xprt); bailout_nofree: - pmap_wake_portmap_waiters(xprt); + task->tk_status = status; + pmap_wake_portmap_waiters(xprt, status); } #ifdef CONFIG_ROOT_NFS @@ -201,29 +200,28 @@ int rpc_getport_external(struct sockaddr_in *sin, __u32 prog, __u32 vers, int pr static void pmap_getport_done(struct rpc_task *child, void *data) { struct portmap_args *map = data; - struct rpc_task *task = map->pm_task; - struct rpc_xprt *xprt = task->tk_xprt; + struct rpc_xprt *xprt = map->pm_xprt; int status = child->tk_status; if (status < 0) { /* Portmapper not available */ xprt->ops->set_port(xprt, 0); - task->tk_status = status; } else if (map->pm_port == 0) { /* Requested RPC service wasn't registered */ xprt->ops->set_port(xprt, 0); - task->tk_status = -EACCES; + status = -EACCES; } else { /* Succeeded */ xprt->ops->set_port(xprt, map->pm_port); xprt_set_bound(xprt); - task->tk_status = 0; + status = 0; } dprintk("RPC: %4d pmap_getport_done(status %d, port %u)\n", - child->tk_pid, child->tk_status, map->pm_port); + child->tk_pid, status, map->pm_port); - pmap_wake_portmap_waiters(xprt); + pmap_wake_portmap_waiters(xprt, status); + xprt_put(xprt); } /** -- cgit v0.10.2 From fd6840714d9cf6e93f1d42b904860a94df316b85 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Tue, 5 Sep 2006 12:27:44 -0400 Subject: NFS: nfs_lookup - don't hash dentry when optimising away the lookup If the open intents tell us that a given lookup is going to result in a, exclusive create, we currently optimize away the lookup call itself. The reason is that the lookup would not be atomic with the create RPC call, so why do it in the first place? A problem occurs, however, if the VFS aborts the exclusive create operation after the lookup, but before the call to create the file/directory: in this case we will end up with a hashed negative dentry in the dcache that has never been looked up. Fix this by only actually hashing the dentry once the create operation has been successfully completed. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c index 51328ae..3419c2d 100644 --- a/fs/nfs/dir.c +++ b/fs/nfs/dir.c @@ -904,9 +904,15 @@ static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, stru lock_kernel(); - /* If we're doing an exclusive create, optimize away the lookup */ - if (nfs_is_exclusive_create(dir, nd)) - goto no_entry; + /* + * If we're doing an exclusive create, optimize away the lookup + * but don't hash the dentry. + */ + if (nfs_is_exclusive_create(dir, nd)) { + d_instantiate(dentry, NULL); + res = NULL; + goto out_unlock; + } error = NFS_PROTO(dir)->lookup(dir, &dentry->d_name, &fhandle, &fattr); if (error == -ENOENT) @@ -1161,6 +1167,8 @@ int nfs_instantiate(struct dentry *dentry, struct nfs_fh *fhandle, if (IS_ERR(inode)) return error; d_instantiate(dentry, inode); + if (d_unhashed(dentry)) + d_rehash(dentry); return 0; } -- cgit v0.10.2 From 2dec51466a08ac1c67da41bfd0518d43d983a2eb Mon Sep 17 00:00:00 2001 From: "J. Bruce Fields" Date: Tue, 12 Sep 2006 11:53:23 -0400 Subject: NFSv4: It's perfectly legal for clp to be NULL here.... Signed-off-by: J. Bruce Fields Signed-off-by: Trond Myklebust diff --git a/fs/nfs/client.c b/fs/nfs/client.c index 110f80e..ec1938d 100644 --- a/fs/nfs/client.c +++ b/fs/nfs/client.c @@ -269,7 +269,7 @@ struct nfs_client *nfs_find_client(const struct sockaddr_in *addr, int nfsversio clp = __nfs_find_client(addr, nfsversion); spin_unlock(&nfs_client_lock); - BUG_ON(clp->cl_cons_state == 0); + BUG_ON(clp && clp->cl_cons_state == 0); return clp; } -- cgit v0.10.2 From 5f004cf2aa8494708fd8d78e78142b7b2748e765 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 14 Sep 2006 14:03:14 -0400 Subject: NFS: Make read() return an ESTALE if the file has been deleted Currently, a read() request will return EIO even if the file has been deleted on the server, simply because that is what the VM will return if the call to readpage() fails to update the page. Ensure that readpage() marks the inode as stale if it receives an ESTALE. Then return that error to userland. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/read.c b/fs/nfs/read.c index dae33c1..69f1549 100644 --- a/fs/nfs/read.c +++ b/fs/nfs/read.c @@ -568,8 +568,13 @@ int nfs_readpage_result(struct rpc_task *task, struct nfs_read_data *data) nfs_add_stats(data->inode, NFSIOS_SERVERREADBYTES, resp->count); - /* Is this a short read? */ - if (task->tk_status >= 0 && resp->count < argp->count && !resp->eof) { + if (task->tk_status < 0) { + if (task->tk_status == -ESTALE) { + set_bit(NFS_INO_STALE, &NFS_FLAGS(data->inode)); + nfs_mark_for_revalidate(data->inode); + } + } else if (resp->count < argp->count && !resp->eof) { + /* This is a short read! */ nfs_inc_stats(data->inode, NFSIOS_SHORTREAD); /* Has the server at least made some progress? */ if (resp->count != 0) { @@ -616,6 +621,10 @@ int nfs_readpage(struct file *file, struct page *page) if (error) goto out_error; + error = -ESTALE; + if (NFS_STALE(inode)) + goto out_error; + if (file == NULL) { ctx = nfs_find_open_context(inode, NULL, FMODE_READ); if (ctx == NULL) @@ -678,7 +687,7 @@ int nfs_readpages(struct file *filp, struct address_space *mapping, }; struct inode *inode = mapping->host; struct nfs_server *server = NFS_SERVER(inode); - int ret; + int ret = -ESTALE; dprintk("NFS: nfs_readpages (%s/%Ld %d)\n", inode->i_sb->s_id, @@ -686,6 +695,9 @@ int nfs_readpages(struct file *filp, struct address_space *mapping, nr_pages); nfs_inc_stats(inode, NFSIOS_VFSREADPAGES); + if (NFS_STALE(inode)) + goto out; + if (filp == NULL) { desc.ctx = nfs_find_open_context(inode, NULL, FMODE_READ); if (desc.ctx == NULL) @@ -701,6 +713,7 @@ int nfs_readpages(struct file *filp, struct address_space *mapping, ret = err; } put_nfs_open_context(desc.ctx); +out: return ret; } -- cgit v0.10.2 From 97db8f41792839a6912fd21be8b61dd6c50db58f Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 14 Sep 2006 14:03:14 -0400 Subject: NFS: Don't invalidate the symlink we just stuffed into the cache And slight optimisation of nfs_end_data_update(): directories never have delegations anyway. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index cb5c65f..a56add0 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -717,13 +717,11 @@ void nfs_end_data_update(struct inode *inode) { struct nfs_inode *nfsi = NFS_I(inode); - if (!nfs_have_delegation(inode, FMODE_READ)) { - /* Directories and symlinks: invalidate page cache */ - if (S_ISDIR(inode->i_mode) || S_ISLNK(inode->i_mode)) { - spin_lock(&inode->i_lock); - nfsi->cache_validity |= NFS_INO_INVALID_DATA; - spin_unlock(&inode->i_lock); - } + /* Directories: invalidate page cache */ + if (S_ISDIR(inode->i_mode)) { + spin_lock(&inode->i_lock); + nfsi->cache_validity |= NFS_INO_INVALID_DATA; + spin_unlock(&inode->i_lock); } nfsi->cache_change_attribute = jiffies; atomic_dec(&nfsi->data_updates); -- cgit v0.10.2 From 6b30954ebb569fa1b2abdb21f2f4290eec76bf80 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Thu, 14 Sep 2006 14:03:14 -0400 Subject: NFSv4: Retry lease recovery if it failed during a synchronous operation. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 7f60beb..c218cc4 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -793,10 +793,17 @@ out: int nfs4_recover_expired_lease(struct nfs_server *server) { struct nfs_client *clp = server->nfs_client; + int ret; - if (test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) + for (;;) { + ret = nfs4_wait_clnt_recover(server->client, clp); + if (ret != 0) + return ret; + if (!test_and_clear_bit(NFS4CLNT_LEASE_EXPIRED, &clp->cl_state)) + break; nfs4_schedule_state_recovery(clp); - return nfs4_wait_clnt_recover(server->client, clp); + } + return 0; } /* -- cgit v0.10.2 From c514983d8d2260020543a81589a2b8c7d4bdab4e Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 15 Sep 2006 08:25:04 -0400 Subject: NFSv4: Handle the condition NFS4ERR_FILE_OPEN Retry a few times before we give up: the error is usually due to ordering issues with asynchronous RPC calls. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c218cc4..c49ac3e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -2891,6 +2891,7 @@ int nfs4_handle_exception(const struct nfs_server *server, int errorcode, struct if (ret == 0) exception->retry = 1; break; + case -NFS4ERR_FILE_OPEN: case -NFS4ERR_GRACE: case -NFS4ERR_DELAY: ret = nfs4_delay(server->client, &exception->timeout); -- cgit v0.10.2 From 2066fe89b459c3c787c811b3369df191cddd93d8 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 15 Sep 2006 08:30:46 -0400 Subject: NFSv4: Poll more aggressively when handling NFS4ERR_DELAY Change the initial retry delay from 1s to 0.1s (and then back off exponentially). Signed-off-by: Trond Myklebust diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index c49ac3e..47c7e6e 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -55,7 +55,7 @@ #define NFSDBG_FACILITY NFSDBG_PROC -#define NFS4_POLL_RETRY_MIN (1*HZ) +#define NFS4_POLL_RETRY_MIN (HZ/10) #define NFS4_POLL_RETRY_MAX (15*HZ) struct nfs4_opendata; -- cgit v0.10.2 From 51b6ded4d9a94a61035deba1d8f51a54e3a3dd86 Mon Sep 17 00:00:00 2001 From: Trond Myklebust Date: Fri, 15 Sep 2006 16:31:56 -0400 Subject: NFSv4: When mounting with a port=0 argument, substitute port=2049 RFC3530 states that the registered port 2049 for the NFS protocol should be the default configuration in order to allow clients not to use the RPC binding protocols. If the mount program sends us a port=0, we therefore substitute port=2049. Signed-off-by: Trond Myklebust diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 665949d..b99113b 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -833,6 +833,9 @@ static int nfs4_get_sb(struct file_system_type *fs_type, __FUNCTION__); return -EINVAL; } + /* RFC3530: The default port for NFS is 2049 */ + if (addr.sin_port == 0) + addr.sin_port = NFS_PORT; /* Grab the authentication type */ authflavour = RPC_AUTH_UNIX; -- cgit v0.10.2 From aec5e175288c711cbe44750276f61efa3fa3d370 Mon Sep 17 00:00:00 2001 From: Josef 'Jeff' Sipek Date: Sat, 16 Sep 2006 21:09:32 -0400 Subject: NFS: Use SEEK_END instead of hardcoded value Signed-off-by: Josef 'Jeff' Sipek Signed-off-by: Trond Myklebust diff --git a/fs/nfs/file.c b/fs/nfs/file.c index a146ed3..be997d6 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -157,7 +157,7 @@ force_reval: static loff_t nfs_file_llseek(struct file *filp, loff_t offset, int origin) { /* origin == SEEK_END => we must revalidate the cached file length */ - if (origin == 2) { + if (origin == SEEK_END) { struct inode *inode = filp->f_mapping->host; int retval = nfs_revalidate_file_size(inode, filp); if (retval < 0) -- cgit v0.10.2 From a53a3c58fd83e572a7c768d88b4c4e9840a57e82 Mon Sep 17 00:00:00 2001 From: Steve Dickson Date: Wed, 6 Sep 2006 11:51:21 -0400 Subject: NFSv4: rpc_mkpipe creating socket inodes w/out sk buffers This patch stop rpc_mkpipe from create S_IFSOCK nodes what don't have associated sk buffers attached (which causes SELinux to oops during NFSv4 mounts). Instead the S_IFIFO mode bit is set which probably make more sense and seems to work just fine during my connectathon and fsx testing... Signed-off-by: Steve Dickson Signed-off-by: Trond Myklebust diff --git a/net/sunrpc/rpc_pipe.c b/net/sunrpc/rpc_pipe.c index 11ec12a..dfa504f 100644 --- a/net/sunrpc/rpc_pipe.c +++ b/net/sunrpc/rpc_pipe.c @@ -720,7 +720,7 @@ rpc_mkpipe(struct dentry *parent, const char *name, void *private, struct rpc_pi if (IS_ERR(dentry)) return dentry; dir = parent->d_inode; - inode = rpc_get_inode(dir->i_sb, S_IFSOCK | S_IRUSR | S_IWUSR); + inode = rpc_get_inode(dir->i_sb, S_IFIFO | S_IRUSR | S_IWUSR); if (!inode) goto err_dput; inode->i_ino = iunique(dir->i_sb, 100); -- cgit v0.10.2 From f551e44ff11d3e2ec8f37907bb88ec2433cc8b74 Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Wed, 20 Sep 2006 14:33:04 -0400 Subject: NFS: add comments clarifying the use of nfs_post_op_update() Comments-only change to clarify a detail of the NFS protocol and how it is implemented in Linux. Test plan: None. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c index a56add0..e8c143d 100644 --- a/fs/nfs/inode.c +++ b/fs/nfs/inode.c @@ -840,6 +840,12 @@ int nfs_refresh_inode(struct inode *inode, struct nfs_fattr *fattr) * * After an operation that has changed the inode metadata, mark the * attribute cache as being invalid, then try to update it. + * + * NB: if the server didn't return any post op attributes, this + * function will force the retrieval of attributes before the next + * NFS request. Thus it should be used only for operations that + * are expected to change one or more attributes, to avoid + * unnecessary NFS requests and trips through nfs_update_inode(). */ int nfs_post_op_update_inode(struct inode *inode, struct nfs_fattr *fattr) { diff --git a/fs/nfs/write.c b/fs/nfs/write.c index 38ba5c0..c12effb 100644 --- a/fs/nfs/write.c +++ b/fs/nfs/write.c @@ -1253,7 +1253,13 @@ int nfs_writeback_done(struct rpc_task *task, struct nfs_write_data *data) dprintk("NFS: %4d nfs_writeback_done (status %d)\n", task->tk_pid, task->tk_status); - /* Call the NFS version-specific code */ + /* + * ->write_done will attempt to use post-op attributes to detect + * conflicting writes by other clients. A strict interpretation + * of close-to-open would allow us to continue caching even if + * another writer had changed the file, but some applications + * depend on tighter cache coherency when writing. + */ status = NFS_PROTO(data->inode)->write_done(task, data); if (status != 0) return status; -- cgit v0.10.2 From 026ed5c9185dcc4b2df92e98c3d61a01cea19cbf Mon Sep 17 00:00:00 2001 From: Chuck Lever Date: Wed, 20 Sep 2006 14:33:07 -0400 Subject: NFS: unmark NFS direct I/O as experimental Remove the EXPERIMENTAL flag from the NFS_DIRECTIO option. Test plan: Unset the EXPERIMENTAL kernel build option and check to see that the NFS direct I/O option is still available. Signed-off-by: Chuck Lever Signed-off-by: Trond Myklebust diff --git a/fs/Kconfig b/fs/Kconfig index 5305816..a270026 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -1471,8 +1471,8 @@ config NFS_V4 If unsure, say N. config NFS_DIRECTIO - bool "Allow direct I/O on NFS files (EXPERIMENTAL)" - depends on NFS_FS && EXPERIMENTAL + bool "Allow direct I/O on NFS files" + depends on NFS_FS help This option enables applications to perform uncached I/O on files in NFS file systems using the O_DIRECT open() flag. When O_DIRECT -- cgit v0.10.2 From 42750b04c5baa7c5ffdf0a8be2b9b320efdf069f Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Thu, 1 Jun 2006 18:34:01 +0200 Subject: [ALSA] Control API - TLV implementation for additional information like dB scale This patch implements a TLV mechanism to transfer an additional information like dB scale to the user space. The types might be extended in future. Acked-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/asound.h b/include/sound/asound.h index 41885f4..76a2040 100644 --- a/include/sound/asound.h +++ b/include/sound/asound.h @@ -688,7 +688,7 @@ struct snd_timer_tread { * * ****************************************************************************/ -#define SNDRV_CTL_VERSION SNDRV_PROTOCOL_VERSION(2, 0, 3) +#define SNDRV_CTL_VERSION SNDRV_PROTOCOL_VERSION(2, 0, 4) struct snd_ctl_card_info { int card; /* card number */ @@ -818,6 +818,12 @@ struct snd_ctl_elem_value { unsigned char reserved[128-sizeof(struct timespec)]; }; +struct snd_ctl_tlv { + unsigned int numid; /* control element numeric identification */ + unsigned int length; /* in bytes aligned to 4 */ + unsigned int tlv[0]; /* first TLV */ +}; + enum { SNDRV_CTL_IOCTL_PVERSION = _IOR('U', 0x00, int), SNDRV_CTL_IOCTL_CARD_INFO = _IOR('U', 0x01, struct snd_ctl_card_info), @@ -831,6 +837,7 @@ enum { SNDRV_CTL_IOCTL_ELEM_ADD = _IOWR('U', 0x17, struct snd_ctl_elem_info), SNDRV_CTL_IOCTL_ELEM_REPLACE = _IOWR('U', 0x18, struct snd_ctl_elem_info), SNDRV_CTL_IOCTL_ELEM_REMOVE = _IOWR('U', 0x19, struct snd_ctl_elem_id), + SNDRV_CTL_IOCTL_TLV_READ = _IOWR('U', 0x1a, struct snd_ctl_tlv), SNDRV_CTL_IOCTL_HWDEP_NEXT_DEVICE = _IOWR('U', 0x20, int), SNDRV_CTL_IOCTL_HWDEP_INFO = _IOR('U', 0x21, struct snd_hwdep_info), SNDRV_CTL_IOCTL_PCM_NEXT_DEVICE = _IOR('U', 0x30, int), diff --git a/include/sound/control.h b/include/sound/control.h index 2489b1e..a93a58d 100644 --- a/include/sound/control.h +++ b/include/sound/control.h @@ -42,6 +42,7 @@ struct snd_kcontrol_new { snd_kcontrol_info_t *info; snd_kcontrol_get_t *get; snd_kcontrol_put_t *put; + unsigned int *tlv; unsigned long private_value; }; @@ -58,6 +59,7 @@ struct snd_kcontrol { snd_kcontrol_info_t *info; snd_kcontrol_get_t *get; snd_kcontrol_put_t *put; + unsigned int *tlv; unsigned long private_value; void *private_data; void (*private_free)(struct snd_kcontrol *kcontrol); diff --git a/include/sound/tlv.h b/include/sound/tlv.h new file mode 100644 index 0000000..b826e1d --- /dev/null +++ b/include/sound/tlv.h @@ -0,0 +1,43 @@ +#ifndef __SOUND_TLV_H +#define __SOUND_TLV_H + +/* + * Advanced Linux Sound Architecture - ALSA - Driver + * Copyright (c) 2006 by Jaroslav Kysela + * + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + */ + +/* + * TLV structure is right behind the struct snd_ctl_tlv: + * unsigned int type - see SNDRV_CTL_TLVT_* + * unsigned int length + * .... data aligned to sizeof(unsigned int), use + * block_length = (length + (sizeof(unsigned int) - 1)) & + * ~(sizeof(unsigned int) - 1)) .... + */ + +#define SNDRV_CTL_TLVT_CONTAINER 0 /* one level down - group of TLVs */ +#define SNDRV_CTL_TLVT_DB_SCALE 1 /* dB scale */ + +#define DECLARE_TLV_DB_SCALE(name, min, step, mute) \ +unsigned int name[] = { \ + SNDRV_CTL_TLVT_DB_SCALE, 2 * sizeof(unsigned int), \ + (min), ((step) & 0xffff) | ((mute) ? 0x10000 : 0) \ +} + +#endif /* __SOUND_TLV_H */ diff --git a/sound/core/control.c b/sound/core/control.c index bb397ea..e9c8854 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -241,6 +241,7 @@ struct snd_kcontrol *snd_ctl_new1(const struct snd_kcontrol_new *ncontrol, kctl.info = ncontrol->info; kctl.get = ncontrol->get; kctl.put = ncontrol->put; + kctl.tlv = ncontrol->tlv; kctl.private_value = ncontrol->private_value; kctl.private_data = private_data; return snd_ctl_new(&kctl, access); @@ -1067,6 +1068,40 @@ static int snd_ctl_subscribe_events(struct snd_ctl_file *file, int __user *ptr) return 0; } +static int snd_ctl_tlv_read(struct snd_card *card, + struct snd_ctl_tlv __user *_tlv) +{ + struct snd_ctl_tlv tlv; + struct snd_kcontrol *kctl; + unsigned int len; + int err = 0; + + if (copy_from_user(&tlv, _tlv, sizeof(tlv))) + return -EFAULT; + if (tlv.length < sizeof(unsigned int) * 3) + return -EINVAL; + down_read(&card->controls_rwsem); + kctl = snd_ctl_find_numid(card, tlv.numid); + if (kctl == NULL) { + err = -ENOENT; + goto __kctl_end; + } + if (kctl->tlv == NULL) { + err = -ENXIO; + goto __kctl_end; + } + len = kctl->tlv[1] + 2 * sizeof(unsigned int); + if (tlv.length < len) { + err = -ENOMEM; + goto __kctl_end; + } + if (copy_to_user(_tlv->tlv, kctl->tlv, len)) + err = -EFAULT; + __kctl_end: + up_read(&card->controls_rwsem); + return err; +} + static long snd_ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct snd_ctl_file *ctl; @@ -1086,11 +1121,11 @@ static long snd_ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg case SNDRV_CTL_IOCTL_CARD_INFO: return snd_ctl_card_info(card, ctl, cmd, argp); case SNDRV_CTL_IOCTL_ELEM_LIST: - return snd_ctl_elem_list(ctl->card, argp); + return snd_ctl_elem_list(card, argp); case SNDRV_CTL_IOCTL_ELEM_INFO: return snd_ctl_elem_info_user(ctl, argp); case SNDRV_CTL_IOCTL_ELEM_READ: - return snd_ctl_elem_read_user(ctl->card, argp); + return snd_ctl_elem_read_user(card, argp); case SNDRV_CTL_IOCTL_ELEM_WRITE: return snd_ctl_elem_write_user(ctl, argp); case SNDRV_CTL_IOCTL_ELEM_LOCK: @@ -1105,6 +1140,8 @@ static long snd_ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg return snd_ctl_elem_remove(ctl, argp); case SNDRV_CTL_IOCTL_SUBSCRIBE_EVENTS: return snd_ctl_subscribe_events(ctl, ip); + case SNDRV_CTL_IOCTL_TLV_READ: + return snd_ctl_tlv_read(card, argp); case SNDRV_CTL_IOCTL_POWER: return -ENOPROTOOPT; case SNDRV_CTL_IOCTL_POWER_STATE: diff --git a/sound/pci/ca0106/ca0106_mixer.c b/sound/pci/ca0106/ca0106_mixer.c index 146eed70d..35309b3 100644 --- a/sound/pci/ca0106/ca0106_mixer.c +++ b/sound/pci/ca0106/ca0106_mixer.c @@ -70,9 +70,12 @@ #include #include #include +#include #include "ca0106.h" +static DECLARE_TLV_DB_SCALE(snd_ca0106_db_scale, -5150, 75, 1); + static int snd_ca0106_shared_spdif_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -472,6 +475,7 @@ static int snd_ca0106_i2c_volume_put(struct snd_kcontrol *kcontrol, .info = snd_ca0106_volume_info, \ .get = snd_ca0106_volume_get, \ .put = snd_ca0106_volume_put, \ + .tlv = snd_ca0106_db_scale, \ .private_value = ((chid) << 8) | (reg) \ } -- cgit v0.10.2 From 746d4a02e68499fc6c1f8d0c43d2271853ade181 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 23 Jun 2006 14:37:59 +0200 Subject: [ALSA] Fix disconnection of proc interface - Add the linked list to each proc entry to enable a single-shot disconnection (unregister) - Deprecate snd_info_unregister(), use snd_info_free_entry() - Removed NULL checks of snd_info_free_entry() Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/info.h b/include/sound/info.h index 74f69967..97ffc4f 100644 --- a/include/sound/info.h +++ b/include/sound/info.h @@ -71,7 +71,6 @@ struct snd_info_entry { mode_t mode; long size; unsigned short content; - unsigned short disconnected: 1; union { struct snd_info_entry_text text; struct snd_info_entry_ops *ops; @@ -83,6 +82,8 @@ struct snd_info_entry { void (*private_free)(struct snd_info_entry *entry); struct proc_dir_entry *p; struct mutex access; + struct list_head children; + struct list_head list; }; #if defined(CONFIG_SND_OSSEMUL) && defined(CONFIG_PROC_FS) @@ -122,8 +123,8 @@ int snd_info_restore_text(struct snd_info_entry * entry); int snd_info_card_create(struct snd_card * card); int snd_info_card_register(struct snd_card * card); int snd_info_card_free(struct snd_card * card); +void snd_info_card_disconnect(struct snd_card * card); int snd_info_register(struct snd_info_entry * entry); -int snd_info_unregister(struct snd_info_entry * entry); /* for card drivers */ int snd_card_proc_new(struct snd_card *card, const char *name, struct snd_info_entry **entryp); @@ -156,8 +157,8 @@ static inline void snd_info_free_entry(struct snd_info_entry * entry) { ; } static inline int snd_info_card_create(struct snd_card * card) { return 0; } static inline int snd_info_card_register(struct snd_card * card) { return 0; } static inline int snd_info_card_free(struct snd_card * card) { return 0; } +static inline void snd_info_card_disconnect(struct snd_card * card) { } static inline int snd_info_register(struct snd_info_entry * entry) { return 0; } -static inline int snd_info_unregister(struct snd_info_entry * entry) { return 0; } static inline int snd_card_proc_new(struct snd_card *card, const char *name, struct snd_info_entry **entryp) { return -EINVAL; } diff --git a/sound/core/hwdep.c b/sound/core/hwdep.c index 8bd0dcc..cbd8a63 100644 --- a/sound/core/hwdep.c +++ b/sound/core/hwdep.c @@ -497,7 +497,7 @@ static void __init snd_hwdep_proc_init(void) static void __exit snd_hwdep_proc_done(void) { - snd_info_unregister(snd_hwdep_proc_entry); + snd_info_free_entry(snd_hwdep_proc_entry); } #else /* !CONFIG_PROC_FS */ #define snd_hwdep_proc_init() diff --git a/sound/core/info.c b/sound/core/info.c index 340332c..9663b6b 100644 --- a/sound/core/info.c +++ b/sound/core/info.c @@ -78,6 +78,7 @@ struct snd_info_private_data { static int snd_info_version_init(void); static int snd_info_version_done(void); +static void snd_info_disconnect(struct snd_info_entry *entry); /* resize the proc r/w buffer */ @@ -304,7 +305,7 @@ static int snd_info_entry_open(struct inode *inode, struct file *file) mutex_lock(&info_mutex); p = PDE(inode); entry = p == NULL ? NULL : (struct snd_info_entry *)p->data; - if (entry == NULL || entry->disconnected) { + if (entry == NULL || ! entry->p) { mutex_unlock(&info_mutex); return -ENODEV; } @@ -586,10 +587,10 @@ int __exit snd_info_done(void) snd_info_version_done(); if (snd_proc_root) { #if defined(CONFIG_SND_SEQUENCER) || defined(CONFIG_SND_SEQUENCER_MODULE) - snd_info_unregister(snd_seq_root); + snd_info_free_entry(snd_seq_root); #endif #ifdef CONFIG_SND_OSSEMUL - snd_info_unregister(snd_oss_root); + snd_info_free_entry(snd_oss_root); #endif snd_remove_proc_entry(&proc_root, snd_proc_root); } @@ -648,17 +649,28 @@ int snd_info_card_register(struct snd_card *card) * de-register the card proc file * called from init.c */ -int snd_info_card_free(struct snd_card *card) +void snd_info_card_disconnect(struct snd_card *card) { - snd_assert(card != NULL, return -ENXIO); + snd_assert(card != NULL, return); + mutex_lock(&info_mutex); if (card->proc_root_link) { snd_remove_proc_entry(snd_proc_root, card->proc_root_link); card->proc_root_link = NULL; } - if (card->proc_root) { - snd_info_unregister(card->proc_root); - card->proc_root = NULL; - } + if (card->proc_root) + snd_info_disconnect(card->proc_root); + mutex_unlock(&info_mutex); +} + +/* + * release the card proc file resources + * called from init.c + */ +int snd_info_card_free(struct snd_card *card) +{ + snd_assert(card != NULL, return -ENXIO); + snd_info_free_entry(card->proc_root); + card->proc_root = NULL; return 0; } @@ -767,6 +779,8 @@ static struct snd_info_entry *snd_info_create_entry(const char *name) entry->mode = S_IFREG | S_IRUGO; entry->content = SNDRV_INFO_CONTENT_TEXT; mutex_init(&entry->access); + INIT_LIST_HEAD(&entry->children); + INIT_LIST_HEAD(&entry->list); return entry; } @@ -819,30 +833,35 @@ struct snd_info_entry *snd_info_create_card_entry(struct snd_card *card, EXPORT_SYMBOL(snd_info_create_card_entry); -static int snd_info_dev_free_entry(struct snd_device *device) +static void snd_info_disconnect(struct snd_info_entry *entry) { - struct snd_info_entry *entry = device->device_data; - snd_info_free_entry(entry); - return 0; -} + struct list_head *p, *n; + struct proc_dir_entry *root; -static int snd_info_dev_register_entry(struct snd_device *device) -{ - struct snd_info_entry *entry = device->device_data; - return snd_info_register(entry); + list_for_each_safe(p, n, &entry->children) { + snd_info_disconnect(list_entry(p, struct snd_info_entry, list)); + } + + if (! entry->p) + return; + list_del_init(&entry->list); + root = entry->parent == NULL ? snd_proc_root : entry->parent->p; + snd_assert(root, return); + snd_remove_proc_entry(root, entry->p); + entry->p = NULL; } -static int snd_info_dev_disconnect_entry(struct snd_device *device) +static int snd_info_dev_free_entry(struct snd_device *device) { struct snd_info_entry *entry = device->device_data; - entry->disconnected = 1; + snd_info_free_entry(entry); return 0; } -static int snd_info_dev_unregister_entry(struct snd_device *device) +static int snd_info_dev_register_entry(struct snd_device *device) { struct snd_info_entry *entry = device->device_data; - return snd_info_unregister(entry); + return snd_info_register(entry); } /** @@ -871,8 +890,7 @@ int snd_card_proc_new(struct snd_card *card, const char *name, static struct snd_device_ops ops = { .dev_free = snd_info_dev_free_entry, .dev_register = snd_info_dev_register_entry, - .dev_disconnect = snd_info_dev_disconnect_entry, - .dev_unregister = snd_info_dev_unregister_entry + /* disconnect is done via snd_info_card_disconnect() */ }; struct snd_info_entry *entry; int err; @@ -901,6 +919,11 @@ void snd_info_free_entry(struct snd_info_entry * entry) { if (entry == NULL) return; + if (entry->p) { + mutex_lock(&info_mutex); + snd_info_disconnect(entry); + mutex_unlock(&info_mutex); + } kfree(entry->name); if (entry->private_free) entry->private_free(entry); @@ -935,38 +958,14 @@ int snd_info_register(struct snd_info_entry * entry) p->size = entry->size; p->data = entry; entry->p = p; + if (entry->parent) + list_add_tail(&entry->list, &entry->parent->children); mutex_unlock(&info_mutex); return 0; } EXPORT_SYMBOL(snd_info_register); -/** - * snd_info_unregister - de-register the info entry - * @entry: the info entry - * - * De-registers the info entry and releases the instance. - * - * Returns zero if successful, or a negative error code on failure. - */ -int snd_info_unregister(struct snd_info_entry * entry) -{ - struct proc_dir_entry *root; - - if (! entry) - return 0; - snd_assert(entry->p != NULL, return -ENXIO); - root = entry->parent == NULL ? snd_proc_root : entry->parent->p; - snd_assert(root, return -ENXIO); - mutex_lock(&info_mutex); - snd_remove_proc_entry(root, entry->p); - mutex_unlock(&info_mutex); - snd_info_free_entry(entry); - return 0; -} - -EXPORT_SYMBOL(snd_info_unregister); - /* */ @@ -999,8 +998,7 @@ static int __init snd_info_version_init(void) static int __exit snd_info_version_done(void) { - if (snd_info_version_entry) - snd_info_unregister(snd_info_version_entry); + snd_info_free_entry(snd_info_version_entry); return 0; } diff --git a/sound/core/info_oss.c b/sound/core/info_oss.c index bb2c40d..3ebc349 100644 --- a/sound/core/info_oss.c +++ b/sound/core/info_oss.c @@ -131,10 +131,8 @@ int snd_info_minor_register(void) int snd_info_minor_unregister(void) { - if (snd_sndstat_proc_entry) { - snd_info_unregister(snd_sndstat_proc_entry); - snd_sndstat_proc_entry = NULL; - } + snd_info_free_entry(snd_sndstat_proc_entry); + snd_sndstat_proc_entry = NULL; return 0; } diff --git a/sound/core/init.c b/sound/core/init.c index 4d92588..1ecb029 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -310,6 +310,7 @@ int snd_card_disconnect(struct snd_card *card) if (err < 0) snd_printk(KERN_ERR "not all devices for card %i can be disconnected\n", card->number); + snd_info_card_disconnect(card); return 0; } @@ -360,7 +361,7 @@ int snd_card_free(struct snd_card *card) } if (card->private_free) card->private_free(card); - snd_info_unregister(card->proc_id); + snd_info_free_entry(card->proc_id); if (snd_info_card_free(card) < 0) { snd_printk(KERN_WARNING "unable to free card info\n"); /* Not fatal error */ @@ -625,9 +626,9 @@ int __init snd_card_info_init(void) int __exit snd_card_info_done(void) { - snd_info_unregister(snd_card_info_entry); + snd_info_free_entry(snd_card_info_entry); #ifdef MODULE - snd_info_unregister(snd_card_module_info_entry); + snd_info_free_entry(snd_card_module_info_entry); #endif return 0; } diff --git a/sound/core/oss/mixer_oss.c b/sound/core/oss/mixer_oss.c index 75a9505..00c95de 100644 --- a/sound/core/oss/mixer_oss.c +++ b/sound/core/oss/mixer_oss.c @@ -1193,10 +1193,8 @@ static void snd_mixer_oss_proc_init(struct snd_mixer_oss *mixer) static void snd_mixer_oss_proc_done(struct snd_mixer_oss *mixer) { - if (mixer->proc_entry) { - snd_info_unregister(mixer->proc_entry); - mixer->proc_entry = NULL; - } + snd_info_free_entry(mixer->proc_entry); + mixer->proc_entry = NULL; } #else /* !CONFIG_PROC_FS */ #define snd_mixer_oss_proc_init(mix) diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index 472fce0..a92b93e 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -2846,11 +2846,9 @@ static void snd_pcm_oss_proc_done(struct snd_pcm *pcm) int stream; for (stream = 0; stream < 2; ++stream) { struct snd_pcm_str *pstr = &pcm->streams[stream]; - if (pstr->oss.proc_entry) { - snd_info_unregister(pstr->oss.proc_entry); - pstr->oss.proc_entry = NULL; - snd_pcm_oss_proc_free_setup_list(pstr); - } + snd_info_free_entry(pstr->oss.proc_entry); + pstr->oss.proc_entry = NULL; + snd_pcm_oss_proc_free_setup_list(pstr); } } #else /* !CONFIG_SND_VERBOSE_PROCFS */ diff --git a/sound/core/pcm.c b/sound/core/pcm.c index 7581edd..b860247 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -494,19 +494,13 @@ static int snd_pcm_stream_proc_init(struct snd_pcm_str *pstr) static int snd_pcm_stream_proc_done(struct snd_pcm_str *pstr) { #ifdef CONFIG_SND_PCM_XRUN_DEBUG - if (pstr->proc_xrun_debug_entry) { - snd_info_unregister(pstr->proc_xrun_debug_entry); - pstr->proc_xrun_debug_entry = NULL; - } + snd_info_free_entry(pstr->proc_xrun_debug_entry); + pstr->proc_xrun_debug_entry = NULL; #endif - if (pstr->proc_info_entry) { - snd_info_unregister(pstr->proc_info_entry); - pstr->proc_info_entry = NULL; - } - if (pstr->proc_root) { - snd_info_unregister(pstr->proc_root); - pstr->proc_root = NULL; - } + snd_info_free_entry(pstr->proc_info_entry); + pstr->proc_info_entry = NULL; + snd_info_free_entry(pstr->proc_root); + pstr->proc_root = NULL; return 0; } @@ -570,29 +564,19 @@ static int snd_pcm_substream_proc_init(struct snd_pcm_substream *substream) return 0; } - + static int snd_pcm_substream_proc_done(struct snd_pcm_substream *substream) { - if (substream->proc_info_entry) { - snd_info_unregister(substream->proc_info_entry); - substream->proc_info_entry = NULL; - } - if (substream->proc_hw_params_entry) { - snd_info_unregister(substream->proc_hw_params_entry); - substream->proc_hw_params_entry = NULL; - } - if (substream->proc_sw_params_entry) { - snd_info_unregister(substream->proc_sw_params_entry); - substream->proc_sw_params_entry = NULL; - } - if (substream->proc_status_entry) { - snd_info_unregister(substream->proc_status_entry); - substream->proc_status_entry = NULL; - } - if (substream->proc_root) { - snd_info_unregister(substream->proc_root); - substream->proc_root = NULL; - } + snd_info_free_entry(substream->proc_info_entry); + substream->proc_info_entry = NULL; + snd_info_free_entry(substream->proc_hw_params_entry); + substream->proc_hw_params_entry = NULL; + snd_info_free_entry(substream->proc_sw_params_entry); + substream->proc_sw_params_entry = NULL; + snd_info_free_entry(substream->proc_status_entry); + substream->proc_status_entry = NULL; + snd_info_free_entry(substream->proc_root); + substream->proc_root = NULL; return 0; } #else /* !CONFIG_SND_VERBOSE_PROCFS */ @@ -1090,8 +1074,7 @@ static void snd_pcm_proc_init(void) static void snd_pcm_proc_done(void) { - if (snd_pcm_proc_entry) - snd_info_unregister(snd_pcm_proc_entry); + snd_info_free_entry(snd_pcm_proc_entry); } #else /* !CONFIG_PROC_FS */ diff --git a/sound/core/pcm_memory.c b/sound/core/pcm_memory.c index 067d205..be030cb 100644 --- a/sound/core/pcm_memory.c +++ b/sound/core/pcm_memory.c @@ -101,7 +101,7 @@ int snd_pcm_lib_preallocate_free(struct snd_pcm_substream *substream) { snd_pcm_lib_preallocate_dma_free(substream); #ifdef CONFIG_SND_VERBOSE_PROCFS - snd_info_unregister(substream->proc_prealloc_entry); + snd_info_free_entry(substream->proc_prealloc_entry); substream->proc_prealloc_entry = NULL; #endif return 0; diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index 8c15c66..51577c2 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -1599,7 +1599,7 @@ static int snd_rawmidi_dev_unregister(struct snd_device *device) mutex_lock(®ister_mutex); list_del(&rmidi->list); if (rmidi->proc_entry) { - snd_info_unregister(rmidi->proc_entry); + snd_info_free_entry(rmidi->proc_entry); rmidi->proc_entry = NULL; } #ifdef CONFIG_SND_OSSEMUL diff --git a/sound/core/seq/oss/seq_oss.c b/sound/core/seq/oss/seq_oss.c index e723413..92858cf 100644 --- a/sound/core/seq/oss/seq_oss.c +++ b/sound/core/seq/oss/seq_oss.c @@ -303,8 +303,7 @@ register_proc(void) static void unregister_proc(void) { - if (info_entry) - snd_info_unregister(info_entry); + snd_info_free_entry(info_entry); info_entry = NULL; } #endif /* CONFIG_PROC_FS */ diff --git a/sound/core/seq/seq_device.c b/sound/core/seq/seq_device.c index 102ff54..b85954e 100644 --- a/sound/core/seq/seq_device.c +++ b/sound/core/seq/seq_device.c @@ -573,7 +573,7 @@ static void __exit alsa_seq_device_exit(void) { remove_drivers(); #ifdef CONFIG_PROC_FS - snd_info_unregister(info_entry); + snd_info_free_entry(info_entry); #endif if (num_ops) snd_printk(KERN_ERR "drivers not released (%d)\n", num_ops); diff --git a/sound/core/seq/seq_info.c b/sound/core/seq/seq_info.c index 142e9e6..8a7fe5c 100644 --- a/sound/core/seq/seq_info.c +++ b/sound/core/seq/seq_info.c @@ -64,9 +64,9 @@ int __init snd_seq_info_init(void) int __exit snd_seq_info_done(void) { - snd_info_unregister(queues_entry); - snd_info_unregister(clients_entry); - snd_info_unregister(timer_entry); + snd_info_free_entry(queues_entry); + snd_info_free_entry(clients_entry); + snd_info_free_entry(timer_entry); return 0; } #endif diff --git a/sound/core/sound.c b/sound/core/sound.c index 7edd1fc..b4430db 100644 --- a/sound/core/sound.c +++ b/sound/core/sound.c @@ -387,8 +387,7 @@ int __init snd_minor_info_init(void) int __exit snd_minor_info_done(void) { - if (snd_minor_info_entry) - snd_info_unregister(snd_minor_info_entry); + snd_info_free_entry(snd_minor_info_entry); return 0; } #endif /* CONFIG_PROC_FS */ diff --git a/sound/core/sound_oss.c b/sound/core/sound_oss.c index 74f0fe5..b2fc40a 100644 --- a/sound/core/sound_oss.c +++ b/sound/core/sound_oss.c @@ -270,8 +270,7 @@ int __init snd_minor_info_oss_init(void) int __exit snd_minor_info_oss_done(void) { - if (snd_minor_info_oss_entry) - snd_info_unregister(snd_minor_info_oss_entry); + snd_info_free_entry(snd_minor_info_oss_entry); return 0; } #endif /* CONFIG_PROC_FS */ diff --git a/sound/core/timer.c b/sound/core/timer.c index 0a984e8..52ecbe1 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -1126,7 +1126,7 @@ static void __init snd_timer_proc_init(void) static void __exit snd_timer_proc_done(void) { - snd_info_unregister(snd_timer_proc_entry); + snd_info_free_entry(snd_timer_proc_entry); } #else /* !CONFIG_PROC_FS */ #define snd_timer_proc_init() diff --git a/sound/drivers/opl4/opl4_proc.c b/sound/drivers/opl4/opl4_proc.c index e552ec3..11dd811 100644 --- a/sound/drivers/opl4/opl4_proc.c +++ b/sound/drivers/opl4/opl4_proc.c @@ -159,8 +159,7 @@ int snd_opl4_create_proc(struct snd_opl4 *opl4) void snd_opl4_free_proc(struct snd_opl4 *opl4) { - if (opl4->proc_entry) - snd_info_unregister(opl4->proc_entry); + snd_info_free_entry(opl4->proc_entry); } #endif /* CONFIG_PROC_FS */ diff --git a/sound/pci/ac97/ac97_proc.c b/sound/pci/ac97/ac97_proc.c index 2118df5..a3fdd7d 100644 --- a/sound/pci/ac97/ac97_proc.c +++ b/sound/pci/ac97/ac97_proc.c @@ -457,14 +457,10 @@ void snd_ac97_proc_init(struct snd_ac97 * ac97) void snd_ac97_proc_done(struct snd_ac97 * ac97) { - if (ac97->proc_regs) { - snd_info_unregister(ac97->proc_regs); - ac97->proc_regs = NULL; - } - if (ac97->proc) { - snd_info_unregister(ac97->proc); - ac97->proc = NULL; - } + snd_info_free_entry(ac97->proc_regs); + ac97->proc_regs = NULL; + snd_info_free_entry(ac97->proc); + ac97->proc = NULL; } void snd_ac97_bus_proc_init(struct snd_ac97_bus * bus) @@ -485,8 +481,6 @@ void snd_ac97_bus_proc_init(struct snd_ac97_bus * bus) void snd_ac97_bus_proc_done(struct snd_ac97_bus * bus) { - if (bus->proc) { - snd_info_unregister(bus->proc); - bus->proc = NULL; - } + snd_info_free_entry(bus->proc); + bus->proc = NULL; } diff --git a/sound/pci/cs46xx/dsp_spos.c b/sound/pci/cs46xx/dsp_spos.c index 5c9711c..89c4027 100644 --- a/sound/pci/cs46xx/dsp_spos.c +++ b/sound/pci/cs46xx/dsp_spos.c @@ -868,35 +868,23 @@ int cs46xx_dsp_proc_done (struct snd_cs46xx *chip) struct dsp_spos_instance * ins = chip->dsp_spos_instance; int i; - if (ins->proc_sym_info_entry) { - snd_info_unregister(ins->proc_sym_info_entry); - ins->proc_sym_info_entry = NULL; - } - - if (ins->proc_modules_info_entry) { - snd_info_unregister(ins->proc_modules_info_entry); - ins->proc_modules_info_entry = NULL; - } - - if (ins->proc_parameter_dump_info_entry) { - snd_info_unregister(ins->proc_parameter_dump_info_entry); - ins->proc_parameter_dump_info_entry = NULL; - } - - if (ins->proc_sample_dump_info_entry) { - snd_info_unregister(ins->proc_sample_dump_info_entry); - ins->proc_sample_dump_info_entry = NULL; - } - - if (ins->proc_scb_info_entry) { - snd_info_unregister(ins->proc_scb_info_entry); - ins->proc_scb_info_entry = NULL; - } - - if (ins->proc_task_info_entry) { - snd_info_unregister(ins->proc_task_info_entry); - ins->proc_task_info_entry = NULL; - } + snd_info_free_entry(ins->proc_sym_info_entry); + ins->proc_sym_info_entry = NULL; + + snd_info_free_entry(ins->proc_modules_info_entry); + ins->proc_modules_info_entry = NULL; + + snd_info_free_entry(ins->proc_parameter_dump_info_entry); + ins->proc_parameter_dump_info_entry = NULL; + + snd_info_free_entry(ins->proc_sample_dump_info_entry); + ins->proc_sample_dump_info_entry = NULL; + + snd_info_free_entry(ins->proc_scb_info_entry); + ins->proc_scb_info_entry = NULL; + + snd_info_free_entry(ins->proc_task_info_entry); + ins->proc_task_info_entry = NULL; mutex_lock(&chip->spos_mutex); for (i = 0; i < ins->nscb; ++i) { @@ -905,10 +893,8 @@ int cs46xx_dsp_proc_done (struct snd_cs46xx *chip) } mutex_unlock(&chip->spos_mutex); - if (ins->proc_dsp_dir) { - snd_info_unregister (ins->proc_dsp_dir); - ins->proc_dsp_dir = NULL; - } + snd_info_free_entry(ins->proc_dsp_dir); + ins->proc_dsp_dir = NULL; return 0; } diff --git a/sound/pci/cs46xx/dsp_spos_scb_lib.c b/sound/pci/cs46xx/dsp_spos_scb_lib.c index 232b337..343f51d 100644 --- a/sound/pci/cs46xx/dsp_spos_scb_lib.c +++ b/sound/pci/cs46xx/dsp_spos_scb_lib.c @@ -233,7 +233,7 @@ void cs46xx_dsp_proc_free_scb_desc (struct dsp_scb_descriptor * scb) snd_printdd("cs46xx_dsp_proc_free_scb_desc: freeing %s\n",scb->scb_name); - snd_info_unregister(scb->proc_info); + snd_info_free_entry(scb->proc_info); scb->proc_info = NULL; snd_assert (scb_info != NULL, return); diff --git a/sound/synth/emux/emux_proc.c b/sound/synth/emux/emux_proc.c index 58b9601..59144ec 100644 --- a/sound/synth/emux/emux_proc.c +++ b/sound/synth/emux/emux_proc.c @@ -128,10 +128,8 @@ void snd_emux_proc_init(struct snd_emux *emu, struct snd_card *card, int device) void snd_emux_proc_free(struct snd_emux *emu) { - if (emu->proc) { - snd_info_unregister(emu->proc); - emu->proc = NULL; - } + snd_info_free_entry(emu->proc); + emu->proc = NULL; } #endif /* CONFIG_PROC_FS */ -- cgit v0.10.2 From c461482c8072bb073e6146db320d3da85cdc89ad Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 23 Jun 2006 14:38:23 +0200 Subject: [ALSA] Unregister device files at disconnection Orignally proposed by Sam Revitch . Unregister device files at disconnection to avoid the futher accesses. Also, the dev_unregister callback is removed and replaced with the combination of disconnect + free. A new function snd_card_free_when_closed() is introduced, which is used in USB disconnect callback. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/core.h b/include/sound/core.h index bab3ff4..cf4001c 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -71,7 +71,6 @@ struct snd_device_ops { int (*dev_free)(struct snd_device *dev); int (*dev_register)(struct snd_device *dev); int (*dev_disconnect)(struct snd_device *dev); - int (*dev_unregister)(struct snd_device *dev); }; struct snd_device { @@ -131,6 +130,7 @@ struct snd_card { state */ spinlock_t files_lock; /* lock the files for this card */ int shutdown; /* this card is going down */ + int free_on_last_close; /* free in context of file_release */ wait_queue_head_t shutdown_sleep; struct work_struct free_workq; /* for free in workqueue */ struct device *dev; @@ -244,6 +244,7 @@ struct snd_card *snd_card_new(int idx, const char *id, struct module *module, int extra_size); int snd_card_disconnect(struct snd_card *card); int snd_card_free(struct snd_card *card); +int snd_card_free_when_closed(struct snd_card *card); int snd_card_free_in_thread(struct snd_card *card); int snd_card_register(struct snd_card *card); int snd_card_info_init(void); diff --git a/include/sound/timer.h b/include/sound/timer.h index 5ece2bf..d42c083 100644 --- a/include/sound/timer.h +++ b/include/sound/timer.h @@ -129,7 +129,6 @@ void snd_timer_notify(struct snd_timer *timer, int event, struct timespec *tstam int snd_timer_global_new(char *id, int device, struct snd_timer **rtimer); int snd_timer_global_free(struct snd_timer *timer); int snd_timer_global_register(struct snd_timer *timer); -int snd_timer_global_unregister(struct snd_timer *timer); int snd_timer_open(struct snd_timer_instance **ti, char *owner, struct snd_timer_id *tid, unsigned int slave_id); int snd_timer_close(struct snd_timer_instance *timeri); diff --git a/sound/core/control.c b/sound/core/control.c index e9c8854..f0c7272 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -1375,6 +1375,11 @@ static int snd_ctl_dev_disconnect(struct snd_device *device) struct snd_card *card = device->device_data; struct list_head *flist; struct snd_ctl_file *ctl; + int err, cardnum; + + snd_assert(card != NULL, return -ENXIO); + cardnum = card->number; + snd_assert(cardnum >= 0 && cardnum < SNDRV_CARDS, return -ENXIO); down_read(&card->controls_rwsem); list_for_each(flist, &card->ctl_files) { @@ -1383,6 +1388,10 @@ static int snd_ctl_dev_disconnect(struct snd_device *device) kill_fasync(&ctl->fasync, SIGIO, POLL_ERR); } up_read(&card->controls_rwsem); + + if ((err = snd_unregister_device(SNDRV_DEVICE_TYPE_CONTROL, + card, -1)) < 0) + return err; return 0; } @@ -1404,23 +1413,6 @@ static int snd_ctl_dev_free(struct snd_device *device) } /* - * de-registration of the control device - */ -static int snd_ctl_dev_unregister(struct snd_device *device) -{ - struct snd_card *card = device->device_data; - int err, cardnum; - - snd_assert(card != NULL, return -ENXIO); - cardnum = card->number; - snd_assert(cardnum >= 0 && cardnum < SNDRV_CARDS, return -ENXIO); - if ((err = snd_unregister_device(SNDRV_DEVICE_TYPE_CONTROL, - card, -1)) < 0) - return err; - return snd_ctl_dev_free(device); -} - -/* * create control core: * called from init.c */ @@ -1430,7 +1422,6 @@ int snd_ctl_create(struct snd_card *card) .dev_free = snd_ctl_dev_free, .dev_register = snd_ctl_dev_register, .dev_disconnect = snd_ctl_dev_disconnect, - .dev_unregister = snd_ctl_dev_unregister }; snd_assert(card != NULL, return -ENXIO); diff --git a/sound/core/device.c b/sound/core/device.c index 6ce4da4..ccb2581 100644 --- a/sound/core/device.c +++ b/sound/core/device.c @@ -71,7 +71,7 @@ EXPORT_SYMBOL(snd_device_new); * @device_data: the data pointer to release * * Removes the device from the list on the card and invokes the - * callback, dev_unregister or dev_free, corresponding to the state. + * callbacks, dev_disconnect and dev_free, corresponding to the state. * Then release the device. * * Returns zero if successful, or a negative error code on failure or if the @@ -90,16 +90,14 @@ int snd_device_free(struct snd_card *card, void *device_data) continue; /* unlink */ list_del(&dev->list); - if ((dev->state == SNDRV_DEV_REGISTERED || - dev->state == SNDRV_DEV_DISCONNECTED) && - dev->ops->dev_unregister) { - if (dev->ops->dev_unregister(dev)) - snd_printk(KERN_ERR "device unregister failure\n"); - } else { - if (dev->ops->dev_free) { - if (dev->ops->dev_free(dev)) - snd_printk(KERN_ERR "device free failure\n"); - } + if (dev->state == SNDRV_DEV_REGISTERED && + dev->ops->dev_disconnect) + if (dev->ops->dev_disconnect(dev)) + snd_printk(KERN_ERR + "device disconnect failure\n"); + if (dev->ops->dev_free) { + if (dev->ops->dev_free(dev)) + snd_printk(KERN_ERR "device free failure\n"); } kfree(dev); return 0; diff --git a/sound/core/hwdep.c b/sound/core/hwdep.c index cbd8a63..9aa9d94 100644 --- a/sound/core/hwdep.c +++ b/sound/core/hwdep.c @@ -42,7 +42,7 @@ static DEFINE_MUTEX(register_mutex); static int snd_hwdep_free(struct snd_hwdep *hwdep); static int snd_hwdep_dev_free(struct snd_device *device); static int snd_hwdep_dev_register(struct snd_device *device); -static int snd_hwdep_dev_unregister(struct snd_device *device); +static int snd_hwdep_dev_disconnect(struct snd_device *device); static struct snd_hwdep *snd_hwdep_search(struct snd_card *card, int device) @@ -353,7 +353,7 @@ int snd_hwdep_new(struct snd_card *card, char *id, int device, static struct snd_device_ops ops = { .dev_free = snd_hwdep_dev_free, .dev_register = snd_hwdep_dev_register, - .dev_unregister = snd_hwdep_dev_unregister + .dev_disconnect = snd_hwdep_dev_disconnect, }; snd_assert(rhwdep != NULL, return -EINVAL); @@ -439,7 +439,7 @@ static int snd_hwdep_dev_register(struct snd_device *device) return 0; } -static int snd_hwdep_dev_unregister(struct snd_device *device) +static int snd_hwdep_dev_disconnect(struct snd_device *device) { struct snd_hwdep *hwdep = device->device_data; @@ -454,9 +454,9 @@ static int snd_hwdep_dev_unregister(struct snd_device *device) snd_unregister_oss_device(hwdep->oss_type, hwdep->card, hwdep->device); #endif snd_unregister_device(SNDRV_DEVICE_TYPE_HWDEP, hwdep->card, hwdep->device); - list_del(&hwdep->list); + list_del_init(&hwdep->list); mutex_unlock(®ister_mutex); - return snd_hwdep_free(hwdep); + return 0; } #ifdef CONFIG_PROC_FS diff --git a/sound/core/init.c b/sound/core/init.c index 1ecb029..5850d99 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -327,22 +327,10 @@ EXPORT_SYMBOL(snd_card_disconnect); * Returns zero. Frees all associated devices and frees the control * interface associated to given soundcard. */ -int snd_card_free(struct snd_card *card) +static int snd_card_do_free(struct snd_card *card) { struct snd_shutdown_f_ops *s_f_ops; - if (card == NULL) - return -EINVAL; - mutex_lock(&snd_card_mutex); - snd_cards[card->number] = NULL; - mutex_unlock(&snd_card_mutex); - -#ifdef CONFIG_PM - wake_up(&card->power_sleep); -#endif - /* wait, until all devices are ready for the free operation */ - wait_event(card->shutdown_sleep, card->files == NULL); - #if defined(CONFIG_SND_MIXER_OSS) || defined(CONFIG_SND_MIXER_OSS_MODULE) if (snd_mixer_oss_notify_callback) snd_mixer_oss_notify_callback(card, SND_MIXER_OSS_NOTIFY_FREE); @@ -371,10 +359,55 @@ int snd_card_free(struct snd_card *card) card->s_f_ops = s_f_ops->next; kfree(s_f_ops); } + kfree(card); + return 0; +} + +static int snd_card_free_prepare(struct snd_card *card) +{ + if (card == NULL) + return -EINVAL; + (void) snd_card_disconnect(card); mutex_lock(&snd_card_mutex); + snd_cards[card->number] = NULL; snd_cards_lock &= ~(1 << card->number); mutex_unlock(&snd_card_mutex); - kfree(card); +#ifdef CONFIG_PM + wake_up(&card->power_sleep); +#endif + return 0; +} + +int snd_card_free_when_closed(struct snd_card *card) +{ + int free_now = 0; + int ret = snd_card_free_prepare(card); + if (ret) + return ret; + + spin_lock(&card->files_lock); + if (card->files == NULL) + free_now = 1; + else + card->free_on_last_close = 1; + spin_unlock(&card->files_lock); + + if (free_now) + snd_card_do_free(card); + return 0; +} + +EXPORT_SYMBOL(snd_card_free_when_closed); + +int snd_card_free(struct snd_card *card) +{ + int ret = snd_card_free_prepare(card); + if (ret) + return ret; + + /* wait, until all devices are ready for the free operation */ + wait_event(card->shutdown_sleep, card->files == NULL); + snd_card_do_free(card); return 0; } @@ -718,6 +751,7 @@ EXPORT_SYMBOL(snd_card_file_add); int snd_card_file_remove(struct snd_card *card, struct file *file) { struct snd_monitor_file *mfile, *pfile = NULL; + int last_close = 0; spin_lock(&card->files_lock); mfile = card->files; @@ -732,9 +766,14 @@ int snd_card_file_remove(struct snd_card *card, struct file *file) pfile = mfile; mfile = mfile->next; } - spin_unlock(&card->files_lock); if (card->files == NULL) + last_close = 1; + spin_unlock(&card->files_lock); + if (last_close) { wake_up(&card->shutdown_sleep); + if (card->free_on_last_close) + snd_card_do_free(card); + } if (!mfile) { snd_printk(KERN_ERR "ALSA card file remove problem (%p)\n", file); return -ENOENT; diff --git a/sound/core/oss/mixer_oss.c b/sound/core/oss/mixer_oss.c index 00c95de..f4c6704 100644 --- a/sound/core/oss/mixer_oss.c +++ b/sound/core/oss/mixer_oss.c @@ -1310,21 +1310,19 @@ static int snd_mixer_oss_notify_handler(struct snd_card *card, int cmd) card->mixer_oss = mixer; snd_mixer_oss_build(mixer); snd_mixer_oss_proc_init(mixer); - } else if (cmd == SND_MIXER_OSS_NOTIFY_DISCONNECT) { - mixer = card->mixer_oss; - if (mixer == NULL || !mixer->oss_dev_alloc) - return 0; - snd_unregister_oss_device(SNDRV_OSS_DEVICE_TYPE_MIXER, mixer->card, 0); - mixer->oss_dev_alloc = 0; - } else { /* free */ + } else { mixer = card->mixer_oss; if (mixer == NULL) return 0; + if (mixer->oss_dev_alloc) { #ifdef SNDRV_OSS_INFO_DEV_MIXERS - snd_oss_info_unregister(SNDRV_OSS_INFO_DEV_MIXERS, mixer->card->number); + snd_oss_info_unregister(SNDRV_OSS_INFO_DEV_MIXERS, mixer->card->number); #endif - if (mixer->oss_dev_alloc) snd_unregister_oss_device(SNDRV_OSS_DEVICE_TYPE_MIXER, mixer->card, 0); + mixer->oss_dev_alloc = 0; + } + if (cmd == SND_MIXER_OSS_NOTIFY_DISCONNECT) + return 0; snd_mixer_oss_proc_done(mixer); return snd_mixer_oss_free1(mixer); } diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c index a92b93e..505b23e 100644 --- a/sound/core/oss/pcm_oss.c +++ b/sound/core/oss/pcm_oss.c @@ -2929,25 +2929,23 @@ static int snd_pcm_oss_disconnect_minor(struct snd_pcm *pcm) snd_unregister_oss_device(SNDRV_OSS_DEVICE_TYPE_PCM, pcm->card, 1); } - } - return 0; -} - -static int snd_pcm_oss_unregister_minor(struct snd_pcm *pcm) -{ - snd_pcm_oss_disconnect_minor(pcm); - if (pcm->oss.reg) { if (dsp_map[pcm->card->number] == (int)pcm->device) { #ifdef SNDRV_OSS_INFO_DEV_AUDIO snd_oss_info_unregister(SNDRV_OSS_INFO_DEV_AUDIO, pcm->card->number); #endif } pcm->oss.reg = 0; - snd_pcm_oss_proc_done(pcm); } return 0; } +static int snd_pcm_oss_unregister_minor(struct snd_pcm *pcm) +{ + snd_pcm_oss_disconnect_minor(pcm); + snd_pcm_oss_proc_done(pcm); + return 0; +} + static struct snd_pcm_notify snd_pcm_oss_notify = { .n_register = snd_pcm_oss_register_minor, diff --git a/sound/core/pcm.c b/sound/core/pcm.c index b860247..f52178a 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -42,7 +42,6 @@ static int snd_pcm_free(struct snd_pcm *pcm); static int snd_pcm_dev_free(struct snd_device *device); static int snd_pcm_dev_register(struct snd_device *device); static int snd_pcm_dev_disconnect(struct snd_device *device); -static int snd_pcm_dev_unregister(struct snd_device *device); static struct snd_pcm *snd_pcm_search(struct snd_card *card, int device) { @@ -680,7 +679,6 @@ int snd_pcm_new(struct snd_card *card, char *id, int device, .dev_free = snd_pcm_dev_free, .dev_register = snd_pcm_dev_register, .dev_disconnect = snd_pcm_dev_disconnect, - .dev_unregister = snd_pcm_dev_unregister }; snd_assert(rpcm != NULL, return -EINVAL); @@ -724,6 +722,7 @@ static void snd_pcm_free_stream(struct snd_pcm_str * pstr) substream = pstr->substream; while (substream) { substream_next = substream->next; + snd_pcm_timer_done(substream); snd_pcm_substream_proc_done(substream); kfree(substream); substream = substream_next; @@ -740,7 +739,12 @@ static void snd_pcm_free_stream(struct snd_pcm_str * pstr) static int snd_pcm_free(struct snd_pcm *pcm) { + struct snd_pcm_notify *notify; + snd_assert(pcm != NULL, return -ENXIO); + list_for_each_entry(notify, &snd_pcm_notify_list, list) { + notify->n_unregister(pcm); + } if (pcm->private_free) pcm->private_free(pcm); snd_pcm_lib_preallocate_free_for_all(pcm); @@ -955,35 +959,22 @@ static int snd_pcm_dev_register(struct snd_device *device) static int snd_pcm_dev_disconnect(struct snd_device *device) { struct snd_pcm *pcm = device->device_data; - struct list_head *list; + struct snd_pcm_notify *notify; struct snd_pcm_substream *substream; - int cidx; + int cidx, devtype; mutex_lock(®ister_mutex); + if (list_empty(&pcm->list)) + goto unlock; + list_del_init(&pcm->list); for (cidx = 0; cidx < 2; cidx++) for (substream = pcm->streams[cidx].substream; substream; substream = substream->next) if (substream->runtime) substream->runtime->status->state = SNDRV_PCM_STATE_DISCONNECTED; - list_for_each(list, &snd_pcm_notify_list) { - struct snd_pcm_notify *notify; - notify = list_entry(list, struct snd_pcm_notify, list); + list_for_each_entry(notify, &snd_pcm_notify_list, list) { notify->n_disconnect(pcm); } - mutex_unlock(®ister_mutex); - return 0; -} - -static int snd_pcm_dev_unregister(struct snd_device *device) -{ - int cidx, devtype; - struct snd_pcm_substream *substream; - struct list_head *list; - struct snd_pcm *pcm = device->device_data; - - snd_assert(pcm != NULL, return -ENXIO); - mutex_lock(®ister_mutex); - list_del(&pcm->list); for (cidx = 0; cidx < 2; cidx++) { devtype = -1; switch (cidx) { @@ -995,23 +986,20 @@ static int snd_pcm_dev_unregister(struct snd_device *device) break; } snd_unregister_device(devtype, pcm->card, pcm->device); - for (substream = pcm->streams[cidx].substream; substream; substream = substream->next) - snd_pcm_timer_done(substream); - } - list_for_each(list, &snd_pcm_notify_list) { - struct snd_pcm_notify *notify; - notify = list_entry(list, struct snd_pcm_notify, list); - notify->n_unregister(pcm); } + unlock: mutex_unlock(®ister_mutex); - return snd_pcm_free(pcm); + return 0; } int snd_pcm_notify(struct snd_pcm_notify *notify, int nfree) { struct list_head *p; - snd_assert(notify != NULL && notify->n_register != NULL && notify->n_unregister != NULL, return -EINVAL); + snd_assert(notify != NULL && + notify->n_register != NULL && + notify->n_unregister != NULL && + notify->n_disconnect, return -EINVAL); mutex_lock(®ister_mutex); if (nfree) { list_del(¬ify->list); diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index 51577c2..8a2bdfa 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -55,7 +55,6 @@ static int snd_rawmidi_free(struct snd_rawmidi *rawmidi); static int snd_rawmidi_dev_free(struct snd_device *device); static int snd_rawmidi_dev_register(struct snd_device *device); static int snd_rawmidi_dev_disconnect(struct snd_device *device); -static int snd_rawmidi_dev_unregister(struct snd_device *device); static LIST_HEAD(snd_rawmidi_devices); static DEFINE_MUTEX(register_mutex); @@ -1426,7 +1425,6 @@ int snd_rawmidi_new(struct snd_card *card, char *id, int device, .dev_free = snd_rawmidi_dev_free, .dev_register = snd_rawmidi_dev_register, .dev_disconnect = snd_rawmidi_dev_disconnect, - .dev_unregister = snd_rawmidi_dev_unregister }; snd_assert(rrawmidi != NULL, return -EINVAL); @@ -1479,6 +1477,14 @@ static void snd_rawmidi_free_substreams(struct snd_rawmidi_str *stream) static int snd_rawmidi_free(struct snd_rawmidi *rmidi) { snd_assert(rmidi != NULL, return -ENXIO); + + snd_info_free_entry(rmidi->proc_entry); + rmidi->proc_entry = NULL; + mutex_lock(®ister_mutex); + if (rmidi->ops && rmidi->ops->dev_unregister) + rmidi->ops->dev_unregister(rmidi); + mutex_unlock(®ister_mutex); + snd_rawmidi_free_substreams(&rmidi->streams[SNDRV_RAWMIDI_STREAM_INPUT]); snd_rawmidi_free_substreams(&rmidi->streams[SNDRV_RAWMIDI_STREAM_OUTPUT]); if (rmidi->private_free) @@ -1587,21 +1593,6 @@ static int snd_rawmidi_dev_disconnect(struct snd_device *device) mutex_lock(®ister_mutex); list_del_init(&rmidi->list); - mutex_unlock(®ister_mutex); - return 0; -} - -static int snd_rawmidi_dev_unregister(struct snd_device *device) -{ - struct snd_rawmidi *rmidi = device->device_data; - - snd_assert(rmidi != NULL, return -ENXIO); - mutex_lock(®ister_mutex); - list_del(&rmidi->list); - if (rmidi->proc_entry) { - snd_info_free_entry(rmidi->proc_entry); - rmidi->proc_entry = NULL; - } #ifdef CONFIG_SND_OSSEMUL if (rmidi->ossreg) { if ((int)rmidi->device == midi_map[rmidi->card->number]) { @@ -1615,17 +1606,9 @@ static int snd_rawmidi_dev_unregister(struct snd_device *device) rmidi->ossreg = 0; } #endif /* CONFIG_SND_OSSEMUL */ - if (rmidi->ops && rmidi->ops->dev_unregister) - rmidi->ops->dev_unregister(rmidi); snd_unregister_device(SNDRV_DEVICE_TYPE_RAWMIDI, rmidi->card, rmidi->device); mutex_unlock(®ister_mutex); -#if defined(CONFIG_SND_SEQUENCER) || (defined(MODULE) && defined(CONFIG_SND_SEQUENCER_MODULE)) - if (rmidi->seq_dev) { - snd_device_free(rmidi->card, rmidi->seq_dev); - rmidi->seq_dev = NULL; - } -#endif - return snd_rawmidi_free(rmidi); + return 0; } /** diff --git a/sound/core/rtctimer.c b/sound/core/rtctimer.c index 84704cc..412dd62 100644 --- a/sound/core/rtctimer.c +++ b/sound/core/rtctimer.c @@ -156,7 +156,7 @@ static int __init rtctimer_init(void) static void __exit rtctimer_exit(void) { if (rtctimer) { - snd_timer_global_unregister(rtctimer); + snd_timer_global_free(rtctimer); rtctimer = NULL; } } diff --git a/sound/core/seq/seq_device.c b/sound/core/seq/seq_device.c index b85954e..b79d011 100644 --- a/sound/core/seq/seq_device.c +++ b/sound/core/seq/seq_device.c @@ -90,7 +90,6 @@ static int snd_seq_device_free(struct snd_seq_device *dev); static int snd_seq_device_dev_free(struct snd_device *device); static int snd_seq_device_dev_register(struct snd_device *device); static int snd_seq_device_dev_disconnect(struct snd_device *device); -static int snd_seq_device_dev_unregister(struct snd_device *device); static int init_device(struct snd_seq_device *dev, struct ops_list *ops); static int free_device(struct snd_seq_device *dev, struct ops_list *ops); @@ -189,7 +188,6 @@ int snd_seq_device_new(struct snd_card *card, int device, char *id, int argsize, .dev_free = snd_seq_device_dev_free, .dev_register = snd_seq_device_dev_register, .dev_disconnect = snd_seq_device_dev_disconnect, - .dev_unregister = snd_seq_device_dev_unregister }; if (result) @@ -309,15 +307,6 @@ static int snd_seq_device_dev_disconnect(struct snd_device *device) } /* - * unregister the existing device - */ -static int snd_seq_device_dev_unregister(struct snd_device *device) -{ - struct snd_seq_device *dev = device->device_data; - return snd_seq_device_free(dev); -} - -/* * register device driver * id = driver id * entry = driver operators - duplicated to each instance diff --git a/sound/core/timer.c b/sound/core/timer.c index 52ecbe1..7e5e562 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -88,7 +88,7 @@ static DEFINE_MUTEX(register_mutex); static int snd_timer_free(struct snd_timer *timer); static int snd_timer_dev_free(struct snd_device *device); static int snd_timer_dev_register(struct snd_device *device); -static int snd_timer_dev_unregister(struct snd_device *device); +static int snd_timer_dev_disconnect(struct snd_device *device); static void snd_timer_reschedule(struct snd_timer * timer, unsigned long ticks_left); @@ -773,7 +773,7 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid, static struct snd_device_ops ops = { .dev_free = snd_timer_dev_free, .dev_register = snd_timer_dev_register, - .dev_unregister = snd_timer_dev_unregister + .dev_disconnect = snd_timer_dev_disconnect, }; snd_assert(tid != NULL, return -EINVAL); @@ -813,6 +813,21 @@ int snd_timer_new(struct snd_card *card, char *id, struct snd_timer_id *tid, static int snd_timer_free(struct snd_timer *timer) { snd_assert(timer != NULL, return -ENXIO); + + mutex_lock(®ister_mutex); + if (! list_empty(&timer->open_list_head)) { + struct list_head *p, *n; + struct snd_timer_instance *ti; + snd_printk(KERN_WARNING "timer %p is busy?\n", timer); + list_for_each_safe(p, n, &timer->open_list_head) { + list_del_init(p); + ti = list_entry(p, struct snd_timer_instance, open_list); + ti->timer = NULL; + } + } + list_del(&timer->device_list); + mutex_unlock(®ister_mutex); + if (timer->private_free) timer->private_free(timer); kfree(timer); @@ -867,30 +882,13 @@ static int snd_timer_dev_register(struct snd_device *dev) return 0; } -static int snd_timer_unregister(struct snd_timer *timer) +static int snd_timer_dev_disconnect(struct snd_device *device) { - struct list_head *p, *n; - struct snd_timer_instance *ti; - - snd_assert(timer != NULL, return -ENXIO); + struct snd_timer *timer = device->device_data; mutex_lock(®ister_mutex); - if (! list_empty(&timer->open_list_head)) { - snd_printk(KERN_WARNING "timer 0x%lx is busy?\n", (long)timer); - list_for_each_safe(p, n, &timer->open_list_head) { - list_del_init(p); - ti = list_entry(p, struct snd_timer_instance, open_list); - ti->timer = NULL; - } - } - list_del(&timer->device_list); + list_del_init(&timer->device_list); mutex_unlock(®ister_mutex); - return snd_timer_free(timer); -} - -static int snd_timer_dev_unregister(struct snd_device *device) -{ - struct snd_timer *timer = device->device_data; - return snd_timer_unregister(timer); + return 0; } void snd_timer_notify(struct snd_timer *timer, int event, struct timespec *tstamp) @@ -955,11 +953,6 @@ int snd_timer_global_register(struct snd_timer *timer) return snd_timer_dev_register(&dev); } -int snd_timer_global_unregister(struct snd_timer *timer) -{ - return snd_timer_unregister(timer); -} - /* * System timer */ @@ -1982,7 +1975,7 @@ static void __exit alsa_timer_exit(void) /* unregister the system timer */ list_for_each_safe(p, n, &snd_timer_list) { struct snd_timer *timer = list_entry(p, struct snd_timer, device_list); - snd_timer_unregister(timer); + snd_timer_free(timer); } snd_timer_proc_done(); #ifdef SNDRV_OSS_INFO_DEV_TIMERS @@ -2005,5 +1998,4 @@ EXPORT_SYMBOL(snd_timer_notify); EXPORT_SYMBOL(snd_timer_global_new); EXPORT_SYMBOL(snd_timer_global_free); EXPORT_SYMBOL(snd_timer_global_register); -EXPORT_SYMBOL(snd_timer_global_unregister); EXPORT_SYMBOL(snd_timer_interrupt); diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c index 51e83d7..b35280c 100644 --- a/sound/pci/ac97/ac97_codec.c +++ b/sound/pci/ac97/ac97_codec.c @@ -1817,13 +1817,13 @@ static int snd_ac97_dev_register(struct snd_device *device) return 0; } -/* unregister ac97 codec */ -static int snd_ac97_dev_unregister(struct snd_device *device) +/* disconnect ac97 codec */ +static int snd_ac97_dev_disconnect(struct snd_device *device) { struct snd_ac97 *ac97 = device->device_data; if (ac97->dev.bus) device_unregister(&ac97->dev); - return snd_ac97_free(ac97); + return 0; } /* build_ops to do nothing */ @@ -1860,7 +1860,7 @@ int snd_ac97_mixer(struct snd_ac97_bus *bus, struct snd_ac97_template *template, static struct snd_device_ops ops = { .dev_free = snd_ac97_dev_free, .dev_register = snd_ac97_dev_register, - .dev_unregister = snd_ac97_dev_unregister, + .dev_disconnect = snd_ac97_dev_disconnect, }; snd_assert(rac97 != NULL, return -EINVAL); diff --git a/sound/usb/usbaudio.c b/sound/usb/usbaudio.c index 1b7f499..3144313 100644 --- a/sound/usb/usbaudio.c +++ b/sound/usb/usbaudio.c @@ -3499,7 +3499,7 @@ static void snd_usb_audio_disconnect(struct usb_device *dev, void *ptr) } usb_chip[chip->index] = NULL; mutex_unlock(®ister_mutex); - snd_card_free(card); + snd_card_free_when_closed(card); } else { mutex_unlock(®ister_mutex); } -- cgit v0.10.2 From 2b29b13c5794f648cd5e839796496704d787f5a6 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 23 Jun 2006 14:38:26 +0200 Subject: [ALSA] Deprecate snd_card_free_in_thread() Deprecated snd_card_free_in_thread(), replaced with snd_card_free_when_closed(). Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl index b8dc51c..4807ef7 100644 --- a/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl +++ b/Documentation/sound/alsa/DocBook/writing-an-alsa-driver.tmpl @@ -1054,9 +1054,8 @@ For a device which allows hotplugging, you can use - snd_card_free_in_thread. This one will - postpone the destruction and wait in a kernel-thread until all - devices are closed. + snd_card_free_when_closed. This one will + postpone the destruction until all devices are closed. diff --git a/include/sound/core.h b/include/sound/core.h index cf4001c..1359c53 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -25,7 +25,6 @@ #include /* wake_up() */ #include /* struct mutex */ #include /* struct rw_semaphore */ -#include /* struct workqueue_struct */ #include /* pm_message_t */ /* forward declarations */ @@ -132,7 +131,6 @@ struct snd_card { int shutdown; /* this card is going down */ int free_on_last_close; /* free in context of file_release */ wait_queue_head_t shutdown_sleep; - struct work_struct free_workq; /* for free in workqueue */ struct device *dev; #ifdef CONFIG_PM @@ -245,7 +243,6 @@ struct snd_card *snd_card_new(int idx, const char *id, int snd_card_disconnect(struct snd_card *card); int snd_card_free(struct snd_card *card); int snd_card_free_when_closed(struct snd_card *card); -int snd_card_free_in_thread(struct snd_card *card); int snd_card_register(struct snd_card *card); int snd_card_info_init(void); int snd_card_info_done(void); diff --git a/sound/core/init.c b/sound/core/init.c index 5850d99..d7607a2 100644 --- a/sound/core/init.c +++ b/sound/core/init.c @@ -81,8 +81,6 @@ static inline int init_info_for_card(struct snd_card *card) #define init_info_for_card(card) #endif -static void snd_card_free_thread(void * __card); - /** * snd_card_new - create and initialize a soundcard structure * @idx: card index (address) [0 ... (SNDRV_CARDS-1)] @@ -145,7 +143,6 @@ struct snd_card *snd_card_new(int idx, const char *xid, INIT_LIST_HEAD(&card->ctl_files); spin_lock_init(&card->files_lock); init_waitqueue_head(&card->shutdown_sleep); - INIT_WORK(&card->free_workq, snd_card_free_thread, card); #ifdef CONFIG_PM mutex_init(&card->power_lock); init_waitqueue_head(&card->power_sleep); @@ -413,53 +410,6 @@ int snd_card_free(struct snd_card *card) EXPORT_SYMBOL(snd_card_free); -static void snd_card_free_thread(void * __card) -{ - struct snd_card *card = __card; - struct module * module = card->module; - - if (!try_module_get(module)) { - snd_printk(KERN_ERR "unable to lock toplevel module for card %i in free thread\n", card->number); - module = NULL; - } - - snd_card_free(card); - - module_put(module); -} - -/** - * snd_card_free_in_thread - call snd_card_free() in thread - * @card: soundcard structure - * - * This function schedules the call of snd_card_free() function in a - * work queue. When all devices are released (non-busy), the work - * is woken up and calls snd_card_free(). - * - * When a card can be disconnected at any time by hotplug service, - * this function should be used in disconnect (or detach) callback - * instead of calling snd_card_free() directly. - * - * Returns - zero otherwise a negative error code if the start of thread failed. - */ -int snd_card_free_in_thread(struct snd_card *card) -{ - if (card->files == NULL) { - snd_card_free(card); - return 0; - } - - if (schedule_work(&card->free_workq)) - return 0; - - snd_printk(KERN_ERR "schedule_work() failed in snd_card_free_in_thread for card %i\n", card->number); - /* try to free the structure immediately */ - snd_card_free(card); - return -EFAULT; -} - -EXPORT_SYMBOL(snd_card_free_in_thread); - static void choose_default_id(struct snd_card *card) { int i, len, idx_flag = 0, loops = SNDRV_CARDS; @@ -742,9 +692,9 @@ EXPORT_SYMBOL(snd_card_file_add); * * This function removes the file formerly added to the card via * snd_card_file_add() function. - * If all files are removed and the release of the card is - * scheduled, it will wake up the the thread to call snd_card_free() - * (see snd_card_free_in_thread() function). + * If all files are removed and snd_card_free_when_closed() was + * called beforehand, it processes the pending release of + * resources. * * Returns zero or a negative error code. */ diff --git a/sound/drivers/mpu401/mpu401.c b/sound/drivers/mpu401/mpu401.c index 17cc105..2de181a 100644 --- a/sound/drivers/mpu401/mpu401.c +++ b/sound/drivers/mpu401/mpu401.c @@ -211,7 +211,7 @@ static void __devexit snd_mpu401_pnp_remove(struct pnp_dev *dev) struct snd_card *card = (struct snd_card *) pnp_get_drvdata(dev); snd_card_disconnect(card); - snd_card_free_in_thread(card); + snd_card_free_when_closed(card); } static struct pnp_driver snd_mpu401_pnp_driver = { diff --git a/sound/pcmcia/pdaudiocf/pdaudiocf.c b/sound/pcmcia/pdaudiocf/pdaudiocf.c index 1c09e5f..fd3590f 100644 --- a/sound/pcmcia/pdaudiocf/pdaudiocf.c +++ b/sound/pcmcia/pdaudiocf/pdaudiocf.c @@ -206,7 +206,7 @@ static void snd_pdacf_detach(struct pcmcia_device *link) snd_pdacf_powerdown(chip); chip->chip_status |= PDAUDIOCF_STAT_IS_STALE; /* to be sure */ snd_card_disconnect(chip->card); - snd_card_free_in_thread(chip->card); + snd_card_free_when_closed(chip->card); } /* diff --git a/sound/pcmcia/vx/vxpocket.c b/sound/pcmcia/vx/vxpocket.c index cafe664..76c85cf 100644 --- a/sound/pcmcia/vx/vxpocket.c +++ b/sound/pcmcia/vx/vxpocket.c @@ -65,7 +65,7 @@ static void vxpocket_release(struct pcmcia_device *link) } /* - * destructor, called from snd_card_free_in_thread() + * destructor, called from snd_card_free_when_closed() */ static int snd_vxpocket_dev_free(struct snd_device *device) { @@ -363,7 +363,7 @@ static void vxpocket_detach(struct pcmcia_device *link) chip->chip_status |= VX_STAT_IS_STALE; /* to be sure */ snd_card_disconnect(chip->card); vxpocket_release(link); - snd_card_free_in_thread(chip->card); + snd_card_free_when_closed(chip->card); } /* -- cgit v0.10.2 From 6dbe662874ba08585eaf732d126762c25ac8e3f7 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 27 Jun 2006 18:28:53 +0200 Subject: [ALSA] Add experimental support of aggressive AC97 power-saving mode Added CONFIG_SND_AC97_POWER_SAVE kernel config to enable the support of aggressive AC97 power-saving mode. In this mode, the AC97 powerdown register bits are dynamically controlled at each open/close of PCM streams. The mode is activated via power_save option for snd-ac97-codec driver. As default it's off. It can be turned on/off on the fly via sysfs, too. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/ac97_codec.h b/include/sound/ac97_codec.h index 758f8bf..4c43521 100644 --- a/include/sound/ac97_codec.h +++ b/include/sound/ac97_codec.h @@ -27,6 +27,7 @@ #include #include +#include #include "pcm.h" #include "control.h" #include "info.h" @@ -140,6 +141,20 @@ #define AC97_GP_DRSS_1011 0x0000 /* LR(C) 10+11(+12) */ #define AC97_GP_DRSS_78 0x0400 /* LR 7+8 */ +/* powerdown bits */ +#define AC97_PD_ADC_STATUS 0x0001 /* ADC status (RO) */ +#define AC97_PD_DAC_STATUS 0x0002 /* DAC status (RO) */ +#define AC97_PD_MIXER_STATUS 0x0004 /* Analog mixer status (RO) */ +#define AC97_PD_VREF_STATUS 0x0008 /* Vref status (RO) */ +#define AC97_PD_PR0 0x0100 /* Power down PCM ADCs and input MUX */ +#define AC97_PD_PR1 0x0200 /* Power down PCM front DAC */ +#define AC97_PD_PR2 0x0400 /* Power down Mixer (Vref still on) */ +#define AC97_PD_PR3 0x0800 /* Power down Mixer (Vref off) */ +#define AC97_PD_PR4 0x1000 /* Power down AC-Link */ +#define AC97_PD_PR5 0x2000 /* Disable internal clock usage */ +#define AC97_PD_PR6 0x4000 /* Headphone amplifier */ +#define AC97_PD_EAPD 0x8000 /* External Amplifer Power Down (EAPD) */ + /* extended audio ID bit defines */ #define AC97_EI_VRA 0x0001 /* Variable bit rate supported */ #define AC97_EI_DRA 0x0002 /* Double rate supported */ @@ -359,6 +374,7 @@ #define AC97_SCAP_INV_EAPD (1<<7) /* inverted EAPD */ #define AC97_SCAP_DETECT_BY_VENDOR (1<<8) /* use vendor registers for read tests */ #define AC97_SCAP_NO_SPDIF (1<<9) /* don't build SPDIF controls */ +#define AC97_SCAP_EAPD_LED (1<<10) /* EAPD as mute LED */ /* ac97->flags */ #define AC97_HAS_PC_BEEP (1<<0) /* force PC Speaker usage */ @@ -491,6 +507,12 @@ struct snd_ac97 { /* jack-sharing info */ unsigned char indep_surround; unsigned char channel_mode; + +#ifdef CONFIG_SND_AC97_POWER_SAVE + unsigned int power_up; /* power states */ + struct workqueue_struct *power_workq; + struct work_struct power_work; +#endif struct device dev; }; @@ -532,6 +554,15 @@ unsigned short snd_ac97_read(struct snd_ac97 *ac97, unsigned short reg); void snd_ac97_write_cache(struct snd_ac97 *ac97, unsigned short reg, unsigned short value); int snd_ac97_update(struct snd_ac97 *ac97, unsigned short reg, unsigned short value); int snd_ac97_update_bits(struct snd_ac97 *ac97, unsigned short reg, unsigned short mask, unsigned short value); +#ifdef CONFIG_SND_AC97_POWER_SAVE +int snd_ac97_update_power(struct snd_ac97 *ac97, int reg, int powerup); +#else +static inline int snd_ac97_update_power(struct snd_ac97 *ac97, int reg, + int powerup) +{ + return 0; +} +#endif #ifdef CONFIG_PM void snd_ac97_suspend(struct snd_ac97 *ac97); void snd_ac97_resume(struct snd_ac97 *ac97); @@ -583,6 +614,7 @@ struct ac97_pcm { copy_flag: 1, /* lowlevel driver must fill all entries */ spdif: 1; /* spdif pcm */ unsigned short aslots; /* active slots */ + unsigned short cur_dbl; /* current double-rate state */ unsigned int rates; /* available rates */ struct { unsigned short slots; /* driver input: requested AC97 slot numbers */ diff --git a/sound/drivers/Kconfig b/sound/drivers/Kconfig index 395c4ef..897dc2d 100644 --- a/sound/drivers/Kconfig +++ b/sound/drivers/Kconfig @@ -100,4 +100,17 @@ config SND_MPU401 To compile this driver as a module, choose M here: the module will be called snd-mpu401. +config SND_AC97_POWER_SAVE + bool "AC97 Power-Saving Mode" + depends on SND_AC97_CODEC && EXPERIMENTAL + default n + help + Say Y here to enable the aggressive power-saving support of + AC97 codecs. In this mode, the power-mode is dynamically + controlled at each open/close. + + The mode is activated by passing power_save=1 option to + snd-ac97-codec driver. You can toggle it dynamically over + sysfs, too. + endmenu diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c index b35280c..f82c636 100644 --- a/sound/pci/ac97/ac97_codec.c +++ b/sound/pci/ac97/ac97_codec.c @@ -47,6 +47,11 @@ static int enable_loopback; module_param(enable_loopback, bool, 0444); MODULE_PARM_DESC(enable_loopback, "Enable AC97 ADC/DAC Loopback Control"); +#ifdef CONFIG_SND_AC97_POWER_SAVE +static int power_save; +module_param(power_save, bool, 0644); +MODULE_PARM_DESC(power_save, "Enable AC97 power-saving control"); +#endif /* */ @@ -187,6 +192,8 @@ static const struct ac97_codec_id snd_ac97_codec_ids[] = { }; +static void update_power_regs(struct snd_ac97 *ac97); + /* * I/O routines */ @@ -554,6 +561,18 @@ int snd_ac97_put_volsw(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value } err = snd_ac97_update_bits(ac97, reg, val_mask, val); snd_ac97_page_restore(ac97, page_save); +#ifdef CONFIG_SND_AC97_POWER_SAVE + /* check analog mixer power-down */ + if ((val_mask & 0x8000) && + (kcontrol->private_value & (1<<30))) { + if (val & 0x8000) + ac97->power_up &= ~(1 << (reg>>1)); + else + ac97->power_up |= 1 << (reg>>1); + if (power_save) + update_power_regs(ac97); + } +#endif return err; } @@ -962,6 +981,10 @@ static int snd_ac97_bus_dev_free(struct snd_device *device) static int snd_ac97_free(struct snd_ac97 *ac97) { if (ac97) { +#ifdef CONFIG_SND_AC97_POWER_SAVE + if (ac97->power_workq) + destroy_workqueue(ac97->power_workq); +#endif snd_ac97_proc_done(ac97); if (ac97->bus) ac97->bus->codec[ac97->num] = NULL; @@ -1117,7 +1140,9 @@ struct snd_kcontrol *snd_ac97_cnew(const struct snd_kcontrol_new *_template, str /* * create mute switch(es) for normal stereo controls */ -static int snd_ac97_cmute_new_stereo(struct snd_card *card, char *name, int reg, int check_stereo, struct snd_ac97 *ac97) +static int snd_ac97_cmute_new_stereo(struct snd_card *card, char *name, int reg, + int check_stereo, int check_amix, + struct snd_ac97 *ac97) { struct snd_kcontrol *kctl; int err; @@ -1137,10 +1162,14 @@ static int snd_ac97_cmute_new_stereo(struct snd_card *card, char *name, int reg, } if (mute_mask == 0x8080) { struct snd_kcontrol_new tmp = AC97_DOUBLE(name, reg, 15, 7, 1, 1); + if (check_amix) + tmp.private_value |= (1 << 30); tmp.index = ac97->num; kctl = snd_ctl_new1(&tmp, ac97); } else { struct snd_kcontrol_new tmp = AC97_SINGLE(name, reg, 15, 1, 1); + if (check_amix) + tmp.private_value |= (1 << 30); tmp.index = ac97->num; kctl = snd_ctl_new1(&tmp, ac97); } @@ -1186,7 +1215,9 @@ static int snd_ac97_cvol_new(struct snd_card *card, char *name, int reg, unsigne /* * create a mute-switch and a volume for normal stereo/mono controls */ -static int snd_ac97_cmix_new_stereo(struct snd_card *card, const char *pfx, int reg, int check_stereo, struct snd_ac97 *ac97) +static int snd_ac97_cmix_new_stereo(struct snd_card *card, const char *pfx, + int reg, int check_stereo, int check_amix, + struct snd_ac97 *ac97) { int err; char name[44]; @@ -1197,7 +1228,9 @@ static int snd_ac97_cmix_new_stereo(struct snd_card *card, const char *pfx, int if (snd_ac97_try_bit(ac97, reg, 15)) { sprintf(name, "%s Switch", pfx); - if ((err = snd_ac97_cmute_new_stereo(card, name, reg, check_stereo, ac97)) < 0) + if ((err = snd_ac97_cmute_new_stereo(card, name, reg, + check_stereo, check_amix, + ac97)) < 0) return err; } check_volume_resolution(ac97, reg, &lo_max, &hi_max); @@ -1209,8 +1242,10 @@ static int snd_ac97_cmix_new_stereo(struct snd_card *card, const char *pfx, int return 0; } -#define snd_ac97_cmix_new(card, pfx, reg, ac97) snd_ac97_cmix_new_stereo(card, pfx, reg, 0, ac97) -#define snd_ac97_cmute_new(card, name, reg, ac97) snd_ac97_cmute_new_stereo(card, name, reg, 0, ac97) +#define snd_ac97_cmix_new(card, pfx, reg, acheck, ac97) \ + snd_ac97_cmix_new_stereo(card, pfx, reg, 0, acheck, ac97) +#define snd_ac97_cmute_new(card, name, reg, acheck, ac97) \ + snd_ac97_cmute_new_stereo(card, name, reg, 0, acheck, ac97) static unsigned int snd_ac97_determine_spdif_rates(struct snd_ac97 *ac97); @@ -1226,9 +1261,11 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* AD claims to remove this control from AD1887, although spec v2.2 does not allow this */ if (snd_ac97_try_volume_mix(ac97, AC97_MASTER)) { if (ac97->flags & AC97_HAS_NO_MASTER_VOL) - err = snd_ac97_cmute_new(card, "Master Playback Switch", AC97_MASTER, ac97); + err = snd_ac97_cmute_new(card, "Master Playback Switch", + AC97_MASTER, 0, ac97); else - err = snd_ac97_cmix_new(card, "Master Playback", AC97_MASTER, ac97); + err = snd_ac97_cmix_new(card, "Master Playback", + AC97_MASTER, 0, ac97); if (err < 0) return err; } @@ -1265,19 +1302,23 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) if ((snd_ac97_try_volume_mix(ac97, AC97_SURROUND_MASTER)) && !(ac97->flags & AC97_AD_MULTI)) { /* Surround Master (0x38) is with stereo mutes */ - if ((err = snd_ac97_cmix_new_stereo(card, "Surround Playback", AC97_SURROUND_MASTER, 1, ac97)) < 0) + if ((err = snd_ac97_cmix_new_stereo(card, "Surround Playback", + AC97_SURROUND_MASTER, 1, 0, + ac97)) < 0) return err; } /* build headphone controls */ if (snd_ac97_try_volume_mix(ac97, AC97_HEADPHONE)) { - if ((err = snd_ac97_cmix_new(card, "Headphone Playback", AC97_HEADPHONE, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Headphone Playback", + AC97_HEADPHONE, 0, ac97)) < 0) return err; } /* build master mono controls */ if (snd_ac97_try_volume_mix(ac97, AC97_MASTER_MONO)) { - if ((err = snd_ac97_cmix_new(card, "Master Mono Playback", AC97_MASTER_MONO, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Master Mono Playback", + AC97_MASTER_MONO, 0, ac97)) < 0) return err; } @@ -1310,7 +1351,8 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* build Phone controls */ if (!(ac97->flags & AC97_HAS_NO_PHONE)) { if (snd_ac97_try_volume_mix(ac97, AC97_PHONE)) { - if ((err = snd_ac97_cmix_new(card, "Phone Playback", AC97_PHONE, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Phone Playback", + AC97_PHONE, 1, ac97)) < 0) return err; } } @@ -1318,7 +1360,8 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* build MIC controls */ if (!(ac97->flags & AC97_HAS_NO_MIC)) { if (snd_ac97_try_volume_mix(ac97, AC97_MIC)) { - if ((err = snd_ac97_cmix_new(card, "Mic Playback", AC97_MIC, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Mic Playback", + AC97_MIC, 1, ac97)) < 0) return err; if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_mic_boost, ac97))) < 0) return err; @@ -1327,14 +1370,16 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* build Line controls */ if (snd_ac97_try_volume_mix(ac97, AC97_LINE)) { - if ((err = snd_ac97_cmix_new(card, "Line Playback", AC97_LINE, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Line Playback", + AC97_LINE, 1, ac97)) < 0) return err; } /* build CD controls */ if (!(ac97->flags & AC97_HAS_NO_CD)) { if (snd_ac97_try_volume_mix(ac97, AC97_CD)) { - if ((err = snd_ac97_cmix_new(card, "CD Playback", AC97_CD, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "CD Playback", + AC97_CD, 1, ac97)) < 0) return err; } } @@ -1342,7 +1387,8 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* build Video controls */ if (!(ac97->flags & AC97_HAS_NO_VIDEO)) { if (snd_ac97_try_volume_mix(ac97, AC97_VIDEO)) { - if ((err = snd_ac97_cmix_new(card, "Video Playback", AC97_VIDEO, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Video Playback", + AC97_VIDEO, 1, ac97)) < 0) return err; } } @@ -1350,7 +1396,8 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) /* build Aux controls */ if (!(ac97->flags & AC97_HAS_NO_AUX)) { if (snd_ac97_try_volume_mix(ac97, AC97_AUX)) { - if ((err = snd_ac97_cmix_new(card, "Aux Playback", AC97_AUX, ac97)) < 0) + if ((err = snd_ac97_cmix_new(card, "Aux Playback", + AC97_AUX, 1, ac97)) < 0) return err; } } @@ -1385,9 +1432,12 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) } else { if (!(ac97->flags & AC97_HAS_NO_STD_PCM)) { if (ac97->flags & AC97_HAS_NO_PCM_VOL) - err = snd_ac97_cmute_new(card, "PCM Playback Switch", AC97_PCM, ac97); + err = snd_ac97_cmute_new(card, + "PCM Playback Switch", + AC97_PCM, 0, ac97); else - err = snd_ac97_cmix_new(card, "PCM Playback", AC97_PCM, ac97); + err = snd_ac97_cmix_new(card, "PCM Playback", + AC97_PCM, 0, ac97); if (err < 0) return err; } @@ -1398,7 +1448,9 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_control_capture_src, ac97))) < 0) return err; if (snd_ac97_try_bit(ac97, AC97_REC_GAIN, 15)) { - if ((err = snd_ac97_cmute_new(card, "Capture Switch", AC97_REC_GAIN, ac97)) < 0) + err = snd_ac97_cmute_new(card, "Capture Switch", + AC97_REC_GAIN, 0, ac97); + if (err < 0) return err; } if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_control_capture_vol, ac97))) < 0) @@ -1829,6 +1881,13 @@ static int snd_ac97_dev_disconnect(struct snd_device *device) /* build_ops to do nothing */ static struct snd_ac97_build_ops null_build_ops; +#ifdef CONFIG_SND_AC97_POWER_SAVE +static void do_update_power(void *data) +{ + update_power_regs(data); +} +#endif + /** * snd_ac97_mixer - create an Codec97 component * @bus: the AC97 bus which codec is attached to @@ -1883,6 +1942,10 @@ int snd_ac97_mixer(struct snd_ac97_bus *bus, struct snd_ac97_template *template, bus->codec[ac97->num] = ac97; mutex_init(&ac97->reg_mutex); mutex_init(&ac97->page_mutex); +#ifdef CONFIG_SND_AC97_POWER_SAVE + ac97->power_workq = create_workqueue("ac97"); + INIT_WORK(&ac97->power_work, do_update_power, ac97); +#endif #ifdef CONFIG_PCI if (ac97->pci) { @@ -2117,15 +2180,8 @@ int snd_ac97_mixer(struct snd_ac97_bus *bus, struct snd_ac97_template *template, return -ENOMEM; } } - /* make sure the proper powerdown bits are cleared */ - if (ac97->scaps && ac97_is_audio(ac97)) { - reg = snd_ac97_read(ac97, AC97_EXTENDED_STATUS); - if (ac97->scaps & AC97_SCAP_SURROUND_DAC) - reg &= ~AC97_EA_PRJ; - if (ac97->scaps & AC97_SCAP_CENTER_LFE_DAC) - reg &= ~(AC97_EA_PRI | AC97_EA_PRK); - snd_ac97_write_cache(ac97, AC97_EXTENDED_STATUS, reg); - } + if (ac97_is_audio(ac97)) + update_power_regs(ac97); snd_ac97_proc_init(ac97); if ((err = snd_device_new(card, SNDRV_DEV_CODEC, ac97, &ops)) < 0) { snd_ac97_free(ac97); @@ -2153,22 +2209,155 @@ static void snd_ac97_powerdown(struct snd_ac97 *ac97) snd_ac97_write(ac97, AC97_HEADPHONE, 0x9f9f); } - power = ac97->regs[AC97_POWERDOWN] | 0x8000; /* EAPD */ - power |= 0x4000; /* Headphone amplifier powerdown */ - power |= 0x0300; /* ADC & DAC powerdown */ + /* surround, CLFE, mic powerdown */ + power = ac97->regs[AC97_EXTENDED_STATUS]; + if (ac97->scaps & AC97_SCAP_SURROUND_DAC) + power |= AC97_EA_PRJ; + if (ac97->scaps & AC97_SCAP_CENTER_LFE_DAC) + power |= AC97_EA_PRI | AC97_EA_PRK; + power |= AC97_EA_PRL; + snd_ac97_write(ac97, AC97_EXTENDED_STATUS, power); + + /* powerdown external amplifier */ + if (ac97->scaps & AC97_SCAP_INV_EAPD) + power = ac97->regs[AC97_POWERDOWN] & ~AC97_PD_EAPD; + else if (! (ac97->scaps & AC97_SCAP_EAPD_LED)) + power = ac97->regs[AC97_POWERDOWN] | AC97_PD_EAPD; + power |= AC97_PD_PR6; /* Headphone amplifier powerdown */ + power |= AC97_PD_PR0 | AC97_PD_PR1; /* ADC & DAC powerdown */ snd_ac97_write(ac97, AC97_POWERDOWN, power); udelay(100); - power |= 0x0400; /* Analog Mixer powerdown (Vref on) */ - snd_ac97_write(ac97, AC97_POWERDOWN, power); - udelay(100); -#if 0 - /* FIXME: this causes click noises on some boards at resume */ - power |= 0x3800; /* AC-link powerdown, internal Clk disable */ + power |= AC97_PD_PR2 | AC97_PD_PR3; /* Analog Mixer powerdown */ snd_ac97_write(ac97, AC97_POWERDOWN, power); +#ifdef CONFIG_SND_AC97_POWER_SAVE + if (power_save) { + udelay(100); + /* AC-link powerdown, internal Clk disable */ + /* FIXME: this may cause click noises on some boards */ + power |= AC97_PD_PR4 | AC97_PD_PR5; + snd_ac97_write(ac97, AC97_POWERDOWN, power); + } #endif } +struct ac97_power_reg { + unsigned short reg; + unsigned short power_reg; + unsigned short mask; +}; + +enum { PWIDX_ADC, PWIDX_FRONT, PWIDX_CLFE, PWIDX_SURR, PWIDX_MIC, PWIDX_SIZE }; + +static struct ac97_power_reg power_regs[PWIDX_SIZE] = { + [PWIDX_ADC] = { AC97_PCM_LR_ADC_RATE, AC97_POWERDOWN, AC97_PD_PR0}, + [PWIDX_FRONT] = { AC97_PCM_FRONT_DAC_RATE, AC97_POWERDOWN, AC97_PD_PR1}, + [PWIDX_CLFE] = { AC97_PCM_LFE_DAC_RATE, AC97_EXTENDED_STATUS, + AC97_EA_PRI | AC97_EA_PRK}, + [PWIDX_SURR] = { AC97_PCM_SURR_DAC_RATE, AC97_EXTENDED_STATUS, + AC97_EA_PRJ}, + [PWIDX_MIC] = { AC97_PCM_MIC_ADC_RATE, AC97_EXTENDED_STATUS, + AC97_EA_PRL}, +}; + +#ifdef CONFIG_SND_AC97_POWER_SAVE +/** + * snd_ac97_update_power - update the powerdown register + * @ac97: the codec instance + * @reg: the rate register, e.g. AC97_PCM_FRONT_DAC_RATE + * @powerup: non-zero when power up the part + * + * Update the AC97 powerdown register bits of the given part. + */ +int snd_ac97_update_power(struct snd_ac97 *ac97, int reg, int powerup) +{ + int i; + + if (! ac97) + return 0; + + if (reg) { + /* SPDIF requires DAC power, too */ + if (reg == AC97_SPDIF) + reg = AC97_PCM_FRONT_DAC_RATE; + for (i = 0; i < PWIDX_SIZE; i++) { + if (power_regs[i].reg == reg) { + if (powerup) + ac97->power_up |= (1 << i); + else + ac97->power_up &= ~(1 << i); + break; + } + } + } + + if (! power_save) + return 0; + + if (! powerup && ac97->power_workq) + /* adjust power-down bits after two seconds delay + * (for avoiding loud click noises for many (OSS) apps + * that open/close frequently) + */ + queue_delayed_work(ac97->power_workq, &ac97->power_work, HZ*2); + else + update_power_regs(ac97); + + return 0; +} + +EXPORT_SYMBOL(snd_ac97_update_power); +#endif /* CONFIG_SND_AC97_POWER_SAVE */ + +static void update_power_regs(struct snd_ac97 *ac97) +{ + unsigned int power_up, bits; + int i; + +#ifdef CONFIG_SND_AC97_POWER_SAVE + if (power_save) + power_up = ac97->power_up; + else { +#endif + power_up = (1 << PWIDX_FRONT) | (1 << PWIDX_ADC); + power_up |= (1 << PWIDX_MIC); + if (ac97->scaps & AC97_SCAP_SURROUND_DAC) + power_up |= (1 << PWIDX_SURR); + if (ac97->scaps & AC97_SCAP_CENTER_LFE_DAC) + power_up |= (1 << PWIDX_CLFE); +#ifdef CONFIG_SND_AC97_POWER_SAVE + } +#endif + if (power_up) { + if (ac97->regs[AC97_POWERDOWN] & AC97_PD_PR2) { + /* needs power-up analog mix and vref */ + snd_ac97_update_bits(ac97, AC97_POWERDOWN, + AC97_PD_PR3, 0); + msleep(1); + snd_ac97_update_bits(ac97, AC97_POWERDOWN, + AC97_PD_PR2, 0); + } + } + for (i = 0; i < PWIDX_SIZE; i++) { + if (power_up & (1 << i)) + bits = 0; + else + bits = power_regs[i].mask; + snd_ac97_update_bits(ac97, power_regs[i].power_reg, + power_regs[i].mask, bits); + } + if (! power_up) { + if (! (ac97->regs[AC97_POWERDOWN] & AC97_PD_PR2)) { + /* power down analog mix and vref */ + snd_ac97_update_bits(ac97, AC97_POWERDOWN, + AC97_PD_PR2, AC97_PD_PR2); + snd_ac97_update_bits(ac97, AC97_POWERDOWN, + AC97_PD_PR3, AC97_PD_PR3); + } + } +} + + #ifdef CONFIG_PM /** * snd_ac97_suspend - General suspend function for AC97 codec @@ -2484,6 +2673,7 @@ static int tune_mute_led(struct snd_ac97 *ac97) msw->put = master_mute_sw_put; snd_ac97_remove_ctl(ac97, "External Amplifier", NULL); snd_ac97_update_bits(ac97, AC97_POWERDOWN, 0x8000, 0x8000); /* mute LED on */ + ac97->scaps |= AC97_SCAP_EAPD_LED; return 0; } diff --git a/sound/pci/ac97/ac97_pcm.c b/sound/pci/ac97/ac97_pcm.c index f684aa2..3758d07 100644 --- a/sound/pci/ac97/ac97_pcm.c +++ b/sound/pci/ac97/ac97_pcm.c @@ -269,6 +269,7 @@ int snd_ac97_set_rate(struct snd_ac97 *ac97, int reg, unsigned int rate) return -EINVAL; } + snd_ac97_update_power(ac97, reg, 1); switch (reg) { case AC97_PCM_MIC_ADC_RATE: if ((ac97->regs[AC97_EXTENDED_STATUS] & AC97_EA_VRM) == 0) /* MIC VRA */ @@ -606,6 +607,7 @@ int snd_ac97_pcm_open(struct ac97_pcm *pcm, unsigned int rate, goto error; } } + pcm->cur_dbl = r; spin_unlock_irq(&pcm->bus->bus_lock); for (i = 3; i < 12; i++) { if (!(slots & (1 << i))) @@ -651,6 +653,21 @@ int snd_ac97_pcm_close(struct ac97_pcm *pcm) unsigned short slots = pcm->aslots; int i, cidx; +#ifdef CONFIG_SND_AC97_POWER_SAVE + int r = pcm->cur_dbl; + for (i = 3; i < 12; i++) { + if (!(slots & (1 << i))) + continue; + for (cidx = 0; cidx < 4; cidx++) { + if (pcm->r[r].rslots[cidx] & (1 << i)) { + int reg = get_slot_reg(pcm, cidx, i, r); + snd_ac97_update_power(pcm->r[r].codec[cidx], + reg, 0); + } + } + } +#endif + bus = pcm->bus; spin_lock_irq(&pcm->bus->bus_lock); for (i = 3; i < 12; i++) { @@ -660,6 +677,7 @@ int snd_ac97_pcm_close(struct ac97_pcm *pcm) bus->used_slots[pcm->stream][cidx] &= ~(1 << i); } pcm->aslots = 0; + pcm->cur_dbl = 0; spin_unlock_irq(&pcm->bus->bus_lock); return 0; } diff --git a/sound/pci/intel8x0.c b/sound/pci/intel8x0.c index 6874263..72dbaed 100644 --- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -2251,6 +2251,16 @@ static int snd_intel8x0_ich_chip_init(struct intel8x0 *chip, int probing) /* ACLink on, 2 channels */ cnt = igetdword(chip, ICHREG(GLOB_CNT)); cnt &= ~(ICH_ACLINK | ICH_PCM_246_MASK); +#ifdef CONFIG_SND_AC97_POWER_SAVE + /* do cold reset - the full ac97 powerdown may leave the controller + * in a warm state but actually it cannot communicate with the codec. + */ + iputdword(chip, ICHREG(GLOB_CNT), cnt & ~ICH_AC97COLD); + cnt = igetdword(chip, ICHREG(GLOB_CNT)); + udelay(10); + iputdword(chip, ICHREG(GLOB_CNT), cnt | ICH_AC97COLD); + msleep(1); +#else /* finish cold or do warm reset */ cnt |= (cnt & ICH_AC97COLD) == 0 ? ICH_AC97COLD : ICH_AC97WARM; iputdword(chip, ICHREG(GLOB_CNT), cnt); @@ -2265,6 +2275,7 @@ static int snd_intel8x0_ich_chip_init(struct intel8x0 *chip, int probing) return -EIO; __ok: +#endif if (probing) { /* wait for any codec ready status. * Once it becomes ready it should remain ready @@ -2485,7 +2496,7 @@ static int intel8x0_resume(struct pci_dev *pci) card->shortname, chip); chip->irq = pci->irq; synchronize_irq(chip->irq); - snd_intel8x0_chip_init(chip, 1); + snd_intel8x0_chip_init(chip, 0); /* re-initialize mixer stuff */ if (chip->device_type == DEVICE_INTEL_ICH4) { @@ -2615,6 +2626,7 @@ static void __devinit intel8x0_measure_ac97_clock(struct intel8x0 *chip) /* not 48000Hz, tuning the clock.. */ chip->ac97_bus->clock = (chip->ac97_bus->clock * 48000) / pos; printk(KERN_INFO "intel8x0: clocking to %d\n", chip->ac97_bus->clock); + snd_ac97_update_power(chip->ac97[0], AC97_PCM_FRONT_DAC_RATE, 0); } #ifdef CONFIG_PROC_FS diff --git a/sound/pci/via82xx.c b/sound/pci/via82xx.c index 08da923..2c23a66 100644 --- a/sound/pci/via82xx.c +++ b/sound/pci/via82xx.c @@ -1277,7 +1277,18 @@ static int snd_via82xx_pcm_close(struct snd_pcm_substream *substream) if (! ratep->used) ratep->rate = 0; spin_unlock_irq(&ratep->lock); - + if (! ratep->rate) { + if (! viadev->direction) { + snd_ac97_update_power(chip->ac97, + AC97_PCM_FRONT_DAC_RATE, 0); + snd_ac97_update_power(chip->ac97, + AC97_PCM_SURR_DAC_RATE, 0); + snd_ac97_update_power(chip->ac97, + AC97_PCM_LFE_DAC_RATE, 0); + } else + snd_ac97_update_power(chip->ac97, + AC97_PCM_LR_ADC_RATE, 0); + } viadev->substream = NULL; return 0; } -- cgit v0.10.2 From 82466ad76d60c35bf1c48ba1b9c98c35d82fc385 Mon Sep 17 00:00:00 2001 From: Mike Rapoport Date: Thu, 29 Jun 2006 17:15:33 +0200 Subject: [ALSA] add codec-specific controls for UCB1400 This patch adds some codec-specific controls for Philips UCB1400 codec. Signed-off-by: Mike Rapoport Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c index f82c636..e5d062d 100644 --- a/sound/pci/ac97/ac97_codec.c +++ b/sound/pci/ac97/ac97_codec.c @@ -156,7 +156,7 @@ static const struct ac97_codec_id snd_ac97_codec_ids[] = { { 0x4e534300, 0xffffffff, "LM4540,43,45,46,48", NULL, NULL }, // only guess --jk { 0x4e534331, 0xffffffff, "LM4549", NULL, NULL }, { 0x4e534350, 0xffffffff, "LM4550", patch_lm4550, NULL }, // volume wrap fix -{ 0x50534304, 0xffffffff, "UCB1400", NULL, NULL }, +{ 0x50534304, 0xffffffff, "UCB1400", patch_ucb1400, NULL }, { 0x53494c20, 0xffffffe0, "Si3036,8", mpatch_si3036, mpatch_si3036, AC97_MODEM_PATCH }, { 0x54524102, 0xffffffff, "TR28022", NULL, NULL }, { 0x54524106, 0xffffffff, "TR28026", NULL, NULL }, diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index 094cfc1..5267b00 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -2872,3 +2872,41 @@ int patch_lm4550(struct snd_ac97 *ac97) ac97->res_table = lm4550_restbl; return 0; } + +/* + * UCB1400 codec (http://www.semiconductors.philips.com/acrobat_download/datasheets/UCB1400-02.pdf) + */ +static const struct snd_kcontrol_new snd_ac97_controls_ucb1400[] = { +/* enable/disable headphone driver which allows direct connection to + stereo headphone without the use of external DC blocking + capacitors */ +AC97_SINGLE("Headphone Driver", 0x6a, 6, 1, 0), +/* Filter used to compensate the DC offset is added in the ADC to remove idle + tones from the audio band. */ +AC97_SINGLE("DC Filter", 0x6a, 4, 1, 0), +/* Control smart-low-power mode feature. Allows automatic power down + of unused blocks in the ADC analog front end and the PLL. */ +AC97_SINGLE("Smart Low Power Mode", 0x6c, 4, 3, 0), +}; + +static int patch_ucb1400_specific(struct snd_ac97 * ac97) +{ + int idx, err; + for (idx = 0; idx < ARRAY_SIZE(snd_ac97_controls_ucb1400); idx++) + if ((err = snd_ctl_add(ac97->bus->card, snd_ctl_new1(&snd_ac97_controls_ucb1400[idx], ac97))) < 0) + return err; + return 0; +} + +static struct snd_ac97_build_ops patch_ucb1400_ops = { + .build_specific = patch_ucb1400_specific, +}; + +int patch_ucb1400(struct snd_ac97 * ac97) +{ + ac97->build_ops = &patch_ucb1400_ops; + /* enable headphone driver and smart low power mode by default */ + snd_ac97_write(ac97, 0x6a, 0x0050); + snd_ac97_write(ac97, 0x6c, 0x0030); + return 0; +} diff --git a/sound/pci/ac97/ac97_patch.h b/sound/pci/ac97/ac97_patch.h index adcaa04..7419792 100644 --- a/sound/pci/ac97/ac97_patch.h +++ b/sound/pci/ac97/ac97_patch.h @@ -58,5 +58,6 @@ int patch_cm9780(struct snd_ac97 * ac97); int patch_vt1616(struct snd_ac97 * ac97); int patch_vt1617a(struct snd_ac97 * ac97); int patch_it2646(struct snd_ac97 * ac97); +int patch_ucb1400(struct snd_ac97 * ac97); int mpatch_si3036(struct snd_ac97 * ac97); int patch_lm4550(struct snd_ac97 * ac97); -- cgit v0.10.2 From e0a5d82a966172c5f1dff6229d4a07be2222e8b3 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Tue, 4 Jul 2006 12:05:14 +0200 Subject: [ALSA] fm801: Support FM only card Signed-off-by: Andy Shevchenko Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/fm801.c b/sound/pci/fm801.c index 13868c9..a30257f 100644 --- a/sound/pci/fm801.c +++ b/sound/pci/fm801.c @@ -2,6 +2,7 @@ * The driver for the ForteMedia FM801 based soundcards * Copyright (c) by Jaroslav Kysela * + * Support FM only card by Andy Shevchenko * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -54,6 +55,7 @@ static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; /* Enable this card * * 1 = MediaForte 256-PCS * 2 = MediaForte 256-PCPR * 3 = MediaForte 64-PCR + * 16 = setup tuner only (this is additional bit), i.e. SF-64-PCR FM card * High 16-bits are video (radio) device number + 1 */ static int tea575x_tuner[SNDRV_CARDS]; @@ -1253,6 +1255,9 @@ static int snd_fm801_chip_init(struct fm801 *chip, int resume) int id; unsigned short cmdw; + if (tea575x_tuner & 0x0010) + goto __ac97_ok; + /* codec cold reset + AC'97 warm reset */ outw((1<<5) | (1<<6), FM801_REG(chip, CODEC_CTRL)); inw(FM801_REG(chip, CODEC_CTRL)); /* flush posting data */ @@ -1394,13 +1399,16 @@ static int __devinit snd_fm801_create(struct snd_card *card, snd_card_set_dev(card, &pci->dev); #ifdef TEA575X_RADIO - if (tea575x_tuner > 0 && (tea575x_tuner & 0xffff) < 4) { + if (tea575x_tuner > 0 && (tea575x_tuner & 0x000f) < 4) { chip->tea.dev_nr = tea575x_tuner >> 16; chip->tea.card = card; chip->tea.freq_fixup = 10700; chip->tea.private_data = chip; - chip->tea.ops = &snd_fm801_tea_ops[(tea575x_tuner & 0xffff) - 1]; + chip->tea.ops = &snd_fm801_tea_ops[(tea575x_tuner & 0x000f) - 1]; snd_tea575x_init(&chip->tea); + + /* Mute FM tuner */ + outw(0xf800, FM801_REG(chip, GPIO_CTRL)); } #endif @@ -1439,6 +1447,9 @@ static int __devinit snd_card_fm801_probe(struct pci_dev *pci, sprintf(card->longname, "%s at 0x%lx, irq %i", card->shortname, chip->port, chip->irq); + if (tea575x_tuner[dev] & 0x0010) + goto __fm801_tuner_only; + if ((err = snd_fm801_pcm(chip, 0, NULL)) < 0) { snd_card_free(card); return err; @@ -1465,6 +1476,7 @@ static int __devinit snd_card_fm801_probe(struct pci_dev *pci, return err; } + __fm801_tuner_only: if ((err = snd_card_register(card)) < 0) { snd_card_free(card); return err; -- cgit v0.10.2 From 6bbe13ecbbce4415a5a7959b3bc35b18313025e0 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Tue, 4 Jul 2006 13:39:55 +0200 Subject: [ALSA] fm801: fixed broken previous patch for the FM tuner only code - do not allocate and enable interrupt - do not do the FM tuner mute (it should be handled more cleanly) Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/fm801.c b/sound/pci/fm801.c index a30257f..88a3e9f 100644 --- a/sound/pci/fm801.c +++ b/sound/pci/fm801.c @@ -160,6 +160,7 @@ struct fm801 { unsigned int multichannel: 1, /* multichannel support */ secondary: 1; /* secondary codec */ unsigned char secondary_addr; /* address of the secondary codec */ + unsigned int tea575x_tuner; /* tuner flags */ unsigned short ply_ctrl; /* playback control */ unsigned short cap_ctrl; /* capture control */ @@ -1255,7 +1256,7 @@ static int snd_fm801_chip_init(struct fm801 *chip, int resume) int id; unsigned short cmdw; - if (tea575x_tuner & 0x0010) + if (chip->tea575x_tuner & 0x0010) goto __ac97_ok; /* codec cold reset + AC'97 warm reset */ @@ -1295,6 +1296,8 @@ static int snd_fm801_chip_init(struct fm801 *chip, int resume) wait_for_codec(chip, 0, AC97_VENDOR_ID1, msecs_to_jiffies(750)); } + __ac97_ok: + /* init volume */ outw(0x0808, FM801_REG(chip, PCM_VOL)); outw(0x9f1f, FM801_REG(chip, FM_VOL)); @@ -1303,9 +1306,12 @@ static int snd_fm801_chip_init(struct fm801 *chip, int resume) /* I2S control - I2S mode */ outw(0x0003, FM801_REG(chip, I2S_MODE)); - /* interrupt setup - unmask MPU, PLAYBACK & CAPTURE */ + /* interrupt setup */ cmdw = inw(FM801_REG(chip, IRQ_MASK)); - cmdw &= ~0x0083; + if (chip->irq < 0) + cmdw |= 0x00c3; /* mask everything, no PCM nor MPU */ + else + cmdw &= ~0x0083; /* unmask MPU, PLAYBACK & CAPTURE */ outw(cmdw, FM801_REG(chip, IRQ_MASK)); /* interrupt clear */ @@ -1370,20 +1376,23 @@ static int __devinit snd_fm801_create(struct snd_card *card, chip->card = card; chip->pci = pci; chip->irq = -1; + chip->tea575x_tuner = tea575x_tuner; if ((err = pci_request_regions(pci, "FM801")) < 0) { kfree(chip); pci_disable_device(pci); return err; } chip->port = pci_resource_start(pci, 0); - if (request_irq(pci->irq, snd_fm801_interrupt, IRQF_DISABLED|IRQF_SHARED, - "FM801", chip)) { - snd_printk(KERN_ERR "unable to grab IRQ %d\n", chip->irq); - snd_fm801_free(chip); - return -EBUSY; + if ((tea575x_tuner & 0x0010) == 0) { + if (request_irq(pci->irq, snd_fm801_interrupt, IRQF_DISABLED|IRQF_SHARED, + "FM801", chip)) { + snd_printk(KERN_ERR "unable to grab IRQ %d\n", chip->irq); + snd_fm801_free(chip); + return -EBUSY; + } + chip->irq = pci->irq; + pci_set_master(pci); } - chip->irq = pci->irq; - pci_set_master(pci); pci_read_config_byte(pci, PCI_REVISION_ID, &rev); if (rev >= 0xb1) /* FM801-AU */ @@ -1406,9 +1415,6 @@ static int __devinit snd_fm801_create(struct snd_card *card, chip->tea.private_data = chip; chip->tea.ops = &snd_fm801_tea_ops[(tea575x_tuner & 0x000f) - 1]; snd_tea575x_init(&chip->tea); - - /* Mute FM tuner */ - outw(0xf800, FM801_REG(chip, GPIO_CTRL)); } #endif -- cgit v0.10.2 From 8aa9b586e42099817163aba01d925c2660c4dbbe Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Wed, 5 Jul 2006 17:34:51 +0200 Subject: [ALSA] Control API - more robust TLV implementation - added callback option - added READ/WRITE/COMMAND flags to access member - added WRITE/COMMAND ioctls - added SNDRV_CTL_EVENT_MASK_TLV for TLV change notifications - added TLV support to ELEM_ADD ioctl Signed-off-by: Jaroslav Kysela diff --git a/include/sound/asound.h b/include/sound/asound.h index 76a2040..c1621c6 100644 --- a/include/sound/asound.h +++ b/include/sound/asound.h @@ -727,10 +727,15 @@ typedef int __bitwise snd_ctl_elem_iface_t; #define SNDRV_CTL_ELEM_ACCESS_WRITE (1<<1) #define SNDRV_CTL_ELEM_ACCESS_READWRITE (SNDRV_CTL_ELEM_ACCESS_READ|SNDRV_CTL_ELEM_ACCESS_WRITE) #define SNDRV_CTL_ELEM_ACCESS_VOLATILE (1<<2) /* control value may be changed without a notification */ -#define SNDRV_CTL_ELEM_ACCESS_TIMESTAMP (1<<2) /* when was control changed */ +#define SNDRV_CTL_ELEM_ACCESS_TIMESTAMP (1<<3) /* when was control changed */ +#define SNDRV_CTL_ELEM_ACCESS_TLV_READ (1<<4) /* TLV read is possible */ +#define SNDRV_CTL_ELEM_ACCESS_TLV_WRITE (1<<5) /* TLV write is possible */ +#define SNDRV_CTL_ELEM_ACCESS_TLV_READWRITE (SNDRV_CTL_ELEM_ACCESS_TLV_READ|SNDRV_CTL_ELEM_ACCESS_TLV_WRITE) +#define SNDRV_CTL_ELEM_ACCESS_TLV_COMMAND (1<<6) /* TLV command is possible */ #define SNDRV_CTL_ELEM_ACCESS_INACTIVE (1<<8) /* control does actually nothing, but may be updated */ #define SNDRV_CTL_ELEM_ACCESS_LOCK (1<<9) /* write lock */ #define SNDRV_CTL_ELEM_ACCESS_OWNER (1<<10) /* write lock owner */ +#define SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK (1<<28) /* kernel use a TLV callback */ #define SNDRV_CTL_ELEM_ACCESS_USER (1<<29) /* user space element */ #define SNDRV_CTL_ELEM_ACCESS_DINDIRECT (1<<30) /* indirect access for matrix dimensions in the info structure */ #define SNDRV_CTL_ELEM_ACCESS_INDIRECT (1<<31) /* indirect access for element value in the value structure */ @@ -838,6 +843,8 @@ enum { SNDRV_CTL_IOCTL_ELEM_REPLACE = _IOWR('U', 0x18, struct snd_ctl_elem_info), SNDRV_CTL_IOCTL_ELEM_REMOVE = _IOWR('U', 0x19, struct snd_ctl_elem_id), SNDRV_CTL_IOCTL_TLV_READ = _IOWR('U', 0x1a, struct snd_ctl_tlv), + SNDRV_CTL_IOCTL_TLV_WRITE = _IOWR('U', 0x1b, struct snd_ctl_tlv), + SNDRV_CTL_IOCTL_TLV_COMMAND = _IOWR('U', 0x1c, struct snd_ctl_tlv), SNDRV_CTL_IOCTL_HWDEP_NEXT_DEVICE = _IOWR('U', 0x20, int), SNDRV_CTL_IOCTL_HWDEP_INFO = _IOR('U', 0x21, struct snd_hwdep_info), SNDRV_CTL_IOCTL_PCM_NEXT_DEVICE = _IOR('U', 0x30, int), @@ -862,6 +869,7 @@ enum sndrv_ctl_event_type { #define SNDRV_CTL_EVENT_MASK_VALUE (1<<0) /* element value was changed */ #define SNDRV_CTL_EVENT_MASK_INFO (1<<1) /* element info was changed */ #define SNDRV_CTL_EVENT_MASK_ADD (1<<2) /* element was added */ +#define SNDRV_CTL_EVENT_MASK_TLV (1<<3) /* element TLV tree was changed */ #define SNDRV_CTL_EVENT_MASK_REMOVE (~0U) /* element was removed */ struct snd_ctl_event { diff --git a/include/sound/control.h b/include/sound/control.h index a93a58d..e3905c5 100644 --- a/include/sound/control.h +++ b/include/sound/control.h @@ -30,6 +30,11 @@ struct snd_kcontrol; typedef int (snd_kcontrol_info_t) (struct snd_kcontrol * kcontrol, struct snd_ctl_elem_info * uinfo); typedef int (snd_kcontrol_get_t) (struct snd_kcontrol * kcontrol, struct snd_ctl_elem_value * ucontrol); typedef int (snd_kcontrol_put_t) (struct snd_kcontrol * kcontrol, struct snd_ctl_elem_value * ucontrol); +typedef int (snd_kcontrol_tlv_rw_t)(struct snd_kcontrol *kcontrol, + int op_flag, /* 0=read,1=write,-1=command */ + unsigned int size, + unsigned int __user *tlv); + struct snd_kcontrol_new { snd_ctl_elem_iface_t iface; /* interface identifier */ @@ -42,7 +47,10 @@ struct snd_kcontrol_new { snd_kcontrol_info_t *info; snd_kcontrol_get_t *get; snd_kcontrol_put_t *put; - unsigned int *tlv; + union { + snd_kcontrol_tlv_rw_t *c; + unsigned int *p; + } tlv; unsigned long private_value; }; @@ -59,7 +67,11 @@ struct snd_kcontrol { snd_kcontrol_info_t *info; snd_kcontrol_get_t *get; snd_kcontrol_put_t *put; - unsigned int *tlv; + snd_kcontrol_tlv_rw_t *tlv_rw; + union { + snd_kcontrol_tlv_rw_t *c; + unsigned int *p; + } tlv; unsigned long private_value; void *private_data; void (*private_free)(struct snd_kcontrol *kcontrol); diff --git a/sound/core/control.c b/sound/core/control.c index f0c7272..31ad581 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -236,12 +236,16 @@ struct snd_kcontrol *snd_ctl_new1(const struct snd_kcontrol_new *ncontrol, kctl.id.index = ncontrol->index; kctl.count = ncontrol->count ? ncontrol->count : 1; access = ncontrol->access == 0 ? SNDRV_CTL_ELEM_ACCESS_READWRITE : - (ncontrol->access & (SNDRV_CTL_ELEM_ACCESS_READWRITE|SNDRV_CTL_ELEM_ACCESS_INACTIVE| - SNDRV_CTL_ELEM_ACCESS_DINDIRECT|SNDRV_CTL_ELEM_ACCESS_INDIRECT)); + (ncontrol->access & (SNDRV_CTL_ELEM_ACCESS_READWRITE| + SNDRV_CTL_ELEM_ACCESS_INACTIVE| + SNDRV_CTL_ELEM_ACCESS_DINDIRECT| + SNDRV_CTL_ELEM_ACCESS_INDIRECT| + SNDRV_CTL_ELEM_ACCESS_TLV_READWRITE| + SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK)); kctl.info = ncontrol->info; kctl.get = ncontrol->get; kctl.put = ncontrol->put; - kctl.tlv = ncontrol->tlv; + kctl.tlv.p = ncontrol->tlv.p; kctl.private_value = ncontrol->private_value; kctl.private_data = private_data; return snd_ctl_new(&kctl, access); @@ -883,6 +887,8 @@ struct user_element { struct snd_ctl_elem_info info; void *elem_data; /* element data */ unsigned long elem_data_size; /* size of element data in bytes */ + void *tlv_data; /* TLV data */ + unsigned long tlv_data_size; /* TLV data size */ void *priv_data; /* private data (like strings for enumerated type) */ unsigned long priv_data_size; /* size of private data in bytes */ }; @@ -917,9 +923,46 @@ static int snd_ctl_elem_user_put(struct snd_kcontrol *kcontrol, return change; } +static int snd_ctl_elem_user_tlv(struct snd_kcontrol *kcontrol, + int op_flag, + unsigned int size, + unsigned int __user *tlv) +{ + struct user_element *ue = kcontrol->private_data; + int change = 0; + void *new_data; + + if (op_flag > 0) { + if (size > 1024 * 128) /* sane value */ + return -EINVAL; + new_data = kmalloc(size, GFP_KERNEL); + if (new_data == NULL) + return -ENOMEM; + if (copy_from_user(new_data, tlv, size)) { + kfree(new_data); + return -EFAULT; + } + change = ue->tlv_data_size != size; + if (!change) + change = memcmp(ue->tlv_data, new_data, size); + kfree(ue->tlv_data); + ue->tlv_data = new_data; + ue->tlv_data_size = size; + } else { + if (size < ue->tlv_data_size) + return -ENOSPC; + if (copy_to_user(tlv, ue->tlv_data, ue->tlv_data_size)) + return -EFAULT; + } + return change; +} + static void snd_ctl_elem_user_free(struct snd_kcontrol *kcontrol) { - kfree(kcontrol->private_data); + struct user_element *ue = kcontrol->private_data; + if (ue->tlv_data) + kfree(ue->tlv_data); + kfree(ue); } static int snd_ctl_elem_add(struct snd_ctl_file *file, @@ -938,7 +981,8 @@ static int snd_ctl_elem_add(struct snd_ctl_file *file, return -EINVAL; access = info->access == 0 ? SNDRV_CTL_ELEM_ACCESS_READWRITE : (info->access & (SNDRV_CTL_ELEM_ACCESS_READWRITE| - SNDRV_CTL_ELEM_ACCESS_INACTIVE)); + SNDRV_CTL_ELEM_ACCESS_INACTIVE| + SNDRV_CTL_ELEM_ACCESS_TLV_READWRITE)); info->id.numid = 0; memset(&kctl, 0, sizeof(kctl)); down_write(&card->controls_rwsem); @@ -964,6 +1008,10 @@ static int snd_ctl_elem_add(struct snd_ctl_file *file, kctl.get = snd_ctl_elem_user_get; if (access & SNDRV_CTL_ELEM_ACCESS_WRITE) kctl.put = snd_ctl_elem_user_put; + if (access & SNDRV_CTL_ELEM_ACCESS_TLV_READWRITE) { + kctl.tlv.c = snd_ctl_elem_user_tlv; + access |= SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK; + } switch (info->type) { case SNDRV_CTL_ELEM_TYPE_BOOLEAN: private_size = sizeof(char); @@ -1068,38 +1116,65 @@ static int snd_ctl_subscribe_events(struct snd_ctl_file *file, int __user *ptr) return 0; } -static int snd_ctl_tlv_read(struct snd_card *card, - struct snd_ctl_tlv __user *_tlv) +static int snd_ctl_tlv_ioctl(struct snd_ctl_file *file, + struct snd_ctl_tlv __user *_tlv, + int op_flag) { + struct snd_card *card = file->card; struct snd_ctl_tlv tlv; struct snd_kcontrol *kctl; + struct snd_kcontrol_volatile *vd; unsigned int len; int err = 0; if (copy_from_user(&tlv, _tlv, sizeof(tlv))) return -EFAULT; - if (tlv.length < sizeof(unsigned int) * 3) - return -EINVAL; - down_read(&card->controls_rwsem); - kctl = snd_ctl_find_numid(card, tlv.numid); - if (kctl == NULL) { - err = -ENOENT; - goto __kctl_end; - } - if (kctl->tlv == NULL) { - err = -ENXIO; - goto __kctl_end; - } - len = kctl->tlv[1] + 2 * sizeof(unsigned int); - if (tlv.length < len) { - err = -ENOMEM; - goto __kctl_end; - } - if (copy_to_user(_tlv->tlv, kctl->tlv, len)) - err = -EFAULT; + if (tlv.length < sizeof(unsigned int) * 3) + return -EINVAL; + down_read(&card->controls_rwsem); + kctl = snd_ctl_find_numid(card, tlv.numid); + if (kctl == NULL) { + err = -ENOENT; + goto __kctl_end; + } + if (kctl->tlv.p == NULL) { + err = -ENXIO; + goto __kctl_end; + } + vd = &kctl->vd[tlv.numid - kctl->id.numid]; + if ((op_flag == 0 && (vd->access & SNDRV_CTL_ELEM_ACCESS_TLV_READ) == 0) || + (op_flag > 0 && (vd->access & SNDRV_CTL_ELEM_ACCESS_TLV_WRITE) == 0) || + (op_flag < 0 && (vd->access & SNDRV_CTL_ELEM_ACCESS_TLV_COMMAND) == 0)) { + err = -ENXIO; + goto __kctl_end; + } + if (vd->access & SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK) { + if (file && vd->owner != NULL && vd->owner != file) { + err = -EPERM; + goto __kctl_end; + } + err = kctl->tlv.c(kctl, op_flag, tlv.length, _tlv->tlv); + if (err > 0) { + up_read(&card->controls_rwsem); + snd_ctl_notify(card, SNDRV_CTL_EVENT_MASK_TLV, &kctl->id); + return 0; + } + } else { + if (op_flag) { + err = -ENXIO; + goto __kctl_end; + } + len = kctl->tlv.p[1] + 2 * sizeof(unsigned int); + if (tlv.length < len) { + err = -ENOMEM; + goto __kctl_end; + } + if (copy_to_user(_tlv->tlv, kctl->tlv.p, len)) + err = -EFAULT; + } __kctl_end: - up_read(&card->controls_rwsem); - return err; + up_read(&card->controls_rwsem); + return err; } static long snd_ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg) @@ -1140,8 +1215,12 @@ static long snd_ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg return snd_ctl_elem_remove(ctl, argp); case SNDRV_CTL_IOCTL_SUBSCRIBE_EVENTS: return snd_ctl_subscribe_events(ctl, ip); - case SNDRV_CTL_IOCTL_TLV_READ: - return snd_ctl_tlv_read(card, argp); + case SNDRV_CTL_IOCTL_TLV_READ: + return snd_ctl_tlv_ioctl(ctl, argp, 0); + case SNDRV_CTL_IOCTL_TLV_WRITE: + return snd_ctl_tlv_ioctl(ctl, argp, 1); + case SNDRV_CTL_IOCTL_TLV_COMMAND: + return snd_ctl_tlv_ioctl(ctl, argp, -1); case SNDRV_CTL_IOCTL_POWER: return -ENOPROTOOPT; case SNDRV_CTL_IOCTL_POWER_STATE: -- cgit v0.10.2 From 7f0e2f8bb851f5e0a2e0fef465b7b6f36c7aa7be Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Wed, 5 Jul 2006 17:39:14 +0200 Subject: [ALSA] HDA codec - little code & comment cleanup Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.h b/sound/pci/hda/hda_codec.h index 40520e9..c12bc4e 100644 --- a/sound/pci/hda/hda_codec.h +++ b/sound/pci/hda/hda_codec.h @@ -479,7 +479,7 @@ struct hda_codec_ops { struct hda_amp_info { u32 key; /* hash key */ u32 amp_caps; /* amp capabilities */ - u16 vol[2]; /* current volume & mute*/ + u16 vol[2]; /* current volume & mute */ u16 status; /* update flag */ u16 next; /* next link */ }; diff --git a/sound/pci/hda/hda_proc.c b/sound/pci/hda/hda_proc.c index c2f0fe8..d737f17 100644 --- a/sound/pci/hda/hda_proc.c +++ b/sound/pci/hda/hda_proc.c @@ -52,10 +52,9 @@ static void print_amp_caps(struct snd_info_buffer *buffer, struct hda_codec *codec, hda_nid_t nid, int dir) { unsigned int caps; - if (dir == HDA_OUTPUT) - caps = snd_hda_param_read(codec, nid, AC_PAR_AMP_OUT_CAP); - else - caps = snd_hda_param_read(codec, nid, AC_PAR_AMP_IN_CAP); + caps = snd_hda_param_read(codec, nid, + dir == HDA_OUTPUT ? + AC_PAR_AMP_OUT_CAP : AC_PAR_AMP_IN_CAP); if (caps == -1 || caps == 0) { snd_iprintf(buffer, "N/A\n"); return; @@ -74,10 +73,7 @@ static void print_amp_vals(struct snd_info_buffer *buffer, unsigned int val; int i; - if (dir == HDA_OUTPUT) - dir = AC_AMP_GET_OUTPUT; - else - dir = AC_AMP_GET_INPUT; + dir = dir == HDA_OUTPUT ? AC_AMP_GET_OUTPUT : AC_AMP_GET_INPUT; for (i = 0; i < indices; i++) { snd_iprintf(buffer, " ["); if (stereo) { -- cgit v0.10.2 From 302e9c5af4fb3ea258917ee6a32e9e45f578b231 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Wed, 5 Jul 2006 17:39:49 +0200 Subject: [ALSA] HDA codec & CA0106 - add/fix TLV support Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ca0106/ca0106_mixer.c b/sound/pci/ca0106/ca0106_mixer.c index 35309b3..df75270 100644 --- a/sound/pci/ca0106/ca0106_mixer.c +++ b/sound/pci/ca0106/ca0106_mixer.c @@ -472,10 +472,12 @@ static int snd_ca0106_i2c_volume_put(struct snd_kcontrol *kcontrol, #define CA_VOLUME(xname,chid,reg) \ { \ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | \ + SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ .info = snd_ca0106_volume_info, \ .get = snd_ca0106_volume_get, \ .put = snd_ca0106_volume_put, \ - .tlv = snd_ca0106_db_scale, \ + .tlv.p = snd_ca0106_db_scale, \ .private_value = ((chid) << 8) | (reg) \ } diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 23201f3..78ff457 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -29,6 +29,7 @@ #include #include "hda_codec.h" #include +#include #include #include "hda_local.h" @@ -841,6 +842,38 @@ int snd_hda_mixer_amp_volume_put(struct snd_kcontrol *kcontrol, struct snd_ctl_e return change; } +int snd_hda_mixer_amp_tlv(struct snd_kcontrol *kcontrol, int op_flag, + unsigned int size, unsigned int __user *_tlv) +{ + struct hda_codec *codec = snd_kcontrol_chip(kcontrol); + hda_nid_t nid = get_amp_nid(kcontrol); + int dir = get_amp_direction(kcontrol); + u32 caps, val1, val2; + + if (size < 4 * sizeof(unsigned int)) + return -ENOMEM; + caps = query_amp_caps(codec, nid, dir); + val2 = (((caps & AC_AMPCAP_STEP_SIZE) >> AC_AMPCAP_STEP_SIZE_SHIFT) + 1) * 25; + val1 = -((caps & AC_AMPCAP_OFFSET) >> AC_AMPCAP_OFFSET_SHIFT); + val1 = ((int)val1) * ((int)val2); + if (caps & AC_AMPCAP_MUTE) + val2 |= 0x10000; + if ((val2 & 0x10000) == 0 && dir == HDA_OUTPUT) { + caps = query_amp_caps(codec, nid, HDA_INPUT); + if (caps & AC_AMPCAP_MUTE) + val2 |= 0x10000; + } + if (put_user(SNDRV_CTL_TLVT_DB_SCALE, _tlv)) + return -EFAULT; + if (put_user(2 * sizeof(unsigned int), _tlv + 1)) + return -EFAULT; + if (put_user(val1, _tlv + 2)) + return -EFAULT; + if (put_user(val2, _tlv + 3)) + return -EFAULT; + return 0; +} + /* switch */ int snd_hda_mixer_amp_switch_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h index 14e8aa2..0f0ae68 100644 --- a/sound/pci/hda/hda_local.h +++ b/sound/pci/hda/hda_local.h @@ -30,9 +30,13 @@ /* mono volume with index (index=0,1,...) (channel=1,2) */ #define HDA_CODEC_VOLUME_MONO_IDX(xname, xcidx, nid, channel, xindex, direction) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xcidx, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | \ + SNDRV_CTL_ELEM_ACCESS_TLV_READ | \ + SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK, \ .info = snd_hda_mixer_amp_volume_info, \ .get = snd_hda_mixer_amp_volume_get, \ .put = snd_hda_mixer_amp_volume_put, \ + .tlv.c = snd_hda_mixer_amp_tlv, \ .private_value = HDA_COMPOSE_AMP_VAL(nid, channel, xindex, direction) } /* stereo volume with index */ #define HDA_CODEC_VOLUME_IDX(xname, xcidx, nid, xindex, direction) \ @@ -63,6 +67,7 @@ int snd_hda_mixer_amp_volume_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo); int snd_hda_mixer_amp_volume_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol); int snd_hda_mixer_amp_volume_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol); +int snd_hda_mixer_amp_tlv(struct snd_kcontrol *kcontrol, int op_flag, unsigned int size, unsigned int __user *tlv); int snd_hda_mixer_amp_switch_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo); int snd_hda_mixer_amp_switch_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol); int snd_hda_mixer_amp_switch_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol); diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 6823f2b..54506d4 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -452,6 +452,19 @@ static int ad1986a_pcm_amp_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl return change; } +static int ad1986a_pcm_amp_tlv(struct snd_kcontrol *kcontrol, int op_flag, + unsigned int size, unsigned int __user *_tlv) +{ + struct hda_codec *codec = snd_kcontrol_chip(kcontrol); + struct ad198x_spec *ad = codec->spec; + + mutex_lock(&ad->amp_mutex); + snd_hda_mixer_amp_tlv(kcontrol, op_flag, size, _tlv); + mutex_unlock(&ad->amp_mutex); + return 0; +} + + #define ad1986a_pcm_amp_sw_info snd_hda_mixer_amp_switch_info static int ad1986a_pcm_amp_sw_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) @@ -488,9 +501,13 @@ static struct snd_kcontrol_new ad1986a_mixers[] = { { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = "PCM Playback Volume", + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ | + SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK, .info = ad1986a_pcm_amp_vol_info, .get = ad1986a_pcm_amp_vol_get, .put = ad1986a_pcm_amp_vol_put, + .tlv.c = ad1986a_pcm_amp_tlv, .private_value = HDA_COMPOSE_AMP_VAL(AD1986A_FRONT_DAC, 3, 0, HDA_OUTPUT) }, { -- cgit v0.10.2 From 22a27c7f8d0752b38b315d6a192c338d45ea28d5 Mon Sep 17 00:00:00 2001 From: Matt Porter Date: Thu, 6 Jul 2006 18:49:10 +0200 Subject: [ALSA] hda: fix sigmatel 9227/8/9 codec support SigmaTel 9227/8/9 IDs must use the 927x patch. Signed-off-by: Matt Porter Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index ea99083..ac96336 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1525,12 +1525,12 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x83847681, .name = "STAC9220D/9223D A2", .patch = patch_stac922x }, { .id = 0x83847682, .name = "STAC9221 A2", .patch = patch_stac922x }, { .id = 0x83847683, .name = "STAC9221D A2", .patch = patch_stac922x }, - { .id = 0x83847618, .name = "STAC9227", .patch = patch_stac922x }, - { .id = 0x83847619, .name = "STAC9227", .patch = patch_stac922x }, - { .id = 0x83847616, .name = "STAC9228", .patch = patch_stac922x }, - { .id = 0x83847617, .name = "STAC9228", .patch = patch_stac922x }, - { .id = 0x83847614, .name = "STAC9229", .patch = patch_stac922x }, - { .id = 0x83847615, .name = "STAC9229", .patch = patch_stac922x }, + { .id = 0x83847618, .name = "STAC9227", .patch = patch_stac927x }, + { .id = 0x83847619, .name = "STAC9227", .patch = patch_stac927x }, + { .id = 0x83847616, .name = "STAC9228", .patch = patch_stac927x }, + { .id = 0x83847617, .name = "STAC9228", .patch = patch_stac927x }, + { .id = 0x83847614, .name = "STAC9229", .patch = patch_stac927x }, + { .id = 0x83847615, .name = "STAC9229", .patch = patch_stac927x }, { .id = 0x83847620, .name = "STAC9274", .patch = patch_stac927x }, { .id = 0x83847621, .name = "STAC9274D", .patch = patch_stac927x }, { .id = 0x83847622, .name = "STAC9273X", .patch = patch_stac927x }, -- cgit v0.10.2 From 11b3a7555aa1b1629614e919889a4479dfe6f37b Mon Sep 17 00:00:00 2001 From: James Courtier-Dutton Date: Sat, 8 Jul 2006 16:39:30 +0100 Subject: [ALSA] snd-emu10k1: Implement 24bit capture via Philips 1361T ADC for SB0240 card. Signed-off-by: James Courtier-Dutton Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/emu10k1/emu10k1_main.c b/sound/pci/emu10k1/emu10k1_main.c index 79f24cd..be65d4d 100644 --- a/sound/pci/emu10k1/emu10k1_main.c +++ b/sound/pci/emu10k1/emu10k1_main.c @@ -927,6 +927,7 @@ static struct snd_emu_chip_details emu_chip_details[] = { .ca0151_chip = 1, .spk71 = 1, .spdif_bug = 1, + .adc_1361t = 1, /* 24 bit capture instead of 16bit */ .ac97_chip = 1} , {.vendor = 0x1102, .device = 0x0004, .subsystem = 0x10051102, .driver = "Audigy2", .name = "Audigy 2 EX [1005]", -- cgit v0.10.2 From 6a65d793b0a82c7e190d9fd92a479401b6a127ca Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 14 Jul 2006 14:39:34 +0200 Subject: [ALSA] Remove unused tlv_rw field from struct snd_kcontrol Remove unused tlv_rw field from struct snd_kcontrol. The callback is set in tlv.c field, instead. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/control.h b/include/sound/control.h index e3905c5..1de148b 100644 --- a/include/sound/control.h +++ b/include/sound/control.h @@ -67,7 +67,6 @@ struct snd_kcontrol { snd_kcontrol_info_t *info; snd_kcontrol_get_t *get; snd_kcontrol_put_t *put; - snd_kcontrol_tlv_rw_t *tlv_rw; union { snd_kcontrol_tlv_rw_t *c; unsigned int *p; -- cgit v0.10.2 From 7bc5ba7e02f63a5732fdf99e7471f54738f6f918 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 14 Jul 2006 15:18:19 +0200 Subject: [ALSA] Add TLV support to snd-usb-audio driver Added TLV-read support to snd-usb-audio driver for passing the volume dB scale information to user-space. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbmixer.c b/sound/usb/usbmixer.c index 491e975..e516d6a 100644 --- a/sound/usb/usbmixer.c +++ b/sound/usb/usbmixer.c @@ -37,6 +37,7 @@ #include #include #include +#include #include "usbaudio.h" @@ -416,6 +417,26 @@ static inline int set_cur_mix_value(struct usb_mixer_elem_info *cval, int channe return set_ctl_value(cval, SET_CUR, (cval->control << 8) | channel, value); } +/* + * TLV callback for mixer volume controls + */ +static int mixer_vol_tlv(struct snd_kcontrol *kcontrol, int op_flag, + unsigned int size, unsigned int __user *_tlv) +{ + struct usb_mixer_elem_info *cval = kcontrol->private_data; + DECLARE_TLV_DB_SCALE(scale, 0, 0, 0); + + if (size < sizeof(scale)) + return -ENOMEM; + /* USB descriptions contain the dB scale in 1/256 dB unit + * while ALSA TLV contains in 1/100 dB unit + */ + scale[2] = (convert_signed_value(cval, cval->min) * 100) / 256; + scale[3] = (convert_signed_value(cval, cval->res) * 100) / 256; + if (copy_to_user(_tlv, scale, sizeof(scale))) + return -EFAULT; + return 0; +} /* * parser routines begin here... @@ -933,6 +954,12 @@ static void build_feature_ctl(struct mixer_build *state, unsigned char *desc, } strlcat(kctl->id.name + len, control == USB_FEATURE_MUTE ? " Switch" : " Volume", sizeof(kctl->id.name)); + if (control == USB_FEATURE_VOLUME) { + kctl->tlv.c = mixer_vol_tlv; + kctl->vd[0].access |= + SNDRV_CTL_ELEM_ACCESS_TLV_READ | + SNDRV_CTL_ELEM_ACCESS_TLV_CALLBACK; + } break; default: -- cgit v0.10.2 From 17f48ec3f15ddb8080b151304ee887c68f7e4650 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Mon, 17 Jul 2006 16:50:56 +0200 Subject: [ALSA] system timer: fix lost ticks correction adjustment Fix the adjustment of the lost ticks correction variable in the case when the correction has been fully taken into account in the next timer expiration value. Subtracting the scheduled ticks value would result in an underflow. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/core/timer.c b/sound/core/timer.c index 7e5e562..8635700 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -987,7 +987,7 @@ static int snd_timer_s_start(struct snd_timer * timer) njiff++; } else { njiff += timer->sticks - priv->correction; - priv->correction -= timer->sticks; + priv->correction = 0; } priv->last_expires = priv->tlist.expires = njiff; add_timer(&priv->tlist); -- cgit v0.10.2 From 6ed5eff025b72cb84a884d4be05f854f13b1542f Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Mon, 17 Jul 2006 16:51:37 +0200 Subject: [ALSA] system timer: accumulate correction for multiple lost ticks When multiple timer interrupts arrive too late, correct for all delays instead of ignoring the earlier ones. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/core/timer.c b/sound/core/timer.c index 8635700..0f6e672 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -971,7 +971,7 @@ static void snd_timer_s_function(unsigned long data) struct snd_timer_system_private *priv = timer->private_data; unsigned long jiff = jiffies; if (time_after(jiff, priv->last_expires)) - priv->correction = (long)jiff - (long)priv->last_expires; + priv->correction += (long)jiff - (long)priv->last_expires; snd_timer_interrupt(timer, (long)jiff - (long)priv->last_jiffies); } -- cgit v0.10.2 From de2696d8bc9c81874b3743e0c27708760cb7fb52 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Mon, 17 Jul 2006 16:52:09 +0200 Subject: [ALSA] system timer: clear correction value when timer stops Do not retain the old correction value when the timer was stopped. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/core/timer.c b/sound/core/timer.c index 0f6e672..4fcc854 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -1006,6 +1006,7 @@ static int snd_timer_s_stop(struct snd_timer * timer) timer->sticks = priv->last_expires - jiff; else timer->sticks = 1; + priv->correction = 0; return 0; } -- cgit v0.10.2 From cd93fe4770ca607c7f39260c02941deccbd77b8b Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Mon, 17 Jul 2006 16:53:57 +0200 Subject: [ALSA] timer: fix timer rescheduling When checking whether a hardware timer needs to be rescheduled, we have to compare against the previously scheduled interval and not against the actual interval between the last two interrupts. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/core/timer.c b/sound/core/timer.c index 4fcc854..3a2f8e2 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -718,7 +718,7 @@ void snd_timer_interrupt(struct snd_timer * timer, unsigned long ticks_left) } } if (timer->flags & SNDRV_TIMER_FLG_RESCHED) - snd_timer_reschedule(timer, ticks_left); + snd_timer_reschedule(timer, timer->sticks); if (timer->running) { if (timer->hw.flags & SNDRV_TIMER_HW_STOP) { timer->hw.stop(timer); -- cgit v0.10.2 From 6e9059b05fa733045d7845ac70c5ba0a05e3c2d1 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 21 Jul 2006 10:45:19 +0200 Subject: [ALSA] system timer: remove unused snd_timer_system_private.timer field Remove the snd_timer_system_private structure's timer field that was never used. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/core/timer.c b/sound/core/timer.c index 3a2f8e2..10a79ae 100644 --- a/sound/core/timer.c +++ b/sound/core/timer.c @@ -959,7 +959,6 @@ int snd_timer_global_register(struct snd_timer *timer) struct snd_timer_system_private { struct timer_list tlist; - struct timer * timer; unsigned long last_expires; unsigned long last_jiffies; unsigned long correction; -- cgit v0.10.2 From f1265391ea002a28933dc1a8a55948c0ed64c9d0 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 21 Jul 2006 10:46:18 +0200 Subject: [ALSA] usb-audio: add more Yamaha devices Add some quirks for some unknown Yamaha USB MIDI devices. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbquirks.h b/sound/usb/usbquirks.h index 9351846..a7e9563 100644 --- a/sound/usb/usbquirks.h +++ b/sound/usb/usbquirks.h @@ -123,6 +123,10 @@ YAMAHA_DEVICE(0x103e, NULL), YAMAHA_DEVICE(0x103f, NULL), YAMAHA_DEVICE(0x1040, NULL), YAMAHA_DEVICE(0x1041, NULL), +YAMAHA_DEVICE(0x1042, NULL), +YAMAHA_DEVICE(0x1043, NULL), +YAMAHA_DEVICE(0x1044, NULL), +YAMAHA_DEVICE(0x1045, NULL), YAMAHA_DEVICE(0x2000, "DGP-7"), YAMAHA_DEVICE(0x2001, "DGP-5"), YAMAHA_DEVICE(0x2002, NULL), @@ -141,6 +145,7 @@ YAMAHA_DEVICE(0x500b, "DME64N"), YAMAHA_DEVICE(0x500c, "DME24N"), YAMAHA_DEVICE(0x500d, NULL), YAMAHA_DEVICE(0x500e, NULL), +YAMAHA_DEVICE(0x500f, NULL), YAMAHA_DEVICE(0x7000, "DTX"), YAMAHA_DEVICE(0x7010, "UB99"), #undef YAMAHA_DEVICE -- cgit v0.10.2 From fff36e472b4315df77513f4339c5c199c6aad28b Mon Sep 17 00:00:00 2001 From: James Courtier-Dutton Date: Sat, 22 Jul 2006 15:02:48 +0100 Subject: [ALSA] snd-ca0106: Fix dB gain TLVs. Signed-off-by: James Courtier-Dutton Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ca0106/ca0106_mixer.c b/sound/pci/ca0106/ca0106_mixer.c index df75270..6d64438 100644 --- a/sound/pci/ca0106/ca0106_mixer.c +++ b/sound/pci/ca0106/ca0106_mixer.c @@ -74,7 +74,8 @@ #include "ca0106.h" -static DECLARE_TLV_DB_SCALE(snd_ca0106_db_scale, -5150, 75, 1); +static DECLARE_TLV_DB_SCALE(snd_ca0106_db_scale1, -5175, 25, 1); +static DECLARE_TLV_DB_SCALE(snd_ca0106_db_scale2, -10350, 50, 1); static int snd_ca0106_shared_spdif_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) @@ -477,16 +478,19 @@ static int snd_ca0106_i2c_volume_put(struct snd_kcontrol *kcontrol, .info = snd_ca0106_volume_info, \ .get = snd_ca0106_volume_get, \ .put = snd_ca0106_volume_put, \ - .tlv.p = snd_ca0106_db_scale, \ + .tlv.p = snd_ca0106_db_scale1, \ .private_value = ((chid) << 8) | (reg) \ } #define I2C_VOLUME(xname,chid) \ { \ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | \ + SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ .info = snd_ca0106_i2c_volume_info, \ .get = snd_ca0106_i2c_volume_get, \ .put = snd_ca0106_i2c_volume_put, \ + .tlv.p = snd_ca0106_db_scale2, \ .private_value = chid \ } -- cgit v0.10.2 From 31508f83f591dc8764427b6321c89f8f9e84bad2 Mon Sep 17 00:00:00 2001 From: James Courtier-Dutton Date: Sat, 22 Jul 2006 17:02:10 +0100 Subject: [ALSA] snd-emu10k1: Implement dB gain infomation. Signed-off-by: James Courtier-Dutton Signed-off-by: Jaroslav Kysela diff --git a/include/sound/emu10k1.h b/include/sound/emu10k1.h index 884bbf5..892e310 100644 --- a/include/sound/emu10k1.h +++ b/include/sound/emu10k1.h @@ -1524,6 +1524,10 @@ struct snd_emu10k1_fx8010_control_gpr { unsigned int value[32]; /* initial values */ unsigned int min; /* minimum range */ unsigned int max; /* maximum range */ + union { + snd_kcontrol_tlv_rw_t *c; + unsigned int *p; + } tlv; unsigned int translation; /* translation type (EMU10K1_GPR_TRANSLATION*) */ }; diff --git a/sound/pci/emu10k1/emufx.c b/sound/pci/emu10k1/emufx.c index dfba002..00fc904 100644 --- a/sound/pci/emu10k1/emufx.c +++ b/sound/pci/emu10k1/emufx.c @@ -35,6 +35,7 @@ #include #include +#include #include #if 0 /* for testing purposes - digital out -> capture */ @@ -290,6 +291,9 @@ static const u32 db_table[101] = { 0x7fffffff, }; +/* EMU10k1/EMU10k2 DSP control db gain */ +static DECLARE_TLV_DB_SCALE(snd_emu10k1_db_scale1, -4000, 40, 1); + static const u32 onoff_table[2] = { 0x00000000, 0x00000001 }; @@ -755,6 +759,11 @@ static int snd_emu10k1_add_controls(struct snd_emu10k1 *emu, knew.device = gctl->id.device; knew.subdevice = gctl->id.subdevice; knew.info = snd_emu10k1_gpr_ctl_info; + if (gctl->tlv.p) { + knew.tlv.p = gctl->tlv.p; + knew.access = SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ; + } knew.get = snd_emu10k1_gpr_ctl_get; knew.put = snd_emu10k1_gpr_ctl_put; memset(nctl, 0, sizeof(*nctl)); @@ -1013,6 +1022,7 @@ snd_emu10k1_init_mono_control(struct snd_emu10k1_fx8010_control_gpr *ctl, ctl->gpr[0] = gpr + 0; ctl->value[0] = defval; ctl->min = 0; ctl->max = 100; + ctl->tlv.p = snd_emu10k1_db_scale1; ctl->translation = EMU10K1_GPR_TRANSLATION_TABLE100; } @@ -1027,6 +1037,7 @@ snd_emu10k1_init_stereo_control(struct snd_emu10k1_fx8010_control_gpr *ctl, ctl->gpr[1] = gpr + 1; ctl->value[1] = defval; ctl->min = 0; ctl->max = 100; + ctl->tlv.p = snd_emu10k1_db_scale1; ctl->translation = EMU10K1_GPR_TRANSLATION_TABLE100; } diff --git a/sound/pci/emu10k1/p16v.c b/sound/pci/emu10k1/p16v.c index 9905651..1e44714 100644 --- a/sound/pci/emu10k1/p16v.c +++ b/sound/pci/emu10k1/p16v.c @@ -100,6 +100,7 @@ #include #include #include +#include #include #include "p16v.h" @@ -784,12 +785,16 @@ static int snd_p16v_capture_channel_put(struct snd_kcontrol *kcontrol, } return change; } +static DECLARE_TLV_DB_SCALE(snd_p16v_db_scale1, -5175, 25, 1); #define P16V_VOL(xname,xreg,xhl) { \ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | \ + SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ .info = snd_p16v_volume_info, \ .get = snd_p16v_volume_get, \ .put = snd_p16v_volume_put, \ + .tlv.p = snd_p16v_db_scale1, \ .private_value = ((xreg) | ((xhl) << 8)) \ } -- cgit v0.10.2 From 0a197f005a27766f5c9e0d960e7650748ec1ee4f Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 14:51:14 +0200 Subject: [ALSA] Add model entry for Samsung X10 laptop Added the proper model entry (laptop-eapd) for Samsung X10-T2300 Culesa laptop with AD1986A codec. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 54506d4..e547442 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -820,6 +820,8 @@ static struct hda_board_config ad1986a_cfg_tbl[] = { .config = AD1986A_LAPTOP_EAPD }, /* Samsung X60 Chane */ { .pci_subvendor = 0x144d, .pci_subdevice = 0xc024, .config = AD1986A_LAPTOP_EAPD }, /* Samsung R65-T2300 Charis */ + { .pci_subvendor = 0x144d, .pci_subdevice = 0xc026, + .config = AD1986A_LAPTOP_EAPD }, /* Samsung X10-T2300 Culesa */ { .pci_subvendor = 0x1043, .pci_subdevice = 0x1153, .config = AD1986A_LAPTOP_EAPD }, /* ASUS M9 */ { .pci_subvendor = 0x1043, .pci_subdevice = 0x1213, -- cgit v0.10.2 From 5a053d012d0576e9306009939ca81a86547ef35a Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 14:51:15 +0200 Subject: [ALSA] Add model entry for Clevo m665n laptop Added the proper model entry for Clevo m665n laptop with ALC880 codec. Also, added a model string 'clevo' to enable the clevo-type model option. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index f61af23..74ea66d3 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -783,6 +783,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. F1734 2-jack lg LG laptop (m1 express dual) lg-lw LG LW20 laptop + clevo Clevo laptops (m520G, m665n) test for testing/debugging purpose, almost all controls can be adjusted. Appearing only when compiled with $CONFIG_SND_DEBUG=y diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 18d1052..f4c96aa 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2156,8 +2156,13 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .modelname = "3stack-digout", .config = ALC880_3ST_DIG }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe308, .config = ALC880_3ST_DIG }, { .pci_subvendor = 0x1025, .pci_subdevice = 0x0070, .config = ALC880_3ST_DIG }, - /* Clevo m520G NB */ - { .pci_subvendor = 0x1558, .pci_subdevice = 0x0520, .config = ALC880_CLEVO }, + + /* Clevo laptops */ + { .modelname = "clevo", .config = ALC880_CLEVO }, + { .pci_subvendor = 0x1558, .pci_subdevice = 0x0520, + .config = ALC880_CLEVO }, /* Clevo m520G NB */ + { .pci_subvendor = 0x1558, .pci_subdevice = 0x0660, + .config = ALC880_CLEVO }, /* Clevo m665n */ /* Back 3 jack plus 1 SPDIF out jack, front 2 jack (Internal add Aux-In)*/ { .pci_subvendor = 0x8086, .pci_subdevice = 0xe305, .config = ALC880_3ST_DIG }, -- cgit v0.10.2 From 6d177ba7839dd7ed391c2f36b121eb09d1eaee4c Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 14:51:15 +0200 Subject: [ALSA] Add hp-bpc model type for HP laptops Added 'hp-bpc' model type for HP xw4400-compatible laptops. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 74ea66d3..c595acb 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -798,6 +798,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC262 fujitsu Fujitsu Laptop + hp-bpc HP xw4400/6400/8400/9400 laptops basic fixed pin assignment w/o SPDIF auto auto-config reading BIOS (default) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index f4c96aa..51f76ee 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5774,6 +5774,7 @@ static struct hda_board_config alc262_cfg_tbl[] = { { .modelname = "fujitsu", .config = ALC262_FUJITSU }, { .pci_subvendor = 0x10cf, .pci_subdevice = 0x1397, .config = ALC262_FUJITSU }, + { .modelname = "hp-bpc", .config = ALC262_HP_BPC }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x208c, .config = ALC262_HP_BPC }, /* xw4400 */ { .pci_subvendor = 0x103c, .pci_subdevice = 0x3014, -- cgit v0.10.2 From 304dcaac91f0d26543b31fd7e63726f096c826ee Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 14:51:16 +0200 Subject: [ALSA] Add support of Benq laptop with ALC262 Added the support of Benq laptop with ALC262 codec. A model string 'benq' is added, too. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index c595acb..885d2ed 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -799,6 +799,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC262 fujitsu Fujitsu Laptop hp-bpc HP xw4400/6400/8400/9400 laptops + benq Benq ED8 basic fixed pin assignment w/o SPDIF auto auto-config reading BIOS (default) diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 51f76ee..42c4f90 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -79,6 +79,7 @@ enum { ALC262_BASIC, ALC262_FUJITSU, ALC262_HP_BPC, + ALC262_BENQ_ED8, ALC262_AUTO, ALC262_MODEL_LAST /* last tag */ }; @@ -5504,6 +5505,13 @@ static struct snd_kcontrol_new alc262_fujitsu_mixer[] = { { } /* end */ }; +/* additional init verbs for Benq laptops */ +static struct hda_verb alc262_EAPD_verbs[] = { + {0x20, AC_VERB_SET_COEF_INDEX, 0x07}, + {0x20, AC_VERB_SET_PROC_COEF, 0x3070}, + {} +}; + /* add playback controls from the parsed DAC table */ static int alc262_auto_create_multi_out_ctls(struct alc_spec *spec, const struct auto_pin_cfg *cfg) { @@ -5783,6 +5791,9 @@ static struct hda_board_config alc262_cfg_tbl[] = { .config = ALC262_HP_BPC }, /* xw8400 */ { .pci_subvendor = 0x103c, .pci_subdevice = 0x12fe, .config = ALC262_HP_BPC }, /* xw9400 */ + { .modelname = "benq", .config = ALC262_BENQ_ED8 }, + { .pci_subvendor = 0x17ff, .pci_subdevice = 0x0560, + .config = ALC262_BENQ_ED8 }, { .modelname = "auto", .config = ALC262_AUTO }, {} }; @@ -5820,6 +5831,16 @@ static struct alc_config_preset alc262_presets[] = { .channel_mode = alc262_modes, .input_mux = &alc262_HP_capture_source, }, + [ALC262_BENQ_ED8] = { + .mixers = { alc262_base_mixer }, + .init_verbs = { alc262_init_verbs, alc262_EAPD_verbs }, + .num_dacs = ARRAY_SIZE(alc262_dac_nids), + .dac_nids = alc262_dac_nids, + .hp_nid = 0x03, + .num_channel_mode = ARRAY_SIZE(alc262_modes), + .channel_mode = alc262_modes, + .input_mux = &alc262_capture_source, + }, }; static int patch_alc262(struct hda_codec *codec) -- cgit v0.10.2 From 827a56ea3d9c3d5f80c5520ba9d487f9b7069238 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 14:51:16 +0200 Subject: [ALSA] Added model for ASUS M2NPV-VM mobo Added the proper model (3stack) for ASUS M2NPV-VM mobo with AD1986A codec. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index e547442..8955397 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -808,6 +808,8 @@ static struct hda_board_config ad1986a_cfg_tbl[] = { .config = AD1986A_3STACK }, /* ASUS A8N-VM CSM */ { .pci_subvendor = 0x1043, .pci_subdevice = 0x81b3, .config = AD1986A_3STACK }, /* ASUS P5RD2-VM / P5GPL-X SE */ + { .pci_subvendor = 0x1043, .pci_subdevice = 0x81cb, + .config = AD1986A_3STACK }, /* ASUS M2NPV-VM */ { .modelname = "laptop", .config = AD1986A_LAPTOP }, { .pci_subvendor = 0x144d, .pci_subdevice = 0xc01e, .config = AD1986A_LAPTOP }, /* FSC V2060 */ -- cgit v0.10.2 From b7c6b03405896bc181e1e2c9c06628c3b1681af5 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 25 Jul 2006 15:29:37 +0200 Subject: [ALSA] via82xx - Add dxs_support entry for a FSC machine Added dxs_support=5 entry for a FSC machine. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/via82xx.c b/sound/pci/via82xx.c index 2c23a66..e0e3bfd 100644 --- a/sound/pci/via82xx.c +++ b/sound/pci/via82xx.c @@ -2404,6 +2404,7 @@ static int __devinit check_dxs_list(struct pci_dev *pci, int revision) { .subvendor = 0x16f3, .subdevice = 0x6405, .action = VIA_DXS_SRC }, /* Jetway K8M8MS */ { .subvendor = 0x1734, .subdevice = 0x1078, .action = VIA_DXS_SRC }, /* FSC Amilo L7300 */ { .subvendor = 0x1734, .subdevice = 0x1093, .action = VIA_DXS_SRC }, /* FSC */ + { .subvendor = 0x1734, .subdevice = 0x10ab, .action = VIA_DXS_SRC }, /* FSC */ { .subvendor = 0x1849, .subdevice = 0x3059, .action = VIA_DXS_NO_VRA }, /* ASRock K7VM2 */ { .subvendor = 0x1849, .subdevice = 0x9739, .action = VIA_DXS_SRC }, /* ASRock mobo(?) */ { .subvendor = 0x1849, .subdevice = 0x9761, .action = VIA_DXS_SRC }, /* ASRock mobo(?) */ -- cgit v0.10.2 From bc6c531eb53de8a0ba355f76ce2bd28f58e46707 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Thu, 27 Jul 2006 10:44:30 +0200 Subject: [ALSA] HDA driver - do not set mute flag for dB scale (follow HDA specification) Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 78ff457..399860c 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -856,13 +856,6 @@ int snd_hda_mixer_amp_tlv(struct snd_kcontrol *kcontrol, int op_flag, val2 = (((caps & AC_AMPCAP_STEP_SIZE) >> AC_AMPCAP_STEP_SIZE_SHIFT) + 1) * 25; val1 = -((caps & AC_AMPCAP_OFFSET) >> AC_AMPCAP_OFFSET_SHIFT); val1 = ((int)val1) * ((int)val2); - if (caps & AC_AMPCAP_MUTE) - val2 |= 0x10000; - if ((val2 & 0x10000) == 0 && dir == HDA_OUTPUT) { - caps = query_amp_caps(codec, nid, HDA_INPUT); - if (caps & AC_AMPCAP_MUTE) - val2 |= 0x10000; - } if (put_user(SNDRV_CTL_TLVT_DB_SCALE, _tlv)) return -EFAULT; if (put_user(2 * sizeof(unsigned int), _tlv + 1)) -- cgit v0.10.2 From 9265d199616630c2eb993ffe40c9daef3d6873b3 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 27 Jul 2006 15:50:14 +0200 Subject: [ALSA] Fix Makefile of cs5535audio Use ifeq instead of ifdef in Makefile to make the maintenance of out-of-kernel tree easier. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/cs5535audio/Makefile b/sound/pci/cs5535audio/Makefile index 2911a8a..ad947b4 100644 --- a/sound/pci/cs5535audio/Makefile +++ b/sound/pci/cs5535audio/Makefile @@ -4,7 +4,7 @@ snd-cs5535audio-objs := cs5535audio.o cs5535audio_pcm.o -ifdef CONFIG_PM +ifeq ($(CONFIG_PM),y) snd-cs5535audio-objs += cs5535audio_pm.o endif -- cgit v0.10.2 From 68ab801e32bbe2caac8b8c6e6e94f41fe7d687ad Mon Sep 17 00:00:00 2001 From: Matthias Koenig Date: Thu, 27 Jul 2006 16:59:23 +0200 Subject: [ALSA] Add snd-mts64 driver for ESI Miditerminal 4140 Added snd-mts64 driver for Ego Systems (ESI) Miditerminal 4140 by Matthias Koenig . The driver requires parport (CONFIG_PARPORT). Signed-off-by: Matthias Koenig Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 885d2ed..7344815 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -1216,6 +1216,14 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. Module supports only 1 card. This module has no enable option. + Module snd-mts64 + ---------------- + + Module for Ego Systems (ESI) Miditerminal 4140 + + This module supports multiple devices. + Requires parport (CONFIG_PARPORT). + Module snd-nm256 ---------------- diff --git a/sound/drivers/Kconfig b/sound/drivers/Kconfig index 897dc2d..952c7f1 100644 --- a/sound/drivers/Kconfig +++ b/sound/drivers/Kconfig @@ -73,6 +73,19 @@ config SND_MTPAV To compile this driver as a module, choose M here: the module will be called snd-mtpav. +config SND_MTS64 + tristate "ESI Miditerminal 4140 driver" + depends on SND && PARPORT + select SND_RAWMIDI + help + The ESI Miditerminal 4140 is a 4 In 4 Out MIDI Interface with + additional SMPTE Timecode capabilities for the parallel port. + + Say 'Y' to include support for this device. + + To compile this driver as a module, chose 'M' here: the module + will be called snd-mts64. + config SND_SERIAL_U16550 tristate "UART16550 serial MIDI driver" depends on SND diff --git a/sound/drivers/Makefile b/sound/drivers/Makefile index cb98c3d..c9bad6d 100644 --- a/sound/drivers/Makefile +++ b/sound/drivers/Makefile @@ -5,6 +5,7 @@ snd-dummy-objs := dummy.o snd-mtpav-objs := mtpav.o +snd-mts64-objs := mts64.o snd-serial-u16550-objs := serial-u16550.o snd-virmidi-objs := virmidi.o @@ -13,5 +14,6 @@ obj-$(CONFIG_SND_DUMMY) += snd-dummy.o obj-$(CONFIG_SND_VIRMIDI) += snd-virmidi.o obj-$(CONFIG_SND_SERIAL_U16550) += snd-serial-u16550.o obj-$(CONFIG_SND_MTPAV) += snd-mtpav.o +obj-$(CONFIG_SND_MTS64) += snd-mts64.o obj-$(CONFIG_SND) += opl3/ opl4/ mpu401/ vx/ diff --git a/sound/drivers/mts64.c b/sound/drivers/mts64.c new file mode 100644 index 0000000..1699873 --- /dev/null +++ b/sound/drivers/mts64.c @@ -0,0 +1,1091 @@ +/* + * ALSA Driver for Ego Systems Inc. (ESI) Miditerminal 4140 + * Copyright (c) 2006 by Matthias König + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; if not, write to the Free Software + * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define CARD_NAME "Miditerminal 4140" +#define DRIVER_NAME "MTS64" +#define PLATFORM_DRIVER "snd_mts64" + +static int index[SNDRV_CARDS] = SNDRV_DEFAULT_IDX; +static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; +static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; + +static struct platform_device *platform_devices[SNDRV_CARDS]; +static int device_count; + +module_param_array(index, int, NULL, S_IRUGO); +MODULE_PARM_DESC(index, "Index value for " CARD_NAME " soundcard."); +module_param_array(id, charp, NULL, S_IRUGO); +MODULE_PARM_DESC(id, "ID string for " CARD_NAME " soundcard."); +module_param_array(enable, bool, NULL, S_IRUGO); +MODULE_PARM_DESC(enable, "Enable " CARD_NAME " soundcard."); + +MODULE_AUTHOR("Matthias Koenig "); +MODULE_DESCRIPTION("ESI Miditerminal 4140"); +MODULE_LICENSE("GPL"); +MODULE_SUPPORTED_DEVICE("{{ESI,Miditerminal 4140}}"); + +/********************************************************************* + * Chip specific + *********************************************************************/ +#define MTS64_NUM_INPUT_PORTS 5 +#define MTS64_NUM_OUTPUT_PORTS 4 +#define MTS64_SMPTE_SUBSTREAM 4 + +struct mts64 { + spinlock_t lock; + struct snd_card *card; + struct snd_rawmidi *rmidi; + struct pardevice *pardev; + int pardev_claimed; + + int open_count; + int current_midi_output_port; + int current_midi_input_port; + u8 mode[MTS64_NUM_INPUT_PORTS]; + struct snd_rawmidi_substream *midi_input_substream[MTS64_NUM_INPUT_PORTS]; + int smpte_switch; + u8 time[4]; /* [0]=hh, [1]=mm, [2]=ss, [3]=ff */ + u8 fps; +}; + +static int snd_mts64_free(struct mts64 *mts) +{ + kfree(mts); + return 0; +} + +static int __devinit snd_mts64_create(struct snd_card *card, + struct pardevice *pardev, + struct mts64 **rchip) +{ + struct mts64 *mts; + + *rchip = NULL; + + mts = kzalloc(sizeof(struct mts64), GFP_KERNEL); + if (mts == NULL) + return -ENOMEM; + + /* Init chip specific data */ + spin_lock_init(&mts->lock); + mts->card = card; + mts->pardev = pardev; + mts->current_midi_output_port = -1; + mts->current_midi_input_port = -1; + + *rchip = mts; + + return 0; +} + +/********************************************************************* + * HW register related constants + *********************************************************************/ + +/* Status Bits */ +#define MTS64_STAT_BSY 0x80 +#define MTS64_STAT_BIT_SET 0x20 /* readout process, bit is set */ +#define MTS64_STAT_PORT 0x10 /* read byte is a port number */ + +/* Control Bits */ +#define MTS64_CTL_READOUT 0x08 /* enable readout */ +#define MTS64_CTL_WRITE_CMD 0x06 +#define MTS64_CTL_WRITE_DATA 0x02 +#define MTS64_CTL_STROBE 0x01 + +/* Command */ +#define MTS64_CMD_RESET 0xfe +#define MTS64_CMD_PROBE 0x8f /* Used in probing procedure */ +#define MTS64_CMD_SMPTE_SET_TIME 0xe8 +#define MTS64_CMD_SMPTE_SET_FPS 0xee +#define MTS64_CMD_SMPTE_STOP 0xef +#define MTS64_CMD_SMPTE_FPS_24 0xe3 +#define MTS64_CMD_SMPTE_FPS_25 0xe2 +#define MTS64_CMD_SMPTE_FPS_2997 0xe4 +#define MTS64_CMD_SMPTE_FPS_30D 0xe1 +#define MTS64_CMD_SMPTE_FPS_30 0xe0 +#define MTS64_CMD_COM_OPEN 0xf8 /* setting the communication mode */ +#define MTS64_CMD_COM_CLOSE1 0xff /* clearing communication mode */ +#define MTS64_CMD_COM_CLOSE2 0xf5 + +/********************************************************************* + * Hardware specific functions + *********************************************************************/ +static void mts64_enable_readout(struct parport *p); +static void mts64_disable_readout(struct parport *p); +static int mts64_device_ready(struct parport *p); +static int mts64_device_init(struct parport *p); +static int mts64_device_open(struct mts64 *mts); +static int mts64_device_close(struct mts64 *mts); +static u8 mts64_map_midi_input(u8 c); +static int mts64_probe(struct parport *p); +static u16 mts64_read(struct parport *p); +static u8 mts64_read_char(struct parport *p); +static void mts64_smpte_start(struct parport *p, + u8 hours, u8 minutes, + u8 seconds, u8 frames, + u8 idx); +static void mts64_smpte_stop(struct parport *p); +static void mts64_write_command(struct parport *p, u8 c); +static void mts64_write_data(struct parport *p, u8 c); +static void mts64_write_midi(struct mts64 *mts, u8 c, int midiport); + + +/* Enables the readout procedure + * + * Before we can read a midi byte from the device, we have to set + * bit 3 of control port. + */ +static void mts64_enable_readout(struct parport *p) +{ + u8 c; + + c = parport_read_control(p); + c |= MTS64_CTL_READOUT; + parport_write_control(p, c); +} + +/* Disables readout + * + * Readout is disabled by clearing bit 3 of control + */ +static void mts64_disable_readout(struct parport *p) +{ + u8 c; + + c = parport_read_control(p); + c &= ~MTS64_CTL_READOUT; + parport_write_control(p, c); +} + +/* waits for device ready + * + * Checks if BUSY (Bit 7 of status) is clear + * 1 device ready + * 0 failure + */ +static int mts64_device_ready(struct parport *p) +{ + int i; + u8 c; + + for (i = 0; i < 0xffff; ++i) { + c = parport_read_status(p); + c &= MTS64_STAT_BSY; + if (c != 0) + return 1; + } + + return 0; +} + +/* Init device (LED blinking startup magic) + * + * Returns: + * 0 init ok + * -EIO failure + */ +static int __devinit mts64_device_init(struct parport *p) +{ + int i; + + mts64_write_command(p, MTS64_CMD_RESET); + + for (i = 0; i < 64; ++i) { + msleep(100); + + if (mts64_probe(p) == 0) { + /* success */ + mts64_disable_readout(p); + return 0; + } + } + mts64_disable_readout(p); + + return -EIO; +} + +/* + * Opens the device (set communication mode) + */ +static int mts64_device_open(struct mts64 *mts) +{ + int i; + struct parport *p = mts->pardev->port; + + for (i = 0; i < 5; ++i) + mts64_write_command(p, MTS64_CMD_COM_OPEN); + + return 0; +} + +/* + * Close device (clear communication mode) + */ +static int mts64_device_close(struct mts64 *mts) +{ + int i; + struct parport *p = mts->pardev->port; + + for (i = 0; i < 5; ++i) { + mts64_write_command(p, MTS64_CMD_COM_CLOSE1); + mts64_write_command(p, MTS64_CMD_COM_CLOSE2); + } + + return 0; +} + +/* map hardware port to substream number + * + * When reading a byte from the device, the device tells us + * on what port the byte is. This HW port has to be mapped to + * the midiport (substream number). + * substream 0-3 are Midiports 1-4 + * substream 4 is SMPTE Timecode + * The mapping is done by the table: + * HW | 0 | 1 | 2 | 3 | 4 + * SW | 0 | 1 | 4 | 2 | 3 + */ +static u8 mts64_map_midi_input(u8 c) +{ + static u8 map[] = { 0, 1, 4, 2, 3 }; + + return map[c]; +} + + +/* Probe parport for device + * + * Do we have a Miditerminal 4140 on parport? + * Returns: + * 0 device found + * -ENODEV no device + */ +static int __devinit mts64_probe(struct parport *p) +{ + u8 c; + + mts64_smpte_stop(p); + mts64_write_command(p, MTS64_CMD_PROBE); + + msleep(50); + + c = mts64_read(p); + + c &= 0x00ff; + if (c != MTS64_CMD_PROBE) + return -ENODEV; + else + return 0; + +} + +/* Read byte incl. status from device + * + * Returns: + * data in lower 8 bits and status in upper 8 bits + */ +static u16 mts64_read(struct parport *p) +{ + u8 data, status; + + mts64_device_ready(p); + mts64_enable_readout(p); + status = parport_read_status(p); + data = mts64_read_char(p); + mts64_disable_readout(p); + + return (status << 8) | data; +} + +/* Read a byte from device + * + * Note, that readout mode has to be enabled. + * readout procedure is as follows: + * - Write number of the Bit to read to DATA + * - Read STATUS + * - Bit 5 of STATUS indicates if Bit is set + * + * Returns: + * Byte read from device + */ +static u8 mts64_read_char(struct parport *p) +{ + u8 c = 0; + u8 status; + u8 i; + + for (i = 0; i < 8; ++i) { + parport_write_data(p, i); + c >>= 1; + status = parport_read_status(p); + if (status & MTS64_STAT_BIT_SET) + c |= 0x80; + } + + return c; +} + +/* Starts SMPTE Timecode generation + * + * The device creates SMPTE Timecode by hardware. + * 0 24 fps + * 1 25 fps + * 2 29.97 fps + * 3 30 fps (Drop-frame) + * 4 30 fps + */ +static void mts64_smpte_start(struct parport *p, + u8 hours, u8 minutes, + u8 seconds, u8 frames, + u8 idx) +{ + static u8 fps[5] = { MTS64_CMD_SMPTE_FPS_24, + MTS64_CMD_SMPTE_FPS_25, + MTS64_CMD_SMPTE_FPS_2997, + MTS64_CMD_SMPTE_FPS_30D, + MTS64_CMD_SMPTE_FPS_30 }; + + mts64_write_command(p, MTS64_CMD_SMPTE_SET_TIME); + mts64_write_command(p, frames); + mts64_write_command(p, seconds); + mts64_write_command(p, minutes); + mts64_write_command(p, hours); + + mts64_write_command(p, MTS64_CMD_SMPTE_SET_FPS); + mts64_write_command(p, fps[idx]); +} + +/* Stops SMPTE Timecode generation + */ +static void mts64_smpte_stop(struct parport *p) +{ + mts64_write_command(p, MTS64_CMD_SMPTE_STOP); +} + +/* Write a command byte to device + */ +static void mts64_write_command(struct parport *p, u8 c) +{ + mts64_device_ready(p); + + parport_write_data(p, c); + + parport_write_control(p, MTS64_CTL_WRITE_CMD); + parport_write_control(p, MTS64_CTL_WRITE_CMD | MTS64_CTL_STROBE); + parport_write_control(p, MTS64_CTL_WRITE_CMD); +} + +/* Write a data byte to device + */ +static void mts64_write_data(struct parport *p, u8 c) +{ + mts64_device_ready(p); + + parport_write_data(p, c); + + parport_write_control(p, MTS64_CTL_WRITE_DATA); + parport_write_control(p, MTS64_CTL_WRITE_DATA | MTS64_CTL_STROBE); + parport_write_control(p, MTS64_CTL_WRITE_DATA); +} + +/* Write a MIDI byte to midiport + * + * midiport ranges from 0-3 and maps to Ports 1-4 + * assumptions: communication mode is on + */ +static void mts64_write_midi(struct mts64 *mts, u8 c, + int midiport) +{ + struct parport *p = mts->pardev->port; + + /* check current midiport */ + if (mts->current_midi_output_port != midiport) + mts64_write_command(p, midiport); + + /* write midi byte */ + mts64_write_data(p, c); +} + +/********************************************************************* + * Control elements + *********************************************************************/ + +/* SMPTE Switch */ +static int snd_mts64_ctl_smpte_switch_info(struct snd_kcontrol *kctl, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_BOOLEAN; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 1; + return 0; +} + +static int snd_mts64_ctl_smpte_switch_get(struct snd_kcontrol* kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + + spin_lock_irq(&mts->lock); + uctl->value.integer.value[0] = mts->smpte_switch; + spin_unlock_irq(&mts->lock); + + return 0; +} + +/* smpte_switch is not accessed from IRQ handler, so we just need + to protect the HW access */ +static int snd_mts64_ctl_smpte_switch_put(struct snd_kcontrol* kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + int changed = 0; + + spin_lock_irq(&mts->lock); + if (mts->smpte_switch == uctl->value.integer.value[0]) + goto __out; + + changed = 1; + mts->smpte_switch = uctl->value.integer.value[0]; + if (mts->smpte_switch) { + mts64_smpte_start(mts->pardev->port, + mts->time[0], mts->time[1], + mts->time[2], mts->time[3], + mts->fps); + } else { + mts64_smpte_stop(mts->pardev->port); + } +__out: + spin_unlock_irq(&mts->lock); + return changed; +} + +static struct snd_kcontrol_new mts64_ctl_smpte_switch __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Playback Switch", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 0, + .info = snd_mts64_ctl_smpte_switch_info, + .get = snd_mts64_ctl_smpte_switch_get, + .put = snd_mts64_ctl_smpte_switch_put +}; + +/* Time */ +static int snd_mts64_ctl_smpte_time_h_info(struct snd_kcontrol *kctl, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 23; + return 0; +} + +static int snd_mts64_ctl_smpte_time_f_info(struct snd_kcontrol *kctl, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 99; + return 0; +} + +static int snd_mts64_ctl_smpte_time_info(struct snd_kcontrol *kctl, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 59; + return 0; +} + +static int snd_mts64_ctl_smpte_time_get(struct snd_kcontrol *kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + int idx = kctl->private_value; + + spin_lock_irq(&mts->lock); + uctl->value.integer.value[0] = mts->time[idx]; + spin_unlock_irq(&mts->lock); + + return 0; +} + +static int snd_mts64_ctl_smpte_time_put(struct snd_kcontrol *kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + int idx = kctl->private_value; + int changed = 0; + + spin_lock_irq(&mts->lock); + if (mts->time[idx] != uctl->value.integer.value[0]) { + changed = 1; + mts->time[idx] = uctl->value.integer.value[0]; + } + spin_unlock_irq(&mts->lock); + + return changed; +} + +static struct snd_kcontrol_new mts64_ctl_smpte_time_hours __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Time Hours", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 0, + .info = snd_mts64_ctl_smpte_time_h_info, + .get = snd_mts64_ctl_smpte_time_get, + .put = snd_mts64_ctl_smpte_time_put +}; + +static struct snd_kcontrol_new mts64_ctl_smpte_time_minutes __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Time Minutes", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 1, + .info = snd_mts64_ctl_smpte_time_info, + .get = snd_mts64_ctl_smpte_time_get, + .put = snd_mts64_ctl_smpte_time_put +}; + +static struct snd_kcontrol_new mts64_ctl_smpte_time_seconds __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Time Seconds", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 2, + .info = snd_mts64_ctl_smpte_time_info, + .get = snd_mts64_ctl_smpte_time_get, + .put = snd_mts64_ctl_smpte_time_put +}; + +static struct snd_kcontrol_new mts64_ctl_smpte_time_frames __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Time Frames", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 3, + .info = snd_mts64_ctl_smpte_time_f_info, + .get = snd_mts64_ctl_smpte_time_get, + .put = snd_mts64_ctl_smpte_time_put +}; + +/* FPS */ +static int snd_mts64_ctl_smpte_fps_info(struct snd_kcontrol *kctl, + struct snd_ctl_elem_info *uinfo) +{ + static char *texts[5] = { "24", + "25", + "29.97", + "30D", + "30" }; + + uinfo->type = SNDRV_CTL_ELEM_TYPE_ENUMERATED; + uinfo->count = 1; + uinfo->value.enumerated.items = 5; + if (uinfo->value.enumerated.item > 4) + uinfo->value.enumerated.item = 4; + strcpy(uinfo->value.enumerated.name, + texts[uinfo->value.enumerated.item]); + + return 0; +} + +static int snd_mts64_ctl_smpte_fps_get(struct snd_kcontrol *kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + + spin_lock_irq(&mts->lock); + uctl->value.enumerated.item[0] = mts->fps; + spin_unlock_irq(&mts->lock); + + return 0; +} + +static int snd_mts64_ctl_smpte_fps_put(struct snd_kcontrol *kctl, + struct snd_ctl_elem_value *uctl) +{ + struct mts64 *mts = snd_kcontrol_chip(kctl); + int changed = 0; + + spin_lock_irq(&mts->lock); + if (mts->fps != uctl->value.enumerated.item[0]) { + changed = 1; + mts->fps = uctl->value.enumerated.item[0]; + } + spin_unlock_irq(&mts->lock); + + return changed; +} + +static struct snd_kcontrol_new mts64_ctl_smpte_fps __devinitdata = { + .iface = SNDRV_CTL_ELEM_IFACE_RAWMIDI, + .name = "SMPTE Fps", + .index = 0, + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE, + .private_value = 0, + .info = snd_mts64_ctl_smpte_fps_info, + .get = snd_mts64_ctl_smpte_fps_get, + .put = snd_mts64_ctl_smpte_fps_put +}; + + +static int __devinit snd_mts64_ctl_create(struct snd_card *card, + struct mts64 *mts) +{ + int err, i; + static struct snd_kcontrol_new *control[] = { + &mts64_ctl_smpte_switch, + &mts64_ctl_smpte_time_hours, + &mts64_ctl_smpte_time_minutes, + &mts64_ctl_smpte_time_seconds, + &mts64_ctl_smpte_time_frames, + &mts64_ctl_smpte_fps, + 0 }; + + for (i = 0; control[i]; ++i) { + err = snd_ctl_add(card, snd_ctl_new1(control[i], mts)); + if (err < 0) { + snd_printd("Cannot create control: %s\n", + control[i]->name); + return err; + } + } + + return 0; +} + +/********************************************************************* + * Rawmidi + *********************************************************************/ +#define MTS64_MODE_INPUT_TRIGGERED 0x01 + +static int snd_mts64_rawmidi_open(struct snd_rawmidi_substream *substream) +{ + struct mts64 *mts = substream->rmidi->private_data; + + if (mts->open_count == 0) { + /* We don't need a spinlock here, because this is just called + if the device has not been opened before. + So there aren't any IRQs from the device */ + mts64_device_open(mts); + + msleep(50); + } + ++(mts->open_count); + + return 0; +} + +static int snd_mts64_rawmidi_close(struct snd_rawmidi_substream *substream) +{ + struct mts64 *mts = substream->rmidi->private_data; + unsigned long flags; + + --(mts->open_count); + if (mts->open_count == 0) { + /* We need the spinlock_irqsave here because we can still + have IRQs at this point */ + spin_lock_irqsave(&mts->lock, flags); + mts64_device_close(mts); + spin_unlock_irqrestore(&mts->lock, flags); + + msleep(500); + + } else if (mts->open_count < 0) + mts->open_count = 0; + + return 0; +} + +static void snd_mts64_rawmidi_output_trigger(struct snd_rawmidi_substream *substream, + int up) +{ + struct mts64 *mts = substream->rmidi->private_data; + u8 data; + unsigned long flags; + + spin_lock_irqsave(&mts->lock, flags); + while (snd_rawmidi_transmit_peek(substream, &data, 1) == 1) { + mts64_write_midi(mts, data, substream->number+1); + snd_rawmidi_transmit_ack(substream, 1); + } + spin_unlock_irqrestore(&mts->lock, flags); +} + +static void snd_mts64_rawmidi_input_trigger(struct snd_rawmidi_substream *substream, + int up) +{ + struct mts64 *mts = substream->rmidi->private_data; + unsigned long flags; + + spin_lock_irqsave(&mts->lock, flags); + if (up) + mts->mode[substream->number] |= MTS64_MODE_INPUT_TRIGGERED; + else + mts->mode[substream->number] &= ~MTS64_MODE_INPUT_TRIGGERED; + + spin_unlock_irqrestore(&mts->lock, flags); +} + +static struct snd_rawmidi_ops snd_mts64_rawmidi_output_ops = { + .open = snd_mts64_rawmidi_open, + .close = snd_mts64_rawmidi_close, + .trigger = snd_mts64_rawmidi_output_trigger +}; + +static struct snd_rawmidi_ops snd_mts64_rawmidi_input_ops = { + .open = snd_mts64_rawmidi_open, + .close = snd_mts64_rawmidi_close, + .trigger = snd_mts64_rawmidi_input_trigger +}; + +/* Create and initialize the rawmidi component */ +static int __devinit snd_mts64_rawmidi_create(struct snd_card *card) +{ + struct mts64 *mts = card->private_data; + struct snd_rawmidi *rmidi; + struct snd_rawmidi_substream *substream; + struct list_head *list; + int err; + + err = snd_rawmidi_new(card, CARD_NAME, 0, + MTS64_NUM_OUTPUT_PORTS, + MTS64_NUM_INPUT_PORTS, + &rmidi); + if (err < 0) + return err; + + rmidi->private_data = mts; + strcpy(rmidi->name, CARD_NAME); + rmidi->info_flags = SNDRV_RAWMIDI_INFO_OUTPUT | + SNDRV_RAWMIDI_INFO_INPUT | + SNDRV_RAWMIDI_INFO_DUPLEX; + + mts->rmidi = rmidi; + + /* register rawmidi ops */ + snd_rawmidi_set_ops(rmidi, SNDRV_RAWMIDI_STREAM_OUTPUT, + &snd_mts64_rawmidi_output_ops); + snd_rawmidi_set_ops(rmidi, SNDRV_RAWMIDI_STREAM_INPUT, + &snd_mts64_rawmidi_input_ops); + + /* name substreams */ + /* output */ + list_for_each(list, + &rmidi->streams[SNDRV_RAWMIDI_STREAM_OUTPUT].substreams) { + substream = list_entry(list, struct snd_rawmidi_substream, list); + sprintf(substream->name, + "Miditerminal %d", substream->number+1); + } + /* input */ + list_for_each(list, + &rmidi->streams[SNDRV_RAWMIDI_STREAM_INPUT].substreams) { + substream = list_entry(list, struct snd_rawmidi_substream, list); + mts->midi_input_substream[substream->number] = substream; + switch(substream->number) { + case MTS64_SMPTE_SUBSTREAM: + strcpy(substream->name, "Miditerminal SMPTE"); + break; + default: + sprintf(substream->name, + "Miditerminal %d", substream->number+1); + } + } + + /* controls */ + err = snd_mts64_ctl_create(card, mts); + + return err; +} + +/********************************************************************* + * parport stuff + *********************************************************************/ +static void snd_mts64_interrupt(int irq, void *private, struct pt_regs *r) +{ + struct mts64 *mts = ((struct snd_card*)private)->private_data; + u16 ret; + u8 status, data; + struct snd_rawmidi_substream *substream; + + spin_lock(&mts->lock); + ret = mts64_read(mts->pardev->port); + data = ret & 0x00ff; + status = ret >> 8; + + if (status & MTS64_STAT_PORT) { + mts->current_midi_input_port = mts64_map_midi_input(data); + } else { + if (mts->current_midi_input_port == -1) + goto __out; + substream = mts->midi_input_substream[mts->current_midi_input_port]; + if (mts->mode[substream->number] & MTS64_MODE_INPUT_TRIGGERED) + snd_rawmidi_receive(substream, &data, 1); + } +__out: + spin_unlock(&mts->lock); +} + +static int __devinit snd_mts64_probe_port(struct parport *p) +{ + struct pardevice *pardev; + int res; + + pardev = parport_register_device(p, DRIVER_NAME, + NULL, NULL, NULL, + 0, NULL); + if (!pardev) + return -EIO; + + if (parport_claim(pardev)) { + parport_unregister_device(pardev); + return -EIO; + } + + res = mts64_probe(p); + + parport_release(pardev); + parport_unregister_device(pardev); + + return res; +} + +static void __devinit snd_mts64_attach(struct parport *p) +{ + struct platform_device *device; + + device = platform_device_alloc(PLATFORM_DRIVER, device_count); + if (!device) + return; + + /* Temporary assignment to forward the parport */ + platform_set_drvdata(device, p); + + if (platform_device_register(device) < 0) { + platform_device_put(device); + return; + } + + /* Since we dont get the return value of probe + * We need to check if device probing succeeded or not */ + if (!platform_get_drvdata(device)) { + platform_device_unregister(device); + return; + } + + /* register device in global table */ + platform_devices[device_count] = device; + device_count++; +} + +static void snd_mts64_detach(struct parport *p) +{ + /* nothing to do here */ +} + +static struct parport_driver mts64_parport_driver = { + .name = "mts64", + .attach = snd_mts64_attach, + .detach = snd_mts64_detach +}; + +/********************************************************************* + * platform stuff + *********************************************************************/ +static void snd_mts64_card_private_free(struct snd_card *card) +{ + struct mts64 *mts = card->private_data; + struct pardevice *pardev = mts->pardev; + + if (pardev) { + if (mts->pardev_claimed) + parport_release(pardev); + parport_unregister_device(pardev); + } + + snd_mts64_free(mts); +} + +static int __devinit snd_mts64_probe(struct platform_device *pdev) +{ + struct pardevice *pardev; + struct parport *p; + int dev = pdev->id; + struct snd_card *card = NULL; + struct mts64 *mts = NULL; + int err; + + p = platform_get_drvdata(pdev); + platform_set_drvdata(pdev, NULL); + + if (dev >= SNDRV_CARDS) + return -ENODEV; + if (!enable[dev]) + return -ENOENT; + if ((err = snd_mts64_probe_port(p)) < 0) + return err; + + card = snd_card_new(index[dev], id[dev], THIS_MODULE, 0); + if (card == NULL) { + snd_printd("Cannot create card\n"); + return -ENOMEM; + } + strcpy(card->driver, DRIVER_NAME); + strcpy(card->shortname, "ESI " CARD_NAME); + sprintf(card->longname, "%s at 0x%lx, irq %i", + card->shortname, p->base, p->irq); + + pardev = parport_register_device(p, /* port */ + DRIVER_NAME, /* name */ + NULL, /* preempt */ + NULL, /* wakeup */ + snd_mts64_interrupt, /* ISR */ + PARPORT_DEV_EXCL, /* flags */ + (void *)card); /* private */ + if (pardev == NULL) { + snd_printd("Cannot register pardevice\n"); + err = -EIO; + goto __err; + } + + if ((err = snd_mts64_create(card, pardev, &mts)) < 0) { + snd_printd("Cannot create main component\n"); + parport_unregister_device(pardev); + goto __err; + } + card->private_data = mts; + card->private_free = snd_mts64_card_private_free; + + if ((err = snd_mts64_rawmidi_create(card)) < 0) { + snd_printd("Creating Rawmidi component failed\n"); + goto __err; + } + + /* claim parport */ + if (parport_claim(pardev)) { + snd_printd("Cannot claim parport 0x%lx\n", pardev->port->base); + err = -EIO; + goto __err; + } + mts->pardev_claimed = 1; + + /* init device */ + if ((err = mts64_device_init(p)) < 0) + goto __err; + + platform_set_drvdata(pdev, card); + + /* At this point card will be usable */ + if ((err = snd_card_register(card)) < 0) { + snd_printd("Cannot register card\n"); + goto __err; + } + + snd_printk("ESI Miditerminal 4140 on 0x%lx\n", p->base); + return 0; + +__err: + snd_card_free(card); + return err; +} + +static int snd_mts64_remove(struct platform_device *pdev) +{ + struct snd_card *card = platform_get_drvdata(pdev); + + if (card) + snd_card_free(card); + + return 0; +} + + +static struct platform_driver snd_mts64_driver = { + .probe = snd_mts64_probe, + .remove = snd_mts64_remove, + .driver = { + .name = PLATFORM_DRIVER + } +}; + +/********************************************************************* + * module init stuff + *********************************************************************/ +static void snd_mts64_unregister_all(void) +{ + int i; + + for (i = 0; i < SNDRV_CARDS; ++i) { + if (platform_devices[i]) { + platform_device_unregister(platform_devices[i]); + platform_devices[i] = NULL; + } + } + platform_driver_unregister(&snd_mts64_driver); + parport_unregister_driver(&mts64_parport_driver); +} + +static int __init snd_mts64_module_init(void) +{ + int err; + + if ((err = platform_driver_register(&snd_mts64_driver)) < 0) + return err; + + if (parport_register_driver(&mts64_parport_driver) != 0) { + platform_driver_unregister(&snd_mts64_driver); + return -EIO; + } + + if (device_count == 0) { + snd_mts64_unregister_all(); + return -ENODEV; + } + + return 0; +} + +static void __exit snd_mts64_module_exit(void) +{ + snd_mts64_unregister_all(); +} + +module_init(snd_mts64_module_init); +module_exit(snd_mts64_module_exit); -- cgit v0.10.2 From 4b146cb087b4a668511f6c991da1dc40e2e04b0d Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 28 Jul 2006 14:42:36 +0200 Subject: [ALSA] Misc fixes for Realtek HD-audio codecs - Added model=arima for Arima W820Di1 with ALC882 codec chip - Added EAPD-control verbs to TCL S700 init verbs - Added missing model strings for Realtek codecs (to be specified via module option explicitly for testing/debugging) Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 7344815..d0dbc3f 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -778,11 +778,15 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 6stack-digout 6-jack with a SPDIF out w810 3-jack z71v 3-jack (HP shared SPDIF) - asus 3-jack + asus 3-jack (ASUS Mobo) + asus-w1v ASUS W1V + asus-dig ASUS with SPDIF out + asus-dig2 ASUS with SPDIF out (using GPIO2) uniwill 3-jack F1734 2-jack lg LG laptop (m1 express dual) lg-lw LG LW20 laptop + tcl TCL S700 clevo Clevo laptops (m520G, m665n) test for testing/debugging purpose, almost all controls can be adjusted. Appearing only when compiled with @@ -791,6 +795,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC260 hp HP machines + hp-3013 HP machines (3013-variant) fujitsu Fujitsu S7020 acer Acer TravelMate basic fixed pin assignment (old default model) @@ -806,18 +811,22 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. ALC882/885 3stack-dig 3-jack with SPDIF I/O 6stck-dig 6-jack digital with SPDIF I/O + arima Arima W820Di1 auto auto-config reading BIOS (default) ALC883/888 3stack-dig 3-jack with SPDIF I/O 6stack-dig 6-jack digital with SPDIF I/O - 6stack-dig-demo 6-stack digital for Intel demo board + 3stack-6ch 3-jack 6-channel + 3stack-6ch-dig 3-jack 6-channel with SPDIF I/O + 6stack-dig-demo 6-jack digital for Intel demo board auto auto-config reading BIOS (default) ALC861/660 3stack 3-jack 3stack-dig 3-jack with SPDIF I/O 6stack-dig 6-jack with SPDIF I/O + 3stack-660 3-jack (for ALC660) auto auto-config reading BIOS (default) CMI9880 diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 42c4f90..378e5f1 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -98,6 +98,7 @@ enum { enum { ALC882_3ST_DIG, ALC882_6ST_DIG, + ALC882_ARIMA, ALC882_AUTO, ALC882_MODEL_LAST, }; @@ -1349,6 +1350,10 @@ static struct hda_verb alc880_pin_clevo_init_verbs[] = { }; static struct hda_verb alc880_pin_tcl_S700_init_verbs[] = { + /* change to EAPD mode */ + {0x20, AC_VERB_SET_COEF_INDEX, 0x07}, + {0x20, AC_VERB_SET_PROC_COEF, 0x3060}, + /* Headphone output */ {0x14, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_HP}, /* Front output*/ @@ -2146,6 +2151,7 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .pci_subvendor = 0x107b, .pci_subdevice = 0x4040, .config = ALC880_3ST }, { .pci_subvendor = 0x107b, .pci_subdevice = 0x4041, .config = ALC880_3ST }, /* TCL S700 */ + { .modelname = "tcl", .config = ALC880_TCL_S700 }, { .pci_subvendor = 0x19db, .pci_subdevice = 0x4188, .config = ALC880_TCL_S700 }, /* Back 3 jack, front 2 jack (Internal add Aux-In) */ @@ -2232,8 +2238,11 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .pci_subvendor = 0x1043, .pci_subdevice = 0x1133, .config = ALC880_ASUS }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1123, .config = ALC880_ASUS_DIG }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1143, .config = ALC880_ASUS }, + { .modelname = "asus-w1v", .config = ALC880_ASUS_W1V }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x10b3, .config = ALC880_ASUS_W1V }, + { .modelname = "asus-dig", .config = ALC880_ASUS_DIG }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x8181, .config = ALC880_ASUS_DIG }, /* ASUS P4GPL-X */ + { .modelname = "asus-dig2", .config = ALC880_ASUS_DIG2 }, { .pci_subvendor = 0x1558, .pci_subdevice = 0x5401, .config = ALC880_ASUS_DIG2 }, { .modelname = "uniwill", .config = ALC880_UNIWILL_DIG }, @@ -3906,6 +3915,7 @@ static struct hda_board_config alc260_cfg_tbl[] = { { .pci_subvendor = 0x152d, .pci_subdevice = 0x0729, .config = ALC260_BASIC }, /* CTL Travel Master U553W */ { .modelname = "hp", .config = ALC260_HP }, + { .modelname = "hp-3013", .config = ALC260_HP_3013 }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3010, .config = ALC260_HP }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3011, .config = ALC260_HP }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3012, .config = ALC260_HP_3013 }, @@ -4272,6 +4282,13 @@ static struct hda_verb alc882_init_verbs[] = { { } }; +static struct hda_verb alc882_eapd_verbs[] = { + /* change to EAPD mode */ + {0x20, AC_VERB_SET_COEF_INDEX, 0x07}, + {0x20, AC_VERB_SET_PROC_COEF, 0x3060}, + { } +}; + /* * generic initialization of ADC, input mixers and output mixers */ @@ -4403,6 +4420,9 @@ static struct hda_board_config alc882_cfg_tbl[] = { .config = ALC882_6ST_DIG }, /* Foxconn */ { .pci_subvendor = 0x1019, .pci_subdevice = 0x6668, .config = ALC882_6ST_DIG }, /* ECS to Intel*/ + { .modelname = "arima", .config = ALC882_ARIMA }, + { .pci_subvendor = 0x161f, .pci_subdevice = 0x2054, + .config = ALC882_ARIMA }, /* Arima W820Di1 */ { .modelname = "auto", .config = ALC882_AUTO }, {} }; @@ -4430,6 +4450,15 @@ static struct alc_config_preset alc882_presets[] = { .channel_mode = alc882_sixstack_modes, .input_mux = &alc882_capture_source, }, + [ALC882_ARIMA] = { + .mixers = { alc882_base_mixer, alc882_chmode_mixer }, + .init_verbs = { alc882_init_verbs, alc882_eapd_verbs }, + .num_dacs = ARRAY_SIZE(alc882_dac_nids), + .dac_nids = alc882_dac_nids, + .num_channel_mode = ARRAY_SIZE(alc882_sixstack_modes), + .channel_mode = alc882_sixstack_modes, + .input_mux = &alc882_capture_source, + }, }; @@ -5005,16 +5034,18 @@ static struct snd_kcontrol_new alc883_capture_mixer[] = { */ static struct hda_board_config alc883_cfg_tbl[] = { { .modelname = "3stack-dig", .config = ALC883_3ST_2ch_DIG }, + { .modelname = "3stack-6ch-dig", .config = ALC883_3ST_6ch_DIG }, + { .pci_subvendor = 0x1019, .pci_subdevice = 0x6668, + .config = ALC883_3ST_6ch_DIG }, /* ECS to Intel*/ + { .modelname = "3stack-6ch", .config = ALC883_3ST_6ch }, + { .pci_subvendor = 0x108e, .pci_subdevice = 0x534d, + .config = ALC883_3ST_6ch }, { .modelname = "6stack-dig", .config = ALC883_6ST_DIG }, - { .modelname = "6stack-dig-demo", .config = ALC888_DEMO_BOARD }, { .pci_subvendor = 0x1462, .pci_subdevice = 0x6668, .config = ALC883_6ST_DIG }, /* MSI */ { .pci_subvendor = 0x105b, .pci_subdevice = 0x6668, .config = ALC883_6ST_DIG }, /* Foxconn */ - { .pci_subvendor = 0x1019, .pci_subdevice = 0x6668, - .config = ALC883_3ST_6ch_DIG }, /* ECS to Intel*/ - { .pci_subvendor = 0x108e, .pci_subdevice = 0x534d, - .config = ALC883_3ST_6ch }, + { .modelname = "6stack-dig-demo", .config = ALC888_DEMO_BOARD }, { .modelname = "auto", .config = ALC883_AUTO }, {} }; @@ -5223,8 +5254,10 @@ static int patch_alc883(struct hda_codec *codec) spec->stream_digital_playback = &alc883_pcm_digital_playback; spec->stream_digital_capture = &alc883_pcm_digital_capture; - spec->adc_nids = alc883_adc_nids; - spec->num_adc_nids = ARRAY_SIZE(alc883_adc_nids); + if (! spec->adc_nids && spec->input_mux) { + spec->adc_nids = alc883_adc_nids; + spec->num_adc_nids = ARRAY_SIZE(alc883_adc_nids); + } codec->patch_ops = alc_patch_ops; if (board_config == ALC883_AUTO) @@ -6504,6 +6537,7 @@ static struct hda_board_config alc861_cfg_tbl[] = { { .modelname = "3stack", .config = ALC861_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xd600, .config = ALC861_3ST }, + { .modelname = "3stack-660", .config = ALC660_3ST }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x81e7, .config = ALC660_3ST }, { .modelname = "3stack-dig", .config = ALC861_3ST_DIG }, -- cgit v0.10.2 From 35aec4e2affb99d52b4b744ddb09767eb6e05580 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 28 Jul 2006 14:44:31 +0200 Subject: [ALSA] Don't set up the same PID twice in snd_hda_multi_out_analog_prepare Check the hp_nid whether it's identical with front pin to avoid the setup of the same widget node twice. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 399860c..ff29d0f 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -1942,7 +1942,7 @@ int snd_hda_multi_out_analog_prepare(struct hda_codec *codec, struct hda_multi_o /* front */ snd_hda_codec_setup_stream(codec, nids[HDA_FRONT], stream_tag, 0, format); - if (mout->hp_nid) + if (mout->hp_nid && mout->hp_nid != nids[HDA_FRONT]) /* headphone out will just decode front left/right (stereo) */ snd_hda_codec_setup_stream(codec, mout->hp_nid, stream_tag, 0, format); /* extra outputs copied from front */ -- cgit v0.10.2 From 4e195a7b78618c89b06547f3140e67a69ec23272 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 28 Jul 2006 14:47:34 +0200 Subject: [ALSA] Fix noisy output with shared channel mode with hd-audio - Fix the wrong initialization of num_dacs when changing the channel mode between 2 and multi-channel modes. It must be evaluated after calling snd_hda_ch_mode_put() - Added the similar check of num_dacs fix in Realtek code. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 8955397..077f1ce 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -1647,10 +1647,12 @@ static int ad198x_ch_mode_put(struct snd_kcontrol *kcontrol, { struct hda_codec *codec = snd_kcontrol_chip(kcontrol); struct ad198x_spec *spec = codec->spec; - if (spec->need_dac_fix) + int err = snd_hda_ch_mode_put(codec, ucontrol, spec->channel_mode, + spec->num_channel_mode, + &spec->multiout.max_channels); + if (! err && spec->need_dac_fix) spec->multiout.num_dacs = spec->multiout.max_channels / 2; - return snd_hda_ch_mode_put(codec, ucontrol, spec->channel_mode, - spec->num_channel_mode, &spec->multiout.max_channels); + return err; } /* 6-stack mode */ diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 378e5f1..991f107 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -155,6 +155,7 @@ struct alc_spec { /* channel model */ const struct hda_channel_mode *channel_mode; int num_channel_mode; + int need_dac_fix; /* PCM information */ struct hda_pcm pcm_rec[3]; /* used in alc_build_pcms() */ @@ -192,6 +193,7 @@ struct alc_config_preset { hda_nid_t dig_in_nid; unsigned int num_channel_mode; const struct hda_channel_mode *channel_mode; + int need_dac_fix; unsigned int num_mux_defs; const struct hda_input_mux *input_mux; void (*unsol_event)(struct hda_codec *, unsigned int); @@ -264,9 +266,12 @@ static int alc_ch_mode_put(struct snd_kcontrol *kcontrol, { struct hda_codec *codec = snd_kcontrol_chip(kcontrol); struct alc_spec *spec = codec->spec; - return snd_hda_ch_mode_put(codec, ucontrol, spec->channel_mode, - spec->num_channel_mode, - &spec->multiout.max_channels); + int err = snd_hda_ch_mode_put(codec, ucontrol, spec->channel_mode, + spec->num_channel_mode, + &spec->multiout.max_channels); + if (! err && spec->need_dac_fix) + spec->multiout.num_dacs = spec->multiout.max_channels / 2; + return err; } /* @@ -546,6 +551,7 @@ static void setup_preset(struct alc_spec *spec, spec->channel_mode = preset->channel_mode; spec->num_channel_mode = preset->num_channel_mode; + spec->need_dac_fix = preset->need_dac_fix; spec->multiout.max_channels = spec->channel_mode[0].channels; @@ -2278,6 +2284,7 @@ static struct alc_config_preset alc880_presets[] = { .dac_nids = alc880_dac_nids, .num_channel_mode = ARRAY_SIZE(alc880_threestack_modes), .channel_mode = alc880_threestack_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_3ST_DIG] = { @@ -2288,6 +2295,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_threestack_modes), .channel_mode = alc880_threestack_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_TCL_S700] = { @@ -2380,6 +2388,7 @@ static struct alc_config_preset alc880_presets[] = { .dac_nids = alc880_asus_dac_nids, .num_channel_mode = ARRAY_SIZE(alc880_asus_modes), .channel_mode = alc880_asus_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_ASUS_DIG] = { @@ -2391,6 +2400,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_asus_modes), .channel_mode = alc880_asus_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_ASUS_DIG2] = { @@ -2402,6 +2412,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_asus_modes), .channel_mode = alc880_asus_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_ASUS_W1V] = { @@ -2413,6 +2424,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_asus_modes), .channel_mode = alc880_asus_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_UNIWILL_DIG] = { @@ -2423,6 +2435,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_asus_modes), .channel_mode = alc880_asus_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_CLEVO] = { @@ -2434,6 +2447,7 @@ static struct alc_config_preset alc880_presets[] = { .hp_nid = 0x03, .num_channel_mode = ARRAY_SIZE(alc880_threestack_modes), .channel_mode = alc880_threestack_modes, + .need_dac_fix = 1, .input_mux = &alc880_capture_source, }, [ALC880_LG] = { @@ -2445,6 +2459,7 @@ static struct alc_config_preset alc880_presets[] = { .dig_out_nid = ALC880_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc880_lg_ch_modes), .channel_mode = alc880_lg_ch_modes, + .need_dac_fix = 1, .input_mux = &alc880_lg_capture_source, .unsol_event = alc880_lg_unsol_event, .init_hook = alc880_lg_automute, @@ -4437,6 +4452,7 @@ static struct alc_config_preset alc882_presets[] = { .dig_in_nid = ALC882_DIGIN_NID, .num_channel_mode = ARRAY_SIZE(alc882_ch_modes), .channel_mode = alc882_ch_modes, + .need_dac_fix = 1, .input_mux = &alc882_capture_source, }, [ALC882_6ST_DIG] = { @@ -5075,6 +5091,7 @@ static struct alc_config_preset alc883_presets[] = { .dig_in_nid = ALC883_DIGIN_NID, .num_channel_mode = ARRAY_SIZE(alc883_3ST_6ch_modes), .channel_mode = alc883_3ST_6ch_modes, + .need_dac_fix = 1, .input_mux = &alc883_capture_source, }, [ALC883_3ST_6ch] = { @@ -5086,6 +5103,7 @@ static struct alc_config_preset alc883_presets[] = { .adc_nids = alc883_adc_nids, .num_channel_mode = ARRAY_SIZE(alc883_3ST_6ch_modes), .channel_mode = alc883_3ST_6ch_modes, + .need_dac_fix = 1, .input_mux = &alc883_capture_source, }, [ALC883_6ST_DIG] = { @@ -6554,6 +6572,7 @@ static struct alc_config_preset alc861_presets[] = { .dac_nids = alc861_dac_nids, .num_channel_mode = ARRAY_SIZE(alc861_threestack_modes), .channel_mode = alc861_threestack_modes, + .need_dac_fix = 1, .num_adc_nids = ARRAY_SIZE(alc861_adc_nids), .adc_nids = alc861_adc_nids, .input_mux = &alc861_capture_source, @@ -6566,6 +6585,7 @@ static struct alc_config_preset alc861_presets[] = { .dig_out_nid = ALC861_DIGOUT_NID, .num_channel_mode = ARRAY_SIZE(alc861_threestack_modes), .channel_mode = alc861_threestack_modes, + .need_dac_fix = 1, .num_adc_nids = ARRAY_SIZE(alc861_adc_nids), .adc_nids = alc861_adc_nids, .input_mux = &alc861_capture_source, @@ -6589,6 +6609,7 @@ static struct alc_config_preset alc861_presets[] = { .dac_nids = alc660_dac_nids, .num_channel_mode = ARRAY_SIZE(alc861_threestack_modes), .channel_mode = alc861_threestack_modes, + .need_dac_fix = 1, .num_adc_nids = ARRAY_SIZE(alc861_adc_nids), .adc_nids = alc861_adc_nids, .input_mux = &alc861_capture_source, -- cgit v0.10.2 From 7012b2dac71988f61b520b33c70c63be372b5994 Mon Sep 17 00:00:00 2001 From: James Courtier-Dutton Date: Fri, 28 Jul 2006 22:27:56 +0100 Subject: [ALSA] snd-emu10k1: Add a comment explaining the conversion function for dB gain. Signed-off-by: James Courtier-Dutton Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/emu10k1/emufx.c b/sound/pci/emu10k1/emufx.c index 00fc904..13cd6ce 100644 --- a/sound/pci/emu10k1/emufx.c +++ b/sound/pci/emu10k1/emufx.c @@ -267,6 +267,7 @@ static const u32 treble_table[41][5] = { { 0x37c4448b, 0xa45ef51d, 0x262f3267, 0x081e36dc, 0xfd8f5d14 } }; +/* dB gain = (float) 20 * log10( float(db_table_value) / 0x8000000 ) */ static const u32 db_table[101] = { 0x00000000, 0x01571f82, 0x01674b41, 0x01783a1b, 0x0189f540, 0x019c8651, 0x01aff763, 0x01c45306, 0x01d9a446, 0x01eff6b8, -- cgit v0.10.2 From f3302a59cf6961712658db63b66ea5902c17d5e1 Mon Sep 17 00:00:00 2001 From: Matt Porter Date: Mon, 31 Jul 2006 12:49:34 +0200 Subject: [ALSA] hda: sigmatel 9205 family support Adds support for the '9205 family' which includes some other part numbers but 9205 is the first one. These are 4 channel codecs, some have digital mic capability. Support for the digital mic feature will come later. Signed-off-by: Matt Porter Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index ac96336..d572f03 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -136,6 +136,14 @@ static hda_nid_t stac927x_mux_nids[3] = { 0x15, 0x16, 0x17 }; +static hda_nid_t stac9205_adc_nids[2] = { + 0x12, 0x13 +}; + +static hda_nid_t stac9205_mux_nids[2] = { + 0x19, 0x1a +}; + static hda_nid_t stac9200_pin_nids[8] = { 0x08, 0x09, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, }; @@ -151,6 +159,13 @@ static hda_nid_t stac927x_pin_nids[14] = { 0x14, 0x21, 0x22, 0x23, }; +static hda_nid_t stac9205_pin_nids[12] = { + 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, + 0x0f, 0x14, 0x16, 0x17, 0x18, + 0x21, 0x22, + +}; + static int stac92xx_mux_enum_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { struct hda_codec *codec = snd_kcontrol_chip(kcontrol); @@ -214,6 +229,12 @@ static struct hda_verb stac927x_core_init[] = { {} }; +static struct hda_verb stac9205_core_init[] = { + /* set master volume and direct control */ + { 0x24, AC_VERB_SET_VOLUME_KNOB_CONTROL, 0xff}, + {} +}; + static struct snd_kcontrol_new stac9200_mixer[] = { HDA_CODEC_VOLUME("Master Playback Volume", 0xb, 0, HDA_OUTPUT), HDA_CODEC_MUTE("Master Playback Switch", 0xb, 0, HDA_OUTPUT), @@ -277,6 +298,21 @@ static snd_kcontrol_new_t stac927x_mixer[] = { { } /* end */ }; +static snd_kcontrol_new_t stac9205_mixer[] = { + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Input Source", + .count = 1, + .info = stac92xx_mux_enum_info, + .get = stac92xx_mux_enum_get, + .put = stac92xx_mux_enum_put, + }, + HDA_CODEC_VOLUME("InMux Capture Volume", 0x19, 0x0, HDA_OUTPUT), + HDA_CODEC_VOLUME("InVol Capture Volume", 0x1b, 0x0, HDA_INPUT), + HDA_CODEC_MUTE("ADCMux Capture Switch", 0x1d, 0x0, HDA_OUTPUT), + { } /* end */ +}; + static int stac92xx_build_controls(struct hda_codec *codec) { struct sigmatel_spec *spec = codec->spec; @@ -415,6 +451,24 @@ static struct hda_board_config stac927x_cfg_tbl[] = { {} /* terminator */ }; +static unsigned int ref9205_pin_configs[12] = { + 0x40000100, 0x40000100, 0x01016011, 0x01014010, + 0x01813122, 0x01a19021, 0x40000100, 0x40000100, + 0x40000100, 0x40000100, 0x01441030, 0x01c41030 +}; + +static unsigned int *stac9205_brd_tbl[] = { + ref9205_pin_configs, +}; + +static struct hda_board_config stac9205_cfg_tbl[] = { + { .modelname = "ref", + .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2668, /* DFI LanParty */ + .config = STAC_REF }, /* SigmaTel reference board */ + {} /* terminator */ +}; + static void stac92xx_set_config_regs(struct hda_codec *codec) { int i; @@ -1354,6 +1408,46 @@ static int patch_stac927x(struct hda_codec *codec) return 0; } +static int patch_stac9205(struct hda_codec *codec) +{ + struct sigmatel_spec *spec; + int err; + + spec = kzalloc(sizeof(*spec), GFP_KERNEL); + if (spec == NULL) + return -ENOMEM; + + codec->spec = spec; + spec->board_config = snd_hda_check_board_config(codec, stac9205_cfg_tbl); + if (spec->board_config < 0) + snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC9205, using BIOS defaults\n"); + else { + spec->num_pins = 14; + spec->pin_nids = stac9205_pin_nids; + spec->pin_configs = stac9205_brd_tbl[spec->board_config]; + stac92xx_set_config_regs(codec); + } + + spec->adc_nids = stac9205_adc_nids; + spec->mux_nids = stac9205_mux_nids; + spec->num_muxes = 3; + + spec->init = stac9205_core_init; + spec->mixer = stac9205_mixer; + + spec->multiout.dac_nids = spec->dac_nids; + + err = stac92xx_parse_auto_config(codec, 0x1f, 0x20); + if (err < 0) { + stac92xx_free(codec); + return err; + } + + codec->patch_ops = stac92xx_patch_ops; + + return 0; +} + /* * STAC 7661(?) hack */ @@ -1542,5 +1636,13 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x83847628, .name = "STAC9274X5NH", .patch = patch_stac927x }, { .id = 0x83847629, .name = "STAC9274D5NH", .patch = patch_stac927x }, { .id = 0x83847661, .name = "STAC7661", .patch = patch_stac7661 }, + { .id = 0x838476a0, .name = "STAC9205", .patch = patch_stac9205 }, + { .id = 0x838476a1, .name = "STAC9205D", .patch = patch_stac9205 }, + { .id = 0x838476a2, .name = "STAC9204", .patch = patch_stac9205 }, + { .id = 0x838476a3, .name = "STAC9204D", .patch = patch_stac9205 }, + { .id = 0x838476a4, .name = "STAC9255", .patch = patch_stac9205 }, + { .id = 0x838476a5, .name = "STAC9255D", .patch = patch_stac9205 }, + { .id = 0x838476a6, .name = "STAC9254", .patch = patch_stac9205 }, + { .id = 0x838476a7, .name = "STAC9254D", .patch = patch_stac9205 }, {} /* terminator */ }; -- cgit v0.10.2 From 1c3985580445ef9225c1ea7714d6d963f7626eeb Mon Sep 17 00:00:00 2001 From: Ondrej Zary Date: Mon, 31 Jul 2006 12:51:57 +0200 Subject: [ALSA] es18xx - Add PnP BIOS support This patch adds PnP BIOS support to es18xx driver. It allows ESS ES18xx sound chips integrated in some notebooks (such as DTK FortisPro TOP-5A) that don't appear as ISA cards (they aren't recognized by ISA PnP, only by PnP BIOS) to 'just work' automatically. Signed-off-by: Ondrej Zary Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/isa/es18xx.c b/sound/isa/es18xx.c index 34998de..8581820 100644 --- a/sound/isa/es18xx.c +++ b/sound/isa/es18xx.c @@ -2038,7 +2038,80 @@ MODULE_PARM_DESC(dma2, "DMA 2 # for ES18xx driver."); static struct platform_device *platform_devices[SNDRV_CARDS]; #ifdef CONFIG_PNP -static int pnp_registered; +static int pnp_registered, pnpc_registered; + +static struct pnp_device_id snd_audiodrive_pnpbiosids[] = { + { .id = "ESS1869" }, + { .id = "" } /* end */ +}; + +MODULE_DEVICE_TABLE(pnp, snd_audiodrive_pnpbiosids); + +/* PnP main device initialization */ +static int __devinit snd_audiodrive_pnp_init_main(int dev, struct pnp_dev *pdev, + struct pnp_resource_table *cfg) +{ + int err; + + pnp_init_resource_table(cfg); + if (port[dev] != SNDRV_AUTO_PORT) + pnp_resource_change(&cfg->port_resource[0], port[dev], 16); + if (fm_port[dev] != SNDRV_AUTO_PORT) + pnp_resource_change(&cfg->port_resource[1], fm_port[dev], 4); + if (mpu_port[dev] != SNDRV_AUTO_PORT) + pnp_resource_change(&cfg->port_resource[2], mpu_port[dev], 2); + if (dma1[dev] != SNDRV_AUTO_DMA) + pnp_resource_change(&cfg->dma_resource[0], dma1[dev], 1); + if (dma2[dev] != SNDRV_AUTO_DMA) + pnp_resource_change(&cfg->dma_resource[1], dma2[dev], 1); + if (irq[dev] != SNDRV_AUTO_IRQ) + pnp_resource_change(&cfg->irq_resource[0], irq[dev], 1); + if (pnp_device_is_isapnp(pdev)) { + err = pnp_manual_config_dev(pdev, cfg, 0); + if (err < 0) + snd_printk(KERN_ERR PFX "PnP manual resources are invalid, using auto config\n"); + } + err = pnp_activate_dev(pdev); + if (err < 0) { + snd_printk(KERN_ERR PFX "PnP configure failure (out of resources?)\n"); + return -EBUSY; + } + /* ok. hack using Vendor-Defined Card-Level registers */ + /* skip csn and logdev initialization - already done in isapnp_configure */ + if (pnp_device_is_isapnp(pdev)) { + isapnp_cfg_begin(isapnp_card_number(pdev), isapnp_csn_number(pdev)); + isapnp_write_byte(0x27, pnp_irq(pdev, 0)); /* Hardware Volume IRQ Number */ + if (mpu_port[dev] != SNDRV_AUTO_PORT) + isapnp_write_byte(0x28, pnp_irq(pdev, 0)); /* MPU-401 IRQ Number */ + isapnp_write_byte(0x72, pnp_irq(pdev, 0)); /* second IRQ */ + isapnp_cfg_end(); + } + port[dev] = pnp_port_start(pdev, 0); + fm_port[dev] = pnp_port_start(pdev, 1); + mpu_port[dev] = pnp_port_start(pdev, 2); + dma1[dev] = pnp_dma(pdev, 0); + dma2[dev] = pnp_dma(pdev, 1); + irq[dev] = pnp_irq(pdev, 0); + snd_printdd("PnP ES18xx: port=0x%lx, fm port=0x%lx, mpu port=0x%lx\n", port[dev], fm_port[dev], mpu_port[dev]); + snd_printdd("PnP ES18xx: dma1=%i, dma2=%i, irq=%i\n", dma1[dev], dma2[dev], irq[dev]); + return 0; +} + +static int __devinit snd_audiodrive_pnp(int dev, struct snd_audiodrive *acard, + struct pnp_dev *pdev) +{ + struct pnp_resource_table * cfg = kmalloc(sizeof(struct pnp_resource_table), GFP_KERNEL); + + if (!cfg) + return -ENOMEM; + acard->dev = pdev; + if (snd_audiodrive_pnp_init_main(dev, acard->dev, cfg) < 0) { + kfree(cfg); + return -EBUSY; + } + kfree(cfg); + return 0; +} static struct pnp_card_device_id snd_audiodrive_pnpids[] = { /* ESS 1868 (integrated on Compaq dual P-Pro motherboard and Genius 18PnP 3D) */ @@ -2061,13 +2134,11 @@ static struct pnp_card_device_id snd_audiodrive_pnpids[] = { MODULE_DEVICE_TABLE(pnp_card, snd_audiodrive_pnpids); -static int __devinit snd_audiodrive_pnp(int dev, struct snd_audiodrive *acard, +static int __devinit snd_audiodrive_pnpc(int dev, struct snd_audiodrive *acard, struct pnp_card_link *card, const struct pnp_card_device_id *id) { - struct pnp_dev *pdev; struct pnp_resource_table * cfg = kmalloc(sizeof(struct pnp_resource_table), GFP_KERNEL); - int err; if (!cfg) return -ENOMEM; @@ -2082,58 +2153,16 @@ static int __devinit snd_audiodrive_pnp(int dev, struct snd_audiodrive *acard, return -EBUSY; } /* Control port initialization */ - err = pnp_activate_dev(acard->devc); - if (err < 0) { + if (pnp_activate_dev(acard->devc) < 0) { snd_printk(KERN_ERR PFX "PnP control configure failure (out of resources?)\n"); - kfree(cfg); return -EAGAIN; } snd_printdd("pnp: port=0x%llx\n", (unsigned long long)pnp_port_start(acard->devc, 0)); - /* PnP initialization */ - pdev = acard->dev; - pnp_init_resource_table(cfg); - if (port[dev] != SNDRV_AUTO_PORT) - pnp_resource_change(&cfg->port_resource[0], port[dev], 16); - if (fm_port[dev] != SNDRV_AUTO_PORT) - pnp_resource_change(&cfg->port_resource[1], fm_port[dev], 4); - if (mpu_port[dev] != SNDRV_AUTO_PORT) - pnp_resource_change(&cfg->port_resource[2], mpu_port[dev], 2); - if (dma1[dev] != SNDRV_AUTO_DMA) - pnp_resource_change(&cfg->dma_resource[0], dma1[dev], 1); - if (dma2[dev] != SNDRV_AUTO_DMA) - pnp_resource_change(&cfg->dma_resource[1], dma2[dev], 1); - if (irq[dev] != SNDRV_AUTO_IRQ) - pnp_resource_change(&cfg->irq_resource[0], irq[dev], 1); - err = pnp_manual_config_dev(pdev, cfg, 0); - if (err < 0) - snd_printk(KERN_ERR PFX "PnP manual resources are invalid, using auto config\n"); - err = pnp_activate_dev(pdev); - if (err < 0) { - snd_printk(KERN_ERR PFX "PnP configure failure (out of resources?)\n"); + if (snd_audiodrive_pnp_init_main(dev, acard->dev, cfg) < 0) { kfree(cfg); return -EBUSY; } - /* ok. hack using Vendor-Defined Card-Level registers */ - /* skip csn and logdev initialization - already done in isapnp_configure */ - if (pnp_device_is_isapnp(pdev)) { - isapnp_cfg_begin(isapnp_card_number(pdev), isapnp_csn_number(pdev)); - isapnp_write_byte(0x27, pnp_irq(pdev, 0)); /* Hardware Volume IRQ Number */ - if (mpu_port[dev] != SNDRV_AUTO_PORT) - isapnp_write_byte(0x28, pnp_irq(pdev, 0)); /* MPU-401 IRQ Number */ - isapnp_write_byte(0x72, pnp_irq(pdev, 0)); /* second IRQ */ - isapnp_cfg_end(); - } else { - snd_printk(KERN_ERR PFX "unable to install ISA PnP hack, expect malfunction\n"); - } - port[dev] = pnp_port_start(pdev, 0); - fm_port[dev] = pnp_port_start(pdev, 1); - mpu_port[dev] = pnp_port_start(pdev, 2); - dma1[dev] = pnp_dma(pdev, 0); - dma2[dev] = pnp_dma(pdev, 1); - irq[dev] = pnp_irq(pdev, 0); - snd_printdd("PnP ES18xx: port=0x%lx, fm port=0x%lx, mpu port=0x%lx\n", port[dev], fm_port[dev], mpu_port[dev]); - snd_printdd("PnP ES18xx: dma1=%i, dma2=%i, irq=%i\n", dma1[dev], dma2[dev], irq[dev]); kfree(cfg); return 0; } @@ -2302,7 +2331,69 @@ static struct platform_driver snd_es18xx_nonpnp_driver = { #ifdef CONFIG_PNP static unsigned int __devinitdata es18xx_pnp_devices; -static int __devinit snd_audiodrive_pnp_detect(struct pnp_card_link *pcard, +static int __devinit snd_audiodrive_pnp_detect(struct pnp_dev *pdev, + const struct pnp_device_id *id) +{ + static int dev; + int err; + struct snd_card *card; + + if (pnp_device_is_isapnp(pdev)) + return -ENOENT; /* we have another procedure - card */ + for (; dev < SNDRV_CARDS; dev++) { + if (enable[dev] && isapnp[dev]) + break; + } + if (dev >= SNDRV_CARDS) + return -ENODEV; + + card = snd_es18xx_card_new(dev); + if (! card) + return -ENOMEM; + if ((err = snd_audiodrive_pnp(dev, card->private_data, pdev)) < 0) { + snd_card_free(card); + return err; + } + snd_card_set_dev(card, &pdev->dev); + if ((err = snd_audiodrive_probe(card, dev)) < 0) { + snd_card_free(card); + return err; + } + pnp_set_drvdata(pdev, card); + dev++; + es18xx_pnp_devices++; + return 0; +} + +static void __devexit snd_audiodrive_pnp_remove(struct pnp_dev * pdev) +{ + snd_card_free(pnp_get_drvdata(pdev)); + pnp_set_drvdata(pdev, NULL); +} + +#ifdef CONFIG_PM +static int snd_audiodrive_pnp_suspend(struct pnp_dev *pdev, pm_message_t state) +{ + return snd_es18xx_suspend(pnp_get_drvdata(pdev), state); +} +static int snd_audiodrive_pnp_resume(struct pnp_dev *pdev) +{ + return snd_es18xx_resume(pnp_get_drvdata(pdev)); +} +#endif + +static struct pnp_driver es18xx_pnp_driver = { + .name = "es18xx-pnpbios", + .id_table = snd_audiodrive_pnpbiosids, + .probe = snd_audiodrive_pnp_detect, + .remove = __devexit_p(snd_audiodrive_pnp_remove), +#ifdef CONFIG_PM + .suspend = snd_audiodrive_pnp_suspend, + .resume = snd_audiodrive_pnp_resume, +#endif +}; + +static int __devinit snd_audiodrive_pnpc_detect(struct pnp_card_link *pcard, const struct pnp_card_device_id *pid) { static int dev; @@ -2320,7 +2411,7 @@ static int __devinit snd_audiodrive_pnp_detect(struct pnp_card_link *pcard, if (! card) return -ENOMEM; - if ((res = snd_audiodrive_pnp(dev, card->private_data, pcard, pid)) < 0) { + if ((res = snd_audiodrive_pnpc(dev, card->private_data, pcard, pid)) < 0) { snd_card_free(card); return res; } @@ -2336,19 +2427,19 @@ static int __devinit snd_audiodrive_pnp_detect(struct pnp_card_link *pcard, return 0; } -static void __devexit snd_audiodrive_pnp_remove(struct pnp_card_link * pcard) +static void __devexit snd_audiodrive_pnpc_remove(struct pnp_card_link * pcard) { snd_card_free(pnp_get_card_drvdata(pcard)); pnp_set_card_drvdata(pcard, NULL); } #ifdef CONFIG_PM -static int snd_audiodrive_pnp_suspend(struct pnp_card_link *pcard, pm_message_t state) +static int snd_audiodrive_pnpc_suspend(struct pnp_card_link *pcard, pm_message_t state) { return snd_es18xx_suspend(pnp_get_card_drvdata(pcard), state); } -static int snd_audiodrive_pnp_resume(struct pnp_card_link *pcard) +static int snd_audiodrive_pnpc_resume(struct pnp_card_link *pcard) { return snd_es18xx_resume(pnp_get_card_drvdata(pcard)); } @@ -2359,11 +2450,11 @@ static struct pnp_card_driver es18xx_pnpc_driver = { .flags = PNP_DRIVER_RES_DISABLE, .name = "es18xx", .id_table = snd_audiodrive_pnpids, - .probe = snd_audiodrive_pnp_detect, - .remove = __devexit_p(snd_audiodrive_pnp_remove), + .probe = snd_audiodrive_pnpc_detect, + .remove = __devexit_p(snd_audiodrive_pnpc_remove), #ifdef CONFIG_PM - .suspend = snd_audiodrive_pnp_suspend, - .resume = snd_audiodrive_pnp_resume, + .suspend = snd_audiodrive_pnpc_suspend, + .resume = snd_audiodrive_pnpc_resume, #endif }; #endif /* CONFIG_PNP */ @@ -2373,8 +2464,10 @@ static void __init_or_module snd_es18xx_unregister_all(void) int i; #ifdef CONFIG_PNP - if (pnp_registered) + if (pnpc_registered) pnp_unregister_card_driver(&es18xx_pnpc_driver); + if (pnp_registered) + pnp_unregister_driver(&es18xx_pnp_driver); #endif for (i = 0; i < ARRAY_SIZE(platform_devices); ++i) platform_device_unregister(platform_devices[i]); @@ -2405,11 +2498,13 @@ static int __init alsa_card_es18xx_init(void) } #ifdef CONFIG_PNP - err = pnp_register_card_driver(&es18xx_pnpc_driver); - if (!err) { + err = pnp_register_driver(&es18xx_pnp_driver); + if (!err) pnp_registered = 1; - cards += es18xx_pnp_devices; - } + err = pnp_register_card_driver(&es18xx_pnpc_driver); + if (!err) + pnpc_registered = 1; + cards += es18xx_pnp_devices; #endif if(!cards) { -- cgit v0.10.2 From 548a648b98318e4b843b636dd2c7f42377e19a00 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 31 Jul 2006 16:51:51 +0200 Subject: [ALSA] Fix control/status mmap with shared PCM substream The flag to avoid 32bit-incompatible mmap for control/status records should be outside the pcm substream instance since a substream can be shared among multiple opens. Now it's flagged in pcm_file list that is directly assigned to file->private_data. Also, removed snd_pcm_add_file() and remove_file() functions and substream.files field that are not really used in the code. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/pcm.h b/include/sound/pcm.h index f84d849..60d40b3 100644 --- a/include/sound/pcm.h +++ b/include/sound/pcm.h @@ -190,7 +190,7 @@ struct snd_pcm_ops { struct snd_pcm_file { struct snd_pcm_substream *substream; - struct snd_pcm_file *next; + int no_compat_mmap; }; struct snd_pcm_hw_rule; @@ -384,7 +384,6 @@ struct snd_pcm_substream { struct snd_info_entry *proc_prealloc_entry; #endif /* misc flags */ - unsigned int no_mmap_ctrl: 1; unsigned int hw_opened: 1; }; @@ -402,7 +401,6 @@ struct snd_pcm_str { /* -- OSS things -- */ struct snd_pcm_oss_stream oss; #endif - struct snd_pcm_file *files; #ifdef CONFIG_SND_VERBOSE_PROCFS struct snd_info_entry *proc_root; struct snd_info_entry *proc_info_entry; diff --git a/sound/core/pcm_compat.c b/sound/core/pcm_compat.c index 2b8aab6..2b53979 100644 --- a/sound/core/pcm_compat.c +++ b/sound/core/pcm_compat.c @@ -478,7 +478,7 @@ static long snd_pcm_ioctl_compat(struct file *file, unsigned int cmd, unsigned l * mmap of PCM status/control records because of the size * incompatibility. */ - substream->no_mmap_ctrl = 1; + pcm_file->no_compat_mmap = 1; switch (cmd) { case SNDRV_PCM_IOCTL_PVERSION: diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c index 439f047..0224c70 100644 --- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -1992,35 +1992,9 @@ int snd_pcm_hw_constraints_complete(struct snd_pcm_substream *substream) return 0; } -static void snd_pcm_add_file(struct snd_pcm_str *str, - struct snd_pcm_file *pcm_file) -{ - pcm_file->next = str->files; - str->files = pcm_file; -} - -static void snd_pcm_remove_file(struct snd_pcm_str *str, - struct snd_pcm_file *pcm_file) -{ - struct snd_pcm_file * pcm_file1; - if (str->files == pcm_file) { - str->files = pcm_file->next; - } else { - pcm_file1 = str->files; - while (pcm_file1 && pcm_file1->next != pcm_file) - pcm_file1 = pcm_file1->next; - if (pcm_file1 != NULL) - pcm_file1->next = pcm_file->next; - } -} - static void pcm_release_private(struct snd_pcm_substream *substream) { - struct snd_pcm_file *pcm_file = substream->file; - snd_pcm_unlink(substream); - snd_pcm_remove_file(substream->pstr, pcm_file); - kfree(pcm_file); } void snd_pcm_release_substream(struct snd_pcm_substream *substream) @@ -2060,7 +2034,6 @@ int snd_pcm_open_substream(struct snd_pcm *pcm, int stream, return 0; } - substream->no_mmap_ctrl = 0; err = snd_pcm_hw_constraints_init(substream); if (err < 0) { snd_printd("snd_pcm_hw_constraints_init failed\n"); @@ -2105,19 +2078,16 @@ static int snd_pcm_open_file(struct file *file, if (err < 0) return err; - if (substream->ref_count > 1) - pcm_file = substream->file; - else { - pcm_file = kzalloc(sizeof(*pcm_file), GFP_KERNEL); - if (pcm_file == NULL) { - snd_pcm_release_substream(substream); - return -ENOMEM; - } + pcm_file = kzalloc(sizeof(*pcm_file), GFP_KERNEL); + if (pcm_file == NULL) { + snd_pcm_release_substream(substream); + return -ENOMEM; + } + pcm_file->substream = substream; + if (substream->ref_count == 1) { str = substream->pstr; substream->file = pcm_file; substream->pcm_release = pcm_release_private; - pcm_file->substream = substream; - snd_pcm_add_file(str, pcm_file); } file->private_data = pcm_file; *rpcm_file = pcm_file; @@ -2209,6 +2179,7 @@ static int snd_pcm_release(struct inode *inode, struct file *file) fasync_helper(-1, file, 0, &substream->runtime->fasync); mutex_lock(&pcm->open_mutex); snd_pcm_release_substream(substream); + kfree(pcm_file); mutex_unlock(&pcm->open_mutex); wake_up(&pcm->open_wait); module_put(pcm->card->module); @@ -3270,11 +3241,11 @@ static int snd_pcm_mmap(struct file *file, struct vm_area_struct *area) offset = area->vm_pgoff << PAGE_SHIFT; switch (offset) { case SNDRV_PCM_MMAP_OFFSET_STATUS: - if (substream->no_mmap_ctrl) + if (pcm_file->no_compat_mmap) return -ENXIO; return snd_pcm_mmap_status(substream, file, area); case SNDRV_PCM_MMAP_OFFSET_CONTROL: - if (substream->no_mmap_ctrl) + if (pcm_file->no_compat_mmap) return -ENXIO; return snd_pcm_mmap_control(substream, file, area); default: -- cgit v0.10.2 From f03d68fe343d70bb06ecdb3d70dcf0e678ed99f9 Mon Sep 17 00:00:00 2001 From: Dmitry Torokhov Date: Thu, 3 Aug 2006 15:06:14 +0200 Subject: [ALSA] ppc-beep - handle errors from input_register_device() ppc-beep: handle errors from input_register_device() (Also fixed the wrong memory release in the error path.) Signed-off-by: Dmitry Torokhov Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/ppc/beep.c b/sound/ppc/beep.c index 5fec1e5..5f38f67 100644 --- a/sound/ppc/beep.c +++ b/sound/ppc/beep.c @@ -215,15 +215,18 @@ int __init snd_pmac_attach_beep(struct snd_pmac *chip) { struct pmac_beep *beep; struct input_dev *input_dev; + struct snd_kcontrol *beep_ctl; void *dmabuf; int err = -ENOMEM; beep = kzalloc(sizeof(*beep), GFP_KERNEL); + if (! beep) + return -ENOMEM; dmabuf = dma_alloc_coherent(&chip->pdev->dev, BEEP_BUFLEN * 4, &beep->addr, GFP_KERNEL); input_dev = input_allocate_device(); - if (!beep || !dmabuf || !input_dev) - goto fail; + if (! dmabuf || ! input_dev) + goto fail1; /* FIXME: set more better values */ input_dev->name = "PowerMac Beep"; @@ -244,17 +247,24 @@ int __init snd_pmac_attach_beep(struct snd_pmac *chip) beep->volume = BEEP_VOLUME; beep->running = 0; - err = snd_ctl_add(chip->card, snd_ctl_new1(&snd_pmac_beep_mixer, chip)); + beep_ctl = snd_ctl_new1(&snd_pmac_beep_mixer, chip); + err = snd_ctl_add(chip->card, beep_ctl); if (err < 0) - goto fail; + goto fail1; + + chip->beep = beep; - chip->beep = beep; - input_register_device(beep->dev); - - return 0; - - fail: input_free_device(input_dev); - kfree(dmabuf); + err = input_register_device(beep->dev); + if (err) + goto fail2; + + return 0; + + fail2: snd_ctl_remove(chip->card, beep_ctl); + fail1: input_free_device(input_dev); + if (dmabuf) + dma_free_coherent(&chip->pdev->dev, BEEP_BUFLEN * 4, + dmabuf, beep->addr); kfree(beep); return err; } -- cgit v0.10.2 From 2529bba7606b23c1b7161d3c2ad486162e8650f9 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 4 Aug 2006 12:57:19 +0200 Subject: [ALSA] Fix substream selection in PCM and rawmidi The PCM and rawmidi substreams can be selected explicitly by opening control handle and set via *_PREFER_SUBDEVICE ioctl. But, when multiple controls are opened, the driver gets confused. The patch fixes the initialization of prefer_*_subdevice and the check of multiple controls. The first set subdevice is picked up as the valid one. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/core/control.c b/sound/core/control.c index 31ad581..ac14426 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -75,6 +75,8 @@ static int snd_ctl_open(struct inode *inode, struct file *file) init_waitqueue_head(&ctl->change_sleep); spin_lock_init(&ctl->read_lock); ctl->card = card; + ctl->prefer_pcm_subdevice = -1; + ctl->prefer_rawmidi_subdevice = -1; ctl->pid = current->pid; file->private_data = ctl; write_lock_irqsave(&card->ctl_files_rwlock, flags); diff --git a/sound/core/pcm.c b/sound/core/pcm.c index f52178a..ed3b094 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -792,7 +792,8 @@ int snd_pcm_attach_substream(struct snd_pcm *pcm, int stream, kctl = snd_ctl_file(list); if (kctl->pid == current->pid) { prefer_subdevice = kctl->prefer_pcm_subdevice; - break; + if (prefer_subdevice != -1) + break; } } up_read(&card->controls_rwsem); diff --git a/sound/core/rawmidi.c b/sound/core/rawmidi.c index 8a2bdfa..269c467 100644 --- a/sound/core/rawmidi.c +++ b/sound/core/rawmidi.c @@ -430,7 +430,8 @@ static int snd_rawmidi_open(struct inode *inode, struct file *file) kctl = snd_ctl_file(list); if (kctl->pid == current->pid) { subdevice = kctl->prefer_rawmidi_subdevice; - break; + if (subdevice != -1) + break; } } up_read(&card->controls_rwsem); -- cgit v0.10.2 From 727f317a10da74b4e5c6d968bbba07767bfea794 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 4 Aug 2006 19:08:03 +0200 Subject: [ALSA] usb-audio - Fix a typo of CONFIG_PROC_FS Fixed a typo of CONFIG_PROC_FS in usbaudio.c. The stream proc file appears again. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbaudio.c b/sound/usb/usbaudio.c index 3144313..087f9b6 100644 --- a/sound/usb/usbaudio.c +++ b/sound/usb/usbaudio.c @@ -2049,7 +2049,7 @@ static struct usb_driver usb_audio_driver = { }; -#if defined(CONFIG_PROCFS) && defined(CONFIG_SND_VERBOSE_PROCFS) +#if defined(CONFIG_PROC_FS) && defined(CONFIG_SND_VERBOSE_PROCFS) /* * proc interface for list the supported pcm formats -- cgit v0.10.2 From 25b6c43b3d6258f3e87244eeb2b9347dc5e83c40 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 8 Aug 2006 13:01:14 +0200 Subject: [ALSA] Fix the preselected model for HP machine Fixed the preselected model for a HP machine with SSID 103c:3010 to use hp-3013 (ALSA bug#2157). Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 991f107..ac561a5 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -3931,7 +3931,7 @@ static struct hda_board_config alc260_cfg_tbl[] = { .config = ALC260_BASIC }, /* CTL Travel Master U553W */ { .modelname = "hp", .config = ALC260_HP }, { .modelname = "hp-3013", .config = ALC260_HP_3013 }, - { .pci_subvendor = 0x103c, .pci_subdevice = 0x3010, .config = ALC260_HP }, + { .pci_subvendor = 0x103c, .pci_subdevice = 0x3010, .config = ALC260_HP_3013 }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3011, .config = ALC260_HP }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3012, .config = ALC260_HP_3013 }, { .pci_subvendor = 0x103c, .pci_subdevice = 0x3013, .config = ALC260_HP_3013 }, -- cgit v0.10.2 From f5a5ffad072ec3c1fd636174c30f0ba52fe0259f Mon Sep 17 00:00:00 2001 From: Danny Tholen Date: Tue, 8 Aug 2006 18:59:07 +0200 Subject: [ALSA] [snd-hda-intel] fix sound on some Asus W6A chips This patch adds support in ALSA snd-hda-intel driver for Asus W6A motherboard as reported in MDV Bugzilla #19962 (see http://qa.mandriva.com/show_bug.cgi?id=19962) Signed-off-by: Danny Tholen Signed-off-by: Thomas Backlund Signed-off-by: Thierry Vignaud Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index ac561a5..a917573 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2240,6 +2240,7 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .pci_subvendor = 0x1043, .pci_subdevice = 0x1113, .config = ALC880_ASUS_DIG }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1173, .config = ALC880_ASUS_DIG }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1993, .config = ALC880_ASUS }, + { .pci_subvendor = 0x1043, .pci_subdevice = 0x10c2, .config = ALC880_ASUS_DIG }, /* Asus W6A */ { .pci_subvendor = 0x1043, .pci_subdevice = 0x10c3, .config = ALC880_ASUS_DIG }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1133, .config = ALC880_ASUS }, { .pci_subvendor = 0x1043, .pci_subdevice = 0x1123, .config = ALC880_ASUS_DIG }, -- cgit v0.10.2 From 683fe1537e660c322c8af953773921e814791193 Mon Sep 17 00:00:00 2001 From: Jochen Voss Date: Tue, 8 Aug 2006 21:12:44 +0200 Subject: [ALSA] Revolution 5.1 - add AK5365 ADC support Add support for the AK5365 ADC. Signed-off-by: Jochen Voss Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/ak4xxx-adda.h b/include/sound/ak4xxx-adda.h index 3d98884..65ddfa3 100644 --- a/include/sound/ak4xxx-adda.h +++ b/include/sound/ak4xxx-adda.h @@ -53,7 +53,8 @@ struct snd_akm4xxx { unsigned int idx_offset; /* control index offset */ enum { SND_AK4524, SND_AK4528, SND_AK4529, - SND_AK4355, SND_AK4358, SND_AK4381 + SND_AK4355, SND_AK4358, SND_AK4381, + SND_AK5365 } type; unsigned int *num_stereo; /* array of combined counts * for the mixer diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index dc7cc20..7d562f0 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -598,6 +598,31 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) if (err < 0) goto __error; } + + if (ak->type == SND_AK5365) { + memset(ctl, 0, sizeof(*ctl)); + if (ak->channel_names == NULL) + strcpy(ctl->id.name, "Capture Volume"); + else + strcpy(ctl->id.name, ak->channel_names[0]); + ctl->id.index = ak->idx_offset * 2; + ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + ctl->count = 1; + ctl->info = snd_akm4xxx_stereo_volume_info; + ctl->get = snd_akm4xxx_stereo_volume_get; + ctl->put = snd_akm4xxx_stereo_volume_put; + /* Registers 4 & 5 (see AK5365 data sheet, pages 34 and 35): + * valid values are from 0x00 (mute) to 0x98 (+12dB). */ + ctl->private_value = + AK_COMPOSE(0, 4, 0, 0x98); + ctl->private_data = ak; + err = snd_ctl_add(ak->card, + snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| + SNDRV_CTL_ELEM_ACCESS_WRITE)); + if (err < 0) + goto __error; + } + if (ak->type == SND_AK4355 || ak->type == SND_AK4358) num_emphs = 1; else -- cgit v0.10.2 From 96d9e9347c9c5ca980bef22b4add7d437d79034f Mon Sep 17 00:00:00 2001 From: Jochen Voss Date: Tue, 8 Aug 2006 21:13:42 +0200 Subject: [ALSA] Revolution 5.1 - register the AK5365 ADC with ALSA Enable capture support for the M-Audio Revolution 5.1 card, by registering the ADC with ALSA. Signed-off-by: Jochen Voss Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ice1712/revo.c b/sound/pci/ice1712/revo.c index fec9440..ef64be4 100644 --- a/sound/pci/ice1712/revo.c +++ b/sound/pci/ice1712/revo.c @@ -98,6 +98,9 @@ static unsigned int revo51_num_stereo[] = {2, 1, 1, 2}; static char *revo51_channel_names[] = {"PCM Playback Volume", "PCM Center Playback Volume", "PCM LFE Playback Volume", "PCM Rear Playback Volume"}; +static unsigned int revo51_adc_num_stereo[] = {2}; +static char *revo51_adc_channel_names[] = {"PCM Capture Volume"}; + static struct snd_akm4xxx akm_revo_front __devinitdata = { .type = SND_AK4381, .num_dacs = 2, @@ -159,7 +162,26 @@ static struct snd_ak4xxx_private akm_revo51_priv __devinitdata = { .data_mask = VT1724_REVO_CDOUT, .clk_mask = VT1724_REVO_CCLK, .cs_mask = VT1724_REVO_CS0 | VT1724_REVO_CS1 | VT1724_REVO_CS2, - .cs_addr = 0, + .cs_addr = VT1724_REVO_CS1 | VT1724_REVO_CS2, + .cs_none = VT1724_REVO_CS0 | VT1724_REVO_CS1 | VT1724_REVO_CS2, + .add_flags = VT1724_REVO_CCLK, /* high at init */ + .mask_flags = 0, +}; + +static struct snd_akm4xxx akm_revo51_adc __devinitdata = { + .type = SND_AK5365, + .num_adcs = 2, + .num_stereo = revo51_adc_num_stereo, + .channel_names = revo51_adc_channel_names +}; + +static struct snd_ak4xxx_private akm_revo51_adc_priv __devinitdata = { + .caddr = 2, + .cif = 0, + .data_mask = VT1724_REVO_CDOUT, + .clk_mask = VT1724_REVO_CCLK, + .cs_mask = VT1724_REVO_CS0 | VT1724_REVO_CS1 | VT1724_REVO_CS2, + .cs_addr = VT1724_REVO_CS0 | VT1724_REVO_CS2, .cs_none = VT1724_REVO_CS0 | VT1724_REVO_CS1 | VT1724_REVO_CS2, .add_flags = VT1724_REVO_CCLK, /* high at init */ .mask_flags = 0, @@ -202,9 +224,13 @@ static int __devinit revo_init(struct snd_ice1712 *ice) snd_ice1712_gpio_write_bits(ice, VT1724_REVO_MUTE, VT1724_REVO_MUTE); break; case VT1724_SUBDEVICE_REVOLUTION51: - ice->akm_codecs = 1; + ice->akm_codecs = 2; if ((err = snd_ice1712_akm4xxx_init(ak, &akm_revo51, &akm_revo51_priv, ice)) < 0) return err; + err = snd_ice1712_akm4xxx_init(ak + 1, &akm_revo51_adc, + &akm_revo51_adc_priv, ice); + if (err < 0) + return err; /* unmute all codecs - needed! */ snd_ice1712_gpio_write_bits(ice, VT1724_REVO_MUTE, VT1724_REVO_MUTE); break; diff --git a/sound/pci/ice1712/revo.h b/sound/pci/ice1712/revo.h index dea52ea..efbb86e 100644 --- a/sound/pci/ice1712/revo.h +++ b/sound/pci/ice1712/revo.h @@ -42,7 +42,7 @@ extern struct snd_ice1712_card_info snd_vt1724_revo_cards[]; #define VT1724_REVO_CCLK 0x02 #define VT1724_REVO_CDIN 0x04 /* not used */ #define VT1724_REVO_CDOUT 0x08 -#define VT1724_REVO_CS0 0x10 /* not used */ +#define VT1724_REVO_CS0 0x10 /* AK5365 chipselect for Rev. 5.1 */ #define VT1724_REVO_CS1 0x20 /* front AKM4381 chipselect */ #define VT1724_REVO_CS2 0x40 /* surround AKM4355 chipselect */ #define VT1724_REVO_MUTE (1<<22) /* 0 = all mute, 1 = normal operation */ -- cgit v0.10.2 From 30ba6e207a915a6c70f22ccb3f9169d1cce88466 Mon Sep 17 00:00:00 2001 From: Jochen Voss Date: Wed, 9 Aug 2006 14:26:26 +0200 Subject: [ALSA] Revolution 5.1 - complete the AK5365 support Complete the AK5365 support. This adds a boolean control to toggle the soft mute feature of the AK5365 chip. Signed-off-by: Jochen Voss Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index 7d562f0..d76d8b0 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -472,6 +472,57 @@ static int snd_akm4xxx_deemphasis_put(struct snd_kcontrol *kcontrol, return change; } +static int ak4xxx_switch_info(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_info *uinfo) +{ + uinfo->type = SNDRV_CTL_ELEM_TYPE_BOOLEAN; + uinfo->count = 1; + uinfo->value.integer.min = 0; + uinfo->value.integer.max = 1; + return 0; +} + +static int ak4xxx_switch_get(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); + int chip = AK_GET_CHIP(kcontrol->private_value); + int addr = AK_GET_ADDR(kcontrol->private_value); + int shift = AK_GET_SHIFT(kcontrol->private_value); + int invert = AK_GET_INVERT(kcontrol->private_value); + unsigned char val = snd_akm4xxx_get(ak, chip, addr); + + if (invert) + val = ! val; + ucontrol->value.integer.value[0] = (val & (1<private_value); + int addr = AK_GET_ADDR(kcontrol->private_value); + int shift = AK_GET_SHIFT(kcontrol->private_value); + int invert = AK_GET_INVERT(kcontrol->private_value); + long flag = ucontrol->value.integer.value[0]; + unsigned char val, oval; + int change; + + if (invert) + flag = ! flag; + oval = snd_akm4xxx_get(ak, chip, addr); + if (flag) + val = oval | (1<channel_names == NULL) + strcpy(ctl->id.name, "Capture Switch"); + else + strcpy(ctl->id.name, ak->channel_names[1]); + ctl->id.index = ak->idx_offset * 2; + ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + ctl->count = 1; + ctl->info = ak4xxx_switch_info; + ctl->get = ak4xxx_switch_get; + ctl->put = ak4xxx_switch_put; + /* register 2, bit 0 (SMUTE): 0 = normal operation, 1 = mute */ + ctl->private_value = + AK_COMPOSE(0, 2, 0, 0) | AK_INVERT; + ctl->private_data = ak; + err = snd_ctl_add(ak->card, + snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| + SNDRV_CTL_ELEM_ACCESS_WRITE)); + if (err < 0) + goto __error; } if (ak->type == SND_AK4355 || ak->type == SND_AK4358) diff --git a/sound/pci/ice1712/revo.c b/sound/pci/ice1712/revo.c index ef64be4..1134a57 100644 --- a/sound/pci/ice1712/revo.c +++ b/sound/pci/ice1712/revo.c @@ -99,7 +99,7 @@ static char *revo51_channel_names[] = {"PCM Playback Volume", "PCM Center Playba "PCM LFE Playback Volume", "PCM Rear Playback Volume"}; static unsigned int revo51_adc_num_stereo[] = {2}; -static char *revo51_adc_channel_names[] = {"PCM Capture Volume"}; +static char *revo51_adc_channel_names[] = {"PCM Capture Volume","PCM Capture Switch"}; static struct snd_akm4xxx akm_revo_front __devinitdata = { .type = SND_AK4381, -- cgit v0.10.2 From cf93907b98c82c2157e5bbe766bee8f1c5bb87b2 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 9 Aug 2006 14:33:27 +0200 Subject: [ALSA] Fix compile warnings in ak4xxx-adda.c Fixed compile warnings in ak4xxx-adda.c reagarding missing enum cases in switch. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index d76d8b0..0aea536 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -137,6 +137,8 @@ void snd_akm4xxx_reset(struct snd_akm4xxx *ak, int state) case SND_AK4381: ak4381_reset(ak, state); break; + default: + break; } } @@ -727,6 +729,9 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) case SND_AK4381: ctl->private_value = AK_COMPOSE(idx, 1, 1, 0); break; + default: + err = -EINVAL; + goto __error; } ctl->private_data = ak; err = snd_ctl_add(ak->card, -- cgit v0.10.2 From f24e9f586b377749dff37554696cf3a105540c94 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 9 Aug 2006 14:51:14 +0200 Subject: [ALSA] Select I2C and I2C_POWERMAC in aoa/codecs/Kconfig Added the missing selection of I2C and I2C_POWERMAC for Onyx and TAS codecs in aoa/codecs/Kconfig. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/aoa/codecs/Kconfig b/sound/aoa/codecs/Kconfig index 90cf58f..d5fbd60 100644 --- a/sound/aoa/codecs/Kconfig +++ b/sound/aoa/codecs/Kconfig @@ -1,6 +1,8 @@ config SND_AOA_ONYX tristate "support Onyx chip" depends on SND_AOA + select I2C + select I2C_POWERMAC ---help--- This option enables support for the Onyx (pcm3052) codec chip found in the latest Apple machines @@ -18,6 +20,8 @@ config SND_AOA_ONYX config SND_AOA_TAS tristate "support TAS chips" depends on SND_AOA + select I2C + select I2C_POWERMAC ---help--- This option enables support for the tas chips found in a lot of Apple Machines, especially -- cgit v0.10.2 From 22309c3e0c8911865cad0aa94f53a9afadaad7ee Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 9 Aug 2006 16:57:28 +0200 Subject: [ALSA] Added model for Uniwill laptop with ALC861 Added a new model 'uniwill-m31' for Uniwill laptops with ALC861 codec chip. The patch is taken from ALSA bug#2035, and modifeid. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index d0dbc3f..74be228 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -827,6 +827,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-dig 3-jack with SPDIF I/O 6stack-dig 6-jack with SPDIF I/O 3stack-660 3-jack (for ALC660) + uniwill-m31 Uniwill M31 laptop auto auto-config reading BIOS (default) CMI9880 diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index a917573..f857e96 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -90,6 +90,7 @@ enum { ALC660_3ST, ALC861_3ST_DIG, ALC861_6ST_DIG, + ALC861_UNIWILL_M31, ALC861_AUTO, ALC861_MODEL_LAST, }; @@ -6021,6 +6022,23 @@ static struct hda_channel_mode alc861_threestack_modes[2] = { { 2, alc861_threestack_ch2_init }, { 6, alc861_threestack_ch6_init }, }; +/* Set mic1 as input and unmute the mixer */ +static struct hda_verb alc861_uniwill_m31_ch2_init[] = { + { 0x0d, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x24 }, + { 0x15, AC_VERB_SET_AMP_GAIN_MUTE, (0x7080 | (0x01 << 8)) }, /*mic*/ + { } /* end */ +}; +/* Set mic1 as output and mute mixer */ +static struct hda_verb alc861_uniwill_m31_ch4_init[] = { + { 0x0d, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x40 }, + { 0x15, AC_VERB_SET_AMP_GAIN_MUTE, (0x7000 | (0x01 << 8)) }, /*mic*/ + { } /* end */ +}; + +static struct hda_channel_mode alc861_uniwill_m31_modes[2] = { + { 2, alc861_uniwill_m31_ch2_init }, + { 4, alc861_uniwill_m31_ch4_init }, +}; /* patch-ALC861 */ @@ -6099,6 +6117,47 @@ static struct snd_kcontrol_new alc861_3ST_mixer[] = { }, { } /* end */ }; +static struct snd_kcontrol_new alc861_uniwill_m31_mixer[] = { + /* output mixer control */ + HDA_CODEC_MUTE("Front Playback Switch", 0x03, 0x0, HDA_OUTPUT), + HDA_CODEC_MUTE("Surround Playback Switch", 0x06, 0x0, HDA_OUTPUT), + HDA_CODEC_MUTE_MONO("Center Playback Switch", 0x05, 1, 0x0, HDA_OUTPUT), + HDA_CODEC_MUTE_MONO("LFE Playback Switch", 0x05, 2, 0x0, HDA_OUTPUT), + /*HDA_CODEC_MUTE("Side Playback Switch", 0x04, 0x0, HDA_OUTPUT), */ + + /* Input mixer control */ + /* HDA_CODEC_VOLUME("Input Playback Volume", 0x15, 0x0, HDA_OUTPUT), + HDA_CODEC_MUTE("Input Playback Switch", 0x15, 0x0, HDA_OUTPUT), */ + HDA_CODEC_VOLUME("CD Playback Volume", 0x15, 0x0, HDA_INPUT), + HDA_CODEC_MUTE("CD Playback Switch", 0x15, 0x0, HDA_INPUT), + HDA_CODEC_VOLUME("Line Playback Volume", 0x15, 0x02, HDA_INPUT), + HDA_CODEC_MUTE("Line Playback Switch", 0x15, 0x02, HDA_INPUT), + HDA_CODEC_VOLUME("Mic Playback Volume", 0x15, 0x01, HDA_INPUT), + HDA_CODEC_MUTE("Mic Playback Switch", 0x15, 0x01, HDA_INPUT), + HDA_CODEC_MUTE("Front Mic Playback Switch", 0x10, 0x01, HDA_OUTPUT), + HDA_CODEC_MUTE("Headphone Playback Switch", 0x1a, 0x03, HDA_INPUT), + + /* Capture mixer control */ + HDA_CODEC_VOLUME("Capture Volume", 0x08, 0x0, HDA_INPUT), + HDA_CODEC_MUTE("Capture Switch", 0x08, 0x0, HDA_INPUT), + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Capture Source", + .count = 1, + .info = alc_mux_enum_info, + .get = alc_mux_enum_get, + .put = alc_mux_enum_put, + }, + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Channel Mode", + .info = alc_ch_mode_info, + .get = alc_ch_mode_get, + .put = alc_ch_mode_put, + .private_value = ARRAY_SIZE(alc861_uniwill_m31_modes), + }, + { } /* end */ +}; /* * generic initialization of ADC, input mixers and output mixers @@ -6227,6 +6286,67 @@ static struct hda_verb alc861_threestack_init_verbs[] = { {0x1a, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(2)}, { } }; + +static struct hda_verb alc861_uniwill_m31_init_verbs[] = { + /* + * Unmute ADC0 and set the default input to mic-in + */ + /* port-A for surround (rear panel) */ + { 0x0e, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x00 }, + /* port-B for mic-in (rear panel) with vref */ + { 0x0d, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x24 }, + /* port-C for line-in (rear panel) */ + { 0x0c, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x20 }, + /* port-D for Front */ + { 0x0b, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x40 }, + { 0x0b, AC_VERB_SET_CONNECT_SEL, 0x00 }, + /* port-E for HP out (front panel) */ + { 0x0f, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x24 }, // this has to be set to VREF80 + /* route front PCM to HP */ + { 0x0f, AC_VERB_SET_CONNECT_SEL, 0x01 }, + /* port-F for mic-in (front panel) with vref */ + { 0x10, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x24 }, + /* port-G for CLFE (rear panel) */ + { 0x1f, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x00 }, + /* port-H for side (rear panel) */ + { 0x20, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x00 }, + /* CD-in */ + { 0x11, AC_VERB_SET_PIN_WIDGET_CONTROL, 0x20 }, + /* route front mic to ADC1*/ + {0x08, AC_VERB_SET_CONNECT_SEL, 0x00}, + {0x08, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + /* Unmute DAC0~3 & spdif out*/ + {0x03, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, + {0x04, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, + {0x05, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, + {0x06, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, + {0x07, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, + + /* Unmute Mixer 14 (mic) 1c (Line in)*/ + {0x014, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x014, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x01c, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x01c, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + + /* Unmute Stereo Mixer 15 */ + {0x15, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x15, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x15, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(2)}, + {0x15, AC_VERB_SET_AMP_GAIN_MUTE, 0xb00c }, //Output 0~12 step + + {0x16, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x16, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x17, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x17, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x18, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x18, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x19, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, + {0x19, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(1)}, + {0x1a, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(3)}, // hp used DAC 3 (Front) + {0x1a, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(2)}, + { } +}; + /* * generic initialization of ADC, input mixers and output mixers */ @@ -6561,6 +6681,9 @@ static struct hda_board_config alc861_cfg_tbl[] = { .config = ALC660_3ST }, { .modelname = "3stack-dig", .config = ALC861_3ST_DIG }, { .modelname = "6stack-dig", .config = ALC861_6ST_DIG }, + { .modelname = "uniwill-m31", .config = ALC861_UNIWILL_M31}, + { .pci_subvendor = 0x1584, .pci_subdevice = 0x9072, + .config = ALC861_UNIWILL_M31 }, { .modelname = "auto", .config = ALC861_AUTO }, {} }; @@ -6615,6 +6738,20 @@ static struct alc_config_preset alc861_presets[] = { .adc_nids = alc861_adc_nids, .input_mux = &alc861_capture_source, }, + [ALC861_UNIWILL_M31] = { + .mixers = { alc861_uniwill_m31_mixer }, + .init_verbs = { alc861_uniwill_m31_init_verbs }, + .num_dacs = ARRAY_SIZE(alc861_dac_nids), + .dac_nids = alc861_dac_nids, + .dig_out_nid = ALC861_DIGOUT_NID, + .num_channel_mode = ARRAY_SIZE(alc861_uniwill_m31_modes), + .channel_mode = alc861_uniwill_m31_modes, + .need_dac_fix = 1, + .num_adc_nids = ARRAY_SIZE(alc861_adc_nids), + .adc_nids = alc861_adc_nids, + .input_mux = &alc861_capture_source, + }, + }; -- cgit v0.10.2 From fe25befde9723ba7d921c100bf00d7643323e5a7 Mon Sep 17 00:00:00 2001 From: Jaroslav Kysela Date: Tue, 15 Aug 2006 14:39:07 +0200 Subject: [ALSA] ice1712 - fix 1600->16000Hz value typo Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ice1712/ice1712.c b/sound/pci/ice1712/ice1712.c index bf20858..9b8325d 100644 --- a/sound/pci/ice1712/ice1712.c +++ b/sound/pci/ice1712/ice1712.c @@ -1857,7 +1857,7 @@ static int snd_ice1712_pro_internal_clock_put(struct snd_kcontrol *kcontrol, { struct snd_ice1712 *ice = snd_kcontrol_chip(kcontrol); static unsigned int xrate[13] = { - 8000, 9600, 11025, 12000, 1600, 22050, 24000, + 8000, 9600, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000 }; unsigned char oval; @@ -1924,7 +1924,7 @@ static int snd_ice1712_pro_internal_clock_default_get(struct snd_kcontrol *kcont { int val; static unsigned int xrate[13] = { - 8000, 9600, 11025, 12000, 1600, 22050, 24000, + 8000, 9600, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000 }; @@ -1941,7 +1941,7 @@ static int snd_ice1712_pro_internal_clock_default_put(struct snd_kcontrol *kcont struct snd_ctl_elem_value *ucontrol) { static unsigned int xrate[13] = { - 8000, 9600, 11025, 12000, 1600, 22050, 24000, + 8000, 9600, 11025, 12000, 16000, 22050, 24000, 32000, 44100, 48000, 64000, 88200, 96000 }; unsigned char oval; -- cgit v0.10.2 From 42fe7647911d0bcaf81aac46db73a3b24387df6d Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Wed, 16 Aug 2006 12:53:34 +0200 Subject: [ALSA] dbri driver cleanup This is a small clean up of the dbri driver for sparc machines. It contains also a fix to DBRI interrupt queue initialization. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index f3ae6e2..652f433 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -104,17 +104,15 @@ static char *cmds[] = { #define dprintk(a, x...) if(dbri_debug & a) printk(KERN_DEBUG x) -#define DBRI_CMD(cmd, intr, value) ((cmd << 28) | \ - (1 << 27) | \ - value) #else #define dprintk(a, x...) -#define DBRI_CMD(cmd, intr, value) ((cmd << 28) | \ - (intr << 27) | \ - value) #endif /* DBRI_DEBUG */ +#define DBRI_CMD(cmd, intr, value) ((cmd << 28) | \ + (intr << 27) | \ + value) + /*************************************************************************** CS4215 specific definitions and structures ****************************************************************************/ @@ -690,7 +688,6 @@ static volatile s32 *dbri_cmdlock(struct snd_dbri * dbri, enum dbri_lock get) static void dbri_cmdsend(struct snd_dbri * dbri, volatile s32 * cmd) { volatile s32 *ptr; - u32 reg; for (ptr = &dbri->dma->cmd[0]; ptr < cmd; ptr++) { dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); @@ -709,9 +706,6 @@ static void dbri_cmdsend(struct snd_dbri * dbri, volatile s32 * cmd) /* Set command pointer and signal it is valid. */ sbus_writel(dbri->dma_dvma, dbri->regs + REG8); - reg = sbus_readl(dbri->regs + REG0); - reg |= D_P; - sbus_writel(reg, dbri->regs + REG0); /*spin_unlock(&dbri->lock); */ } @@ -752,7 +746,7 @@ static void dbri_initialize(struct snd_dbri * dbri) */ for (n = 0; n < DBRI_NO_INTS - 1; n++) { dma_addr = dbri->dma_dvma; - dma_addr += dbri_dma_off(intr, ((n + 1) & DBRI_INT_BLK)); + dma_addr += dbri_dma_off(intr, ((n + 1) * DBRI_INT_BLK)); dbri->dma->intr[n * DBRI_INT_BLK] = dma_addr; } dma_addr = dbri->dma_dvma + dbri_dma_off(intr, 0); -- cgit v0.10.2 From 6fb982803522bc86ca61774c6edf317f77165453 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Wed, 16 Aug 2006 12:54:29 +0200 Subject: [ALSA] sparc dbri removal of DBRI_NO_INTS This patch removes define DBR_NO_INTS and all code related to handling more than one dbri irq statuses block. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 652f433..4651ff5 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -238,12 +238,6 @@ static struct { #define REG9 0x24UL /* Interrupt Queue Pointer */ #define DBRI_NO_CMDS 64 -#define DBRI_NO_INTS 1 /* Note: the value of this define was - * originally 2. The ringbuffer to store - * interrupts in dma is currently broken. - * This is a temporary fix until the ringbuffer - * is fixed. - */ #define DBRI_INT_BLK 64 #define DBRI_NO_DESCS 64 #define DBRI_NO_PIPES 32 @@ -268,7 +262,7 @@ struct dbri_mem { */ struct dbri_dma { volatile s32 cmd[DBRI_NO_CMDS]; /* Place for commands */ - volatile s32 intr[DBRI_NO_INTS * DBRI_INT_BLK]; /* Interrupt field */ + volatile s32 intr[DBRI_INT_BLK]; /* Interrupt field */ struct dbri_mem desc[DBRI_NO_DESCS]; /* Xmit/receive descriptors */ }; @@ -741,18 +735,6 @@ static void dbri_initialize(struct snd_dbri * dbri) dprintk(D_GEN, "init: cmd: %p, int: %p\n", &dbri->dma->cmd[0], &dbri->dma->intr[0]); - /* - * Initialize the interrupt ringbuffer. - */ - for (n = 0; n < DBRI_NO_INTS - 1; n++) { - dma_addr = dbri->dma_dvma; - dma_addr += dbri_dma_off(intr, ((n + 1) * DBRI_INT_BLK)); - dbri->dma->intr[n * DBRI_INT_BLK] = dma_addr; - } - dma_addr = dbri->dma_dvma + dbri_dma_off(intr, 0); - dbri->dma->intr[n * DBRI_INT_BLK] = dma_addr; - dbri->dbri_irqp = 1; - /* Initialize pipes */ for (n = 0; n < DBRI_NO_PIPES; n++) dbri->pipes[n].desc = dbri->pipes[n].first_desc = -1; @@ -765,9 +747,14 @@ static void dbri_initialize(struct snd_dbri * dbri) sbus_writel(tmp, dbri->regs + REG0); /* - * Set up the interrupt queue + * Initialize the interrupt ringbuffer. */ dma_addr = dbri->dma_dvma + dbri_dma_off(intr, 0); + dbri->dma->intr[0] = dma_addr; + dbri->dbri_irqp = 1; + /* + * Set up the interrupt queue + */ *(cmd++) = DBRI_CMD(D_IIQ, 0, 0); *(cmd++) = dma_addr; @@ -1951,10 +1938,8 @@ static void dbri_process_interrupt_buffer(struct snd_dbri * dbri) while ((x = dbri->dma->intr[dbri->dbri_irqp]) != 0) { dbri->dma->intr[dbri->dbri_irqp] = 0; dbri->dbri_irqp++; - if (dbri->dbri_irqp == (DBRI_NO_INTS * DBRI_INT_BLK)) + if (dbri->dbri_irqp == DBRI_INT_BLK) dbri->dbri_irqp = 1; - else if ((dbri->dbri_irqp & (DBRI_INT_BLK - 1)) == 0) - dbri->dbri_irqp++; dbri_process_one_interrupt(dbri, x); } -- cgit v0.10.2 From 5e4968e24ced93b7b130e7e1fc947a79f82776bf Mon Sep 17 00:00:00 2001 From: Tobias Klauser Date: Wed, 16 Aug 2006 12:56:16 +0200 Subject: [ALSA] sound/pci/fm801: Use ARRAY_SIZE macro Use ARRAY_SIZE macro instead of sizeof(x)/sizeof(x[0]) Signed-off-by: Tobias Klauser Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/fm801.c b/sound/pci/fm801.c index 88a3e9f..f3f2b2c 100644 --- a/sound/pci/fm801.c +++ b/sound/pci/fm801.c @@ -321,10 +321,8 @@ static unsigned int channels[] = { 2, 4, 6 }; -#define CHANNELS sizeof(channels) / sizeof(channels[0]) - static struct snd_pcm_hw_constraint_list hw_constraints_channels = { - .count = CHANNELS, + .count = ARRAY_SIZE(channels), .list = channels, .mask = 0, }; -- cgit v0.10.2 From 80b556f26b3830ad5bd6ff9f701675ac8afcb263 Mon Sep 17 00:00:00 2001 From: Alexey Dobriyan Date: Wed, 16 Aug 2006 12:56:53 +0200 Subject: [ALSA] emu10k1x: simplify around pci_register_driver() Report errors to modprobe as side effect. Signed-off-by: Alexey Dobriyan Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/emu10k1/emu10k1x.c b/sound/pci/emu10k1/emu10k1x.c index bda8bdf..da1610a 100644 --- a/sound/pci/emu10k1/emu10k1x.c +++ b/sound/pci/emu10k1/emu10k1x.c @@ -1626,12 +1626,7 @@ static struct pci_driver driver = { // initialization of the module static int __init alsa_card_emu10k1x_init(void) { - int err; - - if ((err = pci_register_driver(&driver)) > 0) - return err; - - return 0; + return pci_register_driver(&driver); } // clean up the module -- cgit v0.10.2 From d244bf897b2e7933112067ec8d8dc1d47b86145f Mon Sep 17 00:00:00 2001 From: Magnus Sandin Date: Wed, 16 Aug 2006 15:25:23 +0200 Subject: [ALSA] Fix for LG K1 Express Laptop Attached is the patch for the LG K1 Express (K1-2333V) laptop that enables sound output. Signed-off-by: Magnus Sandin Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index 5267b00..37c6be4 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -2208,7 +2208,8 @@ int patch_alc655(struct snd_ac97 * ac97) val &= ~(1 << 1); /* Pin 47 is spdif input pin */ else { /* ALC655 */ if (ac97->subsystem_vendor == 0x1462 && - ac97->subsystem_device == 0x0131) /* MSI S270 laptop */ + (ac97->subsystem_device == 0x0131 || /* MSI S270 laptop */ + ac97->subsystem_device == 0x0161)) /* LG K1 Express */ val &= ~(1 << 1); /* Pin 47 is EAPD (for internal speaker) */ else val |= (1 << 1); /* Pin 47 is spdif input pin */ -- cgit v0.10.2 From 99ccc560b73ff7381153dc1391d18391373931d3 Mon Sep 17 00:00:00 2001 From: Guillaume Munch Date: Wed, 16 Aug 2006 19:35:12 +0200 Subject: [ALSA] Add support for Sony Vaio AR 11B This patch adds automatic detection for Sigmatel ID 7664, the sound chip in Sony Vaio AR 11B (european name). - patch_stac7661 becomes patch_stac766x - .id = 0x83847664 is added Signed-off-by: Guillaume Munch Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 74be228..d7e95f1 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -856,10 +856,10 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-dig ditto with SPDIF laptop 3-jack with hp-jack automute laptop-dig ditto with SPDIF - auto auto-confgi reading BIOS (default) + auto auto-config reading BIOS (default) - STAC7661(?) - vaio Setup for VAIO FE550G/SZ110 + STAC7664/7661(?) + vaio Setup for VAIO FE550G/SZ110/AR11B If the default configuration doesn't work and one of the above matches with your device, report it together with the PCI diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index d572f03..7eaf755 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1449,10 +1449,10 @@ static int patch_stac9205(struct hda_codec *codec) } /* - * STAC 7661(?) hack + * STAC 7661(?) and 7664 hack */ -/* static config for Sony VAIO FE550G */ +/* static config for Sony VAIO FE550G and Sony VAIO AR */ static hda_nid_t vaio_dacs[] = { 0x2 }; #define VAIO_HP_DAC 0x5 static hda_nid_t vaio_adcs[] = { 0x8 /*,0x6*/ }; @@ -1552,7 +1552,7 @@ static struct snd_kcontrol_new vaio_mixer[] = { {} }; -static struct hda_codec_ops stac7661_patch_ops = { +static struct hda_codec_ops stac766x_patch_ops = { .build_controls = stac92xx_build_controls, .build_pcms = stac92xx_build_pcms, .init = stac92xx_init, @@ -1562,23 +1562,25 @@ static struct hda_codec_ops stac7661_patch_ops = { #endif }; -enum { STAC7661_VAIO }; +enum { STAC766x_VAIO }; -static struct hda_board_config stac7661_cfg_tbl[] = { - { .modelname = "vaio", .config = STAC7661_VAIO }, +static struct hda_board_config stac766x_cfg_tbl[] = { + { .modelname = "vaio", .config = STAC766x_VAIO }, { .pci_subvendor = 0x104d, .pci_subdevice = 0x81e6, - .config = STAC7661_VAIO }, + .config = STAC766x_VAIO }, { .pci_subvendor = 0x104d, .pci_subdevice = 0x81ef, - .config = STAC7661_VAIO }, + .config = STAC766x_VAIO }, + { .pci_subvendor = 0x104d, .pci_subdevice = 0x81fd, + .config = STAC766x_VAIO }, {} }; -static int patch_stac7661(struct hda_codec *codec) +static int patch_stac766x(struct hda_codec *codec) { struct sigmatel_spec *spec; int board_config; - board_config = snd_hda_check_board_config(codec, stac7661_cfg_tbl); + board_config = snd_hda_check_board_config(codec, stac766x_cfg_tbl); if (board_config < 0) /* unknown config, let generic-parser do its job... */ return snd_hda_parse_generic_codec(codec); @@ -1589,7 +1591,7 @@ static int patch_stac7661(struct hda_codec *codec) codec->spec = spec; switch (board_config) { - case STAC7661_VAIO: + case STAC766x_VAIO: spec->mixer = vaio_mixer; spec->init = vaio_init; spec->multiout.max_channels = 2; @@ -1603,7 +1605,7 @@ static int patch_stac7661(struct hda_codec *codec) break; } - codec->patch_ops = stac7661_patch_ops; + codec->patch_ops = stac766x_patch_ops; return 0; } @@ -1635,7 +1637,7 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x83847627, .name = "STAC9271D", .patch = patch_stac927x }, { .id = 0x83847628, .name = "STAC9274X5NH", .patch = patch_stac927x }, { .id = 0x83847629, .name = "STAC9274D5NH", .patch = patch_stac927x }, - { .id = 0x83847661, .name = "STAC7661", .patch = patch_stac7661 }, + { .id = 0x83847661, .name = "STAC7661", .patch = patch_stac766x }, { .id = 0x838476a0, .name = "STAC9205", .patch = patch_stac9205 }, { .id = 0x838476a1, .name = "STAC9205D", .patch = patch_stac9205 }, { .id = 0x838476a2, .name = "STAC9204", .patch = patch_stac9205 }, @@ -1644,5 +1646,6 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x838476a5, .name = "STAC9255D", .patch = patch_stac9205 }, { .id = 0x838476a6, .name = "STAC9254", .patch = patch_stac9205 }, { .id = 0x838476a7, .name = "STAC9254D", .patch = patch_stac9205 }, + { .id = 0x83847664, .name = "STAC7664", .patch = patch_stac766x }, {} /* terminator */ }; -- cgit v0.10.2 From 7cf0a95310f21f3c986288a483801b1d5694dee1 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 17 Aug 2006 16:23:07 +0200 Subject: [ALSA] Fix compile errors with older gcc Fixed compile errors with older gcc for initialization of a union. sound/pci/ca0106/ca0106_mixer.c: At top level: sound/pci/ca0106/ca0106_mixer.c:499: unknown field 'p' specified in initializer sound/pci/ca0106/ca0106_mixer.c:499: warning: missing braces around initializer sound/pci/ca0106/ca0106_mixer.c:499: warning: (near initialization for 'snd_ca0106_volume_ctls[0].tlv') Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ca0106/ca0106_mixer.c b/sound/pci/ca0106/ca0106_mixer.c index 6d64438..9855f52 100644 --- a/sound/pci/ca0106/ca0106_mixer.c +++ b/sound/pci/ca0106/ca0106_mixer.c @@ -478,7 +478,7 @@ static int snd_ca0106_i2c_volume_put(struct snd_kcontrol *kcontrol, .info = snd_ca0106_volume_info, \ .get = snd_ca0106_volume_get, \ .put = snd_ca0106_volume_put, \ - .tlv.p = snd_ca0106_db_scale1, \ + .tlv = { .p = snd_ca0106_db_scale1 }, \ .private_value = ((chid) << 8) | (reg) \ } @@ -490,7 +490,7 @@ static int snd_ca0106_i2c_volume_put(struct snd_kcontrol *kcontrol, .info = snd_ca0106_i2c_volume_info, \ .get = snd_ca0106_i2c_volume_get, \ .put = snd_ca0106_i2c_volume_put, \ - .tlv.p = snd_ca0106_db_scale2, \ + .tlv = { .p = snd_ca0106_db_scale2 }, \ .private_value = chid \ } diff --git a/sound/pci/emu10k1/p16v.c b/sound/pci/emu10k1/p16v.c index 1e44714..4e0f954 100644 --- a/sound/pci/emu10k1/p16v.c +++ b/sound/pci/emu10k1/p16v.c @@ -794,7 +794,7 @@ static DECLARE_TLV_DB_SCALE(snd_p16v_db_scale1, -5175, 25, 1); .info = snd_p16v_volume_info, \ .get = snd_p16v_volume_get, \ .put = snd_p16v_volume_put, \ - .tlv.p = snd_p16v_db_scale1, \ + .tlv = { .p = snd_p16v_db_scale1 }, \ .private_value = ((xreg) | ((xhl) << 8)) \ } diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h index 0f0ae68..ff24266 100644 --- a/sound/pci/hda/hda_local.h +++ b/sound/pci/hda/hda_local.h @@ -36,7 +36,7 @@ .info = snd_hda_mixer_amp_volume_info, \ .get = snd_hda_mixer_amp_volume_get, \ .put = snd_hda_mixer_amp_volume_put, \ - .tlv.c = snd_hda_mixer_amp_tlv, \ + .tlv = { .c = snd_hda_mixer_amp_tlv }, \ .private_value = HDA_COMPOSE_AMP_VAL(nid, channel, xindex, direction) } /* stereo volume with index */ #define HDA_CODEC_VOLUME_IDX(xname, xcidx, nid, xindex, direction) \ diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 077f1ce..043256c 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -507,7 +507,7 @@ static struct snd_kcontrol_new ad1986a_mixers[] = { .info = ad1986a_pcm_amp_vol_info, .get = ad1986a_pcm_amp_vol_get, .put = ad1986a_pcm_amp_vol_put, - .tlv.c = ad1986a_pcm_amp_tlv, + .tlv = { .c = ad1986a_pcm_amp_tlv }, .private_value = HDA_COMPOSE_AMP_VAL(AD1986A_FRONT_DAC, 3, 0, HDA_OUTPUT) }, { -- cgit v0.10.2 From 5fc3a2b250716b34ca7c0128475bbedf795f1ac2 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Thu, 17 Aug 2006 16:58:45 +0200 Subject: [ALSA] sparc dbri: removal of unused struct members It removes unused or rarely used members of defined structures. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 4651ff5..66b4d45 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -34,7 +34,7 @@ * (the second one is a monitor/tee pipe, valid only for serial input). * * The mmcodec is connected via the CHI bus and needs the data & some - * parameters (volume, balance, output selection) timemultiplexed in 8 byte + * parameters (volume, output selection) timemultiplexed in 8 byte * chunks. It also has a control mode, which serves for audio format setting. * * Looking at the CS4215 data sheet it is easy to set up 2 or 4 codecs on @@ -274,9 +274,7 @@ enum in_or_out { PIPEinput, PIPEoutput }; struct dbri_pipe { u32 sdp; /* SDP command word */ - enum in_or_out direction; int nextpipe; /* Next pipe in linked list */ - int prevpipe; int cycle; /* Offset of timeslot (bits) */ int length; /* Length of timeslot (bits) */ int first_desc; /* Index of first descriptor */ @@ -300,13 +298,11 @@ struct dbri_streaminfo { int pipe; /* Data pipe used */ int left_gain; /* mixer elements */ int right_gain; - int balance; }; /* This structure holds the information for both chips (DBRI & CS4215) */ struct snd_dbri { struct snd_card *card; /* ALSA card */ - struct snd_pcm *pcm; int regs_size, irq; /* Needed for unload */ struct sbus_dev *sdev; /* SBUS device info */ @@ -316,7 +312,6 @@ struct snd_dbri { u32 dma_dvma; /* DBRI visible DMA address */ void __iomem *regs; /* dbri HW regs */ - int dbri_version; /* 'e' and up is OK */ int dbri_irqp; /* intr queue pointer */ int wait_send; /* sequence of command buffers send */ int wait_ackd; /* sequence of command buffers acknowledged */ @@ -337,8 +332,6 @@ struct snd_dbri { #define DBRI_MAX_VOLUME 63 /* Output volume */ #define DBRI_MAX_GAIN 15 /* Input gain */ -#define DBRI_RIGHT_BALANCE 255 -#define DBRI_MID_BALANCE (DBRI_RIGHT_BALANCE >> 1) /* DBRI Reg0 - Status Control Register - defines. (Page 17) */ #define D_P (1<<15) /* Program command & queue pointer valid */ @@ -841,10 +834,6 @@ static void setup_pipe(struct snd_dbri * dbri, int pipe, int sdp) dbri->pipes[pipe].sdp = sdp; dbri->pipes[pipe].desc = -1; dbri->pipes[pipe].first_desc = -1; - if (sdp & D_SDP_TO_SER) - dbri->pipes[pipe].direction = PIPEoutput; - else - dbri->pipes[pipe].direction = PIPEinput; reset_pipe(dbri, pipe); } @@ -1363,14 +1352,6 @@ static void cs4215_setdata(struct snd_dbri * dbri, int muted) int left_gain = info->left_gain % 64; int right_gain = info->right_gain % 64; - if (info->balance < DBRI_MID_BALANCE) { - right_gain *= info->balance; - right_gain /= DBRI_MID_BALANCE; - } else { - left_gain *= DBRI_RIGHT_BALANCE - info->balance; - left_gain /= DBRI_MID_BALANCE; - } - dbri->mm.data[0] &= ~0x3f; /* Reset the volume bits */ dbri->mm.data[1] &= ~0x3f; dbri->mm.data[0] |= (DBRI_MAX_VOLUME - left_gain); @@ -2233,7 +2214,6 @@ static int __devinit snd_dbri_pcm(struct snd_dbri * dbri) pcm->private_data = dbri; pcm->info_flags = 0; strcpy(pcm->name, dbri->card->shortname); - dbri->pcm = pcm; if ((err = snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_CONTINUOUS, @@ -2452,7 +2432,6 @@ static int __init snd_dbri_mixer(struct snd_dbri * dbri) for (idx = DBRI_REC; idx < DBRI_NO_STREAMS; idx++) { dbri->stream_info[idx].left_gain = 0; dbri->stream_info[idx].right_gain = 0; - dbri->stream_info[idx].balance = DBRI_MID_BALANCE; } return 0; @@ -2484,12 +2463,11 @@ static void dbri_debug_read(struct snd_info_entry * entry, struct dbri_pipe *pptr = &dbri->pipes[pipe]; snd_iprintf(buffer, "Pipe %d: %s SDP=0x%x desc=%d, " - "len=%d @ %d prev: %d next %d\n", + "len=%d @ %d next %d\n", pipe, - (pptr->direction == - PIPEinput ? "input" : "output"), pptr->sdp, - pptr->desc, pptr->length, pptr->cycle, - pptr->prevpipe, pptr->nextpipe); + ((pptr->sdp & D_SDP_TO_SER) ? "output" : "input"), + pptr->sdp, pptr->desc, + pptr->length, pptr->cycle, pptr->nextpipe); } } } @@ -2528,7 +2506,6 @@ static int __init snd_dbri_create(struct snd_card *card, dbri->card = card; dbri->sdev = sdev; dbri->irq = irq->pri; - dbri->dbri_version = sdev->prom_name[9]; dbri->dma = sbus_alloc_consistent(sdev, sizeof(struct dbri_dma), &dbri->dma_dvma); @@ -2648,7 +2625,7 @@ static int __init dbri_attach(int prom_node, struct sbus_dev *sdev) printk(KERN_INFO "audio%d at %p (irq %d) is DBRI(%c)+CS4215(%d)\n", dev, dbri->regs, - dbri->irq, dbri->dbri_version, dbri->mm.version); + dbri->irq, sdev->prom_name[9], dbri->mm.version); dev++; return 0; -- cgit v0.10.2 From 16727d94adf9a1376775fd34d982778c7f3506df Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Thu, 17 Aug 2006 16:59:28 +0200 Subject: [ALSA] sparc dbri: removal of redudant volatile keywords It removes redudant volatile keywords. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 66b4d45..405c603 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -252,8 +252,8 @@ static struct { /* One transmit/receive descriptor */ struct dbri_mem { volatile __u32 word1; - volatile __u32 ba; /* Transmit/Receive Buffer Address */ - volatile __u32 nda; /* Next Descriptor Address */ + __u32 ba; /* Transmit/Receive Buffer Address */ + __u32 nda; /* Next Descriptor Address */ volatile __u32 word4; }; @@ -308,7 +308,7 @@ struct snd_dbri { struct sbus_dev *sdev; /* SBUS device info */ spinlock_t lock; - volatile struct dbri_dma *dma; /* Pointer to our DMA block */ + struct dbri_dma *dma; /* Pointer to our DMA block */ u32 dma_dvma; /* DBRI visible DMA address */ void __iomem *regs; /* dbri HW regs */ -- cgit v0.10.2 From e05d696424f21b59eccff35d04938f0d6588cd94 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 17 Aug 2006 17:12:19 +0200 Subject: [ALSA] Fix some typos in snd-dummy driver Fixed some typos in snd-dummy driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c index ffeafaf..73b1613 100644 --- a/sound/drivers/dummy.c +++ b/sound/drivers/dummy.c @@ -285,7 +285,7 @@ static struct snd_pcm_hardware snd_card_dummy_playback = .channels_max = USE_CHANNELS_MAX, .buffer_bytes_max = MAX_BUFFER_SIZE, .period_bytes_min = 64, - .period_bytes_max = MAX_BUFFER_SIZE, + .period_bytes_max = MAX_PERIOD_SIZE, .periods_min = USE_PERIODS_MIN, .periods_max = USE_PERIODS_MAX, .fifo_size = 0, @@ -547,13 +547,13 @@ static struct snd_kcontrol_new snd_dummy_controls[] = { DUMMY_VOLUME("Master Volume", 0, MIXER_ADDR_MASTER), DUMMY_CAPSRC("Master Capture Switch", 0, MIXER_ADDR_MASTER), DUMMY_VOLUME("Synth Volume", 0, MIXER_ADDR_SYNTH), -DUMMY_CAPSRC("Synth Capture Switch", 0, MIXER_ADDR_MASTER), +DUMMY_CAPSRC("Synth Capture Switch", 0, MIXER_ADDR_SYNTH), DUMMY_VOLUME("Line Volume", 0, MIXER_ADDR_LINE), -DUMMY_CAPSRC("Line Capture Switch", 0, MIXER_ADDR_MASTER), +DUMMY_CAPSRC("Line Capture Switch", 0, MIXER_ADDR_LINE), DUMMY_VOLUME("Mic Volume", 0, MIXER_ADDR_MIC), -DUMMY_CAPSRC("Mic Capture Switch", 0, MIXER_ADDR_MASTER), +DUMMY_CAPSRC("Mic Capture Switch", 0, MIXER_ADDR_MIC), DUMMY_VOLUME("CD Volume", 0, MIXER_ADDR_CD), -DUMMY_CAPSRC("CD Capture Switch", 0, MIXER_ADDR_MASTER) +DUMMY_CAPSRC("CD Capture Switch", 0, MIXER_ADDR_CD) }; static int __init snd_card_dummy_new_mixer(struct snd_dummy *dummy) -- cgit v0.10.2 From c256652466127872f1b2e510431dc25524ba40ba Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 17 Aug 2006 18:21:36 +0200 Subject: [ALSA] Add missing TLV callbacks for HD-audio codecs Added missing TLV callbacks for HD-audio codec supports. Also cleaned up the tlv callback for ad1986a (no mutex is needed there). Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 043256c..71abc2a 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -452,19 +452,6 @@ static int ad1986a_pcm_amp_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl return change; } -static int ad1986a_pcm_amp_tlv(struct snd_kcontrol *kcontrol, int op_flag, - unsigned int size, unsigned int __user *_tlv) -{ - struct hda_codec *codec = snd_kcontrol_chip(kcontrol); - struct ad198x_spec *ad = codec->spec; - - mutex_lock(&ad->amp_mutex); - snd_hda_mixer_amp_tlv(kcontrol, op_flag, size, _tlv); - mutex_unlock(&ad->amp_mutex); - return 0; -} - - #define ad1986a_pcm_amp_sw_info snd_hda_mixer_amp_switch_info static int ad1986a_pcm_amp_sw_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) @@ -507,7 +494,7 @@ static struct snd_kcontrol_new ad1986a_mixers[] = { .info = ad1986a_pcm_amp_vol_info, .get = ad1986a_pcm_amp_vol_get, .put = ad1986a_pcm_amp_vol_put, - .tlv = { .c = ad1986a_pcm_amp_tlv }, + .tlv = { .c = snd_hda_mixer_amp_tlv }, .private_value = HDA_COMPOSE_AMP_VAL(AD1986A_FRONT_DAC, 3, 0, HDA_OUTPUT) }, { @@ -654,6 +641,7 @@ static struct snd_kcontrol_new ad1986a_laptop_eapd_mixers[] = { .info = snd_hda_mixer_amp_volume_info, .get = snd_hda_mixer_amp_volume_get, .put = ad1986a_laptop_master_vol_put, + .tlv = { .c = snd_hda_mixer_amp_tlv }, .private_value = HDA_COMPOSE_AMP_VAL(0x1a, 3, 0, HDA_OUTPUT), }, { diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index f857e96..79d3612 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -5540,6 +5540,7 @@ static struct snd_kcontrol_new alc262_fujitsu_mixer[] = { .info = snd_hda_mixer_amp_volume_info, .get = snd_hda_mixer_amp_volume_get, .put = alc262_fujitsu_master_vol_put, + .tlv = { .c = snd_hda_mixer_amp_tlv }, .private_value = HDA_COMPOSE_AMP_VAL(0x0c, 3, 0, HDA_OUTPUT), }, { diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 7eaf755..887b52e 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1528,6 +1528,7 @@ static struct snd_kcontrol_new vaio_mixer[] = { .info = snd_hda_mixer_amp_volume_info, .get = snd_hda_mixer_amp_volume_get, .put = vaio_master_vol_put, + .tlv = { .c = snd_hda_mixer_amp_tlv }, .private_value = HDA_COMPOSE_AMP_VAL(0x02, 3, 0, HDA_OUTPUT), }, { -- cgit v0.10.2 From adf75dcab1deb9625538f74906508c1f6136fd98 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 18 Aug 2006 09:03:45 +0200 Subject: [ALSA] riptide: fix compile errors with older gcc Change the syntax of a union initialization that is not understood by gcc 2.x. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/riptide/riptide.c b/sound/pci/riptide/riptide.c index f435fcd..fe210c8 100644 --- a/sound/pci/riptide/riptide.c +++ b/sound/pci/riptide/riptide.c @@ -673,9 +673,13 @@ static struct lbuspath lbus_rec_path = { #define FIRMWARE_VERSIONS 1 static union firmware_version firmware_versions[] = { { - .firmware.ASIC = 3,.firmware.CODEC = 2, - .firmware.AUXDSP = 3,.firmware.PROG = 773, - }, + .firmware = { + .ASIC = 3, + .CODEC = 2, + .AUXDSP = 3, + .PROG = 773, + }, + }, }; static u32 atoh(unsigned char *in, unsigned int len) -- cgit v0.10.2 From 2aaeee8bd1cf51b6ed7c751a8472cb77f3ddc642 Mon Sep 17 00:00:00 2001 From: Tobin Davis Date: Mon, 21 Aug 2006 19:01:12 +0200 Subject: [ALSA] hda-codec - add missing device ids This patch adds missing device ids for Intel 915 and D102GGC motherboards. Signed-off-by: Tobin Davis Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 79d3612..53aa57f 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2143,7 +2143,10 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .pci_subvendor = 0x8086, .pci_subdevice = 0xe20f, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe210, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe211, .config = ALC880_3ST }, + { .pci_subvendor = 0x8086, .pci_subdevice = 0xe212, .config = ALC880_3ST }, + { .pci_subvendor = 0x8086, .pci_subdevice = 0xe213, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe214, .config = ALC880_3ST }, + { .pci_subvendor = 0x8086, .pci_subdevice = 0xe234, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe302, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe303, .config = ALC880_3ST }, { .pci_subvendor = 0x8086, .pci_subdevice = 0xe304, .config = ALC880_3ST }, @@ -5058,6 +5061,8 @@ static struct hda_board_config alc883_cfg_tbl[] = { { .modelname = "3stack-6ch", .config = ALC883_3ST_6ch }, { .pci_subvendor = 0x108e, .pci_subdevice = 0x534d, .config = ALC883_3ST_6ch }, + { .pci_subvendor = 0x8086, .pci_subdevice = 0xd601, + .config = ALC883_3ST_6ch }, /* D102GGC */ { .modelname = "6stack-dig", .config = ALC883_6ST_DIG }, { .pci_subvendor = 0x1462, .pci_subdevice = 0x6668, .config = ALC883_6ST_DIG }, /* MSI */ -- cgit v0.10.2 From 68a6abd97f8b9aa072e36b1901531e7bb69b6efc Mon Sep 17 00:00:00 2001 From: Tobin Davis Date: Mon, 21 Aug 2006 19:02:10 +0200 Subject: [ALSA] hda-codec - Fix headphone output for some Intel 945 systems This patch enables headphone output at initialization for Intel 945 based systems that don't have proper detection circuitry. Signed-off-by: Tobin Davis Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 887b52e..d709389 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1143,6 +1143,8 @@ static int stac92xx_init(struct hda_codec *codec) STAC_UNSOL_ENABLE); /* fake event to set up pins */ codec->patch_ops.unsol_event(codec, STAC_HP_EVENT << 26); + /* enable the headphones by default. If/when unsol_event detection works, this will be ignored */ + stac92xx_auto_init_hp_out(codec); } else { stac92xx_auto_init_multi_out(codec); stac92xx_auto_init_hp_out(codec); -- cgit v0.10.2 From 79cf0d376fbf1cdf8e9c7c70c3a7c7434a716879 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 22 Aug 2006 16:35:19 +0200 Subject: [ALSA] Fix missing selection of CONFIG_VIDEO_DEV from SND_FM801_TEA575X Fixed the missing selection of CONFIG_VIDEO_DEV from SND_FM801_TEA575X. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/Kconfig b/sound/pci/Kconfig index e49c0fe..dffb6be 100644 --- a/sound/pci/Kconfig +++ b/sound/pci/Kconfig @@ -475,6 +475,7 @@ config SND_FM801_TEA575X depends on SND_FM801_TEA575X_BOOL default SND_FM801 select VIDEO_V4L1 + select VIDEO_DEV config SND_HDA_INTEL tristate "Intel HD Audio" -- cgit v0.10.2 From 948a4db217235ba51c41d8e7c2ffcf9432e57274 Mon Sep 17 00:00:00 2001 From: Tobin Davis Date: Tue, 22 Aug 2006 19:43:46 +0200 Subject: [ALSA] hda-codec - add missing device ids for Intel 945 boards This patch adds missing device ids for Intel 945 motherboards. Signed-off-by: Tobin Davis Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index d709389..7b29288 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -396,19 +396,53 @@ static struct hda_board_config stac922x_cfg_tbl[] = { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2668, /* DFI LanParty */ .config = STAC_REF }, /* SigmaTel reference board */ + /* Intel 945G based systems */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x0101, .config = STAC_D945GTP3 }, /* Intel D945GTP - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x0202, - .config = STAC_D945GTP3 }, /* Intel D945GNT - 3 Stack, 9221 A1 */ + .config = STAC_D945GTP3 }, /* Intel D945GNT - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, - .pci_subdevice = 0x0b0b, - .config = STAC_D945GTP3 }, /* Intel D945PSN - 3 Stack, 9221 A1 */ + .pci_subdevice = 0x0606, + .config = STAC_D945GTP3 }, /* Intel D945GTP - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0601, + .config = STAC_D945GTP3 }, /* Intel D945GTP - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0111, + .config = STAC_D945GTP3 }, /* Intel D945GZP - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x1115, + .config = STAC_D945GTP3 }, /* Intel D945GPM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x1116, + .config = STAC_D945GTP3 }, /* Intel D945GBO - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x1117, + .config = STAC_D945GTP3 }, /* Intel D945GPM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x1118, + .config = STAC_D945GTP3 }, /* Intel D945GPM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x1119, + .config = STAC_D945GTP3 }, /* Intel D945GPM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x8826, + .config = STAC_D945GTP3 }, /* Intel D945GPM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x5049, + .config = STAC_D945GTP3 }, /* Intel D945GCZ - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x5055, + .config = STAC_D945GTP3 }, /* Intel D945GCZ - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x5048, + .config = STAC_D945GTP3 }, /* Intel D945GPB - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0110, + .config = STAC_D945GTP3 }, /* Intel D945GLR - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, - .pci_subdevice = 0x0707, - .config = STAC_D945GTP5 }, /* Intel D945PSV - 5 Stack */ - { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x0404, .config = STAC_D945GTP5 }, /* Intel D945GTP - 5 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, @@ -420,6 +454,26 @@ static struct hda_board_config stac922x_cfg_tbl[] = { { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x0417, .config = STAC_D945GTP5 }, /* Intel D975XBK - 5 Stack */ + /* Intel 945P based systems */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0b0b, + .config = STAC_D945GTP3 }, /* Intel D945PSN - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0112, + .config = STAC_D945GTP3 }, /* Intel D945PLN - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0d0d, + .config = STAC_D945GTP3 }, /* Intel D945PLM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0909, + .config = STAC_D945GTP3 }, /* Intel D945PAW - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0505, + .config = STAC_D945GTP3 }, /* Intel D945PLM - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x0707, + .config = STAC_D945GTP5 }, /* Intel D945PSV - 5 Stack */ + /* other systems */ { .pci_subvendor = 0x8384, .pci_subdevice = 0x7680, .config = STAC_MACMINI }, /* Apple Mac Mini (early 2006) */ -- cgit v0.10.2 From 81d3dbde76eedcd3ede8a73eb72790d67fa254a9 Mon Sep 17 00:00:00 2001 From: Tobin Davis Date: Tue, 22 Aug 2006 19:44:45 +0200 Subject: [ALSA] hda-codec - Add support for new Intel boards with Stac9227 codec This patch adds full 5.1 audio support for Intel boards with the SigmaTel 9227 codec chip (946, 963, 965 series). Signed-off-by: Tobin Davis Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 7b29288..73ca566 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -377,18 +377,11 @@ static unsigned int d945gtp5_pin_configs[10] = { 0x02a19320, 0x40000100, }; -static unsigned int d965_2112_pin_configs[10] = { - 0x0221401f, 0x40000100, 0x40000100, 0x01014011, - 0x01a19021, 0x01813024, 0x01452130, 0x40000100, - 0x02a19320, 0x40000100, -}; - static unsigned int *stac922x_brd_tbl[STAC_922X_MODELS] = { [STAC_REF] = ref922x_pin_configs, [STAC_D945GTP3] = d945gtp3_pin_configs, [STAC_D945GTP5] = d945gtp5_pin_configs, [STAC_MACMINI] = d945gtp5_pin_configs, - [STAC_D965_2112] = d965_2112_pin_configs, }; static struct hda_board_config stac922x_cfg_tbl[] = { @@ -493,8 +486,16 @@ static unsigned int ref927x_pin_configs[14] = { 0x01c41030, 0x40000100, }; +static unsigned int d965_2112_pin_configs[14] = { + 0x0221401f, 0x02a19120, 0x40000100, 0x01014011, + 0x01a19021, 0x01813024, 0x40000100, 0x40000100, + 0x40000100, 0x40000100, 0x40000100, 0x40000100, + 0x40000100, 0x40000100 +}; + static unsigned int *stac927x_brd_tbl[] = { - ref927x_pin_configs, + [STAC_REF] = ref927x_pin_configs, + [STAC_D965_2112] = d965_2112_pin_configs, }; static struct hda_board_config stac927x_cfg_tbl[] = { @@ -502,6 +503,66 @@ static struct hda_board_config stac927x_cfg_tbl[] = { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2668, /* DFI LanParty */ .config = STAC_REF }, /* SigmaTel reference board */ + /* SigmaTel 9227 reference board */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x284b, + .config = STAC_D965_284B }, + /* Intel 946 based systems */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x3d01, + .config = STAC_D965_2112 }, /* D946 configuration */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0xa301, + .config = STAC_D965_2112 }, /* Intel D946GZT - 3 stack */ + /* 965 based systems */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2116, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2115, + .config = STAC_D965_2112 }, /* Intel DQ965WC - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2114, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2113, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2112, + .config = STAC_D965_2112 }, /* Intel DG965MS - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2111, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2110, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2009, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2008, + .config = STAC_D965_2112 }, /* Intel DQ965GF - 3 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2007, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2006, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2005, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2004, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2003, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2002, + .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2001, + .config = STAC_D965_2112 }, /* Intel DQ965GF - 3 Stackg */ {} /* terminator */ }; @@ -1391,25 +1452,6 @@ static int patch_stac922x(struct hda_codec *codec) spec->multiout.dac_nids = spec->dac_nids; - switch (spec->board_config) { - case STAC_D965_2112: - spec->adc_nids = stac9227_adc_nids; - spec->mux_nids = stac9227_mux_nids; -#if 0 - spec->multiout.dac_nids = d965_2112_dac_nids; - spec->multiout.num_dacs = ARRAY_SIZE(d965_2112_dac_nids); -#endif - spec->init = d965_2112_core_init; - spec->mixer = stac9227_mixer; - break; - case STAC_D965_284B: - spec->adc_nids = stac9227_adc_nids; - spec->mux_nids = stac9227_mux_nids; - spec->init = stac9227_core_init; - spec->mixer = stac9227_mixer; - break; - } - err = stac92xx_parse_auto_config(codec, 0x08, 0x09); if (err < 0) { stac92xx_free(codec); @@ -1437,19 +1479,35 @@ static int patch_stac927x(struct hda_codec *codec) spec->board_config = snd_hda_check_board_config(codec, stac927x_cfg_tbl); if (spec->board_config < 0) snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC927x, using BIOS defaults\n"); - else { + else if (stac927x_brd_tbl[spec->board_config] != NULL) { spec->num_pins = 14; spec->pin_nids = stac927x_pin_nids; spec->pin_configs = stac927x_brd_tbl[spec->board_config]; stac92xx_set_config_regs(codec); } - spec->adc_nids = stac927x_adc_nids; - spec->mux_nids = stac927x_mux_nids; - spec->num_muxes = 3; - - spec->init = stac927x_core_init; - spec->mixer = stac927x_mixer; + switch (spec->board_config) { + case STAC_D965_2112: + spec->adc_nids = stac927x_adc_nids; + spec->mux_nids = stac927x_mux_nids; + spec->num_muxes = 3; + spec->init = d965_2112_core_init; + spec->mixer = stac9227_mixer; + break; + case STAC_D965_284B: + spec->adc_nids = stac9227_adc_nids; + spec->mux_nids = stac9227_mux_nids; + spec->num_muxes = 2; + spec->init = stac9227_core_init; + spec->mixer = stac9227_mixer; + break; + default: + spec->adc_nids = stac927x_adc_nids; + spec->mux_nids = stac927x_mux_nids; + spec->num_muxes = 3; + spec->init = stac927x_core_init; + spec->mixer = stac927x_mixer; + } spec->multiout.dac_nids = spec->dac_nids; -- cgit v0.10.2 From e96224ae974844d3f4e84f927ca4b17f1a2079a3 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 21 Aug 2006 17:57:44 +0200 Subject: [ALSA] hda-intel - Switch to polling mode for CORB/RIRB communication Automatically switch to polling mode for CORB/RIRB communication if the irq-driven mode seems not working well. If the polling mode still doesn't work, switch to single_cmd mode as fallback. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 79d63c9..ce75e07 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -332,6 +332,7 @@ struct azx { int position_fix; unsigned int initialized: 1; unsigned int single_cmd: 1; + unsigned int polling_mode: 1; }; /* driver types */ @@ -518,8 +519,23 @@ static unsigned int azx_rirb_get_response(struct hda_codec *codec) struct azx *chip = codec->bus->private_data; int timeout = 50; - while (chip->rirb.cmds) { + for (;;) { + if (chip->polling_mode) { + spin_lock_irq(&chip->reg_lock); + azx_update_rirb(chip); + spin_unlock_irq(&chip->reg_lock); + } + if (! chip->rirb.cmds) + break; if (! --timeout) { + if (! chip->polling_mode) { + snd_printk(KERN_WARNING "hda_intel: " + "azx_get_response timeout, " + "switching to polling mode...\n"); + chip->polling_mode = 1; + timeout = 50; + continue; + } snd_printk(KERN_ERR "hda_intel: azx_get_response timeout, " "switching to single_cmd mode...\n"); -- cgit v0.10.2 From c27354460b1e0cbcd9dfc9232a76bd56c46dce89 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 21 Aug 2006 19:27:35 +0200 Subject: [ALSA] sparc dbri: removal of dri_desc struct The structure is in big part redudant. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 405c603..0b8545a 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -250,6 +250,7 @@ static struct { #define DBRI_NO_STREAMS 2 /* One transmit/receive descriptor */ +/* When ba != 0 descriptor is used */ struct dbri_mem { volatile __u32 word1; __u32 ba; /* Transmit/Receive Buffer Address */ @@ -282,12 +283,6 @@ struct dbri_pipe { volatile __u32 *recv_fixed_ptr; /* Ptr to receive fixed data */ }; -struct dbri_desc { - int inuse; /* Boolean flag */ - int next; /* Index of next desc, or -1 */ - unsigned int len; -}; - /* Per stream (playback or record) information */ struct dbri_streaminfo { struct snd_pcm_substream *substream; @@ -317,7 +312,7 @@ struct snd_dbri { int wait_ackd; /* sequence of command buffers acknowledged */ struct dbri_pipe pipes[DBRI_NO_PIPES]; /* DBRI's 32 data pipes */ - struct dbri_desc descs[DBRI_NO_DESCS]; + int next_desc[DBRI_NO_DESCS]; /* Index of next desc, or -1 */ int chi_in_pipe; int chi_out_pipe; @@ -803,8 +798,8 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) desc = dbri->pipes[pipe].first_desc; while (desc != -1) { - dbri->descs[desc].inuse = 0; - desc = dbri->descs[desc].next; + dbri->dma->desc[desc].nda = dbri->dma->desc[desc].ba = 0; + desc = dbri->next_desc[desc]; } dbri->pipes[pipe].desc = -1; @@ -1093,7 +1088,7 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period int mylen; for (; desc < DBRI_NO_DESCS; desc++) { - if (!dbri->descs[desc].inuse) + if (!dbri->dma->desc[desc].ba) break; } if (desc == DBRI_NO_DESCS) { @@ -1110,19 +1105,16 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period mylen = period; } - dbri->descs[desc].inuse = 1; - dbri->descs[desc].next = -1; + dbri->next_desc[desc] = -1; dbri->dma->desc[desc].ba = dvma_buffer; dbri->dma->desc[desc].nda = 0; if (streamno == DBRI_PLAY) { - dbri->descs[desc].len = mylen; dbri->dma->desc[desc].word1 = DBRI_TD_CNT(mylen); dbri->dma->desc[desc].word4 = 0; if (first_desc != -1) dbri->dma->desc[desc].word1 |= DBRI_TD_M; } else { - dbri->descs[desc].len = 0; dbri->dma->desc[desc].word1 = 0; dbri->dma->desc[desc].word4 = DBRI_RD_B | DBRI_RD_BCNT(mylen); @@ -1131,7 +1123,7 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period if (first_desc == -1) { first_desc = desc; } else { - dbri->descs[last_desc].next = desc; + dbri->next_desc[last_desc] = desc; dbri->dma->desc[last_desc].nda = dbri->dma_dvma + dbri_dma_off(desc, desc); } @@ -1154,7 +1146,7 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period dbri->pipes[info->pipe].first_desc = first_desc; dbri->pipes[info->pipe].desc = first_desc; - for (desc = first_desc; desc != -1; desc = dbri->descs[desc].next) { + for (desc = first_desc; desc != -1; desc = dbri->next_desc[desc]) { dprintk(D_DESC, "DESC %d: %08x %08x %08x %08x\n", desc, dbri->dma->desc[desc].word1, @@ -1747,6 +1739,7 @@ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) struct dbri_streaminfo *info; int td; int status; + int len; info = &dbri->stream_info[DBRI_PLAY]; @@ -1765,11 +1758,12 @@ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) dprintk(D_INT, "TD %d, status 0x%02x\n", td, status); dbri->dma->desc[td].word4 = 0; /* Reset it for next time. */ - info->offset += dbri->descs[td].len; - info->left -= dbri->descs[td].len; + len = DBRI_RD_CNT(dbri->dma->desc[td].word1); + info->offset += len; + info->left -= len; /* On the last TD, transmit them all again. */ - if (dbri->descs[td].next == -1) { + if (dbri->next_desc[td] == -1) { if (info->left > 0) { printk(KERN_WARNING "%d bytes left after last transfer.\n", @@ -1779,7 +1773,7 @@ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) tasklet_schedule(&xmit_descs_task); } - td = dbri->descs[td].next; + td = dbri->next_desc[td]; dbri->pipes[pipe].desc = td; } @@ -1803,8 +1797,8 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) return; } - dbri->descs[rd].inuse = 0; - dbri->pipes[pipe].desc = dbri->descs[rd].next; + dbri->dma->desc[rd].ba = 0; + dbri->pipes[pipe].desc = dbri->next_desc[rd]; status = dbri->dma->desc[rd].word1; dbri->dma->desc[rd].word1 = 0; /* Reset it for next time. */ @@ -1818,7 +1812,7 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) rd, DBRI_RD_STATUS(status), DBRI_RD_CNT(status)); /* On the last TD, transmit them all again. */ - if (dbri->descs[rd].next == -1) { + if (dbri->next_desc[rd] == -1) { if (info->left > info->size) { printk(KERN_WARNING "%d bytes recorded in %d size buffer.\n", -- cgit v0.10.2 From 470f1f1a1c2597fab98339ab0966dbf602d604f0 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 21 Aug 2006 19:28:16 +0200 Subject: [ALSA] sparc dbri: more driver cleanup A general clean up and redudant code removal. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 0b8545a..6fc37c9 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -241,9 +241,7 @@ static struct { #define DBRI_INT_BLK 64 #define DBRI_NO_DESCS 64 #define DBRI_NO_PIPES 32 - -#define DBRI_MM_ONB 1 -#define DBRI_MM_SB 2 +#define DBRI_MAX_PIPE (DBRI_NO_PIPES - 1) #define DBRI_REC 0 #define DBRI_PLAY 1 @@ -650,10 +648,6 @@ static volatile s32 *dbri_cmdlock(struct snd_dbri * dbri, enum dbri_lock get) /* Delay if previous commands are still being processed */ while ((--maxloops) > 0 && (dbri->wait_send != dbri->wait_ackd)) { msleep_interruptible(1); - /* If dbri_cmdlock() got called from inside the - * interrupt handler, this will do the processing. - */ - dbri_process_interrupt_buffer(dbri); } if (maxloops == 0) { printk(KERN_ERR "DBRI: Chip never completed command buffer %d\n", @@ -780,7 +774,7 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) int desc; volatile int *cmd; - if (pipe < 0 || pipe > 31) { + if (pipe < 0 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: reset_pipe called with illegal pipe number\n"); return; } @@ -806,10 +800,9 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) dbri->pipes[pipe].first_desc = -1; } -/* FIXME: direction as an argument? */ static void setup_pipe(struct snd_dbri * dbri, int pipe, int sdp) { - if (pipe < 0 || pipe > 31) { + if (pipe < 0 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: setup_pipe called with illegal pipe number\n"); return; } @@ -843,7 +836,7 @@ static void link_time_slot(struct snd_dbri * dbri, int pipe, int prevpipe; int nextpipe; - if (pipe < 0 || pipe > 31 || basepipe < 0 || basepipe > 31) { + if (pipe < 0 || pipe > DBRI_MAX_PIPE || basepipe < 0 || basepipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: link_time_slot called with illegal pipe number\n"); return; @@ -931,7 +924,8 @@ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, volatile s32 *cmd; int val; - if (pipe < 0 || pipe > 31 || prevpipe < 0 || prevpipe > 31) { + if (pipe < 0 || pipe > DBRI_MAX_PIPE + || prevpipe < 0 || prevpipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: unlink_time_slot called with illegal pipe number\n"); return; @@ -972,7 +966,7 @@ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) { volatile s32 *cmd; - if (pipe < 16 || pipe > 31) { + if (pipe < 16 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: xmit_fixed: Illegal pipe number\n"); return; } @@ -1007,7 +1001,7 @@ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) static void recv_fixed(struct snd_dbri * dbri, int pipe, volatile __u32 * ptr) { - if (pipe < 16 || pipe > 31) { + if (pipe < 16 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: recv_fixed called with illegal pipe number\n"); return; } @@ -1182,20 +1176,14 @@ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_sla /* Set CHI Anchor: Pipe 16 */ - val = D_DTS_VI | D_DTS_INS | D_DTS_PRVIN(16) | D_PIPE(16); + val = D_DTS_VO | D_DTS_VI | D_DTS_INS + | D_DTS_PRVIN(16) | D_PIPE(16) | D_DTS_PRVOUT(16); *(cmd++) = DBRI_CMD(D_DTS, 0, val); *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); - *(cmd++) = 0; - - val = D_DTS_VO | D_DTS_INS | D_DTS_PRVOUT(16) | D_PIPE(16); - *(cmd++) = DBRI_CMD(D_DTS, 0, val); - *(cmd++) = 0; *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); dbri->pipes[16].sdp = 1; dbri->pipes[16].nextpipe = 16; - dbri->chi_in_pipe = 16; - dbri->chi_out_pipe = 16; #if 0 chi_initialized++; @@ -1214,11 +1202,10 @@ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_sla 16, dbri->pipes[pipe].nextpipe); } - dbri->chi_in_pipe = 16; - dbri->chi_out_pipe = 16; - cmd = dbri_cmdlock(dbri, GetLock); } + dbri->chi_in_pipe = 16; + dbri->chi_out_pipe = 16; if (master_or_slave == CHIslave) { /* Setup DBRI for CHI Slave - receive clock, frame sync (FS) @@ -1341,8 +1328,8 @@ static void cs4215_setdata(struct snd_dbri * dbri, int muted) } else { /* Start by setting the playback attenuation. */ struct dbri_streaminfo *info = &dbri->stream_info[DBRI_PLAY]; - int left_gain = info->left_gain % 64; - int right_gain = info->right_gain % 64; + int left_gain = info->left_gain & 0x3f; + int right_gain = info->right_gain & 0x3f; dbri->mm.data[0] &= ~0x3f; /* Reset the volume bits */ dbri->mm.data[1] &= ~0x3f; @@ -1351,8 +1338,8 @@ static void cs4215_setdata(struct snd_dbri * dbri, int muted) /* Now set the recording gain. */ info = &dbri->stream_info[DBRI_REC]; - left_gain = info->left_gain % 16; - right_gain = info->right_gain % 16; + left_gain = info->left_gain & 0xf; + right_gain = info->right_gain & 0xf; dbri->mm.data[2] |= CS4215_LG(left_gain); dbri->mm.data[3] |= CS4215_RG(right_gain); } -- cgit v0.10.2 From d1fdf07e22efdb9fa53739c0f0fec1f6b24c2056 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 21 Aug 2006 19:29:18 +0200 Subject: [ALSA] sparc dbri: fixed setting of burst size after reset A proper way to set DBRI's burst size. The size must be set after each chip reset. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 6fc37c9..810f8b9 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -690,6 +690,7 @@ static void dbri_cmdsend(struct snd_dbri * dbri, volatile s32 * cmd) static void dbri_reset(struct snd_dbri * dbri) { int i; + u32 tmp; dprintk(D_GEN, "reset 0:%x 2:%x 8:%x 9:%x\n", sbus_readl(dbri->regs + REG0), @@ -699,13 +700,20 @@ static void dbri_reset(struct snd_dbri * dbri) sbus_writel(D_R, dbri->regs + REG0); /* Soft Reset */ for (i = 0; (sbus_readl(dbri->regs + REG0) & D_R) && i < 64; i++) udelay(10); + + /* A brute approach - DBRI falls back to working burst size by itself + * On SS20 D_S does not work, so do not try so high. */ + tmp = sbus_readl(dbri->regs + REG0); + tmp |= D_G | D_E; + tmp &= ~D_S; + sbus_writel(tmp, dbri->regs + REG0); } /* Lock must not be held before calling this */ static void dbri_initialize(struct snd_dbri * dbri) { volatile s32 *cmd; - u32 dma_addr, tmp; + u32 dma_addr; unsigned long flags; int n; @@ -721,13 +729,6 @@ static void dbri_initialize(struct snd_dbri * dbri) for (n = 0; n < DBRI_NO_PIPES; n++) dbri->pipes[n].desc = dbri->pipes[n].first_desc = -1; - /* A brute approach - DBRI falls back to working burst size by itself - * On SS20 D_S does not work, so do not try so high. */ - tmp = sbus_readl(dbri->regs + REG0); - tmp |= D_G | D_E; - tmp &= ~D_S; - sbus_writel(tmp, dbri->regs + REG0); - /* * Initialize the interrupt ringbuffer. */ -- cgit v0.10.2 From 294a30dc8cf13c492913f2ed3a6540bdf6e84e39 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 21 Aug 2006 19:29:59 +0200 Subject: [ALSA] sparc dbri: simplifed linking time slot function A simplified routines to link and unlink time slots. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 810f8b9..5696f79 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -274,7 +274,6 @@ enum in_or_out { PIPEinput, PIPEoutput }; struct dbri_pipe { u32 sdp; /* SDP command word */ int nextpipe; /* Next pipe in linked list */ - int cycle; /* Offset of timeslot (bits) */ int length; /* Length of timeslot (bits) */ int first_desc; /* Index of first descriptor */ int desc; /* Index of active descriptor */ @@ -312,8 +311,6 @@ struct snd_dbri { struct dbri_pipe pipes[DBRI_NO_PIPES]; /* DBRI's 32 data pipes */ int next_desc[DBRI_NO_DESCS]; /* Index of next desc, or -1 */ - int chi_in_pipe; - int chi_out_pipe; int chi_bpf; struct cs4215 mm; /* mmcodec special info */ @@ -827,92 +824,55 @@ static void setup_pipe(struct snd_dbri * dbri, int pipe, int sdp) reset_pipe(dbri, pipe); } -/* FIXME: direction not needed */ static void link_time_slot(struct snd_dbri * dbri, int pipe, - enum in_or_out direction, int basepipe, + int prevpipe, int nextpipe, int length, int cycle) { volatile s32 *cmd; int val; - int prevpipe; - int nextpipe; - if (pipe < 0 || pipe > DBRI_MAX_PIPE || basepipe < 0 || basepipe > DBRI_MAX_PIPE) { + if (pipe < 0 || pipe > DBRI_MAX_PIPE + || prevpipe < 0 || prevpipe > DBRI_MAX_PIPE + || nextpipe < 0 || nextpipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: link_time_slot called with illegal pipe number\n"); return; } - if (dbri->pipes[pipe].sdp == 0 || dbri->pipes[basepipe].sdp == 0) { + if (dbri->pipes[pipe].sdp == 0 + || dbri->pipes[prevpipe].sdp == 0 + || dbri->pipes[nextpipe].sdp == 0) { printk(KERN_ERR "DBRI: link_time_slot called on uninitialized pipe\n"); return; } - /* Deal with CHI special case: - * "If transmission on edges 0 or 1 is desired, then cycle n - * (where n = # of bit times per frame...) must be used." - * - DBRI data sheet, page 11 - */ - if (basepipe == 16 && direction == PIPEoutput && cycle == 0) - cycle = dbri->chi_bpf; - - if (basepipe == pipe) { - prevpipe = pipe; - nextpipe = pipe; - } else { - /* We're not initializing a new linked list (basepipe != pipe), - * so run through the linked list and find where this pipe - * should be sloted in, based on its cycle. CHI confuses - * things a bit, since it has a single anchor for both its - * transmit and receive lists. - */ - if (basepipe == 16) { - if (direction == PIPEinput) { - prevpipe = dbri->chi_in_pipe; - } else { - prevpipe = dbri->chi_out_pipe; - } - } else { - prevpipe = basepipe; - } - - nextpipe = dbri->pipes[prevpipe].nextpipe; - - while (dbri->pipes[nextpipe].cycle < cycle - && dbri->pipes[nextpipe].nextpipe != basepipe) { - prevpipe = nextpipe; - nextpipe = dbri->pipes[nextpipe].nextpipe; - } - } - - if (prevpipe == 16) { - if (direction == PIPEinput) { - dbri->chi_in_pipe = pipe; - } else { - dbri->chi_out_pipe = pipe; - } - } else { - dbri->pipes[prevpipe].nextpipe = pipe; - } + dbri->pipes[prevpipe].nextpipe = pipe; dbri->pipes[pipe].nextpipe = nextpipe; - dbri->pipes[pipe].cycle = cycle; dbri->pipes[pipe].length = length; cmd = dbri_cmdlock(dbri, NoGetLock); - if (direction == PIPEinput) { - val = D_DTS_VI | D_DTS_INS | D_DTS_PRVIN(prevpipe) | pipe; + if (dbri->pipes[pipe].sdp & D_SDP_TO_SER) { + /* Deal with CHI special case: + * "If transmission on edges 0 or 1 is desired, then cycle n + * (where n = # of bit times per frame...) must be used." + * - DBRI data sheet, page 11 + */ + if (prevpipe == 16 && cycle == 0) + cycle = dbri->chi_bpf; + + val = D_DTS_VO | D_DTS_INS | D_DTS_PRVOUT(prevpipe) | pipe; *(cmd++) = DBRI_CMD(D_DTS, 0, val); + *(cmd++) = 0; *(cmd++) = D_TS_LEN(length) | D_TS_CYCLE(cycle) | D_TS_NEXT(nextpipe); - *(cmd++) = 0; } else { - val = D_DTS_VO | D_DTS_INS | D_DTS_PRVOUT(prevpipe) | pipe; + val = D_DTS_VI | D_DTS_INS | D_DTS_PRVIN(prevpipe) | pipe; *(cmd++) = DBRI_CMD(D_DTS, 0, val); - *(cmd++) = 0; *(cmd++) = D_TS_LEN(length) | D_TS_CYCLE(cycle) | D_TS_NEXT(nextpipe); + *(cmd++) = 0; } dbri_cmdsend(dbri, cmd); @@ -1192,21 +1152,18 @@ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_sla } else { int pipe; - for (pipe = dbri->chi_in_pipe; - pipe != 16; pipe = dbri->pipes[pipe].nextpipe) { - unlink_time_slot(dbri, pipe, PIPEinput, - 16, dbri->pipes[pipe].nextpipe); - } - for (pipe = dbri->chi_out_pipe; - pipe != 16; pipe = dbri->pipes[pipe].nextpipe) { - unlink_time_slot(dbri, pipe, PIPEoutput, - 16, dbri->pipes[pipe].nextpipe); - } - - cmd = dbri_cmdlock(dbri, GetLock); + for (pipe = 0; pipe < DBRI_NO_PIPES; pipe++ ) + if ( pipe != 16 ) { + if (dbri->pipes[pipe].sdp & D_SDP_TO_SER) + unlink_time_slot(dbri, pipe, PIPEoutput, + 16, dbri->pipes[pipe].nextpipe); + else + unlink_time_slot(dbri, pipe, PIPEinput, + 16, dbri->pipes[pipe].nextpipe); + } + + cmd = dbri_cmdlock(dbri, GetLock); } - dbri->chi_in_pipe = 16; - dbri->chi_out_pipe = 16; if (master_or_slave == CHIslave) { /* Setup DBRI for CHI Slave - receive clock, frame sync (FS) @@ -1397,10 +1354,10 @@ static void cs4215_open(struct snd_dbri * dbri) */ data_width = dbri->mm.channels * dbri->mm.precision; - link_time_slot(dbri, 20, PIPEoutput, 16, 32, dbri->mm.offset + 32); - link_time_slot(dbri, 4, PIPEoutput, 16, data_width, dbri->mm.offset); - link_time_slot(dbri, 6, PIPEinput, 16, data_width, dbri->mm.offset); - link_time_slot(dbri, 21, PIPEinput, 16, 16, dbri->mm.offset + 40); + link_time_slot(dbri, 4, 16, 16, data_width, dbri->mm.offset); + link_time_slot(dbri, 20, 4, 16, 32, dbri->mm.offset + 32); + link_time_slot(dbri, 6, 16, 16, data_width, dbri->mm.offset); + link_time_slot(dbri, 21, 6, 16, 16, dbri->mm.offset + 40); /* FIXME: enable CHI after _setdata? */ tmp = sbus_readl(dbri->regs + REG0); @@ -1466,9 +1423,9 @@ static int cs4215_setctrl(struct snd_dbri * dbri) * Pipe 19: Receive timeslot 7 (version). */ - link_time_slot(dbri, 17, PIPEoutput, 16, 32, dbri->mm.offset); - link_time_slot(dbri, 18, PIPEinput, 16, 8, dbri->mm.offset); - link_time_slot(dbri, 19, PIPEinput, 16, 8, dbri->mm.offset + 48); + link_time_slot(dbri, 17, 16, 16, 32, dbri->mm.offset); + link_time_slot(dbri, 18, 16, 16, 8, dbri->mm.offset); + link_time_slot(dbri, 19, 18, 16, 8, dbri->mm.offset + 48); /* Wait for the chip to echo back CLB (Control Latch Bit) as zero */ dbri->mm.ctrl[0] &= ~CS4215_CLB; @@ -2445,11 +2402,11 @@ static void dbri_debug_read(struct snd_info_entry * entry, struct dbri_pipe *pptr = &dbri->pipes[pipe]; snd_iprintf(buffer, "Pipe %d: %s SDP=0x%x desc=%d, " - "len=%d @ %d next %d\n", + "len=%d next %d\n", pipe, ((pptr->sdp & D_SDP_TO_SER) ? "output" : "input"), pptr->sdp, pptr->desc, - pptr->length, pptr->cycle, pptr->nextpipe); + pptr->length, pptr->nextpipe); } } } -- cgit v0.10.2 From 1be54c824be9b5e163cd83dabdf0ad3ac81c72a8 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 21 Aug 2006 19:30:57 +0200 Subject: [ALSA] sparc dbri: ring buffered version It is a complete rework of low level layer to work on ring buffers for comands and data descriptors. This removes annoying noise due to delay in data buffer switching. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 5696f79..3fb2ede 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -2,6 +2,8 @@ * Driver for DBRI sound chip found on Sparcs. * Copyright (C) 2004, 2005 Martin Habets (mhabets@users.sourceforge.net) * + * Converted to ring buffered version by Krzysztof Helt (krzysztof.h1@wp.pl) + * * Based entirely upon drivers/sbus/audio/dbri.c which is: * Copyright (C) 1997 Rudolf Koenig (rfkoenig@immd4.informatik.uni-erlangen.de) * Copyright (C) 1998, 1999 Brent Baccala (baccala@freesoft.org) @@ -260,7 +262,7 @@ struct dbri_mem { * the CPU and the DBRI */ struct dbri_dma { - volatile s32 cmd[DBRI_NO_CMDS]; /* Place for commands */ + s32 cmd[DBRI_NO_CMDS]; /* Place for commands */ volatile s32 intr[DBRI_INT_BLK]; /* Interrupt field */ struct dbri_mem desc[DBRI_NO_DESCS]; /* Xmit/receive descriptors */ }; @@ -284,7 +286,6 @@ struct dbri_pipe { struct dbri_streaminfo { struct snd_pcm_substream *substream; u32 dvma_buffer; /* Device view of Alsa DMA buffer */ - int left; /* # of bytes left in DMA buffer */ int size; /* Size of DMA buffer */ size_t offset; /* offset in user buffer */ int pipe; /* Data pipe used */ @@ -305,11 +306,11 @@ struct snd_dbri { void __iomem *regs; /* dbri HW regs */ int dbri_irqp; /* intr queue pointer */ - int wait_send; /* sequence of command buffers send */ - int wait_ackd; /* sequence of command buffers acknowledged */ struct dbri_pipe pipes[DBRI_NO_PIPES]; /* DBRI's 32 data pipes */ int next_desc[DBRI_NO_DESCS]; /* Index of next desc, or -1 */ + spinlock_t cmdlock; /* Protects cmd queue accesses */ + s32 *cmdptr; /* Pointer to the last queued cmd */ int chi_bpf; @@ -544,7 +545,7 @@ struct snd_dbri { #define DBRI_TD_TBC (1<<0) /* Transmit buffer Complete */ #define DBRI_TD_STATUS(v) ((v)&0xff) /* Transmit status */ /* Maximum buffer size per TD: almost 8Kb */ -#define DBRI_TD_MAXCNT ((1 << 13) - 1) +#define DBRI_TD_MAXCNT ((1 << 13) - 4) /* Receive descriptor defines */ #define DBRI_RD_F (1<<31) /* End of Frame */ @@ -608,79 +609,110 @@ The list is terminated with a WAIT command, which generates a CPU interrupt to signal completion. Since the DBRI can run in parallel with the CPU, several means of -synchronization present themselves. The method implemented here is close -to the original scheme (Rudolf's), and uses 2 counters (wait_send and -wait_ackd) to synchronize the command buffer between the CPU and the DBRI. +synchronization present themselves. The method implemented here is only +to use the dbri_cmdwait() to wait for execution of batch of sent commands. -A more sophisticated scheme might involve a circular command buffer -or an array of command buffers. A routine could fill one with -commands and link it onto a list. When a interrupt signaled -completion of the current command buffer, look on the list for -the next one. +A circular command buffer is used here. A new command is being added +while other can be executed. The scheme works by adding two WAIT commands +after each sent batch of commands. When the next batch is prepared it is +added after the WAIT commands then the WAITs are replaced with single JUMP +command to the new batch. The the DBRI is forced to reread the last WAIT +command (replaced by the JUMP by then). If the DBRI is still executing +previous commands the request to reread the WAIT command is ignored. Every time a routine wants to write commands to the DBRI, it must -first call dbri_cmdlock() and get an initial pointer into dbri->dma->cmd -in return. dbri_cmdlock() will block if the previous commands have not -been completed yet. After this the commands can be written to the buffer, -and dbri_cmdsend() is called with the final pointer value to send them -to the DBRI. +first call dbri_cmdlock() and get pointer to a free space in +dbri->dma->cmd buffer. After this, the commands can be written to +the buffer, and dbri_cmdsend() is called with the final pointer value +to send them to the DBRI. */ static void dbri_process_interrupt_buffer(struct snd_dbri * dbri); -enum dbri_lock { NoGetLock, GetLock }; #define MAXLOOPS 10 - -static volatile s32 *dbri_cmdlock(struct snd_dbri * dbri, enum dbri_lock get) +/* + * Wait for the current command string to execute + */ +static void dbri_cmdwait(struct snd_dbri *dbri) { int maxloops = MAXLOOPS; -#ifndef SMP - if ((get == GetLock) && spin_is_locked(&dbri->lock)) { - printk(KERN_ERR "DBRI: cmdlock called while in spinlock."); - } -#endif - /* Delay if previous commands are still being processed */ - while ((--maxloops) > 0 && (dbri->wait_send != dbri->wait_ackd)) { + while ((--maxloops) > 0 && (sbus_readl(dbri->regs + REG0) & D_P)) msleep_interruptible(1); - } + if (maxloops == 0) { - printk(KERN_ERR "DBRI: Chip never completed command buffer %d\n", - dbri->wait_send); + printk(KERN_ERR "DBRI: Chip never completed command buffer\n"); } else { dprintk(D_CMD, "Chip completed command buffer (%d)\n", MAXLOOPS - maxloops - 1); } +} +/* + * Lock the command queue and returns pointer to a space for len cmd words + * It locks the cmdlock spinlock. + */ +static s32 *dbri_cmdlock(struct snd_dbri * dbri, int len) +{ + /* Space for 2 WAIT cmds (replaced later by 1 JUMP cmd) */ + len += 2; + spin_lock(&dbri->cmdlock); + if (dbri->cmdptr - dbri->dma->cmd + len < DBRI_NO_CMDS - 2) + return dbri->cmdptr + 2; + else if (len < sbus_readl(dbri->regs + REG8) - dbri->dma_dvma) + return dbri->dma->cmd; + else + printk(KERN_ERR "DBRI: no space for commands."); - /*if (get == GetLock) spin_lock(&dbri->lock); */ - return &dbri->dma->cmd[0]; + return 0; } -static void dbri_cmdsend(struct snd_dbri * dbri, volatile s32 * cmd) +/* + * Send prepared cmd string. It works by writting a JMP cmd into + * the last WAIT cmd and force DBRI to reread the cmd. + * The JMP cmd points to the new cmd string. + * It also releases the cmdlock spinlock. + */ +static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) { - volatile s32 *ptr; + s32 *ptr; + s32 tmp, addr; + static int wait_id = 0; - for (ptr = &dbri->dma->cmd[0]; ptr < cmd; ptr++) { - dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); - } + wait_id++; + wait_id &= 0xffff; /* restrict it to a 16 bit counter. */ + *(cmd) = DBRI_CMD(D_WAIT, 1, wait_id); + *(cmd+1) = DBRI_CMD(D_WAIT, 1, wait_id); - if ((cmd - &dbri->dma->cmd[0]) >= DBRI_NO_CMDS - 1) { - printk(KERN_ERR "DBRI: Command buffer overflow! (bug in driver)\n"); - /* Ignore the last part. */ - cmd = &dbri->dma->cmd[DBRI_NO_CMDS - 3]; - } + /* Replace the last command with JUMP */ + addr = dbri->dma_dvma + (cmd - len - dbri->dma->cmd) * sizeof(s32); + *(dbri->cmdptr+1) = addr; + *(dbri->cmdptr) = DBRI_CMD(D_JUMP, 0, 0); - dbri->wait_send++; - dbri->wait_send &= 0xffff; /* restrict it to a 16 bit counter. */ - *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); - *(cmd++) = DBRI_CMD(D_WAIT, 1, dbri->wait_send); +#ifdef DBRI_DEBUG + if (cmd > dbri->cmdptr ) + for (ptr = dbri->cmdptr; ptr < cmd+2; ptr++) { + dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); + } + else { + ptr = dbri->cmdptr; + dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); + ptr = dbri->cmdptr+1; + dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); + for (ptr = dbri->dma->cmd; ptr < cmd+2; ptr++) { + dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); + } + } +#endif - /* Set command pointer and signal it is valid. */ - sbus_writel(dbri->dma_dvma, dbri->regs + REG8); + /* Reread the last command */ + tmp = sbus_readl(dbri->regs + REG0); + tmp |= D_P; + sbus_writel(tmp, dbri->regs + REG0); - /*spin_unlock(&dbri->lock); */ + dbri->cmdptr = cmd; + spin_unlock(&dbri->cmdlock); } /* Lock must be held when calling this */ @@ -709,7 +741,7 @@ static void dbri_reset(struct snd_dbri * dbri) /* Lock must not be held before calling this */ static void dbri_initialize(struct snd_dbri * dbri) { - volatile s32 *cmd; + s32 *cmd; u32 dma_addr; unsigned long flags; int n; @@ -718,14 +750,11 @@ static void dbri_initialize(struct snd_dbri * dbri) dbri_reset(dbri); - cmd = dbri_cmdlock(dbri, NoGetLock); - dprintk(D_GEN, "init: cmd: %p, int: %p\n", - &dbri->dma->cmd[0], &dbri->dma->intr[0]); - /* Initialize pipes */ for (n = 0; n < DBRI_NO_PIPES; n++) dbri->pipes[n].desc = dbri->pipes[n].first_desc = -1; + spin_lock_init(&dbri->cmdlock); /* * Initialize the interrupt ringbuffer. */ @@ -735,10 +764,19 @@ static void dbri_initialize(struct snd_dbri * dbri) /* * Set up the interrupt queue */ + spin_lock(&dbri->cmdlock); + cmd = dbri->cmdptr = dbri->dma->cmd; *(cmd++) = DBRI_CMD(D_IIQ, 0, 0); *(cmd++) = dma_addr; + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); + dbri->cmdptr = cmd; + *(cmd++) = DBRI_CMD(D_WAIT, 1, 0); + *(cmd++) = DBRI_CMD(D_WAIT, 1, 0); + dma_addr = dbri->dma_dvma + dbri_dma_off(cmd, 0); + sbus_writel(dma_addr, dbri->regs + REG8); + spin_unlock(&dbri->cmdlock); + dbri_cmdwait(dbri); - dbri_cmdsend(dbri, cmd); spin_unlock_irqrestore(&dbri->lock, flags); } @@ -770,7 +808,7 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) { int sdp; int desc; - volatile int *cmd; + s32 *cmd; if (pipe < 0 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: reset_pipe called with illegal pipe number\n"); @@ -783,16 +821,18 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) return; } - cmd = dbri_cmdlock(dbri, NoGetLock); + cmd = dbri_cmdlock(dbri, 3); *(cmd++) = DBRI_CMD(D_SDP, 0, sdp | D_SDP_C | D_SDP_P); *(cmd++) = 0; - dbri_cmdsend(dbri, cmd); + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); + dbri_cmdsend(dbri, cmd, 3); desc = dbri->pipes[pipe].first_desc; - while (desc != -1) { - dbri->dma->desc[desc].nda = dbri->dma->desc[desc].ba = 0; - desc = dbri->next_desc[desc]; - } + if ( desc >= 0) + do { + dbri->dma->desc[desc].nda = dbri->dma->desc[desc].ba = 0; + desc = dbri->next_desc[desc]; + } while (desc != -1 && desc != dbri->pipes[pipe].first_desc); dbri->pipes[pipe].desc = -1; dbri->pipes[pipe].first_desc = -1; @@ -828,7 +868,7 @@ static void link_time_slot(struct snd_dbri * dbri, int pipe, int prevpipe, int nextpipe, int length, int cycle) { - volatile s32 *cmd; + s32 *cmd; int val; if (pipe < 0 || pipe > DBRI_MAX_PIPE @@ -847,11 +887,10 @@ static void link_time_slot(struct snd_dbri * dbri, int pipe, } dbri->pipes[prevpipe].nextpipe = pipe; - dbri->pipes[pipe].nextpipe = nextpipe; dbri->pipes[pipe].length = length; - cmd = dbri_cmdlock(dbri, NoGetLock); + cmd = dbri_cmdlock(dbri, 4); if (dbri->pipes[pipe].sdp & D_SDP_TO_SER) { /* Deal with CHI special case: @@ -874,25 +913,27 @@ static void link_time_slot(struct snd_dbri * dbri, int pipe, D_TS_LEN(length) | D_TS_CYCLE(cycle) | D_TS_NEXT(nextpipe); *(cmd++) = 0; } + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); - dbri_cmdsend(dbri, cmd); + dbri_cmdsend(dbri, cmd, 4); } static void unlink_time_slot(struct snd_dbri * dbri, int pipe, enum in_or_out direction, int prevpipe, int nextpipe) { - volatile s32 *cmd; + s32 *cmd; int val; if (pipe < 0 || pipe > DBRI_MAX_PIPE - || prevpipe < 0 || prevpipe > DBRI_MAX_PIPE) { + || prevpipe < 0 || prevpipe > DBRI_MAX_PIPE + || nextpipe < 0 || nextpipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: unlink_time_slot called with illegal pipe number\n"); return; } - cmd = dbri_cmdlock(dbri, NoGetLock); + cmd = dbri_cmdlock(dbri, 4); if (direction == PIPEinput) { val = D_DTS_VI | D_DTS_DEL | D_DTS_PRVIN(prevpipe) | pipe; @@ -905,8 +946,9 @@ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, *(cmd++) = 0; *(cmd++) = D_TS_NEXT(nextpipe); } + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); - dbri_cmdsend(dbri, cmd); + dbri_cmdsend(dbri, cmd, 4); } /* xmit_fixed() / recv_fixed() @@ -925,7 +967,7 @@ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, */ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) { - volatile s32 *cmd; + s32 *cmd; if (pipe < 16 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: xmit_fixed: Illegal pipe number\n"); @@ -952,12 +994,14 @@ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) if (dbri->pipes[pipe].sdp & D_SDP_MSB) data = reverse_bytes(data, dbri->pipes[pipe].length); - cmd = dbri_cmdlock(dbri, GetLock); + cmd = dbri_cmdlock(dbri, 3); *(cmd++) = DBRI_CMD(D_SSP, 0, pipe); *(cmd++) = data; + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); - dbri_cmdsend(dbri, cmd); + dbri_cmdsend(dbri, cmd, 3); + dbri_cmdwait(dbri); } static void recv_fixed(struct snd_dbri * dbri, int pipe, volatile __u32 * ptr) @@ -991,6 +1035,8 @@ static void recv_fixed(struct snd_dbri * dbri, int pipe, volatile __u32 * ptr) * and work by building chains of descriptors which identify the * data buffers. Buffers too large for a single descriptor will * be spread across multiple descriptors. + * + * All descriptors create a ring buffer. */ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period) { @@ -1051,14 +1097,13 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period return -1; } - if (len > DBRI_TD_MAXCNT) { - mylen = DBRI_TD_MAXCNT; /* 8KB - 1 */ - } else { + if (len > DBRI_TD_MAXCNT) + mylen = DBRI_TD_MAXCNT; /* 8KB - 4 */ + else mylen = len; - } - if (mylen > period) { + + if (mylen > period) mylen = period; - } dbri->next_desc[desc] = -1; dbri->dma->desc[desc].ba = dvma_buffer; @@ -1067,17 +1112,17 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period if (streamno == DBRI_PLAY) { dbri->dma->desc[desc].word1 = DBRI_TD_CNT(mylen); dbri->dma->desc[desc].word4 = 0; - if (first_desc != -1) - dbri->dma->desc[desc].word1 |= DBRI_TD_M; + dbri->dma->desc[desc].word1 |= + DBRI_TD_F | DBRI_TD_B; } else { dbri->dma->desc[desc].word1 = 0; dbri->dma->desc[desc].word4 = DBRI_RD_B | DBRI_RD_BCNT(mylen); } - if (first_desc == -1) { + if (first_desc == -1) first_desc = desc; - } else { + else { dbri->next_desc[last_desc] = desc; dbri->dma->desc[last_desc].nda = dbri->dma_dvma + dbri_dma_off(desc, desc); @@ -1093,21 +1138,28 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period return -1; } - dbri->dma->desc[last_desc].word1 &= ~DBRI_TD_M; if (streamno == DBRI_PLAY) { dbri->dma->desc[last_desc].word1 |= - DBRI_TD_I | DBRI_TD_F | DBRI_TD_B; + DBRI_TD_F | DBRI_TD_B; + dbri->dma->desc[last_desc].nda = + dbri->dma_dvma + dbri_dma_off(desc, first_desc); + dbri->next_desc[last_desc] = first_desc; } dbri->pipes[info->pipe].first_desc = first_desc; dbri->pipes[info->pipe].desc = first_desc; - for (desc = first_desc; desc != -1; desc = dbri->next_desc[desc]) { +#ifdef DBRI_DEBUG + for (desc = first_desc; desc != -1; ) { dprintk(D_DESC, "DESC %d: %08x %08x %08x %08x\n", desc, dbri->dma->desc[desc].word1, dbri->dma->desc[desc].ba, dbri->dma->desc[desc].nda, dbri->dma->desc[desc].word4); + desc = dbri->next_desc[desc]; + if ( desc == first_desc ) + break; } +#endif return 0; } @@ -1127,43 +1179,24 @@ enum master_or_slave { CHImaster, CHIslave }; static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_slave, int bits_per_frame) { - volatile s32 *cmd; + s32 *cmd; int val; - static int chi_initialized = 0; /* FIXME: mutex? */ - - if (!chi_initialized) { - cmd = dbri_cmdlock(dbri, GetLock); + /* Set CHI Anchor: Pipe 16 */ - /* Set CHI Anchor: Pipe 16 */ - - val = D_DTS_VO | D_DTS_VI | D_DTS_INS - | D_DTS_PRVIN(16) | D_PIPE(16) | D_DTS_PRVOUT(16); - *(cmd++) = DBRI_CMD(D_DTS, 0, val); - *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); - *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); + cmd = dbri_cmdlock(dbri, 4); + val = D_DTS_VO | D_DTS_VI | D_DTS_INS + | D_DTS_PRVIN(16) | D_PIPE(16) | D_DTS_PRVOUT(16); + *(cmd++) = DBRI_CMD(D_DTS, 0, val); + *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); + *(cmd++) = D_TS_ANCHOR | D_TS_NEXT(16); + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); + dbri_cmdsend(dbri, cmd, 4); - dbri->pipes[16].sdp = 1; - dbri->pipes[16].nextpipe = 16; + dbri->pipes[16].sdp = 1; + dbri->pipes[16].nextpipe = 16; -#if 0 - chi_initialized++; -#endif - } else { - int pipe; - - for (pipe = 0; pipe < DBRI_NO_PIPES; pipe++ ) - if ( pipe != 16 ) { - if (dbri->pipes[pipe].sdp & D_SDP_TO_SER) - unlink_time_slot(dbri, pipe, PIPEoutput, - 16, dbri->pipes[pipe].nextpipe); - else - unlink_time_slot(dbri, pipe, PIPEinput, - 16, dbri->pipes[pipe].nextpipe); - } - - cmd = dbri_cmdlock(dbri, GetLock); - } + cmd = dbri_cmdlock(dbri, 4); if (master_or_slave == CHIslave) { /* Setup DBRI for CHI Slave - receive clock, frame sync (FS) @@ -1202,8 +1235,9 @@ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_sla *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); *(cmd++) = DBRI_CMD(D_CDM, 0, D_CDM_XCE | D_CDM_XEN | D_CDM_REN); + *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); - dbri_cmdsend(dbri, cmd); + dbri_cmdsend(dbri, cmd, 4); } /* @@ -1240,6 +1274,8 @@ static void cs4215_setup_pipes(struct snd_dbri * dbri) setup_pipe(dbri, 17, D_SDP_FIXED | D_SDP_TO_SER | D_SDP_MSB); setup_pipe(dbri, 18, D_SDP_FIXED | D_SDP_FROM_SER | D_SDP_MSB); setup_pipe(dbri, 19, D_SDP_FIXED | D_SDP_FROM_SER | D_SDP_MSB); + + dbri_cmdwait(dbri); } static int cs4215_init_data(struct cs4215 *mm) @@ -1271,7 +1307,7 @@ static int cs4215_init_data(struct cs4215 *mm) mm->status = 0; mm->version = 0xff; mm->precision = 8; /* For ULAW */ - mm->channels = 2; + mm->channels = 1; return 0; } @@ -1554,7 +1590,6 @@ static int cs4215_init(struct snd_dbri * dbri) } cs4215_setup_pipes(dbri); - cs4215_init_data(&dbri->mm); /* Enable capture of the status & version timeslots. */ @@ -1583,9 +1618,7 @@ buffer and calls dbri_process_one_interrupt() for each interrupt word. Complicated interrupts are handled by dedicated functions (which appear first in this file). Any pending interrupts can be serviced by calling dbri_process_interrupt_buffer(), which works even if the CPU's -interrupts are disabled. This function is used by dbri_cmdlock() -to make sure we're synced up with the chip before each command sequence, -even if we're running cli'ed. +interrupts are disabled. */ @@ -1594,11 +1627,10 @@ even if we're running cli'ed. * Transmit the current TD's for recording/playing, if needed. * For playback, ALSA has filled the DMA memory with new data (we hope). */ -static void xmit_descs(unsigned long data) +static void xmit_descs(struct snd_dbri *dbri) { - struct snd_dbri *dbri = (struct snd_dbri *) data; struct dbri_streaminfo *info; - volatile s32 *cmd; + s32 *cmd; unsigned long flags; int first_td; @@ -1609,7 +1641,7 @@ static void xmit_descs(unsigned long data) info = &dbri->stream_info[DBRI_REC]; spin_lock_irqsave(&dbri->lock, flags); - if ((info->left >= info->size) && (info->pipe >= 0)) { + if (info->pipe >= 0) { first_td = dbri->pipes[info->pipe].first_desc; dprintk(D_DESC, "xmit_descs rec @ TD %d\n", first_td); @@ -1619,16 +1651,15 @@ static void xmit_descs(unsigned long data) goto play; } - cmd = dbri_cmdlock(dbri, NoGetLock); + cmd = dbri_cmdlock(dbri, 2); *(cmd++) = DBRI_CMD(D_SDP, 0, dbri->pipes[info->pipe].sdp | D_SDP_P | D_SDP_EVERY | D_SDP_C); *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); - dbri_cmdsend(dbri, cmd); + dbri_cmdsend(dbri, cmd, 2); /* Reset our admin of the pipe & bytes read. */ dbri->pipes[info->pipe].desc = first_td; - info->left = 0; } play: @@ -1638,33 +1669,27 @@ play: info = &dbri->stream_info[DBRI_PLAY]; spin_lock_irqsave(&dbri->lock, flags); - if ((info->left <= 0) && (info->pipe >= 0)) { + if (info->pipe >= 0) { first_td = dbri->pipes[info->pipe].first_desc; dprintk(D_DESC, "xmit_descs play @ TD %d\n", first_td); /* Stream could be closed by the time we run. */ - if (first_td < 0) { - spin_unlock_irqrestore(&dbri->lock, flags); - return; - } - - cmd = dbri_cmdlock(dbri, NoGetLock); - *(cmd++) = DBRI_CMD(D_SDP, 0, - dbri->pipes[info->pipe].sdp - | D_SDP_P | D_SDP_EVERY | D_SDP_C); - *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); - dbri_cmdsend(dbri, cmd); + if (first_td >= 0) { + cmd = dbri_cmdlock(dbri, 2); + *(cmd++) = DBRI_CMD(D_SDP, 0, + dbri->pipes[info->pipe].sdp + | D_SDP_P | D_SDP_EVERY | D_SDP_C); + *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); + dbri_cmdsend(dbri, cmd, 2); - /* Reset our admin of the pipe & bytes written. */ - dbri->pipes[info->pipe].desc = first_td; - info->left = info->size; + /* Reset our admin of the pipe & bytes written. */ + dbri->pipes[info->pipe].desc = first_td; + } } spin_unlock_irqrestore(&dbri->lock, flags); } -static DECLARE_TASKLET(xmit_descs_task, xmit_descs, 0); - /* transmission_complete_intr() * * Called by main interrupt handler when DBRI signals transmission complete @@ -1684,7 +1709,6 @@ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) struct dbri_streaminfo *info; int td; int status; - int len; info = &dbri->stream_info[DBRI_PLAY]; @@ -1703,20 +1727,7 @@ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) dprintk(D_INT, "TD %d, status 0x%02x\n", td, status); dbri->dma->desc[td].word4 = 0; /* Reset it for next time. */ - len = DBRI_RD_CNT(dbri->dma->desc[td].word1); - info->offset += len; - info->left -= len; - - /* On the last TD, transmit them all again. */ - if (dbri->next_desc[td] == -1) { - if (info->left > 0) { - printk(KERN_WARNING - "%d bytes left after last transfer.\n", - info->left); - info->left = 0; - } - tasklet_schedule(&xmit_descs_task); - } + info->offset += DBRI_RD_CNT(dbri->dma->desc[td].word1); td = dbri->next_desc[td]; dbri->pipes[pipe].desc = td; @@ -1749,7 +1760,6 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) info = &dbri->stream_info[DBRI_REC]; info->offset += DBRI_RD_CNT(status); - info->left += DBRI_RD_CNT(status); /* FIXME: Check status */ @@ -1757,6 +1767,7 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) rd, DBRI_RD_STATUS(status), DBRI_RD_CNT(status)); /* On the last TD, transmit them all again. */ +#if 0 if (dbri->next_desc[rd] == -1) { if (info->left > info->size) { printk(KERN_WARNING @@ -1765,6 +1776,7 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) } tasklet_schedule(&xmit_descs_task); } +#endif /* Notify ALSA */ if (spin_is_locked(&dbri->lock)) { @@ -1793,16 +1805,11 @@ static void dbri_process_one_interrupt(struct snd_dbri * dbri, int x) channel, code, rval); } - if (channel == D_INTR_CMD && command == D_WAIT) { - dbri->wait_ackd = val; - if (dbri->wait_send != val) { - printk(KERN_ERR "Processing wait command %d when %d was send.\n", - val, dbri->wait_send); - } - return; - } - switch (code) { + case D_INTR_CMDI: + if (command != D_WAIT) + printk(KERN_ERR "DBRI: Command read interrupt\n"); + break; case D_INTR_BRDY: reception_complete_intr(dbri, channel); break; @@ -1815,8 +1822,10 @@ static void dbri_process_one_interrupt(struct snd_dbri * dbri, int x) * resend SDP command with clear pipe bit (C) set */ { - volatile s32 *cmd; - + /* FIXME: do something useful in case of underrun */ + printk(KERN_ERR "DBRI: Underrun error\n"); +#if 0 + s32 *cmd; int pipe = channel; int td = dbri->pipes[pipe].desc; @@ -1827,6 +1836,7 @@ static void dbri_process_one_interrupt(struct snd_dbri * dbri, int x) | D_SDP_P | D_SDP_C | D_SDP_2SAME); *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, td); dbri_cmdsend(dbri, cmd); +#endif } break; case D_INTR_FXDT: @@ -1847,9 +1857,7 @@ static void dbri_process_one_interrupt(struct snd_dbri * dbri, int x) /* dbri_process_interrupt_buffer advances through the DBRI's interrupt * buffer until it finds a zero word (indicating nothing more to do * right now). Non-zero words require processing and are handed off - * to dbri_process_one_interrupt AFTER advancing the pointer. This - * order is important since we might recurse back into this function - * and need to make sure the pointer has been advanced first. + * to dbri_process_one_interrupt AFTER advancing the pointer. */ static void dbri_process_interrupt_buffer(struct snd_dbri * dbri) { @@ -1919,8 +1927,6 @@ static irqreturn_t snd_dbri_interrupt(int irq, void *dev_id, dbri_process_interrupt_buffer(dbri); - /* FIXME: Write 0 into regs to ACK interrupt */ - spin_unlock(&dbri->lock); return IRQ_HANDLED; @@ -1962,7 +1968,6 @@ static int snd_dbri_open(struct snd_pcm_substream *substream) spin_lock_irqsave(&dbri->lock, flags); info->substream = substream; - info->left = 0; info->offset = 0; info->dvma_buffer = 0; info->pipe = -1; @@ -1980,7 +1985,6 @@ static int snd_dbri_close(struct snd_pcm_substream *substream) dprintk(D_USR, "close audio output.\n"); info->substream = NULL; - info->left = 0; info->offset = 0; return 0; @@ -2062,10 +2066,8 @@ static int snd_dbri_prepare(struct snd_pcm_substream *substream) info->size = snd_pcm_lib_buffer_bytes(substream); if (DBRI_STREAMNO(substream) == DBRI_PLAY) info->pipe = 4; /* Send pipe */ - else { + else info->pipe = 6; /* Receive pipe */ - info->left = info->size; /* To trigger submittal */ - } spin_lock_irq(&dbri->lock); @@ -2093,14 +2095,11 @@ static int snd_dbri_trigger(struct snd_pcm_substream *substream, int cmd) case SNDRV_PCM_TRIGGER_START: dprintk(D_USR, "start audio, period is %d bytes\n", (int)snd_pcm_lib_period_bytes(substream)); - /* Enable & schedule the tasklet that re-submits the TDs. */ - xmit_descs_task.data = (unsigned long)dbri; - tasklet_schedule(&xmit_descs_task); + /* Re-submit the TDs. */ + xmit_descs(dbri); break; case SNDRV_PCM_TRIGGER_STOP: dprintk(D_USR, "stop audio.\n"); - /* Make the tasklet bail out immediately. */ - xmit_descs_task.data = 0; reset_pipe(dbri, info->pipe); break; default: @@ -2118,8 +2117,8 @@ static snd_pcm_uframes_t snd_dbri_pointer(struct snd_pcm_substream *substream) ret = bytes_to_frames(substream->runtime, info->offset) % substream->runtime->buffer_size; - dprintk(D_USR, "I/O pointer: %ld frames, %d bytes left.\n", - ret, info->left); + dprintk(D_USR, "I/O pointer: %ld frames of %ld.\n", + ret, substream->runtime->buffer_size); return ret; } -- cgit v0.10.2 From ab93c7ae54a81bcecb77608ca89eea140f1d45ad Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Wed, 23 Aug 2006 11:37:36 +0200 Subject: [ALSA] sparc dbri: hardware constrains added This patch adds ALSA hardware constrains so stereo is possible only with 16-bit format. It contains small cleanups to ring buffered code as well. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 3fb2ede..3e6ad50 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -85,7 +85,7 @@ MODULE_PARM_DESC(id, "ID string for Sun DBRI soundcard."); module_param_array(enable, bool, NULL, 0444); MODULE_PARM_DESC(enable, "Enable Sun DBRI soundcard."); -#define DBRI_DEBUG +#undef DBRI_DEBUG #define D_INT (1<<0) #define D_GEN (1<<1) @@ -160,7 +160,7 @@ static struct { /* { NA, (1 << 4), (5 << 3) }, */ { 48000, (1 << 4), (6 << 3) }, { 9600, (1 << 4), (7 << 3) }, - { 5513, (2 << 4), (0 << 3) }, /* Actually 5512.5 */ + { 5512, (2 << 4), (0 << 3) }, /* Actually 5512.5 */ { 11025, (2 << 4), (1 << 3) }, { 18900, (2 << 4), (2 << 3) }, { 22050, (2 << 4), (3 << 3) }, @@ -628,8 +628,6 @@ to send them to the DBRI. */ -static void dbri_process_interrupt_buffer(struct snd_dbri * dbri); - #define MAXLOOPS 10 /* * Wait for the current command string to execute @@ -669,15 +667,15 @@ static s32 *dbri_cmdlock(struct snd_dbri * dbri, int len) } /* - * Send prepared cmd string. It works by writting a JMP cmd into + * Send prepared cmd string. It works by writting a JUMP cmd into * the last WAIT cmd and force DBRI to reread the cmd. - * The JMP cmd points to the new cmd string. + * The JUMP cmd points to the new cmd string. * It also releases the cmdlock spinlock. */ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) { - s32 *ptr; s32 tmp, addr; + unsigned long flags; static int wait_id = 0; wait_id++; @@ -691,14 +689,17 @@ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) *(dbri->cmdptr) = DBRI_CMD(D_JUMP, 0, 0); #ifdef DBRI_DEBUG - if (cmd > dbri->cmdptr ) + if (cmd > dbri->cmdptr) { + s32 *ptr; + for (ptr = dbri->cmdptr; ptr < cmd+2; ptr++) { dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); } - else { - ptr = dbri->cmdptr; + } else { + s32 *ptr = dbri->cmdptr; + dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); - ptr = dbri->cmdptr+1; + ptr++; dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); for (ptr = dbri->dma->cmd; ptr < cmd+2; ptr++) { dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); @@ -706,10 +707,12 @@ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) } #endif + spin_lock_irqsave(&dbri->lock, flags); /* Reread the last command */ tmp = sbus_readl(dbri->regs + REG0); tmp |= D_P; sbus_writel(tmp, dbri->regs + REG0); + spin_unlock_irqrestore(&dbri->lock, flags); dbri->cmdptr = cmd; spin_unlock(&dbri->cmdlock); @@ -1549,8 +1552,7 @@ static int cs4215_prepare(struct snd_dbri * dbri, unsigned int rate, CS4215_BSEL_128 | CS4215_FREQ[freq_idx].xtal; dbri->mm.channels = channels; - /* Stereo bit: 8 bit stereo not working yet. */ - if ((channels > 1) && (dbri->mm.precision == 16)) + if (channels == 2) dbri->mm.ctrl[1] |= CS4215_DFR_STEREO; ret = cs4215_setctrl(dbri); @@ -1624,7 +1626,7 @@ interrupts are disabled. /* xmit_descs() * - * Transmit the current TD's for recording/playing, if needed. + * Starts transmiting the current TD's for recording/playing. * For playback, ALSA has filled the DMA memory with new data (we hope). */ static void xmit_descs(struct snd_dbri *dbri) @@ -1699,9 +1701,9 @@ play: * them as available. Stops when the first descriptor is found without * TBC (Transmit Buffer Complete) set, or we've run through them all. * - * The DMA buffers are not released, but re-used. Since the transmit buffer - * descriptors are not clobbered, they can be re-submitted as is. This is - * done by the xmit_descs() tasklet above since that could take longer. + * The DMA buffers are not released. They form a ring buffer and + * they are filled by ALSA while others are transmitted by DMA. + * */ static void transmission_complete_intr(struct snd_dbri * dbri, int pipe) @@ -1944,8 +1946,8 @@ static struct snd_pcm_hardware snd_dbri_pcm_hw = { SNDRV_PCM_FMTBIT_A_LAW | SNDRV_PCM_FMTBIT_U8 | SNDRV_PCM_FMTBIT_S16_BE, - .rates = SNDRV_PCM_RATE_8000_48000, - .rate_min = 8000, + .rates = SNDRV_PCM_RATE_8000_48000 | SNDRV_PCM_RATE_5512, + .rate_min = 5512, .rate_max = 48000, .channels_min = 1, .channels_max = 2, @@ -1956,6 +1958,39 @@ static struct snd_pcm_hardware snd_dbri_pcm_hw = { .periods_max = 1024, }; +static int snd_hw_rule_format(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) +{ + struct snd_interval *c = hw_param_interval(params, + SNDRV_PCM_HW_PARAM_CHANNELS); + struct snd_mask *f = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + struct snd_mask fmt; + + snd_mask_any(&fmt); + if (c->min > 1) { + fmt.bits[0] &= SNDRV_PCM_FMTBIT_S16_BE; + return snd_mask_refine(f, &fmt); + } + return 0; +} + +static int snd_hw_rule_channels(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) +{ + struct snd_interval *c = hw_param_interval(params, + SNDRV_PCM_HW_PARAM_CHANNELS); + struct snd_mask *f = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + struct snd_interval ch; + + snd_interval_any(&ch); + if (!(f->bits[0] & SNDRV_PCM_FMTBIT_S16_BE)) { + ch.min = ch.max = 1; + ch.integer = 1; + return snd_interval_refine(c, &ch); + } + return 0; +} + static int snd_dbri_open(struct snd_pcm_substream *substream) { struct snd_dbri *dbri = snd_pcm_substream_chip(substream); @@ -1973,6 +2008,14 @@ static int snd_dbri_open(struct snd_pcm_substream *substream) info->pipe = -1; spin_unlock_irqrestore(&dbri->lock, flags); + snd_pcm_hw_rule_add(runtime,0,SNDRV_PCM_HW_PARAM_CHANNELS, + snd_hw_rule_format, 0, SNDRV_PCM_HW_PARAM_FORMAT, + -1); + snd_pcm_hw_rule_add(runtime,0,SNDRV_PCM_HW_PARAM_FORMAT, + snd_hw_rule_channels, 0, + SNDRV_PCM_HW_PARAM_CHANNELS, + -1); + cs4215_open(dbri); return 0; -- cgit v0.10.2 From 93f09c4cc111506db2ffa6220b7a3d7f73e41aa3 Mon Sep 17 00:00:00 2001 From: Adrian Bunk Date: Mon, 21 Aug 2006 19:22:45 +0200 Subject: [ALSA] make sound/pci/emu10k1/emu10k1.c:snd_emu10k1_resume() static This patch makes the needlessly global snd_emu10k1_resume() static. Signed-off-by: Adrian Bunk Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/emu10k1/emu10k1.c b/sound/pci/emu10k1/emu10k1.c index 289bcd9..493ec08 100644 --- a/sound/pci/emu10k1/emu10k1.c +++ b/sound/pci/emu10k1/emu10k1.c @@ -232,7 +232,7 @@ static int snd_emu10k1_suspend(struct pci_dev *pci, pm_message_t state) return 0; } -int snd_emu10k1_resume(struct pci_dev *pci) +static int snd_emu10k1_resume(struct pci_dev *pci) { struct snd_card *card = pci_get_drvdata(pci); struct snd_emu10k1 *emu = card->private_data; -- cgit v0.10.2 From 7376d013fc6d3a45d748e0ce758ca9412b01b9dd Mon Sep 17 00:00:00 2001 From: Stephen Hemminger Date: Mon, 21 Aug 2006 19:17:46 +0200 Subject: [ALSA] intel_hda: MSI support Simple patch to enable Message Signalled Interrupts for the HDA Intel audio controller. Tested with: Intel Corporation 82801FB/FBM/FR/FW/FRW (ICH6 Family) High Definition Audio Controller (rev 03) MSI is better because it means audio doesn't end up sharing IRQ with USB. Signed-off-by: Stephen Hemminger Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index ce75e07..c9ae9f7 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -55,6 +55,7 @@ static char *model; static int position_fix; static int probe_mask = -1; static int single_cmd; +static int disable_msi; module_param(index, int, 0444); MODULE_PARM_DESC(index, "Index value for Intel HD audio interface."); @@ -68,6 +69,8 @@ module_param(probe_mask, int, 0444); MODULE_PARM_DESC(probe_mask, "Bitmask to probe codecs (default = -1)."); module_param(single_cmd, bool, 0444); MODULE_PARM_DESC(single_cmd, "Use single command to communicate with codecs (for debugging only)."); +module_param(disable_msi, int, 0); +MODULE_PARM_DESC(disable_msi, "Disable Message Signaled Interrupt (MSI)"); /* just for backward compatibility */ @@ -1418,8 +1421,10 @@ static int azx_free(struct azx *chip) msleep(1); } - if (chip->irq >= 0) + if (chip->irq >= 0) { + pci_disable_msi(chip->pci); free_irq(chip->irq, (void*)chip); + } if (chip->remap_addr) iounmap(chip->remap_addr); @@ -1502,6 +1507,9 @@ static int __devinit azx_create(struct snd_card *card, struct pci_dev *pci, goto errout; } + if (!disable_msi) + pci_enable_msi(pci); + if (request_irq(pci->irq, azx_interrupt, IRQF_DISABLED|IRQF_SHARED, "HDA Intel", (void*)chip)) { snd_printk(KERN_ERR SFX "unable to grab IRQ %d\n", pci->irq); -- cgit v0.10.2 From 2f3482fbbd5dac7d0e86fe5b7ac5c1e51d52b084 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 21 Aug 2006 18:44:31 +0200 Subject: [ALSA] Add TLV support to AC97 codec driver Added the TLV support to AC97 codec driver for addition of dB range information. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c index e5d062d..c47f43d 100644 --- a/sound/pci/ac97/ac97_codec.c +++ b/sound/pci/ac97/ac97_codec.c @@ -31,6 +31,7 @@ #include #include #include +#include #include #include #include @@ -1182,6 +1183,32 @@ static int snd_ac97_cmute_new_stereo(struct snd_card *card, char *name, int reg, } /* + * set dB information + */ +static DECLARE_TLV_DB_SCALE(db_scale_4bit, -4500, 300, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit, -4650, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_6bit, -9450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit_12db_max, -3450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_rec_gain, 0, 150, 0); + +static unsigned int *find_db_scale(unsigned int maxval) +{ + switch (maxval) { + case 0x0f: return db_scale_4bit; + case 0x1f: return db_scale_5bit; + case 0x3f: return db_scale_6bit; + } + return NULL; +} + +static void set_tlv_db_scale(struct snd_kcontrol *kctl, unsigned int *tlv) +{ + kctl->tlv.p = tlv; + if (tlv) + kctl->vd[0].access |= SNDRV_CTL_ELEM_ACCESS_TLV_READ; +} + +/* * create a volume for normal stereo/mono controls */ static int snd_ac97_cvol_new(struct snd_card *card, char *name, int reg, unsigned int lo_max, @@ -1203,6 +1230,10 @@ static int snd_ac97_cvol_new(struct snd_card *card, char *name, int reg, unsigne tmp.index = ac97->num; kctl = snd_ctl_new1(&tmp, ac97); } + if (reg >= AC97_PHONE && reg <= AC97_PCM) + set_tlv_db_scale(kctl, db_scale_5bit_12db_max); + else + set_tlv_db_scale(kctl, find_db_scale(lo_max)); err = snd_ctl_add(card, kctl); if (err < 0) return err; @@ -1282,6 +1313,7 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) snd_ac97_change_volume_params2(ac97, AC97_CENTER_LFE_MASTER, 0, &max); kctl->private_value &= ~(0xff << 16); kctl->private_value |= (int)max << 16; + set_tlv_db_scale(kctl, find_db_scale(max)); snd_ac97_write_cache(ac97, AC97_CENTER_LFE_MASTER, ac97->regs[AC97_CENTER_LFE_MASTER] | max); } @@ -1295,6 +1327,7 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) snd_ac97_change_volume_params2(ac97, AC97_CENTER_LFE_MASTER, 8, &max); kctl->private_value &= ~(0xff << 16); kctl->private_value |= (int)max << 16; + set_tlv_db_scale(kctl, find_db_scale(max)); snd_ac97_write_cache(ac97, AC97_CENTER_LFE_MASTER, ac97->regs[AC97_CENTER_LFE_MASTER] | max << 8); } @@ -1342,8 +1375,9 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) ((ac97->flags & AC97_HAS_PC_BEEP) || snd_ac97_try_volume_mix(ac97, AC97_PC_BEEP))) { for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_pc_beep[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_pc_beep[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_4bit); snd_ac97_write_cache(ac97, AC97_PC_BEEP, snd_ac97_read(ac97, AC97_PC_BEEP) | 0x801e); } @@ -1410,22 +1444,26 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) else init_val = 0x9f1f; for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_ad18xx_pcm[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_ad18xx_pcm[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_5bit); ac97->spec.ad18xx.pcmreg[0] = init_val; if (ac97->scaps & AC97_SCAP_SURROUND_DAC) { for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_ad18xx_surround[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_ad18xx_surround[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_5bit); ac97->spec.ad18xx.pcmreg[1] = init_val; } if (ac97->scaps & AC97_SCAP_CENTER_LFE_DAC) { for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_ad18xx_center[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_ad18xx_center[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_5bit); for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_ad18xx_lfe[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_ad18xx_lfe[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_5bit); ac97->spec.ad18xx.pcmreg[2] = init_val; } snd_ac97_write_cache(ac97, AC97_PCM, init_val); @@ -1453,16 +1491,18 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) if (err < 0) return err; } - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_control_capture_vol, ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_control_capture_vol, ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_rec_gain); snd_ac97_write_cache(ac97, AC97_REC_SEL, 0x0000); snd_ac97_write_cache(ac97, AC97_REC_GAIN, 0x0000); } /* build MIC Capture controls */ if (snd_ac97_try_volume_mix(ac97, AC97_REC_GAIN_MIC)) { for (idx = 0; idx < 2; idx++) - if ((err = snd_ctl_add(card, snd_ac97_cnew(&snd_ac97_controls_mic_capture[idx], ac97))) < 0) + if ((err = snd_ctl_add(card, kctl = snd_ac97_cnew(&snd_ac97_controls_mic_capture[idx], ac97))) < 0) return err; + set_tlv_db_scale(kctl, db_scale_rec_gain); snd_ac97_write_cache(ac97, AC97_REC_GAIN_MIC, 0x0000); } diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index 37c6be4..392f6cc 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include "ac97_patch.h" #include "ac97_id.h" @@ -51,6 +52,20 @@ static int patch_build_controls(struct snd_ac97 * ac97, const struct snd_kcontro return 0; } +/* replace with a new TLV */ +static void reset_tlv(struct snd_ac97 *ac97, const char *name, + unsigned int *tlv) +{ + struct snd_ctl_elem_id sid; + struct snd_kcontrol *kctl; + memset(&sid, 0, sizeof(sid)); + strcpy(sid.name, name); + sid.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + kctl = snd_ctl_find_id(ac97->bus->card, &sid); + if (kctl && kctl->tlv.p) + kctl->tlv.p = tlv; +} + /* set to the page, update bits and restore the page */ static int ac97_update_bits_page(struct snd_ac97 *ac97, unsigned short reg, unsigned short mask, unsigned short value, unsigned short page) { @@ -1522,12 +1537,16 @@ static const struct snd_kcontrol_new snd_ac97_controls_ad1885[] = { AC97_SINGLE("Line Jack Sense", AC97_AD_JACK_SPDIF, 8, 1, 1), /* inverted */ }; +static DECLARE_TLV_DB_SCALE(db_scale_6bit_6db_max, -8850, 150, 0); + static int patch_ad1885_specific(struct snd_ac97 * ac97) { int err; if ((err = patch_build_controls(ac97, snd_ac97_controls_ad1885, ARRAY_SIZE(snd_ac97_controls_ad1885))) < 0) return err; + reset_tlv(ac97, "Headphone Playback Volume", + db_scale_6bit_6db_max); return 0; } @@ -1551,12 +1570,27 @@ int patch_ad1885(struct snd_ac97 * ac97) return 0; } +static int patch_ad1886_specific(struct snd_ac97 * ac97) +{ + reset_tlv(ac97, "Headphone Playback Volume", + db_scale_6bit_6db_max); + return 0; +} + +static struct snd_ac97_build_ops patch_ad1886_build_ops = { + .build_specific = &patch_ad1886_specific, +#ifdef CONFIG_PM + .resume = ad18xx_resume +#endif +}; + int patch_ad1886(struct snd_ac97 * ac97) { patch_ad1881(ac97); /* Presario700 workaround */ /* for Jack Sense/SPDIF Register misetting causing */ snd_ac97_write_cache(ac97, AC97_AD_JACK_SPDIF, 0x0010); + ac97->build_ops = &patch_ad1886_build_ops; return 0; } @@ -2015,6 +2049,8 @@ static const struct snd_kcontrol_new snd_ac97_spdif_controls_alc650[] = { /* AC97_SINGLE("IEC958 Input Monitor", AC97_ALC650_MULTICH, 13, 1, 0), */ }; +static DECLARE_TLV_DB_SCALE(db_scale_5bit_3db_max, -4350, 150, 0); + static int patch_alc650_specific(struct snd_ac97 * ac97) { int err; @@ -2025,6 +2061,9 @@ static int patch_alc650_specific(struct snd_ac97 * ac97) if ((err = patch_build_controls(ac97, snd_ac97_spdif_controls_alc650, ARRAY_SIZE(snd_ac97_spdif_controls_alc650))) < 0) return err; } + if (ac97->id != AC97_ID_ALC650F) + reset_tlv(ac97, "Master Playback Volume", + db_scale_5bit_3db_max); return 0; } -- cgit v0.10.2 From 7058c042001e111c601e1b031d9bcb8b5d392b74 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 21 Aug 2006 18:44:54 +0200 Subject: [ALSA] Added TLV support to VIA82xx driver Added the TLV support to VIA82xx driver for addition of dB range information. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/via82xx.c b/sound/pci/via82xx.c index e0e3bfd..6db3d4c 100644 --- a/sound/pci/via82xx.c +++ b/sound/pci/via82xx.c @@ -59,6 +59,7 @@ #include #include #include +#include #include #include #include @@ -1698,21 +1699,29 @@ static int snd_via8233_pcmdxs_volume_put(struct snd_kcontrol *kcontrol, return change; } +static DECLARE_TLV_DB_SCALE(db_scale_dxs, -9450, 150, 1); + static struct snd_kcontrol_new snd_via8233_pcmdxs_volume_control __devinitdata = { .name = "PCM Playback Volume", .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .info = snd_via8233_dxs_volume_info, .get = snd_via8233_pcmdxs_volume_get, .put = snd_via8233_pcmdxs_volume_put, + .tlv = { .p = db_scale_dxs } }; static struct snd_kcontrol_new snd_via8233_dxs_volume_control __devinitdata = { .name = "VIA DXS Playback Volume", .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .count = 4, .info = snd_via8233_dxs_volume_info, .get = snd_via8233_dxs_volume_get, .put = snd_via8233_dxs_volume_put, + .tlv = { .p = db_scale_dxs } }; /* -- cgit v0.10.2 From 9107226d2ca8a15534da96313a1d370fb1eb8f9e Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 12:04:34 +0200 Subject: [ALSA] Add dB scale information to ak4531 codec Added the dB scale information to ak4531 codec driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ak4531_codec.c b/sound/pci/ac97/ak4531_codec.c index 94c26ec..c153cb7 100644 --- a/sound/pci/ac97/ak4531_codec.c +++ b/sound/pci/ac97/ak4531_codec.c @@ -27,6 +27,7 @@ #include #include +#include MODULE_AUTHOR("Jaroslav Kysela "); MODULE_DESCRIPTION("Universal routines for AK4531 codec"); @@ -63,6 +64,14 @@ static void snd_ak4531_dump(struct snd_ak4531 *ak4531) .info = snd_ak4531_info_single, \ .get = snd_ak4531_get_single, .put = snd_ak4531_put_single, \ .private_value = reg | (shift << 16) | (mask << 24) | (invert << 22) } +#define AK4531_SINGLE_TLV(xname, xindex, reg, shift, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .index = xindex, \ + .info = snd_ak4531_info_single, \ + .get = snd_ak4531_get_single, .put = snd_ak4531_put_single, \ + .private_value = reg | (shift << 16) | (mask << 24) | (invert << 22), \ + .tlv = { .p = (xtlv) } } static int snd_ak4531_info_single(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -122,6 +131,14 @@ static int snd_ak4531_put_single(struct snd_kcontrol *kcontrol, struct snd_ctl_e .info = snd_ak4531_info_double, \ .get = snd_ak4531_get_double, .put = snd_ak4531_put_double, \ .private_value = left_reg | (right_reg << 8) | (left_shift << 16) | (right_shift << 19) | (mask << 24) | (invert << 22) } +#define AK4531_DOUBLE_TLV(xname, xindex, left_reg, right_reg, left_shift, right_shift, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .index = xindex, \ + .info = snd_ak4531_info_double, \ + .get = snd_ak4531_get_double, .put = snd_ak4531_put_double, \ + .private_value = left_reg | (right_reg << 8) | (left_shift << 16) | (right_shift << 19) | (mask << 24) | (invert << 22), \ + .tlv = { .p = (xtlv) } } static int snd_ak4531_info_double(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -250,50 +267,62 @@ static int snd_ak4531_put_input_sw(struct snd_kcontrol *kcontrol, struct snd_ctl return change; } +static DECLARE_TLV_DB_SCALE(db_scale_master, -6200, 200, 0); +static DECLARE_TLV_DB_SCALE(db_scale_mono, -2800, 400, 0); +static DECLARE_TLV_DB_SCALE(db_scale_input, -5000, 200, 0); + static struct snd_kcontrol_new snd_ak4531_controls[] = { -AK4531_DOUBLE("Master Playback Switch", 0, AK4531_LMASTER, AK4531_RMASTER, 7, 7, 1, 1), +AK4531_DOUBLE_TLV("Master Playback Switch", 0, + AK4531_LMASTER, AK4531_RMASTER, 7, 7, 1, 1, + db_scale_master), AK4531_DOUBLE("Master Playback Volume", 0, AK4531_LMASTER, AK4531_RMASTER, 0, 0, 0x1f, 1), -AK4531_SINGLE("Master Mono Playback Switch", 0, AK4531_MONO_OUT, 7, 1, 1), +AK4531_SINGLE_TLV("Master Mono Playback Switch", 0, AK4531_MONO_OUT, 7, 1, 1, + db_scale_mono), AK4531_SINGLE("Master Mono Playback Volume", 0, AK4531_MONO_OUT, 0, 0x07, 1), AK4531_DOUBLE("PCM Switch", 0, AK4531_LVOICE, AK4531_RVOICE, 7, 7, 1, 1), -AK4531_DOUBLE("PCM Volume", 0, AK4531_LVOICE, AK4531_RVOICE, 0, 0, 0x1f, 1), +AK4531_DOUBLE_TLV("PCM Volume", 0, AK4531_LVOICE, AK4531_RVOICE, 0, 0, 0x1f, 1, + db_scale_input), AK4531_DOUBLE("PCM Playback Switch", 0, AK4531_OUT_SW2, AK4531_OUT_SW2, 3, 2, 1, 0), AK4531_DOUBLE("PCM Capture Switch", 0, AK4531_LIN_SW2, AK4531_RIN_SW2, 2, 2, 1, 0), AK4531_DOUBLE("PCM Switch", 1, AK4531_LFM, AK4531_RFM, 7, 7, 1, 1), -AK4531_DOUBLE("PCM Volume", 1, AK4531_LFM, AK4531_RFM, 0, 0, 0x1f, 1), +AK4531_DOUBLE_TLV("PCM Volume", 1, AK4531_LFM, AK4531_RFM, 0, 0, 0x1f, 1, + db_scale_input), AK4531_DOUBLE("PCM Playback Switch", 1, AK4531_OUT_SW1, AK4531_OUT_SW1, 6, 5, 1, 0), AK4531_INPUT_SW("PCM Capture Route", 1, AK4531_LIN_SW1, AK4531_RIN_SW1, 6, 5), AK4531_DOUBLE("CD Switch", 0, AK4531_LCD, AK4531_RCD, 7, 7, 1, 1), -AK4531_DOUBLE("CD Volume", 0, AK4531_LCD, AK4531_RCD, 0, 0, 0x1f, 1), +AK4531_DOUBLE_TLV("CD Volume", 0, AK4531_LCD, AK4531_RCD, 0, 0, 0x1f, 1, + db_scale_input), AK4531_DOUBLE("CD Playback Switch", 0, AK4531_OUT_SW1, AK4531_OUT_SW1, 2, 1, 1, 0), AK4531_INPUT_SW("CD Capture Route", 0, AK4531_LIN_SW1, AK4531_RIN_SW1, 2, 1), AK4531_DOUBLE("Line Switch", 0, AK4531_LLINE, AK4531_RLINE, 7, 7, 1, 1), -AK4531_DOUBLE("Line Volume", 0, AK4531_LLINE, AK4531_RLINE, 0, 0, 0x1f, 1), +AK4531_DOUBLE_TLV("Line Volume", 0, AK4531_LLINE, AK4531_RLINE, 0, 0, 0x1f, 1, + db_scale_input), AK4531_DOUBLE("Line Playback Switch", 0, AK4531_OUT_SW1, AK4531_OUT_SW1, 4, 3, 1, 0), AK4531_INPUT_SW("Line Capture Route", 0, AK4531_LIN_SW1, AK4531_RIN_SW1, 4, 3), AK4531_DOUBLE("Aux Switch", 0, AK4531_LAUXA, AK4531_RAUXA, 7, 7, 1, 1), -AK4531_DOUBLE("Aux Volume", 0, AK4531_LAUXA, AK4531_RAUXA, 0, 0, 0x1f, 1), +AK4531_DOUBLE_TLV("Aux Volume", 0, AK4531_LAUXA, AK4531_RAUXA, 0, 0, 0x1f, 1, + db_scale_input), AK4531_DOUBLE("Aux Playback Switch", 0, AK4531_OUT_SW2, AK4531_OUT_SW2, 5, 4, 1, 0), AK4531_INPUT_SW("Aux Capture Route", 0, AK4531_LIN_SW2, AK4531_RIN_SW2, 4, 3), AK4531_SINGLE("Mono Switch", 0, AK4531_MONO1, 7, 1, 1), -AK4531_SINGLE("Mono Volume", 0, AK4531_MONO1, 0, 0x1f, 1), +AK4531_SINGLE_TLV("Mono Volume", 0, AK4531_MONO1, 0, 0x1f, 1, db_scale_input), AK4531_SINGLE("Mono Playback Switch", 0, AK4531_OUT_SW2, 0, 1, 0), AK4531_DOUBLE("Mono Capture Switch", 0, AK4531_LIN_SW2, AK4531_RIN_SW2, 0, 0, 1, 0), AK4531_SINGLE("Mono Switch", 1, AK4531_MONO2, 7, 1, 1), -AK4531_SINGLE("Mono Volume", 1, AK4531_MONO2, 0, 0x1f, 1), +AK4531_SINGLE_TLV("Mono Volume", 1, AK4531_MONO2, 0, 0x1f, 1, db_scale_input), AK4531_SINGLE("Mono Playback Switch", 1, AK4531_OUT_SW2, 1, 1, 0), AK4531_DOUBLE("Mono Capture Switch", 1, AK4531_LIN_SW2, AK4531_RIN_SW2, 1, 1, 1, 0), -AK4531_SINGLE("Mic Volume", 0, AK4531_MIC, 0, 0x1f, 1), +AK4531_SINGLE_TLV("Mic Volume", 0, AK4531_MIC, 0, 0x1f, 1, db_scale_input), AK4531_SINGLE("Mic Switch", 0, AK4531_MIC, 7, 1, 1), AK4531_SINGLE("Mic Playback Switch", 0, AK4531_OUT_SW1, 0, 1, 0), AK4531_DOUBLE("Mic Capture Switch", 0, AK4531_LIN_SW1, AK4531_RIN_SW1, 0, 0, 1, 0), -- cgit v0.10.2 From 9f6ab25063f04597e02968ae8393e8f4703c1563 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 12:14:25 +0200 Subject: [ALSA] Add dB scale information to cs4281 driver Added the dB scale information to cs4281 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/cs4281.c b/sound/pci/cs4281.c index 9631456..1990430 100644 --- a/sound/pci/cs4281.c +++ b/sound/pci/cs4281.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include @@ -1054,6 +1055,8 @@ static int snd_cs4281_put_volume(struct snd_kcontrol *kcontrol, return change; } +static DECLARE_TLV_DB_SCALE(db_scale_dsp, -4650, 150, 0); + static struct snd_kcontrol_new snd_cs4281_fm_vol = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1062,6 +1065,7 @@ static struct snd_kcontrol_new snd_cs4281_fm_vol = .get = snd_cs4281_get_volume, .put = snd_cs4281_put_volume, .private_value = ((BA0_FMLVC << 16) | BA0_FMRVC), + .tlv = { .p = db_scale_dsp }, }; static struct snd_kcontrol_new snd_cs4281_pcm_vol = @@ -1072,6 +1076,7 @@ static struct snd_kcontrol_new snd_cs4281_pcm_vol = .get = snd_cs4281_get_volume, .put = snd_cs4281_put_volume, .private_value = ((BA0_PPLVC << 16) | BA0_PPRVC), + .tlv = { .p = db_scale_dsp }, }; static void snd_cs4281_mixer_free_ac97_bus(struct snd_ac97_bus *bus) -- cgit v0.10.2 From 666c70ffd1c4be795de988f26a8ab13524d4ed47 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 12:32:06 +0200 Subject: [ALSA] Add dB scale information to fm801 driver Added the dB scale information to fm801 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/fm801.c b/sound/pci/fm801.c index f3f2b2c..bdfda19 100644 --- a/sound/pci/fm801.c +++ b/sound/pci/fm801.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -1053,6 +1054,13 @@ static int snd_fm801_put_single(struct snd_kcontrol *kcontrol, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .info = snd_fm801_info_double, \ .get = snd_fm801_get_double, .put = snd_fm801_put_double, \ .private_value = reg | (shift_left << 8) | (shift_right << 12) | (mask << 16) | (invert << 24) } +#define FM801_DOUBLE_TLV(xname, reg, shift_left, shift_right, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .info = snd_fm801_info_double, \ + .get = snd_fm801_get_double, .put = snd_fm801_put_double, \ + .private_value = reg | (shift_left << 8) | (shift_right << 12) | (mask << 16) | (invert << 24), \ + .tlv = { .p = (xtlv) } } static int snd_fm801_info_double(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) @@ -1149,14 +1157,19 @@ static int snd_fm801_put_mux(struct snd_kcontrol *kcontrol, return snd_fm801_update_bits(chip, FM801_REC_SRC, 7, val); } +static DECLARE_TLV_DB_SCALE(db_scale_dsp, -3450, 150, 0); + #define FM801_CONTROLS ARRAY_SIZE(snd_fm801_controls) static struct snd_kcontrol_new snd_fm801_controls[] __devinitdata = { -FM801_DOUBLE("Wave Playback Volume", FM801_PCM_VOL, 0, 8, 31, 1), +FM801_DOUBLE_TLV("Wave Playback Volume", FM801_PCM_VOL, 0, 8, 31, 1, + db_scale_dsp), FM801_SINGLE("Wave Playback Switch", FM801_PCM_VOL, 15, 1, 1), -FM801_DOUBLE("I2S Playback Volume", FM801_I2S_VOL, 0, 8, 31, 1), +FM801_DOUBLE_TLV("I2S Playback Volume", FM801_I2S_VOL, 0, 8, 31, 1, + db_scale_dsp), FM801_SINGLE("I2S Playback Switch", FM801_I2S_VOL, 15, 1, 1), -FM801_DOUBLE("FM Playback Volume", FM801_FM_VOL, 0, 8, 31, 1), +FM801_DOUBLE_TLV("FM Playback Volume", FM801_FM_VOL, 0, 8, 31, 1, + db_scale_dsp), FM801_SINGLE("FM Playback Switch", FM801_FM_VOL, 15, 1, 1), { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, -- cgit v0.10.2 From a0aef8edfc9d6d682dba557fe42599297cbc329a Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 13:01:37 +0200 Subject: [ALSA] Add dB scale information to trident driver Added the dB scale information to trident driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/trident/trident_main.c b/sound/pci/trident/trident_main.c index 4930cc6..ebbe12d 100644 --- a/sound/pci/trident/trident_main.c +++ b/sound/pci/trident/trident_main.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -2627,6 +2628,8 @@ static int snd_trident_vol_control_get(struct snd_kcontrol *kcontrol, return 0; } +static DECLARE_TLV_DB_SCALE(db_scale_gvol, -6375, 25, 0); + static int snd_trident_vol_control_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { @@ -2653,6 +2656,7 @@ static struct snd_kcontrol_new snd_trident_vol_music_control __devinitdata = .get = snd_trident_vol_control_get, .put = snd_trident_vol_control_put, .private_value = 16, + .tlv = { .p = db_scale_gvol }, }; static struct snd_kcontrol_new snd_trident_vol_wave_control __devinitdata = @@ -2663,6 +2667,7 @@ static struct snd_kcontrol_new snd_trident_vol_wave_control __devinitdata = .get = snd_trident_vol_control_get, .put = snd_trident_vol_control_put, .private_value = 0, + .tlv = { .p = db_scale_gvol }, }; /*--------------------------------------------------------------------------- @@ -2730,6 +2735,7 @@ static struct snd_kcontrol_new snd_trident_pcm_vol_control __devinitdata = .info = snd_trident_pcm_vol_control_info, .get = snd_trident_pcm_vol_control_get, .put = snd_trident_pcm_vol_control_put, + /* FIXME: no tlv yet */ }; /*--------------------------------------------------------------------------- @@ -2839,6 +2845,8 @@ static int snd_trident_pcm_rvol_control_put(struct snd_kcontrol *kcontrol, return change; } +static DECLARE_TLV_DB_SCALE(db_scale_crvol, -3175, 25, 1); + static struct snd_kcontrol_new snd_trident_pcm_rvol_control __devinitdata = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -2848,6 +2856,7 @@ static struct snd_kcontrol_new snd_trident_pcm_rvol_control __devinitdata = .info = snd_trident_pcm_rvol_control_info, .get = snd_trident_pcm_rvol_control_get, .put = snd_trident_pcm_rvol_control_put, + .tlv = { .p = db_scale_crvol }, }; /*--------------------------------------------------------------------------- @@ -2903,6 +2912,7 @@ static struct snd_kcontrol_new snd_trident_pcm_cvol_control __devinitdata = .info = snd_trident_pcm_cvol_control_info, .get = snd_trident_pcm_cvol_control_get, .put = snd_trident_pcm_cvol_control_put, + .tlv = { .p = db_scale_crvol }, }; static void snd_trident_notify_pcm_change1(struct snd_card *card, -- cgit v0.10.2 From fb567a8e4f077b7b084c0558706339c35a4fb186 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 13:07:19 +0200 Subject: [ALSA] Add dB scale information to dummy driver Added the dB scale information to dummy driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c index 73b1613..42001ef 100644 --- a/sound/drivers/dummy.c +++ b/sound/drivers/dummy.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -443,10 +444,13 @@ static int __init snd_card_dummy_pcm(struct snd_dummy *dummy, int device, int su } #define DUMMY_VOLUME(xname, xindex, addr) \ -{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xindex, \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .index = xindex, \ .info = snd_dummy_volume_info, \ .get = snd_dummy_volume_get, .put = snd_dummy_volume_put, \ - .private_value = addr } + .private_value = addr, \ + .tlv = { .p = db_scale_dummy } } static int snd_dummy_volume_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) @@ -497,6 +501,8 @@ static int snd_dummy_volume_put(struct snd_kcontrol *kcontrol, return change; } +static DECLARE_TLV_DB_SCALE(db_scale_dummy, -4500, 30, 0); + #define DUMMY_CAPSRC(xname, xindex, addr) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xindex, \ .info = snd_dummy_capsrc_info, \ -- cgit v0.10.2 From 0e7febf15851fb438b9518654340d1f704d202e5 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 22 Aug 2006 13:16:01 +0200 Subject: [ALSA] Add dB scale information to ad1816a driver Added the dB scale information to ad1816a driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/isa/ad1816a/ad1816a_lib.c b/sound/isa/ad1816a/ad1816a_lib.c index 8fcf2c1..fd9b61e 100644 --- a/sound/isa/ad1816a/ad1816a_lib.c +++ b/sound/isa/ad1816a/ad1816a_lib.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #include @@ -765,6 +766,13 @@ static int snd_ad1816a_put_mux(struct snd_kcontrol *kcontrol, struct snd_ctl_ele return change; } +#define AD1816A_SINGLE_TLV(xname, reg, shift, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .info = snd_ad1816a_info_single, \ + .get = snd_ad1816a_get_single, .put = snd_ad1816a_put_single, \ + .private_value = reg | (shift << 8) | (mask << 16) | (invert << 24), \ + .tlv = { .p = (xtlv) } } #define AD1816A_SINGLE(xname, reg, shift, mask, invert) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .info = snd_ad1816a_info_single, \ .get = snd_ad1816a_get_single, .put = snd_ad1816a_put_single, \ @@ -822,6 +830,14 @@ static int snd_ad1816a_put_single(struct snd_kcontrol *kcontrol, struct snd_ctl_ return change; } +#define AD1816A_DOUBLE_TLV(xname, reg, shift_left, shift_right, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .info = snd_ad1816a_info_double, \ + .get = snd_ad1816a_get_double, .put = snd_ad1816a_put_double, \ + .private_value = reg | (shift_left << 8) | (shift_right << 12) | (mask << 16) | (invert << 24), \ + .tlv = { .p = (xtlv) } } + #define AD1816A_DOUBLE(xname, reg, shift_left, shift_right, mask, invert) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .info = snd_ad1816a_info_double, \ .get = snd_ad1816a_get_double, .put = snd_ad1816a_put_double, \ @@ -890,28 +906,44 @@ static int snd_ad1816a_put_double(struct snd_kcontrol *kcontrol, struct snd_ctl_ return change; } +static DECLARE_TLV_DB_SCALE(db_scale_4bit, -4500, 300, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit, -4650, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_6bit, -9450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit_12db_max, -3450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_rec_gain, 0, 150, 0); + static struct snd_kcontrol_new snd_ad1816a_controls[] __devinitdata = { AD1816A_DOUBLE("Master Playback Switch", AD1816A_MASTER_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("Master Playback Volume", AD1816A_MASTER_ATT, 8, 0, 31, 1), +AD1816A_DOUBLE_TLV("Master Playback Volume", AD1816A_MASTER_ATT, 8, 0, 31, 1, + db_scale_5bit), AD1816A_DOUBLE("PCM Playback Switch", AD1816A_VOICE_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("PCM Playback Volume", AD1816A_VOICE_ATT, 8, 0, 63, 1), +AD1816A_DOUBLE_TLV("PCM Playback Volume", AD1816A_VOICE_ATT, 8, 0, 63, 1, + db_scale_6bit), AD1816A_DOUBLE("Line Playback Switch", AD1816A_LINE_GAIN_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("Line Playback Volume", AD1816A_LINE_GAIN_ATT, 8, 0, 31, 1), +AD1816A_DOUBLE_TLV("Line Playback Volume", AD1816A_LINE_GAIN_ATT, 8, 0, 31, 1, + db_scale_5bit_12db_max), AD1816A_DOUBLE("CD Playback Switch", AD1816A_CD_GAIN_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("CD Playback Volume", AD1816A_CD_GAIN_ATT, 8, 0, 31, 1), +AD1816A_DOUBLE_TLV("CD Playback Volume", AD1816A_CD_GAIN_ATT, 8, 0, 31, 1, + db_scale_5bit_12db_max), AD1816A_DOUBLE("Synth Playback Switch", AD1816A_SYNTH_GAIN_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("Synth Playback Volume", AD1816A_SYNTH_GAIN_ATT, 8, 0, 31, 1), +AD1816A_DOUBLE_TLV("Synth Playback Volume", AD1816A_SYNTH_GAIN_ATT, 8, 0, 31, 1, + db_scale_5bit_12db_max), AD1816A_DOUBLE("FM Playback Switch", AD1816A_FM_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("FM Playback Volume", AD1816A_FM_ATT, 8, 0, 63, 1), +AD1816A_DOUBLE_TLV("FM Playback Volume", AD1816A_FM_ATT, 8, 0, 63, 1, + db_scale_6bit), AD1816A_SINGLE("Mic Playback Switch", AD1816A_MIC_GAIN_ATT, 15, 1, 1), -AD1816A_SINGLE("Mic Playback Volume", AD1816A_MIC_GAIN_ATT, 8, 31, 1), +AD1816A_SINGLE_TLV("Mic Playback Volume", AD1816A_MIC_GAIN_ATT, 8, 31, 1, + db_scale_5bit_12db_max), AD1816A_SINGLE("Mic Boost", AD1816A_MIC_GAIN_ATT, 14, 1, 0), AD1816A_DOUBLE("Video Playback Switch", AD1816A_VID_GAIN_ATT, 15, 7, 1, 1), -AD1816A_DOUBLE("Video Playback Volume", AD1816A_VID_GAIN_ATT, 8, 0, 31, 1), +AD1816A_DOUBLE_TLV("Video Playback Volume", AD1816A_VID_GAIN_ATT, 8, 0, 31, 1, + db_scale_5bit_12db_max), AD1816A_SINGLE("Phone Capture Switch", AD1816A_PHONE_IN_GAIN_ATT, 15, 1, 1), -AD1816A_SINGLE("Phone Capture Volume", AD1816A_PHONE_IN_GAIN_ATT, 0, 15, 1), +AD1816A_SINGLE_TLV("Phone Capture Volume", AD1816A_PHONE_IN_GAIN_ATT, 0, 15, 1, + db_scale_4bit), AD1816A_SINGLE("Phone Playback Switch", AD1816A_PHONE_OUT_ATT, 7, 1, 1), -AD1816A_SINGLE("Phone Playback Volume", AD1816A_PHONE_OUT_ATT, 0, 31, 1), +AD1816A_SINGLE_TLV("Phone Playback Volume", AD1816A_PHONE_OUT_ATT, 0, 31, 1, + db_scale_5bit), { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = "Capture Source", @@ -920,7 +952,8 @@ AD1816A_SINGLE("Phone Playback Volume", AD1816A_PHONE_OUT_ATT, 0, 31, 1), .put = snd_ad1816a_put_mux, }, AD1816A_DOUBLE("Capture Switch", AD1816A_ADC_PGA, 15, 7, 1, 1), -AD1816A_DOUBLE("Capture Volume", AD1816A_ADC_PGA, 8, 0, 15, 0), +AD1816A_DOUBLE_TLV("Capture Volume", AD1816A_ADC_PGA, 8, 0, 15, 0, + db_scale_rec_gain), AD1816A_SINGLE("3D Control - Switch", AD1816A_3D_PHAT_CTRL, 15, 1, 1), AD1816A_SINGLE("3D Control - Level", AD1816A_3D_PHAT_CTRL, 0, 15, 0), }; -- cgit v0.10.2 From eac06a10d2b814dfacc36a8fff35ef07bf4eec8e Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 22 Aug 2006 13:16:25 +0200 Subject: [ALSA] Add dB scale information to ad1848 driver Added the dB scale information to ad1848 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/ad1848.h b/include/sound/ad1848.h index 57af1fe..c8de6f8 100644 --- a/include/sound/ad1848.h +++ b/include/sound/ad1848.h @@ -179,14 +179,13 @@ enum { AD1848_MIX_SINGLE, AD1848_MIX_DOUBLE, AD1848_MIX_CAPTURE }; #define AD1848_MIXVAL_DOUBLE(left_reg, right_reg, shift_left, shift_right, mask, invert) \ ((left_reg) | ((right_reg) << 8) | ((shift_left) << 16) | ((shift_right) << 19) | ((mask) << 24) | ((invert) << 22)) -int snd_ad1848_add_ctl(struct snd_ad1848 *chip, const char *name, int index, int type, unsigned long value); - /* for ease of use */ struct ad1848_mix_elem { const char *name; int index; int type; unsigned long private_value; + unsigned int *tlv; }; #define AD1848_SINGLE(xname, xindex, reg, shift, mask, invert) \ @@ -195,15 +194,26 @@ struct ad1848_mix_elem { .type = AD1848_MIX_SINGLE, \ .private_value = AD1848_MIXVAL_SINGLE(reg, shift, mask, invert) } +#define AD1848_SINGLE_TLV(xname, xindex, reg, shift, mask, invert, xtlv) \ +{ .name = xname, \ + .index = xindex, \ + .type = AD1848_MIX_SINGLE, \ + .private_value = AD1848_MIXVAL_SINGLE(reg, shift, mask, invert), \ + .tlv = xtlv } + #define AD1848_DOUBLE(xname, xindex, left_reg, right_reg, shift_left, shift_right, mask, invert) \ { .name = xname, \ .index = xindex, \ .type = AD1848_MIX_DOUBLE, \ .private_value = AD1848_MIXVAL_DOUBLE(left_reg, right_reg, shift_left, shift_right, mask, invert) } -static inline int snd_ad1848_add_ctl_elem(struct snd_ad1848 *chip, const struct ad1848_mix_elem *c) -{ - return snd_ad1848_add_ctl(chip, c->name, c->index, c->type, c->private_value); -} +#define AD1848_DOUBLE_TLV(xname, xindex, left_reg, right_reg, shift_left, shift_right, mask, invert, xtlv) \ +{ .name = xname, \ + .index = xindex, \ + .type = AD1848_MIX_DOUBLE, \ + .private_value = AD1848_MIXVAL_DOUBLE(left_reg, right_reg, shift_left, shift_right, mask, invert), \ + .tlv = xtlv } + +int snd_ad1848_add_ctl_elem(struct snd_ad1848 *chip, const struct ad1848_mix_elem *c); #endif /* __SOUND_AD1848_H */ diff --git a/sound/isa/ad1848/ad1848_lib.c b/sound/isa/ad1848/ad1848_lib.c index e711f87..a6fbd5d 100644 --- a/sound/isa/ad1848/ad1848_lib.c +++ b/sound/isa/ad1848/ad1848_lib.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -118,6 +119,8 @@ void snd_ad1848_out(struct snd_ad1848 *chip, #endif } +EXPORT_SYMBOL(snd_ad1848_out); + static void snd_ad1848_dout(struct snd_ad1848 *chip, unsigned char reg, unsigned char value) { @@ -941,6 +944,8 @@ int snd_ad1848_create(struct snd_card *card, return 0; } +EXPORT_SYMBOL(snd_ad1848_create); + static struct snd_pcm_ops snd_ad1848_playback_ops = { .open = snd_ad1848_playback_open, .close = snd_ad1848_playback_close, @@ -988,12 +993,16 @@ int snd_ad1848_pcm(struct snd_ad1848 *chip, int device, struct snd_pcm **rpcm) return 0; } +EXPORT_SYMBOL(snd_ad1848_pcm); + const struct snd_pcm_ops *snd_ad1848_get_pcm_ops(int direction) { return direction == SNDRV_PCM_STREAM_PLAYBACK ? &snd_ad1848_playback_ops : &snd_ad1848_capture_ops; } +EXPORT_SYMBOL(snd_ad1848_get_pcm_ops); + /* * MIXER part */ @@ -1171,7 +1180,8 @@ static int snd_ad1848_put_double(struct snd_kcontrol *kcontrol, struct snd_ctl_e /* */ -int snd_ad1848_add_ctl(struct snd_ad1848 *chip, const char *name, int index, int type, unsigned long value) +int snd_ad1848_add_ctl_elem(struct snd_ad1848 *chip, + const struct ad1848_mix_elem *c) { static struct snd_kcontrol_new newctls[] = { [AD1848_MIX_SINGLE] = { @@ -1196,32 +1206,46 @@ int snd_ad1848_add_ctl(struct snd_ad1848 *chip, const char *name, int index, int struct snd_kcontrol *ctl; int err; - ctl = snd_ctl_new1(&newctls[type], chip); + ctl = snd_ctl_new1(&newctls[c->type], chip); if (! ctl) return -ENOMEM; - strlcpy(ctl->id.name, name, sizeof(ctl->id.name)); - ctl->id.index = index; - ctl->private_value = value; + strlcpy(ctl->id.name, c->name, sizeof(ctl->id.name)); + ctl->id.index = c->index; + ctl->private_value = c->private_value; + if (c->tlv) { + ctl->vd[0].access |= SNDRV_CTL_ELEM_ACCESS_TLV_READ; + ctl->tlv.p = c->tlv; + } if ((err = snd_ctl_add(chip->card, ctl)) < 0) return err; return 0; } +EXPORT_SYMBOL(snd_ad1848_add_ctl_elem); + +static DECLARE_TLV_DB_SCALE(db_scale_6bit, -9450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit_12db_max, -3450, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_rec_gain, 0, 150, 0); static struct ad1848_mix_elem snd_ad1848_controls[] = { AD1848_DOUBLE("PCM Playback Switch", 0, AD1848_LEFT_OUTPUT, AD1848_RIGHT_OUTPUT, 7, 7, 1, 1), -AD1848_DOUBLE("PCM Playback Volume", 0, AD1848_LEFT_OUTPUT, AD1848_RIGHT_OUTPUT, 0, 0, 63, 1), +AD1848_DOUBLE_TLV("PCM Playback Volume", 0, AD1848_LEFT_OUTPUT, AD1848_RIGHT_OUTPUT, 0, 0, 63, 1, + db_scale_6bit), AD1848_DOUBLE("Aux Playback Switch", 0, AD1848_AUX1_LEFT_INPUT, AD1848_AUX1_RIGHT_INPUT, 7, 7, 1, 1), -AD1848_DOUBLE("Aux Playback Volume", 0, AD1848_AUX1_LEFT_INPUT, AD1848_AUX1_RIGHT_INPUT, 0, 0, 31, 1), +AD1848_DOUBLE_TLV("Aux Playback Volume", 0, AD1848_AUX1_LEFT_INPUT, AD1848_AUX1_RIGHT_INPUT, 0, 0, 31, 1, + db_scale_5bit_12db_max), AD1848_DOUBLE("Aux Playback Switch", 1, AD1848_AUX2_LEFT_INPUT, AD1848_AUX2_RIGHT_INPUT, 7, 7, 1, 1), -AD1848_DOUBLE("Aux Playback Volume", 1, AD1848_AUX2_LEFT_INPUT, AD1848_AUX2_RIGHT_INPUT, 0, 0, 31, 1), -AD1848_DOUBLE("Capture Volume", 0, AD1848_LEFT_INPUT, AD1848_RIGHT_INPUT, 0, 0, 15, 0), +AD1848_DOUBLE_TLV("Aux Playback Volume", 1, AD1848_AUX2_LEFT_INPUT, AD1848_AUX2_RIGHT_INPUT, 0, 0, 31, 1, + db_scale_5bit_12db_max), +AD1848_DOUBLE_TLV("Capture Volume", 0, AD1848_LEFT_INPUT, AD1848_RIGHT_INPUT, 0, 0, 15, 0, + db_scale_rec_gain), { .name = "Capture Source", .type = AD1848_MIX_CAPTURE, }, AD1848_SINGLE("Loopback Capture Switch", 0, AD1848_LOOPBACK, 0, 1, 0), -AD1848_SINGLE("Loopback Capture Volume", 0, AD1848_LOOPBACK, 1, 63, 0) +AD1848_SINGLE_TLV("Loopback Capture Volume", 0, AD1848_LOOPBACK, 1, 63, 0, + db_scale_6bit), }; int snd_ad1848_mixer(struct snd_ad1848 *chip) @@ -1245,12 +1269,7 @@ int snd_ad1848_mixer(struct snd_ad1848 *chip) return 0; } -EXPORT_SYMBOL(snd_ad1848_out); -EXPORT_SYMBOL(snd_ad1848_create); -EXPORT_SYMBOL(snd_ad1848_pcm); -EXPORT_SYMBOL(snd_ad1848_get_pcm_ops); EXPORT_SYMBOL(snd_ad1848_mixer); -EXPORT_SYMBOL(snd_ad1848_add_ctl); /* * INIT part -- cgit v0.10.2 From a1c7a7d890634eaec106e5fb3a7d9c92b8f85b0d Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 22 Aug 2006 13:16:39 +0200 Subject: [ALSA] Add dB scale information to opl3sa2 driver Added the dB scale information to opl3sa2 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/isa/opl3sa2.c b/sound/isa/opl3sa2.c index 4031b61..da92bf6 100644 --- a/sound/isa/opl3sa2.c +++ b/sound/isa/opl3sa2.c @@ -33,6 +33,7 @@ #include #include #include +#include #include @@ -337,6 +338,14 @@ static irqreturn_t snd_opl3sa2_interrupt(int irq, void *dev_id, struct pt_regs * .info = snd_opl3sa2_info_single, \ .get = snd_opl3sa2_get_single, .put = snd_opl3sa2_put_single, \ .private_value = reg | (shift << 8) | (mask << 16) | (invert << 24) } +#define OPL3SA2_SINGLE_TLV(xname, xindex, reg, shift, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .index = xindex, \ + .info = snd_opl3sa2_info_single, \ + .get = snd_opl3sa2_get_single, .put = snd_opl3sa2_put_single, \ + .private_value = reg | (shift << 8) | (mask << 16) | (invert << 24), \ + .tlv = { .p = (xtlv) } } static int snd_opl3sa2_info_single(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -395,6 +404,14 @@ static int snd_opl3sa2_put_single(struct snd_kcontrol *kcontrol, struct snd_ctl_ .info = snd_opl3sa2_info_double, \ .get = snd_opl3sa2_get_double, .put = snd_opl3sa2_put_double, \ .private_value = left_reg | (right_reg << 8) | (shift_left << 16) | (shift_right << 19) | (mask << 24) | (invert << 22) } +#define OPL3SA2_DOUBLE_TLV(xname, xindex, left_reg, right_reg, shift_left, shift_right, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ + .name = xname, .index = xindex, \ + .info = snd_opl3sa2_info_double, \ + .get = snd_opl3sa2_get_double, .put = snd_opl3sa2_put_double, \ + .private_value = left_reg | (right_reg << 8) | (shift_left << 16) | (shift_right << 19) | (mask << 24) | (invert << 22), \ + .tlv = { .p = (xtlv) } } static int snd_opl3sa2_info_double(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -469,11 +486,16 @@ static int snd_opl3sa2_put_double(struct snd_kcontrol *kcontrol, struct snd_ctl_ return change; } +static DECLARE_TLV_DB_SCALE(db_scale_master, -3000, 200, 0); +static DECLARE_TLV_DB_SCALE(db_scale_5bit_12db_max, -3450, 150, 0); + static struct snd_kcontrol_new snd_opl3sa2_controls[] = { OPL3SA2_DOUBLE("Master Playback Switch", 0, 0x07, 0x08, 7, 7, 1, 1), -OPL3SA2_DOUBLE("Master Playback Volume", 0, 0x07, 0x08, 0, 0, 15, 1), +OPL3SA2_DOUBLE_TLV("Master Playback Volume", 0, 0x07, 0x08, 0, 0, 15, 1, + db_scale_master), OPL3SA2_SINGLE("Mic Playback Switch", 0, 0x09, 7, 1, 1), -OPL3SA2_SINGLE("Mic Playback Volume", 0, 0x09, 0, 31, 1) +OPL3SA2_SINGLE_TLV("Mic Playback Volume", 0, 0x09, 0, 31, 1, + db_scale_5bit_12db_max), }; static struct snd_kcontrol_new snd_opl3sa2_tone_controls[] = { -- cgit v0.10.2 From bab282b912baf372d8f705357946ef691b621899 Mon Sep 17 00:00:00 2001 From: Vladimir Avdonin Date: Tue, 22 Aug 2006 13:31:58 +0200 Subject: [ALSA] hda-codec - Fix for Acer laptops with ALC883 codec Patch enables the internal speaker on acer laptops with ALC883. Signed-off-by: Vladimir Avdonin Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index d7e95f1..504ebbc 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -820,6 +820,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. 3stack-6ch 3-jack 6-channel 3stack-6ch-dig 3-jack 6-channel with SPDIF I/O 6stack-dig-demo 6-jack digital for Intel demo board + acer Acer laptops (Travelmate 3012WTMi, Aspire 5600, etc) auto auto-config reading BIOS (default) ALC861/660 diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 53aa57f..6590381 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -111,6 +111,7 @@ enum { ALC883_3ST_6ch, ALC883_6ST_DIG, ALC888_DEMO_BOARD, + ALC883_ACER, ALC883_AUTO, ALC883_MODEL_LAST, }; @@ -5069,6 +5070,9 @@ static struct hda_board_config alc883_cfg_tbl[] = { { .pci_subvendor = 0x105b, .pci_subdevice = 0x6668, .config = ALC883_6ST_DIG }, /* Foxconn */ { .modelname = "6stack-dig-demo", .config = ALC888_DEMO_BOARD }, + { .modelname = "acer", .config = ALC883_ACER }, + { .pci_subvendor = 0x1025, .pci_subdevice = 0/*0x0102*/, + .config = ALC883_ACER }, { .modelname = "auto", .config = ALC883_AUTO }, {} }; @@ -5139,6 +5143,23 @@ static struct alc_config_preset alc883_presets[] = { .channel_mode = alc883_sixstack_modes, .input_mux = &alc883_capture_source, }, + [ALC883_ACER] = { + .mixers = { alc883_base_mixer, + alc883_chmode_mixer }, + /* On TravelMate laptops, GPIO 0 enables the internal speaker + * and the headphone jack. Turn this on and rely on the + * standard mute methods whenever the user wants to turn + * these outputs off. + */ + .init_verbs = { alc883_init_verbs, alc880_gpio1_init_verbs }, + .num_dacs = ARRAY_SIZE(alc883_dac_nids), + .dac_nids = alc883_dac_nids, + .num_adc_nids = ARRAY_SIZE(alc883_adc_nids), + .adc_nids = alc883_adc_nids, + .num_channel_mode = ARRAY_SIZE(alc883_3ST_2ch_modes), + .channel_mode = alc883_3ST_2ch_modes, + .input_mux = &alc883_capture_source, + }, }; -- cgit v0.10.2 From 7b89190cf6ecd5075c272b4ec12f65a4ce45a762 Mon Sep 17 00:00:00 2001 From: Magnus Sandin Date: Tue, 22 Aug 2006 13:33:12 +0200 Subject: [ALSA] ac97 - Enable S/PDIF on ASUS P5P800-VM mobo The attached patch will force building the S/PDIF controls on the PCU SSID for Asus P5P800-VM motherboard, even if the AC97_EI_SPDIF bit is not set. Signed-off-by: Magnus Sandin Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_codec.c b/sound/pci/ac97/ac97_codec.c index c47f43d..a79e918 100644 --- a/sound/pci/ac97/ac97_codec.c +++ b/sound/pci/ac97/ac97_codec.c @@ -1573,6 +1573,12 @@ static int snd_ac97_mixer_build(struct snd_ac97 * ac97) } /* build S/PDIF controls */ + + /* Hack for ASUS P5P800-VM, which does not indicate S/PDIF capability */ + if (ac97->subsystem_vendor == 0x1043 && + ac97->subsystem_device == 0x810f) + ac97->ext_id |= AC97_EI_SPDIF; + if ((ac97->ext_id & AC97_EI_SPDIF) && !(ac97->scaps & AC97_SCAP_NO_SPDIF)) { if (ac97->build_ops->build_spdif) { if ((err = ac97->build_ops->build_spdif(ac97)) < 0) -- cgit v0.10.2 From 6d8590650eb81d2c869c7adf4b469071cec11eee Mon Sep 17 00:00:00 2001 From: Guillaume Munch Date: Tue, 22 Aug 2006 17:15:47 +0200 Subject: [ALSA] hda-codec - Support for SigmaTel 9872 - AR11M and AR11S uses the same chip hence we claim to support the AR Series. - Added commentary about STAC9225s which shares the same id as CXD9872RD. - Added entry for 7662 but won't work automatically until pci_subdevice is known. - 'vaio' model now corresponds to CXD9872RD_VAIO for backward compat. - Replaced STAC766x_VAIO with CXD9872RD_VAIO, STAC9872AK_VAIO, STAC9872K_VAIO and CXD9872AKD_VAIO - Added 'vaio-ar' model for potential future modifications. Signed-off-by: Guillaume Munch Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 504ebbc..48d3bdf 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -859,8 +859,9 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. laptop-dig ditto with SPDIF auto auto-config reading BIOS (default) - STAC7664/7661(?) - vaio Setup for VAIO FE550G/SZ110/AR11B + STAC9872 + vaio Setup for VAIO FE550G/SZ110 + vaio-ar Setup for VAIO AR If the default configuration doesn't work and one of the above matches with your device, report it together with the PCI diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 73ca566..139d73e 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1563,7 +1563,7 @@ static int patch_stac9205(struct hda_codec *codec) } /* - * STAC 7661(?) and 7664 hack + * STAC9872 hack */ /* static config for Sony VAIO FE550G and Sony VAIO AR */ @@ -1597,6 +1597,23 @@ static struct hda_verb vaio_init[] = { {} }; +static struct hda_verb vaio_ar_init[] = { + {0x0a, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_HP }, /* HP <- 0x2 */ + {0x0f, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT }, /* Speaker <- 0x5 */ + {0x0d, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_VREF80 }, /* Mic? (<- 0x2) */ + {0x0e, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_IN }, /* CD */ +/* {0x11, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT },*/ /* Optical Out */ + {0x14, AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_VREF80 }, /* Mic? */ + {0x15, AC_VERB_SET_CONNECT_SEL, 0x2}, /* mic-sel: 0a,0d,14,02 */ + {0x02, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE}, /* HP */ + {0x05, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE}, /* Speaker */ +/* {0x10, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE},*/ /* Optical Out */ + {0x09, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_MUTE(0)}, /* capture sw/vol -> 0x8 */ + {0x07, AC_VERB_SET_AMP_GAIN_MUTE, AMP_IN_UNMUTE(0)}, /* CD-in -> 0x6 */ + {0x15, AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE}, /* Mic-in -> 0x9 */ + {} +}; + /* bind volumes of both NID 0x02 and 0x05 */ static int vaio_master_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) @@ -1667,7 +1684,40 @@ static struct snd_kcontrol_new vaio_mixer[] = { {} }; -static struct hda_codec_ops stac766x_patch_ops = { +static struct snd_kcontrol_new vaio_ar_mixer[] = { + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Master Playback Volume", + .info = snd_hda_mixer_amp_volume_info, + .get = snd_hda_mixer_amp_volume_get, + .put = vaio_master_vol_put, + .private_value = HDA_COMPOSE_AMP_VAL(0x02, 3, 0, HDA_OUTPUT), + }, + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Master Playback Switch", + .info = snd_hda_mixer_amp_switch_info, + .get = snd_hda_mixer_amp_switch_get, + .put = vaio_master_sw_put, + .private_value = HDA_COMPOSE_AMP_VAL(0x02, 3, 0, HDA_OUTPUT), + }, + /* HDA_CODEC_VOLUME("CD Capture Volume", 0x07, 0, HDA_INPUT), */ + HDA_CODEC_VOLUME("Capture Volume", 0x09, 0, HDA_INPUT), + HDA_CODEC_MUTE("Capture Switch", 0x09, 0, HDA_INPUT), + /*HDA_CODEC_MUTE("Optical Out Switch", 0x10, 0, HDA_OUTPUT), + HDA_CODEC_VOLUME("Optical Out Volume", 0x10, 0, HDA_OUTPUT),*/ + { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Capture Source", + .count = 1, + .info = stac92xx_mux_enum_info, + .get = stac92xx_mux_enum_get, + .put = stac92xx_mux_enum_put, + }, + {} +}; + +static struct hda_codec_ops stac9872_patch_ops = { .build_controls = stac92xx_build_controls, .build_pcms = stac92xx_build_pcms, .init = stac92xx_init, @@ -1677,25 +1727,34 @@ static struct hda_codec_ops stac766x_patch_ops = { #endif }; -enum { STAC766x_VAIO }; - -static struct hda_board_config stac766x_cfg_tbl[] = { - { .modelname = "vaio", .config = STAC766x_VAIO }, +enum { /* FE and SZ series. id=0x83847661 and subsys=0x104D0700 or 104D1000. */ + CXD9872RD_VAIO, + /* Unknown. id=0x83847662 and subsys=0x104D1200 or 104D1000. */ + STAC9872AK_VAIO, + /* Unknown. id=0x83847661 and subsys=0x104D1200. */ + STAC9872K_VAIO, + /* AR Series. id=0x83847664 and subsys=104D1300 */ + CXD9872AKD_VAIO + }; + +static struct hda_board_config stac9872_cfg_tbl[] = { + { .modelname = "vaio", .config = CXD9872RD_VAIO }, + { .modelname = "vaio-ar", .config = CXD9872AKD_VAIO }, { .pci_subvendor = 0x104d, .pci_subdevice = 0x81e6, - .config = STAC766x_VAIO }, + .config = CXD9872RD_VAIO }, { .pci_subvendor = 0x104d, .pci_subdevice = 0x81ef, - .config = STAC766x_VAIO }, + .config = CXD9872RD_VAIO }, { .pci_subvendor = 0x104d, .pci_subdevice = 0x81fd, - .config = STAC766x_VAIO }, + .config = CXD9872AKD_VAIO }, {} }; -static int patch_stac766x(struct hda_codec *codec) +static int patch_stac9872(struct hda_codec *codec) { struct sigmatel_spec *spec; int board_config; - board_config = snd_hda_check_board_config(codec, stac766x_cfg_tbl); + board_config = snd_hda_check_board_config(codec, stac9872_cfg_tbl); if (board_config < 0) /* unknown config, let generic-parser do its job... */ return snd_hda_parse_generic_codec(codec); @@ -1706,7 +1765,9 @@ static int patch_stac766x(struct hda_codec *codec) codec->spec = spec; switch (board_config) { - case STAC766x_VAIO: + case CXD9872RD_VAIO: + case STAC9872AK_VAIO: + case STAC9872K_VAIO: spec->mixer = vaio_mixer; spec->init = vaio_init; spec->multiout.max_channels = 2; @@ -1718,9 +1779,22 @@ static int patch_stac766x(struct hda_codec *codec) spec->input_mux = &vaio_mux; spec->mux_nids = vaio_mux_nids; break; + + case CXD9872AKD_VAIO: + spec->mixer = vaio_ar_mixer; + spec->init = vaio_ar_init; + spec->multiout.max_channels = 2; + spec->multiout.num_dacs = ARRAY_SIZE(vaio_dacs); + spec->multiout.dac_nids = vaio_dacs; + spec->multiout.hp_nid = VAIO_HP_DAC; + spec->num_adcs = ARRAY_SIZE(vaio_adcs); + spec->adc_nids = vaio_adcs; + spec->input_mux = &vaio_mux; + spec->mux_nids = vaio_mux_nids; + break; } - codec->patch_ops = stac766x_patch_ops; + codec->patch_ops = stac9872_patch_ops; return 0; } @@ -1752,7 +1826,13 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x83847627, .name = "STAC9271D", .patch = patch_stac927x }, { .id = 0x83847628, .name = "STAC9274X5NH", .patch = patch_stac927x }, { .id = 0x83847629, .name = "STAC9274D5NH", .patch = patch_stac927x }, - { .id = 0x83847661, .name = "STAC7661", .patch = patch_stac766x }, + /* The following does not take into account .id=0x83847661 when subsys = + * 104D0C00 which is STAC9225s. Because of this, some SZ Notebooks are + * currently not fully supported. + */ + { .id = 0x83847661, .name = "CXD9872RD/K", .patch = patch_stac9872 }, + { .id = 0x83847662, .name = "STAC9872AK", .patch = patch_stac9872 }, + { .id = 0x83847664, .name = "CXD9872AKD", .patch = patch_stac9872 }, { .id = 0x838476a0, .name = "STAC9205", .patch = patch_stac9205 }, { .id = 0x838476a1, .name = "STAC9205D", .patch = patch_stac9205 }, { .id = 0x838476a2, .name = "STAC9204", .patch = patch_stac9205 }, @@ -1761,6 +1841,5 @@ struct hda_codec_preset snd_hda_preset_sigmatel[] = { { .id = 0x838476a5, .name = "STAC9255D", .patch = patch_stac9205 }, { .id = 0x838476a6, .name = "STAC9254", .patch = patch_stac9205 }, { .id = 0x838476a7, .name = "STAC9254D", .patch = patch_stac9205 }, - { .id = 0x83847664, .name = "STAC7664", .patch = patch_stac766x }, {} /* terminator */ }; -- cgit v0.10.2 From 11b44bbde52b6c50ed8c9ba579d7ee9ff5b48cd8 Mon Sep 17 00:00:00 2001 From: Richard Fish Date: Wed, 23 Aug 2006 18:31:34 +0200 Subject: [ALSA] hda-codec - restore HDA sigmatel pin configs on resume This patch restores the Intel HDA Sigmatel codec pin configuration on resume. Most of it is dedicated to saving the BIOS pin configuration if necessary, so that even unrecognized chips can be resumed correctly. Signed-off-by: Richard Fish Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 139d73e..239ae3f 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -73,6 +73,7 @@ struct sigmatel_spec { hda_nid_t *pin_nids; unsigned int num_pins; unsigned int *pin_configs; + unsigned int *bios_pin_configs; /* codec specific stuff */ struct hda_verb *init; @@ -584,13 +585,42 @@ static struct hda_board_config stac9205_cfg_tbl[] = { {} /* terminator */ }; +static int stac92xx_save_bios_config_regs(struct hda_codec *codec) +{ + int i; + struct sigmatel_spec *spec = codec->spec; + + if (! spec->bios_pin_configs) { + spec->bios_pin_configs = kcalloc(spec->num_pins, + sizeof(*spec->bios_pin_configs), GFP_KERNEL); + if (! spec->bios_pin_configs) + return -ENOMEM; + } + + for (i = 0; i < spec->num_pins; i++) { + hda_nid_t nid = spec->pin_nids[i]; + unsigned int pin_cfg; + + pin_cfg = snd_hda_codec_read(codec, nid, 0, + AC_VERB_GET_CONFIG_DEFAULT, 0x00); + snd_printdd(KERN_INFO "hda_codec: pin nid %2.2x bios pin config %8.8x\n", + nid, pin_cfg); + spec->bios_pin_configs[i] = pin_cfg; + } + + return 0; +} + static void stac92xx_set_config_regs(struct hda_codec *codec) { int i; struct sigmatel_spec *spec = codec->spec; unsigned int pin_cfg; - for (i=0; i < spec->num_pins; i++) { + if (! spec->pin_nids || ! spec->pin_configs) + return; + + for (i = 0; i < spec->num_pins; i++) { snd_hda_codec_write(codec, spec->pin_nids[i], 0, AC_VERB_SET_CONFIG_DEFAULT_BYTES_0, spec->pin_configs[i] & 0x000000ff); @@ -1302,6 +1332,9 @@ static void stac92xx_free(struct hda_codec *codec) kfree(spec->kctl_alloc); } + if (spec->bios_pin_configs) + kfree(spec->bios_pin_configs); + kfree(spec); } @@ -1359,6 +1392,7 @@ static int stac92xx_resume(struct hda_codec *codec) int i; stac92xx_init(codec); + stac92xx_set_config_regs(codec); for (i = 0; i < spec->num_mixers; i++) snd_hda_resume_ctls(codec, spec->mixers[i]); if (spec->multiout.dig_out_nid) @@ -1391,12 +1425,18 @@ static int patch_stac9200(struct hda_codec *codec) return -ENOMEM; codec->spec = spec; + spec->num_pins = 8; + spec->pin_nids = stac9200_pin_nids; spec->board_config = snd_hda_check_board_config(codec, stac9200_cfg_tbl); - if (spec->board_config < 0) - snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC9200, using BIOS defaults\n"); - else { - spec->num_pins = 8; - spec->pin_nids = stac9200_pin_nids; + if (spec->board_config < 0) { + snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC9200, using BIOS defaults\n"); + err = stac92xx_save_bios_config_regs(codec); + if (err < 0) { + stac92xx_free(codec); + return err; + } + spec->pin_configs = spec->bios_pin_configs; + } else { spec->pin_configs = stac9200_brd_tbl[spec->board_config]; stac92xx_set_config_regs(codec); } @@ -1432,13 +1472,19 @@ static int patch_stac922x(struct hda_codec *codec) return -ENOMEM; codec->spec = spec; + spec->num_pins = 10; + spec->pin_nids = stac922x_pin_nids; spec->board_config = snd_hda_check_board_config(codec, stac922x_cfg_tbl); - if (spec->board_config < 0) - snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC922x, " - "using BIOS defaults\n"); - else if (stac922x_brd_tbl[spec->board_config] != NULL) { - spec->num_pins = 10; - spec->pin_nids = stac922x_pin_nids; + if (spec->board_config < 0) { + snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC922x, " + "using BIOS defaults\n"); + err = stac92xx_save_bios_config_regs(codec); + if (err < 0) { + stac92xx_free(codec); + return err; + } + spec->pin_configs = spec->bios_pin_configs; + } else if (stac922x_brd_tbl[spec->board_config] != NULL) { spec->pin_configs = stac922x_brd_tbl[spec->board_config]; stac92xx_set_config_regs(codec); } @@ -1476,12 +1522,18 @@ static int patch_stac927x(struct hda_codec *codec) return -ENOMEM; codec->spec = spec; + spec->num_pins = 14; + spec->pin_nids = stac927x_pin_nids; spec->board_config = snd_hda_check_board_config(codec, stac927x_cfg_tbl); - if (spec->board_config < 0) + if (spec->board_config < 0) { snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC927x, using BIOS defaults\n"); - else if (stac927x_brd_tbl[spec->board_config] != NULL) { - spec->num_pins = 14; - spec->pin_nids = stac927x_pin_nids; + err = stac92xx_save_bios_config_regs(codec); + if (err < 0) { + stac92xx_free(codec); + return err; + } + spec->pin_configs = spec->bios_pin_configs; + } else if (stac927x_brd_tbl[spec->board_config] != NULL) { spec->pin_configs = stac927x_brd_tbl[spec->board_config]; stac92xx_set_config_regs(codec); } @@ -1532,12 +1584,18 @@ static int patch_stac9205(struct hda_codec *codec) return -ENOMEM; codec->spec = spec; + spec->num_pins = 14; + spec->pin_nids = stac9205_pin_nids; spec->board_config = snd_hda_check_board_config(codec, stac9205_cfg_tbl); - if (spec->board_config < 0) - snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC9205, using BIOS defaults\n"); - else { - spec->num_pins = 14; - spec->pin_nids = stac9205_pin_nids; + if (spec->board_config < 0) { + snd_printdd(KERN_INFO "hda_codec: Unknown model for STAC9205, using BIOS defaults\n"); + err = stac92xx_save_bios_config_regs(codec); + if (err < 0) { + stac92xx_free(codec); + return err; + } + spec->pin_configs = spec->bios_pin_configs; + } else { spec->pin_configs = stac9205_brd_tbl[spec->board_config]; stac92xx_set_config_regs(codec); } -- cgit v0.10.2 From 071c73ad5fce436ee00c9422b7ca0c5d629451fb Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 18:34:06 +0200 Subject: [ALSA] hda-codec - Fix mic capture with generic parser Fixed the mic capture with generic parser of hda-codec driver - Use VREF80 for mic pins if available - Handle multiple inputs correctly on audio-input widget node. Confirmed on a conexant codec chip. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_generic.c b/sound/pci/hda/hda_generic.c index 85ad164a..dedfc5b 100644 --- a/sound/pci/hda/hda_generic.c +++ b/sound/pci/hda/hda_generic.c @@ -461,14 +461,19 @@ static const char *get_input_type(struct hda_gnode *node, unsigned int *pinctl) return "Front Line"; return "Line"; case AC_JACK_CD: +#if 0 if (pinctl) *pinctl |= AC_PINCTL_VREF_GRD; +#endif return "CD"; case AC_JACK_AUX: if ((location & 0x0f) == AC_JACK_LOC_FRONT) return "Front Aux"; return "Aux"; case AC_JACK_MIC_IN: + if (node->pin_caps & + (AC_PINCAP_VREF_80 << AC_PINCAP_VREF_SHIFT)) + *pinctl |= AC_PINCTL_VREF_80; if ((location & 0x0f) == AC_JACK_LOC_FRONT) return "Front Mic"; return "Mic"; @@ -556,6 +561,29 @@ static int parse_adc_sub_nodes(struct hda_codec *codec, struct hda_gspec *spec, return 1; /* found */ } +/* add a capture source element */ +static void add_cap_src(struct hda_gspec *spec, int idx) +{ + struct hda_input_mux_item *csrc; + char *buf; + int num, ocap; + + num = spec->input_mux.num_items; + csrc = &spec->input_mux.items[num]; + buf = spec->cap_labels[num]; + for (ocap = 0; ocap < num; ocap++) { + if (! strcmp(buf, spec->cap_labels[ocap])) { + /* same label already exists, + * put the index number to be unique + */ + sprintf(buf, "%s %d", spec->cap_labels[ocap], num); + break; + } + } + csrc->index = idx; + spec->input_mux.num_items++; +} + /* * parse input */ @@ -576,28 +604,26 @@ static int parse_input_path(struct hda_codec *codec, struct hda_gnode *adc_node) * if it reaches to a proper input PIN, add the path as the * input path. */ + /* first, check the direct connections to PIN widgets */ for (i = 0; i < adc_node->nconns; i++) { node = hda_get_node(spec, adc_node->conn_list[i]); - if (! node) - continue; - err = parse_adc_sub_nodes(codec, spec, node); - if (err < 0) - return err; - else if (err > 0) { - struct hda_input_mux_item *csrc = &spec->input_mux.items[spec->input_mux.num_items]; - char *buf = spec->cap_labels[spec->input_mux.num_items]; - int ocap; - for (ocap = 0; ocap < spec->input_mux.num_items; ocap++) { - if (! strcmp(buf, spec->cap_labels[ocap])) { - /* same label already exists, - * put the index number to be unique - */ - sprintf(buf, "%s %d", spec->cap_labels[ocap], - spec->input_mux.num_items); - } - } - csrc->index = i; - spec->input_mux.num_items++; + if (node && node->type == AC_WID_PIN) { + err = parse_adc_sub_nodes(codec, spec, node); + if (err < 0) + return err; + else if (err > 0) + add_cap_src(spec, i); + } + } + /* ... then check the rests, more complicated connections */ + for (i = 0; i < adc_node->nconns; i++) { + node = hda_get_node(spec, adc_node->conn_list[i]); + if (node && node->type != AC_WID_PIN) { + err = parse_adc_sub_nodes(codec, spec, node); + if (err < 0) + return err; + else if (err > 0) + add_cap_src(spec, i); } } @@ -647,9 +673,6 @@ static int parse_input(struct hda_codec *codec) /* * create mixer controls if possible */ -#define DIR_OUT 0x1 -#define DIR_IN 0x2 - static int create_mixer(struct hda_codec *codec, struct hda_gnode *node, unsigned int index, const char *type, const char *dir_sfx) { @@ -743,28 +766,57 @@ static int build_input_controls(struct hda_codec *codec) { struct hda_gspec *spec = codec->spec; struct hda_gnode *adc_node = spec->adc_node; - int err; - - if (! adc_node) + int i, err; + static struct snd_kcontrol_new cap_sel = { + .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .name = "Capture Source", + .info = capture_source_info, + .get = capture_source_get, + .put = capture_source_put, + }; + + if (! adc_node || ! spec->input_mux.num_items) return 0; /* not found */ + spec->cur_cap_src = 0; + select_input_connection(codec, adc_node, + spec->input_mux.items[0].index); + /* create capture volume and switch controls if the ADC has an amp */ - err = create_mixer(codec, adc_node, 0, NULL, "Capture"); + /* do we have only a single item? */ + if (spec->input_mux.num_items == 1) { + err = create_mixer(codec, adc_node, + spec->input_mux.items[0].index, + NULL, "Capture"); + if (err < 0) + return err; + return 0; + } /* create input MUX if multiple sources are available */ - if (spec->input_mux.num_items > 1) { - static struct snd_kcontrol_new cap_sel = { - .iface = SNDRV_CTL_ELEM_IFACE_MIXER, - .name = "Capture Source", - .info = capture_source_info, - .get = capture_source_get, - .put = capture_source_put, - }; - if ((err = snd_ctl_add(codec->bus->card, snd_ctl_new1(&cap_sel, codec))) < 0) + if ((err = snd_ctl_add(codec->bus->card, + snd_ctl_new1(&cap_sel, codec))) < 0) + return err; + + /* no volume control? */ + if (! (adc_node->wid_caps & AC_WCAP_IN_AMP) || + ! (adc_node->amp_in_caps & AC_AMPCAP_NUM_STEPS)) + return 0; + + for (i = 0; i < spec->input_mux.num_items; i++) { + struct snd_kcontrol_new knew; + char name[32]; + sprintf(name, "%s Capture Volume", + spec->input_mux.items[i].label); + knew = (struct snd_kcontrol_new) + HDA_CODEC_VOLUME(name, adc_node->nid, + spec->input_mux.items[i].index, + HDA_INPUT); + if ((err = snd_ctl_add(codec->bus->card, + snd_ctl_new1(&knew, codec))) < 0) return err; - spec->cur_cap_src = 0; - select_input_connection(codec, adc_node, spec->input_mux.items[0].index); } + return 0; } -- cgit v0.10.2 From 3479307f8ca3cbf4181b8bf7d8c824156a9e63b7 Mon Sep 17 00:00:00 2001 From: Jochen Voss Date: Wed, 23 Aug 2006 18:35:35 +0200 Subject: [ALSA] Fix volume control for the AK4358 DAC Fix volume control for the AK4358 DAC. The attenuation control registers of the AK4358 use only 7bit for the volume, the msb is used to enable attenuation output. Without this patch there are 256 volume levels the lower 128 of which are mute. Signed-off-by: Jochen Voss Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index 0aea536..89fc3cb 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -284,11 +284,13 @@ EXPORT_SYMBOL(snd_akm4xxx_init); #define AK_GET_CHIP(val) (((val) >> 8) & 0xff) #define AK_GET_ADDR(val) ((val) & 0xff) -#define AK_GET_SHIFT(val) (((val) >> 16) & 0x7f) +#define AK_GET_SHIFT(val) (((val) >> 16) & 0x3f) +#define AK_GET_NEEDSMSB(val) (((val) >> 22) & 1) #define AK_GET_INVERT(val) (((val) >> 23) & 1) #define AK_GET_MASK(val) (((val) >> 24) & 0xff) #define AK_COMPOSE(chip,addr,shift,mask) \ (((chip) << 8) | (addr) | ((shift) << 16) | ((mask) << 24)) +#define AK_NEEDSMSB (1<<22) #define AK_INVERT (1<<23) static int snd_akm4xxx_volume_info(struct snd_kcontrol *kcontrol, @@ -309,10 +311,13 @@ static int snd_akm4xxx_volume_get(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); + int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); int invert = AK_GET_INVERT(kcontrol->private_value); unsigned int mask = AK_GET_MASK(kcontrol->private_value); unsigned char val = snd_akm4xxx_get(ak, chip, addr); - + + if (needsmsb) + val &= 0x7f; ucontrol->value.integer.value[0] = invert ? mask - val : val; return 0; } @@ -323,6 +328,7 @@ static int snd_akm4xxx_volume_put(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); + int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); int invert = AK_GET_INVERT(kcontrol->private_value); unsigned int mask = AK_GET_MASK(kcontrol->private_value); unsigned char nval = ucontrol->value.integer.value[0] % (mask+1); @@ -330,6 +336,8 @@ static int snd_akm4xxx_volume_put(struct snd_kcontrol *kcontrol, if (invert) nval = mask - nval; + if (needsmsb) + nval |= 0x80; change = snd_akm4xxx_get(ak, chip, addr) != nval; if (change) snd_akm4xxx_write(ak, chip, addr, nval); @@ -354,13 +362,19 @@ static int snd_akm4xxx_stereo_volume_get(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); + int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); int invert = AK_GET_INVERT(kcontrol->private_value); unsigned int mask = AK_GET_MASK(kcontrol->private_value); - unsigned char val = snd_akm4xxx_get(ak, chip, addr); - + unsigned char val; + + val = snd_akm4xxx_get(ak, chip, addr); + if (needsmsb) + val &= 0x7f; ucontrol->value.integer.value[0] = invert ? mask - val : val; val = snd_akm4xxx_get(ak, chip, addr+1); + if (needsmsb) + val &= 0x7f; ucontrol->value.integer.value[1] = invert ? mask - val : val; return 0; @@ -372,6 +386,7 @@ static int snd_akm4xxx_stereo_volume_put(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); + int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); int invert = AK_GET_INVERT(kcontrol->private_value); unsigned int mask = AK_GET_MASK(kcontrol->private_value); unsigned char nval = ucontrol->value.integer.value[0] % (mask+1); @@ -379,6 +394,8 @@ static int snd_akm4xxx_stereo_volume_put(struct snd_kcontrol *kcontrol, if (invert) nval = mask - nval; + if (needsmsb) + nval |= 0x80; change0 = snd_akm4xxx_get(ak, chip, addr) != nval; if (change0) snd_akm4xxx_write(ak, chip, addr, nval); @@ -386,6 +403,8 @@ static int snd_akm4xxx_stereo_volume_put(struct snd_kcontrol *kcontrol, nval = ucontrol->value.integer.value[1] % (mask+1); if (invert) nval = mask - nval; + if (needsmsb) + nval |= 0x80; change1 = snd_akm4xxx_get(ak, chip, addr+1) != nval; if (change1) snd_akm4xxx_write(ak, chip, addr+1, nval); @@ -585,16 +604,13 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) /* register 4-9, chip #0 only */ ctl->private_value = AK_COMPOSE(0, idx + 4, 0, 255); break; - case SND_AK4358: - if (idx >= 6) - /* register 4-9, chip #0 only */ - ctl->private_value = - AK_COMPOSE(0, idx + 5, 0, 255); - else - /* register 4-9, chip #0 only */ - ctl->private_value = - AK_COMPOSE(0, idx + 4, 0, 255); + case SND_AK4358: { + /* register 4-9 and 11-12, chip #0 only */ + int addr = idx < 6 ? idx + 4 : idx + 5; + ctl->private_value = + AK_COMPOSE(0, addr, 0, 127) | AK_NEEDSMSB; break; + } case SND_AK4381: /* register 3 & 4 */ ctl->private_value = -- cgit v0.10.2 From c6ff77f71fe692fa48fe02dbfe74a01f3d5e55e2 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 19:53:02 +0200 Subject: [ALSA] Add dB scale information to pcxhr driver Added the dB scale information to pcxhr driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/pcxhr/pcxhr_mixer.c b/sound/pci/pcxhr/pcxhr_mixer.c index 94e63a1..b133ad9 100644 --- a/sound/pci/pcxhr/pcxhr_mixer.c +++ b/sound/pci/pcxhr/pcxhr_mixer.c @@ -31,6 +31,7 @@ #include "pcxhr_hwdep.h" #include "pcxhr_core.h" #include +#include #include #include "pcxhr_mixer.h" @@ -43,6 +44,9 @@ #define PCXHR_ANALOG_PLAYBACK_LEVEL_MAX 128 /* 0.0 dB */ #define PCXHR_ANALOG_PLAYBACK_ZERO_LEVEL 104 /* -24.0 dB ( 0.0 dB - fix level +24.0 dB ) */ +static DECLARE_TLV_DB_SCALE(db_scale_analog_capture, -9600, 50, 0); +static DECLARE_TLV_DB_SCALE(db_scale_analog_playback, -12800, 100, 0); + static int pcxhr_update_analog_audio_level(struct snd_pcxhr *chip, int is_capture, int channel) { int err, vol; @@ -130,10 +134,13 @@ static int pcxhr_analog_vol_put(struct snd_kcontrol *kcontrol, static struct snd_kcontrol_new pcxhr_control_analog_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), /* name will be filled later */ .info = pcxhr_analog_vol_info, .get = pcxhr_analog_vol_get, .put = pcxhr_analog_vol_put, + /* tlv will be filled later */ }; /* shared */ @@ -188,6 +195,7 @@ static struct snd_kcontrol_new pcxhr_control_output_switch = { #define PCXHR_DIGITAL_LEVEL_MAX 0x1ff /* +18 dB */ #define PCXHR_DIGITAL_ZERO_LEVEL 0x1b7 /* 0 dB */ +static DECLARE_TLV_DB_SCALE(db_scale_digital, -10950, 50, 0); #define MORE_THAN_ONE_STREAM_LEVEL 0x000001 #define VALID_STREAM_PAN_LEVEL_MASK 0x800000 @@ -343,11 +351,14 @@ static int pcxhr_pcm_vol_put(struct snd_kcontrol *kcontrol, static struct snd_kcontrol_new snd_pcxhr_pcm_vol = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), /* name will be filled later */ /* count will be filled later */ .info = pcxhr_digital_vol_info, /* shared */ .get = pcxhr_pcm_vol_get, .put = pcxhr_pcm_vol_put, + .tlv = { .p = db_scale_digital }, }; @@ -433,10 +444,13 @@ static int pcxhr_monitor_vol_put(struct snd_kcontrol *kcontrol, static struct snd_kcontrol_new pcxhr_control_monitor_vol = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Monitoring Volume", .info = pcxhr_digital_vol_info, /* shared */ .get = pcxhr_monitor_vol_get, .put = pcxhr_monitor_vol_put, + .tlv = { .p = db_scale_digital }, }; /* @@ -928,6 +942,7 @@ int pcxhr_create_mixer(struct pcxhr_mgr *mgr) temp = pcxhr_control_analog_level; temp.name = "Master Playback Volume"; temp.private_value = 0; /* playback */ + temp.tlv.p = db_scale_analog_playback; if ((err = snd_ctl_add(chip->card, snd_ctl_new1(&temp, chip))) < 0) return err; /* output mute controls */ @@ -963,6 +978,7 @@ int pcxhr_create_mixer(struct pcxhr_mgr *mgr) temp = pcxhr_control_analog_level; temp.name = "Master Capture Volume"; temp.private_value = 1; /* capture */ + temp.tlv.p = db_scale_analog_capture; if ((err = snd_ctl_add(chip->card, snd_ctl_new1(&temp, chip))) < 0) return err; -- cgit v0.10.2 From 1186ed8c7dc9c0185e783beddf241509cc224f1a Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 23 Aug 2006 19:53:28 +0200 Subject: [ALSA] Add dB scale information to vxpocket and vx222 drivers Added the dB scale information to vxpocket and vx222 drivers. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/vx_core.h b/include/sound/vx_core.h index 9821a61..dbca141 100644 --- a/include/sound/vx_core.h +++ b/include/sound/vx_core.h @@ -128,6 +128,7 @@ struct snd_vx_hardware { unsigned int num_ins; unsigned int num_outs; unsigned int output_level_max; + unsigned int *output_level_db_scale; }; /* hwdep id string */ diff --git a/sound/drivers/vx/vx_mixer.c b/sound/drivers/vx/vx_mixer.c index c1d7fcd..1613ed8 100644 --- a/sound/drivers/vx/vx_mixer.c +++ b/sound/drivers/vx/vx_mixer.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include "vx_cmd.h" @@ -455,10 +456,13 @@ static int vx_output_level_put(struct snd_kcontrol *kcontrol, struct snd_ctl_ele static struct snd_kcontrol_new vx_control_output_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Master Playback Volume", .info = vx_output_level_info, .get = vx_output_level_get, .put = vx_output_level_put, + /* tlv will be filled later */ }; /* @@ -712,12 +716,17 @@ static int vx_monitor_sw_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_ return 0; } +static DECLARE_TLV_DB_SCALE(db_scale_audio_gain, -10975, 25, 0); + static struct snd_kcontrol_new vx_control_audio_gain = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), /* name will be filled later */ .info = vx_audio_gain_info, .get = vx_audio_gain_get, - .put = vx_audio_gain_put + .put = vx_audio_gain_put, + .tlv = { .p = db_scale_audio_gain }, }; static struct snd_kcontrol_new vx_control_output_switch = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -729,9 +738,12 @@ static struct snd_kcontrol_new vx_control_output_switch = { static struct snd_kcontrol_new vx_control_monitor_gain = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = "Monitoring Volume", + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .info = vx_audio_gain_info, /* shared */ .get = vx_audio_monitor_get, - .put = vx_audio_monitor_put + .put = vx_audio_monitor_put, + .tlv = { .p = db_scale_audio_gain }, }; static struct snd_kcontrol_new vx_control_monitor_switch = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -918,6 +930,7 @@ int snd_vx_mixer_new(struct vx_core *chip) for (i = 0; i < chip->hw->num_outs; i++) { temp = vx_control_output_level; temp.index = i; + temp.tlv.p = chip->hw->output_level_db_scale; if ((err = snd_ctl_add(card, snd_ctl_new1(&temp, chip))) < 0) return err; } diff --git a/sound/pci/vx222/vx222.c b/sound/pci/vx222/vx222.c index 9c03c6b..e7cd8ac 100644 --- a/sound/pci/vx222/vx222.c +++ b/sound/pci/vx222/vx222.c @@ -26,6 +26,7 @@ #include #include #include +#include #include "vx222.h" #define CARD_NAME "VX222" @@ -72,6 +73,9 @@ MODULE_DEVICE_TABLE(pci, snd_vx222_ids); /* */ +static DECLARE_TLV_DB_SCALE(db_scale_old_vol, -11350, 50, 0); +static DECLARE_TLV_DB_SCALE(db_scale_akm, -7350, 50, 0); + static struct snd_vx_hardware vx222_old_hw = { .name = "VX222/Old", @@ -81,6 +85,7 @@ static struct snd_vx_hardware vx222_old_hw = { .num_ins = 1, .num_outs = 1, .output_level_max = VX_ANALOG_OUT_LEVEL_MAX, + .output_level_db_scale = db_scale_old_vol, }; static struct snd_vx_hardware vx222_v2_hw = { @@ -92,6 +97,7 @@ static struct snd_vx_hardware vx222_v2_hw = { .num_ins = 1, .num_outs = 1, .output_level_max = VX2_AKM_LEVEL_MAX, + .output_level_db_scale = db_scale_akm, }; static struct snd_vx_hardware vx222_mic_hw = { @@ -103,6 +109,7 @@ static struct snd_vx_hardware vx222_mic_hw = { .num_ins = 1, .num_outs = 1, .output_level_max = VX2_AKM_LEVEL_MAX, + .output_level_db_scale = db_scale_akm, }; diff --git a/sound/pci/vx222/vx222_ops.c b/sound/pci/vx222/vx222_ops.c index 9b6d345..5e51950 100644 --- a/sound/pci/vx222/vx222_ops.c +++ b/sound/pci/vx222/vx222_ops.c @@ -28,6 +28,7 @@ #include #include +#include #include #include "vx222.h" @@ -845,6 +846,8 @@ static void vx2_set_input_level(struct snd_vx222 *chip) #define MIC_LEVEL_MAX 0xff +static DECLARE_TLV_DB_SCALE(db_scale_mic, -6450, 50, 0); + /* * controls API for input levels */ @@ -922,18 +925,24 @@ static int vx_mic_level_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_v static struct snd_kcontrol_new vx_control_input_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Capture Volume", .info = vx_input_level_info, .get = vx_input_level_get, .put = vx_input_level_put, + .tlv = { .p = db_scale_mic }, }; static struct snd_kcontrol_new vx_control_mic_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Mic Capture Volume", .info = vx_mic_level_info, .get = vx_mic_level_get, .put = vx_mic_level_put, + .tlv = { .p = db_scale_mic }, }; /* diff --git a/sound/pcmcia/vx/vxpocket.c b/sound/pcmcia/vx/vxpocket.c index 76c85cf..3089fcc 100644 --- a/sound/pcmcia/vx/vxpocket.c +++ b/sound/pcmcia/vx/vxpocket.c @@ -27,6 +27,7 @@ #include #include #include +#include /* */ @@ -90,6 +91,8 @@ static int snd_vxpocket_dev_free(struct snd_device *device) * Only output levels can be modified */ +static DECLARE_TLV_DB_SCALE(db_scale_old_vol, -11350, 50, 0); + static struct snd_vx_hardware vxpocket_hw = { .name = "VXPocket", .type = VX_TYPE_VXPOCKET, @@ -99,6 +102,7 @@ static struct snd_vx_hardware vxpocket_hw = { .num_ins = 1, .num_outs = 1, .output_level_max = VX_ANALOG_OUT_LEVEL_MAX, + .output_level_db_scale = db_scale_old_vol, }; /* VX-pocket 440 @@ -120,6 +124,7 @@ static struct snd_vx_hardware vxp440_hw = { .num_ins = 2, .num_outs = 2, .output_level_max = VX_ANALOG_OUT_LEVEL_MAX, + .output_level_db_scale = db_scale_old_vol, }; -- cgit v0.10.2 From 86148e84c218e49b54521e8dae7bb78eb66c4281 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 24 Aug 2006 12:36:36 +0200 Subject: [ALSA] Fix errors with user TLV_WRITE Fixed the errors at checking info.access field during user TLV_WRITE call. It should have been zero-initialized. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/core/control.c b/sound/core/control.c index ac14426..3030aaa 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -1048,6 +1048,7 @@ static int snd_ctl_elem_add(struct snd_ctl_file *file, if (ue == NULL) return -ENOMEM; ue->info = *info; + ue->info.access = 0; ue->elem_data = (char *)ue + sizeof(*ue); ue->elem_data_size = private_size; kctl.private_free = snd_ctl_elem_user_free; -- cgit v0.10.2 From 18c1c3f694105ab2a6f43e054e23f9a751b2f869 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 25 Aug 2006 11:39:34 +0200 Subject: [ALSA] Return error if no user TLV is defined Retrun error to user TLV_READ ioctl if no TLV is defined. (Until now, nothing was written and rerunred successfully.) Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/core/control.c b/sound/core/control.c index 3030aaa..6973a96 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -951,6 +951,8 @@ static int snd_ctl_elem_user_tlv(struct snd_kcontrol *kcontrol, ue->tlv_data = new_data; ue->tlv_data_size = size; } else { + if (! ue->tlv_data_size || ! ue->tlv_data) + return -ENXIO; if (size < ue->tlv_data_size) return -ENOSPC; if (copy_to_user(tlv, ue->tlv_data, ue->tlv_data_size)) -- cgit v0.10.2 From 2c7782b420ee137057eeec7c24a565ac85fc1988 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 25 Aug 2006 13:11:26 +0200 Subject: [ALSA] hda-codec - Use model=ref for some Dell laptops Force to choose model=ref for some Dell laptops with STAC9200 codec chip for fixing the silent mic recording problem (possibly due to a BIOS bug). Reference: ALSA bug#2038 So far, applied to Inspiron 630m, Latitude D620 and 120L. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 239ae3f..8d5ad7c 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -582,6 +582,13 @@ static struct hda_board_config stac9205_cfg_tbl[] = { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2668, /* DFI LanParty */ .config = STAC_REF }, /* SigmaTel reference board */ + /* Dell laptops have BIOS problem */ + { .pci_subvendor = PCI_VENDOR_ID_DELL, .pci_subdevice = 0x01b5, + .config = STAC_REF }, /* Dell Inspiron 630m */ + { .pci_subvendor = PCI_VENDOR_ID_DELL, .pci_subdevice = 0x01c2, + .config = STAC_REF }, /* Dell Latitude D620 */ + { .pci_subvendor = PCI_VENDOR_ID_DELL, .pci_subdevice = 0x01cb, + .config = STAC_REF }, /* Dell Latitude 120L */ {} /* terminator */ }; -- cgit v0.10.2 From aaad3653a5f073ce9eaef4efd387cf7fc3a53d18 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 28 Aug 2006 12:59:23 +0200 Subject: [ALSA] sparc dbri: recording is back This patch fixes sound recording after the driver convertion to ring buffered version. It also contains small clean ups to the driver. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 3e6ad50..cdca8e4 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -107,7 +107,7 @@ static char *cmds[] = { #define dprintk(a, x...) if(dbri_debug & a) printk(KERN_DEBUG x) #else -#define dprintk(a, x...) +#define dprintk(a, x...) do { } while (0) #endif /* DBRI_DEBUG */ @@ -610,10 +610,10 @@ CPU interrupt to signal completion. Since the DBRI can run in parallel with the CPU, several means of synchronization present themselves. The method implemented here is only -to use the dbri_cmdwait() to wait for execution of batch of sent commands. +use of the dbri_cmdwait() to wait for execution of batch of sent commands. A circular command buffer is used here. A new command is being added -while other can be executed. The scheme works by adding two WAIT commands +while another can be executed. The scheme works by adding two WAIT commands after each sent batch of commands. When the next batch is prepared it is added after the WAIT commands then the WAITs are replaced with single JUMP command to the new batch. The the DBRI is forced to reread the last WAIT @@ -628,7 +628,7 @@ to send them to the DBRI. */ -#define MAXLOOPS 10 +#define MAXLOOPS 20 /* * Wait for the current command string to execute */ @@ -692,9 +692,8 @@ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) if (cmd > dbri->cmdptr) { s32 *ptr; - for (ptr = dbri->cmdptr; ptr < cmd+2; ptr++) { + for (ptr = dbri->cmdptr; ptr < cmd+2; ptr++) dprintk(D_CMD, "cmd: %lx:%08x\n", (unsigned long)ptr, *ptr); - } } else { s32 *ptr = dbri->cmdptr; @@ -1141,13 +1140,9 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period return -1; } - if (streamno == DBRI_PLAY) { - dbri->dma->desc[last_desc].word1 |= - DBRI_TD_F | DBRI_TD_B; - dbri->dma->desc[last_desc].nda = - dbri->dma_dvma + dbri_dma_off(desc, first_desc); - dbri->next_desc[last_desc] = first_desc; - } + dbri->dma->desc[last_desc].nda = + dbri->dma_dvma + dbri_dma_off(desc, first_desc); + dbri->next_desc[last_desc] = first_desc; dbri->pipes[info->pipe].first_desc = first_desc; dbri->pipes[info->pipe].desc = first_desc; @@ -1639,7 +1634,6 @@ static void xmit_descs(struct snd_dbri *dbri) if (dbri == NULL) return; /* Disabled */ - /* First check the recording stream for buffer overflow */ info = &dbri->stream_info[DBRI_REC]; spin_lock_irqsave(&dbri->lock, flags); @@ -1649,27 +1643,20 @@ static void xmit_descs(struct snd_dbri *dbri) dprintk(D_DESC, "xmit_descs rec @ TD %d\n", first_td); /* Stream could be closed by the time we run. */ - if (first_td < 0) { - goto play; - } - - cmd = dbri_cmdlock(dbri, 2); - *(cmd++) = DBRI_CMD(D_SDP, 0, - dbri->pipes[info->pipe].sdp - | D_SDP_P | D_SDP_EVERY | D_SDP_C); - *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); - dbri_cmdsend(dbri, cmd, 2); + if (first_td >= 0) { + cmd = dbri_cmdlock(dbri, 2); + *(cmd++) = DBRI_CMD(D_SDP, 0, + dbri->pipes[info->pipe].sdp + | D_SDP_P | D_SDP_EVERY | D_SDP_C); + *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); + dbri_cmdsend(dbri, cmd, 2); - /* Reset our admin of the pipe & bytes read. */ - dbri->pipes[info->pipe].desc = first_td; + /* Reset our admin of the pipe. */ + dbri->pipes[info->pipe].desc = first_td; + } } -play: - spin_unlock_irqrestore(&dbri->lock, flags); - - /* Now check the playback stream for buffer underflow */ info = &dbri->stream_info[DBRI_PLAY]; - spin_lock_irqsave(&dbri->lock, flags); if (info->pipe >= 0) { first_td = dbri->pipes[info->pipe].first_desc; @@ -1685,7 +1672,7 @@ play: *(cmd++) = dbri->dma_dvma + dbri_dma_off(desc, first_td); dbri_cmdsend(dbri, cmd, 2); - /* Reset our admin of the pipe & bytes written. */ + /* Reset our admin of the pipe. */ dbri->pipes[info->pipe].desc = first_td; } } @@ -1755,7 +1742,6 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) return; } - dbri->dma->desc[rd].ba = 0; dbri->pipes[pipe].desc = dbri->next_desc[rd]; status = dbri->dma->desc[rd].word1; dbri->dma->desc[rd].word1 = 0; /* Reset it for next time. */ @@ -1768,18 +1754,6 @@ static void reception_complete_intr(struct snd_dbri * dbri, int pipe) dprintk(D_INT, "Recv RD %d, status 0x%02x, len %d\n", rd, DBRI_RD_STATUS(status), DBRI_RD_CNT(status)); - /* On the last TD, transmit them all again. */ -#if 0 - if (dbri->next_desc[rd] == -1) { - if (info->left > info->size) { - printk(KERN_WARNING - "%d bytes recorded in %d size buffer.\n", - info->left, info->size); - } - tasklet_schedule(&xmit_descs_task); - } -#endif - /* Notify ALSA */ if (spin_is_locked(&dbri->lock)) { spin_unlock(&dbri->lock); @@ -2113,6 +2087,7 @@ static int snd_dbri_prepare(struct snd_pcm_substream *substream) info->pipe = 6; /* Receive pipe */ spin_lock_irq(&dbri->lock); + info->offset = 0; /* Setup the all the transmit/receive desciptors to cover the * whole DMA buffer. -- cgit v0.10.2 From 99dabfe716002c54b4dffa545460dc74bc632c22 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 28 Aug 2006 13:00:45 +0200 Subject: [ALSA] dbri sparc: fixes TS leak This patch fixes time slot leak in the dbri driver. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index cdca8e4..6b090fb 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -1044,7 +1044,7 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period { struct dbri_streaminfo *info = &dbri->stream_info[streamno]; __u32 dvma_buffer; - int desc = 0; + int desc; int len; int first_desc = -1; int last_desc = -1; @@ -1087,6 +1087,18 @@ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period len &= ~3; } + /* Free descriptors if pipe has any */ + desc = dbri->pipes[info->pipe].first_desc; + if ( desc >= 0) + do { + dbri->dma->desc[desc].nda = dbri->dma->desc[desc].ba = 0; + desc = dbri->next_desc[desc]; + } while (desc != -1 && desc != dbri->pipes[info->pipe].first_desc); + + dbri->pipes[info->pipe].desc = -1; + dbri->pipes[info->pipe].first_desc = -1; + + desc = 0; while (len > 0) { int mylen; @@ -2054,6 +2066,7 @@ static int snd_dbri_hw_free(struct snd_pcm_substream *substream) struct snd_dbri *dbri = snd_pcm_substream_chip(substream); struct dbri_streaminfo *info = DBRI_STREAM(dbri, substream); int direction; + dprintk(D_USR, "hw_free.\n"); /* hw_free can get called multiple times. Only unmap the DMA once. @@ -2068,7 +2081,10 @@ static int snd_dbri_hw_free(struct snd_pcm_substream *substream) substream->runtime->buffer_size, direction); info->dvma_buffer = 0; } - info->pipe = -1; + if (info->pipe != -1) { + reset_pipe(dbri, info->pipe); + info->pipe = -1; + } return snd_pcm_lib_free_pages(substream); } -- cgit v0.10.2 From 1f14d167f0233342eab53bb1a429ddad1e848de4 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Mon, 28 Aug 2006 13:01:31 +0200 Subject: [ALSA] sparc dbri: OSS layer fix This patch removes setting of incorrect stop_threshold value inside the driver. After the change, playback through the OSS layer works correctly. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 6b090fb..82d5e80 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -2111,8 +2111,6 @@ static int snd_dbri_prepare(struct snd_pcm_substream *substream) ret = setup_descs(dbri, DBRI_STREAMNO(substream), snd_pcm_lib_period_bytes(substream)); - runtime->stop_threshold = DBRI_TD_MAXCNT / runtime->channels; - spin_unlock_irq(&dbri->lock); dprintk(D_USR, "prepare audio output. %d bytes\n", info->size); -- cgit v0.10.2 From 063a40d9111ce7558f2fdfa4f85acfc47eb27353 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 28 Aug 2006 13:20:13 +0200 Subject: [ALSA] Add the definition of linear volume TLV Added the definition of linear volume TLV type. Some DSP chips and codecs (e.g. AK codec) use linear volume control. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/tlv.h b/include/sound/tlv.h index b826e1d..7905841 100644 --- a/include/sound/tlv.h +++ b/include/sound/tlv.h @@ -33,6 +33,7 @@ #define SNDRV_CTL_TLVT_CONTAINER 0 /* one level down - group of TLVs */ #define SNDRV_CTL_TLVT_DB_SCALE 1 /* dB scale */ +#define SNDRV_CTL_TLVT_DB_LINEAR 2 /* linear volume */ #define DECLARE_TLV_DB_SCALE(name, min, step, mute) \ unsigned int name[] = { \ @@ -40,4 +41,13 @@ unsigned int name[] = { \ (min), ((step) & 0xffff) | ((mute) ? 0x10000 : 0) \ } +/* linear volume between min_dB and max_dB (.01dB unit) */ +#define DECLARE_TLV_DB_LINEAR(name, min_dB, max_dB) \ +unsigned int name[] = { \ + SNDRV_CTL_TLVT_DB_LINEAR, 2 * sizeof(unsigned int), \ + (min_dB), (max_dB) \ +} + +#define TLV_DB_GAIN_MUTE -9999999 + #endif /* __SOUND_TLV_H */ -- cgit v0.10.2 From 33925186d843e7004288cd3d87843c5a1dbf55a4 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Mon, 28 Aug 2006 13:29:42 +0200 Subject: [ALSA] ymfpci - Add TLV entries for native volume controls Added the linear volume TLV entries for YMFPCI native volume controls. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ymfpci/ymfpci_main.c b/sound/pci/ymfpci/ymfpci_main.c index a55b5fd..24f6fc5 100644 --- a/sound/pci/ymfpci/ymfpci_main.c +++ b/sound/pci/ymfpci/ymfpci_main.c @@ -36,6 +36,7 @@ #include #include #include +#include #include #include #include @@ -1477,11 +1478,15 @@ static int snd_ymfpci_put_single(struct snd_kcontrol *kcontrol, return change; } +static DECLARE_TLV_DB_LINEAR(db_scale_native, TLV_DB_GAIN_MUTE, 0); + #define YMFPCI_DOUBLE(xname, xindex, reg) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xindex, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ, \ .info = snd_ymfpci_info_double, \ .get = snd_ymfpci_get_double, .put = snd_ymfpci_put_double, \ - .private_value = reg } + .private_value = reg, \ + .tlv = { .p = db_scale_native } } static int snd_ymfpci_info_double(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { -- cgit v0.10.2 From 9f458e7fb5b92385d348fb6039ba7211a6d6ba6e Mon Sep 17 00:00:00 2001 From: Andrey Liakhovets Date: Mon, 28 Aug 2006 16:52:41 +0200 Subject: [ALSA] ac97 - Fix VIA EPIA sound problem Fix the bad sound quality on VIA EPIA system using VIA VT1617A (ALSA bug#2381). Signed-off-by: Andrey Liakhovets Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index 392f6cc..bdd7f89 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -2799,6 +2799,10 @@ int patch_vt1616(struct snd_ac97 * ac97) */ int patch_vt1617a(struct snd_ac97 * ac97) { + /* bring analog power consumption to normal, like WinXP driver + * for EPIA SP + */ + snd_ac97_write_cache(ac97, 0x5c, 0x20); ac97->ext_id |= AC97_EI_SPDIF; /* force the detection of spdif */ ac97->rates[AC97_RATES_SPDIF] = SNDRV_PCM_RATE_44100 | SNDRV_PCM_RATE_48000; return 0; -- cgit v0.10.2 From a79eee8d3d8a80c37d235e1181d67c3705c7bbfe Mon Sep 17 00:00:00 2001 From: Luke Ross Date: Tue, 29 Aug 2006 10:46:32 +0200 Subject: [ALSA] Support for non-standard rates in USB audio driver There's at least one USB audio chipset out there which supports only one non-standard rate (ID 0e6a:0310 supports 46875Hz). There's a few other patches for this card which are unsatisfactory because they attempt to map this rate to 44.1k leading to sound distortion. The patch below uses SNDRV_PCM_RATE_KNOT to properly support the non-standard rates where they are available. Signed-off-by: Luke Ross Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbaudio.c b/sound/usb/usbaudio.c index 087f9b6..664dd4c 100644 --- a/sound/usb/usbaudio.c +++ b/sound/usb/usbaudio.c @@ -123,6 +123,7 @@ struct audioformat { unsigned int rate_min, rate_max; /* min/max rates */ unsigned int nr_rates; /* number of rate table entries */ unsigned int *rate_table; /* rate table */ + unsigned int needs_knot; /* any unusual rates? */ }; struct snd_usb_substream; @@ -1759,6 +1760,9 @@ static int check_hw_params_convention(struct snd_usb_substream *subs) } channels[f->format] |= (1 << f->channels); rates[f->format] |= f->rates; + /* needs knot? */ + if (f->needs_knot) + goto __out; } /* check whether channels and rates match for all formats */ cmaster = rmaster = 0; @@ -1799,6 +1803,38 @@ static int check_hw_params_convention(struct snd_usb_substream *subs) return err; } +/* + * If the device supports unusual bit rates, does the request meet these? + */ +static int snd_usb_pcm_check_knot(struct snd_pcm_runtime *runtime, + struct snd_usb_substream *subs) +{ + struct list_head *p; + struct snd_pcm_hw_constraint_list constraints_rates; + int err; + + list_for_each(p, &subs->fmt_list) { + struct audioformat *fp; + fp = list_entry(p, struct audioformat, list); + + if (!fp->needs_knot) + continue; + + constraints_rates.count = fp->nr_rates; + constraints_rates.list = fp->rate_table; + constraints_rates.mask = 0; + + err = snd_pcm_hw_constraint_list(runtime, 0, + SNDRV_PCM_HW_PARAM_RATE, + &constraints_rates); + + if (err < 0) + return err; + } + + return 0; +} + /* * set up the runtime hardware information. @@ -1861,6 +1897,8 @@ static int setup_hw_info(struct snd_pcm_runtime *runtime, struct snd_usb_substre SNDRV_PCM_HW_PARAM_CHANNELS, -1)) < 0) return err; + if ((err = snd_usb_pcm_check_knot(runtime, subs)) < 0) + return err; } return 0; } @@ -2406,6 +2444,7 @@ static int parse_audio_format_rates(struct snd_usb_audio *chip, struct audioform unsigned char *fmt, int offset) { int nr_rates = fmt[offset]; + int found; if (fmt[0] < offset + 1 + 3 * (nr_rates ? nr_rates : 2)) { snd_printk(KERN_ERR "%d:%u:%d : invalid FORMAT_TYPE desc\n", chip->dev->devnum, fp->iface, fp->altsetting); @@ -2428,6 +2467,7 @@ static int parse_audio_format_rates(struct snd_usb_audio *chip, struct audioform return -1; } + fp->needs_knot = 0; fp->nr_rates = nr_rates; fp->rate_min = fp->rate_max = combine_triple(&fmt[8]); for (r = 0, idx = offset + 1; r < nr_rates; r++, idx += 3) { @@ -2436,13 +2476,19 @@ static int parse_audio_format_rates(struct snd_usb_audio *chip, struct audioform fp->rate_min = rate; else if (rate > fp->rate_max) fp->rate_max = rate; + found = 0; for (c = 0; c < (int)ARRAY_SIZE(conv_rates); c++) { if (rate == conv_rates[c]) { + found = 1; fp->rates |= (1 << c); break; } } + if (!found) + fp->needs_knot = 1; } + if (fp->needs_knot) + fp->rates |= SNDRV_PCM_RATE_KNOT; } else { /* continuous rates */ fp->rates = SNDRV_PCM_RATE_CONTINUOUS; -- cgit v0.10.2 From d0ae48471570c680333cbe28c143bbab887a4ec2 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Tue, 29 Aug 2006 18:15:15 +0200 Subject: [ALSA] Add missing dB scale information to vxpocket driver Added the missing dB scale information for Mic volume to vxpocket driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pcmcia/vx/vxp_mixer.c b/sound/pcmcia/vx/vxp_mixer.c index e237f6c..bced7b6 100644 --- a/sound/pcmcia/vx/vxp_mixer.c +++ b/sound/pcmcia/vx/vxp_mixer.c @@ -23,6 +23,7 @@ #include #include #include +#include #include "vxpocket.h" #define MIC_LEVEL_MIN 0 @@ -63,12 +64,17 @@ static int vx_mic_level_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_v return 0; } +static DECLARE_TLV_DB_SCALE(db_scale_mic, -21, 3, 0); + static struct snd_kcontrol_new vx_control_mic_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Mic Capture Volume", .info = vx_mic_level_info, .get = vx_mic_level_get, .put = vx_mic_level_put, + .tlv = { .p = db_scale_mic }, }; /* -- cgit v0.10.2 From 723b2b0d36fa7cea81a962af2d40d88520d5a5f1 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 30 Aug 2006 16:49:54 +0200 Subject: [ALSA] Clean up and add TLV support to AK4xxx i2c driver - Clean up the code in AK4xxx-ADDA i2c code. - Fix capture gain controls for AK5365 - Changed the static table for DAC/ADC mixer labels to use structs - Implemented TLV entries for each AK codec The volumes in AK4524, AK4528 and AK5365 are corrected with a table to be suitable for dB conversion. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/ak4xxx-adda.h b/include/sound/ak4xxx-adda.h index 65ddfa3..026e4072 100644 --- a/include/sound/ak4xxx-adda.h +++ b/include/sound/ak4xxx-adda.h @@ -39,14 +39,26 @@ struct snd_ak4xxx_ops { #define AK4XXX_IMAGE_SIZE (AK4XXX_MAX_CHIPS * 16) /* 64 bytes */ +/* DAC label and channels */ +struct snd_akm4xxx_dac_channel { + char *name; /* mixer volume name */ + unsigned int num_channels; +}; + +/* ADC labels and channels */ +struct snd_akm4xxx_adc_channel { + char *name; /* capture gain volume label */ + char *gain_name; /* IPGA */ + char *switch_name; /* capture switch */ + unsigned int num_channels; +}; + struct snd_akm4xxx { struct snd_card *card; unsigned int num_adcs; /* AK4524 or AK4528 ADCs */ unsigned int num_dacs; /* AK4524 or AK4528 DACs */ unsigned char images[AK4XXX_IMAGE_SIZE]; /* saved register image */ - unsigned char ipga_gain[AK4XXX_MAX_CHIPS][2]; /* saved register image - * for IPGA (AK4528) - */ + unsigned char volumes[AK4XXX_IMAGE_SIZE]; /* saved volume values */ unsigned long private_value[AK4XXX_MAX_CHIPS]; /* helper for driver */ void *private_data[AK4XXX_MAX_CHIPS]; /* helper for driver */ /* template should fill the following fields */ @@ -56,10 +68,11 @@ struct snd_akm4xxx { SND_AK4355, SND_AK4358, SND_AK4381, SND_AK5365 } type; - unsigned int *num_stereo; /* array of combined counts - * for the mixer - */ - char **channel_names; /* array of mixer channel names */ + + /* (array) information of combined codecs */ + struct snd_akm4xxx_dac_channel *dac_info; + struct snd_akm4xxx_adc_channel *adc_info; + struct snd_ak4xxx_ops ops; }; @@ -73,9 +86,18 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak); (ak)->images[(chip) * 16 + (reg)] #define snd_akm4xxx_set(ak,chip,reg,val) \ ((ak)->images[(chip) * 16 + (reg)] = (val)) +#define snd_akm4xxx_get_vol(ak,chip,reg) \ + (ak)->volumes[(chip) * 16 + (reg)] +#define snd_akm4xxx_set_vol(ak,chip,reg,val) \ + ((ak)->volumes[(chip) * 16 + (reg)] = (val)) + +/* Warning: IPGA is tricky - we assume the addr + 4 is unused + * so far, it's OK for all AK codecs with IPGA: + * AK4524, AK4528 and EK5365 + */ #define snd_akm4xxx_get_ipga(ak,chip,reg) \ - (ak)->ipga_gain[chip][(reg)-4] + snd_akm4xxx_get_vol(ak, chip, (reg) + 4) #define snd_akm4xxx_set_ipga(ak,chip,reg,val) \ - ((ak)->ipga_gain[chip][(reg)-4] = (val)) + snd_akm4xxx_set_vol(ak, chip, (reg) + 4, val) #endif /* __SOUND_AK4XXX_ADDA_H */ diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index 89fc3cb..c34cb46 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -28,12 +28,14 @@ #include #include #include +#include #include MODULE_AUTHOR("Jaroslav Kysela , Takashi Iwai "); MODULE_DESCRIPTION("Routines for control of AK452x / AK43xx AD/DA converters"); MODULE_LICENSE("GPL"); +/* write the given register and save the data to the cache */ void snd_akm4xxx_write(struct snd_akm4xxx *ak, int chip, unsigned char reg, unsigned char val) { @@ -41,15 +43,10 @@ void snd_akm4xxx_write(struct snd_akm4xxx *ak, int chip, unsigned char reg, ak->ops.write(ak, chip, reg, val); /* save the data */ - if (ak->type == SND_AK4524 || ak->type == SND_AK4528) { - if ((reg != 0x04 && reg != 0x05) || (val & 0x80) == 0) - snd_akm4xxx_set(ak, chip, reg, val); - else - snd_akm4xxx_set_ipga(ak, chip, reg, val); - } else { - /* AK4529, or else */ + /* don't overwrite with IPGA data */ + if ((ak->type != SND_AK4524 && ak->type != SND_AK5365) || + (reg != 0x04 && reg != 0x05) || (val & 0x80) == 0) snd_akm4xxx_set(ak, chip, reg, val); - } ak->ops.unlock(ak, chip); } @@ -78,7 +75,7 @@ static void ak4524_reset(struct snd_akm4xxx *ak, int state) /* IPGA */ for (reg = 0x04; reg < 0x06; reg++) snd_akm4xxx_write(ak, chip, reg, - snd_akm4xxx_get_ipga(ak, chip, reg)); + snd_akm4xxx_get_ipga(ak, chip, reg) | 0x80); } } @@ -144,6 +141,42 @@ void snd_akm4xxx_reset(struct snd_akm4xxx *ak, int state) EXPORT_SYMBOL(snd_akm4xxx_reset); + +/* + * Volume conversion table for non-linear volumes + * from -63.5dB (mute) to 0dB step 0.5dB + * + * Used for AK4524 input/ouput attenuation, AK4528, and + * AK5365 input attenuation + */ +static unsigned char vol_cvt_datt[128] = { + 0x00, 0x01, 0x01, 0x02, 0x02, 0x03, 0x03, 0x04, + 0x04, 0x04, 0x04, 0x05, 0x05, 0x05, 0x06, 0x06, + 0x06, 0x07, 0x07, 0x08, 0x08, 0x08, 0x09, 0x0a, + 0x0a, 0x0b, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f, 0x0f, + 0x10, 0x10, 0x11, 0x12, 0x12, 0x13, 0x13, 0x14, + 0x15, 0x16, 0x17, 0x17, 0x18, 0x19, 0x1a, 0x1c, + 0x1d, 0x1e, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x23, + 0x24, 0x25, 0x26, 0x28, 0x29, 0x2a, 0x2b, 0x2d, + 0x2e, 0x30, 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, + 0x37, 0x38, 0x39, 0x3b, 0x3c, 0x3e, 0x3f, 0x40, + 0x41, 0x42, 0x43, 0x44, 0x46, 0x47, 0x48, 0x4a, + 0x4b, 0x4d, 0x4e, 0x50, 0x51, 0x52, 0x53, 0x54, + 0x55, 0x56, 0x58, 0x59, 0x5b, 0x5c, 0x5e, 0x5f, + 0x60, 0x61, 0x62, 0x64, 0x65, 0x66, 0x67, 0x69, + 0x6a, 0x6c, 0x6d, 0x6f, 0x70, 0x71, 0x72, 0x73, + 0x75, 0x76, 0x77, 0x79, 0x7a, 0x7c, 0x7d, 0x7f, +}; + +/* + * dB tables + */ +static DECLARE_TLV_DB_SCALE(db_scale_vol_datt, -6350, 50, 1); +static DECLARE_TLV_DB_SCALE(db_scale_8bit, -12750, 50, 1); +static DECLARE_TLV_DB_SCALE(db_scale_7bit, -6350, 50, 1); +static DECLARE_TLV_DB_LINEAR(db_scale_linear, TLV_DB_GAIN_MUTE, 0); +static DECLARE_TLV_DB_SCALE(db_scale_ipga, 0, 50, 0); + /* * initialize all the ak4xxx chips */ @@ -240,6 +273,9 @@ void snd_akm4xxx_init(struct snd_akm4xxx *ak) int chip, num_chips; unsigned char *ptr, reg, data, *inits; + memset(ak->images, 0, sizeof(ak->images)); + memset(ak->volumes, 0, sizeof(ak->volumes)); + switch (ak->type) { case SND_AK4524: inits = inits_ak4524; @@ -265,6 +301,9 @@ void snd_akm4xxx_init(struct snd_akm4xxx *ak) inits = inits_ak4381; num_chips = ak->num_dacs / 2; break; + case SND_AK5365: + /* FIXME: any init sequence? */ + return; default: snd_BUG(); return; @@ -282,16 +321,21 @@ void snd_akm4xxx_init(struct snd_akm4xxx *ak) EXPORT_SYMBOL(snd_akm4xxx_init); +/* + * Mixer callbacks + */ +#define AK_VOL_CVT (1<<21) /* need dB conversion */ +#define AK_NEEDSMSB (1<<22) /* need MSB update bit */ +#define AK_INVERT (1<<23) /* data is inverted */ #define AK_GET_CHIP(val) (((val) >> 8) & 0xff) #define AK_GET_ADDR(val) ((val) & 0xff) -#define AK_GET_SHIFT(val) (((val) >> 16) & 0x3f) +#define AK_GET_SHIFT(val) (((val) >> 16) & 0x1f) +#define AK_GET_VOL_CVT(val) (((val) >> 21) & 1) #define AK_GET_NEEDSMSB(val) (((val) >> 22) & 1) #define AK_GET_INVERT(val) (((val) >> 23) & 1) #define AK_GET_MASK(val) (((val) >> 24) & 0xff) #define AK_COMPOSE(chip,addr,shift,mask) \ (((chip) << 8) | (addr) | ((shift) << 16) | ((mask) << 24)) -#define AK_NEEDSMSB (1<<22) -#define AK_INVERT (1<<23) static int snd_akm4xxx_volume_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) @@ -311,37 +355,37 @@ static int snd_akm4xxx_volume_get(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); - int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); - int invert = AK_GET_INVERT(kcontrol->private_value); - unsigned int mask = AK_GET_MASK(kcontrol->private_value); - unsigned char val = snd_akm4xxx_get(ak, chip, addr); - if (needsmsb) - val &= 0x7f; - ucontrol->value.integer.value[0] = invert ? mask - val : val; + ucontrol->value.integer.value[0] = snd_akm4xxx_get_vol(ak, chip, addr); return 0; } -static int snd_akm4xxx_volume_put(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_value *ucontrol) +static int put_ak_reg(struct snd_kcontrol *kcontrol, int addr, + unsigned char nval) { struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); - int chip = AK_GET_CHIP(kcontrol->private_value); - int addr = AK_GET_ADDR(kcontrol->private_value); - int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); - int invert = AK_GET_INVERT(kcontrol->private_value); unsigned int mask = AK_GET_MASK(kcontrol->private_value); - unsigned char nval = ucontrol->value.integer.value[0] % (mask+1); - int change; + int chip = AK_GET_CHIP(kcontrol->private_value); - if (invert) + if (snd_akm4xxx_get_vol(ak, chip, addr) == nval) + return 0; + + snd_akm4xxx_set_vol(ak, chip, addr, nval); + if (AK_GET_VOL_CVT(kcontrol->private_value)) + nval = vol_cvt_datt[nval]; + if (AK_GET_INVERT(kcontrol->private_value)) nval = mask - nval; - if (needsmsb) + if (AK_GET_NEEDSMSB(kcontrol->private_value)) nval |= 0x80; - change = snd_akm4xxx_get(ak, chip, addr) != nval; - if (change) - snd_akm4xxx_write(ak, chip, addr, nval); - return change; + snd_akm4xxx_write(ak, chip, addr, nval); + return 1; +} + +static int snd_akm4xxx_volume_put(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + return put_ak_reg(kcontrol, AK_GET_ADDR(kcontrol->private_value), + ucontrol->value.integer.value[0]); } static int snd_akm4xxx_stereo_volume_info(struct snd_kcontrol *kcontrol, @@ -362,66 +406,25 @@ static int snd_akm4xxx_stereo_volume_get(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); - int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); - int invert = AK_GET_INVERT(kcontrol->private_value); - unsigned int mask = AK_GET_MASK(kcontrol->private_value); - unsigned char val; - - val = snd_akm4xxx_get(ak, chip, addr); - if (needsmsb) - val &= 0x7f; - ucontrol->value.integer.value[0] = invert ? mask - val : val; - - val = snd_akm4xxx_get(ak, chip, addr+1); - if (needsmsb) - val &= 0x7f; - ucontrol->value.integer.value[1] = invert ? mask - val : val; + ucontrol->value.integer.value[0] = snd_akm4xxx_get_vol(ak, chip, addr); + ucontrol->value.integer.value[1] = snd_akm4xxx_get_vol(ak, chip, addr+1); return 0; } static int snd_akm4xxx_stereo_volume_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { - struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); - int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); - int needsmsb = AK_GET_NEEDSMSB(kcontrol->private_value); - int invert = AK_GET_INVERT(kcontrol->private_value); - unsigned int mask = AK_GET_MASK(kcontrol->private_value); - unsigned char nval = ucontrol->value.integer.value[0] % (mask+1); - int change0, change1; - - if (invert) - nval = mask - nval; - if (needsmsb) - nval |= 0x80; - change0 = snd_akm4xxx_get(ak, chip, addr) != nval; - if (change0) - snd_akm4xxx_write(ak, chip, addr, nval); - - nval = ucontrol->value.integer.value[1] % (mask+1); - if (invert) - nval = mask - nval; - if (needsmsb) - nval |= 0x80; - change1 = snd_akm4xxx_get(ak, chip, addr+1) != nval; - if (change1) - snd_akm4xxx_write(ak, chip, addr+1, nval); - + int change; - return change0 || change1; + change = put_ak_reg(kcontrol, addr, ucontrol->value.integer.value[0]); + change |= put_ak_reg(kcontrol, addr + 1, + ucontrol->value.integer.value[1]); + return change; } -static int snd_akm4xxx_ipga_gain_info(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_info *uinfo) -{ - uinfo->type = SNDRV_CTL_ELEM_TYPE_INTEGER; - uinfo->count = 1; - uinfo->value.integer.min = 0; - uinfo->value.integer.max = 36; - return 0; -} +#define snd_akm4xxx_ipga_gain_info snd_akm4xxx_volume_info static int snd_akm4xxx_ipga_gain_get(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) @@ -429,21 +432,57 @@ static int snd_akm4xxx_ipga_gain_get(struct snd_kcontrol *kcontrol, struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); + ucontrol->value.integer.value[0] = - snd_akm4xxx_get_ipga(ak, chip, addr) & 0x7f; + snd_akm4xxx_get_ipga(ak, chip, addr); return 0; } +static int put_ak_ipga(struct snd_kcontrol *kcontrol, int addr, + unsigned char nval) +{ + struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); + int chip = AK_GET_CHIP(kcontrol->private_value); + + if (snd_akm4xxx_get_ipga(ak, chip, addr) == nval) + return 0; + snd_akm4xxx_set_ipga(ak, chip, addr, nval); + snd_akm4xxx_write(ak, chip, addr, nval | 0x80); /* need MSB */ + return 1; +} + static int snd_akm4xxx_ipga_gain_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { + return put_ak_ipga(kcontrol, AK_GET_ADDR(kcontrol->private_value), + ucontrol->value.integer.value[0]); +} + +#define snd_akm4xxx_stereo_gain_info snd_akm4xxx_stereo_volume_info + +static int snd_akm4xxx_stereo_gain_get(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); int chip = AK_GET_CHIP(kcontrol->private_value); int addr = AK_GET_ADDR(kcontrol->private_value); - unsigned char nval = (ucontrol->value.integer.value[0] % 37) | 0x80; - int change = snd_akm4xxx_get_ipga(ak, chip, addr) != nval; - if (change) - snd_akm4xxx_write(ak, chip, addr, nval); + + ucontrol->value.integer.value[0] = + snd_akm4xxx_get_ipga(ak, chip, addr); + ucontrol->value.integer.value[1] = + snd_akm4xxx_get_ipga(ak, chip, addr + 1); + return 0; +} + +static int snd_akm4xxx_stereo_gain_put(struct snd_kcontrol *kcontrol, + struct snd_ctl_elem_value *ucontrol) +{ + int addr = AK_GET_ADDR(kcontrol->private_value); + int change; + + change = put_ak_ipga(kcontrol, addr, ucontrol->value.integer.value[0]); + change |= put_ak_ipga(kcontrol, addr + 1, + ucontrol->value.integer.value[1]); return change; } @@ -548,221 +587,247 @@ static int ak4xxx_switch_put(struct snd_kcontrol *kcontrol, * build AK4xxx controls */ -int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) +static int build_dac_controls(struct snd_akm4xxx *ak) { - unsigned int idx, num_emphs; - struct snd_kcontrol *ctl; - int err; - int mixer_ch = 0; - int num_stereo; - - ctl = kmalloc(sizeof(*ctl), GFP_KERNEL); - if (! ctl) - return -ENOMEM; + int idx, err, mixer_ch, num_stereo; + struct snd_kcontrol_new knew; + mixer_ch = 0; for (idx = 0; idx < ak->num_dacs; ) { - memset(ctl, 0, sizeof(*ctl)); - if (ak->channel_names == NULL) { - strcpy(ctl->id.name, "DAC Volume"); + memset(&knew, 0, sizeof(knew)); + if (! ak->dac_info || ! ak->dac_info[mixer_ch].name) { + knew.name = "DAC Volume"; + knew.index = mixer_ch + ak->idx_offset * 2; num_stereo = 1; - ctl->id.index = mixer_ch + ak->idx_offset * 2; } else { - strcpy(ctl->id.name, ak->channel_names[mixer_ch]); - num_stereo = ak->num_stereo[mixer_ch]; - ctl->id.index = 0; + knew.name = ak->dac_info[mixer_ch].name; + num_stereo = ak->dac_info[mixer_ch].num_channels; } - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; + knew.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + knew.count = 1; + knew.access = SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ; if (num_stereo == 2) { - ctl->info = snd_akm4xxx_stereo_volume_info; - ctl->get = snd_akm4xxx_stereo_volume_get; - ctl->put = snd_akm4xxx_stereo_volume_put; + knew.info = snd_akm4xxx_stereo_volume_info; + knew.get = snd_akm4xxx_stereo_volume_get; + knew.put = snd_akm4xxx_stereo_volume_put; } else { - ctl->info = snd_akm4xxx_volume_info; - ctl->get = snd_akm4xxx_volume_get; - ctl->put = snd_akm4xxx_volume_put; + knew.info = snd_akm4xxx_volume_info; + knew.get = snd_akm4xxx_volume_get; + knew.put = snd_akm4xxx_volume_put; } switch (ak->type) { case SND_AK4524: /* register 6 & 7 */ - ctl->private_value = - AK_COMPOSE(idx/2, (idx%2) + 6, 0, 127); + knew.private_value = + AK_COMPOSE(idx/2, (idx%2) + 6, 0, 127) | + AK_VOL_CVT; + knew.tlv.p = db_scale_vol_datt; break; case SND_AK4528: /* register 4 & 5 */ - ctl->private_value = - AK_COMPOSE(idx/2, (idx%2) + 4, 0, 127); + knew.private_value = + AK_COMPOSE(idx/2, (idx%2) + 4, 0, 127) | + AK_VOL_CVT; + knew.tlv.p = db_scale_vol_datt; break; case SND_AK4529: { /* registers 2-7 and b,c */ int val = idx < 6 ? idx + 2 : (idx - 6) + 0xb; - ctl->private_value = + knew.private_value = AK_COMPOSE(0, val, 0, 255) | AK_INVERT; + knew.tlv.p = db_scale_8bit; break; } case SND_AK4355: /* register 4-9, chip #0 only */ - ctl->private_value = AK_COMPOSE(0, idx + 4, 0, 255); + knew.private_value = AK_COMPOSE(0, idx + 4, 0, 255); + knew.tlv.p = db_scale_8bit; break; case SND_AK4358: { /* register 4-9 and 11-12, chip #0 only */ int addr = idx < 6 ? idx + 4 : idx + 5; - ctl->private_value = + knew.private_value = AK_COMPOSE(0, addr, 0, 127) | AK_NEEDSMSB; + knew.tlv.p = db_scale_7bit; break; } case SND_AK4381: /* register 3 & 4 */ - ctl->private_value = + knew.private_value = AK_COMPOSE(idx/2, (idx%2) + 3, 0, 255); + knew.tlv.p = db_scale_linear; break; default: - err = -EINVAL; - goto __error; + return -EINVAL; } - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); + err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); if (err < 0) - goto __error; + return err; idx += num_stereo; mixer_ch++; } - for (idx = 0; idx < ak->num_adcs && ak->type == SND_AK4524; ++idx) { - memset(ctl, 0, sizeof(*ctl)); - strcpy(ctl->id.name, "ADC Volume"); - ctl->id.index = idx + ak->idx_offset * 2; - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; - ctl->info = snd_akm4xxx_volume_info; - ctl->get = snd_akm4xxx_volume_get; - ctl->put = snd_akm4xxx_volume_put; - /* register 4 & 5 */ - ctl->private_value = - AK_COMPOSE(idx/2, (idx%2) + 4, 0, 127); - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); - if (err < 0) - goto __error; - - memset(ctl, 0, sizeof(*ctl)); - strcpy(ctl->id.name, "IPGA Analog Capture Volume"); - ctl->id.index = idx + ak->idx_offset * 2; - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; - ctl->info = snd_akm4xxx_ipga_gain_info; - ctl->get = snd_akm4xxx_ipga_gain_get; - ctl->put = snd_akm4xxx_ipga_gain_put; + return 0; +} + +static int build_adc_controls(struct snd_akm4xxx *ak) +{ + int idx, err, mixer_ch, num_stereo; + struct snd_kcontrol_new knew; + + mixer_ch = 0; + for (idx = 0; idx < ak->num_adcs;) { + memset(&knew, 0, sizeof(knew)); + if (! ak->adc_info || ! ak->adc_info[mixer_ch].name) { + knew.name = "ADC Volume"; + knew.index = mixer_ch + ak->idx_offset * 2; + num_stereo = 1; + } else { + knew.name = ak->adc_info[mixer_ch].name; + num_stereo = ak->adc_info[mixer_ch].num_channels; + } + knew.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + knew.count = 1; + knew.access = SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ; + if (num_stereo == 2) { + knew.info = snd_akm4xxx_stereo_volume_info; + knew.get = snd_akm4xxx_stereo_volume_get; + knew.put = snd_akm4xxx_stereo_volume_put; + } else { + knew.info = snd_akm4xxx_volume_info; + knew.get = snd_akm4xxx_volume_get; + knew.put = snd_akm4xxx_volume_put; + } /* register 4 & 5 */ - ctl->private_value = AK_COMPOSE(idx/2, (idx%2) + 4, 0, 0); - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); + knew.private_value = + AK_COMPOSE(idx/2, (idx%2) + 4, 0, 127) | + AK_VOL_CVT; + knew.tlv.p = db_scale_vol_datt; + err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); if (err < 0) - goto __error; - } + return err; - if (ak->type == SND_AK5365) { - memset(ctl, 0, sizeof(*ctl)); - if (ak->channel_names == NULL) - strcpy(ctl->id.name, "Capture Volume"); + if (! ak->adc_info || ! ak->adc_info[mixer_ch].gain_name) + knew.name = "IPGA Analog Capture Volume"; else - strcpy(ctl->id.name, ak->channel_names[0]); - ctl->id.index = ak->idx_offset * 2; - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; - ctl->info = snd_akm4xxx_stereo_volume_info; - ctl->get = snd_akm4xxx_stereo_volume_get; - ctl->put = snd_akm4xxx_stereo_volume_put; - /* Registers 4 & 5 (see AK5365 data sheet, pages 34 and 35): - * valid values are from 0x00 (mute) to 0x98 (+12dB). */ - ctl->private_value = - AK_COMPOSE(0, 4, 0, 0x98); - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); + knew.name = ak->adc_info[mixer_ch].gain_name; + if (num_stereo == 2) { + knew.info = snd_akm4xxx_stereo_gain_info; + knew.get = snd_akm4xxx_stereo_gain_get; + knew.put = snd_akm4xxx_stereo_gain_put; + } else { + knew.info = snd_akm4xxx_ipga_gain_info; + knew.get = snd_akm4xxx_ipga_gain_get; + knew.put = snd_akm4xxx_ipga_gain_put; + } + /* register 4 & 5 */ + if (ak->type == SND_AK4524) + knew.private_value = AK_COMPOSE(idx/2, (idx%2) + 4, 0, + 24); + else /* AK5365 */ + knew.private_value = AK_COMPOSE(idx/2, (idx%2) + 4, 0, + 36); + knew.tlv.p = db_scale_ipga; + err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); if (err < 0) - goto __error; + return err; + + if (ak->type == SND_AK5365 && (idx % 2) == 0) { + if (! ak->adc_info || + ! ak->adc_info[mixer_ch].switch_name) + knew.name = "Capture Switch"; + else + knew.name = ak->adc_info[mixer_ch].switch_name; + knew.info = ak4xxx_switch_info; + knew.get = ak4xxx_switch_get; + knew.put = ak4xxx_switch_put; + knew.access = 0; + /* register 2, bit 0 (SMUTE): 0 = normal operation, + 1 = mute */ + knew.private_value = + AK_COMPOSE(idx/2, 2, 0, 0) | AK_INVERT; + err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); + if (err < 0) + return err; + } - memset(ctl, 0, sizeof(*ctl)); - if (ak->channel_names == NULL) - strcpy(ctl->id.name, "Capture Switch"); - else - strcpy(ctl->id.name, ak->channel_names[1]); - ctl->id.index = ak->idx_offset * 2; - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; - ctl->info = ak4xxx_switch_info; - ctl->get = ak4xxx_switch_get; - ctl->put = ak4xxx_switch_put; - /* register 2, bit 0 (SMUTE): 0 = normal operation, 1 = mute */ - ctl->private_value = - AK_COMPOSE(0, 2, 0, 0) | AK_INVERT; - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); - if (err < 0) - goto __error; + idx += num_stereo; + mixer_ch++; } + return 0; +} + +static int build_deemphasis(struct snd_akm4xxx *ak, int num_emphs) +{ + int idx, err; + struct snd_kcontrol_new knew; - if (ak->type == SND_AK4355 || ak->type == SND_AK4358) - num_emphs = 1; - else - num_emphs = ak->num_dacs / 2; for (idx = 0; idx < num_emphs; idx++) { - memset(ctl, 0, sizeof(*ctl)); - strcpy(ctl->id.name, "Deemphasis"); - ctl->id.index = idx + ak->idx_offset; - ctl->id.iface = SNDRV_CTL_ELEM_IFACE_MIXER; - ctl->count = 1; - ctl->info = snd_akm4xxx_deemphasis_info; - ctl->get = snd_akm4xxx_deemphasis_get; - ctl->put = snd_akm4xxx_deemphasis_put; + memset(&knew, 0, sizeof(knew)); + knew.name = "Deemphasis"; + knew.index = idx + ak->idx_offset; + knew.iface = SNDRV_CTL_ELEM_IFACE_MIXER; + knew.count = 1; + knew.info = snd_akm4xxx_deemphasis_info; + knew.get = snd_akm4xxx_deemphasis_get; + knew.put = snd_akm4xxx_deemphasis_put; switch (ak->type) { case SND_AK4524: case SND_AK4528: /* register 3 */ - ctl->private_value = AK_COMPOSE(idx, 3, 0, 0); + knew.private_value = AK_COMPOSE(idx, 3, 0, 0); break; case SND_AK4529: { int shift = idx == 3 ? 6 : (2 - idx) * 2; /* register 8 with shift */ - ctl->private_value = AK_COMPOSE(0, 8, shift, 0); + knew.private_value = AK_COMPOSE(0, 8, shift, 0); break; } case SND_AK4355: case SND_AK4358: - ctl->private_value = AK_COMPOSE(idx, 3, 0, 0); + knew.private_value = AK_COMPOSE(idx, 3, 0, 0); break; case SND_AK4381: - ctl->private_value = AK_COMPOSE(idx, 1, 1, 0); + knew.private_value = AK_COMPOSE(idx, 1, 1, 0); break; default: - err = -EINVAL; - goto __error; + return -EINVAL; } - ctl->private_data = ak; - err = snd_ctl_add(ak->card, - snd_ctl_new(ctl, SNDRV_CTL_ELEM_ACCESS_READ| - SNDRV_CTL_ELEM_ACCESS_WRITE)); + err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); if (err < 0) - goto __error; + return err; } - err = 0; - - __error: - kfree(ctl); - return err; + return 0; } +int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) +{ + int err, num_emphs; + + err = build_dac_controls(ak); + if (err < 0) + return err; + + if (ak->type == SND_AK4524 || ak->type == SND_AK5365) { + err = build_adc_controls(ak); + if (err < 0) + return err; + } + + if (ak->type == SND_AK4355 || ak->type == SND_AK4358) + num_emphs = 1; + else + num_emphs = ak->num_dacs / 2; + err = build_deemphasis(ak, num_emphs); + if (err < 0) + return err; + + return 0; +} + EXPORT_SYMBOL(snd_akm4xxx_build_controls); static int __init alsa_akm4xxx_module_init(void) diff --git a/sound/pci/ice1712/revo.c b/sound/pci/ice1712/revo.c index 1134a57..c9eefa9 100644 --- a/sound/pci/ice1712/revo.c +++ b/sound/pci/ice1712/revo.c @@ -87,19 +87,34 @@ static void revo_set_rate_val(struct snd_akm4xxx *ak, unsigned int rate) * initialize the chips on M-Audio Revolution cards */ -static unsigned int revo71_num_stereo_front[] = {2}; -static char *revo71_channel_names_front[] = {"PCM Playback Volume"}; +#define AK_DAC(xname,xch) { .name = xname, .num_channels = xch } -static unsigned int revo71_num_stereo_surround[] = {1, 1, 2, 2}; -static char *revo71_channel_names_surround[] = {"PCM Center Playback Volume", "PCM LFE Playback Volume", - "PCM Side Playback Volume", "PCM Rear Playback Volume"}; +static struct snd_akm4xxx_dac_channel revo71_front[] = { + AK_DAC("PCM Playback Volume", 2) +}; + +static struct snd_akm4xxx_dac_channel revo71_surround[] = { + AK_DAC("PCM Center Playback Volume", 1), + AK_DAC("PCM LFE Playback Volume", 1), + AK_DAC("PCM Side Playback Volume", 2), + AK_DAC("PCM Rear Playback Volume", 2), +}; -static unsigned int revo51_num_stereo[] = {2, 1, 1, 2}; -static char *revo51_channel_names[] = {"PCM Playback Volume", "PCM Center Playback Volume", - "PCM LFE Playback Volume", "PCM Rear Playback Volume"}; +static struct snd_akm4xxx_dac_channel revo51_dac[] = { + AK_DAC("PCM Playback Volume", 2), + AK_DAC("PCM Center Playback Volume", 1), + AK_DAC("PCM LFE Playback Volume", 1), + AK_DAC("PCM Rear Playback Volume", 2), +}; -static unsigned int revo51_adc_num_stereo[] = {2}; -static char *revo51_adc_channel_names[] = {"PCM Capture Volume","PCM Capture Switch"}; +static struct snd_akm4xxx_adc_channel revo51_adc[] = { + { + .name = "PCM Capture Volume", + .gain_name = "PCM Capture Gain Volume", + .switch_name = "PCM Capture Switch", + .num_channels = 2 + }, +}; static struct snd_akm4xxx akm_revo_front __devinitdata = { .type = SND_AK4381, @@ -107,8 +122,7 @@ static struct snd_akm4xxx akm_revo_front __devinitdata = { .ops = { .set_rate_val = revo_set_rate_val }, - .num_stereo = revo71_num_stereo_front, - .channel_names = revo71_channel_names_front + .dac_info = revo71_front, }; static struct snd_ak4xxx_private akm_revo_front_priv __devinitdata = { @@ -130,8 +144,7 @@ static struct snd_akm4xxx akm_revo_surround __devinitdata = { .ops = { .set_rate_val = revo_set_rate_val }, - .num_stereo = revo71_num_stereo_surround, - .channel_names = revo71_channel_names_surround + .dac_info = revo71_surround, }; static struct snd_ak4xxx_private akm_revo_surround_priv __devinitdata = { @@ -152,8 +165,7 @@ static struct snd_akm4xxx akm_revo51 __devinitdata = { .ops = { .set_rate_val = revo_set_rate_val }, - .num_stereo = revo51_num_stereo, - .channel_names = revo51_channel_names + .dac_info = revo51_dac, }; static struct snd_ak4xxx_private akm_revo51_priv __devinitdata = { @@ -171,8 +183,7 @@ static struct snd_ak4xxx_private akm_revo51_priv __devinitdata = { static struct snd_akm4xxx akm_revo51_adc __devinitdata = { .type = SND_AK5365, .num_adcs = 2, - .num_stereo = revo51_adc_num_stereo, - .channel_names = revo51_adc_channel_names + .adc_info = revo51_adc, }; static struct snd_ak4xxx_private akm_revo51_adc_priv __devinitdata = { -- cgit v0.10.2 From 680ef792a1afdb3bf38e4a0296cce996a5b95317 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 30 Aug 2006 16:56:30 +0200 Subject: [ALSA] Add dB scale information to ice1712 driver Added the dB scale information for native digital volumes of ice1712 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ice1712/ice1712.c b/sound/pci/ice1712/ice1712.c index 9b8325d..dc69392 100644 --- a/sound/pci/ice1712/ice1712.c +++ b/sound/pci/ice1712/ice1712.c @@ -62,6 +62,7 @@ #include #include #include +#include #include @@ -1377,6 +1378,7 @@ static int snd_ice1712_pro_mixer_volume_put(struct snd_kcontrol *kcontrol, struc return change; } +static DECLARE_TLV_DB_SCALE(db_scale_playback, -14400, 150, 0); static struct snd_kcontrol_new snd_ice1712_multi_playback_ctrls[] __devinitdata = { { @@ -1390,12 +1392,15 @@ static struct snd_kcontrol_new snd_ice1712_multi_playback_ctrls[] __devinitdata }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Multi Playback Volume", .info = snd_ice1712_pro_mixer_volume_info, .get = snd_ice1712_pro_mixer_volume_get, .put = snd_ice1712_pro_mixer_volume_put, .private_value = 0, .count = 10, + .tlv = { .p = db_scale_playback } }, }; @@ -1420,11 +1425,14 @@ static struct snd_kcontrol_new snd_ice1712_multi_capture_spdif_switch __devinitd static struct snd_kcontrol_new snd_ice1712_multi_capture_analog_volume __devinitdata = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "H/W Multi Capture Volume", .info = snd_ice1712_pro_mixer_volume_info, .get = snd_ice1712_pro_mixer_volume_get, .put = snd_ice1712_pro_mixer_volume_put, .private_value = 10, + .tlv = { .p = db_scale_playback } }; static struct snd_kcontrol_new snd_ice1712_multi_capture_spdif_volume __devinitdata = { -- cgit v0.10.2 From f640c3205aca4fe231beccc9e719c946cf3fee7a Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 30 Aug 2006 16:57:37 +0200 Subject: [ALSA] Add dB scale information to ice1724 driver Added the dB scale information to each board support code of ice1724 driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ice1712/aureon.c b/sound/pci/ice1712/aureon.c index 9492f3d..9e76ceb 100644 --- a/sound/pci/ice1712/aureon.c +++ b/sound/pci/ice1712/aureon.c @@ -60,6 +60,7 @@ #include "ice1712.h" #include "envy24ht.h" #include "aureon.h" +#include /* WM8770 registers */ #define WM_DAC_ATTEN 0x00 /* DAC1-8 analog attenuation */ @@ -660,6 +661,12 @@ static int aureon_ac97_mmute_put(struct snd_kcontrol *kcontrol, struct snd_ctl_e return change; } +static DECLARE_TLV_DB_SCALE(db_scale_wm_dac, -12700, 100, 1); +static DECLARE_TLV_DB_SCALE(db_scale_wm_pcm, -6400, 50, 1); +static DECLARE_TLV_DB_SCALE(db_scale_wm_adc, -1200, 100, 0); +static DECLARE_TLV_DB_SCALE(db_scale_ac97_master, -4650, 150, 0); +static DECLARE_TLV_DB_SCALE(db_scale_ac97_gain, -3450, 150, 0); + /* * Logarithmic volume values for WM8770 * Computed as 20 * Log10(255 / x) @@ -1409,10 +1416,13 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Master Playback Volume", .info = wm_master_vol_info, .get = wm_master_vol_get, - .put = wm_master_vol_put + .put = wm_master_vol_put, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1424,11 +1434,14 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Front Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 0 + .private_value = (2 << 8) | 0, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1440,11 +1453,14 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Rear Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 2 + .private_value = (2 << 8) | 2, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1456,11 +1472,14 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Center Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (1 << 8) | 4 + .private_value = (1 << 8) | 4, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1472,11 +1491,14 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "LFE Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (1 << 8) | 5 + .private_value = (1 << 8) | 5, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1488,11 +1510,14 @@ static struct snd_kcontrol_new aureon_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Side Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 6 + .private_value = (2 << 8) | 6, + .tlv = { .p = db_scale_wm_dac } } }; @@ -1506,10 +1531,13 @@ static struct snd_kcontrol_new wm_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "PCM Playback Volume", .info = wm_pcm_vol_info, .get = wm_pcm_vol_get, - .put = wm_pcm_vol_put + .put = wm_pcm_vol_put, + .tlv = { .p = db_scale_wm_pcm } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1520,10 +1548,13 @@ static struct snd_kcontrol_new wm_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Capture Volume", .info = wm_adc_vol_info, .get = wm_adc_vol_get, - .put = wm_adc_vol_put + .put = wm_adc_vol_put, + .tlv = { .p = db_scale_wm_adc } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1567,11 +1598,14 @@ static struct snd_kcontrol_new ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "AC97 Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_MASTER|AUREON_AC97_STEREO + .private_value = AC97_MASTER|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_master } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1583,11 +1617,14 @@ static struct snd_kcontrol_new ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "CD Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_CD|AUREON_AC97_STEREO + .private_value = AC97_CD|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1599,11 +1636,14 @@ static struct snd_kcontrol_new ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Aux Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_AUX|AUREON_AC97_STEREO + .private_value = AC97_AUX|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1615,11 +1655,14 @@ static struct snd_kcontrol_new ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Line Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_LINE|AUREON_AC97_STEREO + .private_value = AC97_LINE|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1631,11 +1674,14 @@ static struct snd_kcontrol_new ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Mic Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_MIC + .private_value = AC97_MIC, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1657,11 +1703,14 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "AC97 Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_MASTER|AUREON_AC97_STEREO + .private_value = AC97_MASTER|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_master } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1673,11 +1722,14 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "CD Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_AUX|AUREON_AC97_STEREO + .private_value = AC97_AUX|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1685,15 +1737,18 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { .info = aureon_ac97_mute_info, .get = aureon_ac97_mute_get, .put = aureon_ac97_mute_put, - .private_value = AC97_CD, + .private_value = AC97_CD }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Phono Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_CD|AUREON_AC97_STEREO + .private_value = AC97_CD|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1705,11 +1760,14 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Line Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_LINE|AUREON_AC97_STEREO + .private_value = AC97_LINE|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1721,11 +1779,14 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Mic Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_MIC + .private_value = AC97_MIC, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1744,11 +1805,14 @@ static struct snd_kcontrol_new universe_ac97_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Aux Playback Volume", .info = aureon_ac97_vol_info, .get = aureon_ac97_vol_get, .put = aureon_ac97_vol_put, - .private_value = AC97_VIDEO|AUREON_AC97_STEREO + .private_value = AC97_VIDEO|AUREON_AC97_STEREO, + .tlv = { .p = db_scale_ac97_gain } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, diff --git a/sound/pci/ice1712/phase.c b/sound/pci/ice1712/phase.c index 502da1c..e08d73f 100644 --- a/sound/pci/ice1712/phase.c +++ b/sound/pci/ice1712/phase.c @@ -46,6 +46,7 @@ #include "ice1712.h" #include "envy24ht.h" #include "phase.h" +#include /* WM8770 registers */ #define WM_DAC_ATTEN 0x00 /* DAC1-8 analog attenuation */ @@ -696,6 +697,9 @@ static int phase28_oversampling_put(struct snd_kcontrol *kcontrol, struct snd_ct return 0; } +static DECLARE_TLV_DB_SCALE(db_scale_wm_dac, -12700, 100, 1); +static DECLARE_TLV_DB_SCALE(db_scale_wm_pcm, -6400, 50, 1); + static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -706,10 +710,13 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Master Playback Volume", .info = wm_master_vol_info, .get = wm_master_vol_get, - .put = wm_master_vol_put + .put = wm_master_vol_put, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -721,11 +728,14 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Front Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 0 + .private_value = (2 << 8) | 0, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -737,11 +747,14 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Rear Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 2 + .private_value = (2 << 8) | 2, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -753,11 +766,14 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Center Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (1 << 8) | 4 + .private_value = (1 << 8) | 4, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -769,11 +785,14 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "LFE Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (1 << 8) | 5 + .private_value = (1 << 8) | 5, + .tlv = { .p = db_scale_wm_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -785,11 +804,14 @@ static struct snd_kcontrol_new phase28_dac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Side Playback Volume", .info = wm_vol_info, .get = wm_vol_get, .put = wm_vol_put, - .private_value = (2 << 8) | 6 + .private_value = (2 << 8) | 6, + .tlv = { .p = db_scale_wm_dac } } }; @@ -803,10 +825,13 @@ static struct snd_kcontrol_new wm_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "PCM Playback Volume", .info = wm_pcm_vol_info, .get = wm_pcm_vol_get, - .put = wm_pcm_vol_put + .put = wm_pcm_vol_put, + .tlv = { .p = db_scale_wm_pcm } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, diff --git a/sound/pci/ice1712/pontis.c b/sound/pci/ice1712/pontis.c index 0efcad9..6c74c2d 100644 --- a/sound/pci/ice1712/pontis.c +++ b/sound/pci/ice1712/pontis.c @@ -31,6 +31,7 @@ #include #include +#include #include "ice1712.h" #include "envy24ht.h" @@ -564,6 +565,8 @@ static int pontis_gpio_data_put(struct snd_kcontrol *kcontrol, struct snd_ctl_el return changed; } +static DECLARE_TLV_DB_SCALE(db_scale_volume, -6400, 50, 1); + /* * mixers */ @@ -571,17 +574,23 @@ static int pontis_gpio_data_put(struct snd_kcontrol *kcontrol, struct snd_ctl_el static struct snd_kcontrol_new pontis_controls[] __devinitdata = { { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "PCM Playback Volume", .info = wm_dac_vol_info, .get = wm_dac_vol_get, .put = wm_dac_vol_put, + .tlv = { .p = db_scale_volume }, }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Capture Volume", .info = wm_adc_vol_info, .get = wm_adc_vol_get, .put = wm_adc_vol_put, + .tlv = { .p = db_scale_volume }, }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, diff --git a/sound/pci/ice1712/prodigy192.c b/sound/pci/ice1712/prodigy192.c index fdb5cb8..41b2605 100644 --- a/sound/pci/ice1712/prodigy192.c +++ b/sound/pci/ice1712/prodigy192.c @@ -35,6 +35,7 @@ #include "envy24ht.h" #include "prodigy192.h" #include "stac946x.h" +#include static inline void stac9460_put(struct snd_ice1712 *ice, int reg, unsigned char val) { @@ -356,6 +357,9 @@ static int aureon_oversampling_put(struct snd_kcontrol *kcontrol, struct snd_ctl } #endif +static DECLARE_TLV_DB_SCALE(db_scale_dac, -19125, 75, 0); +static DECLARE_TLV_DB_SCALE(db_scale_adc, 0, 150, 0); + /* * mixers */ @@ -368,14 +372,18 @@ static struct snd_kcontrol_new stac_controls[] __devinitdata = { .get = stac9460_dac_mute_get, .put = stac9460_dac_mute_put, .private_value = 1, + .tlv = { .p = db_scale_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Master Playback Volume", .info = stac9460_dac_vol_info, .get = stac9460_dac_vol_get, .put = stac9460_dac_vol_put, .private_value = 1, + .tlv = { .p = db_scale_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -387,11 +395,14 @@ static struct snd_kcontrol_new stac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "DAC Volume", .count = 6, .info = stac9460_dac_vol_info, .get = stac9460_dac_vol_get, .put = stac9460_dac_vol_put, + .tlv = { .p = db_scale_dac } }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -404,11 +415,14 @@ static struct snd_kcontrol_new stac_controls[] __devinitdata = { }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "ADC Volume", .count = 1, .info = stac9460_adc_vol_info, .get = stac9460_adc_vol_get, .put = stac9460_adc_vol_put, + .tlv = { .p = db_scale_adc } }, #if 0 { -- cgit v0.10.2 From 929861c669a443cf667ec0d80ac73a567ed4543c Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 31 Aug 2006 16:55:40 +0200 Subject: [ALSA] hda-intel - Remove volatile Removed volatile from the position buffer pointer. Also, use synchronize_irq() instead of unreliable msleep(1) in the driver remove callback. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index c9ae9f7..d56ea21 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -255,7 +255,7 @@ enum { struct azx_dev { u32 *bdl; /* virtual address of the BDL */ dma_addr_t bdl_addr; /* physical address of the BDL */ - volatile u32 *posbuf; /* position buffer pointer */ + u32 *posbuf; /* position buffer pointer */ unsigned int bufsize; /* size of the play buffer in bytes */ unsigned int fragsize; /* size of each period in bytes */ @@ -1197,7 +1197,7 @@ static snd_pcm_uframes_t azx_pcm_pointer(struct snd_pcm_substream *substream) if (chip->position_fix == POS_FIX_POSBUF || chip->position_fix == POS_FIX_AUTO) { /* use the position buffer */ - pos = *azx_dev->posbuf; + pos = le32_to_cpu(*azx_dev->posbuf); if (chip->position_fix == POS_FIX_AUTO && azx_dev->period_intr == 1 && ! pos) { printk(KERN_WARNING @@ -1345,7 +1345,7 @@ static int __devinit azx_init_stream(struct azx *chip) struct azx_dev *azx_dev = &chip->azx_dev[i]; azx_dev->bdl = (u32 *)(chip->bdl.area + off); azx_dev->bdl_addr = chip->bdl.addr + off; - azx_dev->posbuf = (volatile u32 *)(chip->posbuf.area + i * 8); + azx_dev->posbuf = (u32 __iomem *)(chip->posbuf.area + i * 8); /* offset: SDI0=0x80, SDI1=0xa0, ... SDO3=0x160 */ azx_dev->sd_addr = chip->remap_addr + (0x20 * i + 0x80); /* int mask: SDI0=0x01, SDI1=0x02, ... SDO3=0x80 */ @@ -1417,8 +1417,7 @@ static int azx_free(struct azx *chip) azx_writel(chip, DPLBASE, 0); azx_writel(chip, DPUBASE, 0); - /* wait a little for interrupts to finish */ - msleep(1); + synchronize_irq(chip->irq); } if (chip->irq >= 0) { -- cgit v0.10.2 From 927fc866025857c109219d4ed62d8c3cbc02713a Mon Sep 17 00:00:00 2001 From: Pavel Machek Date: Thu, 31 Aug 2006 17:03:43 +0200 Subject: [ALSA] sound/pci/hda/intel_hda: small cleanups Cleanup whitespace. Signed-off-by: Pavel Machek Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index d56ea21..cc50d13 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -274,8 +274,8 @@ struct azx_dev { /* for sanity check of position buffer */ unsigned int period_intr; - unsigned int opened: 1; - unsigned int running: 1; + unsigned int opened :1; + unsigned int running :1; }; /* CORB/RIRB */ @@ -333,9 +333,9 @@ struct azx { /* flags */ int position_fix; - unsigned int initialized: 1; - unsigned int single_cmd: 1; - unsigned int polling_mode: 1; + unsigned int initialized :1; + unsigned int single_cmd :1; + unsigned int polling_mode :1; }; /* driver types */ @@ -661,14 +661,14 @@ static int azx_reset(struct azx *chip) azx_writeb(chip, GCTL, azx_readb(chip, GCTL) | ICH6_GCTL_RESET); count = 50; - while (! azx_readb(chip, GCTL) && --count) + while (!azx_readb(chip, GCTL) && --count) msleep(1); - /* Brent Chartrand said to wait >= 540us for codecs to intialize */ + /* Brent Chartrand said to wait >= 540us for codecs to initialize */ msleep(1); /* check to see if controller is ready */ - if (! azx_readb(chip, GCTL)) { + if (!azx_readb(chip, GCTL)) { snd_printd("azx_reset: controller not ready!\n"); return -EBUSY; } @@ -677,7 +677,7 @@ static int azx_reset(struct azx *chip) azx_writel(chip, GCTL, azx_readl(chip, GCTL) | ICH6_GCTL_UREN); /* detect codecs */ - if (! chip->codec_mask) { + if (!chip->codec_mask) { chip->codec_mask = azx_readw(chip, STATESTS); snd_printdd("codec_mask = 0x%x\n", chip->codec_mask); } @@ -785,7 +785,7 @@ static void azx_init_chip(struct azx *chip) azx_int_enable(chip); /* initialize the codec command I/O */ - if (! chip->single_cmd) + if (!chip->single_cmd) azx_init_cmd_io(chip); /* program the position buffer */ @@ -813,7 +813,7 @@ static void azx_init_chip(struct azx *chip) /* * interrupt handler */ -static irqreturn_t azx_interrupt(int irq, void* dev_id, struct pt_regs *regs) +static irqreturn_t azx_interrupt(int irq, void *dev_id, struct pt_regs *regs) { struct azx *chip = dev_id; struct azx_dev *azx_dev; @@ -1018,8 +1018,9 @@ static struct snd_pcm_hardware azx_pcm_hw = { .info = (SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER | SNDRV_PCM_INFO_MMAP_VALID | - SNDRV_PCM_INFO_PAUSE /*|*/ - /*SNDRV_PCM_INFO_RESUME*/), + /* No full-resume yet implemented */ + /* SNDRV_PCM_INFO_RESUME |*/ + SNDRV_PCM_INFO_PAUSE), .formats = SNDRV_PCM_FMTBIT_S16_LE, .rates = SNDRV_PCM_RATE_48000, .rate_min = 48000, @@ -1454,19 +1455,19 @@ static int __devinit azx_create(struct snd_card *card, struct pci_dev *pci, struct azx **rchip) { struct azx *chip; - int err = 0; + int err; static struct snd_device_ops ops = { .dev_free = azx_dev_free, }; *rchip = NULL; - if ((err = pci_enable_device(pci)) < 0) + err = pci_enable_device(pci); + if (err < 0) return err; chip = kzalloc(sizeof(*chip), GFP_KERNEL); - - if (NULL == chip) { + if (!chip) { snd_printk(KERN_ERR SFX "cannot allocate chip\n"); pci_disable_device(pci); return -ENOMEM; @@ -1492,13 +1493,14 @@ static int __devinit azx_create(struct snd_card *card, struct pci_dev *pci, } #endif - if ((err = pci_request_regions(pci, "ICH HD audio")) < 0) { + err = pci_request_regions(pci, "ICH HD audio"); + if (err < 0) { kfree(chip); pci_disable_device(pci); return err; } - chip->addr = pci_resource_start(pci,0); + chip->addr = pci_resource_start(pci, 0); chip->remap_addr = ioremap_nocache(chip->addr, pci_resource_len(pci,0)); if (chip->remap_addr == NULL) { snd_printk(KERN_ERR SFX "ioremap error\n"); @@ -1542,7 +1544,7 @@ static int __devinit azx_create(struct snd_card *card, struct pci_dev *pci, } chip->num_streams = chip->playback_streams + chip->capture_streams; chip->azx_dev = kcalloc(chip->num_streams, sizeof(*chip->azx_dev), GFP_KERNEL); - if (! chip->azx_dev) { + if (!chip->azx_dev) { snd_printk(KERN_ERR "cannot malloc azx_dev\n"); goto errout; } @@ -1573,7 +1575,7 @@ static int __devinit azx_create(struct snd_card *card, struct pci_dev *pci, chip->initialized = 1; /* codec detection */ - if (! chip->codec_mask) { + if (!chip->codec_mask) { snd_printk(KERN_ERR SFX "no codecs found!\n"); err = -ENODEV; goto errout; @@ -1600,16 +1602,16 @@ static int __devinit azx_probe(struct pci_dev *pci, const struct pci_device_id * { struct snd_card *card; struct azx *chip; - int err = 0; + int err; card = snd_card_new(index, id, THIS_MODULE, 0); - if (NULL == card) { + if (!card) { snd_printk(KERN_ERR SFX "Error creating card!\n"); return -ENOMEM; } - if ((err = azx_create(card, pci, pci_id->driver_data, - &chip)) < 0) { + err = azx_create(card, pci, pci_id->driver_data, &chip); + if (err < 0) { snd_card_free(card); return err; } -- cgit v0.10.2 From 2fd53a7e9b1392f9cc3002a24f3c13b2796e70c3 Mon Sep 17 00:00:00 2001 From: Andreas Schwab Date: Fri, 1 Sep 2006 17:15:36 +0200 Subject: [ALSA] [PPC,SOUND] Fix audio gpio state detection When booting with line out or headphone plugged, you won't hear anything. The problem is that after reset all channels are muted, but the actual value of the gpio port doesn't exactly match the active_val settings as expected by check_audio_gpio. For example, the line_mute port is set to 7, but check_audio_gpio would expect 0xd or 0xf, thus its return value indicates that it is not active, even though it is. AFAICS only looking at the low bit is enough to determine whether the port is active. Signed-off-by: Andreas Schwab Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/ppc/tumbler.c b/sound/ppc/tumbler.c index 6ae2d5b..cdff53e 100644 --- a/sound/ppc/tumbler.c +++ b/sound/ppc/tumbler.c @@ -190,7 +190,7 @@ static int check_audio_gpio(struct pmac_gpio *gp) ret = do_gpio_read(gp); - return (ret & 0xd) == (gp->active_val & 0xd); + return (ret & 0x1) == (gp->active_val & 0x1); } static int read_audio_gpio(struct pmac_gpio *gp) @@ -198,7 +198,8 @@ static int read_audio_gpio(struct pmac_gpio *gp) int ret; if (! gp->addr) return 0; - ret = ((do_gpio_read(gp) & 0x02) !=0); + ret = do_gpio_read(gp); + ret = (ret & 0x02) !=0; return ret == gp->active_state; } -- cgit v0.10.2 From 35a49934a7180fd80fb0bb3777d125dd939df50e Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 1 Sep 2006 17:09:44 +0200 Subject: [ALSA] Add dB scale information to mixart driver Added the dB scale information to mixart driver. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/mixart/mixart_mixer.c b/sound/pci/mixart/mixart_mixer.c index ed47b73..13de0f7 100644 --- a/sound/pci/mixart/mixart_mixer.c +++ b/sound/pci/mixart/mixart_mixer.c @@ -31,6 +31,7 @@ #include "mixart_core.h" #include "mixart_hwdep.h" #include +#include #include "mixart_mixer.h" static u32 mixart_analog_level[256] = { @@ -388,12 +389,17 @@ static int mixart_analog_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl_e return changed; } +static DECLARE_TLV_DB_SCALE(db_scale_analog, -9600, 50, 0); + static struct snd_kcontrol_new mixart_control_analog_level = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), /* name will be filled later */ .info = mixart_analog_vol_info, .get = mixart_analog_vol_get, .put = mixart_analog_vol_put, + .tlv = { .p = db_scale_analog }, }; /* shared */ @@ -866,14 +872,19 @@ static int mixart_pcm_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem return changed; } +static DECLARE_TLV_DB_SCALE(db_scale_digital, -10950, 50, 0); + static struct snd_kcontrol_new snd_mixart_pcm_vol = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), /* name will be filled later */ /* count will be filled later */ .info = mixart_digital_vol_info, /* shared */ .get = mixart_pcm_vol_get, .put = mixart_pcm_vol_put, + .tlv = { .p = db_scale_digital }, }; @@ -984,10 +995,13 @@ static int mixart_monitor_vol_put(struct snd_kcontrol *kcontrol, struct snd_ctl_ static struct snd_kcontrol_new mixart_control_monitor_vol = { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Monitoring Volume", .info = mixart_digital_vol_info, /* shared */ .get = mixart_monitor_vol_get, .put = mixart_monitor_vol_put, + .tlv = { .p = db_scale_digital }, }; /* -- cgit v0.10.2 From 93ed150375187ae7917ed1e3b9b830b9d4065bad Mon Sep 17 00:00:00 2001 From: Tobin Davis Date: Fri, 1 Sep 2006 21:03:12 +0200 Subject: [ALSA] hda-codec - Add 5 stack audio support for Intel 965 systems This patch renames the 965_2112 function ids to 965_3ST, and adds functional support for 965_5ST (5 stack 7.1 surround). Signed-off-by: Tobin Davis Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 48d3bdf..a788dd7 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -859,6 +859,16 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. laptop-dig ditto with SPDIF auto auto-config reading BIOS (default) + STAC9200/9205/9220/9221/9254 + ref Reference board + 3stack D945 3stack + 5stack D945 5stack + SPDIF + + STAC9227/9228/9229/927x + ref Reference board + 3stack D965 3stack + 5stack D965 5stack + SPDIF + STAC9872 vaio Setup for VAIO FE550G/SZ110 vaio-ar Setup for VAIO AR diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 8d5ad7c..8716903 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -42,9 +42,10 @@ #define STAC_D945GTP3 1 #define STAC_D945GTP5 2 #define STAC_MACMINI 3 -#define STAC_D965_2112 4 -#define STAC_D965_284B 5 -#define STAC_922X_MODELS 6 /* number of 922x models */ +#define STAC_922X_MODELS 4 /* number of 922x models */ +#define STAC_D965_3ST 4 +#define STAC_D965_5ST 5 +#define STAC_927X_MODELS 6 /* number of 922x models */ struct sigmatel_spec { struct snd_kcontrol_new *mixers[4]; @@ -111,24 +112,10 @@ static hda_nid_t stac922x_adc_nids[2] = { 0x06, 0x07, }; -static hda_nid_t stac9227_adc_nids[2] = { - 0x07, 0x08, -}; - -#if 0 -static hda_nid_t d965_2112_dac_nids[3] = { - 0x02, 0x03, 0x05, -}; -#endif - static hda_nid_t stac922x_mux_nids[2] = { 0x12, 0x13, }; -static hda_nid_t stac9227_mux_nids[2] = { - 0x15, 0x16, -}; - static hda_nid_t stac927x_adc_nids[3] = { 0x07, 0x08, 0x09 }; @@ -146,7 +133,8 @@ static hda_nid_t stac9205_mux_nids[2] = { }; static hda_nid_t stac9200_pin_nids[8] = { - 0x08, 0x09, 0x0d, 0x0e, 0x0f, 0x10, 0x11, 0x12, + 0x08, 0x09, 0x0d, 0x0e, + 0x0f, 0x10, 0x11, 0x12, }; static hda_nid_t stac922x_pin_nids[10] = { @@ -206,17 +194,9 @@ static struct hda_verb stac922x_core_init[] = { {} }; -static struct hda_verb stac9227_core_init[] = { - /* set master volume and direct control */ - { 0x16, AC_VERB_SET_VOLUME_KNOB_CONTROL, 0xff}, - /* unmute node 0x1b */ - { 0x1b, AC_VERB_SET_AMP_GAIN_MUTE, 0xb000}, - {} -}; - -static struct hda_verb d965_2112_core_init[] = { +static struct hda_verb d965_core_init[] = { /* set master volume and direct control */ - { 0x16, AC_VERB_SET_VOLUME_KNOB_CONTROL, 0xff}, + { 0x24, AC_VERB_SET_VOLUME_KNOB_CONTROL, 0xff}, /* unmute node 0x1b */ { 0x1b, AC_VERB_SET_AMP_GAIN_MUTE, 0xb000}, /* select node 0x03 as DAC */ @@ -386,6 +366,8 @@ static unsigned int *stac922x_brd_tbl[STAC_922X_MODELS] = { }; static struct hda_board_config stac922x_cfg_tbl[] = { + { .modelname = "5stack", .config = STAC_D945GTP5 }, + { .modelname = "3stack", .config = STAC_D945GTP3 }, { .modelname = "ref", .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2668, /* DFI LanParty */ @@ -471,99 +453,127 @@ static struct hda_board_config stac922x_cfg_tbl[] = { { .pci_subvendor = 0x8384, .pci_subdevice = 0x7680, .config = STAC_MACMINI }, /* Apple Mac Mini (early 2006) */ - { .pci_subvendor = PCI_VENDOR_ID_INTEL, - .pci_subdevice = 0x2112, - .config = STAC_D965_2112 }, - { .pci_subvendor = PCI_VENDOR_ID_INTEL, - .pci_subdevice = 0x284b, - .config = STAC_D965_284B }, {} /* terminator */ }; static unsigned int ref927x_pin_configs[14] = { - 0x01813122, 0x01a19021, 0x01014010, 0x01016011, - 0x01012012, 0x01011014, 0x40000100, 0x40000100, - 0x40000100, 0x40000100, 0x40000100, 0x01441030, - 0x01c41030, 0x40000100, + 0x02214020, 0x02a19080, 0x0181304e, 0x01014010, + 0x01a19040, 0x01011012, 0x01016011, 0x0101201f, + 0x183301f0, 0x18a001f0, 0x18a001f0, 0x01442070, + 0x01c42190, 0x40000100, }; -static unsigned int d965_2112_pin_configs[14] = { +static unsigned int d965_3st_pin_configs[14] = { 0x0221401f, 0x02a19120, 0x40000100, 0x01014011, 0x01a19021, 0x01813024, 0x40000100, 0x40000100, 0x40000100, 0x40000100, 0x40000100, 0x40000100, 0x40000100, 0x40000100 }; -static unsigned int *stac927x_brd_tbl[] = { +static unsigned int d965_5st_pin_configs[14] = { + 0x02214020, 0x02a19080, 0x0181304e, 0x01014010, + 0x01a19040, 0x01011012, 0x01016011, 0x40000100, + 0x40000100, 0x40000100, 0x40000100, 0x01442070, + 0x40000100, 0x40000100 +}; + +static unsigned int *stac927x_brd_tbl[STAC_927X_MODELS] = { [STAC_REF] = ref927x_pin_configs, - [STAC_D965_2112] = d965_2112_pin_configs, + [STAC_D965_3ST] = d965_3st_pin_configs, + [STAC_D965_5ST] = d965_5st_pin_configs, }; static struct hda_board_config stac927x_cfg_tbl[] = { + { .modelname = "5stack", .config = STAC_D965_5ST }, + { .modelname = "3stack", .config = STAC_D965_3ST }, { .modelname = "ref", .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2668, /* DFI LanParty */ .config = STAC_REF }, /* SigmaTel reference board */ - /* SigmaTel 9227 reference board */ - { .pci_subvendor = PCI_VENDOR_ID_INTEL, - .pci_subdevice = 0x284b, - .config = STAC_D965_284B }, /* Intel 946 based systems */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x3d01, - .config = STAC_D965_2112 }, /* D946 configuration */ + .config = STAC_D965_3ST }, /* D946 configuration */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0xa301, - .config = STAC_D965_2112 }, /* Intel D946GZT - 3 stack */ - /* 965 based systems */ + .config = STAC_D965_3ST }, /* Intel D946GZT - 3 stack */ + /* 965 based 3 stack systems */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2116, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2115, - .config = STAC_D965_2112 }, /* Intel DQ965WC - 3 Stack */ + .config = STAC_D965_3ST }, /* Intel DQ965WC - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2114, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2113, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2112, - .config = STAC_D965_2112 }, /* Intel DG965MS - 3 Stack */ + .config = STAC_D965_3ST }, /* Intel DG965MS - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2111, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2110, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2009, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2008, - .config = STAC_D965_2112 }, /* Intel DQ965GF - 3 Stack */ + .config = STAC_D965_3ST }, /* Intel DQ965GF - 3 Stack */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2007, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2006, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2005, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2004, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2003, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2002, - .config = STAC_D965_2112 }, /* Intel D965 3Stack config */ + .config = STAC_D965_3ST }, /* Intel D965 3Stack config */ { .pci_subvendor = PCI_VENDOR_ID_INTEL, .pci_subdevice = 0x2001, - .config = STAC_D965_2112 }, /* Intel DQ965GF - 3 Stackg */ + .config = STAC_D965_3ST }, /* Intel DQ965GF - 3 Stack */ + /* 965 based 5 stack systems */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2301, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2302, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2303, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2304, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2305, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2501, + .config = STAC_D965_5ST }, /* Intel DG965MQ - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2502, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2503, + .config = STAC_D965_5ST }, /* Intel DG965 - 5 Stack */ + { .pci_subvendor = PCI_VENDOR_ID_INTEL, + .pci_subdevice = 0x2504, + .config = STAC_D965_5ST }, /* Intel DQ965GF - 5 Stack */ {} /* terminator */ }; @@ -1546,18 +1556,18 @@ static int patch_stac927x(struct hda_codec *codec) } switch (spec->board_config) { - case STAC_D965_2112: + case STAC_D965_3ST: spec->adc_nids = stac927x_adc_nids; spec->mux_nids = stac927x_mux_nids; spec->num_muxes = 3; - spec->init = d965_2112_core_init; + spec->init = d965_core_init; spec->mixer = stac9227_mixer; break; - case STAC_D965_284B: - spec->adc_nids = stac9227_adc_nids; - spec->mux_nids = stac9227_mux_nids; - spec->num_muxes = 2; - spec->init = stac9227_core_init; + case STAC_D965_5ST: + spec->adc_nids = stac927x_adc_nids; + spec->mux_nids = stac927x_mux_nids; + spec->num_muxes = 3; + spec->init = d965_core_init; spec->mixer = stac9227_mixer; break; default: -- cgit v0.10.2 From bd25b7cae1e763b292f359170e16bccd01c7ee5c Mon Sep 17 00:00:00 2001 From: Ville Syrjala Date: Mon, 4 Sep 2006 12:28:24 +0200 Subject: [ALSA] ac97: Fix AD1819 volume range AD1819 volume registers can hold extra bits which do not affect the actual volume. Add a res_table to the codec patch to fix the problem. PCM, line and mic volume were tested. Signed-off-by: Ville Syrjala Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index bdd7f89..9be4ceb 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -1395,6 +1395,17 @@ static void ad1888_resume(struct snd_ac97 *ac97) #endif +static const struct snd_ac97_res_table ad1819_restbl[] = { + { AC97_PHONE, 0x9f1f }, + { AC97_MIC, 0x9f1f }, + { AC97_LINE, 0x9f1f }, + { AC97_CD, 0x9f1f }, + { AC97_VIDEO, 0x9f1f }, + { AC97_AUX, 0x9f1f }, + { AC97_PCM, 0x9f1f }, + { } /* terminator */ +}; + int patch_ad1819(struct snd_ac97 * ac97) { unsigned short scfg; @@ -1402,6 +1413,7 @@ int patch_ad1819(struct snd_ac97 * ac97) // patch for Analog Devices scfg = snd_ac97_read(ac97, AC97_AD_SERIAL_CFG); snd_ac97_write_cache(ac97, AC97_AD_SERIAL_CFG, scfg | 0x7000); /* select all codecs */ + ac97->res_table = ad1819_restbl; return 0; } -- cgit v0.10.2 From 679e28eef835cbd30de78c2f80bf488cba1b7e40 Mon Sep 17 00:00:00 2001 From: Ville Syrjala Date: Mon, 4 Sep 2006 12:28:51 +0200 Subject: [ALSA] es1968: Fix hw volume Fix maestro2 hardware volume control. Tested on a Dell Inspiron 7000. Signed-off-by: Ville Syrjala Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/es1968.c b/sound/pci/es1968.c index 3c5ab7c..f3c4038 100644 --- a/sound/pci/es1968.c +++ b/sound/pci/es1968.c @@ -1905,7 +1905,7 @@ static void es1968_update_hw_volume(unsigned long private_data) /* Figure out which volume control button was pushed, based on differences from the default register values. */ - x = inb(chip->io_port + 0x1c); + x = inb(chip->io_port + 0x1c) & 0xee; /* Reset the volume control registers. */ outb(0x88, chip->io_port + 0x1c); outb(0x88, chip->io_port + 0x1d); @@ -1921,7 +1921,8 @@ static void es1968_update_hw_volume(unsigned long private_data) /* FIXME: we can't call snd_ac97_* functions since here is in tasklet. */ spin_lock_irqsave(&chip->ac97_lock, flags); val = chip->ac97->regs[AC97_MASTER]; - if (x & 1) { + switch (x) { + case 0x88: /* mute */ val ^= 0x8000; chip->ac97->regs[AC97_MASTER] = val; @@ -1929,26 +1930,31 @@ static void es1968_update_hw_volume(unsigned long private_data) outb(AC97_MASTER, chip->io_port + ESM_AC97_INDEX); snd_ctl_notify(chip->card, SNDRV_CTL_EVENT_MASK_VALUE, &chip->master_switch->id); - } else { - val &= 0x7fff; - if (((x>>1) & 7) > 4) { - /* volume up */ - if ((val & 0xff) > 0) - val--; - if ((val & 0xff00) > 0) - val -= 0x0100; - } else { - /* volume down */ - if ((val & 0xff) < 0x1f) - val++; - if ((val & 0xff00) < 0x1f00) - val += 0x0100; - } + break; + case 0xaa: + /* volume up */ + if ((val & 0x7f) > 0) + val--; + if ((val & 0x7f00) > 0) + val -= 0x0100; + chip->ac97->regs[AC97_MASTER] = val; + outw(val, chip->io_port + ESM_AC97_DATA); + outb(AC97_MASTER, chip->io_port + ESM_AC97_INDEX); + snd_ctl_notify(chip->card, SNDRV_CTL_EVENT_MASK_VALUE, + &chip->master_volume->id); + break; + case 0x66: + /* volume down */ + if ((val & 0x7f) < 0x1f) + val++; + if ((val & 0x7f00) < 0x1f00) + val += 0x0100; chip->ac97->regs[AC97_MASTER] = val; outw(val, chip->io_port + ESM_AC97_DATA); outb(AC97_MASTER, chip->io_port + ESM_AC97_INDEX); snd_ctl_notify(chip->card, SNDRV_CTL_EVENT_MASK_VALUE, &chip->master_volume->id); + break; } spin_unlock_irqrestore(&chip->ac97_lock, flags); } -- cgit v0.10.2 From ea543f1ee61bbfdf6cac4b79d66c7840d5b00037 Mon Sep 17 00:00:00 2001 From: Krzysztof Helt Date: Tue, 5 Sep 2006 20:25:05 +0200 Subject: [ALSA] sparc dbri: SMP fixes The dbri driver hangs when used in kernel compiled with SMP support due to inproper locking. The patch fixes it. Signed-off-by: Krzysztof Helt Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/sparc/dbri.c b/sound/sparc/dbri.c index 82d5e80..e4935fc 100644 --- a/sound/sparc/dbri.c +++ b/sound/sparc/dbri.c @@ -635,10 +635,16 @@ to send them to the DBRI. static void dbri_cmdwait(struct snd_dbri *dbri) { int maxloops = MAXLOOPS; + unsigned long flags; /* Delay if previous commands are still being processed */ - while ((--maxloops) > 0 && (sbus_readl(dbri->regs + REG0) & D_P)) + spin_lock_irqsave(&dbri->lock, flags); + while ((--maxloops) > 0 && (sbus_readl(dbri->regs + REG0) & D_P)) { + spin_unlock_irqrestore(&dbri->lock, flags); msleep_interruptible(1); + spin_lock_irqsave(&dbri->lock, flags); + } + spin_unlock_irqrestore(&dbri->lock, flags); if (maxloops == 0) { printk(KERN_ERR "DBRI: Chip never completed command buffer\n"); @@ -671,11 +677,12 @@ static s32 *dbri_cmdlock(struct snd_dbri * dbri, int len) * the last WAIT cmd and force DBRI to reread the cmd. * The JUMP cmd points to the new cmd string. * It also releases the cmdlock spinlock. + * + * Lock must not be held before calling this. */ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) { s32 tmp, addr; - unsigned long flags; static int wait_id = 0; wait_id++; @@ -706,12 +713,10 @@ static void dbri_cmdsend(struct snd_dbri * dbri, s32 * cmd,int len) } #endif - spin_lock_irqsave(&dbri->lock, flags); /* Reread the last command */ tmp = sbus_readl(dbri->regs + REG0); tmp |= D_P; sbus_writel(tmp, dbri->regs + REG0); - spin_unlock_irqrestore(&dbri->lock, flags); dbri->cmdptr = cmd; spin_unlock(&dbri->cmdlock); @@ -777,9 +782,9 @@ static void dbri_initialize(struct snd_dbri * dbri) dma_addr = dbri->dma_dvma + dbri_dma_off(cmd, 0); sbus_writel(dma_addr, dbri->regs + REG8); spin_unlock(&dbri->cmdlock); - dbri_cmdwait(dbri); spin_unlock_irqrestore(&dbri->lock, flags); + dbri_cmdwait(dbri); } /* @@ -840,6 +845,9 @@ static void reset_pipe(struct snd_dbri * dbri, int pipe) dbri->pipes[pipe].first_desc = -1; } +/* + * Lock must be held before calling this. + */ static void setup_pipe(struct snd_dbri * dbri, int pipe, int sdp) { if (pipe < 0 || pipe > DBRI_MAX_PIPE) { @@ -866,6 +874,9 @@ static void setup_pipe(struct snd_dbri * dbri, int pipe, int sdp) reset_pipe(dbri, pipe); } +/* + * Lock must be held before calling this. + */ static void link_time_slot(struct snd_dbri * dbri, int pipe, int prevpipe, int nextpipe, int length, int cycle) @@ -920,6 +931,10 @@ static void link_time_slot(struct snd_dbri * dbri, int pipe, dbri_cmdsend(dbri, cmd, 4); } +#if 0 +/* + * Lock must be held before calling this. + */ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, enum in_or_out direction, int prevpipe, int nextpipe) @@ -952,6 +967,7 @@ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, dbri_cmdsend(dbri, cmd, 4); } +#endif /* xmit_fixed() / recv_fixed() * @@ -965,11 +981,14 @@ static void unlink_time_slot(struct snd_dbri * dbri, int pipe, * the actual time slot is. The interrupt handler takes care of bit * ordering and alignment. An 8-bit time slot will always end up * in the low-order 8 bits, filled either MSB-first or LSB-first, - * depending on the settings passed to setup_pipe() + * depending on the settings passed to setup_pipe(). + * + * Lock must not be held before calling it. */ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) { s32 *cmd; + unsigned long flags; if (pipe < 16 || pipe > DBRI_MAX_PIPE) { printk(KERN_ERR "DBRI: xmit_fixed: Illegal pipe number\n"); @@ -1002,8 +1021,11 @@ static void xmit_fixed(struct snd_dbri * dbri, int pipe, unsigned int data) *(cmd++) = data; *(cmd++) = DBRI_CMD(D_PAUSE, 0, 0); + spin_lock_irqsave(&dbri->lock, flags); dbri_cmdsend(dbri, cmd, 3); + spin_unlock_irqrestore(&dbri->lock, flags); dbri_cmdwait(dbri); + } static void recv_fixed(struct snd_dbri * dbri, int pipe, volatile __u32 * ptr) @@ -1039,6 +1061,8 @@ static void recv_fixed(struct snd_dbri * dbri, int pipe, volatile __u32 * ptr) * be spread across multiple descriptors. * * All descriptors create a ring buffer. + * + * Lock must be held before calling this. */ static int setup_descs(struct snd_dbri * dbri, int streamno, unsigned int period) { @@ -1186,6 +1210,9 @@ multiplexed serial interface which the DBRI can operate in either master enum master_or_slave { CHImaster, CHIslave }; +/* + * Lock must not be held before calling it. + */ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_slave, int bits_per_frame) { @@ -1258,9 +1285,14 @@ static void reset_chi(struct snd_dbri * dbri, enum master_or_slave master_or_sla In the standard SPARC audio configuration, the CS4215 codec is attached to the DBRI via the CHI interface and few of the DBRI's PIO pins. + * Lock must not be held before calling it. + */ static void cs4215_setup_pipes(struct snd_dbri * dbri) { + unsigned long flags; + + spin_lock_irqsave(&dbri->lock, flags); /* * Data mode: * Pipe 4: Send timeslots 1-4 (audio data) @@ -1284,6 +1316,7 @@ static void cs4215_setup_pipes(struct snd_dbri * dbri) setup_pipe(dbri, 17, D_SDP_FIXED | D_SDP_TO_SER | D_SDP_MSB); setup_pipe(dbri, 18, D_SDP_FIXED | D_SDP_FROM_SER | D_SDP_MSB); setup_pipe(dbri, 19, D_SDP_FIXED | D_SDP_FROM_SER | D_SDP_MSB); + spin_unlock_irqrestore(&dbri->lock, flags); dbri_cmdwait(dbri); } @@ -1358,6 +1391,7 @@ static void cs4215_open(struct snd_dbri * dbri) { int data_width; u32 tmp; + unsigned long flags; dprintk(D_MM, "cs4215_open: %d channels, %d bits\n", dbri->mm.channels, dbri->mm.precision); @@ -1382,6 +1416,7 @@ static void cs4215_open(struct snd_dbri * dbri) * bits. The CS4215, it seems, observes TSIN (the delayed signal) * even if it's the CHI master. Don't ask me... */ + spin_lock_irqsave(&dbri->lock, flags); tmp = sbus_readl(dbri->regs + REG0); tmp &= ~(D_C); /* Disable CHI */ sbus_writel(tmp, dbri->regs + REG0); @@ -1409,6 +1444,7 @@ static void cs4215_open(struct snd_dbri * dbri) tmp = sbus_readl(dbri->regs + REG0); tmp |= D_C; /* Enable CHI */ sbus_writel(tmp, dbri->regs + REG0); + spin_unlock_irqrestore(&dbri->lock, flags); cs4215_setdata(dbri, 0); } @@ -1420,6 +1456,7 @@ static int cs4215_setctrl(struct snd_dbri * dbri) { int i, val; u32 tmp; + unsigned long flags; /* FIXME - let the CPU do something useful during these delays */ @@ -1456,6 +1493,7 @@ static int cs4215_setctrl(struct snd_dbri * dbri) * done in hardware by a TI 248 that delays the DBRI->4215 * frame sync signal by eight clock cycles. Anybody know why? */ + spin_lock_irqsave(&dbri->lock, flags); tmp = sbus_readl(dbri->regs + REG0); tmp &= ~D_C; /* Disable CHI */ sbus_writel(tmp, dbri->regs + REG0); @@ -1472,14 +1510,17 @@ static int cs4215_setctrl(struct snd_dbri * dbri) link_time_slot(dbri, 17, 16, 16, 32, dbri->mm.offset); link_time_slot(dbri, 18, 16, 16, 8, dbri->mm.offset); link_time_slot(dbri, 19, 18, 16, 8, dbri->mm.offset + 48); + spin_unlock_irqrestore(&dbri->lock, flags); /* Wait for the chip to echo back CLB (Control Latch Bit) as zero */ dbri->mm.ctrl[0] &= ~CS4215_CLB; xmit_fixed(dbri, 17, *(int *)dbri->mm.ctrl); + spin_lock_irqsave(&dbri->lock, flags); tmp = sbus_readl(dbri->regs + REG0); tmp |= D_C; /* Enable CHI */ sbus_writel(tmp, dbri->regs + REG0); + spin_unlock_irqrestore(&dbri->lock, flags); for (i = 10; ((dbri->mm.status & 0xe4) != 0x20); --i) { msleep_interruptible(1); @@ -1688,6 +1729,7 @@ static void xmit_descs(struct snd_dbri *dbri) dbri->pipes[info->pipe].desc = first_td; } } + spin_unlock_irqrestore(&dbri->lock, flags); } @@ -2093,7 +2135,6 @@ static int snd_dbri_prepare(struct snd_pcm_substream *substream) { struct snd_dbri *dbri = snd_pcm_substream_chip(substream); struct dbri_streaminfo *info = DBRI_STREAM(dbri, substream); - struct snd_pcm_runtime *runtime = substream->runtime; int ret; info->size = snd_pcm_lib_buffer_bytes(substream); @@ -2232,7 +2273,6 @@ static int snd_cs4215_put_volume(struct snd_kcontrol *kcontrol, { struct snd_dbri *dbri = snd_kcontrol_chip(kcontrol); struct dbri_streaminfo *info = &dbri->stream_info[kcontrol->private_value]; - unsigned long flags; int changed = 0; if (info->left_gain != ucontrol->value.integer.value[0]) { @@ -2247,13 +2287,9 @@ static int snd_cs4215_put_volume(struct snd_kcontrol *kcontrol, /* First mute outputs, and wait 1/8000 sec (125 us) * to make sure this takes. This avoids clicking noises. */ - spin_lock_irqsave(&dbri->lock, flags); - cs4215_setdata(dbri, 1); udelay(125); cs4215_setdata(dbri, 0); - - spin_unlock_irqrestore(&dbri->lock, flags); } return changed; } @@ -2300,7 +2336,6 @@ static int snd_cs4215_put_single(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct snd_dbri *dbri = snd_kcontrol_chip(kcontrol); - unsigned long flags; int elem = kcontrol->private_value & 0xff; int shift = (kcontrol->private_value >> 8) & 0xff; int mask = (kcontrol->private_value >> 16) & 0xff; @@ -2333,13 +2368,9 @@ static int snd_cs4215_put_single(struct snd_kcontrol *kcontrol, /* First mute outputs, and wait 1/8000 sec (125 us) * to make sure this takes. This avoids clicking noises. */ - spin_lock_irqsave(&dbri->lock, flags); - cs4215_setdata(dbri, 1); udelay(125); cs4215_setdata(dbri, 0); - - spin_unlock_irqrestore(&dbri->lock, flags); } return changed; } -- cgit v0.10.2 From 311e70a4741c736795da082da7290164d9cf3726 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 12:13:37 +0200 Subject: [ALSA] hdsp - Fix auto-updating of firmware Fixed the auto-updating of firmware if the breakout box was switched off/on. The firmware binary itself was already cached but it wasn't loaded properly. Also, request_firmware() is issued if the box was with firmware at module loading time but later it's erased. The auto-update is triggered at each PCM action (open, prepare, etc) and at opening proc files. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/rme9652/hdsp.c b/sound/pci/rme9652/hdsp.c index e5a52da..d3e07de 100644 --- a/sound/pci/rme9652/hdsp.c +++ b/sound/pci/rme9652/hdsp.c @@ -726,22 +726,36 @@ static int hdsp_get_iobox_version (struct hdsp *hdsp) } -static int hdsp_check_for_firmware (struct hdsp *hdsp, int show_err) +#ifdef HDSP_FW_LOADER +static int __devinit hdsp_request_fw_loader(struct hdsp *hdsp); +#endif + +static int hdsp_check_for_firmware (struct hdsp *hdsp, int load_on_demand) { - if (hdsp->io_type == H9652 || hdsp->io_type == H9632) return 0; + if (hdsp->io_type == H9652 || hdsp->io_type == H9632) + return 0; if ((hdsp_read (hdsp, HDSP_statusRegister) & HDSP_DllError) != 0) { - snd_printk(KERN_ERR "Hammerfall-DSP: firmware not present.\n"); hdsp->state &= ~HDSP_FirmwareLoaded; - if (! show_err) + if (! load_on_demand) return -EIO; + snd_printk(KERN_ERR "Hammerfall-DSP: firmware not present.\n"); /* try to load firmware */ - if (hdsp->state & HDSP_FirmwareCached) { - if (snd_hdsp_load_firmware_from_cache(hdsp) != 0) - snd_printk(KERN_ERR "Hammerfall-DSP: Firmware loading from cache failed, please upload manually.\n"); - } else { - snd_printk(KERN_ERR "Hammerfall-DSP: No firmware loaded nor cached, please upload firmware.\n"); + if (! (hdsp->state & HDSP_FirmwareCached)) { +#ifdef HDSP_FW_LOADER + if (! hdsp_request_fw_loader(hdsp)) + return 0; +#endif + snd_printk(KERN_ERR + "Hammerfall-DSP: No firmware loaded nor " + "cached, please upload firmware.\n"); + return -EIO; + } + if (snd_hdsp_load_firmware_from_cache(hdsp) != 0) { + snd_printk(KERN_ERR + "Hammerfall-DSP: Firmware loading from " + "cache failed, please upload manually.\n"); + return -EIO; } - return -EIO; } return 0; } @@ -3181,8 +3195,16 @@ snd_hdsp_proc_read(struct snd_info_entry *entry, struct snd_info_buffer *buffer) return; } } else { - snd_iprintf(buffer, "No firmware loaded nor cached, please upload firmware.\n"); - return; + int err = -EINVAL; +#ifdef HDSP_FW_LOADER + err = hdsp_request_fw_loader(hdsp); +#endif + if (err < 0) { + snd_iprintf(buffer, + "No firmware loaded nor cached, " + "please upload firmware.\n"); + return; + } } } @@ -3851,7 +3873,7 @@ static int snd_hdsp_trigger(struct snd_pcm_substream *substream, int cmd) if (hdsp_check_for_iobox (hdsp)) return -EIO; - if (hdsp_check_for_firmware(hdsp, 1)) + if (hdsp_check_for_firmware(hdsp, 0)) /* no auto-loading in trigger */ return -EIO; spin_lock(&hdsp->lock); -- cgit v0.10.2 From 55a29af5ed5d914f017e6a7c613a4d7cc34f82d9 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 12:15:34 +0200 Subject: [ALSA] Add definition of TLV dB range compound Added the definition of TLV dB range compound. It contains one or more dB-range or linear-volume TLV entries with min/max ranges. Used for volume controls with non-linear curves. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/tlv.h b/include/sound/tlv.h index 7905841..d93a96b 100644 --- a/include/sound/tlv.h +++ b/include/sound/tlv.h @@ -34,19 +34,26 @@ #define SNDRV_CTL_TLVT_CONTAINER 0 /* one level down - group of TLVs */ #define SNDRV_CTL_TLVT_DB_SCALE 1 /* dB scale */ #define SNDRV_CTL_TLVT_DB_LINEAR 2 /* linear volume */ +#define SNDRV_CTL_TLVT_DB_RANGE 3 /* dB range container */ +#define TLV_DB_SCALE_ITEM(min, step, mute) \ + SNDRV_CTL_TLVT_DB_SCALE, 2 * sizeof(unsigned int), \ + (min), ((step) & 0xffff) | ((mute) ? 0x10000 : 0) #define DECLARE_TLV_DB_SCALE(name, min, step, mute) \ -unsigned int name[] = { \ - SNDRV_CTL_TLVT_DB_SCALE, 2 * sizeof(unsigned int), \ - (min), ((step) & 0xffff) | ((mute) ? 0x10000 : 0) \ -} + unsigned int name[] = { TLV_DB_SCALE_ITEM(min, step, mute) } /* linear volume between min_dB and max_dB (.01dB unit) */ +#define TLV_DB_LINEAR_ITEM(min_dB, max_dB) \ + SNDRV_CTL_TLVT_DB_LINEAR, 2 * sizeof(unsigned int), \ + (min_dB), (max_dB) #define DECLARE_TLV_DB_LINEAR(name, min_dB, max_dB) \ -unsigned int name[] = { \ - SNDRV_CTL_TLVT_DB_LINEAR, 2 * sizeof(unsigned int), \ - (min_dB), (max_dB) \ -} + unsigned int name[] = { TLV_DB_LINEAR_ITEM(min_dB, max_dB) } + +/* dB range container */ +/* Each item is: */ +/* The below assumes that each item TLV is 4 words like DB_SCALE or LINEAR */ +#define TLV_DB_RANGE_HEAD(num) \ + SNDRV_CTL_TLVT_DB_RANGE, 6 * (num) * sizeof(unsigned int) #define TLV_DB_GAIN_MUTE -9999999 -- cgit v0.10.2 From 0b59397268ed418e139db3806f7956ffcb18b33d Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 13:35:27 +0200 Subject: [ALSA] Add dB information to es1938 driver Added the dB information to ESS Solo (es1938) driver. The new compound dB range TLVs are used for non-linear native volume controls. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/es1938.c b/sound/pci/es1938.c index cc0f34f..3784088 100644 --- a/sound/pci/es1938.c +++ b/sound/pci/es1938.c @@ -62,6 +62,7 @@ #include #include #include +#include #include @@ -1164,6 +1165,14 @@ static int snd_es1938_reg_read(struct es1938 *chip, unsigned char reg) return snd_es1938_read(chip, reg); } +#define ES1938_SINGLE_TLV(xname, xindex, reg, shift, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ,\ + .name = xname, .index = xindex, \ + .info = snd_es1938_info_single, \ + .get = snd_es1938_get_single, .put = snd_es1938_put_single, \ + .private_value = reg | (shift << 8) | (mask << 16) | (invert << 24), \ + .tlv = { .p = xtlv } } #define ES1938_SINGLE(xname, xindex, reg, shift, mask, invert) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xindex, \ .info = snd_es1938_info_single, \ @@ -1217,6 +1226,14 @@ static int snd_es1938_put_single(struct snd_kcontrol *kcontrol, return snd_es1938_reg_bits(chip, reg, mask, val) != val; } +#define ES1938_DOUBLE_TLV(xname, xindex, left_reg, right_reg, shift_left, shift_right, mask, invert, xtlv) \ +{ .iface = SNDRV_CTL_ELEM_IFACE_MIXER, \ + .access = SNDRV_CTL_ELEM_ACCESS_READWRITE | SNDRV_CTL_ELEM_ACCESS_TLV_READ,\ + .name = xname, .index = xindex, \ + .info = snd_es1938_info_double, \ + .get = snd_es1938_get_double, .put = snd_es1938_put_double, \ + .private_value = left_reg | (right_reg << 8) | (shift_left << 16) | (shift_right << 19) | (mask << 24) | (invert << 22), \ + .tlv = { .p = xtlv } } #define ES1938_DOUBLE(xname, xindex, left_reg, right_reg, shift_left, shift_right, mask, invert) \ { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, .index = xindex, \ .info = snd_es1938_info_double, \ @@ -1297,8 +1314,41 @@ static int snd_es1938_put_double(struct snd_kcontrol *kcontrol, return change; } +static unsigned int db_scale_master[] = { + TLV_DB_RANGE_HEAD(2), + 0, 54, TLV_DB_SCALE_ITEM(-3600, 50, 1), + 54, 63, TLV_DB_SCALE_ITEM(-900, 100, 0), +}; + +static unsigned int db_scale_audio1[] = { + TLV_DB_RANGE_HEAD(2), + 0, 8, TLV_DB_SCALE_ITEM(-3300, 300, 1), + 8, 15, TLV_DB_SCALE_ITEM(-900, 150, 0), +}; + +static unsigned int db_scale_audio2[] = { + TLV_DB_RANGE_HEAD(2), + 0, 8, TLV_DB_SCALE_ITEM(-3450, 300, 1), + 8, 15, TLV_DB_SCALE_ITEM(-1050, 150, 0), +}; + +static unsigned int db_scale_mic[] = { + TLV_DB_RANGE_HEAD(2), + 0, 8, TLV_DB_SCALE_ITEM(-2400, 300, 1), + 8, 15, TLV_DB_SCALE_ITEM(0, 150, 0), +}; + +static unsigned int db_scale_line[] = { + TLV_DB_RANGE_HEAD(2), + 0, 8, TLV_DB_SCALE_ITEM(-3150, 300, 1), + 8, 15, TLV_DB_SCALE_ITEM(-750, 150, 0), +}; + +static DECLARE_TLV_DB_SCALE(db_scale_capture, 0, 150, 0); + static struct snd_kcontrol_new snd_es1938_controls[] = { -ES1938_DOUBLE("Master Playback Volume", 0, 0x60, 0x62, 0, 0, 63, 0), +ES1938_DOUBLE_TLV("Master Playback Volume", 0, 0x60, 0x62, 0, 0, 63, 0, + db_scale_master), ES1938_DOUBLE("Master Playback Switch", 0, 0x60, 0x62, 6, 6, 1, 1), { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, @@ -1309,19 +1359,28 @@ ES1938_DOUBLE("Master Playback Switch", 0, 0x60, 0x62, 6, 6, 1, 1), }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, + .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Hardware Master Playback Switch", .access = SNDRV_CTL_ELEM_ACCESS_READ, .info = snd_es1938_info_hw_switch, .get = snd_es1938_get_hw_switch, + .tlv = { .p = db_scale_master }, }, ES1938_SINGLE("Hardware Volume Split", 0, 0x64, 7, 1, 0), -ES1938_DOUBLE("Line Playback Volume", 0, 0x3e, 0x3e, 4, 0, 15, 0), +ES1938_DOUBLE_TLV("Line Playback Volume", 0, 0x3e, 0x3e, 4, 0, 15, 0, + db_scale_line), ES1938_DOUBLE("CD Playback Volume", 0, 0x38, 0x38, 4, 0, 15, 0), -ES1938_DOUBLE("FM Playback Volume", 0, 0x36, 0x36, 4, 0, 15, 0), -ES1938_DOUBLE("Mono Playback Volume", 0, 0x6d, 0x6d, 4, 0, 15, 0), -ES1938_DOUBLE("Mic Playback Volume", 0, 0x1a, 0x1a, 4, 0, 15, 0), -ES1938_DOUBLE("Aux Playback Volume", 0, 0x3a, 0x3a, 4, 0, 15, 0), -ES1938_DOUBLE("Capture Volume", 0, 0xb4, 0xb4, 4, 0, 15, 0), +ES1938_DOUBLE_TLV("FM Playback Volume", 0, 0x36, 0x36, 4, 0, 15, 0, + db_scale_mic), +ES1938_DOUBLE_TLV("Mono Playback Volume", 0, 0x6d, 0x6d, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("Mic Playback Volume", 0, 0x1a, 0x1a, 4, 0, 15, 0, + db_scale_mic), +ES1938_DOUBLE_TLV("Aux Playback Volume", 0, 0x3a, 0x3a, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("Capture Volume", 0, 0xb4, 0xb4, 4, 0, 15, 0, + db_scale_capture), ES1938_SINGLE("PC Speaker Volume", 0, 0x3c, 0, 7, 0), ES1938_SINGLE("Record Monitor", 0, 0xa8, 3, 1, 0), ES1938_SINGLE("Capture Switch", 0, 0x1c, 4, 1, 1), @@ -1332,16 +1391,26 @@ ES1938_SINGLE("Capture Switch", 0, 0x1c, 4, 1, 1), .get = snd_es1938_get_mux, .put = snd_es1938_put_mux, }, -ES1938_DOUBLE("Mono Input Playback Volume", 0, 0x6d, 0x6d, 4, 0, 15, 0), -ES1938_DOUBLE("PCM Capture Volume", 0, 0x69, 0x69, 4, 0, 15, 0), -ES1938_DOUBLE("Mic Capture Volume", 0, 0x68, 0x68, 4, 0, 15, 0), -ES1938_DOUBLE("Line Capture Volume", 0, 0x6e, 0x6e, 4, 0, 15, 0), -ES1938_DOUBLE("FM Capture Volume", 0, 0x6b, 0x6b, 4, 0, 15, 0), -ES1938_DOUBLE("Mono Capture Volume", 0, 0x6f, 0x6f, 4, 0, 15, 0), -ES1938_DOUBLE("CD Capture Volume", 0, 0x6a, 0x6a, 4, 0, 15, 0), -ES1938_DOUBLE("Aux Capture Volume", 0, 0x6c, 0x6c, 4, 0, 15, 0), -ES1938_DOUBLE("PCM Playback Volume", 0, 0x7c, 0x7c, 4, 0, 15, 0), -ES1938_DOUBLE("PCM Playback Volume", 1, 0x14, 0x14, 4, 0, 15, 0), +ES1938_DOUBLE_TLV("Mono Input Playback Volume", 0, 0x6d, 0x6d, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("PCM Capture Volume", 0, 0x69, 0x69, 4, 0, 15, 0, + db_scale_audio2), +ES1938_DOUBLE_TLV("Mic Capture Volume", 0, 0x68, 0x68, 4, 0, 15, 0, + db_scale_mic), +ES1938_DOUBLE_TLV("Line Capture Volume", 0, 0x6e, 0x6e, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("FM Capture Volume", 0, 0x6b, 0x6b, 4, 0, 15, 0, + db_scale_mic), +ES1938_DOUBLE_TLV("Mono Capture Volume", 0, 0x6f, 0x6f, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("CD Capture Volume", 0, 0x6a, 0x6a, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("Aux Capture Volume", 0, 0x6c, 0x6c, 4, 0, 15, 0, + db_scale_line), +ES1938_DOUBLE_TLV("PCM Playback Volume", 0, 0x7c, 0x7c, 4, 0, 15, 0, + db_scale_audio2), +ES1938_DOUBLE_TLV("PCM Playback Volume", 1, 0x14, 0x14, 4, 0, 15, 0, + db_scale_audio1), ES1938_SINGLE("3D Control - Level", 0, 0x52, 0, 63, 0), { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, -- cgit v0.10.2 From 160ea0dc6b86e2c0c4d325c06bf402bfdde7c1c7 Mon Sep 17 00:00:00 2001 From: Richard Fish Date: Wed, 6 Sep 2006 13:58:25 +0200 Subject: [ALSA] [snd-intel-hda] enable center/LFE speaker on some laptops This patch adds LFE mixer controls for laptops with a stac9200 and a mono speaker pin with amplifier. Signed-off-by: Richard Fish Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 8716903..bcbbe11 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1223,6 +1223,66 @@ static int stac9200_auto_create_hp_ctls(struct hda_codec *codec, return 0; } +/* add playback controls for LFE output */ +static int stac9200_auto_create_lfe_ctls(struct hda_codec *codec, + struct auto_pin_cfg *cfg) +{ + struct sigmatel_spec *spec = codec->spec; + int err; + hda_nid_t lfe_pin = 0x0; + int i; + + /* + * search speaker outs and line outs for a mono speaker pin + * with an amp. If one is found, add LFE controls + * for it. + */ + for (i = 0; i < spec->autocfg.speaker_outs && lfe_pin == 0x0; i++) { + hda_nid_t pin = spec->autocfg.speaker_pins[i]; + unsigned long wcaps = get_wcaps(codec, pin); + wcaps &= (AC_WCAP_STEREO | AC_WCAP_OUT_AMP); + if (wcaps == AC_WCAP_OUT_AMP) + /* found a mono speaker with an amp, must be lfe */ + lfe_pin = pin; + } + + /* if speaker_outs is 0, then speakers may be in line_outs */ + if (lfe_pin == 0 && spec->autocfg.speaker_outs == 0) { + for (i = 0; i < spec->autocfg.line_outs && lfe_pin == 0x0; i++) { + hda_nid_t pin = spec->autocfg.line_out_pins[i]; + unsigned long cfg; + cfg = snd_hda_codec_read(codec, pin, 0, + AC_VERB_GET_CONFIG_DEFAULT, + 0x00); + if (get_defcfg_device(cfg) == AC_JACK_SPEAKER) { + unsigned long wcaps = get_wcaps(codec, pin); + wcaps &= (AC_WCAP_STEREO | AC_WCAP_OUT_AMP); + if (wcaps == AC_WCAP_OUT_AMP) + /* found a mono speaker with an amp, + must be lfe */ + lfe_pin = pin; + } + } + } + + if (lfe_pin) { + err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, + "LFE Playback Volume", + HDA_COMPOSE_AMP_VAL(lfe_pin, 1, 0, + HDA_OUTPUT)); + if (err < 0) + return err; + err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, + "LFE Playback Switch", + HDA_COMPOSE_AMP_VAL(lfe_pin, 1, 0, + HDA_OUTPUT)); + if (err < 0) + return err; + } + + return 0; +} + static int stac9200_parse_auto_config(struct hda_codec *codec) { struct sigmatel_spec *spec = codec->spec; @@ -1237,6 +1297,9 @@ static int stac9200_parse_auto_config(struct hda_codec *codec) if ((err = stac9200_auto_create_hp_ctls(codec, &spec->autocfg)) < 0) return err; + if ((err = stac9200_auto_create_lfe_ctls(codec, &spec->autocfg)) < 0) + return err; + if (spec->autocfg.dig_out_pin) spec->multiout.dig_out_nid = 0x05; if (spec->autocfg.dig_in_pin) -- cgit v0.10.2 From a7da6ce564a80952d9c0b210deca5a8cd3474a31 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 14:03:14 +0200 Subject: [ALSA] hda-codec - Add independent headphone volume control This patch addes the support of the independent 'Headphone' volume control to the generic codec parser. Some codecs (e.g. Conexant) have separate connections to the headphone and the independent amp adjustment is needed. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_generic.c b/sound/pci/hda/hda_generic.c index dedfc5b..97e9af1 100644 --- a/sound/pci/hda/hda_generic.c +++ b/sound/pci/hda/hda_generic.c @@ -46,11 +46,18 @@ struct hda_gnode { }; /* patch-specific record */ + +#define MAX_PCM_VOLS 2 +struct pcm_vol { + struct hda_gnode *node; /* Node for PCM volume */ + unsigned int index; /* connection of PCM volume */ +}; + struct hda_gspec { struct hda_gnode *dac_node[2]; /* DAC node */ struct hda_gnode *out_pin_node[2]; /* Output pin (Line-Out) node */ - struct hda_gnode *pcm_vol_node[2]; /* Node for PCM volume */ - unsigned int pcm_vol_index[2]; /* connection of PCM volume */ + struct pcm_vol pcm_vol[MAX_PCM_VOLS]; /* PCM volumes */ + unsigned int pcm_vol_nodes; /* number of PCM volumes */ struct hda_gnode *adc_node; /* ADC node */ struct hda_gnode *cap_vol_node; /* Node for capture volume */ @@ -285,9 +292,11 @@ static int parse_output_path(struct hda_codec *codec, struct hda_gspec *spec, return node == spec->dac_node[dac_idx]; } spec->dac_node[dac_idx] = node; - if (node->wid_caps & AC_WCAP_OUT_AMP) { - spec->pcm_vol_node[dac_idx] = node; - spec->pcm_vol_index[dac_idx] = 0; + if ((node->wid_caps & AC_WCAP_OUT_AMP) && + spec->pcm_vol_nodes < MAX_PCM_VOLS) { + spec->pcm_vol[spec->pcm_vol_nodes].node = node; + spec->pcm_vol[spec->pcm_vol_nodes].index = 0; + spec->pcm_vol_nodes++; } return 1; /* found */ } @@ -307,13 +316,16 @@ static int parse_output_path(struct hda_codec *codec, struct hda_gspec *spec, select_input_connection(codec, node, i); unmute_input(codec, node, i); unmute_output(codec, node); - if (! spec->pcm_vol_node[dac_idx]) { - if (node->wid_caps & AC_WCAP_IN_AMP) { - spec->pcm_vol_node[dac_idx] = node; - spec->pcm_vol_index[dac_idx] = i; - } else if (node->wid_caps & AC_WCAP_OUT_AMP) { - spec->pcm_vol_node[dac_idx] = node; - spec->pcm_vol_index[dac_idx] = 0; + if (spec->dac_node[dac_idx] && + spec->pcm_vol_nodes < MAX_PCM_VOLS && + !(spec->dac_node[dac_idx]->wid_caps & + AC_WCAP_OUT_AMP)) { + if ((node->wid_caps & AC_WCAP_IN_AMP) || + (node->wid_caps & AC_WCAP_OUT_AMP)) { + int n = spec->pcm_vol_nodes; + spec->pcm_vol[n].node = node; + spec->pcm_vol[n].index = i; + spec->pcm_vol_nodes++; } } return 1; @@ -370,7 +382,9 @@ static struct hda_gnode *parse_output_jack(struct hda_codec *codec, /* set PIN-Out enable */ snd_hda_codec_write(codec, node->nid, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, - AC_PINCTL_OUT_EN | AC_PINCTL_HP_EN); + AC_PINCTL_OUT_EN | + ((node->pin_caps & AC_PINCAP_HP_DRV) ? + AC_PINCTL_HP_EN : 0)); return node; } } @@ -745,22 +759,41 @@ static int check_existing_control(struct hda_codec *codec, const char *type, con /* * build output mixer controls */ -static int build_output_controls(struct hda_codec *codec) +static int create_output_mixers(struct hda_codec *codec, const char **names) { struct hda_gspec *spec = codec->spec; - static const char *types[2] = { "Master", "Headphone" }; int i, err; - for (i = 0; i < 2 && spec->pcm_vol_node[i]; i++) { - err = create_mixer(codec, spec->pcm_vol_node[i], - spec->pcm_vol_index[i], - types[i], "Playback"); + for (i = 0; i < spec->pcm_vol_nodes; i++) { + err = create_mixer(codec, spec->pcm_vol[i].node, + spec->pcm_vol[i].index, + names[i], "Playback"); if (err < 0) return err; } return 0; } +static int build_output_controls(struct hda_codec *codec) +{ + struct hda_gspec *spec = codec->spec; + static const char *types_speaker[] = { "Speaker", "Headphone" }; + static const char *types_line[] = { "Front", "Headphone" }; + + switch (spec->pcm_vol_nodes) { + case 1: + return create_mixer(codec, spec->pcm_vol[0].node, + spec->pcm_vol[0].index, + "Master", "Playback"); + case 2: + if (defcfg_type(spec->out_pin_node[0]) == AC_JACK_SPEAKER) + return create_output_mixers(codec, types_speaker); + else + return create_output_mixers(codec, types_line); + } + return 0; +} + /* create capture volume/switch */ static int build_input_controls(struct hda_codec *codec) { -- cgit v0.10.2 From 9d19f48cfe2570562c2c6226780a7ca627b0f1f1 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 14:27:46 +0200 Subject: [ALSA] Add pcm_class attribute to PCM sysfs entry This patch adds a new attribute, pcm_class, to each PCM sysfs entry. It's useful to detect what kind of PCM stream is, for example, HAL can check whether it's a modem or not. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/core.h b/include/sound/core.h index 1359c53..b056ea9 100644 --- a/include/sound/core.h +++ b/include/sound/core.h @@ -26,6 +26,7 @@ #include /* struct mutex */ #include /* struct rw_semaphore */ #include /* pm_message_t */ +#include /* forward declarations */ #ifdef CONFIG_PCI @@ -186,6 +187,7 @@ struct snd_minor { int device; /* device number */ const struct file_operations *f_ops; /* file operations */ void *private_data; /* private data for f_ops->open */ + struct class_device *class_dev; /* class device for sysfs */ }; /* sound.c */ @@ -200,6 +202,8 @@ int snd_register_device(int type, struct snd_card *card, int dev, const char *name); int snd_unregister_device(int type, struct snd_card *card, int dev); void *snd_lookup_minor_data(unsigned int minor, int type); +int snd_add_device_sysfs_file(int type, struct snd_card *card, int dev, + const struct class_device_attribute *attr); #ifdef CONFIG_SND_OSSEMUL int snd_register_oss_device(int type, struct snd_card *card, int dev, diff --git a/sound/core/pcm.c b/sound/core/pcm.c index ed3b094..bf8f412 100644 --- a/sound/core/pcm.c +++ b/sound/core/pcm.c @@ -907,6 +907,28 @@ void snd_pcm_detach_substream(struct snd_pcm_substream *substream) substream->pstr->substream_opened--; } +static ssize_t show_pcm_class(struct class_device *class_device, char *buf) +{ + struct snd_pcm *pcm; + const char *str; + static const char *strs[SNDRV_PCM_CLASS_LAST + 1] = { + [SNDRV_PCM_CLASS_GENERIC] = "generic", + [SNDRV_PCM_CLASS_MULTI] = "multi", + [SNDRV_PCM_CLASS_MODEM] = "modem", + [SNDRV_PCM_CLASS_DIGITIZER] = "digitizer", + }; + + if (! (pcm = class_get_devdata(class_device)) || + pcm->dev_class > SNDRV_PCM_CLASS_LAST) + str = "none"; + else + str = strs[pcm->dev_class]; + return snprintf(buf, PAGE_SIZE, "%s\n", str); +} + +static struct class_device_attribute pcm_attrs = + __ATTR(pcm_class, S_IRUGO, show_pcm_class, NULL); + static int snd_pcm_dev_register(struct snd_device *device) { int cidx, err; @@ -945,6 +967,8 @@ static int snd_pcm_dev_register(struct snd_device *device) mutex_unlock(®ister_mutex); return err; } + snd_add_device_sysfs_file(devtype, pcm->card, pcm->device, + &pcm_attrs); for (substream = pcm->streams[cidx].substream; substream; substream = substream->next) snd_pcm_timer_init(substream); } diff --git a/sound/core/sound.c b/sound/core/sound.c index b4430db..efa476c 100644 --- a/sound/core/sound.c +++ b/sound/core/sound.c @@ -268,7 +268,11 @@ int snd_register_device(int type, struct snd_card *card, int dev, snd_minors[minor] = preg; if (card) device = card->dev; - class_device_create(sound_class, NULL, MKDEV(major, minor), device, "%s", name); + preg->class_dev = class_device_create(sound_class, NULL, + MKDEV(major, minor), + device, "%s", name); + if (preg->class_dev) + class_set_devdata(preg->class_dev, private_data); mutex_unlock(&sound_mutex); return 0; @@ -276,6 +280,24 @@ int snd_register_device(int type, struct snd_card *card, int dev, EXPORT_SYMBOL(snd_register_device); +/* find the matching minor record + * return the index of snd_minor, or -1 if not found + */ +static int find_snd_minor(int type, struct snd_card *card, int dev) +{ + int cardnum, minor; + struct snd_minor *mptr; + + cardnum = card ? card->number : -1; + for (minor = 0; minor < ARRAY_SIZE(snd_minors); ++minor) + if ((mptr = snd_minors[minor]) != NULL && + mptr->type == type && + mptr->card == cardnum && + mptr->device == dev) + return minor; + return -1; +} + /** * snd_unregister_device - unregister the device on the given card * @type: the device type, SNDRV_DEVICE_TYPE_XXX @@ -289,32 +311,42 @@ EXPORT_SYMBOL(snd_register_device); */ int snd_unregister_device(int type, struct snd_card *card, int dev) { - int cardnum, minor; - struct snd_minor *mptr; + int minor; - cardnum = card ? card->number : -1; mutex_lock(&sound_mutex); - for (minor = 0; minor < ARRAY_SIZE(snd_minors); ++minor) - if ((mptr = snd_minors[minor]) != NULL && - mptr->type == type && - mptr->card == cardnum && - mptr->device == dev) - break; - if (minor == ARRAY_SIZE(snd_minors)) { + minor = find_snd_minor(type, card, dev); + if (minor < 0) { mutex_unlock(&sound_mutex); return -EINVAL; } class_device_destroy(sound_class, MKDEV(major, minor)); + kfree(snd_minors[minor]); snd_minors[minor] = NULL; mutex_unlock(&sound_mutex); - kfree(mptr); return 0; } EXPORT_SYMBOL(snd_unregister_device); +int snd_add_device_sysfs_file(int type, struct snd_card *card, int dev, + const struct class_device_attribute *attr) +{ + int minor, ret = -EINVAL; + struct class_device *cdev; + + mutex_lock(&sound_mutex); + minor = find_snd_minor(type, card, dev); + if (minor >= 0 && (cdev = snd_minors[minor]->class_dev) != NULL) + ret = class_device_create_file(cdev, attr); + mutex_unlock(&sound_mutex); + return ret; + +} + +EXPORT_SYMBOL(snd_add_device_sysfs_file); + #ifdef CONFIG_PROC_FS /* * INFO PART -- cgit v0.10.2 From cd417d4fe89638a2848980cb389b9781d4913173 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 6 Sep 2006 16:03:11 +0200 Subject: [ALSA] hda-codec - Add support for LG LW25 laptop Added the support for LG LW25 laptop with ALC880 codec. It's the same codec model as LG LW20 (model=lg-lw). Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index a788dd7..1b74994 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -785,7 +785,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. uniwill 3-jack F1734 2-jack lg LG laptop (m1 express dual) - lg-lw LG LW20 laptop + lg-lw LG LW20/LW25 laptop tcl TCL S700 clevo Clevo laptops (m520G, m665n) test for testing/debugging purpose, almost all controls can be diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index 6590381..d037051 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2270,6 +2270,7 @@ static struct hda_board_config alc880_cfg_tbl[] = { { .modelname = "lg-lw", .config = ALC880_LG_LW }, { .pci_subvendor = 0x1854, .pci_subdevice = 0x0018, .config = ALC880_LG_LW }, + { .pci_subvendor = 0x1854, .pci_subdevice = 0x0077, .config = ALC880_LG_LW }, #ifdef CONFIG_SND_DEBUG { .modelname = "test", .config = ALC880_TEST }, -- cgit v0.10.2 From dafbbb1fdbf103b24d0f7aa645625b6bd558c896 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 7 Sep 2006 12:40:00 +0200 Subject: [ALSA] hda-intel - Fix pci_disable_msi() call Fix the order to call pci_disable_msi() to be after free_irq(). (Otherwise pci_disable_msi() bugs you.) Also, added a description of disable_msi option to documentation. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/Documentation/sound/alsa/ALSA-Configuration.txt b/Documentation/sound/alsa/ALSA-Configuration.txt index 1b74994..e6b57dd 100644 --- a/Documentation/sound/alsa/ALSA-Configuration.txt +++ b/Documentation/sound/alsa/ALSA-Configuration.txt @@ -758,6 +758,7 @@ Prior to version 0.9.0rc4 options had a 'snd_' prefix. This was removed. position_fix - Fix DMA pointer (0 = auto, 1 = none, 2 = POSBUF, 3 = FIFO size) single_cmd - Use single immediate commands to communicate with codecs (for debugging only) + disable_msi - Disable Message Signaled Interrupt (MSI) This module supports one card and autoprobe. diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index cc50d13..bfd74a5 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1422,8 +1422,9 @@ static int azx_free(struct azx *chip) } if (chip->irq >= 0) { - pci_disable_msi(chip->pci); free_irq(chip->irq, (void*)chip); + if (!disable_msi) + pci_disable_msi(chip->pci); } if (chip->remap_addr) iounmap(chip->remap_addr); -- cgit v0.10.2 From e08a007d1041e0bc3df6b855043d8efde91851aa Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 7 Sep 2006 17:52:14 +0200 Subject: [ALSA] hda-codec - Fix SPDIF device number of ALC codecs Assign the SPDIF always to the secondary device (dev#1) to keep the same configuration. Move the optional capture device to the third device (dev#2). hda_intel now just ignores the NULL entries in the pcm arrays from codecs. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index bfd74a5..6309e0c 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1242,7 +1242,12 @@ static int __devinit create_codec_pcm(struct azx *chip, struct hda_codec *codec, struct snd_pcm *pcm; struct azx_pcm *apcm; - snd_assert(cpcm->stream[0].substreams || cpcm->stream[1].substreams, return -EINVAL); + /* if no substreams are defined for both playback and capture, + * it's just a placeholder. ignore it. + */ + if (!cpcm->stream[0].substreams && !cpcm->stream[1].substreams) + return 0; + snd_assert(cpcm->name, return -EINVAL); err = snd_pcm_new(chip->card, cpcm->name, pcm_dev, @@ -1268,7 +1273,8 @@ static int __devinit create_codec_pcm(struct azx *chip, struct hda_codec *codec, snd_dma_pci_data(chip->pci), 1024 * 64, 1024 * 128); chip->pcm[pcm_dev] = pcm; - chip->pcm_devs = pcm_dev + 1; + if (chip->pcm_devs < pcm_dev + 1) + chip->pcm_devs = pcm_dev + 1; return 0; } diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index d037051..ba9e050 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -1796,25 +1796,9 @@ static int alc_build_pcms(struct hda_codec *codec) } } - /* If the use of more than one ADC is requested for the current - * model, configure a second analog capture-only PCM. - */ - if (spec->num_adc_nids > 1) { - codec->num_pcms++; - info++; - info->name = spec->stream_name_analog; - /* No playback stream for second PCM */ - info->stream[SNDRV_PCM_STREAM_PLAYBACK] = alc_pcm_null_playback; - info->stream[SNDRV_PCM_STREAM_PLAYBACK].nid = 0; - if (spec->stream_analog_capture) { - snd_assert(spec->adc_nids, return -EINVAL); - info->stream[SNDRV_PCM_STREAM_CAPTURE] = *(spec->stream_analog_capture); - info->stream[SNDRV_PCM_STREAM_CAPTURE].nid = spec->adc_nids[1]; - } - } - + /* SPDIF for stream index #1 */ if (spec->multiout.dig_out_nid || spec->dig_in_nid) { - codec->num_pcms++; + codec->num_pcms = 2; info++; info->name = spec->stream_name_digital; if (spec->multiout.dig_out_nid && @@ -1829,6 +1813,24 @@ static int alc_build_pcms(struct hda_codec *codec) } } + /* If the use of more than one ADC is requested for the current + * model, configure a second analog capture-only PCM. + */ + /* Additional Analaog capture for index #2 */ + if (spec->num_adc_nids > 1 && spec->stream_analog_capture && + spec->adc_nids) { + codec->num_pcms = 3; + info++; + info->name = spec->stream_name_analog; + /* No playback stream for second PCM */ + info->stream[SNDRV_PCM_STREAM_PLAYBACK] = alc_pcm_null_playback; + info->stream[SNDRV_PCM_STREAM_PLAYBACK].nid = 0; + if (spec->stream_analog_capture) { + info->stream[SNDRV_PCM_STREAM_CAPTURE] = *(spec->stream_analog_capture); + info->stream[SNDRV_PCM_STREAM_CAPTURE].nid = spec->adc_nids[1]; + } + } + return 0; } -- cgit v0.10.2 From 8f88820ee49359ea33af42845456ce9dbf54d39a Mon Sep 17 00:00:00 2001 From: Liam Girdwood Date: Thu, 7 Sep 2006 18:07:46 +0200 Subject: [ALSA] Fix WM9705 AC97 patch build error This patch fixes a build error (introduced by me) in ac97_patch.c wrt WM9705 touchscreen. o Removed spurious '3D' from character after |= operation (0x3D is ASCII for '=') Signed-off-by: Liam Girdwood Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/ac97/ac97_patch.c b/sound/pci/ac97/ac97_patch.c index 9be4ceb..dc28b11 100644 --- a/sound/pci/ac97/ac97_patch.c +++ b/sound/pci/ac97/ac97_patch.c @@ -481,7 +481,7 @@ int patch_wolfson05(struct snd_ac97 * ac97) ac97->build_ops = &patch_wolfson_wm9705_ops; #ifdef CONFIG_TOUCHSCREEN_WM9705 /* WM9705 touchscreen uses AUX and VIDEO for touch */ - ac97->flags |=3D AC97_HAS_NO_VIDEO | AC97_HAS_NO_AUX; + ac97->flags |= AC97_HAS_NO_VIDEO | AC97_HAS_NO_AUX; #endif return 0; } -- cgit v0.10.2 From 854b66e44260320c21ebe4b8a18e189f2e45b5be Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 8 Sep 2006 12:27:38 +0200 Subject: [ALSA] ak4xxx - Remove bogus IPGA controls Remove IPGA volume controls and merge the IPGA range to ADC volume controls. These two volumes are not really independent but connected simply in different ranges 0-0x7f and 0x80-max. It doesn't make sense to provide two controls. Since both 0x7f and 0x80 specify 0dB, a hack is needed for IPGA range to skip 0x80 (increment one) for such controls. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/include/sound/ak4xxx-adda.h b/include/sound/ak4xxx-adda.h index 026e4072..d0deca6 100644 --- a/include/sound/ak4xxx-adda.h +++ b/include/sound/ak4xxx-adda.h @@ -48,7 +48,6 @@ struct snd_akm4xxx_dac_channel { /* ADC labels and channels */ struct snd_akm4xxx_adc_channel { char *name; /* capture gain volume label */ - char *gain_name; /* IPGA */ char *switch_name; /* capture switch */ unsigned int num_channels; }; @@ -91,13 +90,4 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak); #define snd_akm4xxx_set_vol(ak,chip,reg,val) \ ((ak)->volumes[(chip) * 16 + (reg)] = (val)) -/* Warning: IPGA is tricky - we assume the addr + 4 is unused - * so far, it's OK for all AK codecs with IPGA: - * AK4524, AK4528 and EK5365 - */ -#define snd_akm4xxx_get_ipga(ak,chip,reg) \ - snd_akm4xxx_get_vol(ak, chip, (reg) + 4) -#define snd_akm4xxx_set_ipga(ak,chip,reg,val) \ - snd_akm4xxx_set_vol(ak, chip, (reg) + 4, val) - #endif /* __SOUND_AK4XXX_ADDA_H */ diff --git a/sound/i2c/other/ak4xxx-adda.c b/sound/i2c/other/ak4xxx-adda.c index c34cb46..5da49e2 100644 --- a/sound/i2c/other/ak4xxx-adda.c +++ b/sound/i2c/other/ak4xxx-adda.c @@ -43,10 +43,7 @@ void snd_akm4xxx_write(struct snd_akm4xxx *ak, int chip, unsigned char reg, ak->ops.write(ak, chip, reg, val); /* save the data */ - /* don't overwrite with IPGA data */ - if ((ak->type != SND_AK4524 && ak->type != SND_AK5365) || - (reg != 0x04 && reg != 0x05) || (val & 0x80) == 0) - snd_akm4xxx_set(ak, chip, reg, val); + snd_akm4xxx_set(ak, chip, reg, val); ak->ops.unlock(ak, chip); } @@ -70,12 +67,6 @@ static void ak4524_reset(struct snd_akm4xxx *ak, int state) for (reg = 0x04; reg < maxreg; reg++) snd_akm4xxx_write(ak, chip, reg, snd_akm4xxx_get(ak, chip, reg)); - if (ak->type == SND_AK4528) - continue; - /* IPGA */ - for (reg = 0x04; reg < 0x06; reg++) - snd_akm4xxx_write(ak, chip, reg, - snd_akm4xxx_get_ipga(ak, chip, reg) | 0x80); } } @@ -175,7 +166,6 @@ static DECLARE_TLV_DB_SCALE(db_scale_vol_datt, -6350, 50, 1); static DECLARE_TLV_DB_SCALE(db_scale_8bit, -12750, 50, 1); static DECLARE_TLV_DB_SCALE(db_scale_7bit, -6350, 50, 1); static DECLARE_TLV_DB_LINEAR(db_scale_linear, TLV_DB_GAIN_MUTE, 0); -static DECLARE_TLV_DB_SCALE(db_scale_ipga, 0, 50, 0); /* * initialize all the ak4xxx chips @@ -190,8 +180,6 @@ void snd_akm4xxx_init(struct snd_akm4xxx *ak) 0x01, 0x03, /* 1: ADC/DAC enable */ 0x04, 0x00, /* 4: ADC left muted */ 0x05, 0x00, /* 5: ADC right muted */ - 0x04, 0x80, /* 4: ADC IPGA gain 0dB */ - 0x05, 0x80, /* 5: ADC IPGA gain 0dB */ 0x06, 0x00, /* 6: DAC left muted */ 0x07, 0x00, /* 7: DAC right muted */ 0xff, 0xff @@ -324,13 +312,15 @@ EXPORT_SYMBOL(snd_akm4xxx_init); /* * Mixer callbacks */ +#define AK_IPGA (1<<20) /* including IPGA */ #define AK_VOL_CVT (1<<21) /* need dB conversion */ #define AK_NEEDSMSB (1<<22) /* need MSB update bit */ #define AK_INVERT (1<<23) /* data is inverted */ #define AK_GET_CHIP(val) (((val) >> 8) & 0xff) #define AK_GET_ADDR(val) ((val) & 0xff) -#define AK_GET_SHIFT(val) (((val) >> 16) & 0x1f) +#define AK_GET_SHIFT(val) (((val) >> 16) & 0x0f) #define AK_GET_VOL_CVT(val) (((val) >> 21) & 1) +#define AK_GET_IPGA(val) (((val) >> 20) & 1) #define AK_GET_NEEDSMSB(val) (((val) >> 22) & 1) #define AK_GET_INVERT(val) (((val) >> 23) & 1) #define AK_GET_MASK(val) (((val) >> 24) & 0xff) @@ -371,8 +361,10 @@ static int put_ak_reg(struct snd_kcontrol *kcontrol, int addr, return 0; snd_akm4xxx_set_vol(ak, chip, addr, nval); - if (AK_GET_VOL_CVT(kcontrol->private_value)) + if (AK_GET_VOL_CVT(kcontrol->private_value) && nval < 128) nval = vol_cvt_datt[nval]; + if (AK_GET_IPGA(kcontrol->private_value) && nval >= 128) + nval++; /* need to correct + 1 since both 127 and 128 are 0dB */ if (AK_GET_INVERT(kcontrol->private_value)) nval = mask - nval; if (AK_GET_NEEDSMSB(kcontrol->private_value)) @@ -424,68 +416,6 @@ static int snd_akm4xxx_stereo_volume_put(struct snd_kcontrol *kcontrol, return change; } -#define snd_akm4xxx_ipga_gain_info snd_akm4xxx_volume_info - -static int snd_akm4xxx_ipga_gain_get(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_value *ucontrol) -{ - struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); - int chip = AK_GET_CHIP(kcontrol->private_value); - int addr = AK_GET_ADDR(kcontrol->private_value); - - ucontrol->value.integer.value[0] = - snd_akm4xxx_get_ipga(ak, chip, addr); - return 0; -} - -static int put_ak_ipga(struct snd_kcontrol *kcontrol, int addr, - unsigned char nval) -{ - struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); - int chip = AK_GET_CHIP(kcontrol->private_value); - - if (snd_akm4xxx_get_ipga(ak, chip, addr) == nval) - return 0; - snd_akm4xxx_set_ipga(ak, chip, addr, nval); - snd_akm4xxx_write(ak, chip, addr, nval | 0x80); /* need MSB */ - return 1; -} - -static int snd_akm4xxx_ipga_gain_put(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_value *ucontrol) -{ - return put_ak_ipga(kcontrol, AK_GET_ADDR(kcontrol->private_value), - ucontrol->value.integer.value[0]); -} - -#define snd_akm4xxx_stereo_gain_info snd_akm4xxx_stereo_volume_info - -static int snd_akm4xxx_stereo_gain_get(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_value *ucontrol) -{ - struct snd_akm4xxx *ak = snd_kcontrol_chip(kcontrol); - int chip = AK_GET_CHIP(kcontrol->private_value); - int addr = AK_GET_ADDR(kcontrol->private_value); - - ucontrol->value.integer.value[0] = - snd_akm4xxx_get_ipga(ak, chip, addr); - ucontrol->value.integer.value[1] = - snd_akm4xxx_get_ipga(ak, chip, addr + 1); - return 0; -} - -static int snd_akm4xxx_stereo_gain_put(struct snd_kcontrol *kcontrol, - struct snd_ctl_elem_value *ucontrol) -{ - int addr = AK_GET_ADDR(kcontrol->private_value); - int change; - - change = put_ak_ipga(kcontrol, addr, ucontrol->value.integer.value[0]); - change |= put_ak_ipga(kcontrol, addr + 1, - ucontrol->value.integer.value[1]); - return change; -} - static int snd_akm4xxx_deemphasis_info(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_info *uinfo) { @@ -702,35 +632,15 @@ static int build_adc_controls(struct snd_akm4xxx *ak) knew.put = snd_akm4xxx_volume_put; } /* register 4 & 5 */ - knew.private_value = - AK_COMPOSE(idx/2, (idx%2) + 4, 0, 127) | - AK_VOL_CVT; - knew.tlv.p = db_scale_vol_datt; - err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); - if (err < 0) - return err; - - if (! ak->adc_info || ! ak->adc_info[mixer_ch].gain_name) - knew.name = "IPGA Analog Capture Volume"; + if (ak->type == SND_AK5365) + knew.private_value = + AK_COMPOSE(idx/2, (idx%2) + 4, 0, 151) | + AK_VOL_CVT | AK_IPGA; else - knew.name = ak->adc_info[mixer_ch].gain_name; - if (num_stereo == 2) { - knew.info = snd_akm4xxx_stereo_gain_info; - knew.get = snd_akm4xxx_stereo_gain_get; - knew.put = snd_akm4xxx_stereo_gain_put; - } else { - knew.info = snd_akm4xxx_ipga_gain_info; - knew.get = snd_akm4xxx_ipga_gain_get; - knew.put = snd_akm4xxx_ipga_gain_put; - } - /* register 4 & 5 */ - if (ak->type == SND_AK4524) - knew.private_value = AK_COMPOSE(idx/2, (idx%2) + 4, 0, - 24); - else /* AK5365 */ - knew.private_value = AK_COMPOSE(idx/2, (idx%2) + 4, 0, - 36); - knew.tlv.p = db_scale_ipga; + knew.private_value = + AK_COMPOSE(idx/2, (idx%2) + 4, 0, 163) | + AK_VOL_CVT | AK_IPGA; + knew.tlv.p = db_scale_vol_datt; err = snd_ctl_add(ak->card, snd_ctl_new1(&knew, ak)); if (err < 0) return err; @@ -811,11 +721,9 @@ int snd_akm4xxx_build_controls(struct snd_akm4xxx *ak) if (err < 0) return err; - if (ak->type == SND_AK4524 || ak->type == SND_AK5365) { - err = build_adc_controls(ak); - if (err < 0) - return err; - } + err = build_adc_controls(ak); + if (err < 0) + return err; if (ak->type == SND_AK4355 || ak->type == SND_AK4358) num_emphs = 1; diff --git a/sound/pci/ice1712/revo.c b/sound/pci/ice1712/revo.c index c9eefa9..bf98ea3 100644 --- a/sound/pci/ice1712/revo.c +++ b/sound/pci/ice1712/revo.c @@ -110,7 +110,6 @@ static struct snd_akm4xxx_dac_channel revo51_dac[] = { static struct snd_akm4xxx_adc_channel revo51_adc[] = { { .name = "PCM Capture Volume", - .gain_name = "PCM Capture Gain Volume", .switch_name = "PCM Capture Switch", .num_channels = 2 }, -- cgit v0.10.2 From 43001c9515cf87935c50e84b3e27b1f3b3776b5d Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 8 Sep 2006 12:30:03 +0200 Subject: [ALSA] hda-intel - Fix suspend/resume with MSI Fixed suspend/resume with MSI enablement. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 6309e0c..4d2df77 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -1381,6 +1381,10 @@ static int azx_suspend(struct pci_dev *pci, pm_message_t state) snd_pcm_suspend_all(chip->pcm[i]); snd_hda_suspend(chip->bus, state); azx_free_cmd_io(chip); + if (chip->irq >= 0) + free_irq(chip->irq, chip); + if (!disable_msi) + pci_disable_msi(chip->pci); pci_disable_device(pci); pci_save_state(pci); return 0; @@ -1393,6 +1397,12 @@ static int azx_resume(struct pci_dev *pci) pci_restore_state(pci); pci_enable_device(pci); + if (!disable_msi) + pci_enable_msi(pci); + /* FIXME: need proper error handling */ + request_irq(pci->irq, azx_interrupt, IRQF_DISABLED|IRQF_SHARED, + "HDA Intel", chip); + chip->irq = pci->irq; pci_set_master(pci); azx_init_chip(chip); snd_hda_resume(chip->bus); -- cgit v0.10.2 From 307192065c55dbc70159037c1e3006a9f761192b Mon Sep 17 00:00:00 2001 From: Johannes Berg Date: Sun, 17 Sep 2006 21:59:25 +0200 Subject: [ALSA] aoa: add locking to tas codec Looks like I completely forgot to do this. This patch adds locking to the tas codec so two userspace programs can't hit the controls at the same time. Tested on my powerbook, but I obviously can't find any problems even without it since it doesn't do SMP. Signed-off-by: Johannes Berg Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/aoa/codecs/snd-aoa-codec-tas.c b/sound/aoa/codecs/snd-aoa-codec-tas.c index 16c0b6b..2ef55a1 100644 --- a/sound/aoa/codecs/snd-aoa-codec-tas.c +++ b/sound/aoa/codecs/snd-aoa-codec-tas.c @@ -66,6 +66,8 @@ #include #include #include +#include + MODULE_AUTHOR("Johannes Berg "); MODULE_LICENSE("GPL"); MODULE_DESCRIPTION("tas codec driver for snd-aoa"); @@ -91,6 +93,10 @@ struct tas { u8 bass, treble; u8 acr; int drc_range; + /* protects hardware access against concurrency from + * userspace when hitting controls and during + * codec init/suspend/resume */ + struct mutex mtx; }; static int tas_reset_init(struct tas *tas); @@ -231,8 +237,10 @@ static int tas_snd_vol_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->cached_volume_l; ucontrol->value.integer.value[1] = tas->cached_volume_r; + mutex_unlock(&tas->mtx); return 0; } @@ -241,14 +249,18 @@ static int tas_snd_vol_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); if (tas->cached_volume_l == ucontrol->value.integer.value[0] - && tas->cached_volume_r == ucontrol->value.integer.value[1]) + && tas->cached_volume_r == ucontrol->value.integer.value[1]) { + mutex_unlock(&tas->mtx); return 0; + } tas->cached_volume_l = ucontrol->value.integer.value[0]; tas->cached_volume_r = ucontrol->value.integer.value[1]; if (tas->hw_enabled) tas_set_volume(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -276,8 +288,10 @@ static int tas_snd_mute_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = !tas->mute_l; ucontrol->value.integer.value[1] = !tas->mute_r; + mutex_unlock(&tas->mtx); return 0; } @@ -286,14 +300,18 @@ static int tas_snd_mute_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); if (tas->mute_l == !ucontrol->value.integer.value[0] - && tas->mute_r == !ucontrol->value.integer.value[1]) + && tas->mute_r == !ucontrol->value.integer.value[1]) { + mutex_unlock(&tas->mtx); return 0; + } tas->mute_l = !ucontrol->value.integer.value[0]; tas->mute_r = !ucontrol->value.integer.value[1]; if (tas->hw_enabled) tas_set_volume(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -322,8 +340,10 @@ static int tas_snd_mixer_get(struct snd_kcontrol *kcontrol, struct tas *tas = snd_kcontrol_chip(kcontrol); int idx = kcontrol->private_value; + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->mixer_l[idx]; ucontrol->value.integer.value[1] = tas->mixer_r[idx]; + mutex_unlock(&tas->mtx); return 0; } @@ -334,15 +354,19 @@ static int tas_snd_mixer_put(struct snd_kcontrol *kcontrol, struct tas *tas = snd_kcontrol_chip(kcontrol); int idx = kcontrol->private_value; + mutex_lock(&tas->mtx); if (tas->mixer_l[idx] == ucontrol->value.integer.value[0] - && tas->mixer_r[idx] == ucontrol->value.integer.value[1]) + && tas->mixer_r[idx] == ucontrol->value.integer.value[1]) { + mutex_unlock(&tas->mtx); return 0; + } tas->mixer_l[idx] = ucontrol->value.integer.value[0]; tas->mixer_r[idx] = ucontrol->value.integer.value[1]; if (tas->hw_enabled) tas_set_mixer(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -375,7 +399,9 @@ static int tas_snd_drc_range_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->drc_range; + mutex_unlock(&tas->mtx); return 0; } @@ -384,12 +410,16 @@ static int tas_snd_drc_range_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); - if (tas->drc_range == ucontrol->value.integer.value[0]) + mutex_lock(&tas->mtx); + if (tas->drc_range == ucontrol->value.integer.value[0]) { + mutex_unlock(&tas->mtx); return 0; + } tas->drc_range = ucontrol->value.integer.value[0]; if (tas->hw_enabled) tas3004_set_drc(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -417,7 +447,9 @@ static int tas_snd_drc_switch_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->drc_enabled; + mutex_unlock(&tas->mtx); return 0; } @@ -426,12 +458,16 @@ static int tas_snd_drc_switch_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); - if (tas->drc_enabled == ucontrol->value.integer.value[0]) + mutex_lock(&tas->mtx); + if (tas->drc_enabled == ucontrol->value.integer.value[0]) { + mutex_unlock(&tas->mtx); return 0; + } tas->drc_enabled = ucontrol->value.integer.value[0]; if (tas->hw_enabled) tas3004_set_drc(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -463,7 +499,9 @@ static int tas_snd_capture_source_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.enumerated.item[0] = !!(tas->acr & TAS_ACR_INPUT_B); + mutex_unlock(&tas->mtx); return 0; } @@ -471,15 +509,21 @@ static int tas_snd_capture_source_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { struct tas *tas = snd_kcontrol_chip(kcontrol); - int oldacr = tas->acr; + int oldacr; + + mutex_lock(&tas->mtx); + oldacr = tas->acr; tas->acr &= ~TAS_ACR_INPUT_B; if (ucontrol->value.enumerated.item[0]) tas->acr |= TAS_ACR_INPUT_B; - if (oldacr == tas->acr) + if (oldacr == tas->acr) { + mutex_unlock(&tas->mtx); return 0; + } if (tas->hw_enabled) tas_write_reg(tas, TAS_REG_ACR, 1, &tas->acr); + mutex_unlock(&tas->mtx); return 1; } @@ -518,7 +562,9 @@ static int tas_snd_treble_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->treble; + mutex_unlock(&tas->mtx); return 0; } @@ -527,12 +573,16 @@ static int tas_snd_treble_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); - if (tas->treble == ucontrol->value.integer.value[0]) + mutex_lock(&tas->mtx); + if (tas->treble == ucontrol->value.integer.value[0]) { + mutex_unlock(&tas->mtx); return 0; + } tas->treble = ucontrol->value.integer.value[0]; if (tas->hw_enabled) tas_set_treble(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -560,7 +610,9 @@ static int tas_snd_bass_get(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); + mutex_lock(&tas->mtx); ucontrol->value.integer.value[0] = tas->bass; + mutex_unlock(&tas->mtx); return 0; } @@ -569,12 +621,16 @@ static int tas_snd_bass_put(struct snd_kcontrol *kcontrol, { struct tas *tas = snd_kcontrol_chip(kcontrol); - if (tas->bass == ucontrol->value.integer.value[0]) + mutex_lock(&tas->mtx); + if (tas->bass == ucontrol->value.integer.value[0]) { + mutex_unlock(&tas->mtx); return 0; + } tas->bass = ucontrol->value.integer.value[0]; if (tas->hw_enabled) tas_set_bass(tas); + mutex_unlock(&tas->mtx); return 1; } @@ -628,16 +684,16 @@ static int tas_reset_init(struct tas *tas) tmp = TAS_MCS_SCLK64 | TAS_MCS_SPORT_MODE_I2S | TAS_MCS_SPORT_WL_24BIT; if (tas_write_reg(tas, TAS_REG_MCS, 1, &tmp)) - return -ENODEV; + goto outerr; tas->acr |= TAS_ACR_ANALOG_PDOWN | TAS_ACR_B_MONAUREAL | TAS_ACR_B_MON_SEL_RIGHT; if (tas_write_reg(tas, TAS_REG_ACR, 1, &tas->acr)) - return -ENODEV; + goto outerr; tmp = 0; if (tas_write_reg(tas, TAS_REG_MCS2, 1, &tmp)) - return -ENODEV; + goto outerr; tas3004_set_drc(tas); @@ -649,9 +705,11 @@ static int tas_reset_init(struct tas *tas) tas->acr &= ~TAS_ACR_ANALOG_PDOWN; if (tas_write_reg(tas, TAS_REG_ACR, 1, &tas->acr)) - return -ENODEV; + goto outerr; return 0; + outerr: + return -ENODEV; } static int tas_switch_clock(struct codec_info_item *cii, enum clock_switch clock) @@ -666,11 +724,13 @@ static int tas_switch_clock(struct codec_info_item *cii, enum clock_switch clock break; case CLOCK_SWITCH_SLAVE: /* Clocks are back, re-init the codec */ + mutex_lock(&tas->mtx); tas_reset_init(tas); tas_set_volume(tas); tas_set_mixer(tas); tas->hw_enabled = 1; tas->codec.gpio->methods->all_amps_restore(tas->codec.gpio); + mutex_unlock(&tas->mtx); break; default: /* doesn't happen as of now */ @@ -684,19 +744,23 @@ static int tas_switch_clock(struct codec_info_item *cii, enum clock_switch clock * our i2c device is suspended, and then take note of that! */ static int tas_suspend(struct tas *tas) { + mutex_lock(&tas->mtx); tas->hw_enabled = 0; tas->acr |= TAS_ACR_ANALOG_PDOWN; tas_write_reg(tas, TAS_REG_ACR, 1, &tas->acr); + mutex_unlock(&tas->mtx); return 0; } static int tas_resume(struct tas *tas) { /* reset codec */ + mutex_lock(&tas->mtx); tas_reset_init(tas); tas_set_volume(tas); tas_set_mixer(tas); tas->hw_enabled = 1; + mutex_unlock(&tas->mtx); return 0; } @@ -739,11 +803,14 @@ static int tas_init_codec(struct aoa_codec *codec) return -EINVAL; } + mutex_lock(&tas->mtx); if (tas_reset_init(tas)) { printk(KERN_ERR PFX "tas failed to initialise\n"); + mutex_unlock(&tas->mtx); return -ENXIO; } tas->hw_enabled = 1; + mutex_unlock(&tas->mtx); if (tas->codec.soundbus_dev->attach_codec(tas->codec.soundbus_dev, aoa_get_card(), @@ -822,6 +889,7 @@ static int tas_create(struct i2c_adapter *adapter, if (!tas) return -ENOMEM; + mutex_init(&tas->mtx); tas->i2c.driver = &tas_driver; tas->i2c.adapter = adapter; tas->i2c.addr = addr; @@ -850,6 +918,7 @@ static int tas_create(struct i2c_adapter *adapter, detach: i2c_detach_client(&tas->i2c); fail: + mutex_destroy(&tas->mtx); kfree(tas); return -EINVAL; } @@ -908,6 +977,7 @@ static int tas_i2c_detach(struct i2c_client *client) /* power down codec chip */ tas_write_reg(tas, TAS_REG_ACR, 1, &tmp); + mutex_destroy(&tas->mtx); kfree(tas); return 0; } -- cgit v0.10.2 From 783eaf4671a4f5a95102aedb5a45e1f8adab945c Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 17 Sep 2006 22:00:51 +0200 Subject: [ALSA] powermac - Fix Oops when conflicting with aoa driver Fixed Oops when conflictin with aoa driver due to lack of i2c initialization. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/ppc/keywest.c b/sound/ppc/keywest.c index 59482a4..272ae38 100644 --- a/sound/ppc/keywest.c +++ b/sound/ppc/keywest.c @@ -117,6 +117,9 @@ int __init snd_pmac_tumbler_post_init(void) { int err; + if (!keywest_ctx || !keywest_ctx->client) + return -ENXIO; + if ((err = keywest_ctx->init_client(keywest_ctx)) < 0) { snd_printk(KERN_ERR "tumbler: %i :cannot initialize the MCS\n", err); return err; -- cgit v0.10.2 From 2b1181ed83ee8b0afbf9ba3e4f789f00375b2a17 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 17 Sep 2006 22:02:22 +0200 Subject: [ALSA] Add missing compat ioctls for ALSA control API Added the missing 32bit-compat ioctl entries for ALSA control API (espcially for recent additions of TLV stuff). Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/core/control_compat.c b/sound/core/control_compat.c index 3c0161b..ab48962 100644 --- a/sound/core/control_compat.c +++ b/sound/core/control_compat.c @@ -407,6 +407,10 @@ static inline long snd_ctl_ioctl_compat(struct file *file, unsigned int cmd, uns case SNDRV_CTL_IOCTL_POWER_STATE: case SNDRV_CTL_IOCTL_ELEM_LOCK: case SNDRV_CTL_IOCTL_ELEM_UNLOCK: + case SNDRV_CTL_IOCTL_ELEM_REMOVE: + case SNDRV_CTL_IOCTL_TLV_READ: + case SNDRV_CTL_IOCTL_TLV_WRITE: + case SNDRV_CTL_IOCTL_TLV_COMMAND: return snd_ctl_ioctl(file, cmd, (unsigned long)argp); case SNDRV_CTL_IOCTL_ELEM_LIST32: return snd_ctl_elem_list_compat(ctl->card, argp); -- cgit v0.10.2 From 5720fddd62367bb44335ec83f6371ce91e9ead12 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 17 Sep 2006 22:04:17 +0200 Subject: [ALSA] hda-codec - Add device id for Motorola si3054-compatible codec Added the device id for Motorola si3054-compatible modem codec on a Gateway laptop. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_si3054.c b/sound/pci/hda/patch_si3054.c index 250242c..76ec3d7 100644 --- a/sound/pci/hda/patch_si3054.c +++ b/sound/pci/hda/patch_si3054.c @@ -298,6 +298,7 @@ struct hda_codec_preset snd_hda_preset_si3054[] = { { .id = 0x163c3055, .name = "Si3054", .patch = patch_si3054 }, { .id = 0x163c3155, .name = "Si3054", .patch = patch_si3054 }, { .id = 0x11c13026, .name = "Si3054", .patch = patch_si3054 }, + { .id = 0x10573057, .name = "Si3054", .patch = patch_si3054 }, {} }; -- cgit v0.10.2 From a922625126cc9bf593d801879a965b9f0eae6958 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Sun, 17 Sep 2006 22:05:54 +0200 Subject: [ALSA] hda-codec - Add vendor ids for Motorola and Conexant Added string entries for Motorola and Conexant vendor ids. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index ff29d0f..e69db04 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -51,8 +51,10 @@ struct hda_vendor_id { /* codec vendor labels */ static struct hda_vendor_id hda_vendor_ids[] = { { 0x10ec, "Realtek" }, + { 0x1057, "Motorola" }, { 0x11d4, "Analog Devices" }, { 0x13f6, "C-Media" }, + { 0x14f1, "Conexant" }, { 0x434d, "C-Media" }, { 0x8384, "SigmaTel" }, {} /* terminator */ -- cgit v0.10.2 From 33ef765131bcf82bc5fca3f25d8313fa4df93ce0 Mon Sep 17 00:00:00 2001 From: Nicolas Graziano Date: Tue, 19 Sep 2006 14:23:14 +0200 Subject: [ALSA] hda_intel prefer 24bit instead of 20bit If I understand the hda_intel code, for format > 20bit it only advertise the SNDRV_PCM_FMTBIT_S32_LE format and play it at 32 bit, 20 bit or 24 bit. But if the 20bit and 24bit are available, actually it prefer the 20bit format. This path is to prefer the 24bit format instead of 20bit. Signed-off-by: Nicolas Graziano Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index e69db04..8b2c080 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -1505,10 +1505,10 @@ int snd_hda_query_supported_pcm(struct hda_codec *codec, hda_nid_t nid, formats |= SNDRV_PCM_FMTBIT_S32_LE; if (val & AC_SUPPCM_BITS_32) bps = 32; - else if (val & AC_SUPPCM_BITS_20) - bps = 20; else if (val & AC_SUPPCM_BITS_24) bps = 24; + else if (val & AC_SUPPCM_BITS_20) + bps = 20; } } else if (streams == AC_SUPFMT_FLOAT32) { /* should be exclusive */ -- cgit v0.10.2 From eb06ed8f4c2440558ebf465e8baeac6367d90201 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Wed, 20 Sep 2006 17:10:27 +0200 Subject: [ALSA] hda-codec - Support multiple headphone pins Some machines have multiple headpohne pins (usually on the lpatop and on the docking station) while the current hda-codec driver assumes a single headphone pin. Now it supports multiple hp pins (at least for detection). The sigmatel 92xx code supports this new multiple hp pins. It detects all hp pins for auto-muting, too. Also, the driver checks speaker pins in addition. In some cases, all line-out, speaker and hp-pins coexist. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 8b2c080..0736099 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -2012,7 +2012,7 @@ static int is_in_nid_list(hda_nid_t nid, hda_nid_t *list) * in the order of front, rear, CLFE, side, ... * * If more extra outputs (speaker and headphone) are found, the pins are - * assisnged to hp_pin and speaker_pins[], respectively. If no line-out jack + * assisnged to hp_pins[] and speaker_pins[], respectively. If no line-out jack * is detected, one of speaker of HP pins is assigned as the primary * output, i.e. to line_out_pins[0]. So, line_outs is always positive * if any analog output exists. @@ -2074,7 +2074,10 @@ int snd_hda_parse_pin_def_config(struct hda_codec *codec, struct auto_pin_cfg *c cfg->speaker_outs++; break; case AC_JACK_HP_OUT: - cfg->hp_pin = nid; + if (cfg->hp_outs >= ARRAY_SIZE(cfg->hp_pins)) + continue; + cfg->hp_pins[cfg->hp_outs] = nid; + cfg->hp_outs++; break; case AC_JACK_MIC_IN: if (loc == AC_JACK_LOC_FRONT) @@ -2147,8 +2150,10 @@ int snd_hda_parse_pin_def_config(struct hda_codec *codec, struct auto_pin_cfg *c cfg->speaker_outs, cfg->speaker_pins[0], cfg->speaker_pins[1], cfg->speaker_pins[2], cfg->speaker_pins[3], cfg->speaker_pins[4]); - snd_printd(" hp=0x%x, dig_out=0x%x, din_in=0x%x\n", - cfg->hp_pin, cfg->dig_out_pin, cfg->dig_in_pin); + snd_printd(" hp_outs=%d (0x%x/0x%x/0x%x/0x%x/0x%x)\n", + cfg->hp_outs, cfg->hp_pins[0], + cfg->hp_pins[1], cfg->hp_pins[2], + cfg->hp_pins[3], cfg->hp_pins[4]); snd_printd(" inputs: mic=0x%x, fmic=0x%x, line=0x%x, fline=0x%x," " cd=0x%x, aux=0x%x\n", cfg->input_pins[AUTO_PIN_MIC], @@ -2169,10 +2174,12 @@ int snd_hda_parse_pin_def_config(struct hda_codec *codec, struct auto_pin_cfg *c sizeof(cfg->speaker_pins)); cfg->speaker_outs = 0; memset(cfg->speaker_pins, 0, sizeof(cfg->speaker_pins)); - } else if (cfg->hp_pin) { - cfg->line_outs = 1; - cfg->line_out_pins[0] = cfg->hp_pin; - cfg->hp_pin = 0; + } else if (cfg->hp_outs) { + cfg->line_outs = cfg->hp_outs; + memcpy(cfg->line_out_pins, cfg->hp_pins, + sizeof(cfg->hp_pins)); + cfg->hp_outs = 0; + memset(cfg->hp_pins, 0, sizeof(cfg->hp_pins)); } } diff --git a/sound/pci/hda/hda_local.h b/sound/pci/hda/hda_local.h index ff24266..f9416c3 100644 --- a/sound/pci/hda/hda_local.h +++ b/sound/pci/hda/hda_local.h @@ -229,7 +229,8 @@ struct auto_pin_cfg { hda_nid_t line_out_pins[5]; /* sorted in the order of Front/Surr/CLFE/Side */ int speaker_outs; hda_nid_t speaker_pins[5]; - hda_nid_t hp_pin; + int hp_outs; + hda_nid_t hp_pins[5]; hda_nid_t input_pins[AUTO_PIN_LAST]; hda_nid_t dig_out_pin; hda_nid_t dig_in_pin; diff --git a/sound/pci/hda/patch_analog.c b/sound/pci/hda/patch_analog.c index 71abc2a..511df07 100644 --- a/sound/pci/hda/patch_analog.c +++ b/sound/pci/hda/patch_analog.c @@ -2471,7 +2471,7 @@ static void ad1988_auto_init_extra_out(struct hda_codec *codec) pin = spec->autocfg.speaker_pins[0]; if (pin) /* connect to front */ ad1988_auto_set_output_and_unmute(codec, pin, PIN_OUT, 0); - pin = spec->autocfg.hp_pin; + pin = spec->autocfg.hp_pins[0]; if (pin) /* connect to front */ ad1988_auto_set_output_and_unmute(codec, pin, PIN_HP, 0); } @@ -2523,7 +2523,7 @@ static int ad1988_parse_auto_config(struct hda_codec *codec) (err = ad1988_auto_create_extra_out(codec, spec->autocfg.speaker_pins[0], "Speaker")) < 0 || - (err = ad1988_auto_create_extra_out(codec, spec->autocfg.hp_pin, + (err = ad1988_auto_create_extra_out(codec, spec->autocfg.hp_pins[0], "Headphone")) < 0 || (err = ad1988_auto_create_analog_input_ctls(spec, &spec->autocfg)) < 0) return err; diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c index ba9e050..d08d2e3 100644 --- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2753,7 +2753,7 @@ static void alc880_auto_init_extra_out(struct hda_codec *codec) pin = spec->autocfg.speaker_pins[0]; if (pin) /* connect to front */ alc880_auto_set_output_and_unmute(codec, pin, PIN_OUT, 0); - pin = spec->autocfg.hp_pin; + pin = spec->autocfg.hp_pins[0]; if (pin) /* connect to front */ alc880_auto_set_output_and_unmute(codec, pin, PIN_HP, 0); } @@ -2794,7 +2794,7 @@ static int alc880_parse_auto_config(struct hda_codec *codec) (err = alc880_auto_create_extra_out(spec, spec->autocfg.speaker_pins[0], "Speaker")) < 0 || - (err = alc880_auto_create_extra_out(spec, spec->autocfg.hp_pin, + (err = alc880_auto_create_extra_out(spec, spec->autocfg.hp_pins[0], "Headphone")) < 0 || (err = alc880_auto_create_analog_input_ctls(spec, &spec->autocfg)) < 0) return err; @@ -3736,7 +3736,7 @@ static int alc260_auto_create_multi_out_ctls(struct alc_spec *spec, return err; } - nid = cfg->hp_pin; + nid = cfg->hp_pins[0]; if (nid) { err = alc260_add_playback_controls(spec, nid, "Headphone"); if (err < 0) @@ -3806,7 +3806,7 @@ static void alc260_auto_init_multi_out(struct hda_codec *codec) if (nid) alc260_auto_set_output_and_unmute(codec, nid, PIN_OUT, 0); - nid = spec->autocfg.hp_pin; + nid = spec->autocfg.hp_pins[0]; if (nid) alc260_auto_set_output_and_unmute(codec, nid, PIN_OUT, 0); } @@ -4526,7 +4526,7 @@ static void alc882_auto_init_hp_out(struct hda_codec *codec) struct alc_spec *spec = codec->spec; hda_nid_t pin; - pin = spec->autocfg.hp_pin; + pin = spec->autocfg.hp_pins[0]; if (pin) /* connect to front */ alc882_auto_set_output_and_unmute(codec, pin, PIN_HP, 0); /* use dac 0 */ } @@ -5207,7 +5207,7 @@ static void alc883_auto_init_hp_out(struct hda_codec *codec) struct alc_spec *spec = codec->spec; hda_nid_t pin; - pin = spec->autocfg.hp_pin; + pin = spec->autocfg.hp_pins[0]; if (pin) /* connect to front */ /* use dac 0 */ alc883_auto_set_output_and_unmute(codec, pin, PIN_HP, 0); @@ -5630,7 +5630,7 @@ static int alc262_auto_create_multi_out_ctls(struct alc_spec *spec, const struct return err; } } - nid = cfg->hp_pin; + nid = cfg->hp_pins[0]; if (nid) { /* spec->multiout.hp_nid = 2; */ if (nid == 0x16) { @@ -6630,7 +6630,7 @@ static void alc861_auto_init_hp_out(struct hda_codec *codec) struct alc_spec *spec = codec->spec; hda_nid_t pin; - pin = spec->autocfg.hp_pin; + pin = spec->autocfg.hp_pins[0]; if (pin) /* connect to front */ alc861_auto_set_output_and_unmute(codec, pin, PIN_HP, spec->multiout.dac_nids[0]); } @@ -6665,7 +6665,7 @@ static int alc861_parse_auto_config(struct hda_codec *codec) if ((err = alc861_auto_fill_dac_nids(spec, &spec->autocfg)) < 0 || (err = alc861_auto_create_multi_out_ctls(spec, &spec->autocfg)) < 0 || - (err = alc861_auto_create_hp_ctls(spec, spec->autocfg.hp_pin)) < 0 || + (err = alc861_auto_create_hp_ctls(spec, spec->autocfg.hp_pins[0])) < 0 || (err = alc861_auto_create_analog_input_ctls(spec, &spec->autocfg)) < 0) return err; diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index bcbbe11..7cc0642 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1011,11 +1011,29 @@ static int stac92xx_auto_fill_dac_nids(struct hda_codec *codec, return 0; } +/* create volume control/switch for the given prefx type */ +static int create_controls(struct sigmatel_spec *spec, const char *pfx, hda_nid_t nid, int chs) +{ + char name[32]; + int err; + + sprintf(name, "%s Playback Volume", pfx); + err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, name, + HDA_COMPOSE_AMP_VAL(nid, chs, 0, HDA_OUTPUT)); + if (err < 0) + return err; + sprintf(name, "%s Playback Switch", pfx); + err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, name, + HDA_COMPOSE_AMP_VAL(nid, chs, 0, HDA_OUTPUT)); + if (err < 0) + return err; + return 0; +} + /* add playback controls from the parsed DAC table */ static int stac92xx_auto_create_multi_out_ctls(struct sigmatel_spec *spec, const struct auto_pin_cfg *cfg) { - char name[32]; static const char *chname[4] = { "Front", "Surround", NULL /*CLFE*/, "Side" }; @@ -1030,26 +1048,15 @@ static int stac92xx_auto_create_multi_out_ctls(struct sigmatel_spec *spec, if (i == 2) { /* Center/LFE */ - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, "Center Playback Volume", - HDA_COMPOSE_AMP_VAL(nid, 1, 0, HDA_OUTPUT))) < 0) + err = create_controls(spec, "Center", nid, 1); + if (err < 0) return err; - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, "LFE Playback Volume", - HDA_COMPOSE_AMP_VAL(nid, 2, 0, HDA_OUTPUT))) < 0) - return err; - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, "Center Playback Switch", - HDA_COMPOSE_AMP_VAL(nid, 1, 0, HDA_OUTPUT))) < 0) - return err; - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, "LFE Playback Switch", - HDA_COMPOSE_AMP_VAL(nid, 2, 0, HDA_OUTPUT))) < 0) + err = create_controls(spec, "LFE", nid, 2); + if (err < 0) return err; } else { - sprintf(name, "%s Playback Volume", chname[i]); - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, name, - HDA_COMPOSE_AMP_VAL(nid, 3, 0, HDA_OUTPUT))) < 0) - return err; - sprintf(name, "%s Playback Switch", chname[i]); - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, name, - HDA_COMPOSE_AMP_VAL(nid, 3, 0, HDA_OUTPUT))) < 0) + err = create_controls(spec, chname[i], nid, 3); + if (err < 0) return err; } } @@ -1065,39 +1072,85 @@ static int stac92xx_auto_create_multi_out_ctls(struct sigmatel_spec *spec, return 0; } -/* add playback controls for HP output */ -static int stac92xx_auto_create_hp_ctls(struct hda_codec *codec, struct auto_pin_cfg *cfg) +static int check_in_dac_nids(struct sigmatel_spec *spec, hda_nid_t nid) { - struct sigmatel_spec *spec = codec->spec; - hda_nid_t pin = cfg->hp_pin; - hda_nid_t nid; - int i, err; - unsigned int wid_caps; + int i; - if (! pin) - return 0; + for (i = 0; i < spec->multiout.num_dacs; i++) { + if (spec->multiout.dac_nids[i] == nid) + return 1; + } + if (spec->multiout.hp_nid == nid) + return 1; + return 0; +} - wid_caps = get_wcaps(codec, pin); - if (wid_caps & AC_WCAP_UNSOL_CAP) - spec->hp_detect = 1; +static int add_spec_dacs(struct sigmatel_spec *spec, hda_nid_t nid) +{ + if (!spec->multiout.hp_nid) + spec->multiout.hp_nid = nid; + else if (spec->multiout.num_dacs > 4) { + printk(KERN_WARNING "stac92xx: No space for DAC 0x%x\n", nid); + return 1; + } else { + spec->multiout.dac_nids[spec->multiout.num_dacs] = nid; + spec->multiout.num_dacs++; + } + return 0; +} - nid = snd_hda_codec_read(codec, pin, 0, AC_VERB_GET_CONNECT_LIST, 0) & 0xff; - for (i = 0; i < cfg->line_outs; i++) { - if (! spec->multiout.dac_nids[i]) +/* add playback controls for Speaker and HP outputs */ +static int stac92xx_auto_create_hp_ctls(struct hda_codec *codec, + struct auto_pin_cfg *cfg) +{ + struct sigmatel_spec *spec = codec->spec; + hda_nid_t nid; + int i, old_num_dacs, err; + + old_num_dacs = spec->multiout.num_dacs; + for (i = 0; i < cfg->hp_outs; i++) { + unsigned int wid_caps = get_wcaps(codec, cfg->hp_pins[i]); + if (wid_caps & AC_WCAP_UNSOL_CAP) + spec->hp_detect = 1; + nid = snd_hda_codec_read(codec, cfg->hp_pins[i], 0, + AC_VERB_GET_CONNECT_LIST, 0) & 0xff; + if (check_in_dac_nids(spec, nid)) + nid = 0; + if (! nid) continue; - if (spec->multiout.dac_nids[i] == nid) - return 0; + add_spec_dacs(spec, nid); + } + for (i = 0; i < cfg->speaker_outs; i++) { + nid = snd_hda_codec_read(codec, cfg->speaker_pins[0], 0, + AC_VERB_GET_CONNECT_LIST, 0) & 0xff; + if (check_in_dac_nids(spec, nid)) + nid = 0; + if (check_in_dac_nids(spec, nid)) + nid = 0; + if (! nid) + continue; + add_spec_dacs(spec, nid); } - spec->multiout.hp_nid = nid; - - /* control HP volume/switch on the output mixer amp */ - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, "Headphone Playback Volume", - HDA_COMPOSE_AMP_VAL(nid, 3, 0, HDA_OUTPUT))) < 0) - return err; - if ((err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, "Headphone Playback Switch", - HDA_COMPOSE_AMP_VAL(nid, 3, 0, HDA_OUTPUT))) < 0) - return err; + for (i = old_num_dacs; i < spec->multiout.num_dacs; i++) { + static const char *pfxs[] = { + "Speaker", "External Speaker", "Speaker2", + }; + err = create_controls(spec, pfxs[i - old_num_dacs], + spec->multiout.dac_nids[i], 3); + if (err < 0) + return err; + } + if (spec->multiout.hp_nid) { + const char *pfx; + if (old_num_dacs == spec->multiout.num_dacs) + pfx = "Master"; + else + pfx = "Headphone"; + err = create_controls(spec, pfx, spec->multiout.hp_nid, 3); + if (err < 0) + return err; + } return 0; } @@ -1160,11 +1213,20 @@ static void stac92xx_auto_init_multi_out(struct hda_codec *codec) static void stac92xx_auto_init_hp_out(struct hda_codec *codec) { struct sigmatel_spec *spec = codec->spec; - hda_nid_t pin; + int i; - pin = spec->autocfg.hp_pin; - if (pin) /* connect to front */ - stac92xx_auto_set_pinctl(codec, pin, AC_PINCTL_OUT_EN | AC_PINCTL_HP_EN); + for (i = 0; i < spec->autocfg.hp_outs; i++) { + hda_nid_t pin; + pin = spec->autocfg.hp_pins[i]; + if (pin) /* connect to front */ + stac92xx_auto_set_pinctl(codec, pin, AC_PINCTL_OUT_EN | AC_PINCTL_HP_EN); + } + for (i = 0; i < spec->autocfg.speaker_outs; i++) { + hda_nid_t pin; + pin = spec->autocfg.speaker_pins[i]; + if (pin) /* connect to front */ + stac92xx_auto_set_pinctl(codec, pin, AC_PINCTL_OUT_EN); + } } static int stac92xx_parse_auto_config(struct hda_codec *codec, hda_nid_t dig_out, hda_nid_t dig_in) @@ -1210,7 +1272,7 @@ static int stac9200_auto_create_hp_ctls(struct hda_codec *codec, struct auto_pin_cfg *cfg) { struct sigmatel_spec *spec = codec->spec; - hda_nid_t pin = cfg->hp_pin; + hda_nid_t pin = cfg->hp_pins[0]; unsigned int wid_caps; if (! pin) @@ -1266,16 +1328,7 @@ static int stac9200_auto_create_lfe_ctls(struct hda_codec *codec, } if (lfe_pin) { - err = stac92xx_add_control(spec, STAC_CTL_WIDGET_VOL, - "LFE Playback Volume", - HDA_COMPOSE_AMP_VAL(lfe_pin, 1, 0, - HDA_OUTPUT)); - if (err < 0) - return err; - err = stac92xx_add_control(spec, STAC_CTL_WIDGET_MUTE, - "LFE Playback Switch", - HDA_COMPOSE_AMP_VAL(lfe_pin, 1, 0, - HDA_OUTPUT)); + err = create_controls(spec, "LFE", lfe_pin, 1); if (err < 0) return err; } @@ -1363,9 +1416,11 @@ static int stac92xx_init(struct hda_codec *codec) /* set up pins */ if (spec->hp_detect) { /* Enable unsolicited responses on the HP widget */ - snd_hda_codec_write(codec, cfg->hp_pin, 0, - AC_VERB_SET_UNSOLICITED_ENABLE, - STAC_UNSOL_ENABLE); + for (i = 0; i < cfg->hp_outs; i++) + if (get_wcaps(codec, cfg->hp_pins[i]) & AC_WCAP_UNSOL_CAP) + snd_hda_codec_write(codec, cfg->hp_pins[i], 0, + AC_VERB_SET_UNSOLICITED_ENABLE, + STAC_UNSOL_ENABLE); /* fake event to set up pins */ codec->patch_ops.unsol_event(codec, STAC_HP_EVENT << 26); /* enable the headphones by default. If/when unsol_event detection works, this will be ignored */ @@ -1447,21 +1502,36 @@ static void stac92xx_unsol_event(struct hda_codec *codec, unsigned int res) if ((res >> 26) != STAC_HP_EVENT) return; - presence = snd_hda_codec_read(codec, cfg->hp_pin, 0, - AC_VERB_GET_PIN_SENSE, 0x00) >> 31; + presence = 0; + for (i = 0; i < cfg->hp_outs; i++) { + int p = snd_hda_codec_read(codec, cfg->hp_pins[i], 0, + AC_VERB_GET_PIN_SENSE, 0x00); + if (p & (1 << 31)) + presence++; + } if (presence) { /* disable lineouts, enable hp */ for (i = 0; i < cfg->line_outs; i++) stac92xx_reset_pinctl(codec, cfg->line_out_pins[i], AC_PINCTL_OUT_EN); - stac92xx_set_pinctl(codec, cfg->hp_pin, AC_PINCTL_OUT_EN); + for (i = 0; i < cfg->speaker_outs; i++) + stac92xx_reset_pinctl(codec, cfg->speaker_pins[i], + AC_PINCTL_OUT_EN); + for (i = 0; i < cfg->hp_outs; i++) + stac92xx_set_pinctl(codec, cfg->hp_pins[i], + AC_PINCTL_OUT_EN); } else { /* enable lineouts, disable hp */ for (i = 0; i < cfg->line_outs; i++) stac92xx_set_pinctl(codec, cfg->line_out_pins[i], AC_PINCTL_OUT_EN); - stac92xx_reset_pinctl(codec, cfg->hp_pin, AC_PINCTL_OUT_EN); + for (i = 0; i < cfg->speaker_outs; i++) + stac92xx_set_pinctl(codec, cfg->speaker_pins[i], + AC_PINCTL_OUT_EN); + for (i = 0; i < cfg->hp_outs; i++) + stac92xx_reset_pinctl(codec, cfg->hp_pins[i], + AC_PINCTL_OUT_EN); } } -- cgit v0.10.2 From e6f8f108a19638d7c6535ab393a228ed9d4804a6 Mon Sep 17 00:00:00 2001 From: Josef 'Jeff' Sipek Date: Thu, 21 Sep 2006 11:31:58 +0200 Subject: [ALSA] sound core: Use SEEK_{SET,CUR,END} instead of hardcoded values sound core: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/core/info.c b/sound/core/info.c index 9663b6b..e43662b 100644 --- a/sound/core/info.c +++ b/sound/core/info.c @@ -175,15 +175,15 @@ static loff_t snd_info_entry_llseek(struct file *file, loff_t offset, int orig) switch (entry->content) { case SNDRV_INFO_CONTENT_TEXT: switch (orig) { - case 0: /* SEEK_SET */ + case SEEK_SET: file->f_pos = offset; ret = file->f_pos; goto out; - case 1: /* SEEK_CUR */ + case SEEK_CUR: file->f_pos += offset; ret = file->f_pos; goto out; - case 2: /* SEEK_END */ + case SEEK_END: default: ret = -EINVAL; goto out; -- cgit v0.10.2 From dd47a33806bfe93c08b071c4d26a2390cbbc9e65 Mon Sep 17 00:00:00 2001 From: Josef 'Jeff' Sipek Date: Thu, 21 Sep 2006 11:32:43 +0200 Subject: [ALSA] opl4: Use SEEK_{SET,CUR,END} instead of hardcoded values opl4: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/drivers/opl4/opl4_proc.c b/sound/drivers/opl4/opl4_proc.c index 11dd811..1679300 100644 --- a/sound/drivers/opl4/opl4_proc.c +++ b/sound/drivers/opl4/opl4_proc.c @@ -105,13 +105,13 @@ static long long snd_opl4_mem_proc_llseek(struct snd_info_entry *entry, void *fi struct file *file, long long offset, int orig) { switch (orig) { - case 0: /* SEEK_SET */ + case SEEK_SET: file->f_pos = offset; break; - case 1: /* SEEK_CUR */ + case SEEK_CUR: file->f_pos += offset; break; - case 2: /* SEEK_END, offset is negative */ + case SEEK_END: /* offset is negative */ file->f_pos = entry->size + offset; break; default: -- cgit v0.10.2 From d158da81ee9a1fa70d980f58b0f143fa873ca9ed Mon Sep 17 00:00:00 2001 From: Josef 'Jeff' Sipek Date: Thu, 21 Sep 2006 11:33:14 +0200 Subject: [ALSA] gus: Use SEEK_{SET,CUR,END} instead of hardcoded values gus: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/isa/gus/gus_mem_proc.c b/sound/isa/gus/gus_mem_proc.c index 4080255..80f0a83 100644 --- a/sound/isa/gus/gus_mem_proc.c +++ b/sound/isa/gus/gus_mem_proc.c @@ -61,13 +61,13 @@ static long long snd_gf1_mem_proc_llseek(struct snd_info_entry *entry, struct gus_proc_private *priv = entry->private_data; switch (orig) { - case 0: /* SEEK_SET */ + case SEEK_SET: file->f_pos = offset; break; - case 1: /* SEEK_CUR */ + case SEEK_CUR: file->f_pos += offset; break; - case 2: /* SEEK_END, offset is negative */ + case SEEK_END: /* offset is negative */ file->f_pos = priv->size + offset; break; default: -- cgit v0.10.2 From 7ffffecc7c4df08ad89723ca32d936ff09b5b3ff Mon Sep 17 00:00:00 2001 From: Josef 'Jeff' Sipek Date: Thu, 21 Sep 2006 11:33:42 +0200 Subject: [ALSA] mixart: Use SEEK_{SET,CUR,END} instead of hardcoded values mixart: Use SEEK_{SET,CUR,END} instead of hardcoded values Signed-off-by: Josef 'Jeff' Sipek Signed-off-by: Andrew Morton Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/mixart/mixart.c b/sound/pci/mixart/mixart.c index cc43ecd..216aee5 100644 --- a/sound/pci/mixart/mixart.c +++ b/sound/pci/mixart/mixart.c @@ -1109,13 +1109,13 @@ static long long snd_mixart_BA0_llseek(struct snd_info_entry *entry, offset = offset & ~3; /* 4 bytes aligned */ switch(orig) { - case 0: /* SEEK_SET */ + case SEEK_SET: file->f_pos = offset; break; - case 1: /* SEEK_CUR */ + case SEEK_CUR: file->f_pos += offset; break; - case 2: /* SEEK_END, offset is negative */ + case SEEK_END: /* offset is negative */ file->f_pos = MIXART_BA0_SIZE + offset; break; default: @@ -1135,13 +1135,13 @@ static long long snd_mixart_BA1_llseek(struct snd_info_entry *entry, offset = offset & ~3; /* 4 bytes aligned */ switch(orig) { - case 0: /* SEEK_SET */ + case SEEK_SET: file->f_pos = offset; break; - case 1: /* SEEK_CUR */ + case SEEK_CUR: file->f_pos += offset; break; - case 2: /* SEEK_END, offset is negative */ + case SEEK_END: /* offset is negative */ file->f_pos = MIXART_BA1_SIZE + offset; break; default: -- cgit v0.10.2 From 314634bc81325dcfeb31ed138647d428b1f26cbf Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 21 Sep 2006 11:56:18 +0200 Subject: [ALSA] hda-codec - Fix mic input with STAC92xx codecs Fixed mic input with STAC92xx codecs. The mic pin was sometimes set to OUTPUT by the headphone jack detection. Also, try to assign a secondary mic as front-mic (or vice versa) in the auto-detection if possible. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_codec.c b/sound/pci/hda/hda_codec.c index 0736099..9c3d7ac 100644 --- a/sound/pci/hda/hda_codec.c +++ b/sound/pci/hda/hda_codec.c @@ -2079,12 +2079,21 @@ int snd_hda_parse_pin_def_config(struct hda_codec *codec, struct auto_pin_cfg *c cfg->hp_pins[cfg->hp_outs] = nid; cfg->hp_outs++; break; - case AC_JACK_MIC_IN: - if (loc == AC_JACK_LOC_FRONT) - cfg->input_pins[AUTO_PIN_FRONT_MIC] = nid; - else - cfg->input_pins[AUTO_PIN_MIC] = nid; + case AC_JACK_MIC_IN: { + int preferred, alt; + if (loc == AC_JACK_LOC_FRONT) { + preferred = AUTO_PIN_FRONT_MIC; + alt = AUTO_PIN_MIC; + } else { + preferred = AUTO_PIN_MIC; + alt = AUTO_PIN_FRONT_MIC; + } + if (!cfg->input_pins[preferred]) + cfg->input_pins[preferred] = nid; + else if (!cfg->input_pins[alt]) + cfg->input_pins[alt] = nid; break; + } case AC_JACK_LINE_IN: if (loc == AC_JACK_LOC_FRONT) cfg->input_pins[AUTO_PIN_FRONT_LINE] = nid; diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 7cc0642..92f48a7 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -36,7 +36,6 @@ #define NUM_CONTROL_ALLOC 32 #define STAC_HP_EVENT 0x37 -#define STAC_UNSOL_ENABLE (AC_USRSP_EN | STAC_HP_EVENT) #define STAC_REF 0 #define STAC_D945GTP3 1 @@ -1164,23 +1163,28 @@ static int stac92xx_auto_create_analog_input_ctls(struct hda_codec *codec, const int i, j, k; for (i = 0; i < AUTO_PIN_LAST; i++) { - int index = -1; - if (cfg->input_pins[i]) { - imux->items[imux->num_items].label = auto_pin_cfg_labels[i]; - - for (j=0; jnum_muxes; j++) { - int num_cons = snd_hda_get_connections(codec, spec->mux_nids[j], con_lst, HDA_MAX_NUM_INPUTS); - for (k=0; kinput_pins[i]) { - index = k; - break; - } - if (index >= 0) - break; - } - imux->items[imux->num_items].index = index; - imux->num_items++; + int index; + + if (!cfg->input_pins[i]) + continue; + index = -1; + for (j = 0; j < spec->num_muxes; j++) { + int num_cons; + num_cons = snd_hda_get_connections(codec, + spec->mux_nids[j], + con_lst, + HDA_MAX_NUM_INPUTS); + for (k = 0; k < num_cons; k++) + if (con_lst[k] == cfg->input_pins[i]) { + index = k; + goto found; + } } + continue; + found: + imux->items[imux->num_items].label = auto_pin_cfg_labels[i]; + imux->items[imux->num_items].index = index; + imux->num_items++; } if (imux->num_items == 1) { @@ -1405,6 +1409,15 @@ static void stac922x_gpio_mute(struct hda_codec *codec, int pin, int muted) AC_VERB_SET_GPIO_DATA, gpiostate); } +static void enable_pin_detect(struct hda_codec *codec, hda_nid_t nid, + unsigned int event) +{ + if (get_wcaps(codec, nid) & AC_WCAP_UNSOL_CAP) + snd_hda_codec_write(codec, nid, 0, + AC_VERB_SET_UNSOLICITED_ENABLE, + (AC_USRSP_EN | event)); +} + static int stac92xx_init(struct hda_codec *codec) { struct sigmatel_spec *spec = codec->spec; @@ -1417,13 +1430,13 @@ static int stac92xx_init(struct hda_codec *codec) if (spec->hp_detect) { /* Enable unsolicited responses on the HP widget */ for (i = 0; i < cfg->hp_outs; i++) - if (get_wcaps(codec, cfg->hp_pins[i]) & AC_WCAP_UNSOL_CAP) - snd_hda_codec_write(codec, cfg->hp_pins[i], 0, - AC_VERB_SET_UNSOLICITED_ENABLE, - STAC_UNSOL_ENABLE); + enable_pin_detect(codec, cfg->hp_pins[i], + STAC_HP_EVENT); /* fake event to set up pins */ codec->patch_ops.unsol_event(codec, STAC_HP_EVENT << 26); - /* enable the headphones by default. If/when unsol_event detection works, this will be ignored */ + /* enable the headphones by default. + * If/when unsol_event detection works, this will be ignored + */ stac92xx_auto_init_hp_out(codec); } else { stac92xx_auto_init_multi_out(codec); @@ -1478,6 +1491,8 @@ static void stac92xx_set_pinctl(struct hda_codec *codec, hda_nid_t nid, { unsigned int pin_ctl = snd_hda_codec_read(codec, nid, 0, AC_VERB_GET_PIN_WIDGET_CONTROL, 0x00); + if (flag == AC_PINCTL_OUT_EN && (pin_ctl & AC_PINCTL_IN_EN)) + return; snd_hda_codec_write(codec, nid, 0, AC_VERB_SET_PIN_WIDGET_CONTROL, pin_ctl | flag); @@ -1493,21 +1508,27 @@ static void stac92xx_reset_pinctl(struct hda_codec *codec, hda_nid_t nid, pin_ctl & ~flag); } -static void stac92xx_unsol_event(struct hda_codec *codec, unsigned int res) +static int get_pin_presence(struct hda_codec *codec, hda_nid_t nid) +{ + if (!nid) + return 0; + if (snd_hda_codec_read(codec, nid, 0, AC_VERB_GET_PIN_SENSE, 0x00) + & (1 << 31)) + return 1; + return 0; +} + +static void stac92xx_hp_detect(struct hda_codec *codec, unsigned int res) { struct sigmatel_spec *spec = codec->spec; struct auto_pin_cfg *cfg = &spec->autocfg; int i, presence; - if ((res >> 26) != STAC_HP_EVENT) - return; - presence = 0; for (i = 0; i < cfg->hp_outs; i++) { - int p = snd_hda_codec_read(codec, cfg->hp_pins[i], 0, - AC_VERB_GET_PIN_SENSE, 0x00); - if (p & (1 << 31)) - presence++; + presence = get_pin_presence(codec, cfg->hp_pins[i]); + if (presence) + break; } if (presence) { @@ -1535,6 +1556,15 @@ static void stac92xx_unsol_event(struct hda_codec *codec, unsigned int res) } } +static void stac92xx_unsol_event(struct hda_codec *codec, unsigned int res) +{ + switch (res >> 26) { + case STAC_HP_EVENT: + stac92xx_hp_detect(codec, res); + break; + } +} + #ifdef CONFIG_PM static int stac92xx_resume(struct hda_codec *codec) { -- cgit v0.10.2 From 5c79b1f887f8edcd399baa164b66a1c08566c994 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 21 Sep 2006 13:34:13 +0200 Subject: [ALSA] hda-intel - A slight cleanup of timeout check in azx_get_response() A slight cleanup of timeout check in azx_get_response() to check jiffies for HZ-independent timeout. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 4d2df77..e9d4cb4 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -520,38 +520,36 @@ static void azx_update_rirb(struct azx *chip) static unsigned int azx_rirb_get_response(struct hda_codec *codec) { struct azx *chip = codec->bus->private_data; - int timeout = 50; + unsigned long timeout; - for (;;) { + again: + timeout = jiffies + msecs_to_jiffies(1000); + do { if (chip->polling_mode) { spin_lock_irq(&chip->reg_lock); azx_update_rirb(chip); spin_unlock_irq(&chip->reg_lock); } if (! chip->rirb.cmds) - break; - if (! --timeout) { - if (! chip->polling_mode) { - snd_printk(KERN_WARNING "hda_intel: " - "azx_get_response timeout, " - "switching to polling mode...\n"); - chip->polling_mode = 1; - timeout = 50; - continue; - } - snd_printk(KERN_ERR - "hda_intel: azx_get_response timeout, " - "switching to single_cmd mode...\n"); - chip->rirb.rp = azx_readb(chip, RIRBWP); - chip->rirb.cmds = 0; - /* switch to single_cmd mode */ - chip->single_cmd = 1; - azx_free_cmd_io(chip); - return -1; - } - msleep(1); + return chip->rirb.res; /* the last value */ + schedule_timeout_interruptible(1); + } while (time_after_eq(timeout, jiffies)); + + if (!chip->polling_mode) { + snd_printk(KERN_WARNING "hda_intel: azx_get_response timeout, " + "switching to polling mode...\n"); + chip->polling_mode = 1; + goto again; } - return chip->rirb.res; /* the last value */ + + snd_printk(KERN_ERR "hda_intel: azx_get_response timeout, " + "switching to single_cmd mode...\n"); + chip->rirb.rp = azx_readb(chip, RIRBWP); + chip->rirb.cmds = 0; + /* switch to single_cmd mode */ + chip->single_cmd = 1; + azx_free_cmd_io(chip); + return -1; } /* -- cgit v0.10.2 From eb995a8c82dba4a8e027c99ac5001fbc287a115c Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Thu, 21 Sep 2006 14:28:21 +0200 Subject: [ALSA] hda-codec - Fix headphone auto-toggle on sigmatel codec Fix/optimize the headphone auto-toggle function on sigmatel codecs. The headphone pins are kept as output. When headhpones are unplugged, you cannot hear anyway ;) Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/hda/patch_sigmatel.c b/sound/pci/hda/patch_sigmatel.c index 92f48a7..731b7b9 100644 --- a/sound/pci/hda/patch_sigmatel.c +++ b/sound/pci/hda/patch_sigmatel.c @@ -1432,12 +1432,9 @@ static int stac92xx_init(struct hda_codec *codec) for (i = 0; i < cfg->hp_outs; i++) enable_pin_detect(codec, cfg->hp_pins[i], STAC_HP_EVENT); + stac92xx_auto_init_hp_out(codec); /* fake event to set up pins */ codec->patch_ops.unsol_event(codec, STAC_HP_EVENT << 26); - /* enable the headphones by default. - * If/when unsol_event detection works, this will be ignored - */ - stac92xx_auto_init_hp_out(codec); } else { stac92xx_auto_init_multi_out(codec); stac92xx_auto_init_hp_out(codec); @@ -1539,9 +1536,6 @@ static void stac92xx_hp_detect(struct hda_codec *codec, unsigned int res) for (i = 0; i < cfg->speaker_outs; i++) stac92xx_reset_pinctl(codec, cfg->speaker_pins[i], AC_PINCTL_OUT_EN); - for (i = 0; i < cfg->hp_outs; i++) - stac92xx_set_pinctl(codec, cfg->hp_pins[i], - AC_PINCTL_OUT_EN); } else { /* enable lineouts, disable hp */ for (i = 0; i < cfg->line_outs; i++) @@ -1550,9 +1544,6 @@ static void stac92xx_hp_detect(struct hda_codec *codec, unsigned int res) for (i = 0; i < cfg->speaker_outs; i++) stac92xx_set_pinctl(codec, cfg->speaker_pins[i], AC_PINCTL_OUT_EN); - for (i = 0; i < cfg->hp_outs; i++) - stac92xx_reset_pinctl(codec, cfg->hp_pins[i], - AC_PINCTL_OUT_EN); } } -- cgit v0.10.2 From 92b9ac78f934616d08c72747607bfb0fa51ee52d Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 22 Sep 2006 10:57:36 +0200 Subject: [ALSA] usb-audio: increase number of packets per URB To decrease the USB interrupts rate, increase both the default and the maximum number of packets per URB. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbaudio.c b/sound/usb/usbaudio.c index 664dd4c..49248fa 100644 --- a/sound/usb/usbaudio.c +++ b/sound/usb/usbaudio.c @@ -68,7 +68,7 @@ static char *id[SNDRV_CARDS] = SNDRV_DEFAULT_STR; /* ID for this card */ static int enable[SNDRV_CARDS] = SNDRV_DEFAULT_ENABLE_PNP; /* Enable this card */ static int vid[SNDRV_CARDS] = { [0 ... (SNDRV_CARDS-1)] = -1 }; /* Vendor ID for this card */ static int pid[SNDRV_CARDS] = { [0 ... (SNDRV_CARDS-1)] = -1 }; /* Product ID for this card */ -static int nrpacks = 4; /* max. number of packets per urb */ +static int nrpacks = 8; /* max. number of packets per urb */ static int async_unlink = 1; static int device_setup[SNDRV_CARDS]; /* device parameter for this card*/ @@ -100,7 +100,7 @@ MODULE_PARM_DESC(device_setup, "Specific device setup (if needed)."); * */ -#define MAX_PACKS 10 +#define MAX_PACKS 20 #define MAX_PACKS_HS (MAX_PACKS * 8) /* in high speed mode */ #define MAX_URBS 8 #define SYNC_URBS 4 /* always four urbs for sync */ -- cgit v0.10.2 From dbf91dd47d90e1d91d5daf37ca30728f4e11c5e3 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 22 Sep 2006 10:58:40 +0200 Subject: [ALSA] ES1938: remove duplicate field initialization Remove the duplicate and inconsistent initialization of the kcontrol access field. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/es1938.c b/sound/pci/es1938.c index 3784088..3ce5a4e 100644 --- a/sound/pci/es1938.c +++ b/sound/pci/es1938.c @@ -1359,10 +1359,9 @@ ES1938_DOUBLE("Master Playback Switch", 0, 0x60, 0x62, 6, 6, 1, 1), }, { .iface = SNDRV_CTL_ELEM_IFACE_MIXER, - .access = (SNDRV_CTL_ELEM_ACCESS_READWRITE | + .access = (SNDRV_CTL_ELEM_ACCESS_READ | SNDRV_CTL_ELEM_ACCESS_TLV_READ), .name = "Hardware Master Playback Switch", - .access = SNDRV_CTL_ELEM_ACCESS_READ, .info = snd_es1938_info_hw_switch, .get = snd_es1938_get_hw_switch, .tlv = { .p = db_scale_master }, -- cgit v0.10.2 From fef8a0c03daa1aaf3f83e45da2b14674c073a9f5 Mon Sep 17 00:00:00 2001 From: Clemens Ladisch Date: Fri, 22 Sep 2006 11:00:51 +0200 Subject: [ALSA] usb-audio: add mixer control names for the Aureon 5.1 MkII Add a mixer name map for the TerraTec Aureon 5.1 MkII USB. Signed-off-by: Clemens Ladisch Signed-off-by: Jaroslav Kysela diff --git a/sound/usb/usbmixer_maps.c b/sound/usb/usbmixer_maps.c index 37accb6..7c4dcb3 100644 --- a/sound/usb/usbmixer_maps.c +++ b/sound/usb/usbmixer_maps.c @@ -234,6 +234,26 @@ static struct usbmix_name_map justlink_map[] = { { 0 } /* terminator */ }; +/* TerraTec Aureon 5.1 MkII USB */ +static struct usbmix_name_map aureon_51_2_map[] = { + /* 1: IT USB */ + /* 2: IT Mic */ + /* 3: IT Line */ + /* 4: IT SPDIF */ + /* 5: OT SPDIF */ + /* 6: OT Speaker */ + /* 7: OT USB */ + { 8, "Capture Source" }, /* SU */ + { 9, "Master Playback" }, /* FU */ + { 10, "Mic Capture" }, /* FU */ + { 11, "Line Capture" }, /* FU */ + { 12, "IEC958 In Capture" }, /* FU */ + { 13, "Mic Playback" }, /* FU */ + { 14, "Line Playback" }, /* FU */ + /* 15: MU */ + {} /* terminator */ +}; + /* * Control map entries */ @@ -276,6 +296,10 @@ static struct usbmix_ctl_map usbmix_ctl_maps[] = { .id = USB_ID(0x0c45, 0x1158), .map = justlink_map, }, + { + .id = USB_ID(0x0ccd, 0x0028), + .map = aureon_51_2_map, + }, { 0 } /* terminator */ }; -- cgit v0.10.2 From 8b0c4149e82170ebc44b96e9ed96545f8ebd7c81 Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 22 Sep 2006 15:27:55 +0200 Subject: [ALSA] Move CONFIG_SND_AC97_POWER_SAVE to pci/Kconfig Moved the entry of CONFIG_SND_AC97_POWER_SAVE from drivers/Kconfig to more appropriate place, pci/Kconfig. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/drivers/Kconfig b/sound/drivers/Kconfig index 952c7f1..7971285 100644 --- a/sound/drivers/Kconfig +++ b/sound/drivers/Kconfig @@ -113,17 +113,4 @@ config SND_MPU401 To compile this driver as a module, choose M here: the module will be called snd-mpu401. -config SND_AC97_POWER_SAVE - bool "AC97 Power-Saving Mode" - depends on SND_AC97_CODEC && EXPERIMENTAL - default n - help - Say Y here to enable the aggressive power-saving support of - AC97 codecs. In this mode, the power-mode is dynamically - controlled at each open/close. - - The mode is activated by passing power_save=1 option to - snd-ac97-codec driver. You can toggle it dynamically over - sysfs, too. - endmenu diff --git a/sound/pci/Kconfig b/sound/pci/Kconfig index dffb6be..8a6b180 100644 --- a/sound/pci/Kconfig +++ b/sound/pci/Kconfig @@ -744,4 +744,17 @@ config SND_YMFPCI To compile this driver as a module, choose M here: the module will be called snd-ymfpci. +config SND_AC97_POWER_SAVE + bool "AC97 Power-Saving Mode" + depends on SND_AC97_CODEC && EXPERIMENTAL + default n + help + Say Y here to enable the aggressive power-saving support of + AC97 codecs. In this mode, the power-mode is dynamically + controlled at each open/close. + + The mode is activated by passing power_save=1 option to + snd-ac97-codec driver. You can toggle it dynamically over + sysfs, too. + endmenu -- cgit v0.10.2 From f0063c4489a00ed5395378ef80a7edea4272f20b Mon Sep 17 00:00:00 2001 From: Takashi Iwai Date: Fri, 22 Sep 2006 15:30:42 +0200 Subject: [ALSA] intel8x0m - Free irq in suspend Free the irq handler in suspend and reacquire in resume as well as intel8x0 audio driver does. Some devices may change the irq line dynamically during suspend/resume. Signed-off-by: Takashi Iwai Signed-off-by: Jaroslav Kysela diff --git a/sound/pci/intel8x0m.c b/sound/pci/intel8x0m.c index 9185028..268e2f7 100644 --- a/sound/pci/intel8x0m.c +++ b/sound/pci/intel8x0m.c @@ -1045,6 +1045,8 @@ static int intel8x0m_suspend(struct pci_dev *pci, pm_message_t state) for (i = 0; i < chip->pcm_devs; i++) snd_pcm_suspend_all(chip->pcm[i]); snd_ac97_suspend(chip->ac97); + if (chip->irq >= 0) + free_irq(chip->irq, chip); pci_disable_device(pci); pci_save_state(pci); return 0; @@ -1058,6 +1060,9 @@ static int intel8x0m_resume(struct pci_dev *pci) pci_restore_state(pci); pci_enable_device(pci); pci_set_master(pci); + request_irq(pci->irq, snd_intel8x0_interrupt, IRQF_DISABLED|IRQF_SHARED, + card->shortname, chip); + chip->irq = pci->irq; snd_intel8x0_chip_init(chip, 0); snd_ac97_resume(chip->ac97); -- cgit v0.10.2 From 892e4fba1cb5cdc70f3acc65e024e541c0b2d559 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Sat, 23 Sep 2006 10:24:36 +0100 Subject: [MTD] Fix dependencies with CONFIG_MTD=m CMDLINEPARTS shouldn't be selectable, and neither should SSFDC, which can be a tristate anyway. Signed-off-by: David Woodhouse diff --git a/drivers/mtd/Kconfig b/drivers/mtd/Kconfig index 717e904..a03e862 100644 --- a/drivers/mtd/Kconfig +++ b/drivers/mtd/Kconfig @@ -101,7 +101,7 @@ config MTD_REDBOOT_PARTS_READONLY config MTD_CMDLINE_PARTS bool "Command line partition table parsing" - depends on MTD_PARTITIONS = "y" + depends on MTD_PARTITIONS = "y" && MTD = "y" ---help--- Allow generic configuration of the MTD partition tables via the kernel command line. Multiple flash resources are supported for hardware where @@ -264,7 +264,7 @@ config RFD_FTL http://www.gensw.com/pages/prod/bios/rfd.htm config SSFDC - bool "NAND SSFDC (SmartMedia) read only translation layer" + tristate "NAND SSFDC (SmartMedia) read only translation layer" depends on MTD default n help -- cgit v0.10.2 From 9a05eded5d17a425b9d9ed9dd80f518429dde4e8 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Sat, 23 Sep 2006 10:56:24 +0100 Subject: [MTD] SSFDC translation layer minor cleanup Don't include . Don't say 'MB' where you mean 'MiB'. Don't allocate 512 bytes on the stack. Signed-off-by: David Woodhouse diff --git a/drivers/mtd/ssfdc.c b/drivers/mtd/ssfdc.c index ddbf015..cf60a5e 100644 --- a/drivers/mtd/ssfdc.c +++ b/drivers/mtd/ssfdc.c @@ -10,7 +10,6 @@ * published by the Free Software Foundation. */ -#include #include #include #include @@ -29,7 +28,7 @@ struct ssfdcr_record { int cis_block; /* block n. containing CIS/IDI */ int erase_size; /* phys_block_size */ unsigned short *logic_block_map; /* all zones (max 8192 phys blocks on - the 128MB) */ + the 128MiB) */ int map_len; /* n. phys_blocks on the card */ }; @@ -43,11 +42,11 @@ struct ssfdcr_record { #define MAX_LOGIC_BLK_PER_ZONE 1000 #define MAX_PHYS_BLK_PER_ZONE 1024 -#define KB(x) ( (x) * 1024L ) -#define MB(x) ( KB(x) * 1024L ) +#define KiB(x) ( (x) * 1024L ) +#define MiB(x) ( KiB(x) * 1024L ) /** CHS Table - 1MB 2MB 4MB 8MB 16MB 32MB 64MB 128MB + 1MiB 2MiB 4MiB 8MiB 16MiB 32MiB 64MiB 128MiB NCylinder 125 125 250 250 500 500 500 500 NHead 4 4 4 4 4 8 8 16 NSector 4 8 8 16 16 16 32 32 @@ -64,14 +63,14 @@ typedef struct { /* Must be ordered by size */ static const chs_entry_t chs_table[] = { - { MB( 1), 125, 4, 4 }, - { MB( 2), 125, 4, 8 }, - { MB( 4), 250, 4, 8 }, - { MB( 8), 250, 4, 16 }, - { MB( 16), 500, 4, 16 }, - { MB( 32), 500, 8, 16 }, - { MB( 64), 500, 8, 32 }, - { MB(128), 500, 16, 32 }, + { MiB( 1), 125, 4, 4 }, + { MiB( 2), 125, 4, 8 }, + { MiB( 4), 250, 4, 8 }, + { MiB( 8), 250, 4, 16 }, + { MiB( 16), 500, 4, 16 }, + { MiB( 32), 500, 8, 16 }, + { MiB( 64), 500, 8, 32 }, + { MiB(128), 500, 16, 32 }, { 0 }, }; @@ -109,14 +108,19 @@ static int get_valid_cis_sector(struct mtd_info *mtd) int ret, k, cis_sector; size_t retlen; loff_t offset; - uint8_t sect_buf[SECTOR_SIZE]; + uint8_t *sect_buf; + + cis_sector = -1; + + sect_buf = kmalloc(SECTOR_SIZE, GFP_KERNEL); + if (!sect_buf) + goto out; /* * Look for CIS/IDI sector on the first GOOD block (give up after 4 bad * blocks). If the first good block doesn't contain CIS number the flash * is not SSFDC formatted */ - cis_sector = -1; for (k = 0, offset = 0; k < 4; k++, offset += mtd->erasesize) { if (!mtd->block_isbad(mtd, offset)) { ret = mtd->read(mtd, offset, SECTOR_SIZE, &retlen, @@ -140,6 +144,8 @@ static int get_valid_cis_sector(struct mtd_info *mtd) } } + kfree(sect_buf); + out: return cis_sector; } -- cgit v0.10.2 From 08d3ad6a518051bfaefd5d6a8005e20c036996c3 Mon Sep 17 00:00:00 2001 From: David Woodhouse Date: Sat, 23 Sep 2006 16:20:48 +0100 Subject: [MTD] Whitespace cleanup in SSFDC driver. Says akpm: ' - search for "( " and " )", fix.' Signed-off-by: David Woodhouse diff --git a/drivers/mtd/ssfdc.c b/drivers/mtd/ssfdc.c index cf60a5e..79d3bb6 100644 --- a/drivers/mtd/ssfdc.c +++ b/drivers/mtd/ssfdc.c @@ -127,11 +127,11 @@ static int get_valid_cis_sector(struct mtd_info *mtd) sect_buf); /* CIS pattern match on the sector buffer */ - if ( ret < 0 || retlen != SECTOR_SIZE ) { + if (ret < 0 || retlen != SECTOR_SIZE) { printk(KERN_WARNING "SSFDC_RO:can't read CIS/IDI sector\n"); - } else if ( !memcmp(sect_buf, cis_numbers, - sizeof(cis_numbers)) ) { + } else if (!memcmp(sect_buf, cis_numbers, + sizeof(cis_numbers))) { /* Found */ cis_sector = (int)(offset >> SECTOR_SHIFT); } else { @@ -233,7 +233,7 @@ static int get_logical_address(uint8_t *oob_buf) } } - if ( !ok ) + if (!ok) block_address = -2; DEBUG(MTD_DEBUG_LEVEL3, "SSFDC_RO: get_logical_address() %d\n", @@ -251,8 +251,8 @@ static int build_logical_block_map(struct ssfdcr_record *ssfdc) struct mtd_info *mtd = ssfdc->mbd.mtd; DEBUG(MTD_DEBUG_LEVEL1, "SSFDC_RO: build_block_map() nblks=%d (%luK)\n", - ssfdc->map_len, (unsigned long)ssfdc->map_len * - ssfdc->erase_size / 1024 ); + ssfdc->map_len, + (unsigned long)ssfdc->map_len * ssfdc->erase_size / 1024); /* Scan every physical block, skip CIS block */ for (phys_block = ssfdc->cis_block + 1; phys_block < ssfdc->map_len; @@ -329,21 +329,21 @@ static void ssfdcr_add_mtd(struct mtd_blktrans_ops *tr, struct mtd_info *mtd) /* Set geometry */ ssfdc->heads = 16; ssfdc->sectors = 32; - get_chs( mtd->size, NULL, &ssfdc->heads, &ssfdc->sectors); + get_chs(mtd->size, NULL, &ssfdc->heads, &ssfdc->sectors); ssfdc->cylinders = (unsigned short)((mtd->size >> SECTOR_SHIFT) / ((long)ssfdc->sectors * (long)ssfdc->heads)); DEBUG(MTD_DEBUG_LEVEL1, "SSFDC_RO: using C:%d H:%d S:%d == %ld sects\n", ssfdc->cylinders, ssfdc->heads , ssfdc->sectors, (long)ssfdc->cylinders * (long)ssfdc->heads * - (long)ssfdc->sectors ); + (long)ssfdc->sectors); ssfdc->mbd.size = (long)ssfdc->heads * (long)ssfdc->cylinders * (long)ssfdc->sectors; /* Allocate logical block map */ - ssfdc->logic_block_map = kmalloc( sizeof(ssfdc->logic_block_map[0]) * - ssfdc->map_len, GFP_KERNEL); + ssfdc->logic_block_map = kmalloc(sizeof(ssfdc->logic_block_map[0]) * + ssfdc->map_len, GFP_KERNEL); if (!ssfdc->logic_block_map) { printk(KERN_WARNING "SSFDC_RO: out of memory for data structures\n"); @@ -414,7 +414,7 @@ static int ssfdcr_readsect(struct mtd_blktrans_dev *dev, "SSFDC_RO: ssfdcr_readsect() phys_sect_no=%lu\n", sect_no); - if (read_physical_sector( ssfdc->mbd.mtd, buf, sect_no ) < 0) + if (read_physical_sector(ssfdc->mbd.mtd, buf, sect_no) < 0) return -EIO; } else { memset(buf, 0xff, SECTOR_SIZE); -- cgit v0.10.2 From 4c8bd7eeee4c8f157fb61fb64b57500990b42e0e Mon Sep 17 00:00:00 2001 From: David Miller Date: Fri, 22 Sep 2006 22:31:36 -0700 Subject: [KERNEL] Do not truncate to 'int' in ALIGN() macro. Signed-off-by: David S. Miller Signed-off-by: Linus Torvalds diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 851aa1b..2b2ae4f 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -31,7 +31,7 @@ extern const char linux_banner[]; #define STACK_MAGIC 0xdeadbeef #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) -#define ALIGN(x,a) (((x)+(a)-1)&~((a)-1)) +#define ALIGN(x,a) (((x)+(a)-1UL)&~((a)-1UL)) #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) #define roundup(x, y) ((((x) + ((y) - 1)) / (y)) * (y)) -- cgit v0.10.2 From 5f77043f0f7851aa6139fb9a8b297497b540b397 Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Sun, 24 Sep 2006 00:40:41 +1000 Subject: [CRYPTO] hmac: Fix hmac_init update call The crypto_hash_update call in hmac_init gave the number 1 instead of the length of the sg list in bytes. This is a missed conversion from the digest => hash change. As tcrypt only tests crypto_hash_digest it didn't catch this. Signed-off-by: Herbert Xu Signed-off-by: Linus Torvalds diff --git a/crypto/hmac.c b/crypto/hmac.c index f403b69..d52b234 100644 --- a/crypto/hmac.c +++ b/crypto/hmac.c @@ -98,7 +98,7 @@ static int hmac_init(struct hash_desc *pdesc) sg_set_buf(&tmp, ipad, bs); return unlikely(crypto_hash_init(&desc)) ?: - crypto_hash_update(&desc, &tmp, 1); + crypto_hash_update(&desc, &tmp, bs); } static int hmac_update(struct hash_desc *pdesc, -- cgit v0.10.2 From d7b2004528a967f2ba0bf31b1eb0da6a876960e6 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 23 Sep 2006 16:44:16 +0100 Subject: [PATCH] missing includes from infiniband merge indirect chains of includes are arch-specific and can't be relied upon... (hell, even attempt to build it for itanic would trigger vmalloc.h ones; err.h triggers on e.g. alpha). Signed-off-by: Al Viro Signed-off-by: Linus Torvalds diff --git a/drivers/infiniband/core/mad_priv.h b/drivers/infiniband/core/mad_priv.h index 1da9adb..d06b590 100644 --- a/drivers/infiniband/core/mad_priv.h +++ b/drivers/infiniband/core/mad_priv.h @@ -38,6 +38,7 @@ #define __IB_MAD_PRIV_H__ #include +#include #include #include #include diff --git a/drivers/infiniband/hw/amso1100/c2_provider.c b/drivers/infiniband/hw/amso1100/c2_provider.c index 8fddc8c..dd6af55 100644 --- a/drivers/infiniband/hw/amso1100/c2_provider.c +++ b/drivers/infiniband/hw/amso1100/c2_provider.c @@ -49,6 +49,7 @@ #include #include #include +#include #include #include diff --git a/drivers/infiniband/hw/amso1100/c2_rnic.c b/drivers/infiniband/hw/amso1100/c2_rnic.c index 1c3c9d6..f49a32b 100644 --- a/drivers/infiniband/hw/amso1100/c2_rnic.c +++ b/drivers/infiniband/hw/amso1100/c2_rnic.c @@ -50,6 +50,7 @@ #include #include #include +#include #include diff --git a/drivers/infiniband/hw/ipath/ipath_diag.c b/drivers/infiniband/hw/ipath/ipath_diag.c index 28b6b46..29958b6 100644 --- a/drivers/infiniband/hw/ipath/ipath_diag.c +++ b/drivers/infiniband/hw/ipath/ipath_diag.c @@ -43,6 +43,7 @@ #include #include +#include #include #include "ipath_kernel.h" -- cgit v0.10.2 From 13b5aeccc4350e5069c723e8f9becd7208ee02f2 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 23 Sep 2006 16:44:58 +0100 Subject: [PATCH] more fallout from get_property returning pointer to const Signed-off-by: Al Viro Signed-off-by: Linus Torvalds diff --git a/arch/powerpc/platforms/powermac/feature.c b/arch/powerpc/platforms/powermac/feature.c index 13fcaf5..e49621b 100644 --- a/arch/powerpc/platforms/powermac/feature.c +++ b/arch/powerpc/platforms/powermac/feature.c @@ -1058,8 +1058,8 @@ core99_reset_cpu(struct device_node *node, long param, long value) if (np == NULL) return -ENODEV; for (np = np->child; np != NULL; np = np->sibling) { - u32 *num = get_property(np, "reg", NULL); - u32 *rst = get_property(np, "soft-reset", NULL); + const u32 *num = get_property(np, "reg", NULL); + const u32 *rst = get_property(np, "soft-reset", NULL); if (num == NULL || rst == NULL) continue; if (param == *num) { diff --git a/arch/powerpc/platforms/powermac/smp.c b/arch/powerpc/platforms/powermac/smp.c index 653eeb6..1949b65 100644 --- a/arch/powerpc/platforms/powermac/smp.c +++ b/arch/powerpc/platforms/powermac/smp.c @@ -702,7 +702,7 @@ static void __init smp_core99_setup(int ncpus) /* GPIO based HW sync on ppc32 Core99 */ if (pmac_tb_freeze == NULL && !machine_is_compatible("MacRISC4")) { struct device_node *cpu; - u32 *tbprop = NULL; + const u32 *tbprop = NULL; core99_tb_gpio = KL_GPIO_TB_ENABLE; /* default value */ cpu = of_find_node_by_type(NULL, "cpu"); diff --git a/drivers/char/briq_panel.c b/drivers/char/briq_panel.c index a0e5eac..caae795 100644 --- a/drivers/char/briq_panel.c +++ b/drivers/char/briq_panel.c @@ -202,7 +202,7 @@ static struct miscdevice briq_panel_miscdev = { static int __init briq_panel_init(void) { struct device_node *root = find_path_device("/"); - char *machine; + const char *machine; int i; machine = get_property(root, "model", NULL); diff --git a/drivers/video/riva/fbdev.c b/drivers/video/riva/fbdev.c index 67d1e1c..61a4665 100644 --- a/drivers/video/riva/fbdev.c +++ b/drivers/video/riva/fbdev.c @@ -1827,7 +1827,7 @@ static int __devinit riva_get_EDID_OF(struct fb_info *info, struct pci_dev *pd) struct riva_par *par = info->par; struct device_node *dp; unsigned char *pedid = NULL; - unsigned char *disptype = NULL; + const unsigned char *disptype = NULL; static char *propnames[] = { "DFP,EDID", "LCD,EDID", "EDID", "EDID1", "EDID,B", "EDID,A", NULL }; int i; -- cgit v0.10.2 From 2efc80cb8ddc341d81de996920e3b2ad8a12b1f7 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 23 Sep 2006 16:45:55 +0100 Subject: [PATCH] #elif that should've been #elif defined #elif CONFIG_44x in ibm4xx.h should've been #elif defined(CONFIG_44x) Signed-off-by: Al Viro Signed-off-by: Linus Torvalds diff --git a/include/asm-ppc/ibm4xx.h b/include/asm-ppc/ibm4xx.h index cf62b69..499c146 100644 --- a/include/asm-ppc/ibm4xx.h +++ b/include/asm-ppc/ibm4xx.h @@ -86,7 +86,7 @@ void ppc4xx_init(unsigned long r3, unsigned long r4, unsigned long r5, #define PCI_DRAM_OFFSET 0 #endif -#elif CONFIG_44x +#elif defined(CONFIG_44x) #if defined(CONFIG_BAMBOO) #include -- cgit v0.10.2 From 4ac493b1d5bfd332f3dee64baaa620961bab6cdc Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 23 Sep 2006 18:20:56 +0100 Subject: [PATCH] briq_panel: read() and write() get __user pointers, damnit annotated, fixed a roothole in ->write(). Dereferencing user-supplied pointer is a Bad Idea(tm)... Signed-off-by: Al Viro Signed-off-by: Linus Torvalds diff --git a/drivers/char/briq_panel.c b/drivers/char/briq_panel.c index caae795..b8c2225 100644 --- a/drivers/char/briq_panel.c +++ b/drivers/char/briq_panel.c @@ -87,7 +87,7 @@ static int briq_panel_release(struct inode *ino, struct file *filep) return 0; } -static ssize_t briq_panel_read(struct file *file, char *buf, size_t count, +static ssize_t briq_panel_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { unsigned short c; @@ -135,7 +135,7 @@ static void scroll_vfd( void ) vfd_cursor = 20; } -static ssize_t briq_panel_write(struct file *file, const char *buf, size_t len, +static ssize_t briq_panel_write(struct file *file, const char __user *buf, size_t len, loff_t *ppos) { size_t indx = len; @@ -150,19 +150,22 @@ static ssize_t briq_panel_write(struct file *file, const char *buf, size_t len, return -EBUSY; for (;;) { + char c; if (!indx) break; + if (get_user(c, buf)) + return -EFAULT; if (esc) { - set_led(*buf); + set_led(c); esc = 0; - } else if (*buf == 27) { + } else if (c == 27) { esc = 1; - } else if (*buf == 12) { + } else if (c == 12) { /* do a form feed */ for (i=0; i<40; i++) vfd[i] = ' '; vfd_cursor = 0; - } else if (*buf == 10) { + } else if (c == 10) { if (vfd_cursor < 20) vfd_cursor = 20; else if (vfd_cursor < 40) @@ -175,7 +178,7 @@ static ssize_t briq_panel_write(struct file *file, const char *buf, size_t len, /* just a character */ if (vfd_cursor > 39) scroll_vfd(); - vfd[vfd_cursor++] = *buf; + vfd[vfd_cursor++] = c; } indx--; buf++; -- cgit v0.10.2 From 79da342c31ea839277060c1d2086aaf3b5cd85a4 Mon Sep 17 00:00:00 2001 From: Al Viro Date: Sat, 23 Sep 2006 18:21:35 +0100 Subject: [PATCH] more get_property() fallout Signed-off-by: Al Viro Signed-off-by: Linus Torvalds diff --git a/drivers/video/riva/fbdev.c b/drivers/video/riva/fbdev.c index 61a4665..4acde4f 100644 --- a/drivers/video/riva/fbdev.c +++ b/drivers/video/riva/fbdev.c @@ -1826,7 +1826,7 @@ static int __devinit riva_get_EDID_OF(struct fb_info *info, struct pci_dev *pd) { struct riva_par *par = info->par; struct device_node *dp; - unsigned char *pedid = NULL; + const unsigned char *pedid = NULL; const unsigned char *disptype = NULL; static char *propnames[] = { "DFP,EDID", "LCD,EDID", "EDID", "EDID1", "EDID,B", "EDID,A", NULL }; -- cgit v0.10.2 From 73af07de3e32b9ac328c3d1417258bb98a9b0a9b Mon Sep 17 00:00:00 2001 From: Herbert Xu Date: Sun, 24 Sep 2006 09:30:19 +1000 Subject: [CRYPTO] hmac: Fix error truncation by unlikely() The error return values are truncated by unlikely so we need to save it first. Thanks to Kyle Moffett for spotting this. Signed-off-by: Herbert Xu Signed-off-by: Linus Torvalds diff --git a/crypto/hmac.c b/crypto/hmac.c index d52b234..b521bcd 100644 --- a/crypto/hmac.c +++ b/crypto/hmac.c @@ -92,13 +92,17 @@ static int hmac_init(struct hash_desc *pdesc) struct hmac_ctx *ctx = align_ptr(ipad + bs * 2 + ds, sizeof(void *)); struct hash_desc desc; struct scatterlist tmp; + int err; desc.tfm = ctx->child; desc.flags = pdesc->flags & CRYPTO_TFM_REQ_MAY_SLEEP; sg_set_buf(&tmp, ipad, bs); - return unlikely(crypto_hash_init(&desc)) ?: - crypto_hash_update(&desc, &tmp, bs); + err = crypto_hash_init(&desc); + if (unlikely(err)) + return err; + + return crypto_hash_update(&desc, &tmp, bs); } static int hmac_update(struct hash_desc *pdesc, @@ -123,13 +127,17 @@ static int hmac_final(struct hash_desc *pdesc, u8 *out) struct hmac_ctx *ctx = align_ptr(digest + ds, sizeof(void *)); struct hash_desc desc; struct scatterlist tmp; + int err; desc.tfm = ctx->child; desc.flags = pdesc->flags & CRYPTO_TFM_REQ_MAY_SLEEP; sg_set_buf(&tmp, opad, bs + ds); - return unlikely(crypto_hash_final(&desc, digest)) ?: - crypto_hash_digest(&desc, &tmp, bs + ds, out); + err = crypto_hash_final(&desc, digest); + if (unlikely(err)) + return err; + + return crypto_hash_digest(&desc, &tmp, bs + ds, out); } static int hmac_digest(struct hash_desc *pdesc, struct scatterlist *sg, @@ -145,6 +153,7 @@ static int hmac_digest(struct hash_desc *pdesc, struct scatterlist *sg, struct hash_desc desc; struct scatterlist sg1[2]; struct scatterlist sg2[1]; + int err; desc.tfm = ctx->child; desc.flags = pdesc->flags & CRYPTO_TFM_REQ_MAY_SLEEP; @@ -154,8 +163,11 @@ static int hmac_digest(struct hash_desc *pdesc, struct scatterlist *sg, sg1[1].length = 0; sg_set_buf(sg2, opad, bs + ds); - return unlikely(crypto_hash_digest(&desc, sg1, nbytes + bs, digest)) ?: - crypto_hash_digest(&desc, sg2, bs + ds, out); + err = crypto_hash_digest(&desc, sg1, nbytes + bs, digest); + if (unlikely(err)) + return err; + + return crypto_hash_digest(&desc, sg2, bs + ds, out); } static int hmac_init_tfm(struct crypto_tfm *tfm) -- cgit v0.10.2