summaryrefslogtreecommitdiff
path: root/net/netlink
AgeCommit message (Collapse)Author
2009-07-15net/compat/wext: send different messages to compat tasksJohannes Berg
Wireless extensions have the unfortunate problem that events are multicast netlink messages, and are not independent of pointer size. Thus, currently 32-bit tasks on 64-bit platforms cannot properly receive events and fail with all kinds of strange problems, for instance wpa_supplicant never notices disassociations, due to the way the 64-bit event looks (to a 32-bit process), the fact that the address is all zeroes is lost, it thinks instead it is 00:00:00:00:01:00. The same problem existed with the ioctls, until David Miller fixed those some time ago in an heroic effort. A different problem caused by this is that we cannot send the ASSOCREQIE/ASSOCRESPIE events because sending them causes a 32-bit wpa_supplicant on a 64-bit system to overwrite its internal information, which is worse than it not getting the information at all -- so we currently resort to sending a custom string event that it then parses. This, however, has a severe size limitation we are frequently hitting with modern access points; this limitation would can be lifted after this patch by sending the correct binary, not custom, event. A similar problem apparently happens for some other netlink users on x86_64 with 32-bit tasks due to the alignment for 64-bit quantities. In order to fix these problems, I have implemented a way to send compat messages to tasks. When sending an event, we send the non-compat event data together with a compat event data in skb_shinfo(main_skb)->frag_list. Then, when the event is read from the socket, the netlink code makes sure to pass out only the skb that is compatible with the task. This approach was suggested by David Miller, my original approach required always sending two skbs but that had various small problems. To determine whether compat is needed or not, I have used the MSG_CMSG_COMPAT flag, and adjusted the call path for recv and recvfrom to include it, even if those calls do not have a cmsg parameter. I have not solved one small part of the problem, and I don't think it is necessary to: if a 32-bit application uses read() rather than any form of recvmsg() it will still get the wrong (64-bit) event. However, neither do applications actually do this, nor would it be a regression. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12genetlink: make netns awareJohannes Berg
This makes generic netlink network namespace aware. No generic netlink families except for the controller family are made namespace aware, they need to be checked one by one and then set the family->netnsok member to true. A new function genlmsg_multicast_netns() is introduced to allow sending a multicast message in a given namespace, for example when it applies to an object that lives in that namespace, a new function genlmsg_multicast_allns() to send a message to all network namespaces (for objects that do not have an associated netns). The function genlmsg_multicast() is changed to multicast the message in just init_net, which is currently correct for all generic netlink families since they only work in init_net right now. Some will later want to work in all net namespaces because they do not care about the netns at all -- those will have to be converted to use one of the new functions genlmsg_multicast_allns() or genlmsg_multicast_netns() whenever they are made netns aware in some way. After this patch families can easily decide whether or not they should be available in all net namespaces. Many genl families us it for objects not related to networking and should therefore be available in all namespaces, but that will have to be done on a per family basis. Note that this doesn't touch on the checkpoint/restart problem where network namespaces could be used, genl families and multicast groups are numbered globally and I see no easy way of changing that, especially since it must be possible to multicast to all network namespaces for those families that do not care about netns. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12netlink: use call_rcu for netlink_change_ngroupsJohannes Berg
For the network namespace work in generic netlink I need to be able to call this function under rcu_read_lock(), otherwise the locking becomes a nightmare and more locks would be needed. Instead, just embed a struct rcu_head (actually a struct listeners_rcu_head that also carries the pointer to the memory block) into the listeners memory so we can use call_rcu() instead of synchronising and then freeing. No rcu_barrier() is needed since this code cannot be modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-07-12netlink: remove unused exportsJohannes Berg
I added those myself in commits b4ff4f04 and 84659eb5, but I see no reason now why they should be exported, only generic netlink uses them which cannot be modular. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-18net: correct off-by-one write allocations reportsEric Dumazet
commit 2b85a34e911bf483c27cfdd124aeb1605145dc80 (net: No more expensive sock_hold()/sock_put() on each tx) changed initial sk_wmem_alloc value. We need to take into account this offset when reporting sk_wmem_alloc to user, in PROC_FS files or various ioctls (SIOCOUTQ/TIOCOUTQ) Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-05-21genetlink: Introduce genl_register_family_with_ops()Michał Mirosław
This introduces genl_register_family_with_ops() that registers a genetlink family along with operations from a table. This is used to kill copy'n'paste occurrences in following patches. Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-26Merge branch 'master' of /home/davem/src/GIT/linux-2.6/David S. Miller
Conflicts: drivers/net/wimax/i2400m/usb-notif.c
2009-03-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (29 commits) crypto: sha512-s390 - Add missing block size hwrng: timeriomem - Breaks an allyesconfig build on s390: nlattr: Fix build error with NET off crypto: testmgr - add zlib test crypto: zlib - New zlib crypto module, using pcomp crypto: testmgr - Add support for the pcomp interface crypto: compress - Add pcomp interface netlink: Move netlink attribute parsing support to lib crypto: Fix dead links hwrng: timeriomem - New driver crypto: chainiv - Use kcrypto_wq instead of keventd_wq crypto: cryptd - Per-CPU thread implementation based on kcrypto_wq crypto: api - Use dedicated workqueue for crypto subsystem crypto: testmgr - Test skciphers with no IVs crypto: aead - Avoid infinite loop when nivaead fails selftest crypto: skcipher - Avoid infinite loop when cipher fails selftest crypto: api - Fix crypto_alloc_tfm/create_create_tfm return convention crypto: api - crypto_alg_mod_lookup either tested or untested crypto: amcc - Add crypt4xx driver crypto: ansi_cprng - Add maintainer ...
2009-03-24netlink: add NETLINK_NO_ENOBUFS socket flagPablo Neira Ayuso
This patch adds the NETLINK_NO_ENOBUFS socket flag. This flag can be used by unicast and broadcast listeners to avoid receiving ENOBUFS errors. Generally speaking, ENOBUFS errors are useful to notify two things to the listener: a) You may increase the receiver buffer size via setsockopt(). b) You have lost messages, you may be out of sync. In some cases, ignoring ENOBUFS errors can be useful. For example: a) nfnetlink_queue: this subsystem does not have any sort of resync method and you can decide to ignore ENOBUFS once you have set a given buffer size. b) ctnetlink: you can use this together with the socket flag NETLINK_BROADCAST_SEND_ERROR to stop getting ENOBUFS errors as you do not need to resync (packets whose event are not delivered are drop to provide reliable logging and state-synchronization). Moreover, the use of NETLINK_NO_ENOBUFS also reduces a "go up, go down" effect in terms of performance which is due to the netlink congestion control when the listener cannot back off. The effect is the following: 1) throughput rate goes up and netlink messages are inserted in the receiver buffer. 2) Then, netlink buffer fills and overruns (set on nlk->state bit 0). 3) While the listener empties the receiver buffer, netlink keeps dropping messages. Thus, throughput goes dramatically down. 4) Then, once the listener has emptied the buffer (nlk->state bit 0 is set off), goto step 1. This effect is easy to trigger with netlink broadcast under heavy load, and it is more noticeable when using a big receiver buffer. You can find some results in [1] that show this problem. [1] http://1984.lsi.us.es/linux/netlink/ This patch also includes the use of sk_drop to account the number of netlink messages drop due to overrun. This value is shown in /proc/net/netlink. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-24Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6
2009-03-23nefilter: nfnetlink: add nfnetlink_set_err and use it in ctnetlinkPablo Neira Ayuso
This patch adds nfnetlink_set_err() to propagate the error to netlink broadcast listener in case of memory allocation errors in the message building. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Patrick McHardy <kaber@trash.net>
2009-03-05Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tokenring/tmspci.c drivers/net/ucc_geth_mii.c
2009-03-04netlink: invert error code in netlink_set_err()Pablo Neira Ayuso
The callers of netlink_set_err() currently pass a negative value as parameter for the error code. However, sk->sk_err wants a positive error value. Without this patch, skb_recv_datagram() called by netlink_recvmsg() may return a positive value to report an error. Another choice to fix this is to change callers to pass a positive error value, but this seems a bit inconsistent and error prone to me. Indeed, the callers of netlink_set_err() assumed that the (usual) negative value for error codes was fine before this patch :). This patch also includes some documentation in docbook format for netlink_set_err() to avoid this sort of confusion. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-03-04netlink: Move netlink attribute parsing support to libGeert Uytterhoeven
Netlink attribute parsing may be used even if CONFIG_NET is not set. Move it from net/netlink to lib and control its inclusion based on the new config symbol CONFIG_NLATTR, which is selected by CONFIG_NET. Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2009-02-27netlink: remove some pointless conditionals before kfree_skb()Wei Yongjun
Remove some pointless conditionals before kfree_skb(). Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-25netlink: change nlmsg_notify() return value logicPablo Neira Ayuso
This patch changes the return value of nlmsg_notify() as follows: If NETLINK_BROADCAST_ERROR is set by any of the listeners and an error in the delivery happened, return the broadcast error; else if there are no listeners apart from the socket that requested a change with the echo flag, return the result of the unicast notification. Thus, with this patch, the unicast notification is handled in the same way of a broadcast listener that has set the NETLINK_BROADCAST_ERROR socket flag. This patch is useful in case that the caller of nlmsg_notify() wants to know the result of the delivery of a netlink notification (including the broadcast delivery) and take any action in case that the delivery failed. For example, ctnetlink can drop packets if the event delivery failed to provide reliable logging and state-synchronization at the cost of dropping packets. This patch also modifies the rtnetlink code to ignore the return value of rtnl_notify() in all callers. The function rtnl_notify() (before this patch) returned the error of the unicast notification which makes rtnl_set_sk_err() reports errors to all listeners. This is not of any help since the origin of the change (the socket that requested the echoing) notices the ENOBUFS error if the notification fails and should resync itself. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-20netlink: add NETLINK_BROADCAST_ERROR socket optionPablo Neira Ayuso
This patch adds NETLINK_BROADCAST_ERROR which is a netlink socket option that the listener can set to make netlink_broadcast() return errors in the delivery to the caller. This option is useful if the caller of netlink_broadcast() do something with the result of the message delivery, like in ctnetlink where it drops a network packet if the event delivery failed, this is used to enable reliable logging and state-synchronization. If this socket option is not set, netlink_broadcast() only reports ESRCH errors and silently ignore ENOBUFS errors, which is what most netlink_broadcast() callers should do. This socket option is based on a suggestion from Patrick McHardy. Patrick McHardy can exchange this patch for a beer from me ;). Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-06netlink: change return-value logic of netlink_broadcast()Pablo Neira Ayuso
Currently, netlink_broadcast() reports errors to the caller if no messages at all were delivered: 1) If, at least, one message has been delivered correctly, returns 0. 2) Otherwise, if no messages at all were delivered due to skb_clone() failure, return -ENOBUFS. 3) Otherwise, if there are no listeners, return -ESRCH. With this patch, the caller knows if the delivery of any of the messages to the listeners have failed: 1) If it fails to deliver any message (for whatever reason), return -ENOBUFS. 2) Otherwise, if all messages were delivered OK, returns 0. 3) Otherwise, if no listeners, return -ESRCH. In the current ctnetlink code and in Netfilter in general, we can add reliable logging and connection tracking event delivery by dropping the packets whose events were not successfully delivered over Netlink. Of course, this option would be settable via /proc as this approach reduces performance (in terms of filtered connections per seconds by a stateful firewall) but providing reliable logging and event delivery (for conntrackd) in return. This patch also changes some clients of netlink_broadcast() that may report ENOBUFS errors via printk. This error handling is not of any help. Instead, the userspace daemons that are listening to those netlink messages should resync themselves with the kernel-side if they hit ENOBUFS. BTW, netlink_broadcast() clients include those that call cn_netlink_send(), nlmsg_multicast() and genlmsg_multicast() since they internally call netlink_broadcast() and return its error value. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-01-07genetlink: export genl_unregister_mc_group()Inaky Perez-Gonzalez
Add an EXPORT_SYMBOL() to genl_unregister_mc_group(), to allow unregistering groups on the run. EXPORT_SYMBOL_GPL() is not used as the rest of the functions exported by this module (eg: genl_register_mc_group) are also not _GPL(). Cleanup is currently done when unregistering a family, but there is no way to unregister a single multicast group due to that function not being exported. Seems to be a mistake as it is documented as for external consumption. This is needed by the WiMAX stack to be able to cleanup unused mc groups. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Acked-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2008-11-28netlink: allow empty nested attributesPatrick McHardy
validate_nla() currently doesn't allow empty nested attributes. This makes userspace code unnecessarily complicated when starting and ending the nested attribute is done by generic upper level code and the inner attributes are dumped by a module. Add a special case to accept empty nested attributes. When the nested attribute is non empty, the same checks as before are performed. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24net: Make sure BHs are disabled in sock_prot_inuse_add()Eric Dumazet
There is still a call to sock_prot_inuse_add() in af_netlink while in a preemptable section. Add explicit BH disable around this call. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24net: Make sure BHs are disabled in sock_prot_inuse_add()David S. Miller
The rule of calling sock_prot_inuse_add() is that BHs must be disabled. Some new calls were added where this was not true and this tiggers warnings as reported by Ilpo. Fix this by adding explicit BH disabling around those call sites. Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-23net: af_netlink should update its inuse counterEric Dumazet
In order to have relevant information for NETLINK protocol, in /proc/net/protocols, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-28netlink: constify struct nlattr * arg to parsing functionsPatrick McHardy
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-16net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely)Johannes Berg
Some code here depends on CONFIG_KMOD to not try to load protocol modules or similar, replace by CONFIG_MODULES where more than just request_module depends on CONFIG_KMOD and and also use try_then_request_module in ebtables. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-10-14net: Rationalise email address: Network Specific PartsAlan Cox
Clean up the various different email addresses of mine listed in the code to a single current and valid address. As Dave says his network merges for 2.6.28 are now done this seems a good point to send them in where they won't risk disrupting real changes. Signed-off-by: Alan Cox <alan@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-26net: convert BUG_TRAP to generic WARN_ONIlpo Järvinen
Removes legacy reinvent-the-wheel type thing. The generic machinery integrates much better to automated debugging aids such as kerneloops.org (and others), and is unambiguous due to better naming. Non-intuively BUG_TRAP() is actually equal to WARN_ON() rather than BUG_ON() though some might actually be promoted to BUG_ON() but I left that to future. I could make at least one BUILD_BUG_ON conversion. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-06Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: Documentation/feature-removal-schedule.txt drivers/net/wan/hdlc_fr.c drivers/net/wireless/iwlwifi/iwl-4965.c drivers/net/wireless/iwlwifi/iwl3945-base.c
2008-07-02netlink: Unneeded local variableWang Chen
We already have a variable, which has the same capability. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-28Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/wireless/iwlwifi/iwl4965-base.c
2008-06-28netlink: Fix some doc comments in net/netlink/attr.cJulius Volz
Fix some doc comments to match function and attribute names in net/netlink/attr.c. Signed-off-by: Julius Volz <juliusv@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-19Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: net/mac80211/tx.c
2008-06-18netlink: genl: fix circular lockingPatrick McHardy
genetlink has a circular locking dependency when dumping the registered families: - dump start: genl_rcv() : take genl_mutex genl_rcv_msg() : call netlink_dump_start() while holding genl_mutex netlink_dump_start(), netlink_dump() : take nlk->cb_mutex ctrl_dumpfamily() : try to detect this case and not take genl_mutex a second time - dump continuance: netlink_rcv() : call netlink_dump netlink_dump : take nlk->cb_mutex ctrl_dumpfamily() : take genl_mutex Register genl_lock as callback mutex with netlink to fix this. This slightly widens an already existing module unload race, the genl ops used during the dump might go away when the module is unloaded. Thomas Graf is working on a seperate fix for this. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-10Merge branch 'master' of ↵David S. Miller
master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 Conflicts: drivers/net/tg3.c drivers/net/wireless/rt2x00/rt2x00dev.c net/mac80211/ieee80211_i.h
2008-06-05netlink: Remove nonblock parameter from netlink_attachskbDenis V. Lunev
Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-06-03netlink: Improve returned error codesThomas Graf
Make nlmsg_trim(), nlmsg_cancel(), genlmsg_cancel(), and nla_nest_cancel() void functions. Return -EMSGSIZE instead of -1 if the provided message buffer is not big enough. Signed-off-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-04-28Audit: collect sessionid in netlink messagesEric Paris
Previously I added sessionid output to all audit messages where it was available but we still didn't know the sessionid of the sender of netlink messages. This patch adds that information to netlink messages so we can audit who sent netlink messages. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-04-19Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6: security: fix up documentation for security_module_enable Security: Introduce security= boot parameter Audit: Final renamings and cleanup SELinux: use new audit hooks, remove redundant exports Audit: internally use the new LSM audit hooks LSM/Audit: Introduce generic Audit LSM hooks SELinux: remove redundant exports Netlink: Use generic LSM hook Audit: use new LSM hooks instead of SELinux exports SELinux: setup new inode/ipc getsecid hooks LSM: Introduce inode_getsecid and ipc_getsecid hooks
2008-04-18Netlink: Use generic LSM hookAhmed S. Darwish
Don't use SELinux exported selinux_get_task_sid symbol. Use the generic LSM equivalent instead. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Acked-by: James Morris <jmorris@namei.org> Acked-by: David S. Miller <davem@davemloft.net> Reviewed-by: Paul Moore <paul.moore@hp.com>
2008-03-25[NET] NETNS: Omit namespace comparision without CONFIG_NET_NS.YOSHIFUJI Hideaki
Introduce an inline net_eq() to compare two namespaces. Without CONFIG_NET_NS, since no namespace other than &init_net exists, it is always 1. We do not need to convert 1) inline vs inline and 2) inline vs &init_net comparisons. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25[NET] NETNS: Omit seq_net_private->net without CONFIG_NET_NS.YOSHIFUJI Hideaki
Without CONFIG_NET_NS, no namespace other than &init_net exists, no need to store net in seq_net_private. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-25[NET] NETNS: Omit sock->sk_net without CONFIG_NET_NS.YOSHIFUJI Hideaki
Introduce per-sock inlines: sock_net(), sock_net_set() and per-inet_timewait_sock inlines: twsk_net(), twsk_net_set(). Without CONFIG_NET_NS, no namespace other than &init_net exists. Let's explicitly define them to help compiler optimizations. Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
2008-03-21netlink: make socket filters work on netlinkStephen Hemminger
Make socket filters work for netlink unicast and notifications. This is useful for applications like Zebra that get overrun with messages that are then ignored. Note: netlink messages are in host byte order, but packet filter state machine operations are done as network byte order. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29[NET]: Make netlink_kernel_release publically available as sk_release_kernel.Denis V. Lunev
This staff will be needed for non-netlink kernel sockets, which should also not pin a namespace like tcp_socket and icmp_socket. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-29[NETLINK]: No need for a separate __netlink_release call.Denis V. Lunev
Merge it to netlink_kernel_release. Signed-off-by: Denis V. Lunev <den@openvz.org> Acked-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-13[GENETLINK]: Relax dances with genl_lock.Pavel Emelyanov
The genl_unregister_family() calls the genl_unregister_mc_groups(), which takes and releases the genl_lock and then locks and releases this lock itself. Relax this behavior, all the more so the genl_unregister_mc_groups() is called from genl_unregister_family() only. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-02-01[PATCH] switch audit_get_loginuid() to task_struct *Al Viro
all callers pass something->audit_context Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2008-02-01[NETNS]: Fix race between put_net() and netlink_kernel_create().Pavel Emelyanov
The comment about "race free view of the set of network namespaces" was a bit hasty. Look (there even can be only one CPU, as discovered by Alexey Dobriyan and Denis Lunev): put_net() if (atomic_dec_and_test(&net->refcnt)) /* true */ __put_net(net); queue_work(...); /* * note: the net now has refcnt 0, but still in * the global list of net namespaces */ == re-schedule == register_pernet_subsys(&some_ops); register_pernet_operations(&some_ops); (*some_ops)->init(net); /* * we call netlink_kernel_create() here * in some places */ netlink_kernel_create(); sk_alloc(); get_net(net); /* refcnt = 1 */ /* * now we drop the net refcount not to * block the net namespace exit in the * future (or this can be done on the * error path) */ put_net(sk->sk_net); if (atomic_dec_and_test(&...)) /* * true. BOOOM! The net is * scheduled for release twice */ When thinking on this problem, I decided, that getting and putting the net in init callback is wrong. If some init callback needs to have a refcount-less reference on the struct net, _it_ has to be careful himself, rather than relying on the infrastructure to handle this correctly. In case of netlink_kernel_create(), the problem is that the sk_alloc() gets the given namespace, but passing the info that we don't want to get it inside this call is too heavy. Instead, I propose to crate the socket inside an init_net namespace and then re-attach it to the desired one right after the socket is created. After doing this, we also have to be careful on error paths not to drop the reference on the namespace, we didn't get the one on. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Acked-by: Denis Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETLINK]: Add nla_append()Patrick McHardy
Used to append data to a message without a header or padding. Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2008-01-28[NETNS]: Namespace stop vs 'ip r l' race.Denis V. Lunev
During network namespace stop process kernel side netlink sockets belonging to a namespace should be closed. They should not prevent namespace to stop, so they do not increment namespace usage counter. Though this counter will be put during last sock_put. The raplacement of the correct netns for init_ns solves the problem only partial as socket to be stoped until proper stop is a valid netlink kernel socket and can be looked up by the user processes. This is not a problem until it resides in initial namespace (no processes inside this net), but this is not true for init_net. So, hold the referrence for a socket, remove it from lookup tables and only after that change namespace and perform a last put. Signed-off-by: Denis V. Lunev <den@openvz.org> Tested-by: Alexey Dobriyan <adobriyan@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>