From b838b4aced99e0d31a272396d43d9ca21cb078cb Mon Sep 17 00:00:00 2001 From: Bruno Thomsen Date: Thu, 9 Oct 2014 16:48:14 +0200 Subject: phy/micrel: KSZ8031RNL RMII clock reconfiguration bug Bug: Unable to send and receive Ethernet packets with Micrel PHY. Affected devices: KSZ8031RNL (commercial temp) KSZ8031RNLI (industrial temp) Description: PHY device is correctly detected during probe. PHY power-up default is 25MHz crystal clock input and output 50MHz RMII clock to MAC. Reconfiguration of PHY to input 50MHz RMII clock from MAC causes PHY to become unresponsive if clock source is changed after Operation Mode Strap Override (OMSO) register setup. Cause: Long lead times on parts where clock setup match circuit design forces the usage of similar parts with wrong default setup. Solution: Swapped KSZ8031 register setup and added phy_write return code validation. Tested with Freescale i.MX28 Fast Ethernet Controler (fec). Signed-off-by: Bruno Thomsen Signed-off-by: David S. Miller diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 492435f..8c2a29a 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -198,8 +198,10 @@ static int ksz8021_config_init(struct phy_device *phydev) if (rc) dev_err(&phydev->dev, "failed to set led mode\n"); - phy_write(phydev, MII_KSZPHY_OMSO, val); rc = ksz_config_flags(phydev); + if (rc < 0) + return rc; + rc = phy_write(phydev, MII_KSZPHY_OMSO, val); return rc < 0 ? rc : 0; } -- cgit v0.10.2 From 9de7922bc709eee2f609cd01d98aaedc4cf5ea74 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Thu, 9 Oct 2014 22:55:31 +0200 Subject: net: sctp: fix skb_over_panic when receiving malformed ASCONF chunks Commit 6f4c618ddb0 ("SCTP : Add paramters validity check for ASCONF chunk") added basic verification of ASCONF chunks, however, it is still possible to remotely crash a server by sending a special crafted ASCONF chunk, even up to pre 2.6.12 kernels: skb_over_panic: text:ffffffffa01ea1c3 len:31056 put:30768 head:ffff88011bd81800 data:ffff88011bd81800 tail:0x7950 end:0x440 dev: ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:129! [...] Call Trace: [] skb_put+0x5c/0x70 [] sctp_addto_chunk+0x63/0xd0 [sctp] [] sctp_process_asconf+0x1af/0x540 [sctp] [] ? _read_unlock_bh+0x15/0x20 [] sctp_sf_do_asconf+0x168/0x240 [sctp] [] sctp_do_sm+0x71/0x1210 [sctp] [] ? fib_rules_lookup+0xad/0xf0 [] ? sctp_cmp_addr_exact+0x32/0x40 [sctp] [] sctp_assoc_bh_rcv+0xd3/0x180 [sctp] [] sctp_inq_push+0x56/0x80 [sctp] [] sctp_rcv+0x982/0xa10 [sctp] [] ? ipt_local_in_hook+0x23/0x28 [iptable_filter] [] ? nf_iterate+0x69/0xb0 [] ? ip_local_deliver_finish+0x0/0x2d0 [] ? nf_hook_slow+0x76/0x120 [] ? ip_local_deliver_finish+0x0/0x2d0 [] ip_local_deliver_finish+0xdd/0x2d0 [] ip_local_deliver+0x98/0xa0 [] ip_rcv_finish+0x12d/0x440 [] ip_rcv+0x275/0x350 [] __netif_receive_skb+0x4ab/0x750 [] netif_receive_skb+0x58/0x60 This can be triggered e.g., through a simple scripted nmap connection scan injecting the chunk after the handshake, for example, ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ------------------ ASCONF; UNKNOWN ------------------> ... where ASCONF chunk of length 280 contains 2 parameters ... 1) Add IP address parameter (param length: 16) 2) Add/del IP address parameter (param length: 255) ... followed by an UNKNOWN chunk of e.g. 4 bytes. Here, the Address Parameter in the ASCONF chunk is even missing, too. This is just an example and similarly-crafted ASCONF chunks could be used just as well. The ASCONF chunk passes through sctp_verify_asconf() as all parameters passed sanity checks, and after walking, we ended up successfully at the chunk end boundary, and thus may invoke sctp_process_asconf(). Parameter walking is done with WORD_ROUND() to take padding into account. In sctp_process_asconf()'s TLV processing, we may fail in sctp_process_asconf_param() e.g., due to removal of the IP address that is also the source address of the packet containing the ASCONF chunk, and thus we need to add all TLVs after the failure to our ASCONF response to remote via helper function sctp_add_asconf_response(), which basically invokes a sctp_addto_chunk() adding the error parameters to the given skb. When walking to the next parameter this time, we proceed with ... length = ntohs(asconf_param->param_hdr.length); asconf_param = (void *)asconf_param + length; ... instead of the WORD_ROUND()'ed length, thus resulting here in an off-by-one that leads to reading the follow-up garbage parameter length of 12336, and thus throwing an skb_over_panic for the reply when trying to sctp_addto_chunk() next time, which implicitly calls the skb_put() with that length. Fix it by using sctp_walk_params() [ which is also used in INIT parameter processing ] macro in the verification *and* in ASCONF processing: it will make sure we don't spill over, that we walk parameters WORD_ROUND()'ed. Moreover, we're being more defensive and guard against unknown parameter types and missized addresses. Joint work with Vlad Yasevich. Fixes: b896b82be4ae ("[SCTP] ADDIP: Support for processing incoming ASCONF_ACK chunks.") Signed-off-by: Daniel Borkmann Signed-off-by: Vlad Yasevich Acked-by: Neil Horman Signed-off-by: David S. Miller diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h index 7f4eeb3..72a31db 100644 --- a/include/net/sctp/sm.h +++ b/include/net/sctp/sm.h @@ -248,9 +248,9 @@ struct sctp_chunk *sctp_make_asconf_update_ip(struct sctp_association *, int, __be16); struct sctp_chunk *sctp_make_asconf_set_prim(struct sctp_association *asoc, union sctp_addr *addr); -int sctp_verify_asconf(const struct sctp_association *asoc, - struct sctp_paramhdr *param_hdr, void *chunk_end, - struct sctp_paramhdr **errp); +bool sctp_verify_asconf(const struct sctp_association *asoc, + struct sctp_chunk *chunk, bool addr_param_needed, + struct sctp_paramhdr **errp); struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, struct sctp_chunk *asconf); int sctp_process_asconf_ack(struct sctp_association *asoc, diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c index ae0e616..ab734be 100644 --- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -3110,50 +3110,63 @@ static __be16 sctp_process_asconf_param(struct sctp_association *asoc, return SCTP_ERROR_NO_ERROR; } -/* Verify the ASCONF packet before we process it. */ -int sctp_verify_asconf(const struct sctp_association *asoc, - struct sctp_paramhdr *param_hdr, void *chunk_end, - struct sctp_paramhdr **errp) { - sctp_addip_param_t *asconf_param; +/* Verify the ASCONF packet before we process it. */ +bool sctp_verify_asconf(const struct sctp_association *asoc, + struct sctp_chunk *chunk, bool addr_param_needed, + struct sctp_paramhdr **errp) +{ + sctp_addip_chunk_t *addip = (sctp_addip_chunk_t *) chunk->chunk_hdr; union sctp_params param; - int length, plen; - - param.v = (sctp_paramhdr_t *) param_hdr; - while (param.v <= chunk_end - sizeof(sctp_paramhdr_t)) { - length = ntohs(param.p->length); - *errp = param.p; + bool addr_param_seen = false; - if (param.v > chunk_end - length || - length < sizeof(sctp_paramhdr_t)) - return 0; + sctp_walk_params(param, addip, addip_hdr.params) { + size_t length = ntohs(param.p->length); + *errp = param.p; switch (param.p->type) { + case SCTP_PARAM_ERR_CAUSE: + break; + case SCTP_PARAM_IPV4_ADDRESS: + if (length != sizeof(sctp_ipv4addr_param_t)) + return false; + addr_param_seen = true; + break; + case SCTP_PARAM_IPV6_ADDRESS: + if (length != sizeof(sctp_ipv6addr_param_t)) + return false; + addr_param_seen = true; + break; case SCTP_PARAM_ADD_IP: case SCTP_PARAM_DEL_IP: case SCTP_PARAM_SET_PRIMARY: - asconf_param = (sctp_addip_param_t *)param.v; - plen = ntohs(asconf_param->param_hdr.length); - if (plen < sizeof(sctp_addip_param_t) + - sizeof(sctp_paramhdr_t)) - return 0; + /* In ASCONF chunks, these need to be first. */ + if (addr_param_needed && !addr_param_seen) + return false; + length = ntohs(param.addip->param_hdr.length); + if (length < sizeof(sctp_addip_param_t) + + sizeof(sctp_paramhdr_t)) + return false; break; case SCTP_PARAM_SUCCESS_REPORT: case SCTP_PARAM_ADAPTATION_LAYER_IND: if (length != sizeof(sctp_addip_param_t)) - return 0; - + return false; break; default: - break; + /* This is unkown to us, reject! */ + return false; } - - param.v += WORD_ROUND(length); } - if (param.v != chunk_end) - return 0; + /* Remaining sanity checks. */ + if (addr_param_needed && !addr_param_seen) + return false; + if (!addr_param_needed && addr_param_seen) + return false; + if (param.v != chunk->chunk_end) + return false; - return 1; + return true; } /* Process an incoming ASCONF chunk with the next expected serial no. and @@ -3162,16 +3175,17 @@ int sctp_verify_asconf(const struct sctp_association *asoc, struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, struct sctp_chunk *asconf) { + sctp_addip_chunk_t *addip = (sctp_addip_chunk_t *) asconf->chunk_hdr; + bool all_param_pass = true; + union sctp_params param; sctp_addiphdr_t *hdr; union sctp_addr_param *addr_param; sctp_addip_param_t *asconf_param; struct sctp_chunk *asconf_ack; - __be16 err_code; int length = 0; int chunk_len; __u32 serial; - int all_param_pass = 1; chunk_len = ntohs(asconf->chunk_hdr->length) - sizeof(sctp_chunkhdr_t); hdr = (sctp_addiphdr_t *)asconf->skb->data; @@ -3199,9 +3213,14 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, goto done; /* Process the TLVs contained within the ASCONF chunk. */ - while (chunk_len > 0) { + sctp_walk_params(param, addip, addip_hdr.params) { + /* Skip preceeding address parameters. */ + if (param.p->type == SCTP_PARAM_IPV4_ADDRESS || + param.p->type == SCTP_PARAM_IPV6_ADDRESS) + continue; + err_code = sctp_process_asconf_param(asoc, asconf, - asconf_param); + param.addip); /* ADDIP 4.1 A7) * If an error response is received for a TLV parameter, * all TLVs with no response before the failed TLV are @@ -3209,28 +3228,20 @@ struct sctp_chunk *sctp_process_asconf(struct sctp_association *asoc, * the failed response are considered unsuccessful unless * a specific success indication is present for the parameter. */ - if (SCTP_ERROR_NO_ERROR != err_code) - all_param_pass = 0; - + if (err_code != SCTP_ERROR_NO_ERROR) + all_param_pass = false; if (!all_param_pass) - sctp_add_asconf_response(asconf_ack, - asconf_param->crr_id, err_code, - asconf_param); + sctp_add_asconf_response(asconf_ack, param.addip->crr_id, + err_code, param.addip); /* ADDIP 4.3 D11) When an endpoint receiving an ASCONF to add * an IP address sends an 'Out of Resource' in its response, it * MUST also fail any subsequent add or delete requests bundled * in the ASCONF. */ - if (SCTP_ERROR_RSRC_LOW == err_code) + if (err_code == SCTP_ERROR_RSRC_LOW) goto done; - - /* Move to the next ASCONF param. */ - length = ntohs(asconf_param->param_hdr.length); - asconf_param = (void *)asconf_param + length; - chunk_len -= length; } - done: asoc->peer.addip_serial++; diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index c8f6063..bdea3df 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -3591,9 +3591,7 @@ sctp_disposition_t sctp_sf_do_asconf(struct net *net, struct sctp_chunk *asconf_ack = NULL; struct sctp_paramhdr *err_param = NULL; sctp_addiphdr_t *hdr; - union sctp_addr_param *addr_param; __u32 serial; - int length; if (!sctp_vtag_verify(chunk, asoc)) { sctp_add_cmd_sf(commands, SCTP_CMD_REPORT_BAD_TAG, @@ -3618,17 +3616,8 @@ sctp_disposition_t sctp_sf_do_asconf(struct net *net, hdr = (sctp_addiphdr_t *)chunk->skb->data; serial = ntohl(hdr->serial); - addr_param = (union sctp_addr_param *)hdr->params; - length = ntohs(addr_param->p.length); - if (length < sizeof(sctp_paramhdr_t)) - return sctp_sf_violation_paramlen(net, ep, asoc, type, arg, - (void *)addr_param, commands); - /* Verify the ASCONF chunk before processing it. */ - if (!sctp_verify_asconf(asoc, - (sctp_paramhdr_t *)((void *)addr_param + length), - (void *)chunk->chunk_end, - &err_param)) + if (!sctp_verify_asconf(asoc, chunk, true, &err_param)) return sctp_sf_violation_paramlen(net, ep, asoc, type, arg, (void *)err_param, commands); @@ -3745,10 +3734,7 @@ sctp_disposition_t sctp_sf_do_asconf_ack(struct net *net, rcvd_serial = ntohl(addip_hdr->serial); /* Verify the ASCONF-ACK chunk before processing it. */ - if (!sctp_verify_asconf(asoc, - (sctp_paramhdr_t *)addip_hdr->params, - (void *)asconf_ack->chunk_end, - &err_param)) + if (!sctp_verify_asconf(asoc, asconf_ack, false, &err_param)) return sctp_sf_violation_paramlen(net, ep, asoc, type, arg, (void *)err_param, commands); -- cgit v0.10.2 From b69040d8e39f20d5215a03502a8e8b4c6ab78395 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Thu, 9 Oct 2014 22:55:32 +0200 Subject: net: sctp: fix panic on duplicate ASCONF chunks When receiving a e.g. semi-good formed connection scan in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---------------- ASCONF_a; ASCONF_b -----------------> ... where ASCONF_a equals ASCONF_b chunk (at least both serials need to be equal), we panic an SCTP server! The problem is that good-formed ASCONF chunks that we reply with ASCONF_ACK chunks are cached per serial. Thus, when we receive a same ASCONF chunk twice (e.g. through a lost ASCONF_ACK), we do not need to process them again on the server side (that was the idea, also proposed in the RFC). Instead, we know it was cached and we just resend the cached chunk instead. So far, so good. Where things get nasty is in SCTP's side effect interpreter, that is, sctp_cmd_interpreter(): While incoming ASCONF_a (chunk = event_arg) is being marked !end_of_packet and !singleton, and we have an association context, we do not flush the outqueue the first time after processing the ASCONF_ACK singleton chunk via SCTP_CMD_REPLY. Instead, we keep it queued up, although we set local_cork to 1. Commit 2e3216cd54b1 changed the precedence, so that as long as we get bundled, incoming chunks we try possible bundling on outgoing queue as well. Before this commit, we would just flush the output queue. Now, while ASCONF_a's ASCONF_ACK sits in the corked outq, we continue to process the same ASCONF_b chunk from the packet. As we have cached the previous ASCONF_ACK, we find it, grab it and do another SCTP_CMD_REPLY command on it. So, effectively, we rip the chunk->list pointers and requeue the same ASCONF_ACK chunk another time. Since we process ASCONF_b, it's correctly marked with end_of_packet and we enforce an uncork, and thus flush, thus crashing the kernel. Fix it by testing if the ASCONF_ACK is currently pending and if that is the case, do not requeue it. When flushing the output queue we may relink the chunk for preparing an outgoing packet, but eventually unlink it when it's copied into the skb right before transmission. Joint work with Vlad Yasevich. Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h index 9fbd856..856f01c 100644 --- a/include/net/sctp/sctp.h +++ b/include/net/sctp/sctp.h @@ -426,6 +426,11 @@ static inline void sctp_assoc_pending_pmtu(struct sock *sk, struct sctp_associat asoc->pmtu_pending = 0; } +static inline bool sctp_chunk_pending(const struct sctp_chunk *chunk) +{ + return !list_empty(&chunk->list); +} + /* Walk through a list of TLV parameters. Don't trust the * individual parameter lengths and instead depend on * the chunk length to indicate when to stop. Make sure diff --git a/net/sctp/associola.c b/net/sctp/associola.c index a88b852..f791edd 100644 --- a/net/sctp/associola.c +++ b/net/sctp/associola.c @@ -1668,6 +1668,8 @@ struct sctp_chunk *sctp_assoc_lookup_asconf_ack( * ack chunk whose serial number matches that of the request. */ list_for_each_entry(ack, &asoc->asconf_ack_list, transmitted_list) { + if (sctp_chunk_pending(ack)) + continue; if (ack->subh.addip_hdr->serial == serial) { sctp_chunk_hold(ack); return ack; -- cgit v0.10.2 From 26b87c7881006311828bb0ab271a551a62dcceb4 Mon Sep 17 00:00:00 2001 From: Daniel Borkmann Date: Thu, 9 Oct 2014 22:55:33 +0200 Subject: net: sctp: fix remote memory pressure from excessive queueing This scenario is not limited to ASCONF, just taken as one example triggering the issue. When receiving ASCONF probes in the form of ... -------------- INIT[ASCONF; ASCONF_ACK] -------------> <----------- INIT-ACK[ASCONF; ASCONF_ACK] ------------ -------------------- COOKIE-ECHO --------------------> <-------------------- COOKIE-ACK --------------------- ---- ASCONF_a; [ASCONF_b; ...; ASCONF_n;] JUNK ------> [...] ---- ASCONF_m; [ASCONF_o; ...; ASCONF_z;] JUNK ------> ... where ASCONF_a, ASCONF_b, ..., ASCONF_z are good-formed ASCONFs and have increasing serial numbers, we process such ASCONF chunk(s) marked with !end_of_packet and !singleton, since we have not yet reached the SCTP packet end. SCTP does only do verification on a chunk by chunk basis, as an SCTP packet is nothing more than just a container of a stream of chunks which it eats up one by one. We could run into the case that we receive a packet with a malformed tail, above marked as trailing JUNK. All previous chunks are here goodformed, so the stack will eat up all previous chunks up to this point. In case JUNK does not fit into a chunk header and there are no more other chunks in the input queue, or in case JUNK contains a garbage chunk header, but the encoded chunk length would exceed the skb tail, or we came here from an entirely different scenario and the chunk has pdiscard=1 mark (without having had a flush point), it will happen, that we will excessively queue up the association's output queue (a correct final chunk may then turn it into a response flood when flushing the queue ;)): I ran a simple script with incremental ASCONF serial numbers and could see the server side consuming excessive amount of RAM [before/after: up to 2GB and more]. The issue at heart is that the chunk train basically ends with !end_of_packet and !singleton markers and since commit 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet") therefore preventing an output queue flush point in sctp_do_sm() -> sctp_cmd_interpreter() on the input chunk (chunk = event_arg) even though local_cork is set, but its precedence has changed since then. In the normal case, the last chunk with end_of_packet=1 would trigger the queue flush to accommodate possible outgoing bundling. In the input queue, sctp_inq_pop() seems to do the right thing in terms of discarding invalid chunks. So, above JUNK will not enter the state machine and instead be released and exit the sctp_assoc_bh_rcv() chunk processing loop. It's simply the flush point being missing at loop exit. Adding a try-flush approach on the output queue might not work as the underlying infrastructure might be long gone at this point due to the side-effect interpreter run. One possibility, albeit a bit of a kludge, would be to defer invalid chunk freeing into the state machine in order to possibly trigger packet discards and thus indirectly a queue flush on error. It would surely be better to discard chunks as in the current, perhaps better controlled environment, but going back and forth, it's simply architecturally not possible. I tried various trailing JUNK attack cases and it seems to look good now. Joint work with Vlad Yasevich. Fixes: 2e3216cd54b1 ("sctp: Follow security requirement of responding with 1 packet") Signed-off-by: Daniel Borkmann Signed-off-by: Vlad Yasevich Signed-off-by: David S. Miller diff --git a/net/sctp/inqueue.c b/net/sctp/inqueue.c index 4de12af..7e8a16c 100644 --- a/net/sctp/inqueue.c +++ b/net/sctp/inqueue.c @@ -140,18 +140,9 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) } else { /* Nothing to do. Next chunk in the packet, please. */ ch = (sctp_chunkhdr_t *) chunk->chunk_end; - /* Force chunk->skb->data to chunk->chunk_end. */ - skb_pull(chunk->skb, - chunk->chunk_end - chunk->skb->data); - - /* Verify that we have at least chunk headers - * worth of buffer left. - */ - if (skb_headlen(chunk->skb) < sizeof(sctp_chunkhdr_t)) { - sctp_chunk_free(chunk); - chunk = queue->in_progress = NULL; - } + skb_pull(chunk->skb, chunk->chunk_end - chunk->skb->data); + /* We are guaranteed to pull a SCTP header. */ } } @@ -187,24 +178,14 @@ struct sctp_chunk *sctp_inq_pop(struct sctp_inq *queue) skb_pull(chunk->skb, sizeof(sctp_chunkhdr_t)); chunk->subh.v = NULL; /* Subheader is no longer valid. */ - if (chunk->chunk_end < skb_tail_pointer(chunk->skb)) { + if (chunk->chunk_end + sizeof(sctp_chunkhdr_t) < + skb_tail_pointer(chunk->skb)) { /* This is not a singleton */ chunk->singleton = 0; } else if (chunk->chunk_end > skb_tail_pointer(chunk->skb)) { - /* RFC 2960, Section 6.10 Bundling - * - * Partial chunks MUST NOT be placed in an SCTP packet. - * If the receiver detects a partial chunk, it MUST drop - * the chunk. - * - * Since the end of the chunk is past the end of our buffer - * (which contains the whole packet, we can freely discard - * the whole packet. - */ - sctp_chunk_free(chunk); - chunk = queue->in_progress = NULL; - - return NULL; + /* Discard inside state machine. */ + chunk->pdiscard = 1; + chunk->chunk_end = skb_tail_pointer(chunk->skb); } else { /* We are at the end of the packet, so mark the chunk * in case we need to send a SACK. diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index bdea3df..3ee27b7 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -170,6 +170,9 @@ sctp_chunk_length_valid(struct sctp_chunk *chunk, { __u16 chunk_length = ntohs(chunk->chunk_hdr->length); + /* Previously already marked? */ + if (unlikely(chunk->pdiscard)) + return 0; if (unlikely(chunk_length < required_length)) return 0; -- cgit v0.10.2 From c53fed07a03d8b2a2e3bdaba87768211fa55806c Mon Sep 17 00:00:00 2001 From: Vince Bridgers Date: Thu, 9 Oct 2014 10:08:42 -0500 Subject: MAINTAINERS: Update contact information for Vince Bridgers Signed-off-by: Vince Bridgers Signed-off-by: David S. Miller diff --git a/MAINTAINERS b/MAINTAINERS index 1a85af8..6dc0840 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -564,7 +564,7 @@ L: linux-alpha@vger.kernel.org F: arch/alpha/ ALTERA TRIPLE SPEED ETHERNET DRIVER -M: Vince Bridgers +M: Vince Bridgers L: netdev@vger.kernel.org L: nios2-dev@lists.rocketboards.org (moderated for non-subscribers) S: Maintained -- cgit v0.10.2 From 5bc26726ada73264c0fd7b93ccbe7d9e78b2b2d2 Mon Sep 17 00:00:00 2001 From: Nimrod Andy Date: Mon, 13 Oct 2014 10:53:48 +0800 Subject: net: fec: Fix sparse warnings with different lock contexts for basic block reproduce: make ARCH=arm C=1 2>fec.txt drivers/net/ethernet/freescale/fec_main.o cat fec.txt sparse warnings: drivers/net/ethernet/freescale/fec_main.c:2916:12: warning: context imbalance in 'fec_set_features' - different lock contexts for basic block Christopher Li suggest to change as below: if (need_lock) { lock(); do_something_real(); unlock(); } else { do_something_real(); } Reported-by: Fabio Estevam Suggested-by: Christopher Li Signed-off-by: Fugang Duan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 87975b5..7a8209e 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -2912,20 +2912,12 @@ static void fec_poll_controller(struct net_device *dev) #endif #define FEATURES_NEED_QUIESCE NETIF_F_RXCSUM - -static int fec_set_features(struct net_device *netdev, +static inline void fec_enet_set_netdev_features(struct net_device *netdev, netdev_features_t features) { struct fec_enet_private *fep = netdev_priv(netdev); netdev_features_t changed = features ^ netdev->features; - /* Quiesce the device if necessary */ - if (netif_running(netdev) && changed & FEATURES_NEED_QUIESCE) { - napi_disable(&fep->napi); - netif_tx_lock_bh(netdev); - fec_stop(netdev); - } - netdev->features = features; /* Receive checksum has been changed */ @@ -2935,13 +2927,25 @@ static int fec_set_features(struct net_device *netdev, else fep->csum_flags &= ~FLAG_RX_CSUM_ENABLED; } +} + +static int fec_set_features(struct net_device *netdev, + netdev_features_t features) +{ + struct fec_enet_private *fep = netdev_priv(netdev); + netdev_features_t changed = features ^ netdev->features; - /* Resume the device after updates */ if (netif_running(netdev) && changed & FEATURES_NEED_QUIESCE) { + napi_disable(&fep->napi); + netif_tx_lock_bh(netdev); + fec_stop(netdev); + fec_enet_set_netdev_features(netdev, features); fec_restart(netdev); netif_tx_wake_all_queues(netdev); netif_tx_unlock_bh(netdev); napi_enable(&fep->napi); + } else { + fec_enet_set_netdev_features(netdev, features); } return 0; -- cgit v0.10.2 From 2c2b2f0cb9388df8aa8b5036cf18060ac77e6d94 Mon Sep 17 00:00:00 2001 From: Alexander Duyck Date: Fri, 10 Oct 2014 14:30:52 -0700 Subject: fm10k: Add skb->xmit_more support This change adds support for skb->xmit_more based on the changes that were made to igb to support the feature. The main changes are moving up the check for maybe_stop_tx so that we can check netif_xmit_stopped to determine if we must write the tail because we can add no further buffers. Acked-by: Matthew Vick Signed-off-by: Alexander Duyck Acked-by: Jeff Kirsher Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_main.c b/drivers/net/ethernet/intel/fm10k/fm10k_main.c index 9d7118a..e645af4 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_main.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_main.c @@ -929,6 +929,30 @@ static bool fm10k_tx_desc_push(struct fm10k_ring *tx_ring, return i == tx_ring->count; } +static int __fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size) +{ + netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index); + + smp_mb(); + + /* We need to check again in a case another CPU has just + * made room available. */ + if (likely(fm10k_desc_unused(tx_ring) < size)) + return -EBUSY; + + /* A reprieve! - use start_queue because it doesn't call schedule */ + netif_start_subqueue(tx_ring->netdev, tx_ring->queue_index); + ++tx_ring->tx_stats.restart_queue; + return 0; +} + +static inline int fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size) +{ + if (likely(fm10k_desc_unused(tx_ring) >= size)) + return 0; + return __fm10k_maybe_stop_tx(tx_ring, size); +} + static void fm10k_tx_map(struct fm10k_ring *tx_ring, struct fm10k_tx_buffer *first) { @@ -1022,13 +1046,18 @@ static void fm10k_tx_map(struct fm10k_ring *tx_ring, tx_ring->next_to_use = i; + /* Make sure there is space in the ring for the next send. */ + fm10k_maybe_stop_tx(tx_ring, DESC_NEEDED); + /* notify HW of packet */ - writel(i, tx_ring->tail); + if (netif_xmit_stopped(txring_txq(tx_ring)) || !skb->xmit_more) { + writel(i, tx_ring->tail); - /* we need this if more than one processor can write to our tail - * at a time, it synchronizes IO on IA64/Altix systems - */ - mmiowb(); + /* we need this if more than one processor can write to our tail + * at a time, it synchronizes IO on IA64/Altix systems + */ + mmiowb(); + } return; dma_error: @@ -1048,30 +1077,6 @@ dma_error: tx_ring->next_to_use = i; } -static int __fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size) -{ - netif_stop_subqueue(tx_ring->netdev, tx_ring->queue_index); - - smp_mb(); - - /* We need to check again in a case another CPU has just - * made room available. */ - if (likely(fm10k_desc_unused(tx_ring) < size)) - return -EBUSY; - - /* A reprieve! - use start_queue because it doesn't call schedule */ - netif_start_subqueue(tx_ring->netdev, tx_ring->queue_index); - ++tx_ring->tx_stats.restart_queue; - return 0; -} - -static inline int fm10k_maybe_stop_tx(struct fm10k_ring *tx_ring, u16 size) -{ - if (likely(fm10k_desc_unused(tx_ring) >= size)) - return 0; - return __fm10k_maybe_stop_tx(tx_ring, size); -} - netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb, struct fm10k_ring *tx_ring) { @@ -1116,8 +1121,6 @@ netdev_tx_t fm10k_xmit_frame_ring(struct sk_buff *skb, fm10k_tx_map(tx_ring, first); - fm10k_maybe_stop_tx(tx_ring, DESC_NEEDED); - return NETDEV_TX_OK; out_drop: -- cgit v0.10.2 From 31eff81e94472ddb7549509bf4b6e93e1f6f7dc9 Mon Sep 17 00:00:00 2001 From: Alexander Aring Date: Fri, 10 Oct 2014 23:10:47 +0200 Subject: skbuff: fix ftrace handling in skb_unshare If the skb is not dropped afterwards we should run consume_skb instead kfree_skb. Inside of function skb_unshare we do always a kfree_skb, doesn't depend if skb_copy failed or was successful. This patch switch this behaviour like skb_share_check, if allocation of sk_buff failed we use kfree_skb otherwise consume_skb. Signed-off-by: Alexander Aring Signed-off-by: David S. Miller diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 3ab0749..a59d934 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1203,7 +1203,12 @@ static inline struct sk_buff *skb_unshare(struct sk_buff *skb, might_sleep_if(pri & __GFP_WAIT); if (skb_cloned(skb)) { struct sk_buff *nskb = skb_copy(skb, pri); - kfree_skb(skb); /* Free our shared copy */ + + /* Free our shared copy */ + if (likely(nskb)) + consume_skb(skb); + else + kfree_skb(skb); skb = nskb; } return skb; -- cgit v0.10.2 From b2532eb9abd88384aa586169b54a3e53574f29f8 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 10 Oct 2014 18:06:35 -0700 Subject: tcp: fix ooo_okay setting vs Small Queues TCP Small Queues (tcp_tsq_handler()) can hold one reference on sk->sk_wmem_alloc, preventing skb->ooo_okay being set. We should relax test done to set skb->ooo_okay to take care of this extra reference. Minimal truesize of skb containing one byte of payload is SKB_TRUESIZE(1) Without this fix, we have more chance locking flows into the wrong transmit queue. Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 8d4eac7..0a5d97c 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -914,9 +914,13 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it, tcp_ca_event(sk, CA_EVENT_TX_START); /* if no packet is in qdisc/device queue, then allow XPS to select - * another queue. + * another queue. We can be called from tcp_tsq_handler() + * which holds one reference to sk_wmem_alloc. + * + * TODO: Ideally, in-flight pure ACK packets should not matter here. + * One way to get this would be to set skb->truesize = 2 on them. */ - skb->ooo_okay = sk_wmem_alloc_get(sk) == 0; + skb->ooo_okay = sk_wmem_alloc_get(sk) < SKB_TRUESIZE(1); skb_push(skb, tcp_header_size); skb_reset_transport_header(skb); -- cgit v0.10.2 From e0ee9c12157dc74e49e4731e0d07512e7d1ceb95 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Fri, 10 Oct 2014 20:30:23 -0700 Subject: x86: bpf_jit: fix two bugs in eBPF JIT compiler 1. JIT compiler using multi-pass approach to converge to final image size, since x86 instructions are variable length. It starts with large gaps between instructions (so some jumps may use imm32 instead of imm8) and iterates until total program size is the same as in previous pass. This algorithm works only if program size is strictly decreasing. Programs that use LD_ABS insn need additional code in prologue, but it was not emitted during 1st pass, so there was a chance that 2nd pass would adjust imm32->imm8 jump offsets to the same number of bytes as increase in prologue, which may cause algorithm to erroneously decide that size converged. Fix it by always emitting largest prologue in the first pass which is detected by oldproglen==0 check. Also change error check condition 'proglen != oldproglen' to fail gracefully. 2. while staring at the code realized that 64-byte buffer may not be enough when 1st insn is large, so increase it to 128 to avoid buffer overflow (theoretical maximum size of prologue+div is 109) and add runtime check. Fixes: 622582786c9e ("net: filter: x86: internal BPF JIT") Reported-by: Darrick J. Wong Signed-off-by: Alexei Starovoitov Tested-by: Darrick J. Wong Signed-off-by: David S. Miller diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index d56cd1f..3f62734 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -182,12 +182,17 @@ struct jit_context { bool seen_ld_abs; }; +/* maximum number of bytes emitted while JITing one eBPF insn */ +#define BPF_MAX_INSN_SIZE 128 +#define BPF_INSN_SAFETY 64 + static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, int oldproglen, struct jit_context *ctx) { struct bpf_insn *insn = bpf_prog->insnsi; int insn_cnt = bpf_prog->len; - u8 temp[64]; + bool seen_ld_abs = ctx->seen_ld_abs | (oldproglen == 0); + u8 temp[BPF_MAX_INSN_SIZE + BPF_INSN_SAFETY]; int i; int proglen = 0; u8 *prog = temp; @@ -225,7 +230,7 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, u8 *image, EMIT2(0x31, 0xc0); /* xor eax, eax */ EMIT3(0x4D, 0x31, 0xED); /* xor r13, r13 */ - if (ctx->seen_ld_abs) { + if (seen_ld_abs) { /* r9d : skb->len - skb->data_len (headlen) * r10 : skb->data */ @@ -685,7 +690,7 @@ xadd: if (is_imm8(insn->off)) case BPF_JMP | BPF_CALL: func = (u8 *) __bpf_call_base + imm32; jmp_offset = func - (image + addrs[i]); - if (ctx->seen_ld_abs) { + if (seen_ld_abs) { EMIT2(0x41, 0x52); /* push %r10 */ EMIT2(0x41, 0x51); /* push %r9 */ /* need to adjust jmp offset, since @@ -699,7 +704,7 @@ xadd: if (is_imm8(insn->off)) return -EINVAL; } EMIT1_off32(0xE8, jmp_offset); - if (ctx->seen_ld_abs) { + if (seen_ld_abs) { EMIT2(0x41, 0x59); /* pop %r9 */ EMIT2(0x41, 0x5A); /* pop %r10 */ } @@ -804,7 +809,8 @@ emit_jmp: goto common_load; case BPF_LD | BPF_ABS | BPF_W: func = CHOOSE_LOAD_FUNC(imm32, sk_load_word); -common_load: ctx->seen_ld_abs = true; +common_load: + ctx->seen_ld_abs = seen_ld_abs = true; jmp_offset = func - (image + addrs[i]); if (!func || !is_simm32(jmp_offset)) { pr_err("unsupported bpf func %d addr %p image %p\n", @@ -878,6 +884,11 @@ common_load: ctx->seen_ld_abs = true; } ilen = prog - temp; + if (ilen > BPF_MAX_INSN_SIZE) { + pr_err("bpf_jit_compile fatal insn size error\n"); + return -EFAULT; + } + if (image) { if (unlikely(proglen + ilen > oldproglen)) { pr_err("bpf_jit_compile fatal error\n"); @@ -934,9 +945,11 @@ void bpf_int_jit_compile(struct bpf_prog *prog) goto out; } if (image) { - if (proglen != oldproglen) + if (proglen != oldproglen) { pr_err("bpf_jit: proglen=%d != oldproglen=%d\n", proglen, oldproglen); + goto out; + } break; } if (proglen == oldproglen) { -- cgit v0.10.2 From 02ea80741a25435123e8a5ca40cac6a0bcf0c9f1 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Sat, 11 Oct 2014 13:03:34 +0800 Subject: ipv6: remove aca_lock spinlock from struct ifacaddr6 no user uses this lock. Signed-off-by: Li RongQing Signed-off-by: David S. Miller diff --git a/include/net/if_inet6.h b/include/net/if_inet6.h index 55a8d40..98e5f95 100644 --- a/include/net/if_inet6.h +++ b/include/net/if_inet6.h @@ -146,7 +146,6 @@ struct ifacaddr6 { struct ifacaddr6 *aca_next; int aca_users; atomic_t aca_refcnt; - spinlock_t aca_lock; unsigned long aca_cstamp; unsigned long aca_tstamp; }; diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c index f5e319a..baf2742 100644 --- a/net/ipv6/anycast.c +++ b/net/ipv6/anycast.c @@ -235,7 +235,6 @@ static struct ifacaddr6 *aca_alloc(struct rt6_info *rt, /* aca_tstamp should be updated upon changes */ aca->aca_cstamp = aca->aca_tstamp = jiffies; atomic_set(&aca->aca_refcnt, 1); - spin_lock_init(&aca->aca_lock); return aca; } -- cgit v0.10.2 From f28460b229919387b2f97f3a688d0dd86cc819c9 Mon Sep 17 00:00:00 2001 From: Luwei Zhou Date: Fri, 10 Oct 2014 13:15:28 +0800 Subject: net: fec: ptp: Use the 31-bit ptp timer. When ptp switches from software adjustment to hardware ajustment, linux ptp can't converge. It is caused by the IP limit. Hardware adjustment logcial have issue when ptp counter runs over 0x80000000(31 bit counter). The internal IP reference manual already remove 32bit free-running count support. This patch replace the 32-bit PTP timer with 31-bit. Signed-off-by: Luwei Zhou Signed-off-by: Frank Li Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index cca3617..8016bdd 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -70,6 +70,7 @@ #define FEC_TS_TIMESTAMP 0x418 #define FEC_CC_MULT (1 << 31) +#define FEC_COUNTER_PERIOD (1 << 31) /** * fec_ptp_read - read raw cycle counter (to be used by time counter) * @cc: the cyclecounter structure @@ -113,14 +114,15 @@ void fec_ptp_start_cyclecounter(struct net_device *ndev) /* 1ns counter */ writel(inc << FEC_T_INC_OFFSET, fep->hwp + FEC_ATIME_INC); - /* use free running count */ - writel(0, fep->hwp + FEC_ATIME_EVT_PERIOD); + /* use 31-bit timer counter */ + writel(FEC_COUNTER_PERIOD, fep->hwp + FEC_ATIME_EVT_PERIOD); - writel(FEC_T_CTRL_ENABLE, fep->hwp + FEC_ATIME_CTRL); + writel(FEC_T_CTRL_ENABLE | FEC_T_CTRL_PERIOD_RST, + fep->hwp + FEC_ATIME_CTRL); memset(&fep->cc, 0, sizeof(fep->cc)); fep->cc.read = fec_ptp_read; - fep->cc.mask = CLOCKSOURCE_MASK(32); + fep->cc.mask = CLOCKSOURCE_MASK(31); fep->cc.shift = 31; fep->cc.mult = FEC_CC_MULT; -- cgit v0.10.2 From 89bddcda7e4f4ff2586e736427405115c362bed4 Mon Sep 17 00:00:00 2001 From: Luwei Zhou Date: Fri, 10 Oct 2014 13:15:29 +0800 Subject: net: fec: ptp: Use hardware algorithm to adjust PTP counter. The FEC IP supports hardware adjustment for ptp timer. Refer to the description of ENET_ATCOR and ENET_ATINC registers in the spec about the hardware adjustment. This patch uses hardware support to adjust the ptp offset and frequency on the slave side. Signed-off-by: Luwei Zhou Signed-off-by: Frank Li Signed-off-by: Fugang Duan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 1d5e182..b0e6025 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -484,6 +484,9 @@ struct fec_enet_private { unsigned int itr_clk_rate; u32 rx_copybreak; + + /* ptp clock period in ns*/ + unsigned int ptp_inc; }; void fec_ptp_init(struct platform_device *pdev); diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index 8016bdd..f5ee460 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -145,32 +145,59 @@ void fec_ptp_start_cyclecounter(struct net_device *ndev) */ static int fec_ptp_adjfreq(struct ptp_clock_info *ptp, s32 ppb) { - u64 diff; unsigned long flags; int neg_adj = 0; - u32 mult = FEC_CC_MULT; + u32 i, tmp; + u32 corr_inc, corr_period; + u32 corr_ns; + u64 lhs, rhs; struct fec_enet_private *fep = container_of(ptp, struct fec_enet_private, ptp_caps); + if (ppb == 0) + return 0; + if (ppb < 0) { ppb = -ppb; neg_adj = 1; } - diff = mult; - diff *= ppb; - diff = div_u64(diff, 1000000000ULL); + /* In theory, corr_inc/corr_period = ppb/NSEC_PER_SEC; + * Try to find the corr_inc between 1 to fep->ptp_inc to + * meet adjustment requirement. + */ + lhs = NSEC_PER_SEC; + rhs = (u64)ppb * (u64)fep->ptp_inc; + for (i = 1; i <= fep->ptp_inc; i++) { + if (lhs >= rhs) { + corr_inc = i; + corr_period = div_u64(lhs, rhs); + break; + } + lhs += NSEC_PER_SEC; + } + /* Not found? Set it to high value - double speed + * correct in every clock step. + */ + if (i > fep->ptp_inc) { + corr_inc = fep->ptp_inc; + corr_period = 1; + } + + if (neg_adj) + corr_ns = fep->ptp_inc - corr_inc; + else + corr_ns = fep->ptp_inc + corr_inc; spin_lock_irqsave(&fep->tmreg_lock, flags); - /* - * dummy read to set cycle_last in tc to now. - * So use adjusted mult to calculate when next call - * timercounter_read. - */ - timecounter_read(&fep->tc); - fep->cc.mult = neg_adj ? mult - diff : mult + diff; + tmp = readl(fep->hwp + FEC_ATIME_INC) & FEC_T_INC_MASK; + tmp |= corr_ns << FEC_T_INC_CORR_OFFSET; + writel(tmp, fep->hwp + FEC_ATIME_INC); + writel(corr_period, fep->hwp + FEC_ATIME_CORR); + /* dummy read to update the timer. */ + timecounter_read(&fep->tc); spin_unlock_irqrestore(&fep->tmreg_lock, flags); @@ -190,12 +217,19 @@ static int fec_ptp_adjtime(struct ptp_clock_info *ptp, s64 delta) container_of(ptp, struct fec_enet_private, ptp_caps); unsigned long flags; u64 now; + u32 counter; spin_lock_irqsave(&fep->tmreg_lock, flags); now = timecounter_read(&fep->tc); now += delta; + /* Get the timer value based on adjusted timestamp. + * Update the counter with the masked value. + */ + counter = now & fep->cc.mask; + writel(counter, fep->hwp + FEC_ATIME); + /* reset the timecounter */ timecounter_init(&fep->tc, &fep->cc, now); @@ -246,6 +280,7 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp, u64 ns; unsigned long flags; + u32 counter; mutex_lock(&fep->ptp_clk_mutex); /* Check the ptp clock */ @@ -256,8 +291,13 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp, ns = ts->tv_sec * 1000000000ULL; ns += ts->tv_nsec; + /* Get the timer value based on timestamp. + * Update the counter with the masked value. + */ + counter = ns & fep->cc.mask; spin_lock_irqsave(&fep->tmreg_lock, flags); + writel(counter, fep->hwp + FEC_ATIME); timecounter_init(&fep->tc, &fep->cc, ns); spin_unlock_irqrestore(&fep->tmreg_lock, flags); mutex_unlock(&fep->ptp_clk_mutex); @@ -396,6 +436,7 @@ void fec_ptp_init(struct platform_device *pdev) fep->ptp_caps.enable = fec_ptp_enable; fep->cycle_speed = clk_get_rate(fep->clk_ptp); + fep->ptp_inc = NSEC_PER_SEC / fep->cycle_speed; spin_lock_init(&fep->tmreg_lock); -- cgit v0.10.2 From 278d24047891a1bf4a98128eaa8ecafd019e58c2 Mon Sep 17 00:00:00 2001 From: Luwei Zhou Date: Fri, 10 Oct 2014 13:15:30 +0800 Subject: net: fec: ptp: Enable PPS output based on ptp clock FEC ptp timer has 4 channel compare/trigger function. It can be used to enable pps output. The pulse would be ouput high exactly on N second. The pulse ouput high on compare event mode is used to produce pulse per second. The pulse width would be one cycle based on ptp timer clock source.Since 31-bit ptp hardware timer is used, the timer will wrap more than 2 seconds. We need to reload the compare compare event about every 1 second. Signed-off-by: Luwei Zhou Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index b0e6025..1e65917 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -487,12 +487,19 @@ struct fec_enet_private { /* ptp clock period in ns*/ unsigned int ptp_inc; + + /* pps */ + int pps_channel; + unsigned int reload_period; + int pps_enable; + unsigned int next_counter; }; void fec_ptp_init(struct platform_device *pdev); void fec_ptp_start_cyclecounter(struct net_device *ndev); int fec_ptp_set(struct net_device *ndev, struct ifreq *ifr); int fec_ptp_get(struct net_device *ndev, struct ifreq *ifr); +uint fec_ptp_check_pps_event(struct fec_enet_private *fep); /****************************************************************************/ #endif /* FEC_H */ diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 7a8209e..e364d1f 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -1622,6 +1622,8 @@ fec_enet_interrupt(int irq, void *dev_id) complete(&fep->mdio_done); } + fec_ptp_check_pps_event(fep); + return ret; } diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index f5ee460..0fdcdc9 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -61,6 +61,24 @@ #define FEC_T_INC_CORR_MASK 0x00007f00 #define FEC_T_INC_CORR_OFFSET 8 +#define FEC_T_CTRL_PINPER 0x00000080 +#define FEC_T_TF0_MASK 0x00000001 +#define FEC_T_TF0_OFFSET 0 +#define FEC_T_TF1_MASK 0x00000002 +#define FEC_T_TF1_OFFSET 1 +#define FEC_T_TF2_MASK 0x00000004 +#define FEC_T_TF2_OFFSET 2 +#define FEC_T_TF3_MASK 0x00000008 +#define FEC_T_TF3_OFFSET 3 +#define FEC_T_TDRE_MASK 0x00000001 +#define FEC_T_TDRE_OFFSET 0 +#define FEC_T_TMODE_MASK 0x0000003C +#define FEC_T_TMODE_OFFSET 2 +#define FEC_T_TIE_MASK 0x00000040 +#define FEC_T_TIE_OFFSET 6 +#define FEC_T_TF_MASK 0x00000080 +#define FEC_T_TF_OFFSET 7 + #define FEC_ATIME_CTRL 0x400 #define FEC_ATIME 0x404 #define FEC_ATIME_EVT_OFFSET 0x408 @@ -69,8 +87,143 @@ #define FEC_ATIME_INC 0x414 #define FEC_TS_TIMESTAMP 0x418 +#define FEC_TGSR 0x604 +#define FEC_TCSR(n) (0x608 + n * 0x08) +#define FEC_TCCR(n) (0x60C + n * 0x08) +#define MAX_TIMER_CHANNEL 3 +#define FEC_TMODE_TOGGLE 0x05 +#define FEC_HIGH_PULSE 0x0F + #define FEC_CC_MULT (1 << 31) #define FEC_COUNTER_PERIOD (1 << 31) +#define PPS_OUPUT_RELOAD_PERIOD NSEC_PER_SEC +#define FEC_CHANNLE_0 0 +#define DEFAULT_PPS_CHANNEL FEC_CHANNLE_0 + +/** + * fec_ptp_enable_pps + * @fep: the fec_enet_private structure handle + * @enable: enable the channel pps output + * + * This function enble the PPS ouput on the timer channel. + */ +static int fec_ptp_enable_pps(struct fec_enet_private *fep, uint enable) +{ + unsigned long flags; + u32 val, tempval; + int inc; + struct timespec ts; + u64 ns; + u32 remainder; + val = 0; + + if (!(fep->hwts_tx_en || fep->hwts_rx_en)) { + dev_err(&fep->pdev->dev, "No ptp stack is running\n"); + return -EINVAL; + } + + if (fep->pps_enable == enable) + return 0; + + fep->pps_channel = DEFAULT_PPS_CHANNEL; + fep->reload_period = PPS_OUPUT_RELOAD_PERIOD; + inc = fep->ptp_inc; + + spin_lock_irqsave(&fep->tmreg_lock, flags); + + if (enable) { + /* clear capture or output compare interrupt status if have. + */ + writel(FEC_T_TF_MASK, fep->hwp + FEC_TCSR(fep->pps_channel)); + + /* It is recommended to doulbe check the TMODE field in the + * TCSR register to be cleared before the first compare counter + * is written into TCCR register. Just add a double check. + */ + val = readl(fep->hwp + FEC_TCSR(fep->pps_channel)); + do { + val &= ~(FEC_T_TMODE_MASK); + writel(val, fep->hwp + FEC_TCSR(fep->pps_channel)); + val = readl(fep->hwp + FEC_TCSR(fep->pps_channel)); + } while (val & FEC_T_TMODE_MASK); + + /* Dummy read counter to update the counter */ + timecounter_read(&fep->tc); + /* We want to find the first compare event in the next + * second point. So we need to know what the ptp time + * is now and how many nanoseconds is ahead to get next second. + * The remaining nanosecond ahead before the next second would be + * NSEC_PER_SEC - ts.tv_nsec. Add the remaining nanoseconds + * to current timer would be next second. + */ + tempval = readl(fep->hwp + FEC_ATIME_CTRL); + tempval |= FEC_T_CTRL_CAPTURE; + writel(tempval, fep->hwp + FEC_ATIME_CTRL); + + tempval = readl(fep->hwp + FEC_ATIME); + /* Convert the ptp local counter to 1588 timestamp */ + ns = timecounter_cyc2time(&fep->tc, tempval); + ts.tv_sec = div_u64_rem(ns, 1000000000ULL, &remainder); + ts.tv_nsec = remainder; + + /* The tempval is less than 3 seconds, and so val is less than + * 4 seconds. No overflow for 32bit calculation. + */ + val = NSEC_PER_SEC - (u32)ts.tv_nsec + tempval; + + /* Need to consider the situation that the current time is + * very close to the second point, which means NSEC_PER_SEC + * - ts.tv_nsec is close to be zero(For example 20ns); Since the timer + * is still running when we calculate the first compare event, it is + * possible that the remaining nanoseonds run out before the compare + * counter is calculated and written into TCCR register. To avoid + * this possibility, we will set the compare event to be the next + * of next second. The current setting is 31-bit timer and wrap + * around over 2 seconds. So it is okay to set the next of next + * seond for the timer. + */ + val += NSEC_PER_SEC; + + /* We add (2 * NSEC_PER_SEC - (u32)ts.tv_nsec) to current + * ptp counter, which maybe cause 32-bit wrap. Since the + * (NSEC_PER_SEC - (u32)ts.tv_nsec) is less than 2 second. + * We can ensure the wrap will not cause issue. If the offset + * is bigger than fep->cc.mask would be a error. + */ + val &= fep->cc.mask; + writel(val, fep->hwp + FEC_TCCR(fep->pps_channel)); + + /* Calculate the second the compare event timestamp */ + fep->next_counter = (val + fep->reload_period) & fep->cc.mask; + + /* * Enable compare event when overflow */ + val = readl(fep->hwp + FEC_ATIME_CTRL); + val |= FEC_T_CTRL_PINPER; + writel(val, fep->hwp + FEC_ATIME_CTRL); + + /* Compare channel setting. */ + val = readl(fep->hwp + FEC_TCSR(fep->pps_channel)); + val |= (1 << FEC_T_TF_OFFSET | 1 << FEC_T_TIE_OFFSET); + val &= ~(1 << FEC_T_TDRE_OFFSET); + val &= ~(FEC_T_TMODE_MASK); + val |= (FEC_HIGH_PULSE << FEC_T_TMODE_OFFSET); + writel(val, fep->hwp + FEC_TCSR(fep->pps_channel)); + + /* Write the second compare event timestamp and calculate + * the third timestamp. Refer the TCCR register detail in the spec. + */ + writel(fep->next_counter, fep->hwp + FEC_TCCR(fep->pps_channel)); + fep->next_counter = (fep->next_counter + fep->reload_period) & fep->cc.mask; + } else { + writel(0, fep->hwp + FEC_TCSR(fep->pps_channel)); + } + + fep->pps_enable = enable; + spin_unlock_irqrestore(&fep->tmreg_lock, flags); + + return 0; +} + /** * fec_ptp_read - read raw cycle counter (to be used by time counter) * @cc: the cyclecounter structure @@ -314,6 +467,15 @@ static int fec_ptp_settime(struct ptp_clock_info *ptp, static int fec_ptp_enable(struct ptp_clock_info *ptp, struct ptp_clock_request *rq, int on) { + struct fec_enet_private *fep = + container_of(ptp, struct fec_enet_private, ptp_caps); + int ret = 0; + + if (rq->type == PTP_CLK_REQ_PPS) { + ret = fec_ptp_enable_pps(fep, on); + + return ret; + } return -EOPNOTSUPP; } @@ -428,7 +590,7 @@ void fec_ptp_init(struct platform_device *pdev) fep->ptp_caps.n_ext_ts = 0; fep->ptp_caps.n_per_out = 0; fep->ptp_caps.n_pins = 0; - fep->ptp_caps.pps = 0; + fep->ptp_caps.pps = 1; fep->ptp_caps.adjfreq = fec_ptp_adjfreq; fep->ptp_caps.adjtime = fec_ptp_adjtime; fep->ptp_caps.gettime = fec_ptp_gettime; @@ -452,3 +614,36 @@ void fec_ptp_init(struct platform_device *pdev) schedule_delayed_work(&fep->time_keep, HZ); } + +/** + * fec_ptp_check_pps_event + * @fep: the fec_enet_private structure handle + * + * This function check the pps event and reload the timer compare counter. + */ +uint fec_ptp_check_pps_event(struct fec_enet_private *fep) +{ + u32 val; + u8 channel = fep->pps_channel; + struct ptp_clock_event event; + + val = readl(fep->hwp + FEC_TCSR(channel)); + if (val & FEC_T_TF_MASK) { + /* Write the next next compare(not the next according the spec) + * value to the register + */ + writel(fep->next_counter, fep->hwp + FEC_TCCR(channel)); + do { + writel(val, fep->hwp + FEC_TCSR(channel)); + } while (readl(fep->hwp + FEC_TCSR(channel)) & FEC_T_TF_MASK); + + /* Update the counter; */ + fep->next_counter = (fep->next_counter + fep->reload_period) & fep->cc.mask; + + event.type = PTP_CLOCK_PPS; + ptp_clock_event(fep->ptp_clock, &event); + return 1; + } + + return 0; +} -- cgit v0.10.2 From 1bdc07ebabefd19b56d1d36584a401ff6085fa71 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:29 +0200 Subject: isdn/gigaset: missing break in do_facility_req If we take the unsupported supplementary service notification mask path, we end up falling through and overwriting the error code. Insert a break statement to skip the remainder of the switch case and proceed to sending the reply message. Spotted with Coverity. Reported-by: Dave Jones Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/capi.c b/drivers/isdn/gigaset/capi.c index 3286903..a2eabe9 100644 --- a/drivers/isdn/gigaset/capi.c +++ b/drivers/isdn/gigaset/capi.c @@ -1180,6 +1180,7 @@ static void do_facility_req(struct gigaset_capi_ctr *iif, confparam[3] = 2; /* length */ capimsg_setu16(confparam, 4, CapiSupplementaryServiceNotSupported); + break; } info = CapiSuccess; confparam[3] = 2; /* length */ -- cgit v0.10.2 From ee7ff5fed25711a26da7826071e6ede8af049ad2 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:29 +0200 Subject: isdn/gigaset: make sure controller name is null terminated In gigaset_isdn_regdev, the name field may not have a null terminator if the source string's length is equal to the buffer size. Fix by zero filling the structure and excluding the last byte of the name field from the copy. Spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/capi.c b/drivers/isdn/gigaset/capi.c index a2eabe9..044392c 100644 --- a/drivers/isdn/gigaset/capi.c +++ b/drivers/isdn/gigaset/capi.c @@ -2358,7 +2358,7 @@ int gigaset_isdn_regdev(struct cardstate *cs, const char *isdnid) struct gigaset_capi_ctr *iif; int rc; - iif = kmalloc(sizeof(*iif), GFP_KERNEL); + iif = kzalloc(sizeof(*iif), GFP_KERNEL); if (!iif) { pr_err("%s: out of memory\n", __func__); return -ENOMEM; @@ -2367,7 +2367,7 @@ int gigaset_isdn_regdev(struct cardstate *cs, const char *isdnid) /* prepare controller structure */ iif->ctr.owner = THIS_MODULE; iif->ctr.driverdata = cs; - strncpy(iif->ctr.name, isdnid, sizeof(iif->ctr.name)); + strncpy(iif->ctr.name, isdnid, sizeof(iif->ctr.name) - 1); iif->ctr.driver_name = "gigaset"; iif->ctr.load_firmware = NULL; iif->ctr.reset_ctr = NULL; -- cgit v0.10.2 From 097933ddcd28ef99c116651b20fd2e06717e0f0d Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:29 +0200 Subject: isdn/gigaset: limit raw CAPI message dump length In dump_rawmsg, the length field from a received data package was used unscrutinized, allowing an attacker to control the size of the allocated buffer and the number of times the output loop iterates. Fix by limiting to a reasonable value. Spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/capi.c b/drivers/isdn/gigaset/capi.c index 044392c..47e2a91 100644 --- a/drivers/isdn/gigaset/capi.c +++ b/drivers/isdn/gigaset/capi.c @@ -250,6 +250,8 @@ static inline void dump_rawmsg(enum debuglevel level, const char *tag, l -= 12; if (l <= 0) return; + if (l > 64) + l = 64; /* arbitrary limit */ dbgline = kmalloc(3 * l, GFP_ATOMIC); if (!dbgline) return; -- cgit v0.10.2 From 846ac30135e7c5e03b487c16c87ccb1ab020a01f Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:29 +0200 Subject: isdn/gigaset: fix NULL pointer dereference In do_action, a NULL pointer might be passed to function start_dial which will dereference it. Fix by adding a check for NULL before the call. Spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/ev-layer.c b/drivers/isdn/gigaset/ev-layer.c index dcae14a..0f699eb 100644 --- a/drivers/isdn/gigaset/ev-layer.c +++ b/drivers/isdn/gigaset/ev-layer.c @@ -1380,6 +1380,11 @@ static void do_action(int action, struct cardstate *cs, /* events from the LL */ case ACT_DIAL: + if (!ev->ptr) { + *p_genresp = 1; + *p_resp_code = RSP_ERROR; + break; + } start_dial(at_state, ev->ptr, ev->parameter); break; case ACT_ACCEPT: -- cgit v0.10.2 From b8324f94202af7dc688576259803a2ef9a98d655 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/gigaset: fix non-heap pointer deallocation at_state structures may be allocated individually or as part of a cardstate or bc_state structure. The disconnect() function handled both cases, creating a risk that it might try to deallocate an at_state structure that had not been allocated individually. Fix by splitting disconnect() into two variants handling cases with and without an associated B channel separately, and adding an explicit check. Spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/ev-layer.c b/drivers/isdn/gigaset/ev-layer.c index 0f699eb..c8ced12 100644 --- a/drivers/isdn/gigaset/ev-layer.c +++ b/drivers/isdn/gigaset/ev-layer.c @@ -604,14 +604,14 @@ void gigaset_handle_modem_response(struct cardstate *cs) } EXPORT_SYMBOL_GPL(gigaset_handle_modem_response); -/* disconnect +/* disconnect_nobc * process closing of connection associated with given AT state structure + * without B channel */ -static void disconnect(struct at_state_t **at_state_p) +static void disconnect_nobc(struct at_state_t **at_state_p, + struct cardstate *cs) { unsigned long flags; - struct bc_state *bcs = (*at_state_p)->bcs; - struct cardstate *cs = (*at_state_p)->cs; spin_lock_irqsave(&cs->lock, flags); ++(*at_state_p)->seq_index; @@ -622,23 +622,44 @@ static void disconnect(struct at_state_t **at_state_p) gig_dbg(DEBUG_EVENT, "Scheduling PC_UMMODE"); cs->commands_pending = 1; } - spin_unlock_irqrestore(&cs->lock, flags); - if (bcs) { - /* B channel assigned: invoke hardware specific handler */ - cs->ops->close_bchannel(bcs); - /* notify LL */ - if (bcs->chstate & (CHS_D_UP | CHS_NOTIFY_LL)) { - bcs->chstate &= ~(CHS_D_UP | CHS_NOTIFY_LL); - gigaset_isdn_hupD(bcs); - } - } else { - /* no B channel assigned: just deallocate */ - spin_lock_irqsave(&cs->lock, flags); + /* check for and deallocate temporary AT state */ + if (!list_empty(&(*at_state_p)->list)) { list_del(&(*at_state_p)->list); kfree(*at_state_p); *at_state_p = NULL; - spin_unlock_irqrestore(&cs->lock, flags); + } + + spin_unlock_irqrestore(&cs->lock, flags); +} + +/* disconnect_bc + * process closing of connection associated with given AT state structure + * and B channel + */ +static void disconnect_bc(struct at_state_t *at_state, + struct cardstate *cs, struct bc_state *bcs) +{ + unsigned long flags; + + spin_lock_irqsave(&cs->lock, flags); + ++at_state->seq_index; + + /* revert to selected idle mode */ + if (!cs->cidmode) { + cs->at_state.pending_commands |= PC_UMMODE; + gig_dbg(DEBUG_EVENT, "Scheduling PC_UMMODE"); + cs->commands_pending = 1; + } + spin_unlock_irqrestore(&cs->lock, flags); + + /* invoke hardware specific handler */ + cs->ops->close_bchannel(bcs); + + /* notify LL */ + if (bcs->chstate & (CHS_D_UP | CHS_NOTIFY_LL)) { + bcs->chstate &= ~(CHS_D_UP | CHS_NOTIFY_LL); + gigaset_isdn_hupD(bcs); } } @@ -646,7 +667,7 @@ static void disconnect(struct at_state_t **at_state_p) * get a free AT state structure: either one of those associated with the * B channels of the Gigaset device, or if none of those is available, * a newly allocated one with bcs=NULL - * The structure should be freed by calling disconnect() after use. + * The structure should be freed by calling disconnect_nobc() after use. */ static inline struct at_state_t *get_free_channel(struct cardstate *cs, int cid) @@ -1057,7 +1078,7 @@ static void do_action(int action, struct cardstate *cs, struct event_t *ev) { struct at_state_t *at_state = *p_at_state; - struct at_state_t *at_state2; + struct bc_state *bcs2; unsigned long flags; int channel; @@ -1156,8 +1177,8 @@ static void do_action(int action, struct cardstate *cs, break; case ACT_RING: /* get fresh AT state structure for new CID */ - at_state2 = get_free_channel(cs, ev->parameter); - if (!at_state2) { + at_state = get_free_channel(cs, ev->parameter); + if (!at_state) { dev_warn(cs->dev, "RING ignored: could not allocate channel structure\n"); break; @@ -1166,16 +1187,16 @@ static void do_action(int action, struct cardstate *cs, /* initialize AT state structure * note that bcs may be NULL if no B channel is free */ - at_state2->ConState = 700; + at_state->ConState = 700; for (i = 0; i < STR_NUM; ++i) { - kfree(at_state2->str_var[i]); - at_state2->str_var[i] = NULL; + kfree(at_state->str_var[i]); + at_state->str_var[i] = NULL; } - at_state2->int_var[VAR_ZCTP] = -1; + at_state->int_var[VAR_ZCTP] = -1; spin_lock_irqsave(&cs->lock, flags); - at_state2->timer_expires = RING_TIMEOUT; - at_state2->timer_active = 1; + at_state->timer_expires = RING_TIMEOUT; + at_state->timer_active = 1; spin_unlock_irqrestore(&cs->lock, flags); break; case ACT_ICALL: @@ -1213,14 +1234,17 @@ static void do_action(int action, struct cardstate *cs, case ACT_DISCONNECT: cs->cur_at_seq = SEQ_NONE; at_state->cid = -1; - if (bcs && cs->onechannel && cs->dle) { + if (!bcs) { + disconnect_nobc(p_at_state, cs); + } else if (cs->onechannel && cs->dle) { /* Check for other open channels not needed: * DLE only used for M10x with one B channel. */ at_state->pending_commands |= PC_DLE0; cs->commands_pending = 1; - } else - disconnect(p_at_state); + } else { + disconnect_bc(at_state, cs, bcs); + } break; case ACT_FAKEDLE0: at_state->int_var[VAR_ZDLE] = 0; @@ -1228,25 +1252,27 @@ static void do_action(int action, struct cardstate *cs, /* fall through */ case ACT_DLE0: cs->cur_at_seq = SEQ_NONE; - at_state2 = &cs->bcs[cs->curchannel].at_state; - disconnect(&at_state2); + bcs2 = cs->bcs + cs->curchannel; + disconnect_bc(&bcs2->at_state, cs, bcs2); break; case ACT_ABORTHUP: cs->cur_at_seq = SEQ_NONE; dev_warn(cs->dev, "Could not hang up.\n"); at_state->cid = -1; - if (bcs && cs->onechannel) + if (!bcs) + disconnect_nobc(p_at_state, cs); + else if (cs->onechannel) at_state->pending_commands |= PC_DLE0; else - disconnect(p_at_state); + disconnect_bc(at_state, cs, bcs); schedule_init(cs, MS_RECOVER); break; case ACT_FAILDLE0: cs->cur_at_seq = SEQ_NONE; dev_warn(cs->dev, "Error leaving DLE mode.\n"); cs->dle = 0; - at_state2 = &cs->bcs[cs->curchannel].at_state; - disconnect(&at_state2); + bcs2 = cs->bcs + cs->curchannel; + disconnect_bc(&bcs2->at_state, cs, bcs2); schedule_init(cs, MS_RECOVER); break; case ACT_FAILDLE1: @@ -1275,14 +1301,14 @@ static void do_action(int action, struct cardstate *cs, if (reinit_and_retry(cs, channel) < 0) { dev_warn(cs->dev, "Could not get a call ID. Cannot dial.\n"); - at_state2 = &cs->bcs[channel].at_state; - disconnect(&at_state2); + bcs2 = cs->bcs + channel; + disconnect_bc(&bcs2->at_state, cs, bcs2); } break; case ACT_ABORTCID: cs->cur_at_seq = SEQ_NONE; - at_state2 = &cs->bcs[cs->curchannel].at_state; - disconnect(&at_state2); + bcs2 = cs->bcs + cs->curchannel; + disconnect_bc(&bcs2->at_state, cs, bcs2); break; case ACT_DIALING: @@ -1291,7 +1317,10 @@ static void do_action(int action, struct cardstate *cs, break; case ACT_ABORTACCEPT: /* hangup/error/timeout during ICALL procssng */ - disconnect(p_at_state); + if (bcs) + disconnect_bc(at_state, cs, bcs); + else + disconnect_nobc(p_at_state, cs); break; case ACT_ABORTDIAL: /* error/timeout during dial preparation */ -- cgit v0.10.2 From 9ea8aa8d5087529210553114b7bc4bf4374ace8f Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: correct capi20_manufacturer argument type mismatch Function capi20_manufacturer() is declared with unsigned int cmd argument but called with unsigned long. Fix by correcting the function prototype since the actual argument is part of the user visible API. Spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/kcapi.c b/drivers/isdn/capi/kcapi.c index c123709..823f698 100644 --- a/drivers/isdn/capi/kcapi.c +++ b/drivers/isdn/capi/kcapi.c @@ -1184,7 +1184,7 @@ static int old_capi_manufacturer(unsigned int cmd, void __user *data) * Return value: CAPI result code */ -int capi20_manufacturer(unsigned int cmd, void __user *data) +int capi20_manufacturer(unsigned long cmd, void __user *data) { struct capi_ctr *ctr; int retval; @@ -1259,7 +1259,7 @@ int capi20_manufacturer(unsigned int cmd, void __user *data) } default: - printk(KERN_ERR "kcapi: manufacturer command %d unknown.\n", + printk(KERN_ERR "kcapi: manufacturer command %lu unknown.\n", cmd); break; diff --git a/include/linux/kernelcapi.h b/include/linux/kernelcapi.h index 9be37da..e985ba6 100644 --- a/include/linux/kernelcapi.h +++ b/include/linux/kernelcapi.h @@ -41,7 +41,7 @@ u16 capi20_get_manufacturer(u32 contr, u8 buf[CAPI_MANUFACTURER_LEN]); u16 capi20_get_version(u32 contr, struct capi_version *verp); u16 capi20_get_serial(u32 contr, u8 serial[CAPI_SERIAL_LEN]); u16 capi20_get_profile(u32 contr, struct capi_profile *profp); -int capi20_manufacturer(unsigned int cmd, void __user *data); +int capi20_manufacturer(unsigned long cmd, void __user *data); #define CAPICTR_UP 0 #define CAPICTR_DOWN 1 -- cgit v0.10.2 From 5362247a42e18ef74e698bb23575c272f8e35375 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: prevent index overrun from command_2_index() The result of the function command_2_index() is used to index two arrays mnames[] and cpars[] with max. index 0x4e but in its current form that function can produce results up to 3*(0x9+0x9)+0x7f = 0xb5. Fix by clamping all result values potentially overrunning the arrays to zero which is already handled as an invalid value. Re-spotted with Coverity. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c index 4073d16..b501d76 100644 --- a/drivers/isdn/capi/capiutil.c +++ b/drivers/isdn/capi/capiutil.c @@ -207,6 +207,8 @@ static unsigned command_2_index(unsigned c, unsigned sc) c = 0x9 + (c & 0x0f); else if (c == 0x41) c = 0x9 + 0x1; + if (c > 0x18) + c = 0x00; return (sc & 3) * (0x9 + 0x9) + c; } -- cgit v0.10.2 From 854d23b77aa25b203c7af11de885c3b8b3834c20 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: refactor command/subcommand table accesses Encapsulate accesses to the CAPI 2.0 command/subcommand name and parameter tables in a single place in preparation for redesign. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c index b501d76..8e401ed 100644 --- a/drivers/isdn/capi/capiutil.c +++ b/drivers/isdn/capi/capiutil.c @@ -212,6 +212,19 @@ static unsigned command_2_index(unsigned c, unsigned sc) return (sc & 3) * (0x9 + 0x9) + c; } +/** + * capi_cmd2par() - find parameter string for CAPI 2.0 command/subcommand + * @cmd: command number + * @subcmd: subcommand number + * + * Return value: static string, NULL if command/subcommand unknown + */ + +static unsigned char *capi_cmd2par(u8 cmd, u8 subcmd) +{ + return cpars[command_2_index(cmd, subcmd)]; +} + /*-------------------------------------------------------*/ #define TYP (cdef[cmsg->par[cmsg->p]].typ) #define OFF (((u8 *)cmsg) + cdef[cmsg->par[cmsg->p]].off) @@ -304,7 +317,7 @@ unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg) cmsg->m = msg; cmsg->l = 8; cmsg->p = 0; - cmsg->par = cpars[command_2_index(cmsg->Command, cmsg->Subcommand)]; + cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand); pars_2_message(cmsg); @@ -377,7 +390,7 @@ unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg) cmsg->p = 0; byteTRcpy(cmsg->m + 4, &cmsg->Command); byteTRcpy(cmsg->m + 5, &cmsg->Subcommand); - cmsg->par = cpars[command_2_index(cmsg->Command, cmsg->Subcommand)]; + cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand); message_2_pars(cmsg); @@ -761,10 +774,10 @@ _cdebbuf *capi_message2str(u8 *msg) cmsg->p = 0; byteTRcpy(cmsg->m + 4, &cmsg->Command); byteTRcpy(cmsg->m + 5, &cmsg->Subcommand); - cmsg->par = cpars[command_2_index(cmsg->Command, cmsg->Subcommand)]; + cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand); cdb = bufprint(cdb, "%-26s ID=%03d #0x%04x LEN=%04d\n", - mnames[command_2_index(cmsg->Command, cmsg->Subcommand)], + capi_cmd2str(cmsg->Command, cmsg->Subcommand), ((unsigned short *) msg)[1], ((unsigned short *) msg)[3], ((unsigned short *) msg)[0]); @@ -798,7 +811,7 @@ _cdebbuf *capi_cmsg2str(_cmsg *cmsg) cmsg->l = 8; cmsg->p = 0; cdb = bufprint(cdb, "%s ID=%03d #0x%04x LEN=%04d\n", - mnames[command_2_index(cmsg->Command, cmsg->Subcommand)], + capi_cmd2str(cmsg->Command, cmsg->Subcommand), ((u16 *) cmsg->m)[1], ((u16 *) cmsg->m)[3], ((u16 *) cmsg->m)[0]); -- cgit v0.10.2 From 5510ab18048397193ae073d6b0d4ea78ff0170f5 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: prevent NULL pointer dereference on invalid CAPI command An invalid CAPI 2.0 command/subcommand combination may retrieve a NULL pointer from the cpars[] array which will later be dereferenced by the parser routines. Fix by adding NULL pointer checks in strategic places. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c index 8e401ed..36835ef 100644 --- a/drivers/isdn/capi/capiutil.c +++ b/drivers/isdn/capi/capiutil.c @@ -318,6 +318,8 @@ unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg) cmsg->l = 8; cmsg->p = 0; cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand); + if (!cmsg->par) + return 1; /* invalid command/subcommand */ pars_2_message(cmsg); @@ -391,6 +393,8 @@ unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg) byteTRcpy(cmsg->m + 4, &cmsg->Command); byteTRcpy(cmsg->m + 5, &cmsg->Subcommand); cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand); + if (!cmsg->par) + return 1; /* invalid command/subcommand */ message_2_pars(cmsg); @@ -640,6 +644,9 @@ static _cdebbuf *printstruct(_cdebbuf *cdb, u8 *m) static _cdebbuf *protocol_message_2_pars(_cdebbuf *cdb, _cmsg *cmsg, int level) { + if (!cmsg->par) + return NULL; /* invalid command/subcommand */ + for (; TYP != _CEND; cmsg->p++) { int slen = 29 + 3 - level; int i; -- cgit v0.10.2 From 2bf3a09ea51f807d78d48d0ebc591b9e1502a743 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: handle CAPI 2.0 message parser failures Have callers of capi_cmsg2message and capi_message2cmsg handle non-zero return values indicating failure. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/capidrv.c b/drivers/isdn/capi/capidrv.c index fd6d28f..1cc6ca8 100644 --- a/drivers/isdn/capi/capidrv.c +++ b/drivers/isdn/capi/capidrv.c @@ -506,7 +506,10 @@ static void send_message(capidrv_contr *card, _cmsg *cmsg) struct sk_buff *skb; size_t len; - capi_cmsg2message(cmsg, cmsg->buf); + if (capi_cmsg2message(cmsg, cmsg->buf)) { + printk(KERN_ERR "capidrv::send_message: parser failure\n"); + return; + } len = CAPIMSG_LEN(cmsg->buf); skb = alloc_skb(len, GFP_ATOMIC); if (!skb) { @@ -1578,7 +1581,12 @@ static _cmsg s_cmsg; static void capidrv_recv_message(struct capi20_appl *ap, struct sk_buff *skb) { - capi_message2cmsg(&s_cmsg, skb->data); + if (capi_message2cmsg(&s_cmsg, skb->data)) { + printk(KERN_ERR "capidrv: applid=%d: received invalid message\n", + ap->applid); + kfree_skb(skb); + return; + } if (debugmode > 3) { _cdebbuf *cdb = capi_cmsg2str(&s_cmsg); @@ -1903,7 +1911,11 @@ static int capidrv_command(isdn_ctrl *c, capidrv_contr *card) NULL, /* Useruserdata */ NULL /* Facilitydataarray */ ); - capi_cmsg2message(&cmdcmsg, cmdcmsg.buf); + if (capi_cmsg2message(&cmdcmsg, cmdcmsg.buf)) { + printk(KERN_ERR "capidrv-%d: capidrv_command: parser failure\n", + card->contrnr); + return -EINVAL; + } plci_change_state(card, bchan->plcip, EV_PLCI_CONNECT_RESP); send_message(card, &cmdcmsg); return 0; @@ -2090,7 +2102,11 @@ static int if_sendbuf(int id, int channel, int doack, struct sk_buff *skb) if (capidrv_add_ack(nccip, datahandle, doack ? (int)skb->len : -1) < 0) return 0; - capi_cmsg2message(&sendcmsg, sendcmsg.buf); + if (capi_cmsg2message(&sendcmsg, sendcmsg.buf)) { + printk(KERN_ERR "capidrv-%d: if_sendbuf: parser failure\n", + card->contrnr); + return -EINVAL; + } msglen = CAPIMSG_LEN(sendcmsg.buf); if (skb_headroom(skb) < msglen) { struct sk_buff *nskb = skb_realloc_headroom(skb, msglen); diff --git a/drivers/isdn/gigaset/capi.c b/drivers/isdn/gigaset/capi.c index 47e2a91..ccec777 100644 --- a/drivers/isdn/gigaset/capi.c +++ b/drivers/isdn/gigaset/capi.c @@ -647,7 +647,13 @@ int gigaset_isdn_icall(struct at_state_t *at_state) __func__); break; } - capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize)); + if (capi_cmsg2message(&iif->hcmsg, + __skb_put(skb, msgsize))) { + dev_err(cs->dev, "%s: message parser failure\n", + __func__); + dev_kfree_skb_any(skb); + break; + } dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg); /* add to listeners on this B channel, update state */ @@ -693,7 +699,12 @@ static void send_disconnect_ind(struct bc_state *bcs, dev_err(cs->dev, "%s: out of memory\n", __func__); return; } - capi_cmsg2message(&iif->hcmsg, __skb_put(skb, CAPI_DISCONNECT_IND_LEN)); + if (capi_cmsg2message(&iif->hcmsg, + __skb_put(skb, CAPI_DISCONNECT_IND_LEN))) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); } @@ -723,8 +734,12 @@ static void send_disconnect_b3_ind(struct bc_state *bcs, dev_err(cs->dev, "%s: out of memory\n", __func__); return; } - capi_cmsg2message(&iif->hcmsg, - __skb_put(skb, CAPI_DISCONNECT_B3_IND_BASELEN)); + if (capi_cmsg2message(&iif->hcmsg, + __skb_put(skb, CAPI_DISCONNECT_B3_IND_BASELEN))) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); } @@ -789,7 +804,11 @@ void gigaset_isdn_connD(struct bc_state *bcs) dev_err(cs->dev, "%s: out of memory\n", __func__); return; } - capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize)); + if (capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize))) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); } @@ -889,7 +908,11 @@ void gigaset_isdn_connB(struct bc_state *bcs) dev_err(cs->dev, "%s: out of memory\n", __func__); return; } - capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize)); + if (capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize))) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); } @@ -1096,13 +1119,19 @@ static void send_conf(struct gigaset_capi_ctr *iif, struct sk_buff *skb, u16 info) { + struct cardstate *cs = iif->ctr.driverdata; + /* * _CONF replies always only have NCCI and Info parameters * so they'll fit into the _REQ message skb */ capi_cmsg_answer(&iif->acmsg); iif->acmsg.Info = info; - capi_cmsg2message(&iif->acmsg, skb->data); + if (capi_cmsg2message(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } __skb_trim(skb, CAPI_STDCONF_LEN); dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); @@ -1124,7 +1153,11 @@ static void do_facility_req(struct gigaset_capi_ctr *iif, static u8 confparam[10]; /* max. 9 octets + length byte */ /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* @@ -1223,6 +1256,7 @@ static void do_facility_req(struct gigaset_capi_ctr *iif, } /* send FACILITY_CONF with given Info and confirmation parameter */ + dev_kfree_skb_any(skb); capi_cmsg_answer(cmsg); cmsg->Info = info; cmsg->FacilityConfirmationParameter = confparam; @@ -1232,7 +1266,11 @@ static void do_facility_req(struct gigaset_capi_ctr *iif, dev_err(cs->dev, "%s: out of memory\n", __func__); return; } - capi_cmsg2message(cmsg, __skb_put(cskb, msgsize)); + if (capi_cmsg2message(cmsg, __skb_put(cskb, msgsize))) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(cskb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); capi_ctr_handle_message(&iif->ctr, ap->id, cskb); } @@ -1246,8 +1284,14 @@ static void do_listen_req(struct gigaset_capi_ctr *iif, struct gigaset_capi_appl *ap, struct sk_buff *skb) { + struct cardstate *cs = iif->ctr.driverdata; + /* decode message */ - capi_message2cmsg(&iif->acmsg, skb->data); + if (capi_message2cmsg(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); /* store listening parameters */ @@ -1264,8 +1308,14 @@ static void do_alert_req(struct gigaset_capi_ctr *iif, struct gigaset_capi_appl *ap, struct sk_buff *skb) { + struct cardstate *cs = iif->ctr.driverdata; + /* decode message */ - capi_message2cmsg(&iif->acmsg, skb->data); + if (capi_message2cmsg(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); send_conf(iif, ap, skb, CapiAlertAlreadySent); } @@ -1290,7 +1340,11 @@ static void do_connect_req(struct gigaset_capi_ctr *iif, u16 info; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* get free B channel & construct PLCI */ @@ -1577,7 +1631,11 @@ static void do_connect_resp(struct gigaset_capi_ctr *iif, int channel; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); dev_kfree_skb_any(skb); @@ -1743,7 +1801,11 @@ static void do_connect_b3_req(struct gigaset_capi_ctr *iif, int channel; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* extract and check channel number from PLCI */ @@ -1788,7 +1850,11 @@ static void do_connect_b3_resp(struct gigaset_capi_ctr *iif, u8 command; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* extract and check channel number and NCCI */ @@ -1828,7 +1894,11 @@ static void do_connect_b3_resp(struct gigaset_capi_ctr *iif, capi_cmsg_header(cmsg, ap->id, command, CAPI_IND, ap->nextMessageNumber++, cmsg->adr.adrNCCI); __skb_trim(skb, msgsize); - capi_cmsg2message(cmsg, skb->data); + if (capi_cmsg2message(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); capi_ctr_handle_message(&iif->ctr, ap->id, skb); } @@ -1850,7 +1920,11 @@ static void do_disconnect_req(struct gigaset_capi_ctr *iif, int channel; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* extract and check channel number from PLCI */ @@ -1906,8 +1980,14 @@ static void do_disconnect_req(struct gigaset_capi_ctr *iif, kfree(b3cmsg); return; } - capi_cmsg2message(b3cmsg, - __skb_put(b3skb, CAPI_DISCONNECT_B3_IND_BASELEN)); + if (capi_cmsg2message(b3cmsg, + __skb_put(b3skb, CAPI_DISCONNECT_B3_IND_BASELEN))) { + dev_err(cs->dev, "%s: message parser failure\n", + __func__); + kfree(b3cmsg); + dev_kfree_skb_any(b3skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, b3cmsg); kfree(b3cmsg); capi_ctr_handle_message(&iif->ctr, ap->id, b3skb); @@ -1938,7 +2018,11 @@ static void do_disconnect_b3_req(struct gigaset_capi_ctr *iif, int channel; /* decode message */ - capi_message2cmsg(cmsg, skb->data); + if (capi_message2cmsg(cmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, cmsg); /* extract and check channel number and NCCI */ @@ -2055,8 +2139,14 @@ static void do_reset_b3_req(struct gigaset_capi_ctr *iif, struct gigaset_capi_appl *ap, struct sk_buff *skb) { + struct cardstate *cs = iif->ctr.driverdata; + /* decode message */ - capi_message2cmsg(&iif->acmsg, skb->data); + if (capi_message2cmsg(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); send_conf(iif, ap, skb, CapiResetProcedureNotSupportedByCurrentProtocol); @@ -2069,8 +2159,14 @@ static void do_unsupported(struct gigaset_capi_ctr *iif, struct gigaset_capi_appl *ap, struct sk_buff *skb) { + struct cardstate *cs = iif->ctr.driverdata; + /* decode message */ - capi_message2cmsg(&iif->acmsg, skb->data); + if (capi_message2cmsg(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); send_conf(iif, ap, skb, CapiMessageNotSupportedInCurrentState); } @@ -2082,8 +2178,14 @@ static void do_nothing(struct gigaset_capi_ctr *iif, struct gigaset_capi_appl *ap, struct sk_buff *skb) { + struct cardstate *cs = iif->ctr.driverdata; + /* decode message */ - capi_message2cmsg(&iif->acmsg, skb->data); + if (capi_message2cmsg(&iif->acmsg, skb->data)) { + dev_err(cs->dev, "%s: message parser failure\n", __func__); + dev_kfree_skb_any(skb); + return; + } dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg); dev_kfree_skb_any(skb); } -- cgit v0.10.2 From 340184b35ac8786bdb574d2c8ce8e4f1269ec4da Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/capi: don't return NULL from capi_cmd2str() capi_cmd2str() is used in many places to build log messages. None of them is prepared to handle NULL as a result. Change the function to return printable string "INVALID_COMMAND" instead. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c index 36835ef..36c1b37 100644 --- a/drivers/isdn/capi/capiutil.c +++ b/drivers/isdn/capi/capiutil.c @@ -489,12 +489,17 @@ static char *mnames[] = * @cmd: command number * @subcmd: subcommand number * - * Return value: static string, NULL if command/subcommand unknown + * Return value: static string */ char *capi_cmd2str(u8 cmd, u8 subcmd) { - return mnames[command_2_index(cmd, subcmd)]; + char *result; + + result = mnames[command_2_index(cmd, subcmd)]; + if (result == NULL) + result = "INVALID_COMMAND"; + return result; } -- cgit v0.10.2 From 86f8ef2c4802ac9dbe0c8c1c12670bd915a13013 Mon Sep 17 00:00:00 2001 From: Tilman Schmidt Date: Sat, 11 Oct 2014 13:46:30 +0200 Subject: isdn/gigaset: fix usb_gigaset write_cmd result race In usb_gigaset function gigaset_write_cmd(), the length field of the command buffer structure could be cleared by the transmit tasklet before it was used for the function's return value. Fix by copying to a local variable before scheduling the tasklet. Signed-off-by: Tilman Schmidt Signed-off-by: David S. Miller diff --git a/drivers/isdn/gigaset/usb-gigaset.c b/drivers/isdn/gigaset/usb-gigaset.c index 82e91ba..a8e652d 100644 --- a/drivers/isdn/gigaset/usb-gigaset.c +++ b/drivers/isdn/gigaset/usb-gigaset.c @@ -497,6 +497,7 @@ static int send_cb(struct cardstate *cs, struct cmdbuf_t *cb) static int gigaset_write_cmd(struct cardstate *cs, struct cmdbuf_t *cb) { unsigned long flags; + int len; gigaset_dbg_buffer(cs->mstate != MS_LOCKED ? DEBUG_TRANSCMD : DEBUG_LOCKCMD, @@ -515,10 +516,11 @@ static int gigaset_write_cmd(struct cardstate *cs, struct cmdbuf_t *cb) spin_unlock_irqrestore(&cs->cmdlock, flags); spin_lock_irqsave(&cs->lock, flags); + len = cb->len; if (cs->connected) tasklet_schedule(&cs->write_tasklet); spin_unlock_irqrestore(&cs->lock, flags); - return cb->len; + return len; } static int gigaset_write_room(struct cardstate *cs) -- cgit v0.10.2 From ad971f616aa98ea2503f1a1064637bfb4ef7b21e Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Sat, 11 Oct 2014 15:17:29 -0700 Subject: tcp: fix tcp_ack() performance problem We worked hard to improve tcp_ack() performance, by not accessing skb_shinfo() in fast path (cd7d8498c9a5 tcp: change tcp_skb_pcount() location) We still have one spurious access because of ACK timestamping, added in commit e1c8a607b281 ("net-timestamp: ACK timestamp for bytestreams") By checking if sk_tsflags has SOF_TIMESTAMPING_TX_ACK set, we can avoid two cache line misses for the common case. While we are at it, add two prefetchw() : One in tcp_ack() to bring skb at the head of write queue. One in tcp_clean_rtx_queue() loop to bring following skb, as we will delete skb from the write queue and dirty skb->next->prev. Add a couple of [un]likely() clauses. After this patch, tcp_ack() is no longer the most consuming function in tcp stack. Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Cc: Neal Cardwell Cc: Yuchung Cheng Cc: Van Jacobson Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 00a4149..a12b455 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -68,6 +68,7 @@ #include #include #include +#include #include #include #include @@ -3029,6 +3030,21 @@ static u32 tcp_tso_acked(struct sock *sk, struct sk_buff *skb) return packets_acked; } +static void tcp_ack_tstamp(struct sock *sk, struct sk_buff *skb, + u32 prior_snd_una) +{ + const struct skb_shared_info *shinfo; + + /* Avoid cache line misses to get skb_shinfo() and shinfo->tx_flags */ + if (likely(!(sk->sk_tsflags & SOF_TIMESTAMPING_TX_ACK))) + return; + + shinfo = skb_shinfo(skb); + if ((shinfo->tx_flags & SKBTX_ACK_TSTAMP) && + between(shinfo->tskey, prior_snd_una, tcp_sk(sk)->snd_una - 1)) + __skb_tstamp_tx(skb, NULL, sk, SCM_TSTAMP_ACK); +} + /* Remove acknowledged frames from the retransmission queue. If our packet * is before the ack sequence we can discard it as it's confirmed to have * arrived at the other end. @@ -3052,14 +3068,11 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, first_ackt.v64 = 0; while ((skb = tcp_write_queue_head(sk)) && skb != tcp_send_head(sk)) { - struct skb_shared_info *shinfo = skb_shinfo(skb); struct tcp_skb_cb *scb = TCP_SKB_CB(skb); u8 sacked = scb->sacked; u32 acked_pcount; - if (unlikely(shinfo->tx_flags & SKBTX_ACK_TSTAMP) && - between(shinfo->tskey, prior_snd_una, tp->snd_una - 1)) - __skb_tstamp_tx(skb, NULL, sk, SCM_TSTAMP_ACK); + tcp_ack_tstamp(sk, skb, prior_snd_una); /* Determine how many packets and what bytes were acked, tso and else */ if (after(scb->end_seq, tp->snd_una)) { @@ -3073,10 +3086,12 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, fully_acked = false; } else { + /* Speedup tcp_unlink_write_queue() and next loop */ + prefetchw(skb->next); acked_pcount = tcp_skb_pcount(skb); } - if (sacked & TCPCB_RETRANS) { + if (unlikely(sacked & TCPCB_RETRANS)) { if (sacked & TCPCB_SACKED_RETRANS) tp->retrans_out -= acked_pcount; flag |= FLAG_RETRANS_DATA_ACKED; @@ -3107,7 +3122,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, * connection startup slow start one packet too * quickly. This is severely frowned upon behavior. */ - if (!(scb->tcp_flags & TCPHDR_SYN)) { + if (likely(!(scb->tcp_flags & TCPHDR_SYN))) { flag |= FLAG_DATA_ACKED; } else { flag |= FLAG_SYN_ACKED; @@ -3119,9 +3134,9 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, tcp_unlink_write_queue(skb, sk); sk_wmem_free_skb(sk, skb); - if (skb == tp->retransmit_skb_hint) + if (unlikely(skb == tp->retransmit_skb_hint)) tp->retransmit_skb_hint = NULL; - if (skb == tp->lost_skb_hint) + if (unlikely(skb == tp->lost_skb_hint)) tp->lost_skb_hint = NULL; } @@ -3132,7 +3147,7 @@ static int tcp_clean_rtx_queue(struct sock *sk, int prior_fackets, flag |= FLAG_SACK_RENEGING; skb_mstamp_get(&now); - if (first_ackt.v64) { + if (likely(first_ackt.v64)) { seq_rtt_us = skb_mstamp_us_delta(&now, &first_ackt); ca_seq_rtt_us = skb_mstamp_us_delta(&now, &last_ackt); } @@ -3394,6 +3409,9 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) int acked = 0; /* Number of packets newly acked */ long sack_rtt_us = -1L; + /* We very likely will need to access write queue head. */ + prefetchw(sk->sk_write_queue.next); + /* If the ack is older than previous acks * then we can probably ignore it. */ -- cgit v0.10.2 From f76936d07c4eeb36d8dbb64ebd30ab46ff85d9f7 Mon Sep 17 00:00:00 2001 From: Jiri Pirko Date: Mon, 13 Oct 2014 16:34:10 +0200 Subject: ipv4: fix nexthop attlen check in fib_nh_match fib_nh_match does not match nexthops correctly. Example: ip route add 172.16.10/24 nexthop via 192.168.122.12 dev eth0 \ nexthop via 192.168.122.13 dev eth0 ip route del 172.16.10/24 nexthop via 192.168.122.14 dev eth0 \ nexthop via 192.168.122.15 dev eth0 Del command is successful and route is removed. After this patch applied, the route is correctly matched and result is: RTNETLINK answers: No such process Please consider this for stable trees as well. Fixes: 4e902c57417c4 ("[IPv4]: FIB configuration using struct fib_config") Signed-off-by: Jiri Pirko Acked-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index 5b6efb3..f99f41b 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -537,7 +537,7 @@ int fib_nh_match(struct fib_config *cfg, struct fib_info *fi) return 1; attrlen = rtnh_attrlen(rtnh); - if (attrlen < 0) { + if (attrlen > 0) { struct nlattr *nla, *attrs = rtnh_attrs(rtnh); nla = nla_find(attrs, attrlen, RTA_GATEWAY); -- cgit v0.10.2 From 2c7c9ea429ba30fe506747b7da110e2212d8fefa Mon Sep 17 00:00:00 2001 From: Prashant Sreedharan Date: Mon, 13 Oct 2014 09:21:42 -0700 Subject: tg3: Add skb->xmit_more support Ring TX doorbell only if xmit_more is not set or the queue is stopped. Suggested-by: Daniel Borkmann Signed-off-by: Prashant Sreedharan Signed-off-by: Michael Chan Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c index ba49948..dbb41c1 100644 --- a/drivers/net/ethernet/broadcom/tg3.c +++ b/drivers/net/ethernet/broadcom/tg3.c @@ -8099,9 +8099,6 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev) /* Sync BD data before updating mailbox */ wmb(); - /* Packets are ready, update Tx producer idx local and on card. */ - tw32_tx_mbox(tnapi->prodmbox, entry); - tnapi->tx_prod = entry; if (unlikely(tg3_tx_avail(tnapi) <= (MAX_SKB_FRAGS + 1))) { netif_tx_stop_queue(txq); @@ -8116,7 +8113,12 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev) netif_tx_wake_queue(txq); } - mmiowb(); + if (!skb->xmit_more || netif_xmit_stopped(txq)) { + /* Packets are ready, update Tx producer idx on card. */ + tw32_tx_mbox(tnapi->prodmbox, entry); + mmiowb(); + } + return NETDEV_TX_OK; dma_error: -- cgit v0.10.2 From ff9538b1fce3a3af66578c072259dba7f7b4fe7a Mon Sep 17 00:00:00 2001 From: Mugunthan V N Date: Mon, 13 Oct 2014 22:21:05 +0530 Subject: drivers: net: davinci_cpdma: remove kfree on objects allocated with devm_* apis memories allocated with devm_* apis must not be freed with kfree apis, so removing the kfree calls Fixes: e194312854ed ('drivers: net: davinci_cpdma: Convert kzalloc() to devm_kzalloc().') Signed-off-by: Mugunthan V N Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c index 4a000f6..32dc289 100644 --- a/drivers/net/ethernet/ti/davinci_cpdma.c +++ b/drivers/net/ethernet/ti/davinci_cpdma.c @@ -561,7 +561,6 @@ int cpdma_chan_destroy(struct cpdma_chan *chan) cpdma_chan_stop(chan); ctlr->channels[chan->chan_num] = NULL; spin_unlock_irqrestore(&ctlr->lock, flags); - kfree(chan); return 0; } EXPORT_SYMBOL_GPL(cpdma_chan_destroy); -- cgit v0.10.2 From fc7a99fb71b83f811e2c013ab55e507048153f23 Mon Sep 17 00:00:00 2001 From: Mugunthan V N Date: Mon, 13 Oct 2014 22:21:06 +0530 Subject: drivers: net: davinci_cpdma: remove spinlock as SOFTIRQ-unsafe lock order detected remove spinlock in cpdma_desc_pool_destroy() as there is no active cpdma channel and iounmap should be called without auquiring lock. root@dra7xx-evm:~# modprobe -r ti_cpsw [ 50.539743] [ 50.541312] ====================================================== [ 50.547796] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] [ 50.554826] 3.14.19-02124-g95c5b7b #308 Not tainted [ 50.559939] ------------------------------------------------------ [ 50.566416] modprobe/1921 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 50.573347] (vmap_area_lock){+.+...}, at: [] find_vmap_area+0x10/0x6c [ 50.581132] [ 50.581132] and this task is already holding: [ 50.587249] (&(&pool->lock)->rlock#2){..-...}, at: [] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma] [ 50.597766] which would create a new lock dependency: [ 50.603048] (&(&pool->lock)->rlock#2){..-...} -> (vmap_area_lock){+.+...} [ 50.610296] [ 50.610296] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 50.618601] (&(&pool->lock)->rlock#2){..-...} ... which became SOFTIRQ-irq-safe at: [ 50.626829] [] _raw_spin_lock_irqsave+0x38/0x4c [ 50.632677] [] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma] [ 50.640437] [] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma] [ 50.647289] [] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma] [ 50.654512] [] cpdma_chan_process+0x3c/0x54 [davinci_cpdma] [ 50.661455] [] cpsw_poll+0x14/0xa8 [ti_cpsw] [ 50.667038] [] net_rx_action+0xc0/0x1e8 [ 50.672150] [] __do_softirq+0xcc/0x304 [ 50.677183] [] irq_exit+0xa8/0xfc [ 50.681751] [] handle_IRQ+0x50/0xb0 [ 50.686513] [] gic_handle_irq+0x28/0x5c [ 50.691628] [] __irq_svc+0x44/0x5c [ 50.696289] [] _raw_spin_unlock_irqrestore+0x34/0x44 [ 50.702591] [] do_page_fault.part.9+0x144/0x3c4 [ 50.708433] [] do_page_fault+0x74/0x84 [ 50.713453] [] do_DataAbort+0x34/0x98 [ 50.718391] [] __dabt_usr+0x3c/0x40 [ 50.723148] [ 50.723148] to a SOFTIRQ-irq-unsafe lock: [ 50.728893] (vmap_area_lock){+.+...} ... which became SOFTIRQ-irq-unsafe at: [ 50.736476] ... [] _raw_spin_lock+0x28/0x38 [ 50.741876] [] alloc_vmap_area.isra.28+0xb8/0x300 [ 50.747908] [] __get_vm_area_node.isra.29+0x90/0x134 [ 50.754210] [] get_vm_area_caller+0x3c/0x48 [ 50.759692] [] vmap+0x40/0x78 [ 50.763900] [] check_writebuffer_bugs+0x54/0x1a0 [ 50.769835] [] start_kernel+0x320/0x388 [ 50.774952] [<80008074>] 0x80008074 [ 50.778793] [ 50.778793] other info that might help us debug this: [ 50.778793] [ 50.787181] Possible interrupt unsafe locking scenario: [ 50.787181] [ 50.794295] CPU0 CPU1 [ 50.799042] ---- ---- [ 50.803785] lock(vmap_area_lock); [ 50.807446] local_irq_disable(); [ 50.813652] lock(&(&pool->lock)->rlock#2); [ 50.820782] lock(vmap_area_lock); [ 50.827086] [ 50.829823] lock(&(&pool->lock)->rlock#2); [ 50.834490] [ 50.834490] *** DEADLOCK *** [ 50.834490] [ 50.840695] 4 locks held by modprobe/1921: [ 50.844981] #0: (&__lockdep_no_validate__){......}, at: [] driver_detach+0x44/0xb8 [ 50.854038] #1: (&__lockdep_no_validate__){......}, at: [] driver_detach+0x50/0xb8 [ 50.863102] #2: (&(&ctlr->lock)->rlock){......}, at: [] cpdma_ctlr_destroy+0x1c/0x114 [davinci_cpdma] [ 50.873890] #3: (&(&pool->lock)->rlock#2){..-...}, at: [] cpdma_ctlr_destroy+0x5c/0x114 [davinci_cpdma] [ 50.884871] the dependencies between SOFTIRQ-irq-safe lock and the holding lock: [ 50.892827] -> (&(&pool->lock)->rlock#2){..-...} ops: 167 { [ 50.898703] IN-SOFTIRQ-W at: [ 50.901995] [] _raw_spin_lock_irqsave+0x38/0x4c [ 50.909476] [] cpdma_desc_free.constprop.7+0x28/0x58 [davinci_cpdma] [ 50.918878] [] __cpdma_chan_free+0x7c/0xa8 [davinci_cpdma] [ 50.927366] [] __cpdma_chan_process+0xf4/0x134 [davinci_cpdma] [ 50.936218] [] cpdma_chan_process+0x3c/0x54 [davinci_cpdma] [ 50.944794] [] cpsw_poll+0x14/0xa8 [ti_cpsw] [ 50.952009] [] net_rx_action+0xc0/0x1e8 [ 50.958765] [] __do_softirq+0xcc/0x304 [ 50.965432] [] irq_exit+0xa8/0xfc [ 50.971635] [] handle_IRQ+0x50/0xb0 [ 50.978035] [] gic_handle_irq+0x28/0x5c [ 50.984788] [] __irq_svc+0x44/0x5c [ 50.991085] [] _raw_spin_unlock_irqrestore+0x34/0x44 [ 50.999023] [] do_page_fault.part.9+0x144/0x3c4 [ 51.006510] [] do_page_fault+0x74/0x84 [ 51.013171] [] do_DataAbort+0x34/0x98 [ 51.019738] [] __dabt_usr+0x3c/0x40 [ 51.026129] INITIAL USE at: [ 51.029335] [] _raw_spin_lock_irqsave+0x38/0x4c [ 51.036729] [] cpdma_chan_submit+0x4c/0x2f0 [davinci_cpdma] [ 51.045225] [] cpsw_ndo_open+0x378/0x6bc [ti_cpsw] [ 51.052897] [] __dev_open+0x9c/0x104 [ 51.059287] [] __dev_change_flags+0x88/0x160 [ 51.066420] [] dev_change_flags+0x18/0x48 [ 51.073270] [] devinet_ioctl+0x61c/0x6e0 [ 51.080029] [] sock_ioctl+0x5c/0x298 [ 51.086418] [] do_vfs_ioctl+0x78/0x61c [ 51.092993] [] SyS_ioctl+0x64/0x74 [ 51.099200] [] ret_fast_syscall+0x0/0x48 [ 51.105956] } [ 51.107696] ... key at: [] __key.21312+0x0/0xfffff650 [davinci_cpdma] [ 51.115912] ... acquired at: [ 51.119019] [] lock_acquire+0x9c/0x104 [ 51.124138] [] _raw_spin_lock+0x28/0x38 [ 51.129341] [] find_vmap_area+0x10/0x6c [ 51.134547] [] remove_vm_area+0x8/0x6c [ 51.139659] [] __vunmap+0x20/0xf8 [ 51.144318] [] __arm_iounmap+0x10/0x18 [ 51.149440] [] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma] [ 51.156560] [] cpsw_remove+0x48/0x8c [ti_cpsw] [ 51.162407] [] platform_drv_remove+0x18/0x1c [ 51.168063] [] __device_release_driver+0x70/0xc8 [ 51.174094] [] driver_detach+0xb4/0xb8 [ 51.179212] [] bus_remove_driver+0x4c/0x90 [ 51.184693] [] SyS_delete_module+0x10c/0x198 [ 51.190355] [] ret_fast_syscall+0x0/0x48 [ 51.195661] [ 51.197217] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock: [ 51.205986] -> (vmap_area_lock){+.+...} ops: 520 { [ 51.211032] HARDIRQ-ON-W at: [ 51.214321] [] _raw_spin_lock+0x28/0x38 [ 51.221090] [] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.228750] [] __get_vm_area_node.isra.29+0x90/0x134 [ 51.236690] [] get_vm_area_caller+0x3c/0x48 [ 51.243811] [] vmap+0x40/0x78 [ 51.249654] [] check_writebuffer_bugs+0x54/0x1a0 [ 51.257239] [] start_kernel+0x320/0x388 [ 51.263994] [<80008074>] 0x80008074 [ 51.269474] SOFTIRQ-ON-W at: [ 51.272769] [] _raw_spin_lock+0x28/0x38 [ 51.279525] [] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.287190] [] __get_vm_area_node.isra.29+0x90/0x134 [ 51.295126] [] get_vm_area_caller+0x3c/0x48 [ 51.302245] [] vmap+0x40/0x78 [ 51.308094] [] check_writebuffer_bugs+0x54/0x1a0 [ 51.315669] [] start_kernel+0x320/0x388 [ 51.322423] [<80008074>] 0x80008074 [ 51.327906] INITIAL USE at: [ 51.331112] [] _raw_spin_lock+0x28/0x38 [ 51.337775] [] alloc_vmap_area.isra.28+0xb8/0x300 [ 51.345352] [] __get_vm_area_node.isra.29+0x90/0x134 [ 51.353197] [] get_vm_area_caller+0x3c/0x48 [ 51.360224] [] vmap+0x40/0x78 [ 51.365977] [] check_writebuffer_bugs+0x54/0x1a0 [ 51.373464] [] start_kernel+0x320/0x388 [ 51.380131] [<80008074>] 0x80008074 [ 51.385517] } [ 51.387260] ... key at: [] vmap_area_lock+0x10/0x20 [ 51.393841] ... acquired at: [ 51.396945] [] lock_acquire+0x9c/0x104 [ 51.402060] [] _raw_spin_lock+0x28/0x38 [ 51.407266] [] find_vmap_area+0x10/0x6c [ 51.412478] [] remove_vm_area+0x8/0x6c [ 51.417592] [] __vunmap+0x20/0xf8 [ 51.422252] [] __arm_iounmap+0x10/0x18 [ 51.427369] [] cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma] [ 51.434487] [] cpsw_remove+0x48/0x8c [ti_cpsw] [ 51.440336] [] platform_drv_remove+0x18/0x1c [ 51.446000] [] __device_release_driver+0x70/0xc8 [ 51.452031] [] driver_detach+0xb4/0xb8 [ 51.457147] [] bus_remove_driver+0x4c/0x90 [ 51.462628] [] SyS_delete_module+0x10c/0x198 [ 51.468289] [] ret_fast_syscall+0x0/0x48 [ 51.473584] [ 51.475140] [ 51.475140] stack backtrace: [ 51.479703] CPU: 0 PID: 1921 Comm: modprobe Not tainted 3.14.19-02124-g95c5b7b #308 [ 51.487744] [] (unwind_backtrace) from [] (show_stack+0x10/0x14) [ 51.495865] [] (show_stack) from [] (dump_stack+0x78/0x94) [ 51.503444] [] (dump_stack) from [] (check_usage+0x408/0x594) [ 51.511293] [] (check_usage) from [] (check_irq_usage+0x54/0xb0) [ 51.519416] [] (check_irq_usage) from [] (__lock_acquire+0xe54/0x1b90) [ 51.528077] [] (__lock_acquire) from [] (lock_acquire+0x9c/0x104) [ 51.536291] [] (lock_acquire) from [] (_raw_spin_lock+0x28/0x38) [ 51.544417] [] (_raw_spin_lock) from [] (find_vmap_area+0x10/0x6c) [ 51.552726] [] (find_vmap_area) from [] (remove_vm_area+0x8/0x6c) [ 51.560935] [] (remove_vm_area) from [] (__vunmap+0x20/0xf8) [ 51.568693] [] (__vunmap) from [] (__arm_iounmap+0x10/0x18) [ 51.576362] [] (__arm_iounmap) from [] (cpdma_ctlr_destroy+0xf0/0x114 [davinci_cpdma]) [ 51.586494] [] (cpdma_ctlr_destroy [davinci_cpdma]) from [] (cpsw_remove+0x48/0x8c [ti_cpsw]) [ 51.597261] [] (cpsw_remove [ti_cpsw]) from [] (platform_drv_remove+0x18/0x1c) [ 51.606659] [] (platform_drv_remove) from [] (__device_release_driver+0x70/0xc8) [ 51.616237] [] (__device_release_driver) from [] (driver_detach+0xb4/0xb8) [ 51.625264] [] (driver_detach) from [] (bus_remove_driver+0x4c/0x90) [ 51.633749] [] (bus_remove_driver) from [] (SyS_delete_module+0x10c/0x198) [ 51.642781] [] (SyS_delete_module) from [] (ret_fast_syscall+0x0/0x48) Signed-off-by: Mugunthan V N Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/ti/davinci_cpdma.c b/drivers/net/ethernet/ti/davinci_cpdma.c index 32dc289..657b65b 100644 --- a/drivers/net/ethernet/ti/davinci_cpdma.c +++ b/drivers/net/ethernet/ti/davinci_cpdma.c @@ -193,12 +193,9 @@ fail: static void cpdma_desc_pool_destroy(struct cpdma_desc_pool *pool) { - unsigned long flags; - if (!pool) return; - spin_lock_irqsave(&pool->lock, flags); WARN_ON(pool->used_desc); if (pool->cpumap) { dma_free_coherent(pool->dev, pool->mem_size, pool->cpumap, @@ -206,7 +203,6 @@ static void cpdma_desc_pool_destroy(struct cpdma_desc_pool *pool) } else { iounmap(pool->iomap); } - spin_unlock_irqrestore(&pool->lock, flags); } static inline dma_addr_t desc_phys(struct cpdma_desc_pool *pool, -- cgit v0.10.2 From 030b16a0e37ff2a870dd57c5da89c1741c683684 Mon Sep 17 00:00:00 2001 From: Mugunthan V N Date: Mon, 13 Oct 2014 22:21:07 +0530 Subject: drivers: net: cpsw: remove child devices while driver detach remove all the child devices from the system to make sure that re-insert of cpsw module doesn't fail on child device populated by of_platform_populate(). Signed-off-by: Mugunthan V N Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/ti/cpsw.c b/drivers/net/ethernet/ti/cpsw.c index ab167dc..952e1e4 100644 --- a/drivers/net/ethernet/ti/cpsw.c +++ b/drivers/net/ethernet/ti/cpsw.c @@ -2392,6 +2392,15 @@ clean_ndev_ret: return ret; } +static int cpsw_remove_child_device(struct device *dev, void *c) +{ + struct platform_device *pdev = to_platform_device(dev); + + of_device_unregister(pdev); + + return 0; +} + static int cpsw_remove(struct platform_device *pdev) { struct net_device *ndev = platform_get_drvdata(pdev); @@ -2406,6 +2415,7 @@ static int cpsw_remove(struct platform_device *pdev) cpdma_chan_destroy(priv->rxch); cpdma_ctlr_destroy(priv->dma); pm_runtime_disable(&pdev->dev); + device_for_each_child(&pdev->dev, NULL, cpsw_remove_child_device); if (priv->data.dual_emac) free_netdev(cpsw_get_slave_ndev(priv, 1)); free_netdev(ndev); -- cgit v0.10.2 From 6ff1e1e3c81426515e1782f2f13b7237211a43df Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Mon, 13 Oct 2014 22:21:46 +0200 Subject: caif: replace kmalloc/memset 0 by kzalloc Also add blank line after declaration Signed-off-by: Fabian Frederick Signed-off-by: David S. Miller diff --git a/net/caif/cfmuxl.c b/net/caif/cfmuxl.c index 8c5d638..510aa5a 100644 --- a/net/caif/cfmuxl.c +++ b/net/caif/cfmuxl.c @@ -47,10 +47,10 @@ static struct cflayer *get_up(struct cfmuxl *muxl, u16 id); struct cflayer *cfmuxl_create(void) { - struct cfmuxl *this = kmalloc(sizeof(struct cfmuxl), GFP_ATOMIC); + struct cfmuxl *this = kzalloc(sizeof(struct cfmuxl), GFP_ATOMIC); + if (!this) return NULL; - memset(this, 0, sizeof(*this)); this->layer.receive = cfmuxl_receive; this->layer.transmit = cfmuxl_transmit; this->layer.ctrlcmd = cfmuxl_ctrlcmd; -- cgit v0.10.2 From 7970f1918ff685e64063b54474a9c1ac087aee4d Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Tue, 14 Oct 2014 19:00:55 +0200 Subject: caif_usb: remove redundant memory message Let MM subsystem display out of memory messages. Signed-off-by: Fabian Frederick Signed-off-by: David S. Miller diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index ba02db0..0e487b0 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -87,10 +87,9 @@ static struct cflayer *cfusbl_create(int phyid, u8 ethaddr[ETH_ALEN], { struct cfusbl *this = kmalloc(sizeof(struct cfusbl), GFP_ATOMIC); - if (!this) { - pr_warn("Out of memory\n"); + if (!this) return NULL; - } + caif_assert(offsetof(struct cfusbl, layer) == 0); memset(this, 0, sizeof(struct cflayer)); -- cgit v0.10.2 From 91c4467e3c76b6d40ecc29ed71d3aa1e0285ab80 Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Tue, 14 Oct 2014 19:01:14 +0200 Subject: caif_usb: use target structure member in memset parent cfusbl was used instead of first structure member 'layer' Suggested-by: Joe Perches Signed-off-by: Fabian Frederick Signed-off-by: David S. Miller diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c index 0e487b0..5cd44f0 100644 --- a/net/caif/caif_usb.c +++ b/net/caif/caif_usb.c @@ -92,7 +92,7 @@ static struct cflayer *cfusbl_create(int phyid, u8 ethaddr[ETH_ALEN], caif_assert(offsetof(struct cfusbl, layer) == 0); - memset(this, 0, sizeof(struct cflayer)); + memset(&this->layer, 0, sizeof(this->layer)); this->layer.receive = cfusbl_receive; this->layer.transmit = cfusbl_transmit; this->layer.ctrlcmd = cfusbl_ctrlcmd; -- cgit v0.10.2 From c15952dc18d8a293d976ac6c06d44d9d98023b45 Mon Sep 17 00:00:00 2001 From: Alexei Starovoitov Date: Tue, 14 Oct 2014 02:08:54 -0700 Subject: net: filter: move common defines into bpf_common.h userspace programs that use eBPF instruction macros need to include two files: uapi/linux/filter.h and uapi/linux/bpf.h Move common macro definitions that are shared between classic BPF and eBPF into uapi/linux/bpf_common.h, so that user app can include only one bpf.h file Cc: Daniel Borkmann Signed-off-by: Alexei Starovoitov Signed-off-by: David S. Miller diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild index 70e150e..1115d68 100644 --- a/include/uapi/linux/Kbuild +++ b/include/uapi/linux/Kbuild @@ -68,6 +68,7 @@ header-y += binfmts.h header-y += blkpg.h header-y += blktrace_api.h header-y += bpf.h +header-y += bpf_common.h header-y += bpqether.h header-y += bsg.h header-y += btrfs.h diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 31b0ac2..d18316f 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -8,6 +8,7 @@ #define _UAPI__LINUX_BPF_H__ #include +#include /* Extended instruction set based on top of classic BPF */ diff --git a/include/uapi/linux/bpf_common.h b/include/uapi/linux/bpf_common.h new file mode 100644 index 0000000..a5c220e --- /dev/null +++ b/include/uapi/linux/bpf_common.h @@ -0,0 +1,55 @@ +#ifndef _UAPI__LINUX_BPF_COMMON_H__ +#define _UAPI__LINUX_BPF_COMMON_H__ + +/* Instruction classes */ +#define BPF_CLASS(code) ((code) & 0x07) +#define BPF_LD 0x00 +#define BPF_LDX 0x01 +#define BPF_ST 0x02 +#define BPF_STX 0x03 +#define BPF_ALU 0x04 +#define BPF_JMP 0x05 +#define BPF_RET 0x06 +#define BPF_MISC 0x07 + +/* ld/ldx fields */ +#define BPF_SIZE(code) ((code) & 0x18) +#define BPF_W 0x00 +#define BPF_H 0x08 +#define BPF_B 0x10 +#define BPF_MODE(code) ((code) & 0xe0) +#define BPF_IMM 0x00 +#define BPF_ABS 0x20 +#define BPF_IND 0x40 +#define BPF_MEM 0x60 +#define BPF_LEN 0x80 +#define BPF_MSH 0xa0 + +/* alu/jmp fields */ +#define BPF_OP(code) ((code) & 0xf0) +#define BPF_ADD 0x00 +#define BPF_SUB 0x10 +#define BPF_MUL 0x20 +#define BPF_DIV 0x30 +#define BPF_OR 0x40 +#define BPF_AND 0x50 +#define BPF_LSH 0x60 +#define BPF_RSH 0x70 +#define BPF_NEG 0x80 +#define BPF_MOD 0x90 +#define BPF_XOR 0xa0 + +#define BPF_JA 0x00 +#define BPF_JEQ 0x10 +#define BPF_JGT 0x20 +#define BPF_JGE 0x30 +#define BPF_JSET 0x40 +#define BPF_SRC(code) ((code) & 0x08) +#define BPF_K 0x00 +#define BPF_X 0x08 + +#ifndef BPF_MAXINSNS +#define BPF_MAXINSNS 4096 +#endif + +#endif /* _UAPI__LINUX_BPF_COMMON_H__ */ diff --git a/include/uapi/linux/filter.h b/include/uapi/linux/filter.h index 253b4d4..47785d5 100644 --- a/include/uapi/linux/filter.h +++ b/include/uapi/linux/filter.h @@ -7,7 +7,7 @@ #include #include - +#include /* * Current version of the filter code architecture. @@ -32,56 +32,6 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ struct sock_filter __user *filter; }; -/* - * Instruction classes - */ - -#define BPF_CLASS(code) ((code) & 0x07) -#define BPF_LD 0x00 -#define BPF_LDX 0x01 -#define BPF_ST 0x02 -#define BPF_STX 0x03 -#define BPF_ALU 0x04 -#define BPF_JMP 0x05 -#define BPF_RET 0x06 -#define BPF_MISC 0x07 - -/* ld/ldx fields */ -#define BPF_SIZE(code) ((code) & 0x18) -#define BPF_W 0x00 -#define BPF_H 0x08 -#define BPF_B 0x10 -#define BPF_MODE(code) ((code) & 0xe0) -#define BPF_IMM 0x00 -#define BPF_ABS 0x20 -#define BPF_IND 0x40 -#define BPF_MEM 0x60 -#define BPF_LEN 0x80 -#define BPF_MSH 0xa0 - -/* alu/jmp fields */ -#define BPF_OP(code) ((code) & 0xf0) -#define BPF_ADD 0x00 -#define BPF_SUB 0x10 -#define BPF_MUL 0x20 -#define BPF_DIV 0x30 -#define BPF_OR 0x40 -#define BPF_AND 0x50 -#define BPF_LSH 0x60 -#define BPF_RSH 0x70 -#define BPF_NEG 0x80 -#define BPF_MOD 0x90 -#define BPF_XOR 0xa0 - -#define BPF_JA 0x00 -#define BPF_JEQ 0x10 -#define BPF_JGT 0x20 -#define BPF_JGE 0x30 -#define BPF_JSET 0x40 -#define BPF_SRC(code) ((code) & 0x08) -#define BPF_K 0x00 -#define BPF_X 0x08 - /* ret - BPF_K and BPF_X also apply */ #define BPF_RVAL(code) ((code) & 0x18) #define BPF_A 0x10 @@ -91,10 +41,6 @@ struct sock_fprog { /* Required for SO_ATTACH_FILTER. */ #define BPF_TAX 0x00 #define BPF_TXA 0x80 -#ifndef BPF_MAXINSNS -#define BPF_MAXINSNS 4096 -#endif - /* * Macros for filter block array initializers. */ -- cgit v0.10.2 From 4c2e7f0954dcd9fbb47d065c654d44608dad38e0 Mon Sep 17 00:00:00 2001 From: Iyappan Subramanian Date: Mon, 13 Oct 2014 17:05:32 -0700 Subject: dtb: Add SGMII based 1GbE node to APM X-Gene SoC device tree Signed-off-by: Iyappan Subramanian Signed-off-by: Keyur Chudgar Signed-off-by: David S. Miller diff --git a/arch/arm64/boot/dts/apm-mustang.dts b/arch/arm64/boot/dts/apm-mustang.dts index 2ae782b..71a1489 100644 --- a/arch/arm64/boot/dts/apm-mustang.dts +++ b/arch/arm64/boot/dts/apm-mustang.dts @@ -33,6 +33,10 @@ status = "ok"; }; +&sgenet0 { + status = "ok"; +}; + &xgenet { status = "ok"; }; diff --git a/arch/arm64/boot/dts/apm-storm.dtsi b/arch/arm64/boot/dts/apm-storm.dtsi index d16cc03..f45bbfe 100644 --- a/arch/arm64/boot/dts/apm-storm.dtsi +++ b/arch/arm64/boot/dts/apm-storm.dtsi @@ -176,6 +176,16 @@ clock-output-names = "menetclk"; }; + sge0clk: sge0clk@1f21c000 { + compatible = "apm,xgene-device-clock"; + #clock-cells = <1>; + clocks = <&socplldiv2 0>; + reg = <0x0 0x1f21c000 0x0 0x1000>; + reg-names = "csr-reg"; + csr-mask = <0x3>; + clock-output-names = "sge0clk"; + }; + xge0clk: xge0clk@1f61c000 { compatible = "apm,xgene-device-clock"; #clock-cells = <1>; @@ -446,6 +456,20 @@ }; }; + sgenet0: ethernet@1f210000 { + compatible = "apm,xgene-enet"; + status = "disabled"; + reg = <0x0 0x1f210000 0x0 0x10000>, + <0x0 0x1f200000 0x0 0X10000>, + <0x0 0x1B000000 0x0 0X20000>; + reg-names = "enet_csr", "ring_csr", "ring_cmd"; + interrupts = <0x0 0xA0 0x4>; + dma-coherent; + clocks = <&sge0clk 0>; + local-mac-address = [00 00 00 00 00 00]; + phy-connection-type = "sgmii"; + }; + xgenet: ethernet@1f610000 { compatible = "apm,xgene-enet"; status = "disabled"; -- cgit v0.10.2 From dc8385f0c0f46ca18c1c8ab59c9f565dc7cfa6bf Mon Sep 17 00:00:00 2001 From: Iyappan Subramanian Date: Mon, 13 Oct 2014 17:05:33 -0700 Subject: drivers: net: xgene: Preparing for adding SGMII based 1GbE - Added link_state function pointer to the xgene__mac_ops structure - Moved ring manager (pdata->rm) assignment to xgene_enet_setup_ops - Removed unused variable (pdata->phy_addr) and macro (FULL_DUPLEX) Signed-off-by: Iyappan Subramanian Signed-off-by: Keyur Chudgar Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c index c8f3824..63ea194 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.c @@ -410,7 +410,6 @@ static void xgene_gmac_set_mac_addr(struct xgene_enet_pdata *pdata) addr0 = (dev_addr[3] << 24) | (dev_addr[2] << 16) | (dev_addr[1] << 8) | dev_addr[0]; addr1 = (dev_addr[5] << 24) | (dev_addr[4] << 16); - addr1 |= pdata->phy_addr & 0xFFFF; xgene_enet_wr_mcx_mac(pdata, STATION_ADDR0_ADDR, addr0); xgene_enet_wr_mcx_mac(pdata, STATION_ADDR1_ADDR, addr1); diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h index 15ec426..2efc4d9 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h @@ -179,7 +179,6 @@ enum xgene_enet_rm { #define TUND_ADDR 0x4a #define TSO_IPPROTO_TCP 1 -#define FULL_DUPLEX 2 #define USERINFO_POS 0 #define USERINFO_LEN 32 diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c index 9b85239..9e251ec 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c @@ -833,11 +833,9 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata) if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) { pdata->mcx_mac_addr = base_addr + BLOCK_ETH_MAC_OFFSET; pdata->mcx_mac_csr_addr = base_addr + BLOCK_ETH_MAC_CSR_OFFSET; - pdata->rm = RM3; } else { pdata->mcx_mac_addr = base_addr + BLOCK_AXG_MAC_OFFSET; pdata->mcx_mac_csr_addr = base_addr + BLOCK_AXG_MAC_CSR_OFFSET; - pdata->rm = RM0; } pdata->rx_buff_cnt = NUM_PKT_BUF; @@ -881,10 +879,12 @@ static void xgene_enet_setup_ops(struct xgene_enet_pdata *pdata) case PHY_INTERFACE_MODE_RGMII: pdata->mac_ops = &xgene_gmac_ops; pdata->port_ops = &xgene_gport_ops; + pdata->rm = RM3; break; default: pdata->mac_ops = &xgene_xgmac_ops; pdata->port_ops = &xgene_xgport_ops; + pdata->rm = RM0; break; } } @@ -895,6 +895,7 @@ static int xgene_enet_probe(struct platform_device *pdev) struct xgene_enet_pdata *pdata; struct device *dev = &pdev->dev; struct napi_struct *napi; + struct xgene_mac_ops *mac_ops; int ret; ndev = alloc_etherdev(sizeof(struct xgene_enet_pdata)); @@ -937,10 +938,11 @@ static int xgene_enet_probe(struct platform_device *pdev) napi = &pdata->rx_ring->napi; netif_napi_add(ndev, napi, xgene_enet_napi, NAPI_POLL_WEIGHT); + mac_ops = pdata->mac_ops; if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) ret = xgene_enet_mdio_config(pdata); else - INIT_DELAYED_WORK(&pdata->link_work, xgene_enet_link_state); + INIT_DELAYED_WORK(&pdata->link_work, mac_ops->link_state); return ret; err: diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h index 86cf68b..10b03a1 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h @@ -76,6 +76,7 @@ struct xgene_mac_ops { void (*tx_disable)(struct xgene_enet_pdata *pdata); void (*rx_disable)(struct xgene_enet_pdata *pdata); void (*set_mac_addr)(struct xgene_enet_pdata *pdata); + void (*link_state)(struct work_struct *work); }; struct xgene_port_ops { @@ -109,7 +110,6 @@ struct xgene_enet_pdata { void __iomem *base_addr; void __iomem *ring_csr_addr; void __iomem *ring_cmd_addr; - u32 phy_addr; int phy_mode; enum xgene_enet_rm rm; struct rtnl_link_stats64 stats; diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c index cd64b9f..67d0720 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.c @@ -284,7 +284,7 @@ static void xgene_enet_shutdown(struct xgene_enet_pdata *pdata) clk_disable_unprepare(pdata->clk); } -void xgene_enet_link_state(struct work_struct *work) +static void xgene_enet_link_state(struct work_struct *work) { struct xgene_enet_pdata *pdata = container_of(to_delayed_work(work), struct xgene_enet_pdata, link_work); @@ -322,6 +322,7 @@ struct xgene_mac_ops xgene_xgmac_ops = { .rx_disable = xgene_xgmac_rx_disable, .tx_disable = xgene_xgmac_tx_disable, .set_mac_addr = xgene_xgmac_set_mac_addr, + .link_state = xgene_enet_link_state }; struct xgene_port_ops xgene_xgport_ops = { diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h index d2d59e7..dcb2087 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h @@ -50,7 +50,6 @@ #define PHY_POLL_LINK_ON (10 * HZ) #define PHY_POLL_LINK_OFF (PHY_POLL_LINK_ON / 5) -void xgene_enet_link_state(struct work_struct *work); extern struct xgene_mac_ops xgene_xgmac_ops; extern struct xgene_port_ops xgene_xgport_ops; -- cgit v0.10.2 From 32f784b50e14c653ad0f010fbd5921a5f8caf846 Mon Sep 17 00:00:00 2001 From: Iyappan Subramanian Date: Mon, 13 Oct 2014 17:05:34 -0700 Subject: drivers: net: xgene: Add SGMII based 1GbE support Signed-off-by: Iyappan Subramanian Signed-off-by: Keyur Chudgar Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/apm/xgene/Makefile b/drivers/net/ethernet/apm/xgene/Makefile index 589b352..68be5655 100644 --- a/drivers/net/ethernet/apm/xgene/Makefile +++ b/drivers/net/ethernet/apm/xgene/Makefile @@ -2,6 +2,6 @@ # Makefile for APM X-Gene Ethernet Driver. # -xgene-enet-objs := xgene_enet_hw.o xgene_enet_xgmac.o \ +xgene-enet-objs := xgene_enet_hw.o xgene_enet_sgmac.o xgene_enet_xgmac.o \ xgene_enet_main.o xgene_enet_ethtool.o obj-$(CONFIG_NET_XGENE) += xgene-enet.o diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h index 2efc4d9..3855858 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_hw.h @@ -44,6 +44,7 @@ static inline u32 xgene_get_bits(u32 val, u32 start, u32 end) enum xgene_enet_rm { RM0, + RM1, RM3 = 3 }; @@ -143,6 +144,8 @@ enum xgene_enet_rm { #define CFG_CLE_FPSEL0_SET(dst, val) xgene_set_bits(dst, val, 16, 4) #define CFG_MACMODE_SET(dst, val) xgene_set_bits(dst, val, 18, 2) #define CFG_WAITASYNCRD_SET(dst, val) xgene_set_bits(dst, val, 0, 16) +#define CFG_CLE_DSTQID0(val) (val & GENMASK(11, 0)) +#define CFG_CLE_FPSEL0(val) ((val << 16) & GENMASK(19, 16)) #define ICM_CONFIG0_REG_0_ADDR 0x0400 #define ICM_CONFIG2_REG_0_ADDR 0x0410 #define RX_DV_GATE_REG_0_ADDR 0x05fc diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c index 9e251ec..3c208cc 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c @@ -21,6 +21,7 @@ #include "xgene_enet_main.h" #include "xgene_enet_hw.h" +#include "xgene_enet_sgmac.h" #include "xgene_enet_xgmac.h" static void xgene_enet_init_bufpool(struct xgene_enet_desc_ring *buf_pool) @@ -813,6 +814,7 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata) return pdata->phy_mode; } if (pdata->phy_mode != PHY_INTERFACE_MODE_RGMII && + pdata->phy_mode != PHY_INTERFACE_MODE_SGMII && pdata->phy_mode != PHY_INTERFACE_MODE_XGMII) { dev_err(dev, "Incorrect phy-connection-type specified\n"); return -ENODEV; @@ -830,7 +832,8 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata) pdata->eth_csr_addr = base_addr + BLOCK_ETH_CSR_OFFSET; pdata->eth_ring_if_addr = base_addr + BLOCK_ETH_RING_IF_OFFSET; pdata->eth_diag_csr_addr = base_addr + BLOCK_ETH_DIAG_CSR_OFFSET; - if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII) { + if (pdata->phy_mode == PHY_INTERFACE_MODE_RGMII || + pdata->phy_mode == PHY_INTERFACE_MODE_SGMII) { pdata->mcx_mac_addr = base_addr + BLOCK_ETH_MAC_OFFSET; pdata->mcx_mac_csr_addr = base_addr + BLOCK_ETH_MAC_CSR_OFFSET; } else { @@ -881,6 +884,11 @@ static void xgene_enet_setup_ops(struct xgene_enet_pdata *pdata) pdata->port_ops = &xgene_gport_ops; pdata->rm = RM3; break; + case PHY_INTERFACE_MODE_SGMII: + pdata->mac_ops = &xgene_sgmac_ops; + pdata->port_ops = &xgene_sgport_ops; + pdata->rm = RM1; + break; default: pdata->mac_ops = &xgene_xgmac_ops; pdata->port_ops = &xgene_xgport_ops; diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h index 10b03a1..874e5a0 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.h @@ -39,6 +39,9 @@ #define NUM_PKT_BUF 64 #define NUM_BUFPOOL 32 +#define PHY_POLL_LINK_ON (10 * HZ) +#define PHY_POLL_LINK_OFF (PHY_POLL_LINK_ON / 5) + /* software context of a descriptor ring */ struct xgene_enet_desc_ring { struct net_device *ndev; @@ -118,6 +121,13 @@ struct xgene_enet_pdata { struct delayed_work link_work; }; +struct xgene_indirect_ctl { + void __iomem *addr; + void __iomem *ctl; + void __iomem *cmd; + void __iomem *cmd_done; +}; + /* Set the specified value into a bit-field defined by its starting position * and length within a single u64. */ diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c new file mode 100644 index 0000000..e6d24c2 --- /dev/null +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.c @@ -0,0 +1,389 @@ +/* Applied Micro X-Gene SoC Ethernet Driver + * + * Copyright (c) 2014, Applied Micro Circuits Corporation + * Authors: Iyappan Subramanian + * Keyur Chudgar + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +#include "xgene_enet_main.h" +#include "xgene_enet_hw.h" +#include "xgene_enet_sgmac.h" + +static void xgene_enet_wr_csr(struct xgene_enet_pdata *p, u32 offset, u32 val) +{ + iowrite32(val, p->eth_csr_addr + offset); +} + +static void xgene_enet_wr_ring_if(struct xgene_enet_pdata *p, + u32 offset, u32 val) +{ + iowrite32(val, p->eth_ring_if_addr + offset); +} + +static void xgene_enet_wr_diag_csr(struct xgene_enet_pdata *p, + u32 offset, u32 val) +{ + iowrite32(val, p->eth_diag_csr_addr + offset); +} + +static bool xgene_enet_wr_indirect(struct xgene_indirect_ctl *ctl, + u32 wr_addr, u32 wr_data) +{ + int i; + + iowrite32(wr_addr, ctl->addr); + iowrite32(wr_data, ctl->ctl); + iowrite32(XGENE_ENET_WR_CMD, ctl->cmd); + + /* wait for write command to complete */ + for (i = 0; i < 10; i++) { + if (ioread32(ctl->cmd_done)) { + iowrite32(0, ctl->cmd); + return true; + } + udelay(1); + } + + return false; +} + +static void xgene_enet_wr_mac(struct xgene_enet_pdata *p, + u32 wr_addr, u32 wr_data) +{ + struct xgene_indirect_ctl ctl = { + .addr = p->mcx_mac_addr + MAC_ADDR_REG_OFFSET, + .ctl = p->mcx_mac_addr + MAC_WRITE_REG_OFFSET, + .cmd = p->mcx_mac_addr + MAC_COMMAND_REG_OFFSET, + .cmd_done = p->mcx_mac_addr + MAC_COMMAND_DONE_REG_OFFSET + }; + + if (!xgene_enet_wr_indirect(&ctl, wr_addr, wr_data)) + netdev_err(p->ndev, "mac write failed, addr: %04x\n", wr_addr); +} + +static u32 xgene_enet_rd_csr(struct xgene_enet_pdata *p, u32 offset) +{ + return ioread32(p->eth_csr_addr + offset); +} + +static u32 xgene_enet_rd_diag_csr(struct xgene_enet_pdata *p, u32 offset) +{ + return ioread32(p->eth_diag_csr_addr + offset); +} + +static u32 xgene_enet_rd_indirect(struct xgene_indirect_ctl *ctl, u32 rd_addr) +{ + u32 rd_data; + int i; + + iowrite32(rd_addr, ctl->addr); + iowrite32(XGENE_ENET_RD_CMD, ctl->cmd); + + /* wait for read command to complete */ + for (i = 0; i < 10; i++) { + if (ioread32(ctl->cmd_done)) { + rd_data = ioread32(ctl->ctl); + iowrite32(0, ctl->cmd); + + return rd_data; + } + udelay(1); + } + + pr_err("%s: mac read failed, addr: %04x\n", __func__, rd_addr); + + return 0; +} + +static u32 xgene_enet_rd_mac(struct xgene_enet_pdata *p, u32 rd_addr) +{ + struct xgene_indirect_ctl ctl = { + .addr = p->mcx_mac_addr + MAC_ADDR_REG_OFFSET, + .ctl = p->mcx_mac_addr + MAC_READ_REG_OFFSET, + .cmd = p->mcx_mac_addr + MAC_COMMAND_REG_OFFSET, + .cmd_done = p->mcx_mac_addr + MAC_COMMAND_DONE_REG_OFFSET + }; + + return xgene_enet_rd_indirect(&ctl, rd_addr); +} + +static int xgene_enet_ecc_init(struct xgene_enet_pdata *p) +{ + struct net_device *ndev = p->ndev; + u32 data; + int i; + + xgene_enet_wr_diag_csr(p, ENET_CFG_MEM_RAM_SHUTDOWN_ADDR, 0); + for (i = 0; i < 10 && data != ~0U ; i++) { + usleep_range(100, 110); + data = xgene_enet_rd_diag_csr(p, ENET_BLOCK_MEM_RDY_ADDR); + } + + if (data != ~0U) { + netdev_err(ndev, "Failed to release memory from shutdown\n"); + return -ENODEV; + } + + return 0; +} + +static void xgene_enet_config_ring_if_assoc(struct xgene_enet_pdata *p) +{ + u32 val = 0xffffffff; + + xgene_enet_wr_ring_if(p, ENET_CFGSSQMIWQASSOC_ADDR, val); + xgene_enet_wr_ring_if(p, ENET_CFGSSQMIFPQASSOC_ADDR, val); +} + +static void xgene_mii_phy_write(struct xgene_enet_pdata *p, u8 phy_id, + u32 reg, u16 data) +{ + u32 addr, wr_data, done; + int i; + + addr = PHY_ADDR(phy_id) | REG_ADDR(reg); + xgene_enet_wr_mac(p, MII_MGMT_ADDRESS_ADDR, addr); + + wr_data = PHY_CONTROL(data); + xgene_enet_wr_mac(p, MII_MGMT_CONTROL_ADDR, wr_data); + + for (i = 0; i < 10; i++) { + done = xgene_enet_rd_mac(p, MII_MGMT_INDICATORS_ADDR); + if (!(done & BUSY_MASK)) + return; + usleep_range(10, 20); + } + + netdev_err(p->ndev, "MII_MGMT write failed\n"); +} + +static u32 xgene_mii_phy_read(struct xgene_enet_pdata *p, u8 phy_id, u32 reg) +{ + u32 addr, data, done; + int i; + + addr = PHY_ADDR(phy_id) | REG_ADDR(reg); + xgene_enet_wr_mac(p, MII_MGMT_ADDRESS_ADDR, addr); + xgene_enet_wr_mac(p, MII_MGMT_COMMAND_ADDR, READ_CYCLE_MASK); + + for (i = 0; i < 10; i++) { + done = xgene_enet_rd_mac(p, MII_MGMT_INDICATORS_ADDR); + if (!(done & BUSY_MASK)) { + data = xgene_enet_rd_mac(p, MII_MGMT_STATUS_ADDR); + xgene_enet_wr_mac(p, MII_MGMT_COMMAND_ADDR, 0); + + return data; + } + usleep_range(10, 20); + } + + netdev_err(p->ndev, "MII_MGMT read failed\n"); + + return 0; +} + +static void xgene_sgmac_reset(struct xgene_enet_pdata *p) +{ + xgene_enet_wr_mac(p, MAC_CONFIG_1_ADDR, SOFT_RESET1); + xgene_enet_wr_mac(p, MAC_CONFIG_1_ADDR, 0); +} + +static void xgene_sgmac_set_mac_addr(struct xgene_enet_pdata *p) +{ + u32 addr0, addr1; + u8 *dev_addr = p->ndev->dev_addr; + + addr0 = (dev_addr[3] << 24) | (dev_addr[2] << 16) | + (dev_addr[1] << 8) | dev_addr[0]; + xgene_enet_wr_mac(p, STATION_ADDR0_ADDR, addr0); + + addr1 = xgene_enet_rd_mac(p, STATION_ADDR1_ADDR); + addr1 |= (dev_addr[5] << 24) | (dev_addr[4] << 16); + xgene_enet_wr_mac(p, STATION_ADDR1_ADDR, addr1); +} + +static u32 xgene_enet_link_status(struct xgene_enet_pdata *p) +{ + u32 data; + + data = xgene_mii_phy_read(p, INT_PHY_ADDR, + SGMII_BASE_PAGE_ABILITY_ADDR >> 2); + + return data & LINK_UP; +} + +static void xgene_sgmac_init(struct xgene_enet_pdata *p) +{ + u32 data, loop = 10; + + xgene_sgmac_reset(p); + + /* Enable auto-negotiation */ + xgene_mii_phy_write(p, INT_PHY_ADDR, SGMII_CONTROL_ADDR >> 2, 0x1000); + xgene_mii_phy_write(p, INT_PHY_ADDR, SGMII_TBI_CONTROL_ADDR >> 2, 0); + + while (loop--) { + data = xgene_mii_phy_read(p, INT_PHY_ADDR, + SGMII_STATUS_ADDR >> 2); + if ((data & AUTO_NEG_COMPLETE) && (data & LINK_STATUS)) + break; + usleep_range(10, 20); + } + if (!(data & AUTO_NEG_COMPLETE) || !(data & LINK_STATUS)) + netdev_err(p->ndev, "Auto-negotiation failed\n"); + + data = xgene_enet_rd_mac(p, MAC_CONFIG_2_ADDR); + ENET_INTERFACE_MODE2_SET(&data, 2); + xgene_enet_wr_mac(p, MAC_CONFIG_2_ADDR, data | FULL_DUPLEX2); + xgene_enet_wr_mac(p, INTERFACE_CONTROL_ADDR, ENET_GHD_MODE); + + data = xgene_enet_rd_csr(p, ENET_SPARE_CFG_REG_ADDR); + data |= MPA_IDLE_WITH_QMI_EMPTY; + xgene_enet_wr_csr(p, ENET_SPARE_CFG_REG_ADDR, data); + + xgene_sgmac_set_mac_addr(p); + + data = xgene_enet_rd_csr(p, DEBUG_REG_ADDR); + data |= CFG_BYPASS_UNISEC_TX | CFG_BYPASS_UNISEC_RX; + xgene_enet_wr_csr(p, DEBUG_REG_ADDR, data); + + /* Adjust MDC clock frequency */ + data = xgene_enet_rd_mac(p, MII_MGMT_CONFIG_ADDR); + MGMT_CLOCK_SEL_SET(&data, 7); + xgene_enet_wr_mac(p, MII_MGMT_CONFIG_ADDR, data); + + /* Enable drop if bufpool not available */ + data = xgene_enet_rd_csr(p, RSIF_CONFIG_REG_ADDR); + data |= CFG_RSIF_FPBUFF_TIMEOUT_EN; + xgene_enet_wr_csr(p, RSIF_CONFIG_REG_ADDR, data); + + /* Rtype should be copied from FP */ + xgene_enet_wr_csr(p, RSIF_RAM_DBG_REG0_ADDR, 0); + + /* Bypass traffic gating */ + xgene_enet_wr_csr(p, CFG_LINK_AGGR_RESUME_0_ADDR, TX_PORT0); + xgene_enet_wr_csr(p, CFG_BYPASS_ADDR, RESUME_TX); + xgene_enet_wr_csr(p, SG_RX_DV_GATE_REG_0_ADDR, RESUME_RX0); +} + +static void xgene_sgmac_rxtx(struct xgene_enet_pdata *p, u32 bits, bool set) +{ + u32 data; + + data = xgene_enet_rd_mac(p, MAC_CONFIG_1_ADDR); + + if (set) + data |= bits; + else + data &= ~bits; + + xgene_enet_wr_mac(p, MAC_CONFIG_1_ADDR, data); +} + +static void xgene_sgmac_rx_enable(struct xgene_enet_pdata *p) +{ + xgene_sgmac_rxtx(p, RX_EN, true); +} + +static void xgene_sgmac_tx_enable(struct xgene_enet_pdata *p) +{ + xgene_sgmac_rxtx(p, TX_EN, true); +} + +static void xgene_sgmac_rx_disable(struct xgene_enet_pdata *p) +{ + xgene_sgmac_rxtx(p, RX_EN, false); +} + +static void xgene_sgmac_tx_disable(struct xgene_enet_pdata *p) +{ + xgene_sgmac_rxtx(p, TX_EN, false); +} + +static void xgene_enet_reset(struct xgene_enet_pdata *p) +{ + clk_prepare_enable(p->clk); + clk_disable_unprepare(p->clk); + clk_prepare_enable(p->clk); + + xgene_enet_ecc_init(p); + xgene_enet_config_ring_if_assoc(p); +} + +static void xgene_enet_cle_bypass(struct xgene_enet_pdata *p, + u32 dst_ring_num, u16 bufpool_id) +{ + u32 data, fpsel; + + data = CFG_CLE_BYPASS_EN0; + xgene_enet_wr_csr(p, CLE_BYPASS_REG0_0_ADDR, data); + + fpsel = xgene_enet_ring_bufnum(bufpool_id) - 0x20; + data = CFG_CLE_DSTQID0(dst_ring_num) | CFG_CLE_FPSEL0(fpsel); + xgene_enet_wr_csr(p, CLE_BYPASS_REG1_0_ADDR, data); +} + +static void xgene_enet_shutdown(struct xgene_enet_pdata *p) +{ + clk_disable_unprepare(p->clk); +} + +static void xgene_enet_link_state(struct work_struct *work) +{ + struct xgene_enet_pdata *p = container_of(to_delayed_work(work), + struct xgene_enet_pdata, link_work); + struct net_device *ndev = p->ndev; + u32 link, poll_interval; + + link = xgene_enet_link_status(p); + if (link) { + if (!netif_carrier_ok(ndev)) { + netif_carrier_on(ndev); + xgene_sgmac_init(p); + xgene_sgmac_rx_enable(p); + xgene_sgmac_tx_enable(p); + netdev_info(ndev, "Link is Up - 1Gbps\n"); + } + poll_interval = PHY_POLL_LINK_ON; + } else { + if (netif_carrier_ok(ndev)) { + xgene_sgmac_rx_disable(p); + xgene_sgmac_tx_disable(p); + netif_carrier_off(ndev); + netdev_info(ndev, "Link is Down\n"); + } + poll_interval = PHY_POLL_LINK_OFF; + } + + schedule_delayed_work(&p->link_work, poll_interval); +} + +struct xgene_mac_ops xgene_sgmac_ops = { + .init = xgene_sgmac_init, + .reset = xgene_sgmac_reset, + .rx_enable = xgene_sgmac_rx_enable, + .tx_enable = xgene_sgmac_tx_enable, + .rx_disable = xgene_sgmac_rx_disable, + .tx_disable = xgene_sgmac_tx_disable, + .set_mac_addr = xgene_sgmac_set_mac_addr, + .link_state = xgene_enet_link_state +}; + +struct xgene_port_ops xgene_sgport_ops = { + .reset = xgene_enet_reset, + .cle_bypass = xgene_enet_cle_bypass, + .shutdown = xgene_enet_shutdown +}; diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.h b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.h new file mode 100644 index 0000000..de43246 --- /dev/null +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_sgmac.h @@ -0,0 +1,41 @@ +/* Applied Micro X-Gene SoC Ethernet Driver + * + * Copyright (c) 2014, Applied Micro Circuits Corporation + * Authors: Iyappan Subramanian + * Keyur Chudgar + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the + * Free Software Foundation; either version 2 of the License, or (at your + * option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program. If not, see . + */ + +#ifndef __XGENE_ENET_SGMAC_H__ +#define __XGENE_ENET_SGMAC_H__ + +#define PHY_ADDR(src) (((src)<<8) & GENMASK(12, 8)) +#define REG_ADDR(src) ((src) & GENMASK(4, 0)) +#define PHY_CONTROL(src) ((src) & GENMASK(15, 0)) +#define INT_PHY_ADDR 0x1e +#define SGMII_TBI_CONTROL_ADDR 0x44 +#define SGMII_CONTROL_ADDR 0x00 +#define SGMII_STATUS_ADDR 0x04 +#define SGMII_BASE_PAGE_ABILITY_ADDR 0x14 +#define AUTO_NEG_COMPLETE BIT(5) +#define LINK_STATUS BIT(2) +#define LINK_UP BIT(15) +#define MPA_IDLE_WITH_QMI_EMPTY BIT(12) +#define SG_RX_DV_GATE_REG_0_ADDR 0x0dfc + +extern struct xgene_mac_ops xgene_sgmac_ops; +extern struct xgene_port_ops xgene_sgport_ops; + +#endif /* __XGENE_ENET_SGMAC_H__ */ diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h index dcb2087..5a5296a 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_xgmac.h @@ -47,9 +47,6 @@ #define XG_ENET_SPARE_CFG_REG_1_ADDR 0x0410 #define XGENET_RX_DV_GATE_REG_0_ADDR 0x0804 -#define PHY_POLL_LINK_ON (10 * HZ) -#define PHY_POLL_LINK_OFF (PHY_POLL_LINK_ON / 5) - extern struct xgene_mac_ops xgene_xgmac_ops; extern struct xgene_port_ops xgene_xgport_ops; -- cgit v0.10.2 From 5e6a024bebea5bad6b787cf2c0ee28116b4147f0 Mon Sep 17 00:00:00 2001 From: Iyappan Subramanian Date: Mon, 13 Oct 2014 17:05:35 -0700 Subject: drivers: net: xgene: Add SGMII based 1GbE ethtool support Signed-off-by: Iyappan Subramanian Signed-off-by: Keyur Chudgar Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c index c1c997b..416d6eb 100644 --- a/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c +++ b/drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c @@ -64,16 +64,25 @@ static int xgene_get_settings(struct net_device *ndev, struct ethtool_cmd *cmd) return -ENODEV; return phy_ethtool_gset(phydev, cmd); + } else if (pdata->phy_mode == PHY_INTERFACE_MODE_SGMII) { + cmd->supported = SUPPORTED_1000baseT_Full | + SUPPORTED_Autoneg | SUPPORTED_MII; + cmd->advertising = cmd->supported; + ethtool_cmd_speed_set(cmd, SPEED_1000); + cmd->duplex = DUPLEX_FULL; + cmd->port = PORT_MII; + cmd->transceiver = XCVR_INTERNAL; + cmd->autoneg = AUTONEG_ENABLE; + } else { + cmd->supported = SUPPORTED_10000baseT_Full | SUPPORTED_FIBRE; + cmd->advertising = cmd->supported; + ethtool_cmd_speed_set(cmd, SPEED_10000); + cmd->duplex = DUPLEX_FULL; + cmd->port = PORT_FIBRE; + cmd->transceiver = XCVR_INTERNAL; + cmd->autoneg = AUTONEG_DISABLE; } - cmd->supported = SUPPORTED_10000baseT_Full | SUPPORTED_FIBRE; - cmd->advertising = cmd->supported; - ethtool_cmd_speed_set(cmd, SPEED_10000); - cmd->duplex = DUPLEX_FULL; - cmd->port = PORT_FIBRE; - cmd->transceiver = XCVR_EXTERNAL; - cmd->autoneg = AUTONEG_DISABLE; - return 0; } -- cgit v0.10.2 From 77b3a4dcde4f770a0f3edbe16dd423b3d0717318 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Tue, 14 Oct 2014 11:21:04 -0700 Subject: dsa: mv88e6171: Fix tag_protocol check tag_protocol is now an enum, so drivers have to check against it. Cc: Andrew Lunn Signed-off-by: Guenter Roeck Acked-by: Florian Fainelli Acked-by: Andrew Lunn Signed-off-by: David S. Miller diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c index 6365e30..1020a7a 100644 --- a/drivers/net/dsa/mv88e6171.c +++ b/drivers/net/dsa/mv88e6171.c @@ -206,7 +206,7 @@ static int mv88e6171_setup_port(struct dsa_switch *ds, int p) */ val = 0x0433; if (dsa_is_cpu_port(ds, p)) { - if (ds->dst->tag_protocol == htons(ETH_P_EDSA)) + if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA) val |= 0x3300; else val |= 0x0100; -- cgit v0.10.2 From 8c2a7a5d2c6ec6c2a95fe22a6d3af1db07840da8 Mon Sep 17 00:00:00 2001 From: Giuseppe CAVALLARO Date: Tue, 14 Oct 2014 08:11:54 +0200 Subject: stmmac: platform: fix FIXED_PHY support. On several STi platforms: e.g. stihxxx-b2120 an Ethernet switch is embedded and connected to the stmmac via RGMII mode. So this is managed by using the FIXED_PHY. In that case, the support in the platform needs to be fixed to allow the stmmac to dialog with the switch via fixed-link by using phy_bus_name property. Signed-off-by: Giuseppe Cavallaro Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 6521717..4894500 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -160,11 +160,16 @@ static int stmmac_probe_config_dt(struct platform_device *pdev, if (of_property_read_u32(np, "snps,phy-addr", &plat->phy_addr) == 0) dev_warn(&pdev->dev, "snps,phy-addr property is deprecated\n"); - plat->mdio_bus_data = devm_kzalloc(&pdev->dev, - sizeof(struct stmmac_mdio_bus_data), - GFP_KERNEL); - - plat->force_sf_dma_mode = of_property_read_bool(np, "snps,force_sf_dma_mode"); + if (plat->phy_bus_name) + plat->mdio_bus_data = NULL; + else + plat->mdio_bus_data = + devm_kzalloc(&pdev->dev, + sizeof(struct stmmac_mdio_bus_data), + GFP_KERNEL); + + plat->force_sf_dma_mode = + of_property_read_bool(np, "snps,force_sf_dma_mode"); /* Set the maxmtu to a default of JUMBO_LEN in case the * parameter is not present in the device tree. -- cgit v0.10.2 From 160e1fd10a287bb805745ea4e5b8bb383b686b7f Mon Sep 17 00:00:00 2001 From: Giuseppe CAVALLARO Date: Tue, 14 Oct 2014 08:12:55 +0200 Subject: stmmac: make the STi Layer compatible to STiH407 This adds the missing compatibility to the STiH407 SoC. Signed-off-by: Giuseppe Cavallaro Signed-off-by: David S. Miller diff --git a/Documentation/devicetree/bindings/net/sti-dwmac.txt b/Documentation/devicetree/bindings/net/sti-dwmac.txt index 3dd3d0b..8c84d9a 100644 --- a/Documentation/devicetree/bindings/net/sti-dwmac.txt +++ b/Documentation/devicetree/bindings/net/sti-dwmac.txt @@ -3,8 +3,8 @@ STMicroelectronics SoC DWMAC glue layer controller The device node has following properties. Required properties: - - compatible : Can be "st,stih415-dwmac", "st,stih416-dwmac" or - "st,stid127-dwmac". + - compatible : Can be "st,stih415-dwmac", "st,stih416-dwmac", + "st,stid127-dwmac", "st,stih407-dwmac". - reg : Offset of the glue configuration register map in system configuration regmap pointed by st,syscon property and size. diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 4894500..0d6b9ad 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -40,6 +40,7 @@ static const struct of_device_id stmmac_dt_ids[] = { { .compatible = "st,stih415-dwmac", .data = &sti_gmac_data}, { .compatible = "st,stih416-dwmac", .data = &sti_gmac_data}, { .compatible = "st,stid127-dwmac", .data = &sti_gmac_data}, + { .compatible = "st,stih407-dwmac", .data = &sti_gmac_data}, #endif #ifdef CONFIG_DWMAC_SOCFPGA { .compatible = "altr,socfpga-stmmac", .data = &socfpga_gmac_data }, -- cgit v0.10.2 From 53b26b9bc9a547bf10135a8079e5ae88f354b9f6 Mon Sep 17 00:00:00 2001 From: Giuseppe CAVALLARO Date: Tue, 14 Oct 2014 08:12:56 +0200 Subject: stmmac: dwmac-sti: review the glue-logic for STi4xx and STiD127 SoCs This patch is to review the whole glue logic adopted on STi SoCs that was bugged. In the old glue-logic there was a lot of confusion when setup the retiming especially for STiD127 where, for example, the bits 6 and 7 (in the GMAC control register) have a different meaning of what is used for STiH4xx SoCs. So we cannot adopt the same glue for all these SoCs. Moreover, GiGa on STiD127 didn't work and, for all the SoCs, the RGMII couldn't run when the speed was 10Mbps (because the clock was not properly managed). Note that the phy clock needs to be provided by the platform as well as documented in the related binding file (updated as consequence). The old code supported too many configurations never adopted and validated. This made the code very complex to maintain and debug in case of issues. The patch simplifies all the configurations as commented in the tables inside the file and obviously it has been tested on all the boards based on the SoCs mentioned. With this patch, the dwmac-sti is also ready to support new configurations that will be available on next SoC generations. Signed-off-by: Giuseppe Cavallaro Cc: Srinivas Kandagatla Signed-off-by: David S. Miller diff --git a/Documentation/devicetree/bindings/net/sti-dwmac.txt b/Documentation/devicetree/bindings/net/sti-dwmac.txt index 8c84d9a..6762a6b 100644 --- a/Documentation/devicetree/bindings/net/sti-dwmac.txt +++ b/Documentation/devicetree/bindings/net/sti-dwmac.txt @@ -1,58 +1,65 @@ STMicroelectronics SoC DWMAC glue layer controller +This file documents differences between the core properties in +Documentation/devicetree/bindings/net/stmmac.txt +and what is needed on STi platforms to program the stmmac glue logic. + The device node has following properties. Required properties: - compatible : Can be "st,stih415-dwmac", "st,stih416-dwmac", - "st,stid127-dwmac", "st,stih407-dwmac". - - reg : Offset of the glue configuration register map in system + "st,stih407-dwmac", "st,stid127-dwmac". + - reg : Offset of the glue configuration register map in system configuration regmap pointed by st,syscon property and size. - - - reg-names : Should be "sti-ethconf". - - - st,syscon : Should be phandle to system configuration node which + - st,syscon : Should be phandle to system configuration node which encompases this glue registers. + - st,gmac_en: this is to enable the gmac into a dedicated sysctl control + register available on STiH407 SoC. + - sti-ethconf: this is the gmac glue logic register to enable the GMAC, + select among the different modes and program the clk retiming. + - pinctrl-0: pin-control for all the MII mode supported. - - st,tx-retime-src: On STi Parts for Giga bit speeds, 125Mhz clocks can be - wired up in from different sources. One via TXCLK pin and other via CLK_125 - pin. This wiring is totally board dependent. However the retiming glue - logic should be configured accordingly. Possible values for this property - - "txclk" - if 125Mhz clock is wired up via txclk line. - "clk_125" - if 125Mhz clock is wired up via clk_125 line. - - This property is only valid for Giga bit setup( GMII, RGMII), and it is - un-used for non-giga bit (MII and RMII) setups. Also note that internal - clockgen can not generate stable 125Mhz clock. - - - st,ext-phyclk: This boolean property indicates who is generating the clock - for tx and rx. This property is only valid for RMII case where the clock can - be generated from the MAC or PHY. - - - clock-names: should be "sti-ethclk". - - clocks: Should point to ethernet clockgen which can generate phyclk. - +Optional properties: + - resets : phandle pointing to the system reset controller with correct + reset line index for ethernet reset. + - st,ext-phyclk: valid only for RMII where PHY can generate 50MHz clock or + MAC can generate it. + - st,tx-retime-src: This specifies which clk is wired up to the mac for + retimeing tx lines. This is totally board dependent and can take one of the + posssible values from "txclk", "clk_125" or "clkgen". + If not passed, the internal clock will be used by default. + - sti-ethclk: this is the phy clock. + - sti-clkconf: this is an extra sysconfig register, available in new SoCs, + to program the clk retiming. + - st,gmac_en: to enable the GMAC, this only is present in some SoCs; e.g. + STiH407. Example: -ethernet0: dwmac@fe810000 { - device_type = "network"; - compatible = "st,stih416-dwmac", "snps,dwmac", "snps,dwmac-3.710"; - reg = <0xfe810000 0x8000>, <0x8bc 0x4>; - reg-names = "stmmaceth", "sti-ethconf"; - interrupts = <0 133 0>, <0 134 0>, <0 135 0>; - interrupt-names = "macirq", "eth_wake_irq", "eth_lpi"; - phy-mode = "mii"; +ethernet0: dwmac@9630000 { + device_type = "network"; + status = "disabled"; + compatible = "st,stih407-dwmac", "snps,dwmac", "snps,dwmac-3.710"; + reg = <0x9630000 0x8000>, <0x80 0x4>; + reg-names = "stmmaceth", "sti-ethconf"; - st,syscon = <&syscfg_rear>; + st,syscon = <&syscfg_sbc_reg>; + st,gmac_en; + resets = <&softreset STIH407_ETH1_SOFTRESET>; + reset-names = "stmmaceth"; - snps,pbl = <32>; + interrupts = , + , + ; + interrupt-names = "macirq", "eth_wake_irq", "eth_lpi"; + + snps,pbl = <32>; snps,mixed-burst; - resets = <&softreset STIH416_ETH0_SOFTRESET>; - reset-names = "stmmaceth"; - pinctrl-0 = <&pinctrl_mii0>; - pinctrl-names = "default"; - clocks = <&CLK_S_GMAC0_PHY>; - clock-names = "stmmaceth"; + pinctrl-names = "default"; + pinctrl-0 = <&pinctrl_rgmii1>; + + clock-names = "stmmaceth", "sti-ethclk"; + clocks = <&CLK_S_C0_FLEXGEN CLK_EXT2F_A9>, + <&CLK_S_C0_FLEXGEN CLK_ETH_PHY>; }; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c index 552bbc1..ccfe7e5 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sti.c @@ -3,7 +3,7 @@ * * Copyright (C) 2003-2014 STMicroelectronics (R&D) Limited * Author: Srinivas Kandagatla - * + * Contributors: Giuseppe Cavallaro * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by @@ -22,45 +22,22 @@ #include #include +#define DWMAC_125MHZ 125000000 +#define DWMAC_50MHZ 50000000 +#define DWMAC_25MHZ 25000000 +#define DWMAC_2_5MHZ 2500000 + +#define IS_PHY_IF_MODE_RGMII(iface) (iface == PHY_INTERFACE_MODE_RGMII || \ + iface == PHY_INTERFACE_MODE_RGMII_ID || \ + iface == PHY_INTERFACE_MODE_RGMII_RXID || \ + iface == PHY_INTERFACE_MODE_RGMII_TXID) + +#define IS_PHY_IF_MODE_GBIT(iface) (IS_PHY_IF_MODE_RGMII(iface) || \ + iface == PHY_INTERFACE_MODE_GMII) + +/* STiH4xx register definitions (STiH415/STiH416/STiH407/STiH410 families) */ + /** - * STi GMAC glue logic. - * -------------------- - * - * _ - * | \ - * --------|0 \ ETH_SEL_INTERNAL_NOTEXT_PHYCLK - * phyclk | |___________________________________________ - * | | | (phyclk-in) - * --------|1 / | - * int-clk |_ / | - * | _ - * | | \ - * |_______|1 \ ETH_SEL_TX_RETIME_CLK - * | |___________________________ - * | | (tx-retime-clk) - * _______|0 / - * | |_ / - * _ | - * | \ | - * --------|0 \ | - * clk_125 | |__| - * | | ETH_SEL_TXCLK_NOT_CLK125 - * --------|1 / - * txclk |_ / - * - * - * ETH_SEL_INTERNAL_NOTEXT_PHYCLK is valid only for RMII where PHY can - * generate 50MHz clock or MAC can generate it. - * This bit is configured by "st,ext-phyclk" property. - * - * ETH_SEL_TXCLK_NOT_CLK125 is only valid for gigabit modes, where the 125Mhz - * clock either comes from clk-125 pin or txclk pin. This configuration is - * totally driven by the board wiring. This bit is configured by - * "st,tx-retime-src" property. - * - * TXCLK configuration is different for different phy interface modes - * and changes according to link speed in modes like RGMII. - * * Below table summarizes the clock requirement and clock sources for * supported phy interface modes with link speeds. * ________________________________________________ @@ -74,44 +51,58 @@ * ------------------------------------------------ *| RGMII | 125Mhz | 25Mhz | *| | clk-125/txclk | clkgen | + *| | clkgen | | * ------------------------------------------------ *| RMII | n/a | 25Mhz | *| | |clkgen/phyclk-in | * ------------------------------------------------ * - * TX lines are always retimed with a clk, which can vary depending - * on the board configuration. Below is the table of these bits - * in eth configuration register depending on source of retime clk. - * - *--------------------------------------------------------------- - * src | tx_rt_clk | int_not_ext_phyclk | txclk_n_clk125| - *--------------------------------------------------------------- - * txclk | 0 | n/a | 1 | - *--------------------------------------------------------------- - * ck_125| 0 | n/a | 0 | - *--------------------------------------------------------------- - * phyclk| 1 | 0 | n/a | - *--------------------------------------------------------------- - * clkgen| 1 | 1 | n/a | - *--------------------------------------------------------------- + * Register Configuration + *------------------------------- + * src |BIT(8)| BIT(7)| BIT(6)| + *------------------------------- + * txclk | 0 | n/a | 1 | + *------------------------------- + * ck_125| 0 | n/a | 0 | + *------------------------------- + * phyclk| 1 | 0 | n/a | + *------------------------------- + * clkgen| 1 | 1 | n/a | + *------------------------------- */ - /* Register definition */ +#define STIH4XX_RETIME_SRC_MASK GENMASK(8, 6) +#define STIH4XX_ETH_SEL_TX_RETIME_CLK BIT(8) +#define STIH4XX_ETH_SEL_INTERNAL_NOTEXT_PHYCLK BIT(7) +#define STIH4XX_ETH_SEL_TXCLK_NOT_CLK125 BIT(6) + +/* STiD127 register definitions */ - /* 3 bits [8:6] - * [6:6] ETH_SEL_TXCLK_NOT_CLK125 - * [7:7] ETH_SEL_INTERNAL_NOTEXT_PHYCLK - * [8:8] ETH_SEL_TX_RETIME_CLK - * - */ +/** + *----------------------- + * src |BIT(6)| BIT(7)| + *----------------------- + * MII | 1 | n/a | + *----------------------- + * RMII | n/a | 1 | + * clkgen| | | + *----------------------- + * RMII | n/a | 0 | + * phyclk| | | + *----------------------- + * RGMII | 1 | n/a | + * clkgen| | | + *----------------------- + */ -#define TX_RETIME_SRC_MASK GENMASK(8, 6) -#define ETH_SEL_TX_RETIME_CLK BIT(8) -#define ETH_SEL_INTERNAL_NOTEXT_PHYCLK BIT(7) -#define ETH_SEL_TXCLK_NOT_CLK125 BIT(6) +#define STID127_RETIME_SRC_MASK GENMASK(7, 6) +#define STID127_ETH_SEL_INTERNAL_NOTEXT_PHYCLK BIT(7) +#define STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK BIT(6) -#define ENMII_MASK GENMASK(5, 5) -#define ENMII BIT(5) +#define ENMII_MASK GENMASK(5, 5) +#define ENMII BIT(5) +#define EN_MASK GENMASK(1, 1) +#define EN BIT(1) /** * 3 bits [4:2] @@ -120,29 +111,23 @@ * 010-SGMII * 100-RMII */ -#define MII_PHY_SEL_MASK GENMASK(4, 2) -#define ETH_PHY_SEL_RMII BIT(4) -#define ETH_PHY_SEL_SGMII BIT(3) -#define ETH_PHY_SEL_RGMII BIT(2) -#define ETH_PHY_SEL_GMII 0x0 -#define ETH_PHY_SEL_MII 0x0 - -#define IS_PHY_IF_MODE_RGMII(iface) (iface == PHY_INTERFACE_MODE_RGMII || \ - iface == PHY_INTERFACE_MODE_RGMII_ID || \ - iface == PHY_INTERFACE_MODE_RGMII_RXID || \ - iface == PHY_INTERFACE_MODE_RGMII_TXID) - -#define IS_PHY_IF_MODE_GBIT(iface) (IS_PHY_IF_MODE_RGMII(iface) || \ - iface == PHY_INTERFACE_MODE_GMII) +#define MII_PHY_SEL_MASK GENMASK(4, 2) +#define ETH_PHY_SEL_RMII BIT(4) +#define ETH_PHY_SEL_SGMII BIT(3) +#define ETH_PHY_SEL_RGMII BIT(2) +#define ETH_PHY_SEL_GMII 0x0 +#define ETH_PHY_SEL_MII 0x0 struct sti_dwmac { - int interface; - bool ext_phyclk; - bool is_tx_retime_src_clk_125; - struct clk *clk; - int reg; + int interface; /* MII interface */ + bool ext_phyclk; /* Clock from external PHY */ + u32 tx_retime_src; /* TXCLK Retiming*/ + struct clk *clk; /* PHY clock */ + int ctrl_reg; /* GMAC glue-logic control register */ + int clk_sel_reg; /* GMAC ext clk selection register */ struct device *dev; struct regmap *regmap; + u32 speed; }; static u32 phy_intf_sels[] = { @@ -162,74 +147,133 @@ enum { TX_RETIME_SRC_CLKGEN, }; -static const char *const tx_retime_srcs[] = { - [TX_RETIME_SRC_NA] = "", - [TX_RETIME_SRC_TXCLK] = "txclk", - [TX_RETIME_SRC_CLK_125] = "clk_125", - [TX_RETIME_SRC_PHYCLK] = "phyclk", - [TX_RETIME_SRC_CLKGEN] = "clkgen", -}; - -static u32 tx_retime_val[] = { - [TX_RETIME_SRC_TXCLK] = ETH_SEL_TXCLK_NOT_CLK125, +static u32 stih4xx_tx_retime_val[] = { + [TX_RETIME_SRC_TXCLK] = STIH4XX_ETH_SEL_TXCLK_NOT_CLK125, [TX_RETIME_SRC_CLK_125] = 0x0, - [TX_RETIME_SRC_PHYCLK] = ETH_SEL_TX_RETIME_CLK, - [TX_RETIME_SRC_CLKGEN] = ETH_SEL_TX_RETIME_CLK | - ETH_SEL_INTERNAL_NOTEXT_PHYCLK, + [TX_RETIME_SRC_PHYCLK] = STIH4XX_ETH_SEL_TX_RETIME_CLK, + [TX_RETIME_SRC_CLKGEN] = STIH4XX_ETH_SEL_TX_RETIME_CLK + | STIH4XX_ETH_SEL_INTERNAL_NOTEXT_PHYCLK, }; -static void setup_retime_src(struct sti_dwmac *dwmac, u32 spd) +static void stih4xx_fix_retime_src(void *priv, u32 spd) { - u32 src = 0, freq = 0; - - if (spd == SPEED_100) { - if (dwmac->interface == PHY_INTERFACE_MODE_MII || - dwmac->interface == PHY_INTERFACE_MODE_GMII) { - src = TX_RETIME_SRC_TXCLK; - } else if (dwmac->interface == PHY_INTERFACE_MODE_RMII) { - if (dwmac->ext_phyclk) { - src = TX_RETIME_SRC_PHYCLK; - } else { - src = TX_RETIME_SRC_CLKGEN; - freq = 50000000; - } - - } else if (IS_PHY_IF_MODE_RGMII(dwmac->interface)) { + struct sti_dwmac *dwmac = priv; + u32 src = dwmac->tx_retime_src; + u32 reg = dwmac->ctrl_reg; + u32 freq = 0; + + if (dwmac->interface == PHY_INTERFACE_MODE_MII) { + src = TX_RETIME_SRC_TXCLK; + } else if (dwmac->interface == PHY_INTERFACE_MODE_RMII) { + if (dwmac->ext_phyclk) { + src = TX_RETIME_SRC_PHYCLK; + } else { src = TX_RETIME_SRC_CLKGEN; - freq = 25000000; + freq = DWMAC_50MHZ; } + } else if (IS_PHY_IF_MODE_RGMII(dwmac->interface)) { + /* On GiGa clk source can be either ext or from clkgen */ + if (spd == SPEED_1000) { + freq = DWMAC_125MHZ; + } else { + /* Switch to clkgen for these speeds */ + src = TX_RETIME_SRC_CLKGEN; + if (spd == SPEED_100) + freq = DWMAC_25MHZ; + else if (spd == SPEED_10) + freq = DWMAC_2_5MHZ; + } + } - if (src == TX_RETIME_SRC_CLKGEN && dwmac->clk) - clk_set_rate(dwmac->clk, freq); + if (src == TX_RETIME_SRC_CLKGEN && dwmac->clk && freq) + clk_set_rate(dwmac->clk, freq); - } else if (spd == SPEED_1000) { - if (dwmac->is_tx_retime_src_clk_125) - src = TX_RETIME_SRC_CLK_125; - else - src = TX_RETIME_SRC_TXCLK; + regmap_update_bits(dwmac->regmap, reg, STIH4XX_RETIME_SRC_MASK, + stih4xx_tx_retime_val[src]); +} + +static void stid127_fix_retime_src(void *priv, u32 spd) +{ + struct sti_dwmac *dwmac = priv; + u32 reg = dwmac->ctrl_reg; + u32 freq = 0; + u32 val = 0; + + if (dwmac->interface == PHY_INTERFACE_MODE_MII) { + val = STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK; + } else if (dwmac->interface == PHY_INTERFACE_MODE_RMII) { + if (!dwmac->ext_phyclk) { + val = STID127_ETH_SEL_INTERNAL_NOTEXT_PHYCLK; + freq = DWMAC_50MHZ; + } + } else if (IS_PHY_IF_MODE_RGMII(dwmac->interface)) { + val = STID127_ETH_SEL_INTERNAL_NOTEXT_TXCLK; + if (spd == SPEED_1000) + freq = DWMAC_125MHZ; + else if (spd == SPEED_100) + freq = DWMAC_25MHZ; + else if (spd == SPEED_10) + freq = DWMAC_2_5MHZ; } - regmap_update_bits(dwmac->regmap, dwmac->reg, - TX_RETIME_SRC_MASK, tx_retime_val[src]); + if (dwmac->clk && freq) + clk_set_rate(dwmac->clk, freq); + + regmap_update_bits(dwmac->regmap, reg, STID127_RETIME_SRC_MASK, val); } -static void sti_dwmac_exit(struct platform_device *pdev, void *priv) +static void sti_dwmac_ctrl_init(struct sti_dwmac *dwmac) { - struct sti_dwmac *dwmac = priv; + struct regmap *regmap = dwmac->regmap; + int iface = dwmac->interface; + struct device *dev = dwmac->dev; + struct device_node *np = dev->of_node; + u32 reg = dwmac->ctrl_reg; + u32 val; if (dwmac->clk) - clk_disable_unprepare(dwmac->clk); + clk_prepare_enable(dwmac->clk); + + if (of_property_read_bool(np, "st,gmac_en")) + regmap_update_bits(regmap, reg, EN_MASK, EN); + + regmap_update_bits(regmap, reg, MII_PHY_SEL_MASK, phy_intf_sels[iface]); + + val = (iface == PHY_INTERFACE_MODE_REVMII) ? 0 : ENMII; + regmap_update_bits(regmap, reg, ENMII_MASK, val); +} + +static int stix4xx_init(struct platform_device *pdev, void *priv) +{ + struct sti_dwmac *dwmac = priv; + u32 spd = dwmac->speed; + + sti_dwmac_ctrl_init(dwmac); + + stih4xx_fix_retime_src(priv, spd); + + return 0; } -static void sti_fix_mac_speed(void *priv, unsigned int spd) +static int stid127_init(struct platform_device *pdev, void *priv) { struct sti_dwmac *dwmac = priv; + u32 spd = dwmac->speed; - setup_retime_src(dwmac, spd); + sti_dwmac_ctrl_init(dwmac); - return; + stid127_fix_retime_src(priv, spd); + + return 0; } +static void sti_dwmac_exit(struct platform_device *pdev, void *priv) +{ + struct sti_dwmac *dwmac = priv; + + if (dwmac->clk) + clk_disable_unprepare(dwmac->clk); +} static int sti_dwmac_parse_data(struct sti_dwmac *dwmac, struct platform_device *pdev) { @@ -245,6 +289,13 @@ static int sti_dwmac_parse_data(struct sti_dwmac *dwmac, res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sti-ethconf"); if (!res) return -ENODATA; + dwmac->ctrl_reg = res->start; + + /* clk selection from extra syscfg register */ + dwmac->clk_sel_reg = -ENXIO; + res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "sti-clkconf"); + if (res) + dwmac->clk_sel_reg = res->start; regmap = syscon_regmap_lookup_by_phandle(np, "st,syscon"); if (IS_ERR(regmap)) @@ -253,53 +304,31 @@ static int sti_dwmac_parse_data(struct sti_dwmac *dwmac, dwmac->dev = dev; dwmac->interface = of_get_phy_mode(np); dwmac->regmap = regmap; - dwmac->reg = res->start; dwmac->ext_phyclk = of_property_read_bool(np, "st,ext-phyclk"); - dwmac->is_tx_retime_src_clk_125 = false; + dwmac->tx_retime_src = TX_RETIME_SRC_NA; + dwmac->speed = SPEED_100; if (IS_PHY_IF_MODE_GBIT(dwmac->interface)) { const char *rs; + dwmac->tx_retime_src = TX_RETIME_SRC_CLKGEN; err = of_property_read_string(np, "st,tx-retime-src", &rs); - if (err < 0) { - dev_err(dev, "st,tx-retime-src not specified\n"); - return err; - } + if (err < 0) + dev_warn(dev, "Use internal clock source\n"); if (!strcasecmp(rs, "clk_125")) - dwmac->is_tx_retime_src_clk_125 = true; + dwmac->tx_retime_src = TX_RETIME_SRC_CLK_125; + else if (!strcasecmp(rs, "txclk")) + dwmac->tx_retime_src = TX_RETIME_SRC_TXCLK; + + dwmac->speed = SPEED_1000; } dwmac->clk = devm_clk_get(dev, "sti-ethclk"); - - if (IS_ERR(dwmac->clk)) + if (IS_ERR(dwmac->clk)) { + dev_warn(dev, "No phy clock provided...\n"); dwmac->clk = NULL; - - return 0; -} - -static int sti_dwmac_init(struct platform_device *pdev, void *priv) -{ - struct sti_dwmac *dwmac = priv; - struct regmap *regmap = dwmac->regmap; - int iface = dwmac->interface; - u32 reg = dwmac->reg; - u32 val, spd; - - if (dwmac->clk) - clk_prepare_enable(dwmac->clk); - - regmap_update_bits(regmap, reg, MII_PHY_SEL_MASK, phy_intf_sels[iface]); - - val = (iface == PHY_INTERFACE_MODE_REVMII) ? 0 : ENMII; - regmap_update_bits(regmap, reg, ENMII_MASK, val); - - if (IS_PHY_IF_MODE_GBIT(iface)) - spd = SPEED_1000; - else - spd = SPEED_100; - - setup_retime_src(dwmac, spd); + } return 0; } @@ -322,9 +351,16 @@ static void *sti_dwmac_setup(struct platform_device *pdev) return dwmac; } -const struct stmmac_of_data sti_gmac_data = { - .fix_mac_speed = sti_fix_mac_speed, +const struct stmmac_of_data stih4xx_dwmac_data = { + .fix_mac_speed = stih4xx_fix_retime_src, + .setup = sti_dwmac_setup, + .init = stix4xx_init, + .exit = sti_dwmac_exit, +}; + +const struct stmmac_of_data stid127_dwmac_data = { + .fix_mac_speed = stid127_fix_retime_src, .setup = sti_dwmac_setup, - .init = sti_dwmac_init, + .init = stid127_init, .exit = sti_dwmac_exit, }; -- cgit v0.10.2 From 22c0b963d7400971f4c5a1a67b083e3742996640 Mon Sep 17 00:00:00 2001 From: Hariprasad Shenai Date: Wed, 15 Oct 2014 01:54:14 +0530 Subject: cxgb4: Fix FW flash logic using ethtool Use t4_fw_upgrade instead of t4_load_fw to write firmware into FLASH, since t4_load_fw doesn't co-ordinate with the firmware and the adapter can get hosed enough to require a power cycle of the system. Based on original work by Casey Leedom Signed-off-by: Hariprasad Shenai Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h index 410ed58..3c481b2 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h @@ -986,6 +986,8 @@ static inline int t4_memory_write(struct adapter *adap, int mtype, u32 addr, int t4_seeprom_wp(struct adapter *adapter, bool enable); int get_vpd_params(struct adapter *adapter, struct vpd_params *p); int t4_load_fw(struct adapter *adapter, const u8 *fw_data, unsigned int size); +int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, + const u8 *fw_data, unsigned int size, int force); unsigned int t4_flash_cfg_addr(struct adapter *adapter); int t4_get_fw_version(struct adapter *adapter, u32 *vers); int t4_get_tp_version(struct adapter *adapter, u32 *vers); diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 5b38e95..fed8f26 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -2929,16 +2929,26 @@ static int set_flash(struct net_device *netdev, struct ethtool_flash *ef) int ret; const struct firmware *fw; struct adapter *adap = netdev2adap(netdev); + unsigned int mbox = FW_PCIE_FW_MASTER_MASK + 1; ef->data[sizeof(ef->data) - 1] = '\0'; ret = request_firmware(&fw, ef->data, adap->pdev_dev); if (ret < 0) return ret; - ret = t4_load_fw(adap, fw->data, fw->size); + /* If the adapter has been fully initialized then we'll go ahead and + * try to get the firmware's cooperation in upgrading to the new + * firmware image otherwise we'll try to do the entire job from the + * host ... and we always "force" the operation in this path. + */ + if (adap->flags & FULL_INIT_DONE) + mbox = adap->mbox; + + ret = t4_fw_upgrade(adap, mbox, fw->data, fw->size, 1); release_firmware(fw); if (!ret) - dev_info(adap->pdev_dev, "loaded firmware %s\n", ef->data); + dev_info(adap->pdev_dev, "loaded firmware %s," + " reload cxgb4 driver\n", ef->data); return ret; } diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c index 1fff149..a9d9d74 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c +++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c @@ -37,8 +37,6 @@ #include "t4_regs.h" #include "t4fw_api.h" -static int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, - const u8 *fw_data, unsigned int size, int force); /** * t4_wait_op_done_val - wait until an operation is completed * @adapter: the adapter performing the operation @@ -3076,8 +3074,8 @@ static int t4_fw_restart(struct adapter *adap, unsigned int mbox, int reset) * positive errno indicates that the adapter is ~probably~ intact, a * negative errno indicates that things are looking bad ... */ -static int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, - const u8 *fw_data, unsigned int size, int force) +int t4_fw_upgrade(struct adapter *adap, unsigned int mbox, + const u8 *fw_data, unsigned int size, int force) { const struct fw_hdr *fw_hdr = (const struct fw_hdr *)fw_data; int reset, ret; -- cgit v0.10.2 From dee49f203a7feef5d00c416b7dc7e34a7caba8e1 Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Tue, 14 Oct 2014 12:35:08 -0700 Subject: rds: avoid calling sock_kfree_s() on allocation failure It is okay to free a NULL pointer but not okay to mischarge the socket optmem accounting. Compile test only. Reported-by: rucsoftsec@gmail.com Cc: Chien Yen Cc: Stephen Hemminger Signed-off-by: Cong Wang Signed-off-by: Cong Wang Signed-off-by: David S. Miller diff --git a/net/rds/rdma.c b/net/rds/rdma.c index 4e37c1c..40084d8 100644 --- a/net/rds/rdma.c +++ b/net/rds/rdma.c @@ -564,12 +564,12 @@ int rds_cmsg_rdma_args(struct rds_sock *rs, struct rds_message *rm, if (rs->rs_bound_addr == 0) { ret = -ENOTCONN; /* XXX not a great errno */ - goto out; + goto out_ret; } if (args->nr_local > UIO_MAXIOV) { ret = -EMSGSIZE; - goto out; + goto out_ret; } /* Check whether to allocate the iovec area */ @@ -578,7 +578,7 @@ int rds_cmsg_rdma_args(struct rds_sock *rs, struct rds_message *rm, iovs = sock_kmalloc(rds_rs_to_sk(rs), iov_size, GFP_KERNEL); if (!iovs) { ret = -ENOMEM; - goto out; + goto out_ret; } } @@ -696,6 +696,7 @@ out: if (iovs != iovstack) sock_kfree_s(rds_rs_to_sk(rs), iovs, iov_size); kfree(pages); +out_ret: if (ret) rds_rdma_free_op(op); else -- cgit v0.10.2 From e53da5fbfc02586fe4506ed583069b8205f3e38d Mon Sep 17 00:00:00 2001 From: "David S. Miller" Date: Tue, 14 Oct 2014 17:02:37 -0400 Subject: net: Trap attempts to call sock_kfree_s() with a NULL pointer. Unlike normal kfree() it is never right to call sock_kfree_s() with a NULL pointer, because sock_kfree_s() also has the side effect of discharging the memory from the sockets quota. Signed-off-by: David S. Miller diff --git a/net/core/sock.c b/net/core/sock.c index b4f3ea2..15e0c67 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1718,6 +1718,8 @@ EXPORT_SYMBOL(sock_kmalloc); */ void sock_kfree_s(struct sock *sk, void *mem, int size) { + if (WARN_ON_ONCE(!mem)) + return; kfree(mem); atomic_sub(size, &sk->sk_omem_alloc); } -- cgit v0.10.2 From db404b13617fe0cdb415da55762203d456837912 Mon Sep 17 00:00:00 2001 From: Mark Rustad Date: Tue, 14 Oct 2014 06:28:38 -0700 Subject: genl_magic: Resolve logical-op warnings Resolve "logical 'and' applied to non-boolean constant" warnings" that appear in W=2 builds by adding !! to a bit test. Signed-off-by: Mark Rustad Signed-off-by: Jeff Kirsher Signed-off-by: David S. Miller diff --git a/include/linux/genl_magic_func.h b/include/linux/genl_magic_func.h index c0894dd..667c311 100644 --- a/include/linux/genl_magic_func.h +++ b/include/linux/genl_magic_func.h @@ -178,12 +178,12 @@ static int s_name ## _from_attrs_for_change(struct s_name *s, \ #define __assign(attr_nr, attr_flag, name, nla_type, type, assignment...) \ nla = ntb[attr_nr]; \ if (nla) { \ - if (exclude_invariants && ((attr_flag) & DRBD_F_INVARIANT)) { \ + if (exclude_invariants && !!((attr_flag) & DRBD_F_INVARIANT)) { \ pr_info("<< must not change invariant attr: %s\n", #name); \ return -EEXIST; \ } \ assignment; \ - } else if (exclude_invariants && ((attr_flag) & DRBD_F_INVARIANT)) { \ + } else if (exclude_invariants && !!((attr_flag) & DRBD_F_INVARIANT)) { \ /* attribute missing from payload, */ \ /* which was expected */ \ } else if ((attr_flag) & DRBD_F_REQUIRED) { \ -- cgit v0.10.2 From 2a1ef4b5a72614c72fce0e21f44e996ee8f0ef23 Mon Sep 17 00:00:00 2001 From: Rajesh Borundia Date: Tue, 14 Oct 2014 07:41:45 -0400 Subject: qlcnic: Fix programming number of arguments in a command. o Initially we were programming maximum number of arguments. Instead we should program number of arguments required in a command. o Maximum number of arguments for 82xx adapter is four. Fix it for GET_ESWITCH_STATS command. Signed-off-by: Rajesh Borundia Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c index ffbae29..243752f 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c @@ -32,7 +32,7 @@ static const struct qlcnic_mailbox_metadata qlcnic_mbx_tbl[] = { {QLCNIC_CMD_CONFIGURE_ESWITCH, 4, 1}, {QLCNIC_CMD_GET_MAC_STATS, 4, 1}, {QLCNIC_CMD_GET_ESWITCH_PORT_CONFIG, 4, 3}, - {QLCNIC_CMD_GET_ESWITCH_STATS, 5, 1}, + {QLCNIC_CMD_GET_ESWITCH_STATS, 4, 1}, {QLCNIC_CMD_CONFIG_PORT, 4, 1}, {QLCNIC_CMD_TEMP_SIZE, 4, 4}, {QLCNIC_CMD_GET_TEMP_HDR, 4, 1}, @@ -129,7 +129,7 @@ int qlcnic_82xx_issue_cmd(struct qlcnic_adapter *adapter, } QLCWR32(adapter, QLCNIC_SIGN_CRB_OFFSET, signature); - for (i = 1; i < QLCNIC_CDRP_MAX_ARGS; i++) + for (i = 1; i < cmd->req.num; i++) QLCWR32(adapter, QLCNIC_CDRP_ARG(i), cmd->req.arg[i]); QLCWR32(adapter, QLCNIC_CDRP_CRB_OFFSET, QLCNIC_CDRP_FORM_CMD(cmd->req.arg[0])); -- cgit v0.10.2 From d47d2fdd29cf41543a0c5a522c4cc9463f9627b2 Mon Sep 17 00:00:00 2001 From: Rajesh Borundia Date: Tue, 14 Oct 2014 07:41:46 -0400 Subject: qlcnic: Fix number of arguments in destroy tx context command o Number of arguments taken by destroy tx command is three instead of two. Signed-off-by: Rajesh Borundia Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c index 243752f..6e6f18f 100644 --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c @@ -11,7 +11,7 @@ static const struct qlcnic_mailbox_metadata qlcnic_mbx_tbl[] = { {QLCNIC_CMD_CREATE_RX_CTX, 4, 1}, {QLCNIC_CMD_DESTROY_RX_CTX, 2, 1}, {QLCNIC_CMD_CREATE_TX_CTX, 4, 1}, - {QLCNIC_CMD_DESTROY_TX_CTX, 2, 1}, + {QLCNIC_CMD_DESTROY_TX_CTX, 3, 1}, {QLCNIC_CMD_INTRPT_TEST, 4, 1}, {QLCNIC_CMD_SET_MTU, 4, 1}, {QLCNIC_CMD_READ_PHY, 4, 2}, -- cgit v0.10.2 From 9b462d02d6dd671a9ebdc45caed6fe98a53c0ebe Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Mon, 13 Oct 2014 06:27:47 -0700 Subject: tcp: TCP Small Queues and strange attractors TCP Small queues tries to keep number of packets in qdisc as small as possible, and depends on a tasklet to feed following packets at TX completion time. Choice of tasklet was driven by latencies requirements. Then, TCP stack tries to avoid reorders, by locking flows with outstanding packets in qdisc in a given TX queue. What can happen is that many flows get attracted by a low performing TX queue, and cpu servicing TX completion has to feed packets for all of them, making this cpu 100% busy in softirq mode. This became particularly visible with latest skb->xmit_more support Strategy adopted in this patch is to detect when tcp_wfree() is called from ksoftirqd and let the outstanding queue for this flow being drained before feeding additional packets, so that skb->ooo_okay can be set to allow select_queue() to select the optimal queue : Incoming ACKS are normally handled by different cpus, so this patch gives more chance for these cpus to take over the burden of feeding qdisc with future packets. Tested: lpaa23:~# ./super_netperf 1400 --google-pacing-rate 3028000 -H lpaa24 -l 3600 & lpaa23:~# sar -n DEV 1 10 | grep eth1 06:16:18 AM eth1 595448.00 1190564.00 38381.09 1760253.12 0.00 0.00 1.00 06:16:19 AM eth1 594858.00 1189686.00 38340.76 1758952.72 0.00 0.00 0.00 06:16:20 AM eth1 597017.00 1194019.00 38480.79 1765370.29 0.00 0.00 1.00 06:16:21 AM eth1 595450.00 1190936.00 38380.19 1760805.05 0.00 0.00 0.00 06:16:22 AM eth1 596385.00 1193096.00 38442.56 1763976.29 0.00 0.00 1.00 06:16:23 AM eth1 598155.00 1195978.00 38552.97 1768264.60 0.00 0.00 0.00 06:16:24 AM eth1 594405.00 1188643.00 38312.57 1757414.89 0.00 0.00 1.00 06:16:25 AM eth1 593366.00 1187154.00 38252.16 1755195.83 0.00 0.00 0.00 06:16:26 AM eth1 593188.00 1186118.00 38232.88 1753682.57 0.00 0.00 1.00 06:16:27 AM eth1 596301.00 1192241.00 38440.94 1762733.09 0.00 0.00 0.00 Average: eth1 595457.30 1190843.50 38381.69 1760664.84 0.00 0.00 0.50 lpaa23:~# ./tc -s -d qd sh dev eth1 | grep backlog backlog 7606336b 2513p requeues 167982 backlog 224072b 74p requeues 566 backlog 581376b 192p requeues 5598 backlog 181680b 60p requeues 1070 backlog 5305056b 1753p requeues 110166 // Here, this TX queue is attracting flows backlog 157456b 52p requeues 1758 backlog 672216b 222p requeues 3025 backlog 60560b 20p requeues 24541 backlog 448144b 148p requeues 21258 lpaa23:~# echo 1 >/proc/sys/net/ipv4/tcp_tsq_enable_tcp_wfree_ksoftirqd_detect Immediate jump to full bandwidth, and traffic is properly shard on all tx queues. lpaa23:~# sar -n DEV 1 10 | grep eth1 06:16:46 AM eth1 1397632.00 2795397.00 90081.87 4133031.26 0.00 0.00 1.00 06:16:47 AM eth1 1396874.00 2793614.00 90032.99 4130385.46 0.00 0.00 0.00 06:16:48 AM eth1 1395842.00 2791600.00 89966.46 4127409.67 0.00 0.00 1.00 06:16:49 AM eth1 1395528.00 2791017.00 89946.17 4126551.24 0.00 0.00 0.00 06:16:50 AM eth1 1397891.00 2795716.00 90098.74 4133497.39 0.00 0.00 1.00 06:16:51 AM eth1 1394951.00 2789984.00 89908.96 4125022.51 0.00 0.00 0.00 06:16:52 AM eth1 1394608.00 2789190.00 89886.90 4123851.36 0.00 0.00 1.00 06:16:53 AM eth1 1395314.00 2790653.00 89934.33 4125983.09 0.00 0.00 0.00 06:16:54 AM eth1 1396115.00 2792276.00 89984.25 4128411.21 0.00 0.00 1.00 06:16:55 AM eth1 1396829.00 2793523.00 90030.19 4130250.28 0.00 0.00 0.00 Average: eth1 1396158.40 2792297.00 89987.09 4128439.35 0.00 0.00 0.50 lpaa23:~# tc -s -d qd sh dev eth1 | grep backlog backlog 7900052b 2609p requeues 173287 backlog 878120b 290p requeues 589 backlog 1068884b 354p requeues 5621 backlog 996212b 329p requeues 1088 backlog 984100b 325p requeues 115316 backlog 956848b 316p requeues 1781 backlog 1080996b 357p requeues 3047 backlog 975016b 322p requeues 24571 backlog 990156b 327p requeues 21274 (All 8 TX queues get a fair share of the traffic) Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 0a5d97c..e13d778 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -839,26 +839,38 @@ void tcp_wfree(struct sk_buff *skb) { struct sock *sk = skb->sk; struct tcp_sock *tp = tcp_sk(sk); + int wmem; + + /* Keep one reference on sk_wmem_alloc. + * Will be released by sk_free() from here or tcp_tasklet_func() + */ + wmem = atomic_sub_return(skb->truesize - 1, &sk->sk_wmem_alloc); + + /* If this softirq is serviced by ksoftirqd, we are likely under stress. + * Wait until our queues (qdisc + devices) are drained. + * This gives : + * - less callbacks to tcp_write_xmit(), reducing stress (batches) + * - chance for incoming ACK (processed by another cpu maybe) + * to migrate this flow (skb->ooo_okay will be eventually set) + */ + if (wmem >= SKB_TRUESIZE(1) && this_cpu_ksoftirqd() == current) + goto out; if (test_and_clear_bit(TSQ_THROTTLED, &tp->tsq_flags) && !test_and_set_bit(TSQ_QUEUED, &tp->tsq_flags)) { unsigned long flags; struct tsq_tasklet *tsq; - /* Keep a ref on socket. - * This last ref will be released in tcp_tasklet_func() - */ - atomic_sub(skb->truesize - 1, &sk->sk_wmem_alloc); - /* queue this socket to tasklet queue */ local_irq_save(flags); tsq = &__get_cpu_var(tsq_tasklet); list_add(&tp->tsq_node, &tsq->head); tasklet_schedule(&tsq->tasklet); local_irq_restore(flags); - } else { - sock_wfree(skb); + return; } +out: + sk_free(sk); } /* This routine actually transmits TCP packets queued in by -- cgit v0.10.2 From 587ddfe2d212019de7c921d9c010789828893f86 Mon Sep 17 00:00:00 2001 From: Anish Bhatt Date: Tue, 14 Oct 2014 20:07:21 -0700 Subject: cxgb4i : Remove duplicated CLIP handling code cxgb4 already handles CLIP updates from a previous changeset for iw_cxgb4, there is no need to have this functionality in cxgb4i. Remove duplicated code Signed-off-by: Anish Bhatt Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index fed8f26..411acf0 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -4490,6 +4490,13 @@ static int update_root_dev_clip(struct net_device *dev) return ret; /* Parse all bond and vlan devices layered on top of the physical dev */ + root_dev = netdev_master_upper_dev_get_rcu(dev); + if (root_dev) { + ret = update_dev_clip(root_dev, dev); + if (ret) + return ret; + } + for (i = 0; i < VLAN_N_VID; i++) { root_dev = __vlan_find_dev_deep_rcu(dev, htons(ETH_P_8021Q), i); if (!root_dev) diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index 02e69e7..18d0d1c 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -1635,129 +1635,6 @@ static int cxgb4i_ddp_init(struct cxgbi_device *cdev) return 0; } -#if IS_ENABLED(CONFIG_IPV6) -static int cxgbi_inet6addr_handler(struct notifier_block *this, - unsigned long event, void *data) -{ - struct inet6_ifaddr *ifa = data; - struct net_device *event_dev = ifa->idev->dev; - struct cxgbi_device *cdev; - int ret = NOTIFY_DONE; - - if (event_dev->priv_flags & IFF_802_1Q_VLAN) - event_dev = vlan_dev_real_dev(event_dev); - - cdev = cxgbi_device_find_by_netdev_rcu(event_dev, NULL); - - if (!cdev) - return ret; - - switch (event) { - case NETDEV_UP: - ret = cxgb4_clip_get(event_dev, - (const struct in6_addr *) - ((ifa)->addr.s6_addr)); - if (ret < 0) - return ret; - - ret = NOTIFY_OK; - break; - - case NETDEV_DOWN: - cxgb4_clip_release(event_dev, - (const struct in6_addr *) - ((ifa)->addr.s6_addr)); - ret = NOTIFY_OK; - break; - - default: - break; - } - - return ret; -} - -static struct notifier_block cxgbi_inet6addr_notifier = { - .notifier_call = cxgbi_inet6addr_handler -}; - -/* Retrieve IPv6 addresses from a root device (bond, vlan) associated with - * a physical device. - * The physical device reference is needed to send the actual CLIP command. - */ -static int update_dev_clip(struct net_device *root_dev, struct net_device *dev) -{ - struct inet6_dev *idev = NULL; - struct inet6_ifaddr *ifa; - int ret = 0; - - idev = __in6_dev_get(root_dev); - if (!idev) - return ret; - - read_lock_bh(&idev->lock); - list_for_each_entry(ifa, &idev->addr_list, if_list) { - pr_info("updating the clip for addr %pI6\n", - ifa->addr.s6_addr); - ret = cxgb4_clip_get(dev, (const struct in6_addr *) - ifa->addr.s6_addr); - if (ret < 0) - break; - } - - read_unlock_bh(&idev->lock); - return ret; -} - -static int update_root_dev_clip(struct net_device *dev) -{ - struct net_device *root_dev = NULL; - int i, ret = 0; - - /* First populate the real net device's IPv6 address */ - ret = update_dev_clip(dev, dev); - if (ret) - return ret; - - /* Parse all bond and vlan devices layered on top of the physical dev */ - root_dev = netdev_master_upper_dev_get(dev); - if (root_dev) { - ret = update_dev_clip(root_dev, dev); - if (ret) - return ret; - } - - for (i = 0; i < VLAN_N_VID; i++) { - root_dev = __vlan_find_dev_deep_rcu(dev, htons(ETH_P_8021Q), i); - if (!root_dev) - continue; - - ret = update_dev_clip(root_dev, dev); - if (ret) - break; - } - return ret; -} - -static void cxgbi_update_clip(struct cxgbi_device *cdev) -{ - int i; - - rcu_read_lock(); - - for (i = 0; i < cdev->nports; i++) { - struct net_device *dev = cdev->ports[i]; - int ret = 0; - - if (dev) - ret = update_root_dev_clip(dev); - if (ret < 0) - break; - } - rcu_read_unlock(); -} -#endif /* IS_ENABLED(CONFIG_IPV6) */ - static void *t4_uld_add(const struct cxgb4_lld_info *lldi) { struct cxgbi_device *cdev; @@ -1876,10 +1753,6 @@ static int t4_uld_state_change(void *handle, enum cxgb4_state state) switch (state) { case CXGB4_STATE_UP: pr_info("cdev 0x%p, UP.\n", cdev); -#if IS_ENABLED(CONFIG_IPV6) - cxgbi_update_clip(cdev); -#endif - /* re-initialize */ break; case CXGB4_STATE_START_RECOVERY: pr_info("cdev 0x%p, RECOVERY.\n", cdev); @@ -1910,17 +1783,11 @@ static int __init cxgb4i_init_module(void) return rc; cxgb4_register_uld(CXGB4_ULD_ISCSI, &cxgb4i_uld_info); -#if IS_ENABLED(CONFIG_IPV6) - register_inet6addr_notifier(&cxgbi_inet6addr_notifier); -#endif return 0; } static void __exit cxgb4i_exit_module(void) { -#if IS_ENABLED(CONFIG_IPV6) - unregister_inet6addr_notifier(&cxgbi_inet6addr_notifier); -#endif cxgb4_unregister_uld(CXGB4_ULD_ISCSI); cxgbi_device_unregister_all(CXGBI_FLAG_DEV_T4); cxgbi_iscsi_cleanup(&cxgb4i_iscsi_transport, &cxgb4i_stt); -- cgit v0.10.2 From 1bb60376cda108306818365b186450f154ede5f2 Mon Sep 17 00:00:00 2001 From: Anish Bhatt Date: Tue, 14 Oct 2014 20:07:22 -0700 Subject: cxgb4 : Fix build failure in cxgb4 when ipv6 is disabled/not in-built cxgb4 ipv6 does not guard against ipv6 being disabled, or the standard ipv6 module vs inbuilt tri-state issue. This was fixed for cxgb4i & iw_cxgb4 but missed for cxgb4. Signed-off-by: Anish Bhatt Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/chelsio/Kconfig b/drivers/net/ethernet/chelsio/Kconfig index c3ce9df..ac6473f 100644 --- a/drivers/net/ethernet/chelsio/Kconfig +++ b/drivers/net/ethernet/chelsio/Kconfig @@ -68,7 +68,7 @@ config CHELSIO_T3 config CHELSIO_T4 tristate "Chelsio Communications T4/T5 Ethernet support" - depends on PCI + depends on PCI && (IPV6 || IPV6=n) select FW_LOADER select MDIO ---help--- diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 411acf0..3f60070 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -4369,6 +4369,7 @@ EXPORT_SYMBOL(cxgb4_unregister_uld); * success (true) if it belongs otherwise failure (false). * Called with rcu_read_lock() held. */ +#if IS_ENABLED(CONFIG_IPV6) static bool cxgb4_netdev(const struct net_device *netdev) { struct adapter *adap; @@ -4529,6 +4530,7 @@ static void update_clip(const struct adapter *adap) } rcu_read_unlock(); } +#endif /* IS_ENABLED(CONFIG_IPV6) */ /** * cxgb_up - enable the adapter @@ -4575,7 +4577,9 @@ static int cxgb_up(struct adapter *adap) t4_intr_enable(adap); adap->flags |= FULL_INIT_DONE; notify_ulds(adap, CXGB4_STATE_UP); +#if IS_ENABLED(CONFIG_IPV6) update_clip(adap); +#endif out: return err; irq_err: @@ -6869,14 +6873,18 @@ static int __init cxgb4_init_module(void) if (ret < 0) debugfs_remove(cxgb4_debugfs_root); +#if IS_ENABLED(CONFIG_IPV6) register_inet6addr_notifier(&cxgb4_inet6addr_notifier); +#endif return ret; } static void __exit cxgb4_cleanup_module(void) { +#if IS_ENABLED(CONFIG_IPV6) unregister_inet6addr_notifier(&cxgb4_inet6addr_notifier); +#endif pci_unregister_driver(&cxgb4_driver); debugfs_remove(cxgb4_debugfs_root); /* NULL ok */ } -- cgit v0.10.2 From f42bb57c61fd21fb7e30a2b99dbeb1671666bc47 Mon Sep 17 00:00:00 2001 From: Anish Bhatt Date: Tue, 14 Oct 2014 20:07:23 -0700 Subject: cxgb4i : Fix -Wunused-function warning A bunch of ipv6 related code is left on by default. While this causes no compilation issues, there is no need to have this enabled by default. Guard with an ipv6 check, which also takes care of a -Wunused-function warning. Signed-off-by: Anish Bhatt Signed-off-by: David S. Miller diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index 18d0d1c..df176f0 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -259,6 +259,7 @@ static void send_act_open_req(struct cxgbi_sock *csk, struct sk_buff *skb, cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t); } +#if IS_ENABLED(CONFIG_IPV6) static void send_act_open_req6(struct cxgbi_sock *csk, struct sk_buff *skb, struct l2t_entry *e) { @@ -344,6 +345,7 @@ static void send_act_open_req6(struct cxgbi_sock *csk, struct sk_buff *skb, cxgb4_l2t_send(csk->cdev->ports[csk->port_id], skb, csk->l2t); } +#endif static void send_close_req(struct cxgbi_sock *csk) { @@ -781,9 +783,11 @@ static void csk_act_open_retry_timer(unsigned long data) if (csk->csk_family == AF_INET) { send_act_open_func = send_act_open_req; skb = alloc_wr(size, 0, GFP_ATOMIC); +#if IS_ENABLED(CONFIG_IPV6) } else { send_act_open_func = send_act_open_req6; skb = alloc_wr(size6, 0, GFP_ATOMIC); +#endif } if (!skb) @@ -1335,8 +1339,10 @@ static int init_act_open(struct cxgbi_sock *csk) if (csk->csk_family == AF_INET) skb = alloc_wr(size, 0, GFP_NOIO); +#if IS_ENABLED(CONFIG_IPV6) else skb = alloc_wr(size6, 0, GFP_NOIO); +#endif if (!skb) goto rel_resource; @@ -1370,8 +1376,10 @@ static int init_act_open(struct cxgbi_sock *csk) cxgbi_sock_set_state(csk, CTP_ACTIVE_OPEN); if (csk->csk_family == AF_INET) send_act_open_req(csk, skb, csk->l2t); +#if IS_ENABLED(CONFIG_IPV6) else send_act_open_req6(csk, skb, csk->l2t); +#endif neigh_release(n); return 0; diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c index 6a2001d..54fa6e0 100644 --- a/drivers/scsi/cxgbi/libcxgbi.c +++ b/drivers/scsi/cxgbi/libcxgbi.c @@ -275,6 +275,7 @@ struct cxgbi_device *cxgbi_device_find_by_netdev_rcu(struct net_device *ndev, } EXPORT_SYMBOL_GPL(cxgbi_device_find_by_netdev_rcu); +#if IS_ENABLED(CONFIG_IPV6) static struct cxgbi_device *cxgbi_device_find_by_mac(struct net_device *ndev, int *port) { @@ -307,6 +308,7 @@ static struct cxgbi_device *cxgbi_device_find_by_mac(struct net_device *ndev, ndev, ndev->name); return NULL; } +#endif void cxgbi_hbas_remove(struct cxgbi_device *cdev) { -- cgit v0.10.2 From c5bbcb5822b25c9f738db98e6d6ad2506cab8136 Mon Sep 17 00:00:00 2001 From: Anish Bhatt Date: Tue, 14 Oct 2014 20:07:24 -0700 Subject: cxgb4i: Remove duplicate call to dst_neigh_lookup() There is an extra call to dst_neigh_lookup() leftover in cxgb4i that can cause an unreleased refcnt issue. Remove extraneous call. Signed-off-by: Anish Bhatt Fixes : 759a0cc5a3e1b ('cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api') Signed-off-by: David S. Miller diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index df176f0..8c3003b 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -1317,11 +1317,6 @@ static int init_act_open(struct cxgbi_sock *csk) cxgbi_sock_set_flag(csk, CTPF_HAS_ATID); cxgbi_sock_get(csk); - n = dst_neigh_lookup(csk->dst, &csk->daddr.sin_addr.s_addr); - if (!n) { - pr_err("%s, can't get neighbour of csk->dst.\n", ndev->name); - goto rel_resource; - } csk->l2t = cxgb4_l2t_get(lldi->l2t, n, ndev, 0); if (!csk->l2t) { pr_err("%s, cannot alloc l2t.\n", ndev->name); -- cgit v0.10.2 From 71ae8f5271b31da1172751059deb8bfc32b2b759 Mon Sep 17 00:00:00 2001 From: Giuseppe CAVALLARO Date: Wed, 15 Oct 2014 07:30:41 +0200 Subject: stmmac: fix sti compatibililies this patch is to fix the stmmac data compatibilities for all the SoCs inside the platform file. Reported-by: Stephen Rothwell Signed-off-by: Giuseppe Cavallaro Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index 4452889..c3c4065 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -144,7 +144,8 @@ extern const struct stmmac_of_data meson6_dwmac_data; extern const struct stmmac_of_data sun7i_gmac_data; #endif #ifdef CONFIG_DWMAC_STI -extern const struct stmmac_of_data sti_gmac_data; +extern const struct stmmac_of_data stih4xx_dwmac_data; +extern const struct stmmac_of_data stid127_dwmac_data; #endif #ifdef CONFIG_DWMAC_SOCFPGA extern const struct stmmac_of_data socfpga_gmac_data; diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index 0d6b9ad..db56fa7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -37,10 +37,10 @@ static const struct of_device_id stmmac_dt_ids[] = { { .compatible = "allwinner,sun7i-a20-gmac", .data = &sun7i_gmac_data}, #endif #ifdef CONFIG_DWMAC_STI - { .compatible = "st,stih415-dwmac", .data = &sti_gmac_data}, - { .compatible = "st,stih416-dwmac", .data = &sti_gmac_data}, - { .compatible = "st,stid127-dwmac", .data = &sti_gmac_data}, - { .compatible = "st,stih407-dwmac", .data = &sti_gmac_data}, + { .compatible = "st,stih415-dwmac", .data = &stih4xx_dwmac_data}, + { .compatible = "st,stih416-dwmac", .data = &stih4xx_dwmac_data}, + { .compatible = "st,stid127-dwmac", .data = &stid127_dwmac_data}, + { .compatible = "st,stih407-dwmac", .data = &stih4xx_dwmac_data}, #endif #ifdef CONFIG_DWMAC_SOCFPGA { .compatible = "altr,socfpga-stmmac", .data = &socfpga_gmac_data }, -- cgit v0.10.2 From 04ffcb255f22a2a988ce7393e6e72f6eb3fcb7aa Mon Sep 17 00:00:00 2001 From: Tom Herbert Date: Tue, 14 Oct 2014 15:19:06 -0700 Subject: net: Add ndo_gso_check Add ndo_gso_check which a device can define to indicate whether is is capable of doing GSO on a packet. This funciton would be called from the stack to determine whether software GSO is needed to be done. A driver should populate this function if it advertises GSO types for which there are combinations that it wouldn't be able to handle. For instance a device that performs UDP tunneling might only implement support for transparent Ethernet bridging type of inner packets or might have limitations on lengths of inner headers. Signed-off-by: Tom Herbert Signed-off-by: David S. Miller diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 0c6adaa..65e2892 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -298,7 +298,7 @@ static rx_handler_result_t macvtap_handle_frame(struct sk_buff **pskb) */ if (q->flags & IFF_VNET_HDR) features |= vlan->tap_features; - if (netif_needs_gso(skb, features)) { + if (netif_needs_gso(dev, skb, features)) { struct sk_buff *segs = __skb_gso_segment(skb, features, false); if (IS_ERR(segs)) diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index ca82f54..3c0b375 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -638,7 +638,7 @@ static int xennet_start_xmit(struct sk_buff *skb, struct net_device *dev) if (unlikely(!netif_carrier_ok(dev) || (slots > 1 && !xennet_can_sg(dev)) || - netif_needs_gso(skb, netif_skb_features(skb)))) { + netif_needs_gso(dev, skb, netif_skb_features(skb)))) { spin_unlock_irqrestore(&queue->tx_lock, flags); goto drop; } diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 838407a..74fd5d3 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -998,6 +998,12 @@ typedef u16 (*select_queue_fallback_t)(struct net_device *dev, * Callback to use for xmit over the accelerated station. This * is used in place of ndo_start_xmit on accelerated net * devices. + * bool (*ndo_gso_check) (struct sk_buff *skb, + * struct net_device *dev); + * Called by core transmit path to determine if device is capable of + * performing GSO on a packet. The device returns true if it is + * able to GSO the packet, false otherwise. If the return value is + * false the stack will do software GSO. */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); @@ -1147,6 +1153,8 @@ struct net_device_ops { struct net_device *dev, void *priv); int (*ndo_get_lock_subclass)(struct net_device *dev); + bool (*ndo_gso_check) (struct sk_buff *skb, + struct net_device *dev); }; /** @@ -3572,10 +3580,12 @@ static inline bool skb_gso_ok(struct sk_buff *skb, netdev_features_t features) (!skb_has_frag_list(skb) || (features & NETIF_F_FRAGLIST)); } -static inline bool netif_needs_gso(struct sk_buff *skb, +static inline bool netif_needs_gso(struct net_device *dev, struct sk_buff *skb, netdev_features_t features) { return skb_is_gso(skb) && (!skb_gso_ok(skb, features) || + (dev->netdev_ops->ndo_gso_check && + !dev->netdev_ops->ndo_gso_check(skb, dev)) || unlikely((skb->ip_summed != CHECKSUM_PARTIAL) && (skb->ip_summed != CHECKSUM_UNNECESSARY))); } diff --git a/net/core/dev.c b/net/core/dev.c index 4699dcf..9f77a78 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -2675,7 +2675,7 @@ static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device if (skb->encapsulation) features &= dev->hw_enc_features; - if (netif_needs_gso(skb, features)) { + if (netif_needs_gso(dev, skb, features)) { struct sk_buff *segs; segs = skb_gso_segment(skb, features); -- cgit v0.10.2 From 001586a737ee8c11a1198c352c5635f19fd090ed Mon Sep 17 00:00:00 2001 From: Anish Bhatt Date: Wed, 15 Oct 2014 00:26:47 -0700 Subject: cxgb4i : Fix -Wmaybe-uninitialized warning. Identified by kbuild test robot. csk family is always set to be AF_INET or AF_INET6, so skb will always be initialized to some value but there is no harm in silencing the warning anyways. Signed-off-by: Anish Bhatt Fixes : f42bb57c61fd ('cxgb4i : Fix -Wunused-function warning') Signed-off-by: David S. Miller diff --git a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c index 8c3003b..3e0a0d3 100644 --- a/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c +++ b/drivers/scsi/cxgbi/cxgb4i/cxgb4i.c @@ -758,7 +758,7 @@ static int act_open_rpl_status_to_errno(int status) static void csk_act_open_retry_timer(unsigned long data) { - struct sk_buff *skb; + struct sk_buff *skb = NULL; struct cxgbi_sock *csk = (struct cxgbi_sock *)data; struct cxgb4_lld_info *lldi = cxgbi_cdev_priv(csk->cdev); void (*send_act_open_func)(struct cxgbi_sock *, struct sk_buff *, -- cgit v0.10.2 From 28b5f058cf1d268d965894ce42a614d13f853dd6 Mon Sep 17 00:00:00 2001 From: Nimrod Andy Date: Wed, 15 Oct 2014 17:30:12 +0800 Subject: net: fec: ptp: fix convergence issue to support LinuxPTP stack iMX6SX IEEE 1588 module has one hw issue in capturing the ATVR register. The current SW flow is: ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK; ts_counter_ns = ENET0->ATVR; The ATVR value is not expected value that cause LinuxPTP stack cannot be convergent. ENET Block Guide/ Chapter for the iMX6SX (PELE) address the issue: After set ENET_ATCR[Capture], there need some time cycles before the counter value is capture in the register clock domain. The wait-time-cycles is at least 6 clock cycles of the slower clock between the register clock and the 1588 clock. So need something like: ENET0->ATCR |= ENET_ATCR_CAPTURE_MASK; wait(); ts_counter_ns = ENET0->ATVR; For iMX6SX, the 1588 ts_clk is fixed to 25Mhz, register clock is 66Mhz, so the wait-time-cycles must be greater than 240ns (40ns * 6). The patch add 1us delay before cpu read ATVR register. Changes V2: Modify the commit/comments log to describe the issue clearly. Signed-off-by: Fugang Duan Acked-by: Richard Cochran Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/fec.h b/drivers/net/ethernet/freescale/fec.h index 1e65917..9af296a 100644 --- a/drivers/net/ethernet/freescale/fec.h +++ b/drivers/net/ethernet/freescale/fec.h @@ -367,6 +367,56 @@ struct bufdesc_ex { #define FEC_VLAN_TAG_LEN 0x04 #define FEC_ETHTYPE_LEN 0x02 +/* Controller is ENET-MAC */ +#define FEC_QUIRK_ENET_MAC (1 << 0) +/* Controller needs driver to swap frame */ +#define FEC_QUIRK_SWAP_FRAME (1 << 1) +/* Controller uses gasket */ +#define FEC_QUIRK_USE_GASKET (1 << 2) +/* Controller has GBIT support */ +#define FEC_QUIRK_HAS_GBIT (1 << 3) +/* Controller has extend desc buffer */ +#define FEC_QUIRK_HAS_BUFDESC_EX (1 << 4) +/* Controller has hardware checksum support */ +#define FEC_QUIRK_HAS_CSUM (1 << 5) +/* Controller has hardware vlan support */ +#define FEC_QUIRK_HAS_VLAN (1 << 6) +/* ENET IP errata ERR006358 + * + * If the ready bit in the transmit buffer descriptor (TxBD[R]) is previously + * detected as not set during a prior frame transmission, then the + * ENET_TDAR[TDAR] bit is cleared at a later time, even if additional TxBDs + * were added to the ring and the ENET_TDAR[TDAR] bit is set. This results in + * frames not being transmitted until there is a 0-to-1 transition on + * ENET_TDAR[TDAR]. + */ +#define FEC_QUIRK_ERR006358 (1 << 7) +/* ENET IP hw AVB + * + * i.MX6SX ENET IP add Audio Video Bridging (AVB) feature support. + * - Two class indicators on receive with configurable priority + * - Two class indicators and line speed timer on transmit allowing + * implementation class credit based shapers externally + * - Additional DMA registers provisioned to allow managing up to 3 + * independent rings + */ +#define FEC_QUIRK_HAS_AVB (1 << 8) +/* There is a TDAR race condition for mutliQ when the software sets TDAR + * and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). + * This will cause the udma_tx and udma_tx_arbiter state machines to hang. + * The issue exist at i.MX6SX enet IP. + */ +#define FEC_QUIRK_ERR007885 (1 << 9) +/* ENET Block Guide/ Chapter for the iMX6SX (PELE) address one issue: + * After set ENET_ATCR[Capture], there need some time cycles before the counter + * value is capture in the register clock domain. + * The wait-time-cycles is at least 6 clock cycles of the slower clock between + * the register clock and the 1588 clock. The 1588 ts_clk is fixed to 25Mhz, + * register clock is 66Mhz, so the wait-time-cycles must be greater than 240ns + * (40ns * 6). + */ +#define FEC_QUIRK_BUG_CAPTURE (1 << 10) + struct fec_enet_priv_tx_q { int index; unsigned char *tx_bounce[TX_RING_SIZE]; diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index e364d1f..81b96cf 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -78,47 +78,6 @@ static void fec_enet_itr_coal_init(struct net_device *ndev); #define FEC_ENET_RAFL_V 0x8 #define FEC_ENET_OPD_V 0xFFF0 -/* Controller is ENET-MAC */ -#define FEC_QUIRK_ENET_MAC (1 << 0) -/* Controller needs driver to swap frame */ -#define FEC_QUIRK_SWAP_FRAME (1 << 1) -/* Controller uses gasket */ -#define FEC_QUIRK_USE_GASKET (1 << 2) -/* Controller has GBIT support */ -#define FEC_QUIRK_HAS_GBIT (1 << 3) -/* Controller has extend desc buffer */ -#define FEC_QUIRK_HAS_BUFDESC_EX (1 << 4) -/* Controller has hardware checksum support */ -#define FEC_QUIRK_HAS_CSUM (1 << 5) -/* Controller has hardware vlan support */ -#define FEC_QUIRK_HAS_VLAN (1 << 6) -/* ENET IP errata ERR006358 - * - * If the ready bit in the transmit buffer descriptor (TxBD[R]) is previously - * detected as not set during a prior frame transmission, then the - * ENET_TDAR[TDAR] bit is cleared at a later time, even if additional TxBDs - * were added to the ring and the ENET_TDAR[TDAR] bit is set. This results in - * frames not being transmitted until there is a 0-to-1 transition on - * ENET_TDAR[TDAR]. - */ -#define FEC_QUIRK_ERR006358 (1 << 7) -/* ENET IP hw AVB - * - * i.MX6SX ENET IP add Audio Video Bridging (AVB) feature support. - * - Two class indicators on receive with configurable priority - * - Two class indicators and line speed timer on transmit allowing - * implementation class credit based shapers externally - * - Additional DMA registers provisioned to allow managing up to 3 - * independent rings - */ -#define FEC_QUIRK_HAS_AVB (1 << 8) -/* There is a TDAR race condition for mutliQ when the software sets TDAR - * and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). - * This will cause the udma_tx and udma_tx_arbiter state machines to hang. - * The issue exist at i.MX6SX enet IP. - */ -#define FEC_QUIRK_ERR007885 (1 << 9) - static struct platform_device_id fec_devtype[] = { { /* keep it for coldfire */ @@ -146,7 +105,7 @@ static struct platform_device_id fec_devtype[] = { .driver_data = FEC_QUIRK_ENET_MAC | FEC_QUIRK_HAS_GBIT | FEC_QUIRK_HAS_BUFDESC_EX | FEC_QUIRK_HAS_CSUM | FEC_QUIRK_HAS_VLAN | FEC_QUIRK_HAS_AVB | - FEC_QUIRK_ERR007885, + FEC_QUIRK_ERR007885 | FEC_QUIRK_BUG_CAPTURE, }, { /* sentinel */ } diff --git a/drivers/net/ethernet/freescale/fec_ptp.c b/drivers/net/ethernet/freescale/fec_ptp.c index 0fdcdc9..992c8c3 100644 --- a/drivers/net/ethernet/freescale/fec_ptp.c +++ b/drivers/net/ethernet/freescale/fec_ptp.c @@ -236,12 +236,17 @@ static cycle_t fec_ptp_read(const struct cyclecounter *cc) { struct fec_enet_private *fep = container_of(cc, struct fec_enet_private, cc); + const struct platform_device_id *id_entry = + platform_get_device_id(fep->pdev); u32 tempval; tempval = readl(fep->hwp + FEC_ATIME_CTRL); tempval |= FEC_T_CTRL_CAPTURE; writel(tempval, fep->hwp + FEC_ATIME_CTRL); + if (id_entry->driver_data & FEC_QUIRK_BUG_CAPTURE) + udelay(1); + return readl(fep->hwp + FEC_ATIME); } -- cgit v0.10.2 From 4b7fd2e688d51f8ed7380758047fcaa4d4693d47 Mon Sep 17 00:00:00 2001 From: "Michael S. Tsirkin" Date: Wed, 15 Oct 2014 16:23:28 +0300 Subject: virtio_net: fix use after free commit 0b725a2ca61bedc33a2a63d0451d528b268cf975 net: Remove ndo_xmit_flush netdev operation, use signalling instead. added code that looks at skb->xmit_more after the skb has been put in TX VQ. Since some paths process the ring and free the skb immediately, this can cause use after free. Fix by storing xmit_more in a local variable. Cc: David S. Miller Signed-off-by: Michael S. Tsirkin Signed-off-by: David S. Miller diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 3d0ce44..13d0a8b 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -920,6 +920,8 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) int qnum = skb_get_queue_mapping(skb); struct send_queue *sq = &vi->sq[qnum]; int err; + struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum); + bool kick = !skb->xmit_more; /* Free up any pending old buffers before queueing new ones. */ free_old_xmit_skbs(sq); @@ -956,7 +958,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev) } } - if (__netif_subqueue_stopped(dev, qnum) || !skb->xmit_more) + if (kick || netif_xmit_stopped(txq)) virtqueue_kick(sq->vq); return NETDEV_TX_OK; -- cgit v0.10.2 From f5b720b85944413491df106f05a784cd1714436e Mon Sep 17 00:00:00 2001 From: Claudiu Manoil Date: Wed, 15 Oct 2014 19:11:46 +0300 Subject: gianfar: Add FCS to rx buffer size (fix) For each Rx frame the eTSEC writes its FCS (Frame Check Sequence) to the Rx buffer. The eTSEC h/w manual states in the "Receive Buffer Descriptor Field Descriptions" table: "Data length is the number of octets written by the eTSEC into this BD's data buffer if L is cleared (the value is equal to MRBLR), or, if L is set, the length of the frame including *CRC*, FCB (if RCTRL[PRSDEP > 00), preamble (if MACCFG2[PreAmRxEn]=1), time stamp (if RCTRL[TS] = 1) and any padding (RCTRL[PAL])." Though the FCS bytes are removed by the driver before passing the skb to the net stack, the Rx buffer size computation does not currently take into account the FCS bytes (4 bytes). Because the Rx buffer size is multiple of 512 bytes, leaving out the FCS is not a problem for the default MTU of 1500, as the Rx buffer size is 1536 in this case. However, for custom MTUs, where the difference between the MTU size and the Rx buffer size is less, this can be a problem as the computed Rx buffer size won't be enough to accomodate the FCS for a received frame that is big enough (close to MTU size). In such case the received frame is considered to be incomplete (L flag not set in the RxBD status) and silently dropped. Note that the driver does not currently support S/G on Rx, so it has to compute its Rx buffer size based on the MTU of the device. Reported-by: Kristian Otnes Signed-off-by: Claudiu Manoil Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c index 379b1a5..4fdf0aa 100644 --- a/drivers/net/ethernet/freescale/gianfar.c +++ b/drivers/net/ethernet/freescale/gianfar.c @@ -338,7 +338,7 @@ static void gfar_init_tx_rx_base(struct gfar_private *priv) static void gfar_rx_buff_size_config(struct gfar_private *priv) { - int frame_size = priv->ndev->mtu + ETH_HLEN; + int frame_size = priv->ndev->mtu + ETH_HLEN + ETH_FCS_LEN; /* set this when rx hw offload (TOE) functions are being used */ priv->uses_rxfcb = 0; -- cgit v0.10.2 From 7e78cc46b7ec0c80257de8d09f0097081754e206 Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Wed, 15 Oct 2014 21:03:18 +0200 Subject: openvswitch: kerneldoc warning fix s/sock/gs Signed-off-by: Fabian Frederick Acked-by: Pravin B Shelar Signed-off-by: David S. Miller diff --git a/net/openvswitch/vport-geneve.c b/net/openvswitch/vport-geneve.c index 910b3ef..106a9d8 100644 --- a/net/openvswitch/vport-geneve.c +++ b/net/openvswitch/vport-geneve.c @@ -30,7 +30,7 @@ /** * struct geneve_port - Keeps track of open UDP ports - * @sock: The socket created for this port number. + * @gs: The socket created for this port number. * @name: vport name. */ struct geneve_port { -- cgit v0.10.2 From 4e8febd0a76333875636859e0092a14c1fba49e4 Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Wed, 15 Oct 2014 21:03:41 +0200 Subject: openvswitch: use vport instead of p All functions used struct vport *vport except ovs_vport_find_upcall_portid. This fixes 1 kerneldoc warning Signed-off-by: Fabian Frederick Acked-by: Pravin B Shelar Signed-off-by: David S. Miller diff --git a/net/openvswitch/vport.c b/net/openvswitch/vport.c index 53001b0..6015802 100644 --- a/net/openvswitch/vport.c +++ b/net/openvswitch/vport.c @@ -408,13 +408,13 @@ int ovs_vport_get_upcall_portids(const struct vport *vport, * * Returns the portid of the target socket. Must be called with rcu_read_lock. */ -u32 ovs_vport_find_upcall_portid(const struct vport *p, struct sk_buff *skb) +u32 ovs_vport_find_upcall_portid(const struct vport *vport, struct sk_buff *skb) { struct vport_portids *ids; u32 ids_index; u32 hash; - ids = rcu_dereference(p->upcall_portids); + ids = rcu_dereference(vport->upcall_portids); if (ids->n_ids == 1 && ids->ids[0] == 0) return 0; -- cgit v0.10.2 From ce6502a8f9572179f044a4d62667c4645256d6e4 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Thu, 16 Oct 2014 08:49:41 +0800 Subject: vxlan: fix a use after free in vxlan_encap_bypass when netif_rx() is done, the netif_rx handled skb maybe be freed, and should not be used. Signed-off-by: Li RongQing Signed-off-by: David S. Miller diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 2a51e6e..faf1bd1 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1668,6 +1668,8 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan, struct pcpu_sw_netstats *tx_stats, *rx_stats; union vxlan_addr loopback; union vxlan_addr *remote_ip = &dst_vxlan->default_dst.remote_ip; + struct net_device *dev = skb->dev; + int len = skb->len; tx_stats = this_cpu_ptr(src_vxlan->dev->tstats); rx_stats = this_cpu_ptr(dst_vxlan->dev->tstats); @@ -1691,16 +1693,16 @@ static void vxlan_encap_bypass(struct sk_buff *skb, struct vxlan_dev *src_vxlan, u64_stats_update_begin(&tx_stats->syncp); tx_stats->tx_packets++; - tx_stats->tx_bytes += skb->len; + tx_stats->tx_bytes += len; u64_stats_update_end(&tx_stats->syncp); if (netif_rx(skb) == NET_RX_SUCCESS) { u64_stats_update_begin(&rx_stats->syncp); rx_stats->rx_packets++; - rx_stats->rx_bytes += skb->len; + rx_stats->rx_bytes += len; u64_stats_update_end(&rx_stats->syncp); } else { - skb->dev->stats.rx_dropped++; + dev->stats.rx_dropped++; } } -- cgit v0.10.2 From 91269e390d062b526432f2ef1352b8df82e0e0bc Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Thu, 16 Oct 2014 09:17:18 +0800 Subject: vxlan: using pskb_may_pull as early as possible pskb_may_pull should be used to check if skb->data has enough space, skb->len can not ensure that. Cc: Cong Wang Signed-off-by: Li RongQing Signed-off-by: David S. Miller diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index faf1bd1..77ab844 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1437,9 +1437,6 @@ static int neigh_reduce(struct net_device *dev, struct sk_buff *skb) if (!in6_dev) goto out; - if (!pskb_may_pull(skb, skb->len)) - goto out; - iphdr = ipv6_hdr(skb); saddr = &iphdr->saddr; daddr = &iphdr->daddr; @@ -1880,7 +1877,8 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev) return arp_reduce(dev, skb); #if IS_ENABLED(CONFIG_IPV6) else if (ntohs(eth->h_proto) == ETH_P_IPV6 && - skb->len >= sizeof(struct ipv6hdr) + sizeof(struct nd_msg) && + pskb_may_pull(skb, sizeof(struct ipv6hdr) + + sizeof(struct nd_msg)) && ipv6_hdr(skb)->nexthdr == IPPROTO_ICMPV6) { struct nd_msg *msg; -- cgit v0.10.2 From 4d4191566fdd0e8990b2e8ab5ae819227c92892f Mon Sep 17 00:00:00 2001 From: Matthew Vick Date: Thu, 2 Oct 2014 05:10:18 +0000 Subject: fm10k: Check the host state when bringing the interface up Set the flag to fetch the host state before kicking off the service task that reads the host state when bringing the interface back up. Signed-off-by: Matthew Vick Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c index e02036c..a0cb74a 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c @@ -1489,6 +1489,7 @@ void fm10k_up(struct fm10k_intfc *interface) netif_tx_start_all_queues(interface->netdev); /* kick off the service timer */ + hw->mac.get_host_state = 1; mod_timer(&interface->service_timer, jiffies); } -- cgit v0.10.2 From 13cb2dad45cc8c8e350abc84de38449b89629c3c Mon Sep 17 00:00:00 2001 From: Matthew Vick Date: Fri, 3 Oct 2014 00:43:35 +0000 Subject: fm10k: Unlock mailbox on VLAN addition failures After grabbing the mailbox lock and detecting an error, the lock must be released before the error code can be returned. Signed-off-by: Matthew Vick Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c index bf44a8f..b57ea1c 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c @@ -785,14 +785,14 @@ static int fm10k_update_vid(struct net_device *netdev, u16 vid, bool set) if (!(netdev->flags & IFF_PROMISC)) { err = hw->mac.ops.update_vlan(hw, vid, 0, set); if (err) - return err; + goto err_out; } /* update our base MAC address */ err = hw->mac.ops.update_uc_addr(hw, interface->glort, hw->mac.addr, vid, set, 0); if (err) - return err; + goto err_out; /* set vid prior to syncing/unsyncing the VLAN */ interface->vid = vid + (set ? VLAN_N_VID : 0); @@ -801,9 +801,10 @@ static int fm10k_update_vid(struct net_device *netdev, u16 vid, bool set) __dev_uc_unsync(netdev, fm10k_uc_vlan_unsync); __dev_mc_unsync(netdev, fm10k_mc_vlan_unsync); +err_out: fm10k_mbx_unlock(interface); - return 0; + return err; } static int fm10k_vlan_rx_add_vid(struct net_device *netdev, -- cgit v0.10.2 From f6b03c10a1b3f2c98ed23813997cdebef8aabeba Mon Sep 17 00:00:00 2001 From: Andy Zhou Date: Sat, 4 Oct 2014 06:19:11 +0000 Subject: fm10k: Add CONFIG_FM10K_VXLAN configuration option Compiling with CONFIG_FM10K=y and VXLAN=m resulting in linking error: drivers/built-in.o: In function `fm10k_open': (.text+0x1f9d7a): undefined reference to `vxlan_get_rx_port' make: *** [vmlinux] Error 1 The fix follows the same strategy as I40E. Signed-off-by: Andy Zhou Acked-by: Alexander Duyck Tested-by: Aaron Brown Signed-off-by: Jeff Kirsher diff --git a/drivers/net/ethernet/intel/Kconfig b/drivers/net/ethernet/intel/Kconfig index 6919adb..5b8300a 100644 --- a/drivers/net/ethernet/intel/Kconfig +++ b/drivers/net/ethernet/intel/Kconfig @@ -320,4 +320,15 @@ config FM10K To compile this driver as a module, choose M here. The module will be called fm10k. MSI-X interrupt support is required +config FM10K_VXLAN + bool "Virtual eXtensible Local Area Network Support" + default n + depends on FM10K && VXLAN && !(FM10K=y && VXLAN=m) + ---help--- + This allows one to create VXLAN virtual interfaces that provide + Layer 2 Networks over Layer 3 Networks. VXLAN is often used + to tunnel virtual network infrastructure in virtualized environments. + Say Y here if you want to use Virtual eXtensible Local Area Network + (VXLAN) in the driver. + endif # NET_VENDOR_INTEL diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c index b57ea1c..8811364 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_netdev.c @@ -20,9 +20,9 @@ #include "fm10k.h" #include -#if IS_ENABLED(CONFIG_VXLAN) +#if IS_ENABLED(CONFIG_FM10K_VXLAN) #include -#endif /* CONFIG_VXLAN */ +#endif /* CONFIG_FM10K_VXLAN */ /** * fm10k_setup_tx_resources - allocate Tx resources (Descriptors) @@ -556,7 +556,7 @@ int fm10k_open(struct net_device *netdev) if (err) goto err_set_queues; -#if IS_ENABLED(CONFIG_VXLAN) +#if IS_ENABLED(CONFIG_FM10K_VXLAN) /* update VXLAN port configuration */ vxlan_get_rx_port(netdev); -- cgit v0.10.2 From 600a507ddcb99096731e1d96a3ebf43e20fc7f80 Mon Sep 17 00:00:00 2001 From: Emil Tantilov Date: Thu, 16 Oct 2014 15:49:02 +0000 Subject: ixgbe: check for vfs outside of sriov_num_vfs before dereference The check for vfinfo is not sufficient because it does not protect against specifying vf that is outside of sriov_num_vfs range. All of the ndo functions have a check for it except for ixgbevf_ndo_set_spoofcheck(). The following patch is all we need to protect against this panic: ip link set p96p1 vf 0 spoofchk off BUG: unable to handle kernel NULL pointer dereference at 0000000000000052 IP: [] ixgbe_ndo_set_vf_spoofchk+0x51/0x150 [ixgbe] Reported-by: Thierry Herbelot Signed-off-by: Emil Tantilov Acked-by: Thierry Herbelot Signed-off-by: Jeff Kirsher diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c index 706fc69..97c85b8 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_sriov.c @@ -1261,6 +1261,9 @@ int ixgbe_ndo_set_vf_spoofchk(struct net_device *netdev, int vf, bool setting) struct ixgbe_hw *hw = &adapter->hw; u32 regval; + if (vf >= adapter->num_vfs) + return -EINVAL; + adapter->vfinfo[vf].spoofchk_enabled = setting; regval = IXGBE_READ_REG(hw, IXGBE_PFVFSPOOF(vf_target_reg)); -- cgit v0.10.2 From 2c6ba4b15b5ef38213b6c42ce09e9398f78cef9f Mon Sep 17 00:00:00 2001 From: Nicolas Dichtel Date: Thu, 16 Oct 2014 15:47:51 +0200 Subject: netlink: fix description of portid Avoid confusion between pid and portid. Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller diff --git a/include/net/netlink.h b/include/net/netlink.h index 6c10762..7b903e1 100644 --- a/include/net/netlink.h +++ b/include/net/netlink.h @@ -431,7 +431,7 @@ static inline int nlmsg_report(const struct nlmsghdr *nlh) /** * nlmsg_put - Add a new netlink message to an skb * @skb: socket buffer to store message in - * @portid: netlink process id + * @portid: netlink PORTID of requesting application * @seq: sequence number of message * @type: message type * @payload: length of message payload -- cgit v0.10.2 From b7983e3f89dd960b2a6d156fd2200548c3300428 Mon Sep 17 00:00:00 2001 From: Michael Opdenacker Date: Wed, 15 Oct 2014 09:45:50 +0200 Subject: atm: simplify lanai.c by using module_pci_driver This simplifies the lanai.c driver by using the module_pci_driver() macro, at the expense of losing only debugging messages. Signed-off-by: Michael Opdenacker Signed-off-by: David S. Miller diff --git a/drivers/atm/lanai.c b/drivers/atm/lanai.c index fa7d701..93eaf8d 100644 --- a/drivers/atm/lanai.c +++ b/drivers/atm/lanai.c @@ -2614,27 +2614,7 @@ static struct pci_driver lanai_driver = { .probe = lanai_init_one, }; -static int __init lanai_module_init(void) -{ - int x; - - x = pci_register_driver(&lanai_driver); - if (x != 0) - printk(KERN_ERR DEV_LABEL ": no adapter found\n"); - return x; -} - -static void __exit lanai_module_exit(void) -{ - /* We'll only get called when all the interfaces are already - * gone, so there isn't much to do - */ - DPRINTK("cleanup_module()\n"); - pci_unregister_driver(&lanai_driver); -} - -module_init(lanai_module_init); -module_exit(lanai_module_exit); +module_pci_driver(lanai_driver); MODULE_AUTHOR("Mitchell Blank Jr "); MODULE_DESCRIPTION("Efficient Networks Speedstream 3010 driver"); -- cgit v0.10.2 From 2077eebf7d8bf20b36524de45851e28111a60c52 Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Wed, 15 Oct 2014 14:33:20 -0700 Subject: ipv4: call __ip_options_echo() in cookie_v4_check() commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses") missed that cookie_v4_check() still calls ip_options_echo() which uses IPCB(). It should use TCPCB() at TCP layer, so call __ip_options_echo() instead. Fixes: commit 971f10eca186cab238c49da ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Cc: Krzysztof Kolasa Cc: Eric Dumazet Reported-by: Krzysztof Kolasa Tested-by: Krzysztof Kolasa Signed-off-by: Cong Wang Signed-off-by: Cong Wang Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 0431a8f..7e7401c 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -321,7 +321,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, int opt_size = sizeof(struct ip_options_rcu) + opt->optlen; ireq->opt = kmalloc(opt_size, GFP_ATOMIC); - if (ireq->opt != NULL && ip_options_echo(&ireq->opt->opt, skb)) { + if (ireq->opt != NULL && __ip_options_echo(&ireq->opt->opt, skb, opt)) { kfree(ireq->opt); ireq->opt = NULL; } -- cgit v0.10.2 From e25f866fbc8a4bf387b5dbe8e25aa5b07e55c74f Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Wed, 15 Oct 2014 14:33:21 -0700 Subject: ipv4: share tcp_v4_save_options() with cookie_v4_check() cookie_v4_check() allocates ip_options_rcu in the same way with tcp_v4_save_options(), we can just make it a helper function. Cc: Krzysztof Kolasa Cc: Eric Dumazet Signed-off-by: Cong Wang Signed-off-by: Cong Wang Acked-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/include/net/tcp.h b/include/net/tcp.h index 74efeda..869637a 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1666,4 +1666,24 @@ int tcpv4_offload_init(void); void tcp_v4_init(void); void tcp_init(void); +/* + * Save and compile IPv4 options, return a pointer to it + */ +static inline struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb) +{ + const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt; + struct ip_options_rcu *dopt = NULL; + + if (opt && opt->optlen) { + int opt_size = sizeof(*dopt) + opt->optlen; + + dopt = kmalloc(opt_size, GFP_ATOMIC); + if (dopt && __ip_options_echo(&dopt->opt, skb, opt)) { + kfree(dopt); + dopt = NULL; + } + } + return dopt; +} + #endif /* _TCP_H */ diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 7e7401c..c68d0a1 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -317,15 +317,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, /* We throwed the options of the initial SYN away, so we hope * the ACK carries the same options again (see RFC1122 4.2.3.8) */ - if (opt && opt->optlen) { - int opt_size = sizeof(struct ip_options_rcu) + opt->optlen; - - ireq->opt = kmalloc(opt_size, GFP_ATOMIC); - if (ireq->opt != NULL && __ip_options_echo(&ireq->opt->opt, skb, opt)) { - kfree(ireq->opt); - ireq->opt = NULL; - } - } + ireq->opt = tcp_v4_save_options(skb); if (security_inet_conn_request(sk, skb, req)) { reqsk_free(req); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 552e87e..6a2a7d6 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -880,26 +880,6 @@ bool tcp_syn_flood_action(struct sock *sk, } EXPORT_SYMBOL(tcp_syn_flood_action); -/* - * Save and compile IPv4 options into the request_sock if needed. - */ -static struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb) -{ - const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt; - struct ip_options_rcu *dopt = NULL; - - if (opt && opt->optlen) { - int opt_size = sizeof(*dopt) + opt->optlen; - - dopt = kmalloc(opt_size, GFP_ATOMIC); - if (dopt && __ip_options_echo(&dopt->opt, skb, opt)) { - kfree(dopt); - dopt = NULL; - } - } - return dopt; -} - #ifdef CONFIG_TCP_MD5SIG /* * RFC2385 MD5 checksumming requires a mapping of -- cgit v0.10.2 From 461b74c391c4ec9c766794e158508c357d8952e6 Mon Sep 17 00:00:00 2001 From: Cong Wang Date: Wed, 15 Oct 2014 14:33:22 -0700 Subject: ipv4: clean up cookie_v4_check() We can retrieve opt from skb, no need to pass it as a parameter. And opt should always be non-NULL, no need to check. Cc: Krzysztof Kolasa Cc: Eric Dumazet Tested-by: Krzysztof Kolasa Signed-off-by: Cong Wang Signed-off-by: Cong Wang Acked-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/include/net/tcp.h b/include/net/tcp.h index 869637a..3a4bbbf 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -468,8 +468,7 @@ void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb); /* From syncookies.c */ int __cookie_v4_check(const struct iphdr *iph, const struct tcphdr *th, u32 cookie); -struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, - struct ip_options *opt); +struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb); #ifdef CONFIG_SYN_COOKIES /* Syncookies use a monotonic timer which increments every 60 seconds. @@ -1674,7 +1673,7 @@ static inline struct ip_options_rcu *tcp_v4_save_options(struct sk_buff *skb) const struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt; struct ip_options_rcu *dopt = NULL; - if (opt && opt->optlen) { + if (opt->optlen) { int opt_size = sizeof(*dopt) + opt->optlen; dopt = kmalloc(opt_size, GFP_ATOMIC); diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index c68d0a1..d346303 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -255,9 +255,9 @@ bool cookie_check_timestamp(struct tcp_options_received *tcp_opt, } EXPORT_SYMBOL(cookie_check_timestamp); -struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, - struct ip_options *opt) +struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) { + struct ip_options *opt = &TCP_SKB_CB(skb)->header.h4.opt; struct tcp_options_received tcp_opt; struct inet_request_sock *ireq; struct tcp_request_sock *treq; @@ -336,7 +336,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb, flowi4_init_output(&fl4, sk->sk_bound_dev_if, ireq->ir_mark, RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, IPPROTO_TCP, inet_sk_flowi_flags(sk), - (opt && opt->srr) ? opt->faddr : ireq->ir_rmt_addr, + opt->srr ? opt->faddr : ireq->ir_rmt_addr, ireq->ir_loc_addr, th->source, th->dest); security_req_classify_flow(req, flowi4_to_flowi(&fl4)); rt = ip_route_output_key(sock_net(sk), &fl4); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 6a2a7d6..94d1a77 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1408,7 +1408,7 @@ static struct sock *tcp_v4_hnd_req(struct sock *sk, struct sk_buff *skb) #ifdef CONFIG_SYN_COOKIES if (!th->syn) - sk = cookie_v4_check(sk, skb, &TCP_SKB_CB(skb)->header.h4.opt); + sk = cookie_v4_check(sk, skb); #endif return sk; } -- cgit v0.10.2 From 4062090e3e5caaf55bed4523a69f26c3265cc1d2 Mon Sep 17 00:00:00 2001 From: Vasily Averin Date: Wed, 15 Oct 2014 16:24:02 +0400 Subject: ipv4: dst_entry leak in ip_send_unicast_reply() ip_setup_cork() called inside ip_append_data() steals dst entry from rt to cork and in case errors in __ip_append_data() nobody frees stolen dst entry Fixes: 2e77d89b2fa8 ("net: avoid a pair of dst_hold()/dst_release() in ip_append_data()") Signed-off-by: Vasily Averin Acked-by: Eric Dumazet Signed-off-by: David S. Miller diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index e35b712..88e5ef2 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1535,6 +1535,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, struct sk_buff *nskb; struct sock *sk; struct inet_sock *inet; + int err; if (__ip_options_echo(&replyopts.opt.opt, skb, sopt)) return; @@ -1574,8 +1575,13 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, sock_net_set(sk, net); __skb_queue_head_init(&sk->sk_write_queue); sk->sk_sndbuf = sysctl_wmem_default; - ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, len, 0, - &ipc, &rt, MSG_DONTWAIT); + err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, + len, 0, &ipc, &rt, MSG_DONTWAIT); + if (unlikely(err)) { + ip_flush_pending_frames(sk); + goto out; + } + nskb = skb_peek(&sk->sk_write_queue); if (nskb) { if (arg->csumoffset >= 0) @@ -1587,7 +1593,7 @@ void ip_send_unicast_reply(struct net *net, struct sk_buff *skb, skb_set_queue_mapping(nskb, skb_get_queue_mapping(skb)); ip_push_pending_frames(sk, &fl4); } - +out: put_cpu_var(unicast_sock); ip_rt_put(rt); -- cgit v0.10.2 From 389f48947a5a37ea283de520abb742d42174edb0 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 17 Oct 2014 14:03:08 +0800 Subject: openvswitch: fix a use after free pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp setting after arphdr_ok to avoid the use the freed memory Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.") Cc: Jesse Gross Cc: Eric Dumazet Signed-off-by: Li RongQing Acked-by: Jesse Gross Signed-off-by: David S. Miller diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index 62db02b..c5cfc72 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -557,10 +557,11 @@ static int key_extract(struct sk_buff *skb, struct sw_flow_key *key) } else if (key->eth.type == htons(ETH_P_ARP) || key->eth.type == htons(ETH_P_RARP)) { struct arp_eth_header *arp; + bool arp_available = arphdr_ok(skb); arp = (struct arp_eth_header *)skb_network_header(skb); - if (arphdr_ok(skb) && + if (arp_available && arp->ar_hrd == htons(ARPHRD_ETHER) && arp->ar_pro == htons(ETH_P_IP) && arp->ar_hln == ETH_ALEN && -- cgit v0.10.2 From 7a9f526fc3ee49b6034af2f243676ee0a27dcaa8 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 17 Oct 2014 14:06:16 +0800 Subject: vxlan: fix a free after use pskb_may_pull maybe change skb->data and make eth pointer oboslete, so eth needs to reload Fixes: 91269e390d062 ("vxlan: using pskb_may_pull as early as possible") Cc: Eric Dumazet Signed-off-by: Li RongQing Signed-off-by: David S. Miller diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 77ab844..ca30982 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1887,6 +1887,7 @@ static netdev_tx_t vxlan_xmit(struct sk_buff *skb, struct net_device *dev) msg->icmph.icmp6_type == NDISC_NEIGHBOUR_SOLICITATION) return neigh_reduce(dev, skb); } + eth = eth_hdr(skb); #endif } -- cgit v0.10.2 From f47de068f68db91b89e0d3335230d07e02da8727 Mon Sep 17 00:00:00 2001 From: Pravin B Shelar Date: Thu, 16 Oct 2014 21:55:45 -0700 Subject: openvswitch: Create right mask with disabled megaflows If megaflows are disabled, the userspace does not send the netlink attribute OVS_FLOW_ATTR_MASK, and the kernel must create an exact match mask. sw_flow_mask_set() sets every bytes (in 'range') of the mask to 0xff, even the bytes that represent padding for struct sw_flow, or the bytes that represent fields that may not be set during ovs_flow_extract(). This is a problem, because when we extract a flow from a packet, we do not memset() anymore the struct sw_flow to 0. This commit gets rid of sw_flow_mask_set() and introduces mask_set_nlattr(), which operates on the netlink attributes rather than on the mask key. Using this approach we are sure that only the bytes that the user provided in the flow are matched. Also, if the parse_flow_mask_nlattrs() for the mask ENCAP attribute fails, we now return with an error. This bug is introduced by commit 0714812134d7dcadeb7ecfbfeb18788aa7e1eaac ("openvswitch: Eliminate memset() from flow_extract"). Reported-by: Alex Wang Signed-off-by: Daniele Di Proietto Signed-off-by: Andy Zhou Signed-off-by: Pravin B Shelar Signed-off-by: David S. Miller diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c index 368f233..939bcb3 100644 --- a/net/openvswitch/flow_netlink.c +++ b/net/openvswitch/flow_netlink.c @@ -103,10 +103,19 @@ static void update_range__(struct sw_flow_match *match, SW_FLOW_KEY_MEMCPY_OFFSET(match, offsetof(struct sw_flow_key, field), \ value_p, len, is_mask) -static u16 range_n_bytes(const struct sw_flow_key_range *range) -{ - return range->end - range->start; -} +#define SW_FLOW_KEY_MEMSET_FIELD(match, field, value, is_mask) \ + do { \ + update_range__(match, offsetof(struct sw_flow_key, field), \ + sizeof((match)->key->field), is_mask); \ + if (is_mask) { \ + if ((match)->mask) \ + memset((u8 *)&(match)->mask->key.field, value,\ + sizeof((match)->mask->key.field)); \ + } else { \ + memset((u8 *)&(match)->key->field, value, \ + sizeof((match)->key->field)); \ + } \ + } while (0) static bool match_validate(const struct sw_flow_match *match, u64 key_attrs, u64 mask_attrs) @@ -809,13 +818,26 @@ static int ovs_key_from_nlattrs(struct sw_flow_match *match, u64 attrs, return 0; } -static void sw_flow_mask_set(struct sw_flow_mask *mask, - struct sw_flow_key_range *range, u8 val) +static void nlattr_set(struct nlattr *attr, u8 val, bool is_attr_mask_key) { - u8 *m = (u8 *)&mask->key + range->start; + struct nlattr *nla; + int rem; + + /* The nlattr stream should already have been validated */ + nla_for_each_nested(nla, attr, rem) { + /* We assume that ovs_key_lens[type] == -1 means that type is a + * nested attribute + */ + if (is_attr_mask_key && ovs_key_lens[nla_type(nla)] == -1) + nlattr_set(nla, val, false); + else + memset(nla_data(nla), val, nla_len(nla)); + } +} - mask->range = *range; - memset(m, val, range_n_bytes(range)); +static void mask_set_nlattr(struct nlattr *attr, u8 val) +{ + nlattr_set(attr, val, true); } /** @@ -836,6 +858,7 @@ int ovs_nla_get_match(struct sw_flow_match *match, { const struct nlattr *a[OVS_KEY_ATTR_MAX + 1]; const struct nlattr *encap; + struct nlattr *newmask = NULL; u64 key_attrs = 0; u64 mask_attrs = 0; bool encap_valid = false; @@ -882,18 +905,44 @@ int ovs_nla_get_match(struct sw_flow_match *match, if (err) return err; + if (match->mask && !mask) { + /* Create an exact match mask. We need to set to 0xff all the + * 'match->mask' fields that have been touched in 'match->key'. + * We cannot simply memset 'match->mask', because padding bytes + * and fields not specified in 'match->key' should be left to 0. + * Instead, we use a stream of netlink attributes, copied from + * 'key' and set to 0xff: ovs_key_from_nlattrs() will take care + * of filling 'match->mask' appropriately. + */ + newmask = kmemdup(key, nla_total_size(nla_len(key)), + GFP_KERNEL); + if (!newmask) + return -ENOMEM; + + mask_set_nlattr(newmask, 0xff); + + /* The userspace does not send tunnel attributes that are 0, + * but we should not wildcard them nonetheless. + */ + if (match->key->tun_key.ipv4_dst) + SW_FLOW_KEY_MEMSET_FIELD(match, tun_key, 0xff, true); + + mask = newmask; + } + if (mask) { err = parse_flow_mask_nlattrs(mask, a, &mask_attrs); if (err) - return err; + goto free_newmask; - if (mask_attrs & 1 << OVS_KEY_ATTR_ENCAP) { + if (mask_attrs & 1 << OVS_KEY_ATTR_ENCAP) { __be16 eth_type = 0; __be16 tci = 0; if (!encap_valid) { OVS_NLERR("Encap mask attribute is set for non-VLAN frame.\n"); - return -EINVAL; + err = -EINVAL; + goto free_newmask; } mask_attrs &= ~(1 << OVS_KEY_ATTR_ENCAP); @@ -904,10 +953,13 @@ int ovs_nla_get_match(struct sw_flow_match *match, mask_attrs &= ~(1 << OVS_KEY_ATTR_ETHERTYPE); encap = a[OVS_KEY_ATTR_ENCAP]; err = parse_flow_mask_nlattrs(encap, a, &mask_attrs); + if (err) + goto free_newmask; } else { OVS_NLERR("VLAN frames must have an exact match on the TPID (mask=%x).\n", ntohs(eth_type)); - return -EINVAL; + err = -EINVAL; + goto free_newmask; } if (a[OVS_KEY_ATTR_VLAN]) @@ -915,23 +967,22 @@ int ovs_nla_get_match(struct sw_flow_match *match, if (!(tci & htons(VLAN_TAG_PRESENT))) { OVS_NLERR("VLAN tag present bit must have an exact match (tci_mask=%x).\n", ntohs(tci)); - return -EINVAL; + err = -EINVAL; + goto free_newmask; } } err = ovs_key_from_nlattrs(match, mask_attrs, a, true); if (err) - return err; - } else { - /* Populate exact match flow's key mask. */ - if (match->mask) - sw_flow_mask_set(match->mask, &match->range, 0xff); + goto free_newmask; } if (!match_validate(match, key_attrs, mask_attrs)) - return -EINVAL; + err = -EINVAL; - return 0; +free_newmask: + kfree(newmask); + return err; } /** -- cgit v0.10.2 From f88e67149f97d73c704d6fe6f492edde97463025 Mon Sep 17 00:00:00 2001 From: Haiyang Zhang Date: Thu, 16 Oct 2014 14:47:58 -0700 Subject: hyperv: Add handling of IP header with option field in netvsc_set_hash() In case that the IP header has optional field at the end, this patch will get the port numbers after that field, and compute the hash. The general parser skb_flow_dissect() is used here. Signed-off-by: Haiyang Zhang Reviewed-by: K. Y. Srinivasan Signed-off-by: David S. Miller diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c index 0fcb5e7..9e17d1a 100644 --- a/drivers/net/hyperv/netvsc_drv.c +++ b/drivers/net/hyperv/netvsc_drv.c @@ -162,7 +162,7 @@ union sub_key { * data: network byte order * return: host byte order */ -static u32 comp_hash(u8 *key, int klen, u8 *data, int dlen) +static u32 comp_hash(u8 *key, int klen, void *data, int dlen) { union sub_key subk; int k_next = 4; @@ -176,7 +176,7 @@ static u32 comp_hash(u8 *key, int klen, u8 *data, int dlen) for (i = 0; i < dlen; i++) { subk.kb = key[k_next]; k_next = (k_next + 1) % klen; - dt = data[i]; + dt = ((u8 *)data)[i]; for (j = 0; j < 8; j++) { if (dt & 0x80) ret ^= subk.ka; @@ -190,26 +190,20 @@ static u32 comp_hash(u8 *key, int klen, u8 *data, int dlen) static bool netvsc_set_hash(u32 *hash, struct sk_buff *skb) { - struct iphdr *iphdr; + struct flow_keys flow; int data_len; - bool ret = false; - if (eth_hdr(skb)->h_proto != htons(ETH_P_IP)) + if (!skb_flow_dissect(skb, &flow) || flow.n_proto != htons(ETH_P_IP)) return false; - iphdr = ip_hdr(skb); + if (flow.ip_proto == IPPROTO_TCP) + data_len = 12; + else + data_len = 8; - if (iphdr->version == 4) { - if (iphdr->protocol == IPPROTO_TCP) - data_len = 12; - else - data_len = 8; - *hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, - (u8 *)&iphdr->saddr, data_len); - ret = true; - } + *hash = comp_hash(netvsc_hash_key, HASH_KEYLEN, &flow, data_len); - return ret; + return true; } static u16 netvsc_select_queue(struct net_device *ndev, struct sk_buff *skb, -- cgit v0.10.2 From 1245dfc8cadb258386fcd27df38215a0eccb1f17 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 17 Oct 2014 16:53:23 +0800 Subject: ipv4: fix a potential use after free in ip_tunnel_core.c pskb_may_pull() maybe change skb->data and make eth pointer oboslete, so set eth after pskb_may_pull() Fixes:3d7b46cd("ip_tunnel: push generic protocol handling to ip_tunnel module") Cc: Pravin B Shelar Signed-off-by: Li RongQing Acked-by: Pravin B Shelar Signed-off-by: David S. Miller diff --git a/net/ipv4/ip_tunnel_core.c b/net/ipv4/ip_tunnel_core.c index f4c987b..88c386c 100644 --- a/net/ipv4/ip_tunnel_core.c +++ b/net/ipv4/ip_tunnel_core.c @@ -91,11 +91,12 @@ int iptunnel_pull_header(struct sk_buff *skb, int hdr_len, __be16 inner_proto) skb_pull_rcsum(skb, hdr_len); if (inner_proto == htons(ETH_P_TEB)) { - struct ethhdr *eh = (struct ethhdr *)skb->data; + struct ethhdr *eh; if (unlikely(!pskb_may_pull(skb, ETH_HLEN))) return -ENOMEM; + eh = (struct ethhdr *)skb->data; if (likely(ntohs(eh->h_proto) >= ETH_P_802_3_MIN)) skb->protocol = eh->h_proto; else -- cgit v0.10.2 From d8f00d27105a1553a13d4a96c3eb4544f70ca908 Mon Sep 17 00:00:00 2001 From: Li RongQing Date: Fri, 17 Oct 2014 16:53:47 +0800 Subject: ipv4: fix a potential use after free in fou.c pskb_may_pull() maybe change skb->data and make uh pointer oboslete, so reload uh and guehdr Fixes: 37dd0247 ("gue: Receive side for Generic UDP Encapsulation") Cc: Tom Herbert Signed-off-by: Li RongQing Signed-off-by: David S. Miller diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c index efa70ad..32e7892 100644 --- a/net/ipv4/fou.c +++ b/net/ipv4/fou.c @@ -87,6 +87,9 @@ static int gue_udp_recv(struct sock *sk, struct sk_buff *skb) if (!pskb_may_pull(skb, len)) goto drop; + uh = udp_hdr(skb); + guehdr = (struct guehdr *)&uh[1]; + if (guehdr->version != 0) goto drop; -- cgit v0.10.2 From 6cc69f2a404dea8641d6cf97c0fbe8d24579e259 Mon Sep 17 00:00:00 2001 From: hayeswang Date: Fri, 17 Oct 2014 16:55:08 +0800 Subject: r8152: return -EBUSY for runtime suspend Remove calling cancel_delayed_work_sync() for runtime suspend, because it would cause dead lock. Instead, return -EBUSY to avoid the device enters suspending if the net is running and the delayed work is pending or running. The delayed work would try to wake up the device later, so the suspending is not necessary. Signed-off-by: Hayes Wang Signed-off-by: David S. Miller diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index 864159e..e3d84c3 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3189,31 +3189,39 @@ static void r8153_init(struct r8152 *tp) static int rtl8152_suspend(struct usb_interface *intf, pm_message_t message) { struct r8152 *tp = usb_get_intfdata(intf); + struct net_device *netdev = tp->netdev; + int ret = 0; mutex_lock(&tp->control); - if (PMSG_IS_AUTO(message)) + if (PMSG_IS_AUTO(message)) { + if (netif_running(netdev) && work_busy(&tp->schedule.work)) { + ret = -EBUSY; + goto out1; + } + set_bit(SELECTIVE_SUSPEND, &tp->flags); - else - netif_device_detach(tp->netdev); + } else { + netif_device_detach(netdev); + } - if (netif_running(tp->netdev)) { + if (netif_running(netdev)) { clear_bit(WORK_ENABLE, &tp->flags); usb_kill_urb(tp->intr_urb); - cancel_delayed_work_sync(&tp->schedule); tasklet_disable(&tp->tl); if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) { rtl_stop_rx(tp); rtl_runtime_suspend_enable(tp, true); } else { + cancel_delayed_work_sync(&tp->schedule); tp->rtl_ops.down(tp); } tasklet_enable(&tp->tl); } - +out1: mutex_unlock(&tp->control); - return 0; + return ret; } static int rtl8152_resume(struct usb_interface *intf) -- cgit v0.10.2 From 70b33fb0ddec827cbbd14cdc664fc27b2ef4a6b6 Mon Sep 17 00:00:00 2001 From: Edward Cree Date: Fri, 17 Oct 2014 15:32:25 +0100 Subject: sfc: add support for skb->xmit_more Don't ring the doorbell, and don't do PIO. This will also prevent TX Push, because there will be more than one buffer waiting when the doorbell is rung. Signed-off-by: Edward Cree Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/sfc/nic.h b/drivers/net/ethernet/sfc/nic.h index 60f8514..f77cce0 100644 --- a/drivers/net/ethernet/sfc/nic.h +++ b/drivers/net/ethernet/sfc/nic.h @@ -71,9 +71,17 @@ efx_tx_desc(struct efx_tx_queue *tx_queue, unsigned int index) return ((efx_qword_t *) (tx_queue->txd.buf.addr)) + index; } -/* Report whether the NIC considers this TX queue empty, given the - * write_count used for the last doorbell push. May return false - * negative. +/* Get partner of a TX queue, seen as part of the same net core queue */ +static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue) +{ + if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD) + return tx_queue - EFX_TXQ_TYPE_OFFLOAD; + else + return tx_queue + EFX_TXQ_TYPE_OFFLOAD; +} + +/* Report whether this TX queue would be empty for the given write_count. + * May return false negative. */ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue, unsigned int write_count) @@ -86,9 +94,18 @@ static inline bool __efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue, return ((empty_read_count ^ write_count) & ~EFX_EMPTY_COUNT_VALID) == 0; } -static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue) +/* Decide whether we can use TX PIO, ie. write packet data directly into + * a buffer on the device. This can reduce latency at the expense of + * throughput, so we only do this if both hardware and software TX rings + * are empty. This also ensures that only one packet at a time can be + * using the PIO buffer. + */ +static inline bool efx_nic_may_tx_pio(struct efx_tx_queue *tx_queue) { - return __efx_nic_tx_is_empty(tx_queue, tx_queue->write_count); + struct efx_tx_queue *partner = efx_tx_queue_partner(tx_queue); + return tx_queue->piobuf && + __efx_nic_tx_is_empty(tx_queue, tx_queue->insert_count) && + __efx_nic_tx_is_empty(partner, partner->insert_count); } /* Decide whether to push a TX descriptor to the NIC vs merely writing @@ -96,6 +113,8 @@ static inline bool efx_nic_tx_is_empty(struct efx_tx_queue *tx_queue) * descriptor to an empty queue, but is otherwise pointless. Further, * Falcon and Siena have hardware bugs (SF bug 33851) that may be * triggered if we don't check this. + * We use the write_count used for the last doorbell push, to get the + * NIC's view of the tx queue. */ static inline bool efx_nic_may_push_tx_desc(struct efx_tx_queue *tx_queue, unsigned int write_count) diff --git a/drivers/net/ethernet/sfc/tx.c b/drivers/net/ethernet/sfc/tx.c index 3206098..ee84a90 100644 --- a/drivers/net/ethernet/sfc/tx.c +++ b/drivers/net/ethernet/sfc/tx.c @@ -132,15 +132,6 @@ unsigned int efx_tx_max_skb_descs(struct efx_nic *efx) return max_descs; } -/* Get partner of a TX queue, seen as part of the same net core queue */ -static struct efx_tx_queue *efx_tx_queue_partner(struct efx_tx_queue *tx_queue) -{ - if (tx_queue->queue & EFX_TXQ_TYPE_OFFLOAD) - return tx_queue - EFX_TXQ_TYPE_OFFLOAD; - else - return tx_queue + EFX_TXQ_TYPE_OFFLOAD; -} - static void efx_tx_maybe_stop_queue(struct efx_tx_queue *txq1) { /* We need to consider both queues that the net core sees as one */ @@ -344,6 +335,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb) struct efx_nic *efx = tx_queue->efx; struct device *dma_dev = &efx->pci_dev->dev; struct efx_tx_buffer *buffer; + unsigned int old_insert_count = tx_queue->insert_count; skb_frag_t *fragment; unsigned int len, unmap_len = 0; dma_addr_t dma_addr, unmap_addr = 0; @@ -351,7 +343,7 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb) unsigned short dma_flags; int i = 0; - EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count); + EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count); if (skb_shinfo(skb)->gso_size) return efx_enqueue_skb_tso(tx_queue, skb); @@ -369,9 +361,8 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb) /* Consider using PIO for short packets */ #ifdef EFX_USE_PIO - if (skb->len <= efx_piobuf_size && tx_queue->piobuf && - efx_nic_tx_is_empty(tx_queue) && - efx_nic_tx_is_empty(efx_tx_queue_partner(tx_queue))) { + if (skb->len <= efx_piobuf_size && !skb->xmit_more && + efx_nic_may_tx_pio(tx_queue)) { buffer = efx_enqueue_skb_pio(tx_queue, skb); dma_flags = EFX_TX_BUF_OPTION; goto finish_packet; @@ -439,13 +430,14 @@ finish_packet: netdev_tx_sent_queue(tx_queue->core_txq, skb->len); + efx_tx_maybe_stop_queue(tx_queue); + /* Pass off to hardware */ - efx_nic_push_buffers(tx_queue); + if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq)) + efx_nic_push_buffers(tx_queue); tx_queue->tx_packets++; - efx_tx_maybe_stop_queue(tx_queue); - return NETDEV_TX_OK; dma_err: @@ -458,7 +450,7 @@ finish_packet: dev_kfree_skb_any(skb); /* Work backwards until we hit the original insert pointer value */ - while (tx_queue->insert_count != tx_queue->write_count) { + while (tx_queue->insert_count != old_insert_count) { unsigned int pkts_compl = 0, bytes_compl = 0; --tx_queue->insert_count; buffer = __efx_tx_queue_get_insert_buffer(tx_queue); @@ -989,12 +981,13 @@ static int efx_tso_put_header(struct efx_tx_queue *tx_queue, /* Remove buffers put into a tx_queue. None of the buffers must have * an skb attached. */ -static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue) +static void efx_enqueue_unwind(struct efx_tx_queue *tx_queue, + unsigned int insert_count) { struct efx_tx_buffer *buffer; /* Work backwards until we hit the original insert pointer value */ - while (tx_queue->insert_count != tx_queue->write_count) { + while (tx_queue->insert_count != insert_count) { --tx_queue->insert_count; buffer = __efx_tx_queue_get_insert_buffer(tx_queue); efx_dequeue_buffer(tx_queue, buffer, NULL, NULL); @@ -1258,13 +1251,14 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue, struct sk_buff *skb) { struct efx_nic *efx = tx_queue->efx; + unsigned int old_insert_count = tx_queue->insert_count; int frag_i, rc; struct tso_state state; /* Find the packet protocol and sanity-check it */ state.protocol = efx_tso_check_protocol(skb); - EFX_BUG_ON_PARANOID(tx_queue->write_count != tx_queue->insert_count); + EFX_BUG_ON_PARANOID(tx_queue->write_count > tx_queue->insert_count); rc = tso_start(&state, efx, skb); if (rc) @@ -1308,11 +1302,12 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue, netdev_tx_sent_queue(tx_queue->core_txq, skb->len); - /* Pass off to hardware */ - efx_nic_push_buffers(tx_queue); - efx_tx_maybe_stop_queue(tx_queue); + /* Pass off to hardware */ + if (!skb->xmit_more || netif_xmit_stopped(tx_queue->core_txq)) + efx_nic_push_buffers(tx_queue); + tx_queue->tso_bursts++; return NETDEV_TX_OK; @@ -1336,6 +1331,6 @@ static int efx_enqueue_skb_tso(struct efx_tx_queue *tx_queue, dma_unmap_single(&efx->pci_dev->dev, state.header_dma_addr, state.header_unmap_len, DMA_TO_DEVICE); - efx_enqueue_unwind(tx_queue); + efx_enqueue_unwind(tx_queue, old_insert_count); return NETDEV_TX_OK; } -- cgit v0.10.2 From 870c3151382c980590d4d609babf3b0243e7db93 Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 17 Oct 2014 09:17:20 -0700 Subject: ipv6: introduce tcp_v6_iif() Commit 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") added a regression for SO_BINDTODEVICE on IPv6. This is because we still use inet6_iif() which expects that IP6 control block is still at the beginning of skb->cb[] This patch adds tcp_v6_iif() helper and uses it where necessary. Because __inet6_lookup_skb() is used by TCP and DCCP, we add an iif parameter to it. Signed-off-by: Eric Dumazet Fixes: 971f10eca186 ("tcp: better TCP_SKB_CB layout to reduce cache line misses") Acked-by: Cong Wang Signed-off-by: David S. Miller diff --git a/include/net/inet6_hashtables.h b/include/net/inet6_hashtables.h index ae06135..d1d2728 100644 --- a/include/net/inet6_hashtables.h +++ b/include/net/inet6_hashtables.h @@ -80,7 +80,8 @@ static inline struct sock *__inet6_lookup(struct net *net, static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo, struct sk_buff *skb, const __be16 sport, - const __be16 dport) + const __be16 dport, + int iif) { struct sock *sk = skb_steal_sock(skb); @@ -90,7 +91,7 @@ static inline struct sock *__inet6_lookup_skb(struct inet_hashinfo *hashinfo, return __inet6_lookup(dev_net(skb_dst(skb)->dev), hashinfo, &ipv6_hdr(skb)->saddr, sport, &ipv6_hdr(skb)->daddr, ntohs(dport), - inet6_iif(skb)); + iif); } struct sock *inet6_lookup(struct net *net, struct inet_hashinfo *hashinfo, diff --git a/include/net/tcp.h b/include/net/tcp.h index 3a4bbbf..c9766f8 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -729,6 +729,15 @@ struct tcp_skb_cb { #define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0])) + +/* This is the variant of inet6_iif() that must be used by TCP, + * as TCP moves IP6CB into a different location in skb->cb[] + */ +static inline int tcp_v6_iif(const struct sk_buff *skb) +{ + return TCP_SKB_CB(skb)->header.h6.iif; +} + /* Due to TSO, an SKB can be composed of multiple actual * packets. To keep these tracked properly, we use this. */ diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c index ad2acfe..6bcaa33 100644 --- a/net/dccp/ipv6.c +++ b/net/dccp/ipv6.c @@ -757,7 +757,8 @@ static int dccp_v6_rcv(struct sk_buff *skb) /* Step 2: * Look up flow ID in table and get corresponding socket */ sk = __inet6_lookup_skb(&dccp_hashinfo, skb, - dh->dccph_sport, dh->dccph_dport); + dh->dccph_sport, dh->dccph_dport, + inet6_iif(skb)); /* * Step 2: * If no socket ... diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 9a2838e..2a86a0f 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -214,7 +214,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) /* So that link locals have meaning */ if (!sk->sk_bound_dev_if && ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL) - ireq->ir_iif = inet6_iif(skb); + ireq->ir_iif = tcp_v6_iif(skb); ireq->ir_mark = inet_request_mark(sk, skb); diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index cf2e45a..8314955 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -424,6 +424,7 @@ static void tcp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt, if (sock_owned_by_user(sk)) goto out; + /* Note : We use inet6_iif() here, not tcp_v6_iif() */ req = inet6_csk_search_req(sk, &prev, th->dest, &hdr->daddr, &hdr->saddr, inet6_iif(skb)); if (!req) @@ -738,7 +739,7 @@ static void tcp_v6_init_req(struct request_sock *req, struct sock *sk, /* So that link locals have meaning */ if (!sk->sk_bound_dev_if && ipv6_addr_type(&ireq->ir_v6_rmt_addr) & IPV6_ADDR_LINKLOCAL) - ireq->ir_iif = inet6_iif(skb); + ireq->ir_iif = tcp_v6_iif(skb); if (!TCP_SKB_CB(skb)->tcp_tw_isn && (ipv6_opt_accepted(sk, skb, &TCP_SKB_CB(skb)->header.h6) || @@ -860,7 +861,7 @@ static void tcp_v6_send_response(struct sk_buff *skb, u32 seq, u32 ack, u32 win, fl6.flowi6_proto = IPPROTO_TCP; if (rt6_need_strict(&fl6.daddr) && !oif) - fl6.flowi6_oif = inet6_iif(skb); + fl6.flowi6_oif = tcp_v6_iif(skb); else fl6.flowi6_oif = oif; fl6.flowi6_mark = IP6_REPLY_MARK(net, skb->mark); @@ -918,7 +919,7 @@ static void tcp_v6_send_reset(struct sock *sk, struct sk_buff *skb) sk1 = inet6_lookup_listener(dev_net(skb_dst(skb)->dev), &tcp_hashinfo, &ipv6h->saddr, th->source, &ipv6h->daddr, - ntohs(th->source), inet6_iif(skb)); + ntohs(th->source), tcp_v6_iif(skb)); if (!sk1) return; @@ -1000,13 +1001,14 @@ static struct sock *tcp_v6_hnd_req(struct sock *sk, struct sk_buff *skb) /* Find possible connection requests. */ req = inet6_csk_search_req(sk, &prev, th->source, &ipv6_hdr(skb)->saddr, - &ipv6_hdr(skb)->daddr, inet6_iif(skb)); + &ipv6_hdr(skb)->daddr, tcp_v6_iif(skb)); if (req) return tcp_check_req(sk, skb, req, prev, false); nsk = __inet6_lookup_established(sock_net(sk), &tcp_hashinfo, - &ipv6_hdr(skb)->saddr, th->source, - &ipv6_hdr(skb)->daddr, ntohs(th->dest), inet6_iif(skb)); + &ipv6_hdr(skb)->saddr, th->source, + &ipv6_hdr(skb)->daddr, ntohs(th->dest), + tcp_v6_iif(skb)); if (nsk) { if (nsk->sk_state != TCP_TIME_WAIT) { @@ -1090,7 +1092,7 @@ static struct sock *tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb, newnp->ipv6_fl_list = NULL; newnp->pktoptions = NULL; newnp->opt = NULL; - newnp->mcast_oif = inet6_iif(skb); + newnp->mcast_oif = tcp_v6_iif(skb); newnp->mcast_hops = ipv6_hdr(skb)->hop_limit; newnp->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(skb)); if (np->repflow) @@ -1174,7 +1176,7 @@ static struct sock *tcp_v6_syn_recv_sock(struct sock *sk, struct sk_buff *skb, skb_set_owner_r(newnp->pktoptions, newsk); } newnp->opt = NULL; - newnp->mcast_oif = inet6_iif(skb); + newnp->mcast_oif = tcp_v6_iif(skb); newnp->mcast_hops = ipv6_hdr(skb)->hop_limit; newnp->rcv_flowinfo = ip6_flowinfo(ipv6_hdr(skb)); if (np->repflow) @@ -1360,7 +1362,7 @@ ipv6_pktoptions: if (TCP_SKB_CB(opt_skb)->end_seq == tp->rcv_nxt && !((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) { if (np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo) - np->mcast_oif = inet6_iif(opt_skb); + np->mcast_oif = tcp_v6_iif(opt_skb); if (np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) np->mcast_hops = ipv6_hdr(opt_skb)->hop_limit; if (np->rxopt.bits.rxflow || np->rxopt.bits.rxtclass) @@ -1427,7 +1429,8 @@ static int tcp_v6_rcv(struct sk_buff *skb) TCP_SKB_CB(skb)->ip_dsfield = ipv6_get_dsfield(hdr); TCP_SKB_CB(skb)->sacked = 0; - sk = __inet6_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest); + sk = __inet6_lookup_skb(&tcp_hashinfo, skb, th->source, th->dest, + tcp_v6_iif(skb)); if (!sk) goto no_tcp_socket; @@ -1514,7 +1517,7 @@ do_time_wait: sk2 = inet6_lookup_listener(dev_net(skb->dev), &tcp_hashinfo, &ipv6_hdr(skb)->saddr, th->source, &ipv6_hdr(skb)->daddr, - ntohs(th->dest), inet6_iif(skb)); + ntohs(th->dest), tcp_v6_iif(skb)); if (sk2 != NULL) { struct inet_timewait_sock *tw = inet_twsk(sk); inet_twsk_deschedule(tw, &tcp_death_row); @@ -1553,6 +1556,7 @@ static void tcp_v6_early_demux(struct sk_buff *skb) if (th->doff < sizeof(struct tcphdr) / 4) return; + /* Note : We use inet6_iif() here, not tcp_v6_iif() */ sk = __inet6_lookup_established(dev_net(skb->dev), &tcp_hashinfo, &hdr->saddr, th->source, &hdr->daddr, ntohs(th->dest), -- cgit v0.10.2 From 643566d4b47e2956110e79c0e6f65db9b9ea42c6 Mon Sep 17 00:00:00 2001 From: Jon Paul Maloy Date: Fri, 17 Oct 2014 15:25:28 -0400 Subject: tipc: fix bug in bundled buffer reception In commit ec8a2e5621db2da24badb3969eda7fd359e1869f ("tipc: same receive code path for connection protocol and data messages") we omitted the the possiblilty that an arriving message extracted from a bundle buffer may be a multicast message. Such messages need to be to be delivered to the socket via a separate function, tipc_sk_mcast_rcv(). As a result, small multicast messages arriving as members of a bundle buffer will be silently dropped. This commit corrects the error by considering this case in the function tipc_link_bundle_rcv(). Signed-off-by: Jon Maloy Signed-off-by: David S. Miller diff --git a/net/tipc/link.c b/net/tipc/link.c index 65410e1..1db162a 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -1924,7 +1924,12 @@ void tipc_link_bundle_rcv(struct sk_buff *buf) } omsg = buf_msg(obuf); pos += align(msg_size(omsg)); - if (msg_isdata(omsg) || (msg_user(omsg) == CONN_MANAGER)) { + if (msg_isdata(omsg)) { + if (unlikely(msg_type(omsg) == TIPC_MCAST_MSG)) + tipc_sk_mcast_rcv(obuf); + else + tipc_sk_rcv(obuf); + } else if (msg_user(omsg) == CONN_MANAGER) { tipc_sk_rcv(obuf); } else if (msg_user(omsg) == NAME_DISTRIBUTOR) { tipc_named_rcv(obuf); -- cgit v0.10.2 From b184e497f7fe2895b2175859e0cb21ae5d531555 Mon Sep 17 00:00:00 2001 From: Guenter Roeck Date: Fri, 17 Oct 2014 12:30:58 -0700 Subject: dsa: Fix conversion from host device to mii bus Commit b4d2394d01bc ("dsa: Replace mii_bus with a generic host device") replaces mii_bus with a generic host_dev, and introduces dsa_host_dev_to_mii_bus() to support conversion from host_dev to mii_bus. However, in some cases it uses to_mii_bus to perform that conversion. Since host_dev is not the phy bus device but typically a platform device, this fails and results in a crash with the affected drivers. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] __mutex_lock_slowpath+0x75/0x100 PGD 406783067 PUD 406784067 PMD 0 Oops: 0002 [#1] SMP ... Call Trace: [] ? pick_next_task_fair+0x61b/0x880 [] mutex_lock+0x23/0x37 [] mdiobus_read+0x34/0x60 [] __mv88e6xxx_reg_read+0x8a/0xa0 [] mv88e6xxx_reg_read+0x4c/0xa0 Fixes: b4d2394d01bc ("dsa: Replace mii_bus with a generic host device") Cc: Alexander Duyck Signed-off-by: Guenter Roeck Acked-by: Alexander Duyck Acked-by: Florian Fainelli Signed-off-by: David S. Miller diff --git a/drivers/net/dsa/mv88e6060.c b/drivers/net/dsa/mv88e6060.c index 776e965..05b0ca3 100644 --- a/drivers/net/dsa/mv88e6060.c +++ b/drivers/net/dsa/mv88e6060.c @@ -21,8 +21,12 @@ static int reg_read(struct dsa_switch *ds, int addr, int reg) { - return mdiobus_read(to_mii_bus(ds->master_dev), - ds->pd->sw_addr + addr, reg); + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev); + + if (bus == NULL) + return -EINVAL; + + return mdiobus_read(bus, ds->pd->sw_addr + addr, reg); } #define REG_READ(addr, reg) \ @@ -38,8 +42,12 @@ static int reg_read(struct dsa_switch *ds, int addr, int reg) static int reg_write(struct dsa_switch *ds, int addr, int reg, u16 val) { - return mdiobus_write(to_mii_bus(ds->master_dev), - ds->pd->sw_addr + addr, reg, val); + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev); + + if (bus == NULL) + return -EINVAL; + + return mdiobus_write(bus, ds->pd->sw_addr + addr, reg, val); } #define REG_WRITE(addr, reg, val) \ diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c index d6f6428..a6c90cf 100644 --- a/drivers/net/dsa/mv88e6xxx.c +++ b/drivers/net/dsa/mv88e6xxx.c @@ -75,11 +75,14 @@ int __mv88e6xxx_reg_read(struct mii_bus *bus, int sw_addr, int addr, int reg) int mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev); int ret; + if (bus == NULL) + return -EINVAL; + mutex_lock(&ps->smi_mutex); - ret = __mv88e6xxx_reg_read(to_mii_bus(ds->master_dev), - ds->pd->sw_addr, addr, reg); + ret = __mv88e6xxx_reg_read(bus, ds->pd->sw_addr, addr, reg); mutex_unlock(&ps->smi_mutex); return ret; @@ -119,11 +122,14 @@ int __mv88e6xxx_reg_write(struct mii_bus *bus, int sw_addr, int addr, int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val) { struct mv88e6xxx_priv_state *ps = ds_to_priv(ds); + struct mii_bus *bus = dsa_host_dev_to_mii_bus(ds->master_dev); int ret; + if (bus == NULL) + return -EINVAL; + mutex_lock(&ps->smi_mutex); - ret = __mv88e6xxx_reg_write(to_mii_bus(ds->master_dev), - ds->pd->sw_addr, addr, reg, val); + ret = __mv88e6xxx_reg_write(bus, ds->pd->sw_addr, addr, reg, val); mutex_unlock(&ps->smi_mutex); return ret; -- cgit v0.10.2 From dc8e54165f1dc8ee946c953512a877676f8bbe3f Mon Sep 17 00:00:00 2001 From: Fabian Frederick Date: Fri, 17 Oct 2014 22:00:22 +0200 Subject: netrom: use linux/uaccess.h replace asm/uaccess.h by linux/uaccess.h Signed-off-by: Fabian Frederick Signed-off-by: David S. Miller diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c index 71cf1bf..1b06a1f 100644 --- a/net/netrom/af_netrom.c +++ b/net/netrom/af_netrom.c @@ -30,7 +30,7 @@ #include #include #include -#include +#include #include #include /* For TIOCINQ/OUTQ */ #include diff --git a/net/netrom/nr_dev.c b/net/netrom/nr_dev.c index 743262b..6ae063c 100644 --- a/net/netrom/nr_dev.c +++ b/net/netrom/nr_dev.c @@ -20,8 +20,8 @@ #include #include /* For the statistics structure. */ #include +#include -#include #include #include diff --git a/net/netrom/nr_in.c b/net/netrom/nr_in.c index c3073a2..80dbd0b 100644 --- a/net/netrom/nr_in.c +++ b/net/netrom/nr_in.c @@ -23,7 +23,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/net/netrom/nr_out.c b/net/netrom/nr_out.c index 0b4bcb2..00fbf14 100644 --- a/net/netrom/nr_out.c +++ b/net/netrom/nr_out.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/net/netrom/nr_route.c b/net/netrom/nr_route.c index b976d5e..96b64d2 100644 --- a/net/netrom/nr_route.c +++ b/net/netrom/nr_route.c @@ -25,7 +25,7 @@ #include #include #include -#include +#include #include #include /* For TIOCINQ/OUTQ */ #include diff --git a/net/netrom/nr_subr.c b/net/netrom/nr_subr.c index ca40e22..029c8bb 100644 --- a/net/netrom/nr_subr.c +++ b/net/netrom/nr_subr.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include #include #include #include diff --git a/net/netrom/nr_timer.c b/net/netrom/nr_timer.c index ff2c1b1..94d05806 100644 --- a/net/netrom/nr_timer.c +++ b/net/netrom/nr_timer.c @@ -23,7 +23,7 @@ #include #include #include -#include +#include #include #include #include -- cgit v0.10.2 From 25ef1328a03c72a7285883d5b337c4b602476ecd Mon Sep 17 00:00:00 2001 From: Pravin B Shelar Date: Fri, 17 Oct 2014 13:56:31 -0700 Subject: openvswitch: Set flow-key members. This patch adds missing memset which are required to initialize flow key member. For example for IP flow we need to initialize ip.frag for all cases. Found by inspection. This bug is introduced by commit 0714812134d7dcadeb7ecfbfeb18788aa7e1eaac ("openvswitch: Eliminate memset() from flow_extract"). Signed-off-by: Pravin B Shelar Signed-off-by: David S. Miller diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c index c5cfc72..2b78789 100644 --- a/net/openvswitch/flow.c +++ b/net/openvswitch/flow.c @@ -274,6 +274,8 @@ static int parse_ipv6hdr(struct sk_buff *skb, struct sw_flow_key *key) key->ip.frag = OVS_FRAG_TYPE_LATER; else key->ip.frag = OVS_FRAG_TYPE_FIRST; + } else { + key->ip.frag = OVS_FRAG_TYPE_NONE; } nh_len = payload_ofs - nh_ofs; @@ -358,6 +360,7 @@ static int parse_icmpv6(struct sk_buff *skb, struct sw_flow_key *key, */ key->tp.src = htons(icmp->icmp6_type); key->tp.dst = htons(icmp->icmp6_code); + memset(&key->ipv6.nd, 0, sizeof(key->ipv6.nd)); if (icmp->icmp6_code == 0 && (icmp->icmp6_type == NDISC_NEIGHBOUR_SOLICITATION || @@ -674,9 +677,6 @@ int ovs_flow_key_extract(struct ovs_tunnel_info *tun_info, key->ovs_flow_hash = 0; key->recirc_id = 0; - /* Flags are always used as part of stats */ - key->tp.flags = 0; - return key_extract(skb, key); } -- cgit v0.10.2 From a28205437b41a2c1333c1599ce1e8f09af7b00d6 Mon Sep 17 00:00:00 2001 From: Florian Fainelli Date: Fri, 17 Oct 2014 16:02:13 -0700 Subject: net: dsa: add includes for ethtool and phy_fixed definitions net/dsa/slave.c uses functions and structures declared in phy_fixed.h but does not explicitely include it, while dsa.h needs structure declarations for 'struct ethtool_wolinfo' and 'struct ethtool_eee', fix those by including the correct header files. Fixes: ec9436baedb6 ("net: dsa: allow drivers to do link adjustment") Fixes: ce31b31c68e7 ("net: dsa: allow updating fixed PHY link information") Signed-off-by: Florian Fainelli Signed-off-by: David S. Miller diff --git a/include/net/dsa.h b/include/net/dsa.h index 58ad8c6..b765592 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -18,6 +18,7 @@ #include #include #include +#include enum dsa_tag_protocol { DSA_TAG_PROTO_NONE = 0, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 8030489..a851e9f 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include "dsa_priv.h" -- cgit v0.10.2 From f2d9da1a8375cbe53df5b415d059429013a3a79f Mon Sep 17 00:00:00 2001 From: Eric Dumazet Date: Fri, 17 Oct 2014 12:45:55 -0700 Subject: bna: fix skb->truesize underestimation skb->truesize is not meant to be tracking amount of used bytes in an skb, but amount of reserved/consumed bytes in memory. For instance, if we use a single byte in last page fragment, we have to account the full size of the fragment. skb->truesize can be very different from skb->len, that has a very specific safety purpose. Signed-off-by: Eric Dumazet Cc: Rasesh Mody Signed-off-by: David S. Miller diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c index 153cafa..c3861de 100644 --- a/drivers/net/ethernet/brocade/bna/bnad.c +++ b/drivers/net/ethernet/brocade/bna/bnad.c @@ -552,6 +552,7 @@ bnad_cq_setup_skb_frags(struct bna_rcb *rcb, struct sk_buff *skb, len = (vec == nvecs) ? last_fraglen : unmap->vector.len; + skb->truesize += unmap->vector.len; totlen += len; skb_fill_page_desc(skb, skb_shinfo(skb)->nr_frags, @@ -563,7 +564,6 @@ bnad_cq_setup_skb_frags(struct bna_rcb *rcb, struct sk_buff *skb, skb->len += totlen; skb->data_len += totlen; - skb->truesize += totlen; } static inline void -- cgit v0.10.2