From 4f27c00bf80f122513d3a5be16ed851573164534 Mon Sep 17 00:00:00 2001 From: Alan Cox Date: Sun, 15 Jul 2007 23:40:55 -0700 Subject: Improve behaviour of spurious IRQ detect Currently we handle spurious IRQ activity based upon seeing a lot of invalid interrupts, and we clear things back on the base of lots of valid interrupts. Unfortunately in some cases you get legitimate invalid interrupts caused by timing asynchronicity between the PCI bus and the APIC bus when disabling interrupts and pulling other tricks. In this case although the spurious IRQs are not a problem our unhandled counters didn't clear and they act as a slow running timebomb. (This is effectively what the serial port/tty problem that was fixed by clearing counters when registering a handler showed up) It's easy enough to add a second parameter - time. This means that if we see a regular stream of harmless spurious interrupts which are not harming processing we don't go off and do something stupid like disable the IRQ after a month of running. OTOH lockups and performance killers show up a lot more than 10/second [akpm@linux-foundation.org: cleanup] Signed-off-by: Alan Cox Cc: Ingo Molnar Cc: Thomas Gleixner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds diff --git a/include/linux/irq.h b/include/linux/irq.h index 1695054..4465719 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -161,6 +161,7 @@ struct irq_desc { unsigned int wake_depth; /* nested wake enables */ unsigned int irq_count; /* For detecting broken IRQs */ unsigned int irqs_unhandled; + unsigned long last_unhandled; /* Aging timer for unhandled count */ spinlock_t lock; #ifdef CONFIG_SMP cpumask_t affinity; diff --git a/kernel/irq/spurious.c b/kernel/irq/spurious.c index bd9e272..32b1619 100644 --- a/kernel/irq/spurious.c +++ b/kernel/irq/spurious.c @@ -172,7 +172,17 @@ void note_interrupt(unsigned int irq, struct irq_desc *desc, irqreturn_t action_ret) { if (unlikely(action_ret != IRQ_HANDLED)) { - desc->irqs_unhandled++; + /* + * If we are seeing only the odd spurious IRQ caused by + * bus asynchronicity then don't eventually trigger an error, + * otherwise the couter becomes a doomsday timer for otherwise + * working systems + */ + if (jiffies - desc->last_unhandled > HZ/10) + desc->irqs_unhandled = 1; + else + desc->irqs_unhandled++; + desc->last_unhandled = jiffies; if (unlikely(action_ret != IRQ_NONE)) report_bad_irq(irq, desc, action_ret); } -- cgit v0.10.2