Article 14262 of comp.os.ms-windows.programmer.nt.kernel-mode: This thread worked its way to Dave Cutler. Someone asked me to post his answer as I am somewhat active in the newsgroup. Here is his verbatim response, From: David Cutler Sent: Friday, January 16, 1998 10:52 AM To: Darryl Havens Subject: RE: From Newsgroup: Bug in KeInsertQueueDpc - Our investigation seems to confirm this I answered this before and told them - by design. We are aware of this and we are not going to change the design. The reason it is the way it is is to avoid a single lock and to have distributed dpcs that can be targeted. What this mail proposes is a single lock that everyone would have to get behind or another lock associated with th dpc object itself. We don't expect that someone will try to to queue exactly the same dpc on different processors simultaneously. We have never seen this problem before. If someone tried to complete the same i/o packet simultaneously on two different processors it will fail with horrible consequences as well. We do not intend to do anything about this. They have found a solution in their driver as described below by acquiring another spinlock before attempting to insert their dpc. d Jim McCollum wrote in message <69lhm3$q2r$1@decius.ultra.net>... >I've been getting IRQL_NOT_LESS_OR_EQUAL bugchecks out of the NT kernel >that, after lots of debugging, appears to me to be a bug in >KeInsertQueueDpc. > >Analysis of the crash shows that the DPC queue, the header of which is >located in the processor control region (PCR), is corrupted. The nature of >the corruption is that a DPC has a forward link on the DPC queue in one >processor's PCR while the backward link is linked to the PCR DPC queue of >another processor. > >I spent a great deal of time tracking this down, including unassembling the >code in KeInsertQueueDpc, and I've managed to convince myself that >KeInsertQueueDpc is not SMP safe. Here's what happens. Two threads running >on separate processors make simultaneous calls to KeInsertQueueDpc >specifying the same DPC object. Because the DPC is not targetted to a >specific processor, each thread attempts to queue the object to the local >processor's DPC queue. KeInsertQueueDpc performs the following steps (this >is slightly simplified): > >1) Raises IRQL to HIGH_LEVEL (31). >2) Acquires a spinlock which is located in the PCR (hence, each thread is >allocating a *different* spinlock). >3) Manipulates the flink/blink in the DPC object to place it on the local >processor's DPC queue. >4) Requests a software interrupt (to force processing of the DPC queues). >5) Releases the spinlock acquired in (2). >6) Restores IRQL. >7) Returns to the caller. > >Because the spinlock allocated in step (2) is located in the PCR, each >thread allocates a different spinlock. They then proceed to step 3, >simultaneously attempting to manipulate the list entry in the DPC object, >corrupting them. Both threads then release their respective spinlocks, lower >IRQL and exit KeInsertQueueDpc, leaving behind corrupted processor DPC >queues. When NT later attempts to retire the DPC and remove it from the >queue, it stumbles over the corrupted links and crashes. My driver is >nowhere near the stack, but its DPC object is always implicated in the >resulting corrupted DPC queues. > >After staring at this code, it became clear that while the spinlock in step >(2) above protects the DPC queue in the PCR, the DPC object itself is not >protected. I am able to workaround the crash by associating a spinlock with >my DPC object and acquiring it before calling KeInsertQueueDpc. This >prevents multiple threads from going through KeInsertQueueDpc simultaneously >for the same DPC object and the crash disappeared. > >I distilled the code from my driver and wrote a small driver with only a few >routines which will crash an SMP system in a matter of minutes, if not >seconds. All this driver does is start up a bunch of threads, which do >nothing but sit in a loop and periodically call KeInsertQueueDpc, specifying >a dummy DPC routine. I then applied the above workaround and sure enough the >crashes went away. > >Has anyone else seen these crashes? I'm running 4.0 with service pack 3. >While I am able to prevent these crashes out of my driver with the >workaround, I'm concerned that other NT drivers may unknowingly be >susceptible to this problem. > >I'll include the source code for the driver which will crash an SMP system >in a reply to this entry. > >Thanks, >Jim McCollum >Marathon Technologies Corporation > > >