4 millseconds is how long a thread can occupy a processor in MS SQL Server. It will be there a shorter time if it completes its operation, or it needs a resource. Completion is trival – it goes away, or rather the results are sent someplace <grin>. If the threads needs something, like data, it’s state changes from running to suspended and the thread is moved to the waiter list. When it is signaled that the resource is available, the thread is moved from waiting to runnable, and it is queued by the scheduler (‘logical” CPU to SQLOS). At the right time and place (time, since the place is preordained…), the thread goes back to the processor and runs – to completion, until another resource is needed, or the quantum (4ms) is exhausted. In the case of exhaustion, it moved off the processor by the scheduler which is run by SQLOS in a process of “non-preemptive or cooperative scheduling”.
If the task is not completed but the data is available, the thread can move directly to runnable instead of waiting – since it is not waiting for a resource and only needs more time on the processor.
If a t-sql task is parallelized, it will consist of multiple threads, which are drawn from a thread pool. All the task threads may not complete at the same time although they are running on different processors (we did say parallel, right?), and the time for the first thread to wait for the last thread is a CX_Packet wait state. There may be n-1 CX_Packets for a task assuming a MAXDOP of n (all waiting for the last/slowest thread to complete).
Running (on processor\cpu\scheduler, single or parallel)
Runable (runnable queue
Waiting (waiting list)
worker pool with workers
Running worker (one per scheduler)