D. Kalinsky Associates
Home  |  Online Learning  |  Resources  |  About Us  |  Contact  |  Site Map  
Technical Paper :

"Queueing Theory  for Embedded Systems Designers"
Very often, an embedded systems or software engineer will need to figure out the maximum number of messages
that will queue up at a given message queue in a system that's being designed. Many RTOSs need this number in
order to create the queue.  Other designers may be concerned with queueing delays, especially in "hard" real-time
systems.  If I'm asked about these issues, I tend to answer,
“Oh, that’s just queuing theory”. And invariably I’ll hear
an uncomfortable nervous sort of laughter from the embedded engineers.  They're thinking
“Right!  I’m gonna stop
my project in order to take a course on queueing theory, just for this
program’s message queue.  What’s this guy
been smoking?”

Similar questions come up about the capacity of linked lists and ring buffers, or the number of buffers in a buffer pool,
or even the assignment of dual-port RAM to buffer descriptors inside of modern communications processors. These
questions impact directly the amount of memory that must be allocated to the resource at issue, and the capacity of an
embedded system to handle the varying ebb and flow of data.

Sometimes engineers use empirical methods to try to estimate the required capacity.    In one method, they divvy up
all available RAM in their development target board, assigning it to their stacks, their queues, their linked lists, and
buffer pools.  Hopefully the hardware team has supplied sufficient RAM that there’s more than enough for all of these
structures.  Before running the application software, test software “paints” these structures with a standard but
improbable content; for example patterns of
0xAAAAAAAA , 0x55555555 or 0xDEADDEAD.  Then the application is run
for a while; and later the structures are examined to see how much of the standard “paint” is left.  The area where
“paint” remains presumably has not been used by the application, and so most of it can probably be eliminated.

In another empirical technique, many real-time operating system sensitive debuggers/profilers and ‘object browsers’
have the capability of measuring the maximum number of messages in a queue at any time during an experimental
run of application software. Some will even show you graphs of queue filling as a function of time.  Similar graphs can
be gotten for stacks and buffer pools. The high point of the graph will tell you how much of the capacity your software
actually used.  But is this number a good measure of the required capacity for a given message queue or stack or
buffer pool?  Perhaps not, since seconds or minutes after the end of the experimental run of the software, its memory
needs might increase dramatically had it been allowed to continue to run.  And this information would not be captured
by any empirical method.

So empirical methods cannot always be relied upon, even if ‘fudge factors’ like adding 30% extra capacity, are

A third approach, based on mathematics rather than empirical tests, may give us a better handle to the answers to our
capacity questions.  This paper will present a simplified peek into queuing calculations that may be just enough to
perhaps answer some of our software design questions and let us know if more queuing theory is the way to get
more questions answered.

Software message queues will be used as our example here, although similar considerations could apply to linked
lists, buffer pools, ring buffers and stacks.


A message queue is a set of memory locations that provide temporary storage for data that are being passed from
task to task (or to/from interrupt service software).  Each set of data is called a ‘message’.   When a new message is
stored but there are already messages in the queue, the old messages remain and the new message is added to
them in the queue.  When a message is taken from the queue, no memory of that message remains in the queue.  
Computer science gurus call this
“Non-Destructive Write / Destructive Read”.
Figure 1: Message Queue Analogy - Olive Oil Bottling
Let’s use a simplified model of a queue that uses the analogy of an olive oil bottling plant. A funnel is used to pour
olive oil into bottles.  The funnel has a wide opening at its top, so that it can hold lots of olive oil (or software
messages by analogy).  The funnel has a narrow opening at its bottom, so that the rate of flow of olive oil out of the
funnel is limited (by analogy to a software process that can receive and process messages slowly).  See Figure 1.

The rate at which olive oil is poured into the funnel (or the rate at which messages are sent into the message queue)
is named
Pr(t), the Production Rate.  The rate at which olive oil comes out of the bottom of the funnel (or the rate at
which messages are consumed from the message queue) is named
Cr(t), the Consumption Rate.
Pr(t) and Cr(t) are functions of the time t; in other words, the flows in and out can vary over time.


If the Production Rate is always less than the Consumption Rate, Pr(t) < Cr(t), then everything put into the
funnel or queue always flows out immediately.  No backup of olive oil (or queue of messages) ever develops.  Thus,
an olive oil funnel or message queue is unnecessary in this case.

On the other hand, if the Production Rate is always larger than the Consumption Rate,
Pr(t) > Cr(t), then there’
s always more being put in than can be taken out.  If you wait a while, olive oil will fill up the entire funnel and begin to
spill out of the top.  By analogy, messages will fill up the entire message queue and will have no additional place to be
stored in the message queue.  No matter how big the capacity of the funnel (or the queue), eventually all of its capacity
will be filled up.  In this situation, a message queue will be of no use since it will eventually overflow and messages
will be lost, even if it has huge capacity.   See Figure 2.
Figure 2:  Funnel overflows if consistently filled
at a greater rate than it can empty.
So in fact it sounds as if olive oil funnels and message queues are not awfully
useful. And in fact both are useful only in one limited situation:
Only if the Production Rate is greater than the Consumption Rate for a short time T,  Pr(t) > Cr(t) for times t
within T, then an olive oil funnel or a message queue makes sense.  This short time T may be called a ‘burst’.  The
idea is that the funnel or queue can absorb only a temporary excess of production over consumption during an
occasional burst.  And so a funnel or a queue is only useful if the Production Rate is greater than Consumption Rate
for a limited burst time only.  These bursts should be spread out in time.

Many software designers forget this limitation of message queues and other data buffering schemes, and so they
sometimes tend to over-use them.  Please ask yourself if your queues or ring buffers or buffer pools will be used to
store information that is arriving in bursts.  Because otherwise they will fail (by explosion) if asked to store information
that will relentlessly be arriving at a faster pace than it can be handled.


We can begin with the simplification of making the Consumption Rate ‘Cr’ and the Production Rate ‘Pr’ be
constants.  Of course, the constant Production Rate must be limited to burst time
T.  For now, let’s keep it at zero at
other times.

At the beginning of a burst, the message queue should be empty.  During the burst, messages are produced and put
into the queue; and at the same time some messages are taken out of the queue and consumed.  Since everything is
linear in this simple situation, the number of messages in the queue gradually grows as the burst proceeds. So, by
the end of the burst:
 Pr*T  messages have been produced and put in the queue.
And …                
Cr*T  messages have been taken and consumed from the queue.
So the number of messages remaining in the queue at the end of the burst,
L = (Pr-Cr)*T.

And since the queue gets longer and longer as time advances through the burst, the length of the queue at the end of
the burst is the maximum length of the queue during the burst.  And so  
L = (Pr-Cr)*T is the maximum length of
the queue.  In other words, it is the required size of the queue.

But there’s a
"gotcha" here.  We’re not done with our constant-pace situation yet!  If another burst begins shortly after
the end of the first burst, the funnel will overflow, or the message queue will not have capacity for the second burst.  In
fact the formula above is only valid if a new burst begins after the funnel/queue has been totally emptied of content
from the previous burst.  This will take some time. We’ll call that the Emptying time
D, or required Dead time.  We can
calculate it as follows:

At the end of the burst, the Production Rate ‘
Pr’ goes to zero, but the Consumption Rate ‘Cr’ continues at its constant
value.    There are
L = (Pr-Cr)*T messages in the queue at that time.
So the emptying time  
D = L/Cr,  or  D = ( Pr/Cr – 1 )*T.

The emptying time  
D could actually be much longer than the burst time, if the  Pr/Cr ratio is big.  It is a ‘quiet time’
that should be given to the message queue to allow it to recover and get ready for a new burst.

If you think about it for a moment, you'll quickly realize that this emptying time  
D  is also the longest time that any
message will sit in the queue, waiting to be taken from the queue.   It's the time it will take for the very last message
that was put into the queue at the very end of the burst, to wait for all the other messages that are in the queue ahead
of it to be fetched and processed.  Hence  
D is also the maximum queueing delay in our simple "linear" queue. [The
variable name "
D" can be thought to signify maximum  Delay time.]  This is a very important value for designers of time-
critical systems.


The application in this example is a device that will detect changes in a person’s “brain waves” in response to colors
and images that are flashed before the subject’s eyes.  “Brain waves”, also called EEG (electoencephalographic)
signals are weak electrical signals that can be detected by an array of electrodes placed in a “halo” around a person’s

Whenever a new color or image is flashed before the subject’s eyes, hardware begins to sample the EEG signals at a
2 millisecond per sample rate.  Sampling then continues for 1.5 seconds.  In software, each sample consists of an
array of bytes, one byte per electrode.  The entire array for a sample is put into a single message that is then put into a
message queue.  See Figure 3.

Further processing of the EEG signals takes 5 milliseconds per sample.

When the EEG signal messages are stored in a message queue, with one message per data sample, the required
length of the message queue is calculated as follows:
Pr = 500 samples per second, during a burst
Cr = 200 samples per second, during and after a burst
T = 1.5 seconds, length of a burst
Resulting in ..        
L = (500-200)*1.5  =  450 messages need to be queued up.

The required length of the message queue is then precisely
450 messages.
Figure 3:  EEG signal data samples are stored
as 1 message per 2 millisecond sample.

The calculation of the length of the queue that we’ve done is valid only if the queue is completely emptied before
another burst begins.  In the brain wave device example, this means that:

D = L/Cr   = 450 messages / 200 messages per second = 2.25 seconds.

In other words, a new burst must not begin until at least 2.25 seconds after the end of the previous burst.

This sort of limitation can impose significant constraints on a system.  In our brain wave device, it means that a new
color or image must not be flashed before the subject’s eyes more frequently than once every 3.75 seconds.   [Burst
Time + Recovery Time]  Let’s hope that the subjects of the brain wave experiments have the patience for this!

What is the longest delay that any one of the messages will encounter in the queue of this example?  It is also the
value we calculated for  
D.  So a message may be delayed in this queue by as long as 2.25 seconds.  And in fact
the last message in each burst will be held in the queue for this amount of time, before it gets processed.


Queuing formulas get much more complicated when message production and consumption rates are no longer
constant.  In preparation for working with a much more complex equation, some new symbols need to be defined:

The burst period
T begins at time t1 and ends at time  t2  .   A time inside of this burst interval will be called Tx.

The rate of production of messages for the queue is now changeable over time, and will be called
Pr(t).   The rate
of consumption of messages is now also a function of time, called

Since this more general case is non-linear, we can no longer rely on the assumption that the queue will be longest at
the end of a burst.  So we need to proceed in two steps:  In the first step, we need to figure out the length of the queue
at any time during a burst.  And then in the second step, we need to see at what point during the burst was the queue
longest.  This would then be the length of the queue we need to prepare in our software.  In the general case, the point
at which the queue is longest might well be in the middle of the burst or 2/3 of the way through a burst.

For the first step, here is an equation that tells us the actual length of the queue of messages
L(Tx) at any time Tx
during burst T:
By looking at L(Tx) ,   we can see how the length of the message queue varies over time during the burst of
[Caution:  This formula assumes that the length of the queue L(Tx)  does not go negative anytime
during the burst.  If yours does, then you may need to sign up for a full Queuing Theory course right now!]

Now for the second step:  The issue of greatest interest is the worst-case length of the message queue.  We can’t
assume that this worst-case length occurs at the end of the burst.  It might occur at any time during the burst,
Pr(t) and Cr(t) fluctuate in wild enough ways.  So in fact all that can be said is that the worst-case length of
the message queue is:

Lqueue     =  MAXIMUM[ L(Tx) ]
      for any Tx in T

This is the overall queue length that is needed for the message queue.


The formulas above are pretty frightening for those of us who have forgotten our days in calculus class (or who’d like
to forget them).  So here’s a way to avoid the calculus:  Work with graphs of the functions, and calculate the areas
under the curves instead of integrating the equation.

In Figure 4 there are graphs of message Production Rate and Consumption Rate as functions of time.  To figure out  
 the length of the queue at time Tx , you just need to get the area of the  Pr(t) curve up to time Tx and the
area of the  
Cr(t) curve up to time Tx.  The difference between these two areas is L(Tx), the length of the queue
at time
Tx.  [Kids can do this by putting a ruler vertically on the 2 graphs at time Tx , and then figuring out the areas to
the left of the ruler and under each of the 2 curves.]

By doing this for a few different values of the time
Tx , you can create a plot of L(Tx)  as a function of time Tx.  That’
s the bottom curve in Figure 4.  This curve shows you how the number of messages in the queue varies a function of
time.  You can continue the curve after the end of the message production burst, and thus see the queue empty out.  
[In fact, in the variable-pace general case, the limitation to well-defined “bursts” can be relaxed quite a bit.]

Then all that remains is to find the high point of the curve of  
L(Tx) .  That high point represents the longest length of
the message queue at any time in the burst.  That’s the overall queue length that’s needed for your message queue.
Figure 4:  Variable Pace Queue Example.

So far, we've seen a basic overview of queuing theory for embedded systems developers. We've seen some very
simple algebraic formulas that rely on simplifying assumptions such as constant or known message arrival and
departure rates.  But in many cases, such assumptions are gross over-simplifications.  In many cases, embedded
designers may object to what we've seen, saying  
“Hey, there’s no way I can get a precise algebraic formula
describing the rate at which my messages will be sent, or the precise rate at which they’ll be processed.  So I
can't use the formulas we've got so far.  Is there anything else I can do to figure out my queue lengths and my
queuing delays in the real world??”

To answer these questions, we need to go a couple of steps deeper into queuing theory.  In particular, we need to
wade into
“classical” queuing theory.  Whereas up until now we focused on deterministic “hard” real-time systems,
now we’ll focus on systems with timing behavior that can be described in a statistical way.  Some people refer to them
as “soft” real-time systems.

We’ll be working with something called an
M/M/1 queue, which is the mathematical name for the simplest kind of
statistical queuing model.  The first ‘M’ means ‘Memoryless arrival time probability function’. In other words, the arrival
times of subsequent messages into the queue don’t depend on when previous messages arrived.  The second ‘M’
means ‘Memoryless service time probability function’.  In other words, the processing times of subsequent
messages don’t depend on the processing times of previous messages.  And the ‘1’ means we’re focusing on a
single queue feeding its messages to a single processor of messages.


A lot of people think that ‘Memoryless’ arrival times mean that you just can’t say anything about when the next
message will arrive.  They think that if the last 25 messages arrived at a queue around 10 milliseconds apart, then the
next message could just as easily arrive 10,000 years from now as it could arrive 10 milliseconds from now.  But that’
d be total chaos and unrealistic, rather than just independent arrival times.  So we’ll be talking about a much more
orderly kind of random behavior in which inter-arrival times are randomly distributed but there’s still a known average
number of arrivals per unit of time.

This is known as a Poisson distribution (named in honor of a French mathematician,
not a French fish ).   The
Poisson distribution is widely used to describe discrete but rare events, for which the time between successive
arrivals is randomly distributed.  For instance, it can describe traffic accident rates, cancer rates in large populations,
radioactive decay, or embedded system interrupt arrival rates.  Things will become quite orderly if you look at them in a
very specific way: You need to ask how many of these random events you expect will occur in a given time interval, if
you know the average number of such events per time interval.  Sometimes there will be fewer than the average
number of events; and sometimes there will be more than the average.  The Poisson distribution expresses this
mathematically.  If ‘
E’ is the average number of events per time interval, then ‘P(N)’ will be the probability of ‘N’ events
in a time interval, where:
    -E         N
P(N) = (e)    * ( (E)   / N! )        , for N = 0, 1, 2, 3, …

Don’t worry … we won’t be deriving this formula, nor even using it directly.  We just need to understand what it means;
and we can most easily do that by studying a graphic example in Figure 5.
In this example E=1. In other words, on average we’re having one ‘event’ (or message arrival at our queue) per time
interval.  As we can see in the graph, this actually means that there’s a 1/3 probability that there will be zero ‘events’ in
any given time interval.  There’s also a 1/3 probability that there will be exactly one ‘event’ in any given time interval.  
And there will be smaller probabilities of larger numbers of ‘events’ in a given time interval.  Similar graphs can be
obtained for other values of
E, which typically fall off on either side of a peak near value  E.  The falloff is gradual for  
N>E  as  N, the number of ‘events’ per time interval, gets larger; but it is steep for N<E since  N  cannot go negative.

We will assume that messages arrive into our queues according to such Poisson distributions.   Going forward in this
paper, we’ll work with the inverse of ‘
E’, which we’ll call ‘M’ – the Mean Inter-Arrival Time.     M = 1/E   .


Once a message has arrived in a queue, it stays in the queue until all messages ahead of it have been taken off the
queue and individually processed.  It’s not hard to model message processing time as also randomly distributed, but
with a known average processing time.  So that’s what we’ll do.

In fact, these are the same assumptions as for a Poisson distribution, so the math that was developed by Mr. Poisson
holds for message processing times as well.  But when we talk about message processing times, it’s more
convenient to express the formulas in a different way:  If  
A  is the average message processing time, then the
P(T) that any given message will consume actual processing time T, is:
P(T) = (e)     / A
       , for processing times T ranging from 0 to A and higher.

This view of random timing is called an Exponential distribution. Don’t worry … we won’t be deriving this formula
either. We just need to understand this one too.  Its graph can be seen in Figure 6.
For above-average message processing times (to the right of  A), the curve extends out to infinity --- saying that some
of our messages might have excruciatingly long processing times, although there’s a very low probability that this will
happen to any given message.  But for below-average message processing times (to the left of  
A), the curve is
limited to positive times – so that a large number of messages will cluster in this small range, with the number of
messages increasing as the processing load they impose goes down toward zero.   

This pattern of lots of easy-to-process messages, with a few occasional hard-to-process messages, very often
describes the reality within our embedded systems.  So we will use Exponential distributions to describe message
processing times.  The average message processing time  
A  will appear again and again as we continue.


If message arrival times and message processing times are both random, then how could a queue of significant
length ever develop?  Well, if the arrival times were correlated with the processing times, then a queue might not
develop; or it might be very short in worst case.  But in fact in queuing theory as well as in embedded systems and
software, message arrival times and message processing times are not normally correlated with each another in any
way.  In other words, a cluster of messages with very short inter-arrival times might well contain a lot of messages
with very long processing times.  And that’s when a queue of significant length would develop.

In fact, if the mean inter-arrival time ‘
M’ is shorter than the average processing time ‘A’,   M<A then the queue will grow
without bounds, overflow and become useless. Hence, in this random statistical world of ours, we must have:

M > A                , for a queue to be useful.

Another way of expressing this is to introduce a new parameter  
R, defined as:

R == A / M                , or equivalently  R=A*E .

R is the ratio of average processing time to mean inter-arrival time.  R  must be less than 1 for a queue not to
overflow uselessly.  In other words, for a queue to be useful, messages must arrive on average more slowly than they
can be processed on average.


It can be shown (although we won’t go through mathematical proofs here) that the average length of a queue will be:
Lavg = R / (1-R)        , or Lavg = A / (M-A).

And the average time that a message spends in the queue will be:

Tavg = A / (1-R)        , or Tavg = A*M / (M-A).

And if you divide the  
Tavg  formula by the  Lavg  formula, you’ll get:  Tavg / Lavg = M , or:

Lavg = Tavg / M .        , or  Lavg = E*  Tavg .

This is called “
Little’s Theorem”.   It’s very much analogous to the famous “Rate * Time = Distance” formula of high
school physics, but applied to queues.  It says that when averages are used for all values, the length of a queue is the
product of the rate of arrival of messages multiplied by the time they stay in the queue.

Why doesn’t Little’s Theorem involve  
A , the amount of time it takes to process a message on average?  Wouldn’t
that be more intuitive?  The answer to this is: In fact  
A  is very much involved in Little’s Theorem, because  A  is a
major ingredient in determining  
Tavg, the average amount of time a message spends in the queue.

A final useful formula: The probability  
P  that there will be at least  K  messages in the queue will be:
P [ > K enqueued ] = (R)

In other words, the probability of a queue becoming longer goes down geometrically with its length (since
R is
smaller than 1).

The remainder of this paper will show how these 4 formulas can be used to obtain useful results for designing your
embedded systems and software.


Say we’ve got a task that produces messages and puts them into a queue with mean inter-arrival time of
M = 25  milliseconds.  And we’ve got another task that fetches the messages from the queue, and processes them
with mean processing time of  
A = 20  milliseconds per message.  See Figure 7.  What will be the average delay of
a message in this queue?
Figure 7:  A Queue may Delay Messages
We can use the second “useful formula” for this. The average time that a message spends in a queue will be:                
Tavg = A / (1-R) .
Crunching the numbers results in: Tavg = 20 / (1-20/25) = 100 milliseconds.
That is to say, a message will on average be delayed a tenth of a second while it is in this queue.

We can also try out our first “useful formula” on this example.
The average length of a queue will be:        
Lavg = R / (1-R) .
Crunching the numbers results in:
Lavg = (20/25)  / (1-20/25) = 20/5 = 4 messages.
So the average number of messages in this queue will be 4.

And we can also check Little’s Theorem on this example,
by recalculating the average queue length using the formula:
Lavg = Tavg / M .
Crunching the numbers results in:
Lavg = 100 / 25 = 4 messages.  And so we’re consistent with our
previous results.  Little’s Theorem checks out.


Say we’ve got a software task that’s producing messages and depositing them into a queue at random times, but with
mean message arrival time of   
M = 30  milliseconds.  Another software task is responsible for fetching and
processing messages from this queue.   The queue is limited to a maximum of 5 messages; otherwise it will
overflow. See Figure 8.   What is the highest average message processing time that will limit queue overflow to less
than one ten-thousandth of a percent?
Figure 8:  Queue of Length 5 Messages, showing Queue Overflow
We’ll use the formula for the probability that there will be at least K messages in the queue:
P [ > K enqueued ] = (R)  
,and we’ll put P at less than 0.0001% when K is at 6 or more messages for queue overflow. That gives us:
P [ >6 enqueued ] = (R)   < 0.000001
                                6     6         6        -6
And since R = A / M and M=0.030 seconds,  we get ...              (R) = (A) / (0.030) < 1*(10)   .
Resulting in
A < 1*(1/10)*0.030 = 0.003 seconds.
Thus, the average processing time for the task that processes messages must be less than 3 milliseconds if we are
to limit overflow losses in our 5-message capacity queue to less than 0.0001%.

Now, what would be the average queue length under these conditions?
We’ll use our trusty formula:  
Lavg = R / (1-R) .
Crunching the numbers, we get  
R= 0.003 / 0.030 = 0.1 .
And then  
Lavg = 0.1 / 0.9 = 0.11 messages.
So there’s a lot less than one message in the queue, on average!  Most of the 5-message capacity of the queue is
hardly ever used.  But it’s got to be there in order to limit queue overflow to less than 0.0001%.


Say we’ve got a hardware device that delivers interrupt requests to a processor at random times, but with a mean time
between interrupts of
M = 25 milliseconds.  The interrupt service software needs average processing time A to
handle each interrupt.  Interrupts cannot be queued up before they’re processed; hence they’re lost if the processor is
overloaded with interrupts.  What is the value of
A that will limit the interrupt overloading losses to less than 1%?

Here we really don’t have a queue.  Instead, interrupt-related information is stored in hardware until either interrupt
service software fetches it, or another interrupt arrives.  This is like a hardware-based queue with capacity for exactly 1
message, and no more.  In other words, the arrival of a second message while an earlier one is still in this queue,
would be considered an overloading loss.

We’ll once again use the formula for the probability that there will be at least K messages in the queue:
P [ > K enqueued ] = (R)  
,and we’ll put P at less than 1% when K is at 2 or more messages for overflow. That gives us:
P [ >2 enqueued ] = R*R < 0.01
And since R = A / M and M=0.025 seconds,  we get R*R = A*A / ((0.025)*(0.025)) < 0.01 .
Resulting in
A < sqrt (0.01)* 0.025 = 0.0025 seconds.
Thus, the interrupt service software average processing time must be less than 2.5 milliseconds if we are to limit
interrupt overloading losses to less than 1%.


Say that interrupts from a hardware device are arriving sporadically, with mean inter-arrival time M = 400
microseconds.  For these interrupts, the mean data processing time is A = 300 microseconds.   What will be the
mean response time for the processing of these interrupts?

The average time that a message spends in a queue is given by “useful formula”:   
Tavg = A / (1-R).
Hence, the average time will be:
Tavg = 300 / (1-300/400) = 1200 microseconds.
This is the average time until software begins to process its responses to these interrupts.

A result such as this brings up a number of system and software design questions:
*       Can our application live with a response time that’s so much longer than the average rate of interrupts, and so
much longer than the interrupt’s data processing time?
*       Where are the interrupt-related data going to be stored while they’re waiting to be processed?
*       Will this storage be implemented in software or hardware?
*        …

Let’s try to answer the second question above by using the very first “useful formula” again.  If interrupts were to queue
up, then the average length of the queue would be given by our trusty formula:
Lavg = R / (1-R) .
In this example, the result is:
Lavg = (300/400) / (1-300/400) = 3 messages.  (You can also get this
result using Little’s Theorem.)  So, …
aha!!…, we’re going to need some queuing of interrupts, or interrupt-related
data, for handling this hardware device.

And so the results of our average queue length calculation bring us to the third question above: * Will this queue be
implemented in software or in hardware?  Some processors nowadays, such as network communication processors,
have message queuing capability for on-chip hardware interfaces built right into their silicon.

But if we choose to implement this queue in software, we’re going to have to redesign the software that handles this
particular hardware device, to avoid the kind of interrupt overloading losses we saw earlier in
Example #3.  Rather
than doing all of the data processing (
A=300 microseconds) right inside the interrupt service software, we’re going
to need to break apart the software into a very brief interrupt service routine and a separate software task that will
handle the bulk of the interrupt-related data processing.  The interrupt service routine will pass messages (one per
interrupt) to the software task through a message queue, as seen in Figure 9.
So here we’ve seen how a quick  “back of a postage stamp”  queueing calculation can bring us to a major rethinking
of an embedded system software design decision.


Unlike its name, we've seen that Queuing “Theory” has lots of non-theoretical uses in practical down-to-earth
embedded system and software design. There are lots of ways to delve even deeper into Queuing Theory.  After all,
Queuing Theory is a well-developed branch of classical mathematics.

Earlier versions of this material were published as a sequence of 2 articles in the April 2001 and May 2003 issues of Embedded Systems Programming magazine.

© Copyright 2016, D. Kalinsky Associates, All Rights Reserved.
This page Updated March 25, 2016
Figure 5:
Poisson Distribution of
Message Arrival Rates for E=1
Pr(t) Production Rate
Cr(t) Consumption Rate
Time (sec.)
Time (sec.)
Time (sec.)
Message Production
Rate (msgs./sec.)
Message Consumption
Rate (msgs./sec.)
Queue Length
Over Time
(in Messages)
Overall Queue Length Limit
Needed is 35 Messages
L( Tx ) =
[ Pr(t) - Cr(t) ] dt
Figure 6: Exponential Distribution of Message Processing Times
Figure 9:  Redesigned Software for handling Hardware Device Interrupts