Above shows the relationship between
the message delay and the number of message cycles that were dropped.
What we see is that, without throttling the send, a very high
percentage of messages do not get through. From examination of the raw
data it appears that most messages are dropped at the re-Mote end
(commands are not received). This could be due to the different
antennas being used on basestation and re-Mote.
Above shows the overall number of
successful messages that can be exchanged versus the percentage that
fail to get through. It shows that around 10 exchanges per second can
be expected with a less than 3% failure rate. This particular value
used a 60ms delay between command sends. If higher failure rates are
acceptable one could get up to 13 or so msg/sec but with a LOT of
dropped messages. This would be using a much shorter inter-command
delay time.
Remember that "messages" here refers to "message cycles" of one command
and one reply, so it appears that 20 individual messages per second is
a good ballpark speed estimate. This assumes that there is very little
chance of transmissions overlapping or having to backoff using CSMA.
Broadcast Message Cycle Using ACKs
In this set of experiments commands are sent as Broadcast messages
(using the destination ID -1) and are sent only once per cycle. There
are five re-Motes and all are expected to receive and reply to each
command, hopefully before another command cycle begins. Replies are
still point-to-point, in that they have the basestation address as a
destination ID. Message ACKs are enabled on both ends but the broadcast
commands (with destination -1) are never ACKed. However the replies
received at the basestation
are
ACKed and this seems to lead to some unpleasent results.
The useful thing about using Broadcast commands is that it triggers all
the re-Motes to reply at the same time, so it shows how effective the
CSMA mechanism is in interleaving multiple media accesses.
The glossy overview graphs:
Above again shows the relationship
between
the message delay and the number of message cycles that were dropped.
And again, what we see is that, without throttling the send, a very
high
percentage of messages do not get through. But we also see that the
best success rate is still horrible, with 10-30% of the message cycles
failing. From the raw data it appears that somewhere between 10-30% of
those failures are due to the command not being received and the rest
are replies not getting back through.
The extremely high failure rate was troubling, so I tried the next set
of experiments...
Broadcast Message Cycle NOT Using ACKs
In this set of experiments, as in the previous set, commands are sent
as Broadcast messages
(using the destination ID -1) and are sent only once per cycle. There
are five re-Motes and all are expected to receive and reply to each
command, hopefully before another command cycle begins. However in this
set ACKs are disabled at both ends, basestation and reMote. These runs
were done with an inter-command delay of 350ms, which is more than
enough to allow for all re-Motes to reply. (The worst case message
cycle time was about 210ms -- from beginning of transmit to end of
receive). I did not do a set of varying delay runs, but it would seem
that about 200ms between messages (150ms delay) would be
optimum with five re-Motes.
The glossy overview graphs:
This graph shows the rate of message
dropping versus the number of devices receiving the broadcast command.
I by-passed some of the more detailed experiments in order to get a
feel for the system scaling with the number of re-Motes. In general
this experiment showed that without
ACKs the broadcast mechanism works much more reliably.
Where we had 20% or worse loss with the ACKs turned on in the
basestation, now it is about 1/3 that with the same number of devices
competing for airwaves. Why this should be so is left as an exercise
for the reader...
The pink "Xmit loss" trace is a scaled indication of how many failures
are due to commands not being received. A value of 1 means 100% of
failures are due to the command not getting through. The ratio of
'transmission' failures decreases as devices are added, but the
relative number per device remains about the same under all
circumstances.
The interesting thing here is that the message reliability seems to
scale linearly with the number of devices, assuming that enough time is
allowed for all devices to respond. Also, the difference between a 1%
drop with one device and 6.5% with five shows that the CSMA mechanism
is not foolproof. This is a worst case scenerio because all of the
re-Motes are trying to access the airwaves at the same time, so perhaps
6 or 7% loss is not so bad. The raw data shows that message replies are
generally interleaved in 20-30ms steps. No further analysis to
catagorize failures was done.
RoundRobin Message Cycle NOT Using ACKs
Just for completeness I did a set of runs using roundrobin command
addressing, but with all ACKs disabled. This is not as rigorous as the
first experiment set above, but shows similar behavior. It was done
with five re-Motes and a varying command delay time. As one might
expect, due to the absence of the ACK send and receive wait, a somewhat
faster message cycle time is possible.
The glossy overview graphs:
This is the same type
of graph as in the first experiment series. It shows the dependence of
successful message cycles on the command cycle delay time. It also
shows that a much shorter delay can be used where there are no ACKs.
Again, this is the same type of graph
as in the first experiment series. It
shows the number of successful messages one can expect versus how many
fail. It shows that some few more message cycles are possible for about
the same error rate when were
there are no ACKs. The best rate with fewest failures was using a 25ms
delay (or about 75ms total message cycle time) and gave close to 13
messages per second with less than 2% failure, of which about 20% were
lost command transmissions. Again each message cycle is a set of two
mica2 messages, so the actual message rate is twice what is shown on
the graph.
Almost Raw Results and Test Code
The spread sheet with the accumulated data used to generate these
graphs is here:
The raw-raw data, each message transaction timing and such, is
available, as is the TOS and Java code used to run the tests. I'm just
to lazy to get it up right now...
Conclusions
Using a message request/reply system where a basestation requests a
message from each re-Mote, by either indivudual point-to-point request
and reply or broadcast request with a p-t-p reply, I have arrived at
the following conclusions. These are under generally best case
conditions where five re-Mote mica2s and one
base station are within 2 meters of each other:
Point-to-point request messages should be throttled to less than about
10 per second or many replies will be dropped.
Broadcast request messages should be run without message ACKs
enabled and should be throttled such that there is about 20-30ms per
re-Mote to allow for replies. If the request timing is too short many
replies will be dropped.
With message ACKs enabled in a point-to-point roundrobin mode where
there is little chance of transmit overlap and CSMA backoff, about 20
successful individual messages per second can be sent with less than 3%
failure rate. However about 20% of the message ACKs are never
received.
When ACKs are enabled in a semi-broadcast mode where there is a good
chance of CSMA backoff, message failures increase drastically, up to a
30% failure rate in some cases. See Experiments for details.
Without ACKs, in a roundrobin mode where there is little chance of
transmit overlap and CSMA backoff, successful messages are around 26
per second with less than 2% failure rate.
Without ACKs in a broadcast mode where there is maximum chance
of transmit overlap and CSMA backoff, successful messages are around 25
per second but the failure rate varies linearly from 1% with one
re-Mote to 6.5% with five re-Motes.
I don't know how much the failure rate linearity can be extrapolated.
xxx