Interpreting and Graphing Aruba ARM Counters

Guest post by Mike Albano

The topic of "do you trust RRM" is often discussed. The most typical answer is: "Yes, if I understand it." I know I've personally spent numerous hours blaming RRM for a questionable Dynamic Channel Assignment (DCA), and I'm usually wrong.

For the purpose of this post, RRM = Radio Resource Management; be it ARM (Aruba), RRM (Cisco), ACSP (Aerohive), SmartRF (Extreme) etc. etc.

This post isn't about the topic of "trust", or if to use RRM. Here's a good post by @wirednot on that topic. (Read the comments!)

This is more about:

  • Finding a way to interpret and use the data available to identify if/when an AP will change channels (DCA).
  • Analyzing the state of the channel, from the AP's perspective, before & after a channel change.
  • Showing an example of tools I use regularly in troubleshooting (Python and AirRecorder).

The system in question is an Aruba Instant AP (Instant OS version 6.3.1.8-4.0.0.9).

Data Gathering

Typically, I use Pexpect for screen-scraping CLI output but Aruba has written a handy utility to do this for you. It's called Air Recorder, and is multi-platform (Java.) Will run on Linux & OSX (I use it on both), and probably Windows.

To get AirRecorder: Login to support.arubanetworks.com go to Tools & Resources, click the Air Recorder folder.

Some of what I've done with the Python script below for parsing the output can be done directly inside Air Recorder. No need to make this a tutorial on that tool(Check references at bottom of post for those), but I'll detail what I've done for this use-case.

 

1.  Create a file (commands.txt), with the command I'd like to run on the IAP, every two minutes 5 seconds.
Contents of "commands.txt" :

125,show ap arm rf-summary

2.  Set AirRecorder to SSH to an IAP (with a 20 second timeout), using the aforementioned commands.txt file. From Terminal :

java -jar AirRecorder-1.2.16-release.jar --instant --protocol ssh -u <username> -t 20 -m -c commands.txt 10.1.1.6

3.  Let it run for however long you'd like to gather data for. You will end up with a file in the current directory, named something like "air-recorder-10.0.1.6-20150502.log".

How long you let it run, depends on exactly what you're trying to do. Might be days, if you've statically assigned a channel and want to view it over an extended amount of time. You can also just run a command once, if you're interested in obtaining certain output from numerous AP's quickly.

Interpreting

The output from "show ap arm rf-summary" is incredibly useful, but not well documented, especially for the Instant architecture. I'll detail what I've gathered. Please leave comments if you spot misinterpretation, or have additional info.

Here is some sample output, from an IAP operating on Channel 11.

Channel quality history:wifi1

 1:Q:   1   1   1   0   0   0   0   0   0   0   0   0   0   0   0   0   0...

  :c:  31  31   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1...

  :N:  56  56  56  37  39  39  36  37  36  38  36  37  37  37  36  38  37...

  :s:  90  90  90 100 100 100 100 100 100 100 100 100 100 100 100 100 100...

  :U:  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99...

 6:Q:   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   0...

  :c: 120 120 122 124 126 128 130 132 134 136 138 140 142 144 146 116 118...

  :N:  38  38  38  36  36  37  37  36  35  35  36  37  37  40  36  57  39...

  :s: 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100  87 100...

  :U:  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99  99...

11:Q:  97  97  98  98  98  98  97  97  97  97  98  98  98  98  97  98  98...

  :c: 118 118 118 118 118 118 118 118 118 118 118 118 118 118 118 118 118...

  :N: *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94 *94...

  :s:   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0...

  :U:   3   3   2   2   2   2   3   3   3   3   2   2   2   2   3   2   2...

  :R:   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0...

Note, the only channel with an ":R:" is 11 (current AP channel).
Here's similar output (formatted for clarity), while an interferer is introduced:

Each column represents a 5-second sample interval. There are 24 columns, however each time you run the command, it shifts the columns to the right. In my script I have AirRecorder run every 125 seconds, to be sure I avoid duplicate data.

The Row definitions, as far as I can tell, are as follows:

  • Q: Channel Quality. (0-100, higher=better) This is "Aruba Proprietary" and plays a major role in ARM's DCA algorithm. Even though it's proprietary, we can analyze some of what affects it. I've found it generally (but not always)  Q: = 100 - :U - :R Unfortunately, proprietary=difficult to be scientific about analyzing a channel change.
  • :c = Quality check counter. Not really sure what this means.
  • :N = Noise Floor. Self explanatory.
  • :s = Noise scale. (0-100) This appears to bear an inverse relationship to Noise Floor, though I haven't precisely figure it out.
  • :U = Non-802.11 Channel Utilization. (0-100%). Percentage of time the IAP detects RF energy, over it's CCA-Energy Detect threshold, but can not demodulate as 802.11.
  • :R = 802.11 Retry percentage. (0-100%) Amount of Rx frames with the Retry bit set.

Taking this data to DCA decisions is dependent on your configuration, especially in AOS. IAP is less configurable. The flow-chart below is based on defaults:

Graphing

The main goal of this script is to take the the output of "show ap arm rf-summary" and graph the useful bits, for easier consumption. For example, this can help in analyzing the channel state before & after a channel change. This is, of course, only the AP's perspective and totally dependent on the accuracy of its counters.

Some of this is built-in to the IAP web interface, but that doesn't suite my needs because

A) it does not show data on all the counters (ie Retry %)

B) it only keeps ~ 6 minutes of history

C) graphing CLI output over time is something I occasionally have a use for. I also like seeing all the data represented on one graph, for correlation.

At present, this will graph the "Q, U & R" counters from above. Each point on the graph represents 2 minutes worth of data (averaged).

For example, the following graph was generated from 10 hours worth of IAP output (ie I let AirRecorder run for 10 hours):

Note: using the toolbar you can zoom, scroll, save as PNG etc.

You can see that "Quality" accordingly decreases, as either 802.11 Retries or Non-802.11 Noise increases. At ~ 300 minutes (5 hours) in, there was a large increase in Retries/decrease in Quality. Interestingly, there was relatively no increase in Non-802.11 CU at that time. It's likely that something, other than RF interference, was causing that >60% Retry % at ~4:45:10am.

Since I've set the output to be collected every 2 minutes (see: Data Gathering), each point on the graph equals the average of 2 minutes of data.

A second example, over 1 hour, introducing my 'noise generator' in at ~25 minutes:

This graph was saved as a PNG (disk icon on bottom toolbar)

You can see the Non-802.11 CU increase, while the Quality decreases accordingly. 802.11 Retries are not present, or more specifically did not increase, in this example since I placed the noise generator a foot from the IAP. It can't demodulate 802.11 frames with the retry bit set, since it can't demodulate any 802.11 frames at all.

# show ap debug radio-stats 1 | i "Ch Busy perct"

Ch Busy perct @ beacon intvl        99 99 99 99 99 99 99 99 0 99 100 99 0 99 99 99 0 99 0 99 99 99 0 99 100 99 0 99 0 99 

And here's a pcap, filtered by AP transmitter address, showing a 30-second duration where the IAP can not Beacon due to CCA-ED:

Usage

The script will work for either radio (2.4 or 5GHz). I've left it fairly commented, so you should be able to change the Axis, time interval and even "what" is graphed with relative ease.

When you run it, either by making it executable (chmod +x) or by python directly, it will print simple usage instructions, but basically it's:

2.4GHz graph:
python graph_output.py -f <air_recorder_file>

5GHz graph:
python graph_output.py -5 -f <air_recorder_file>

The only non-standard dependency is the matplotlib library

Linux: sudo apt-get install python-matplotlib

OSX: use pip, macports or "sudo easy_install matplotlib"

Windows: meh. untested.

Noise Generator

While it's relatively easy to find consumer electronics that interfere with WiFi, I wanted one that was

A) Easy to turn on & off

B) Completely consume the channel(s) (Continues Tx/100% Duty Cycle)

C) Cheap

This quarter-sized chip for a video transmitter, at 7.90$, fit all those requirements.

According to the specs, the transmitter requires:

  • Power Supply: 5V
  • Current: 90mA

USB power seemed to fit the bill, so I used a battery charger I received from Live last year. Small enough and conveniently has an on/off button, though any USB port should do:

Just leave the two data cables disconnected, and connect/solder the hot (red) and black (grd) to appropriate leads, according to the schematic (kind of...my chip wasn't actually 100% represented by the manufacturer diagram, but close enough):

Disclaimer: if you burn down your house, or kill yourself building this, I'm not responsible.
Disclaimer2: I performed my testing in a Faraday cage. Intentionally interfering with WiFi is illegal in the US.

Immediately after turning it on, the spectrum analyzer indicates a center frequency of 2.468, 100% duty cycle & affecting WiFi channels 6-14. This is with transmitter placed a foot away from SA.

This is with no jumpers soldered, and appears to match up with the manufacturers spec.:

That's closest to WiFi Channel 11 (US) or 12 (Elsewhere), though depending on where it is in relation to the client and/or AP, it will affect WiFi channels 6-14.

Along with saturating a channel, you can use this to create a less detrimental "Noisy" environment by placing the transmitter at varying distances from the Client and/or AP.

References
Python script on GitHub -- The usual disclaimers apply, ie this script is probably full of errors. Feedback welcome. AirMagnet Spectrum XT with interferer
AirRecorder Tutorial Part-1
AirRecorder Tutorial Part-2
ARM Details -- This is NOT accurate for Instant AP's. I have not found an IAP version of this doc.
Configuring ARM profiles (changing metrics) -- This does NOT apply to IAP. IAP is less configurable.
Interference Immunity Definitions -- No examples or specifics on metric changes.