Cisco CleanAir Review

Cisco's CleanAir system integrates spectrum analysis into wireless access points to provide real-time, always-on visibility into external non-Wi-Fi sources of interference present in the environment.

My organization was involved in the beta evaluation of the CleanAir product, and the product has been released for several months now to all customers. However, I have refrained from posting on this product until now in order to be able to provide conclusions on the product from live "real-world" deployments.

Having recently deployed the CleanAir product in two production facilities, I would like to now share some of our results and findings.

Brief Overview of CleanAir
Released last spring, this system is available to customers as of the 7.0 version of code and requires the newer 3500 series wireless access points. The 3500 AP series hardware has been augmented with a dedicated spectrum analysis chipset  to detect and report sources of interference. The AP reports findings up to the wireless controller, where the information can be integrated into the Radio Resource Management (RRM) feature set to automatically optimize the network channel and power settings to avoid severe and/or persistent sources of interference.

A funny anecdote about the "CleanAir" product name - many non-technical individuals being briefed on the technology originally thought it referred to either a.) a "green IT" initiative or b.) removing foul smells from the air.

The SAgE (spectrum analysis engine) chipset is architected in-parallel with the Wi-Fi chipset and does not impede wireless performance. If the incoming energy is recognized as a Wi-Fi signal (specifically the Wi-Fi preamble), it is sent to the Wi-Fi chipset in the AP. If not, then it is passed to the SAgE chipset for spectrum analysis. 

Many administrators (and even some engineers) confuse the meaning of "interference" to include medium contention from other nearby Wi-Fi networks. This is not correct. Strictly speaking, "interference" is non-Wi-Fi energy. The CleanAir system only attempts to measure, identify, and classify sources of non-Wi-Fi interference. This is evident in the basic CleanAir chipset architecture, essentially splitting incoming signals to either the Wi-Fi or SAgE chipset, but never both.

Air quality index (AQI) is an inverse measure of how much interference is in the environment. Air quality is at 100% when no interference is present, and is reduced based on energy strength and duty cycle (airtime occupied) by interference sources. SAgE samples are taken every 1 second by the AP, AQI is calculated every 15 seconds and summarized into 30 second intervals, which are then reported up to the controller every 15 minutes (by default). The exception is when an administrator is actively monitoring an AP radio interface from the WCS or WLC, then the AP is automatically instructed to switch into a rapid update mode which changes the default reporting period down to 30 seconds to provide more real-time information. 

A new RRM component, called Event-Driven RRM (EDRRM), allows the controller to take immediate action to mitigate severe interference issues rather than waiting for the RRM configured interval to take action. The sensitivity threshold determines the AQI value for an individual AP radio that is required in order for EDRRM to kick into effect and make an adjustment in order to avoid the source of interference. Three threshold settings are available to control what AQI value triggers RRM events: High Sensitivity requires AQI to fall below 60, Medium Sensitivity requires AQI below 50, and Low Sensitivity requires AQI below 35. Additionally, air quality SNMP trap alarms are sent when the AQI drops below a value of 35 (by default).

Psuedo-MACs (PMAC) are used to correlate interference sources being detected by multiple APs and merge report information on the device, which is likely to be the case in most enterprise deployments. CleanAir has to detect the interferer for a long-enough period of time (classification requires 5-60 sec. of activity) in order to correlate an interferer as the same device being detected by multiple APs. Since interference sources do not have MAC addresses, a psuedo-MAC is created to uniquely identify interference sources. "Clustering" is used to represent a merged record for an interference source from multiple APs. Currently, cluster information is discarded once the detected energy source stops, and is not persistent for any length of time after the interference stops or is removed from the environment.

Persistent device avoidance allows the CleanAir system to recognize devices that are fixed in position and unlikely to move and avoid recurring interference issue in the areas affected by such devices. The interference sources may be continuous or periodic in nature, but either way are likely to repeatedly impact the same physical area over and over again. Examples include microwave ovens and mounted video cameras. CleanAir recognizes these persistent devices and instructs nearby APs to operate on alternate channels even if the persistent device is no longer observed. Only after a persistent device is absent for greater than 7 days, does the CleanAir system allow APs in the affected areas to re-use those channels.

Reporting is tied into Cisco's WCS management platform and location tracking is performed through the Mobility Services Engine (MSE) context-aware service. WCS provides a central dashboard for administrative staff to monitor network performance, view historical interference trending data, and identify the location of the offending interferer when coupled with the MSE appliance for easy removal of the offending device.

For more information on the CleanAir feature set, see these excellent sources of information:
Cisco CleanAir Design Guide (The Definitive Resource from Cisco on CleanAir)

Deployment and Setup
Deployment and configuration of CleanAir are intuitive and straight-forward. The following steps are involved when implementing the product:

1. Upgrade WLC code to version 7.0 or later

2. Install CleanAir capable access points (3500 series). Note - Cisco does not recommend a "salt-and-pepper" approach to CleanAir AP deployment with other APs. This is because EDRRM can only take action with CleanAir capable APs and does not currently effect the broader RRM eco-system. Therefore, other APs would not benefit from spectrum data reported by nearby CleanAir APs.

3. Configure CleanAir Settings for each Network Band

4. Configure EDRRM in the RRM > DCA Section for each Network Band

To view the configuration of CleanAir on the system, issue the show 802.11b cleanair config command (substitute '802.11a' to see the config for the 5GHz band).

5. Optionally, Configure the MSE to Track Location for Interference Sources

6. Monitor Interference Activity 

From the WLC (Monitor > Cisco CleanAir):

From the WCS Dashboard:

From WCS Maps (if using the MSE to locate interference sources) (Monitor > Maps):

8. Monitor EDRRM Activity from WCS (Monitor > RRM)

9. Create Interference Reports from WCS (Reports > Report Launch Pad)

Real-World Findings

- Interference Detection - CleanAir has been adept at accurately finding and reporting on multiple sources of interference in our deployments. One environment which consists of carpeted office space has discovered numerous DECT devices (likely desk phone wireless headsets) as well as microwave ovens. It was amazing to see just how many floors and areas of the building have leaky microwaves! A real eye-opener. Another environment which is warehouse space has exhibited microwave ovens in breakrooms and bluetooth devices likely owned by employees and active in their pockets while they work (cell phones are likely).

Overall, validation of CleanAir findings with laptop-based spectrum analysis has confirmed the devices and severity levels being reported in our production environments.

- Numerous Similar Interference Entries - Some interference sources are reported multiple times because the PMAC cluster and merge process seems to not work. We have only experienced this for a few types of interferers, most notable DECT phones. It is annoying to see a list of 5 DECT phones, when in reality only one exists but is being detected by 5 APs. More information on this in the "Opportunities for Improvement" section below.

Good Coverage - Reception of Wi-Fi signals and interference energy exceeds the range for APs and clients to communicate reliably. So, network designs for data and voice should have no problem providing adequate coverage and visibility for CleanAir spectrum analysis.

As Cisco states in their CleanAir Design Guide, "The technology has been designed to compliment the current best practices in Wi-Fi deployment. This includes the deployment models of other widely used technologies such as Adaptive wIPS, Voice, and location deployments." 

- Well-Tuned EDRRM - We have found that EDRRM events are rare, and really only occur when interference is severe enough to warrant a channel change to improve client performance. Fears of change with wreckless abandon are unfounded, and network operation has been stable.

Benefits of CleanAir
The benefits of integrated spectrum analysis, and the CleanAir system in particular, include the following.

Event-Driven RRM - allows the controller to take immediate action to avoid severe sources of interference, translating into reduced network downtime, improved performance for clients, and faster time to resolution for  client impacting incidents. The Air Quality Index (AQI) drives Event-Driven RRM to make on-demand changes. An example would be a Wi-Fi video camera with strong narrowband interference that effectively kills network operation on the channel.

Persistent Interferer Avoidance - allows the controller to recognize sources of interference that may be lower-severity, yet occurring in a repeated fashion and degrading network performance. By tracking these repeating interference events, the network can pro-actively avoid such problems. An example would be a microwave oven that only gets turned on during lunch breaks but still needs to be avoided all the time so channel changes don't occur every day over the lunch hour.

Enhanced Network Visibility - Monitoring air quality through the WLC and WCS are easy and intuitive. The WCS dashboard provides quick snapshot information for administrators checking in on network operation. Air quality reports allow scheduled review of all activity in the environment. Should severe interference sources be found, administrators now have the visibility within their toolbag to positively confirm or deny the presence of interference, rather than trying to diagnose issues from client-reported symptoms. This is tremendously beneficial for removing uncertainty and speculation around Wi-Fi performance issues, as well as to remove offending devices from the environment to prevent future issues.

Accurate Device Classification - With a dedicated spectrum analysis chipset, the network gains precise and accurate information on non-Wi-Fi sources of interference. Other vendor solutions aiming to provide spectrum analysis capability rely on the Wi-Fi chipset itself to report on non-Wi-Fi energy. The problem with that approach is that Wi-Fi chipsets are designed primarily to modulate and de-modulate Wi-Fi signals, not to identify other sources of energy. Spectral resolution is also vastly superior with a chipset dedicated to spectrum analysis, which allows CleanAir to accurately identify spectral signatures to classify devices and report accurately on energy strength and duty cycle. Solutions based on Wi-Fi chipsets can take a guess, at best. This is especially true of narrowband interference sources or frequency-hopping wireless systems, where more granular spectrum resolution bandwidth can identify individual hopping patterns, as can be experienced with Bluetooth type devices for example.

Cisco's CleanAir spectral resolution is documented at 78KHz (on a 20MHz channel dwell) and 156KHz (on a 40MHz channel dwell), versus a standard Wi-Fi chipset at 312KHz. In addition, even other spectrum cards such as the MetaGeek Wi-Spy 2.4i at 373KHz resolution and AirMagnet Spectrum XT at 156.3KHz aren't as accurate as CleanAir. Also of note, is that the rated spectral resolution bandwidth of the Cisco Spectrum Expert laptop card is a minimum of 10KHz, so CleanAir is not quite as accurate as the laptop card and engineers may notice slight display differences between the two products.

Update: The newer Wi-Spy DBx product has much better resolution bandwidth rated at 24KHz. The Wi-Spy 2.4i resolution bandwidth is 373KHz, not 328KHz as originally posted. I also added a links to the AirMagnet Spectrum XT, Cisco CleanAir, and Cisco Spectrum Expert datasheets as requested by some readers.

Update 2: It appears that the resolution bandwidth listed in the CleanAir Design Guide - Glossary is inaccurate, swapping the values for 20MHz versus 40MHz dwell times. The information above has been updated to reflect the correct values.

Remote Troubleshooting - Placing the AP into SE-Connect mode allows a Wi-Fi engineer to remotely connect to the AP and view real-time spectrum analysis information through their workstation with Cisco Spectrum Expert software installed. The reduces the need for expensive on-site travel by an engineer, decreases time to resolution of incidents, and improves troubleshooting capabilities of remote branch offices by central IT staff.

Opportunities for Improvement
For a first-generation product, Cisco seems to have nailed CleanAir. However, there are a few features that could improve the solution as it stands today.

Unclassified Interferer Reporting - Currently, CleanAir only reports on interference sources that it can classify. This is also reflected in the AQI value for each AP radio. Any sources of interference which cannot be classified are not reported in the device list and do not affect AQI. They are visible however in the WLC via the Air Quality Graphs for individual radios. This behavior is on purpose because the CleanAir is specifically architected to classify specific non-Wi-Fi interference sources (currently 20+) and not to speculate on unknown energy.

I agree with this approach when it comes to EDRRM change activity, but disagree when it comes to reporting and alarms. CleanAir should be enhanced to give network administrators more visibility into the unknown energy sources through automated AQI reporting and alerting to signal the red flag for a human to investigate the source. Perhaps differentiating AQI between classified versus unclassifed device severity would still allow EDRRM to be based off only the sources which have been classifed, yet allow administrator visibility into air quality taking into account all energy being detected.

In addition, 802.11b DSSS/CCK modulation poses problems with detection because adjacent channel activity is hard to classify as either Wi-Fi or interference due to the spread spectrum modulation where most of the signal is around the center channel frequency, causing problems detecting side-lobe activity. This problem may be feeding energy to the SAgE chipset rather than the Wi-Fi chipset and resulting in some amount of detected energy remaining unclassified by CleanAir.

Off-Channel Interference Scanning - The current RRM channel scanning process uses short dwell times and is already being used for neighbor discovery, rogue scanning and aWIPS. Spectrum information is collected during these times, but does not provide enough time time to reliably classify devices; therefore data collected during off-channel scanning is suppressed by the system. One option is to deploy monitor mode APs, which spend significantly longer dwell times on channels which allows CleanAir adequate time to detect and classify interference sources. Another option is to use on-demand directed off-channel scanning. This feature would allow an AP to detect interference sources on off-channels during RRM scanning, or receive reports of interference sources on other channels from nearby APs through the controller, and queue up other channels to scan when client traffic activity is low.

Duplicate Interference Source Entries - PMAC does not always work as expected, and "bouncing" may occur when the interferer comes and goes faster than CleanAir can classify the source, which results in multiple PMACs being created for the same device or detecting APs being listed as "unknown" (location may still be fairly accurate, but reported information may be incomplete). An enhancement should be made to retain detected energy "clusters" for a period of time to better correlate a single interferer that is intermittent into a single PMAC.

Local Mode CleanAir - An enhancement to the CleanAir product should be made to allow remote troubleshooting to be performed without changing the mode of the AP, allowing it to continuously serve clients while remote spectrum analysis is performed. The SE-Connect mode should be discarded in favor of real-time spectrum analysis in Local mode.

A Note on Beamforming
One competitor has claimed that integrated spectrum analysis is great and all, but most performance issues come from the Wi-Fi network stomping on itself through medium contention among co-channel access points, and that dynamic beamforming can avoid non-Wi-Fi sources of interference.

Using beamforming to reduce Wi-Fi medium contention as well as to reduce received signal strength from interference sources is helpful to reduce negative network performance impact to some extent, but it cannot eliminate the impact completely. However, beamforming does not obviate the need for integrated spectrum analysis. In an ideal world, a product would have both.

Sure, changing channels disrupts the client session and should be avoided if possible, and beamforming may provide a bit more SNR and wiggle room to avoid a channel change. But there will inevitably be cases where the strength of the interfering device is so overwhelming (narrowband video cameras, for example) that it completely wipes out the channel, even with beamforming on the APs. In those instances would you rather have 1.) beamforming without integrated spectrum intelligence which just sits there confused and inoperable, or 2.) have visibility into the issue and be able to take corrective action to change channels and get the clients working again. Yeah, option number 2 is my choice too - get clients working again without manual intervention!

End result: integrated spectrum analysis is still a beneficial feature and cannot be discounted by vendors with dynamic beamforming capability.

The current CleanAir feature is still in its infancy, yet it already provides a great foundation for wireless network administrators to gain visibility into external sources of network problems. This feature is of great value to administrators today, and will only become even more important as Wi-Fi network become more pervasive and mission-critical to business operations.

Here are a few reasons why integrated spectrum analysis will be required to support mission-critical wireless networks now and into the future:

- Unlicensed spectrum use is growing. More devices, more uses, more potential for interference!

- Voice over Wi-Fi performance requires a well-tuned wireless network. Part of that is having some semblance of control over factors outside the realm of network control. Having visibility into these factors allows administrators better control over their environment to eliminate outside influences where required.

I have found Cisco's CleanAir to be a great initial product offering for integrated spectrum analysis. With the emergence of Wi-Fi networks as mission-critical to business operations, the maturation of voice over Wi-Fi requiring a highly stable and well-performing network, and growing use of unlicensed spectrum by devices of all types, integrated spectrum analysis will give organizations much-needed visibility into external sources of problems. This will allow organizations to solve performance problems rather than "speculate" as to the root-cause.


PS - Thank you Joel, Darrin, and Pete for support during the beta evaluation and production deployments of this product! This post is specifically dedicated to your teams for this (and other) reasons previously discussed ;)