Enterprise Wireless Local Area Networks (WLANs) consist of multiple Access Points (APs) covering a given area. In these networks, interference is mitigated by allocating different channels to neighboring APs. Besides, stations are allowed to associate to any AP in the network, selecting by default the one from which receive higher power, even if it is not the best option in terms of the network performance.
Finding a suitable network configuration able to maximize the performance of enterprise ...
Enterprise Wireless Local Area Networks (WLANs) consist of multiple Access Points (APs) covering a given area. In these networks, interference is mitigated by allocating different channels to neighboring APs. Besides, stations are allowed to associate to any AP in the network, selecting by default the one from which receive higher power, even if it is not the best option in terms of the network performance.
Finding a suitable network configuration able to maximize the performance of enterprise WLANs is a challenging task given the complex dependencies between APs and stations. Recently, in wireless networking, the use of reinforcement learning techniques has emerged as an effective solution to efficiently explore the impact of different network configurations in the system performance, identifying those that provide better performance.
In this paper, we study if Multi-Armed Bandits (MABs) are able to offer a feasible solution to the decentralized channel allocation and AP selection problems in Enterprise WLAN scenarios. To do so, we empower APs and stations with agents that, by means of implementing the Thompson sampling algorithm, explore and learn which is the best channel to use, and which is the best AP to associate, respectively. Our evaluation is performed over randomly generated scenarios, which enclose different network topologies and traffic loads. The presented results show that the proposed adaptive framework using MABs outperform the static approach (i.e., using always the initial default configuration, usually random) regardless of the network density and the traffic requirements. Moreover, we show that the use of the proposed framework reduces the performance variability between different scenarios. Also, results show that we achieve the same performance (or better) than static strategies with less APs for the same number of stations. Finally, special attention is placed on how the agents interact. Even if the agents operate in a completely independent manner, their decisions have interrelated effects, as they take actions over the same set of channel resources.
+