Nowadays, access to WLANs is regarded as a basic service. However, despite its importance, very few WLANs run at their maximum efficiency. Their current deployments often contain a dense number of APs, which can have a major impact on the WLANs' performance because of the listen-before-talk property of 802.11. The recent amendment to the 802.11 standard (802.11ax or Wi-Fi 6) could be a game-changer as it enables WLANs to dynamically modify the transmission power of APs as well as their CCA threshold. In this work, we frame the proper tuning of these two parameters as a Multi-Armed Bandit problem, which allows us to derive an efficient and robust data-driven solution using Thompson sampling, an original sampling of WLAN configurations, and a tailor-made reward function assessing their quality.