--- title: "multithreading a gui" date: "28/05/2025" --- # the problem On the [Cavalier Autonomous Racing](https://autonomousracing.dev) team at my school, we have a pretty ~~useless~~ cool basestation acting as a real-time telemetry visualization tool. We want to expand this GUI significantly, including tabulated windows for each sub-team. Leveraging concurrency is vital for keeping things efficient and data up-to-date but this is easier said than done. Consider the original design: # original architecture Originally, the GUI followed the traditional QtC++ ROS single-threaded pattern. The GUI event loop ran on one thread, the main one: ```cpp void spin_node(basestation::Gui* gui, ...) { ... rclcpp::executors::SingleThreadedExecutor executor; rclcpp::Node::SharedPtr node = std::make_shared(init_gui); executor.add_node(node); while (init_gui->is_running) { executor.spin_once(timeout); } } ``` All of 30+ topic callbacks were registered on this thread: ```cpp GuiNode::GuiNode(basestation::Gui* init_gui) : Node("uva_gui") { this->gui = init_gui; rclcpp::QoS best_effort_qos = rclcpp::QoS(rclcpp::QoSInitialization(RMW_QOS_POLICY_HISTORY_KEEP_LAST, 1)); best_effort_qos.best_effort(); this->pubDesVel_ = create_publisher("vehicle/set_desired_velocity", 1); ... this->subAutonomy_ = create_subscription("/telemetry/autonomy", 1, std::bind(&GuiNode::receiveAutonomy, this, std::placeholders::_1)); } ``` The Qt Framework is quite complex and beyond the scope of this post. Big picture, the GUI controls a bunch of data-independent visualizations that are handled on this GUI thread. How do we optimize this? It is necessary to take a step back and observe the structure of the entire application: ![single-threaded-design](/public/posts/multithreading-a-gui/single-threaded-design.webp) Many flaws are now clear: - Single, high-frequency topics can flood the GUI, causing stale data for other Widgets - Data races in which multiple callbacks modify shared data are especially difficult to handle - Poor separation of responsibility: the `MainWindow` is responsible for too much—it should not need first-hand knowledge of every Widget's API Luckily, we're far from the first (and the last) to encounter a problem such as this. Enter ROS's exhaustive suite of concurrency tools. # multi-threaded architecture First, let's pin down what exactly can be parallelized. 1. Subscriptions: group subscription callbacks[^1] according to the displayed widgets. For the CAR, this was three `Vehicle`, `Trajectory`, and `Perception` groups. Now, widgets can be updated as dependent data is received. 2. Visualizations: now that data is handled independently, GUI widgets can be updated (with care) as well. > Just kidding. Because of ROS, all GUI widgets can only be updated on the main GUI thread. This lead me to the follow structure: ![multi-threaded-design](/public/posts/multithreading-a-gui/multi-threaded-implementation.webp) - Three callback groups are triggered at differing intervals according to their urgency on the GUI node - A thread-safe queue[^2] processes all ingested data for each callback group - Every 10ms, the GUI is updated, highest to lowest urgency messages first - The `MainWindow` houses the visualization widgets as before—however, the GUI thread actually performs the update logic - GUI Widgets were re-implemented to be thread-safe with basic locking, a small amount of overhead for safe memory access ## \*actual\* multi-threaded implementation Sounds good, right? Well, I should've done my research first. The Qt framework has *already internalized* the logic for this entire paradigm of multithreaded code. Turns out all I need are: - [Signals/slots](https://doc.qt.io/qt-6/signalsandslots.html) and a `Qt::QueuedConnection` - Running the GUI with `MultithreadedExecutor` As it turns out, signals and slots *automatically* leverage ROS's internal thread-safe message queue, ensuring deserialization one at a time. The following (final) design employs two threads: 1. Main GUI Thread (Qt Event Loop): handles UI rendering + forward signals/slots to executor 2. Executor Thread: runs callbacks and publishes messages The executor is simply spawned in the main thread: ```cpp std::thread ros_thread([this]() { executor.spin(); }); ``` Data flows from a called subscription → queued signal → signal connected to a slot → slot runs the GUI]widget when scheduled. ```cpp // 1. Subscribe to a topic this->subAutonomy_ = create_subscription<...>("/telemetry/autonomy", 1, std::bind(&GuiNode::receiveAutonomy, ...)); // 2. Queue a signal to the emitter void GuiNode::receiveAutonomy(deep_orange_msgs::msg::Autonomy::SharedPtr msg) { this->des_vel = msg->desired_velocity_readout; signal_emitter->autonomyReceived(msg); } // 3. Signal connects to slot (registered in initialization) QObject::connect(signal_emitter, &GuiSignalEmitter::autonomyReceived, window, &MainWindow::receiveAutonomy, Qt::QueuedConnection); // 4. Slot-performed logic when ROS runs thread void MainWindow::receiveAutonomy( const deep_orange_msgs::msg::Autonomy::SharedPtr msg) { ... } ``` Elegantly, registering a signal/slot with `Qt::QueuedConnection` does *all of the hard work*: - Queueing messages in the target thread's loop - Actual slot execution happens in the GUI thread - Prevents cross-thread memory access/critical sections - Qt event-level synchronization # retrospective Looking back, this GUI should've been implemented with a modern web framework such as [React](https://react.dev/) with [react-ros](https://github.com/flynneva/react-ros?tab=readme-ov-file). CAR needs high-speed, reactive data, and a QtC++ front-end is simply not meant for this level of complexity. I made it a lot harder than it needed to be with my lack of due diligence, but the single-threaded GUI event loop in ROS is more harm than help. [^1]: See [the ROS documentation](https://docs.ros.org/en/foxy/How-To-Guides/Using-callback-groups.onhtml) to learn more. The CAR publishes various topic-related data at set rates, so I'm looking to run various groups of mutually exclusive callbacks at a set interval (i.e. `MutuallyExclusive`) [^2]: The simplest implementation did the job: ```cpp ... template class ThreadSafeQueue { public: void push(const T& item) { std::lock_guard lock(mutex_); queue_.push(item); condition_.notify_one(); } bool pop(T& item, std::chrono::milliseconds timeout = std::chrono::milliseconds(0)) { std::unique_lock lock(mutex_); if (timeout.count() > 0) { if (!condition_.wait_for(lock, timeout, [this] { return !queue_.empty(); })) { return false; } } else if (queue_.empty()) { return false; } item = queue_.front(); queue_.pop(); return true; } bool empty() const { std::lock_guard lock(mutex_); return queue_.empty(); } size_t size() const { std::lock_guard lock(mutex_); return queue_.size(); } void clear() { std::lock_guard lock(mutex_); std::queue empty; std::swap(queue_, empty); } ... };