LearnIoTVR: An End-to-End Virtual Reality Environment Providing Authentic Learning Experiences for Internet of Things (CHI 2023🏅)
LearnIoTVR is an end-to-end VR learning environment designed to teach Internet of Things (IoT) concepts through immersive, hands-on experiences. Students can place virtual IoT components within realistic environments (e.g., a smart home) and program their behaviors using a custom 3D block-based language directly inside VR. This setup situates learning in authentic contexts while providing immediate feedback on programming outcomes. In our user study, participants demonstrated significant improvements in programming skills and IoT understanding. And they also reported better user experience compared to traditional desktop-based environments.
The Problem
Creating an AR tutorial is a complex and time-intensive process that demands both programming skills and domain expertise. This high barrier of entry limits the widespread adoption and scalability of AR tutorials, despite their strong potential for improving real-world training and instruction.
What if we can generate the AR tutorial automatically while operating a device?
Formative Study
We began by identifying the most common physical interfaces found on everyday devices: buttons, knobs, sliders, switches, and touch screens.
Through a questionnaire study, we analyzed how users naturally interact with each type and summarized the most frequent manipulation patterns. These findings guided our design of the operation-recognition algorithm.
HArdware design
The wearable prototype consists of three main components—a wristband and two finger caps. This minimal setup preserves the natural appearance of the hand, ensuring compatibility with existing hand-tracking algorithms.
The finger caps, equipped with thin-film pressure sensors, are worn on the thumb and index finger. The pressure readings inform the system of the exact moment when the user interacts with the interface. The wristband houses the microprocessor and a haptic feedback module that provides immediate tactile feedback during tutorial playback.
The finger caps, equipped with thin-film pressure sensors, are worn on the thumb and index finger. The pressure readings inform the system of the exact moment when the user interacts with the interface. The wristband houses the microprocessor and a haptic feedback module that provides immediate tactile feedback during tutorial playback.
Operation Recognition: the decision Tree
By combining hand-tracking data with pressure sensor input, we developed a decision tree to classify common interaction types. The system first distinguishes gestures based on which fingers are active (thumb, index, or both) and then filters them through pose and transformation layers.
This hierarchy enables InstruMentAR to accurately recognize operations, including button presses, switch toggles, knob rotations, slider movements, and screen touches.
This hierarchy enables InstruMentAR to accurately recognize operations, including button presses, switch toggles, knob rotations, slider movements, and screen touches.
Authoring Mode
Using InstruMentAR, the authoring process is simply demonstrating the operation and pinching to convert your voice to text instructions.
Accessing Mode:
Automatic Forward
After a correct operation is done, the tutorial will automatically forward to the next step.
User Study
Since InstruMentAR is capable of tracking hand operation, it can issue preemptive warnings to prevent wrong operations. The haptic module provides haptic warning as well.
Full Video
Video can’t be displayed
Personal Thought
Just like many other engineering students, I suffered a lot in the ECE labs learning to operate the oscilloscope. I often found myself lost in messy lab manuals, trying to locate the current step and comparing it with the figures to check if I pressed the right button.
When I began researching in XR, I immediately saw how AR tutorials could transform this experience by overlaying guidance directly on instruments. However, I soon realized that creating AR tutorials was far from simple—it often took days to build even a single session in Unity.
That’s why I designed InstruMentAR: to make AR authoring as natural as performing the task itself. Instructors can simply demonstrate the operation without dealing with complex authoring interfaces. The user study confirmed its effectiveness—the authoring process was more than twice as fast as conventional immersive methods.
Looking ahead, the authoring process could become even easier. With the advancement of LLMs, maybe all we’ll need is to feed the system a device manual and simply prompt it to create the desired AR tutorial. Then the creator wouldn’t even need to demonstrate the procedure — just spend a few minutes talking, and there goes the AR lab manual.
When I began researching in XR, I immediately saw how AR tutorials could transform this experience by overlaying guidance directly on instruments. However, I soon realized that creating AR tutorials was far from simple—it often took days to build even a single session in Unity.
That’s why I designed InstruMentAR: to make AR authoring as natural as performing the task itself. Instructors can simply demonstrate the operation without dealing with complex authoring interfaces. The user study confirmed its effectiveness—the authoring process was more than twice as fast as conventional immersive methods.
Looking ahead, the authoring process could become even easier. With the advancement of LLMs, maybe all we’ll need is to feed the system a device manual and simply prompt it to create the desired AR tutorial. Then the creator wouldn’t even need to demonstrate the procedure — just spend a few minutes talking, and there goes the AR lab manual.