top of page

Human-Robot Collaboration in Construction


The construction industry, valued at $13 trillion in 2021 and projected to reach over $23 trillion by 2026 with a compound annual growth rate of 9.8%, is a vital component of the global economy. However, the industry faces challenges like escalating costs, labor shortages, and deficient productivity, with 98% of projects encountering cost overruns and 77% experiencing scheduling delays. Modular construction is gaining popularity owing to easy installation, relocation, and budget-friendly nature compared to traditional builds. While manual installation of modules remains labor and time-intensive, robotic automation also has limitations, as extensive training is required to enable robots to handle assorted tasks, potentially delaying construction. Therefore, more convenient, time-effective and cost-efficient robotic installation techniques need to be developed to fully realize the promises of modular construction.


Human &

Solution & Techniques

  • 1. Human Intention and Instruction:

Worker Visual Focus of Attention Tracking: Help robots better know what workers' targets are and provide instant help.

Worker Body Movement Detection: Help robots learn human's movement and respond accordingly.

LLM-based Instruction Training: Harness the power of Language-to-Logic Mapping (LLM) to transform worker instructions into robot actions. Drastically cuts down on robot training time.

  • 2. Robot Control:

Interlocking Block Assembly: Receive human instruction and intention and then implement the corresponding work.

The Proposed Worker Focus of Intention Detection Model


The Workflow of Human Intention Prediction in Human-Robot Collaborative Assembly Tasks


The LLM-based Robot Task Performing Route

Overall route.png


Demonstration of Predicted Body Movements Over Six Prediction Steps


Blue lines indicate ground truth poses, and red lines indicate predicted poses.

Assembly Task Performing Based on Human Instructions

LLM results.png


  • Augmented Human Intentions:

By predicting and detecting human body language and Intentions, better instruction could be generated for robot control.

  • Streamlined Robot Training:

Through the innovative use of Language-to-Logic Mapping (LLM), we've revolutionized how robots interpret human directives. By immediately translating worker instructions, the need for lengthy training is dramatically reduced. This paves the way for faster robot deployment and heightened operational efficiency.

bottom of page