Sim-to-real robot training with synthetic data is one of the most powerful techniques in embodied AI — and one of the most misunderstood. The gap between what simulation promises and what it delivers in production is real, measurable, and manageable if you understand its structure.
The appeal of sim-to-real synthetic data strategies
Simulation offers something irresistible to robotics ML teams: unlimited training data at near-zero marginal cost. You can generate millions of manipulation demonstrations in a physics simulator overnight. You can randomise object poses, lighting, surface textures, and robot configurations across the full distribution you want your policy to handle. You never have to worry about operator fatigue, hardware downtime, or facility access.
The appeal is real. So is the problem.
What the sim-to-real gap actually is
The sim-to-real gap is the performance degradation that occurs when a policy trained in simulation is deployed on physical hardware. It is not a single failure mode — it is a family of mismatches between the simulated environment and the real world.
Physics fidelity gap: Real-world contact dynamics — friction, deformation, surface compliance — are difficult to simulate accurately. A policy trained on simulated grasping with idealized contact models will encounter different forces and reactions on physical hardware. The more contact-dependent the task, the larger this gap.
Visual gap: Rendered images in simulation do not match real camera images. Lighting, shadows, texture detail, sensor noise, motion blur, and lens distortion all differ. Policies that rely on visual input for state estimation fail when the visual distribution shifts at deployment.
Action noise gap: Real actuators have backlash, compliance, and control noise that simulators either do not model or model imperfectly. A policy trained on clean simulated joint control will encounter different dynamics on physical hardware.
Latency gap: Real-world control loops have latency — sensor readout time, network latency in distributed systems, actuator response delay. Simulators typically run faster than real time. Policies that are not trained with realistic control latency learn action timing that is inappropriate for deployment.
What domain randomisation does and does not solve
Domain randomisation — training across wide distributions of simulated visual and physical parameters — is the primary technique for narrowing the sim-to-real gap. By training on a distribution of simulated environments that is broad enough to include the real world, you encourage the policy to learn representations that are robust to the specific parameters that differ at deployment.
Domain randomisation works well for visual robustness to lighting and texture variation. It works less well for contact dynamics, where the structure of the physics gap is not well-modeled by randomising contact parameters within a single simulation framework.
The deeper problem with domain randomisation is that it can only randomise over parameters that are represented in the simulator. The sim-to-real gap often includes failure modes that are not modeled in simulation at all — specific surface interactions, cable routing effects, thermal compliance changes, or sensor failure modes. No amount of randomisation covers what the simulator does not model.
The role real-world data must play
Real-world data in a sim-first pipeline plays two roles. The first is fine-tuning: after sim training, a relatively small number of real-world demonstrations — often in the hundreds to low thousands — can bridge the remaining gap. The sim-trained policy provides a good initialisation; real-world fine-tuning adjusts the policy for the specific dynamics of the physical hardware and environment.
The second role is distribution anchoring: real-world demonstrations define what “real” looks like for the visual domain randomisation target. Without real-world reference data, domain randomisation is often too broad or too narrow — broadening the simulation distribution in directions that do not help, while missing the specific visual characteristics of the actual deployment environment.
A practical sim-plus-real data strategy
The most effective strategies treat simulation and real-world collection as complementary, not competing. A reasonable starting allocation:
- Use simulation for broad task coverage and distribution of object configurations — the scenarios that are expensive to physically set up at scale
- Use real-world teleop for high-quality seed demonstrations that anchor your fine-tuning distribution and calibrate your domain randomisation targets
- Use sim for initial policy training to the point where the robot can complete the task under ideal conditions
- Use real-world data for fine-tuning, validation, and handling the failure modes that simulation did not capture
The ratio of sim to real data depends heavily on the task. For locomotion on structured terrain, sim-heavy strategies work well. For precise manipulation with contact, expect to lean more heavily on real-world data regardless of how good your simulator is.
For precise manipulation with contact, expect to lean more heavily on real-world data regardless of how good your simulator is. If you’re working on sim-to-real and want to scope the real-data component, see how we run sim-to-real programs or send us the task and the transfer gap.





