Home / How we collect / Hybrid

Delivery model 03

Hybrid

Dedicated core for the spine of your dataset. Crowdsourced edge for diversity and long-tail. One unified pipeline.

80 / 20
Typical core / edge split
14 days
SOW to first batch
1
SOW. 1 dataset. 1 dashboard.
DEDICATED CORE • 80% Core task operators (on-prem) Consistent rigs + calibration IP-controlled pipeline High-complexity tasks SPINE OF YOUR DATASET CROWDSOURCE EDGE • 20% Scene diversity (12 countries) Long-tail environments Elastic scale-up Simpler task variants DIVERSITY + SCALE 1 SOW • 1 DATASET • 1 DASHBOARD Core data 80% Edge data 20% Combined BEST OF BOTH IP control from core + diversity from crowd 80/20Typical core/edge splitAdjustable per program 14dSOW to first batchBoth streams producing 1Unified pipelineOne SOW, one dashboard

What is hybrid collection?

Hybrid combines a dedicated core team with crowdsource edge collection under a single SOW. Your core handles high-complexity, IP-sensitive tasks on controlled rigs. The crowd fills in scene diversity, geographic spread, and long-tail variants from real homes and workspaces across 12 countries.

When to choose hybrid

  • Generalization + quality — core data sets the bar, crowd data widens the distribution
  • Phased programs — start dedicated, add crowd as you need diversity
  • Mixed task complexity — hard tasks go to core, simpler variants go to crowd
  • Budget flexibility — predictable core cost + elastic crowd spend

What you get

  • One SOW — single contract, single point of contact
  • One dataset — unified format, merged QA pipeline
  • One dashboard — core and crowd metrics side by side

Best for

Teams that need lab-quality core data and real-world diversity in the same training set — without managing two vendors.

80/20 typical core/edge split.

14 days SOW to first batch.

1 pipeline unified delivery.

Built for

When neither model alone is enough

Where production teams land after trying pure dedicated or pure crowdsource and finding both lacking.

Production foundation models

A high-quality spine for the bulk of training, plus diversity capture for robustness and long-tail generalization.

Mid-volume training pipelines

Where pure crowdsource is too noisy for the spine and pure dedicated is too slow to scale the edge.

Model release cycles

Dedicated for held-out benchmarks. Crowdsourced for fine-tune sprints between releases.

How it works

Two layers, one program

Parallel ramp on both sides. Same SOW, same SLAs, same accountability.

0Week 0

Scope both layers

Define what’s spine (high-quality, controlled) and what’s edge (diversity, scale). One SOW covers both.

1Week 1

Spine starts

Dedicated pod begins recruitment, training, and pilot. Default 80% of unit volume.

2Week 2

Edge spins up

Crowdsource network onboards for diversity, geographic spread, and long-tail capture.

4Week 4

Unified operation

Both layers operating. Weekly merge into your dataset. Unified QA and dashboard across both.

Structure

What hybrid actually looks like

A dedicated spine and a crowdsourced edge, both reporting into one dataset and one dashboard.

SPINE 80%
EDGE 20%

Default ratio. Tunable per program.

THE SPINE / DEDICATED

80% of unit volume

THE EDGE / CROWDSOURCE

20% of unit volume

Quality across both layers

How we keep spine and edge comparable

A hybrid dataset is only as good as the layer joins. These four pillars hold the seam tight.

PILLAR 01

Unified performance bar

Both layers tested against the same acceptance criteria. The spine sets the bar. The edge is measured against it.

PILLAR 02

Cross-layer calibration

Dedicated operators occasionally take crowd tasks. Crowd seniors take spine tasks. Drift detection runs both directions.

PILLAR 03

Provenance tagging

Every unit tagged with layer, operator tier, geography, and hardware. Filter your dataset by source at training time.

PILLAR 04

One throughput dashboard

A single view across both layers. Same accuracy metrics, same throughput counts, same SLAs.

Compare delivery models

When to choose hybrid

Pick hybrid when you need a dedicated spine for quality and a crowdsourced edge for diversity.

CriteriaDedicatedCrowdsourceHybridRECOMMENDED FOR PRODUCTION
IP controlFull exclusivity. Operators NDA’d.Per-batch NDA.Mixed — dedicated core, crowdsourced edge.
Operator qualityTrained on your spec to a bar you set.Tier-qualified. Variable. Aggregated.Best for spine + scale for diversity.
Ramp time4 weeks SOW to production.3 days spec to first units.2–3 weeks. Dedicated first, crowd added.
HardwareYour rig in our studio, or our matching rig.Operator-provided or our standard rigs.Mix of both per task type.
Best forSurgical, AV, defense, industrial.Long-tail, scene diversity, geo spread.Most production foundation models.
PricingPer FTE-month.Per task or per hour.Custom SOW.
What our partners say
We tried pure dedicated for eighteen months and pure crowd for six. Hybrid is the only one that produced a dataset robust enough to ship the model.
Anika Reddy
Director of AI Training, Locus AI

FAQ

Questions about hybrid delivery programs

When you need a consistent high-quality core dataset plus volume diversity on top of it. A common pattern: dedicated team captures the canonical task demonstrations, crowdsource generates variation around them.
The dedicated team establishes the quality baseline. Crowdsourced submissions are validated against that baseline — any submission outside the defined envelope is rejected and re-queued.
The dedicated component is priced per FTE-month; the crowdsourced component per validated submission. You get a blended rate in the SOW that reflects the mix you need.

Further reading

From the blog

How to Scale Teleop Data Collection Without Losing Quality

The hybrid model for burst scale without sacrificing core quality.

From the blog

Build vs. Buy: The Real Cost

When hybrid outsourcing beats both extremes.

Build your hybrid program

Tell us your spine-vs-edge ratio. We’ll scope both layers in a single SOW. Fourteen days to first batch.