About us

We build the data supply chain
that physical AI trains on.

StreamGenie is the creator marketplace. Diffraction, Inc. is the company behind it — and what we're really building is the certified, instrumented data pipeline that the next generation of AI needs most.


The real-world data gap

Language models learned from the internet's text. Image models learned from its photos. But a robot that has to pick up a mug, fold a towel, or cross a kitchen can't learn any of that from the web — and neither can a model that needs to understand a livestream. They need consented, real-world video of people doing real things, captured with depth, motion and provenance. That footage doesn't exist on the internet, and it can't be faked in simulation without a crippling reality gap.

StreamGenie closes that gap from two directions at once: a marketplace that pays real people for real footage, and an in-house facility that captures measured, certified data ourselves.

The Co-founders (Thom & Joseph)

The creator network

StreamGenie — consented footage at scale

Every campaign on StreamGenie is backed by a real buyer budget. Creators record tasks, license their streams, or clip campaigns — and get paid a rate stated upfront. No jargon. No exploitative "exposure" economy.

The creator network is one half of our supply chain: consented, licensed footage from real people doing real things. That provenance is the thing scrapers structurally cannot replicate.

  • Opt-in licensing — you decide what's included
  • Revocable — remove footage from future use at any time
  • Transparent — a plain-language "what this trains" on every campaign
  • Direct pay — within 3–5 business days, no intermediary
$32/hr
Example rate for kitchen manipulation recordings

Rates are set by the buyer and shown before you commit. No surprises, no unpublished tiers.


What we stand on
150,000
hours of licensed real-world video, and growing
500+
creators in our TikTok LIVE US network
2
supply sources — the marketplace and our in-house POD
4
verticals: AI training, clipping, UGC, live highlights
100%
creator ownership — opt-in, time-boxed, revocable licensing
3–5 days
typical review-to-payout, via PayPal or bank — never crypto

Real platform figures, not invented ones. Project-level proof lives in the case studies.


The thesis

Ego + Exo: understanding more than just the task

A phone video captures a task. A calibrated, multi-view capture environment understands one. That difference — between estimated signals and measured ones — is the moat we're building.

Egocentric capture

The operator wears an ego device — iPhone Pro with ARKit, or Project Aria glasses. This gives us gravity-aligned, metric-scale trajectories: a measured pose that every other signal anchors to.

Exocentric cameras

Static exo cameras surround the workspace. Their extrinsics solve into the ego frame via joint structure-from-motion. Four viewpoints, one calibrated world — depth, occlusion, and spatial context that a single camera can never provide.

Measured vs estimated

Most data in the wild is estimated: labels applied after the fact, scale ambiguous, spatially inconsistent. Our instrumented captures produce measured channels — ARKit poses, metric depth, calibrated intrinsics — that training runs can rely on rather than compensate for.

Certified supply chain

Every asset has a provenance certificate: who captured it, on what device, under what consent terms, through what pipeline version. A signed manifest that travels with the data to the buyer — not a spreadsheet of claims, a cryptographically verifiable record.


The company

Diffraction, Inc.

We are Diffraction, Inc. — a small team building the infrastructure layer between the people who create content and the AI systems that learn from it.

The name comes from wave optics: diffraction is what happens when light passes through a small aperture and spreads into structure. We think about data the same way — raw footage is the input; structured, calibrated, certified signal is what comes out the other side.

StreamGenie is our creator-facing surface. The POD capture environments, the video-processing pipeline, and the data certification layer are the infrastructure underneath. Together they form a supply chain that produces the kind of data physical-AI and frontier-model teams actually need — not scraped, not estimated, not unverifiable.

Learn more at diffraction.agency
"Everyone else sells found data. We sell manufactured, certified, instrumented data."
— Internal product principle

Explore our work

Start earning with us

Browse live campaigns, check the rate before committing, and get paid for your time.