← back to blog Engineering

Why we run AR and AI on the device, not the cloud

Aurora gradient visualizing real-time camera AI

Round-trips kill real-time experiences and pile up cloud bills. Here is how Nosmai keeps frames on the phone, and why that is better for latency, privacy and your margins.

MC
Maya Chen
Co-founder & CTO
Jun 4, 2026
7 min read

Every camera-first feature lives or dies by latency. The moment you send a frame to a server, classify it, and wait for a response, you have already lost the real-time feel, and you have signed up for a bill that scales with every user you add.

The round-trip tax

A cloud call from a mobile device is rarely under 200ms once you account for the network, queueing and inference. For a 60fps camera that budget is 16 milliseconds per frame. There is simply no room for a server in the loop.

  • Latency that breaks the live feel of filters and effects
  • Per-call costs that grow linearly with usage
  • A pipeline that can fall over the moment the network does
  • User content leaving the device, a compliance surface you now own

On-device by default

Detection events landing instantly on the device, no upload step

Nosmai Effects and Moderation run entirely on the device GPU. Frames never leave the phone, so there is nothing to upload, nothing to store, and nothing to bill per call. You get sub-8ms processing and a feature that works on a plane.

Once you stop shipping pixels to a server, latency and privacy stop being trade-offs. They become defaults. — Maya Chen

When the edge makes sense

Some workloads genuinely need a server. For those, keep the footprint thin: a secure proxy that holds your keys and handles routing, caching and failover, without your secrets ever shipping in the binary. The rule of thumb: keep perception on-device, and only leave the device when you truly have to.

That single decision, on-device first, edge only when required, is what lets a small team ship camera AI that feels instant and stays affordable as it grows.

More from the blog

Engineering

How Nosmai Effects hits 60fps on mid-range phones

A look at the GPU pipeline, the filter graph, and the tricks that keep beauty and AR smooth on hardware from three years ago.

Product

On-device moderation: privacy and cost, solved together

Why scanning user content on the phone, instead of a cloud service, fixes two problems at once.

Nosmai

We make advanced camera and AI technology accessible to every developer. By packaging hard problems into simple

developers
legal
newsletter

Product updates and release notes. No spam.

© 2026 nosmai, inc · all rights reserved