Machine learning, in your browser

Models that run on your laptop, not ours.

Three demos. An image classifier, an object detector, and a sentiment analyser. Each one downloads its model the first time you press a button, then runs every inference locally in WebAssembly or WebGPU. No data leaves your device.

Privacy by construction. Models are quantised ONNX, served from a public CDN. First use downloads between 15 and 170 megabytes per model, then cached. Subsequent runs take a few hundred milliseconds.

Open the demos

Choose a model.

Three tabs, three small models, three jobs. Pick one and follow the inline instructions.

Image classification

Xenova/vit-base-patch16-224 · about 85 MB · ImageNet 1000 classes

First use of this tab will download about 85 MB to your browser cache. After that, classification takes well under a second per image.

Upload an image to classify JPG, PNG, or WEBP. Maximum 8 MB.

Try a sample:

Your image preview will appear here.

Ready. Pick an image to begin.

Object detection

Xenova/detr-resnet-50 · about 167 MB · 91 COCO classes

First use of this tab will download a 167 MB model to your browser cache. The download takes a minute or two on a typical home connection. After that, detection takes 1 to 3 seconds.

Upload an image to detect objects in JPG, PNG, or WEBP. Maximum 8 MB.

Try a sample:

Your image preview will appear here.

Ready. Pick an image to begin.

Sentiment analysis

Xenova/distilbert-base-uncased-finetuned-sst-2-english · about 67 MB · positive or negative

First use of this tab will download a 67 MB model to your browser cache. After that, sentences are scored in well under a second each.

Enter a sentence or paragraph

Ready. Press Analyse sentiment to score the text.

How this works.

The model

Each tab uses a quantised ONNX model from the Xenova catalogue. ONNX is the open standard format for cross runtime neural networks. Quantisation shrinks the weights to 8 bit integers, which makes them small enough to ship over a CDN and fast enough to run on a laptop.

The runtime

The transformers.js library runs the model in WebAssembly by default, or WebGPU if your browser supports it. The first inference of a session warms up the runtime. Subsequent inferences are an order of magnitude faster.

The privacy story

Your image or text never leaves your machine. The model weights travel one way, from the public CDN to your browser cache. After the first use of a tab, the model can run offline. No telemetry, no analytics, no server logs of your input.

Models that run on your laptop, not ours.

Choose a model.

Image classification

Top 5 predictions

Object detection

Detected objects

Sentiment analysis

How this works.

The model

The runtime

The privacy story