How We Build This#

1. Why This Is Hard#

Getting a clean cutout from a photo is still really hard. Hair, fuzzy edges, see-through things, and bad lighting trip up even the best AI models.

Research papers often look great on test images but fail on real photos. Paid tools work well but cost a lot and often miss the details.

We try to do something different: combine good research with solid engineering. We don't just copy papers. We run a constant loop of training on a dataset that keeps getting better. This powers our models:

  • withoutBG Snap: Our first model (old now).
  • withoutBG Focus: Our newest open model. It’s much better at edges and staying stable.
  • withoutBG Pro: Our best model, available via API.

Money from the API pays for better data and models. This helps everyone, whether you pay or use the free code. It also lets us do two big things: make fake data look real with GANs and run cheaply on AWS Inferentia.

2. Two Ways to Use It#

We built this so you can choose what matters to you: privacy or ease.

Open Source means you control it. It runs offline, costs what you want, and keeps your data private.
Pro API is for when you just want it to work fast and don't want to manage servers.

Our open models aren't "lite" versions. They are the real deal, often better than paid tools.

Here is what we offer:

GitHub: https://github.com/withoutbg/withoutbg
Docker: https://hub.docker.com/r/withoutbg/app

Switching is easy. Use Focus if you run it yourself. Use Pro if you want the best quality via API. Use Snap if you have to.

When to use what:

ScenarioLocal ONNXCloud API
Privacy matters✔️
No internet needed✔️
Need to process tons of images✔️
Don't want to set up servers✔️

3. How It Works#

The Focus model is about 320 MB. We made it that size to be smart enough to see details but fast enough to run.

We use fake (synthetic) data, but if you just paste a person on a background, it looks fake. The AI learns to spot the fake edges. So we use another AI (a GAN) to fix the lighting and colors so it looks like a real photo. This makes the model work much better on real images.

We also use 3D rendering. We make scenes in Blender with random lights and textures. This gives us more variety than just flipping or rotating photos.

We also take real photos in a studio where we know exactly what the background is. We share these so you can see for yourself.

We track everything. Over 100 experiments, changing how the model learns and what data it sees, are all logged. We don't guess; we measure.

We also mix up backgrounds. To make it look real, we even add drop shadows under objects.

4. Training#

We start small on our own computers, then train big on AWS. We track it all with Weights & Biases.

We teach the model in steps. First, it learns on small, simple images. Then we give it harder, bigger ones. This helps it learn faster.

For big images, we cut them into pieces. We pick the pieces with the edges or the important parts so we don't run out of memory.

5. Running It#

For the API, we use AWS Inferentia. It’s special hardware that runs our models fast and cheap. This keeps the price down for you.

For you, we export to ONNX. This runs on almost anything: CPUs, GPUs, you name it. We test it to make sure it’s fast.

6. Being Open#

We show our work. We publish the hard cases, the failures, and the weird lighting. We compare our models honestly.

Coming soon: withoutBG Zoom for super high detail, and plugins for Blender and Photoshop.

7. Help Us#

Good AI comes from a loop: make a model, find where it fails, fix the data, repeat. Open source makes this loop faster.

Get involved:

We love pull requests, ideas, and new data.