Matryoshka Dimensions
What are Matryoshka embeddings?
dish-embed is trained at 1024 dimensions but can be served at any lower dimension (128, 256, 384) without retraining. The model is trained so that the first N dimensions of the full vector carry the most important information, like a Russian nesting doll (Matryoshka) where each smaller layer is self-contained.
This means you can choose your dimension at query time based on your requirements.
Choosing a dimension
| Dimension | Storage per item | Use case |
|---|---|---|
| 128 | 0.5 KB | Cost-sensitive dedup, large catalogs, fast retrieval |
| 256 | 1.0 KB | Balanced quality and cost |
| 384 | 1.5 KB | Highest quality, fine-grained distinction |
128 dimensions
Good enough for most dedup and search tasks. Catches obvious duplicates ("Chiken Biryani" vs "Chicken Biryani") and handles broad search queries well. Use this when you have millions of items and storage or latency matters.
256 dimensions
Middle ground. Slightly better at distinguishing similar items without meaningful cost increase for most applications.
384 dimensions
Best accuracy. Use this when precision matters, for example distinguishing "Latte" from "Mocha" or "Butter Chicken" from "Chicken Butter Masala". All dish-embed endpoints default to 384d internally.
Storage math
For a catalog of 1 million items:
- 128d: ~512 MB
- 256d: ~1 GB
- 384d: ~1.5 GB
These are raw vector sizes. Your vector database adds overhead for indexing (typically 20-50% more).
How to specify dimension
Pass the dimension parameter when calling /embed:
# Compact embeddings for large-scale dedup
resp = requests.post(f"{BASE}/embed", headers=headers,
json={"items": menu_items, "dimension": 128})
# High-quality embeddings for precise matching
resp = requests.post(f"{BASE}/embed", headers=headers,
json={"items": menu_items, "dimension": 384})
For /search, /match, /dedup, and other endpoints, the model uses 384d internally. The dimension parameter only affects /embed output when you're storing vectors yourself.
Quality comparison
Accuracy loss from reducing dimensions is small but measurable:
- 384d to 256d: ~1-2% drop on fine-grained benchmarks
- 384d to 128d: ~3-5% drop on fine-grained benchmarks
- All dimensions perform equally well on obvious duplicates
If your items are sufficiently distinct (pizza vs sushi vs biryani), 128d works perfectly. If you need to distinguish closely related items (Flat White vs Cappuccino vs Latte), use 384d.