POST /dedup

Send a full menu and get back clusters of items that refer to the same dish. Catches spelling errors, transliterations, promotional noise, and serving size variations. Returns groups of duplicates plus a list of singletons (unique items).

Request

Parameter	Type	Required	Description
`items`	string[]	Yes	Menu item texts to deduplicate. 1-2000 items, max 500 chars each.
`cosine_threshold`	float	No	Similarity threshold (0.85 works well, lower = more aggressive grouping). Default: 0.85.

{
  "items": [
    "Chicken Biryani",
    "Murgh Biryani Serves 2",
    "Paneer Tikka",
    "Panner Tika",
    "Masala Dosa",
    "**NEW** Chiken Biryani (Half)"
  ]
}

Response

Field	Type	Description
`clusters`	object[]	Groups of duplicate items
`singletons`	string[]	Items with no duplicates found
`total_items`	int	Total items submitted
`duplicate_items`	int	Number of excess duplicates (items in clusters minus number of clusters)
`processing_time_ms`	float	Processing time in milliseconds

Each cluster:

Field	Type	Description
`cluster_id`	int	Unique cluster identifier (1-based)
`canonical`	string	Shortest member, recommended as the canonical name
`members`	string[]	All items in this duplicate group (original text)
`pairwise_scores`	object[]	Similarity scores between each pair of members (capped for very large clusters)
`pairwise_truncated`	bool	True if `pairwise_scores` was capped for this cluster. Cluster membership is always complete.

{
  "clusters": [
    {
      "cluster_id": 1,
      "canonical": "Chicken Biryani",
      "members": ["Chicken Biryani", "Murgh Biryani Serves 2", "**NEW** Chiken Biryani (Half)"],
      "pairwise_scores": [
        {"text_a": "Chicken Biryani", "text_b": "Murgh Biryani Serves 2", "score": 0.932841},
        {"text_a": "Chicken Biryani", "text_b": "**NEW** Chiken Biryani (Half)", "score": 0.951203},
        {"text_a": "Murgh Biryani Serves 2", "text_b": "**NEW** Chiken Biryani (Half)", "score": 0.910547}
      ]
    },
    {
      "cluster_id": 2,
      "canonical": "Paneer Tikka",
      "members": ["Paneer Tikka", "Panner Tika"],
      "pairwise_scores": [
        {"text_a": "Paneer Tikka", "text_b": "Panner Tika", "score": 0.961482}
      ]
    }
  ],
  "singletons": ["Masala Dosa"],
  "total_items": 6,
  "duplicate_items": 3,
  "processing_time_ms": 187.4
}

Example

import requests

menu = [
    "Chicken Biryani",
    "Murgh Biryani Serves 2",
    "Paneer Tikka",
    "Panner Tika",
    "Masala Dosa",
    "**NEW** Chiken Biryani (Half)"
]

response = requests.post("https://dish-embed.latimal.com/dedup",
    headers={"X-API-Key": "YOUR_KEY", "Content-Type": "application/json"},
    json={"items": menu}
)

data = response.json()
print(f"Found {len(data['clusters'])} duplicate groups, {data['duplicate_items']} excess items")

for cluster in data["clusters"]:
    print(f"\n  Canonical: {cluster['canonical']}")
    print(f"  Duplicates: {cluster['members']}")

Availability

Available on the Scale plan. Each request counts as one API call against your monthly quota, regardless of how many items you send.

Try it live — Test this endpoint in the interactive playground.

For a complete integration walkthrough, see the Menu Deduplication guide.

Notes

duplicate_items counts excess items (total items in clusters minus number of clusters), not total clustered items. A cluster of 3 items = 2 excess duplicates.
The canonical suggestion picks the shortest member. You may want to apply your own logic (e.g. prefer items without noise or misspellings).
For menus over 2000 items, split into batches by category or restaurant and deduplicate each batch.
Promotional text (prices, "NEW", serving sizes) is stripped before comparison, so "Chicken Biryani" and "BEST SELLER Chicken Biryani Rs. 299" will match.