Reliable JSON from LLMs: structured output in practice

01"Sadece JSON döndür" neden kırılırWhy "just return JSON" breaks

İlk sürüm hep aynıdır: prompt'a "sadece JSON döndür" yazılır, cevap doğrudan JSON.parse'a verilir. Demoda çalışır. Üretimde model bir gün cevabın başına "Elbette, işte çıktı:" ekler; ertesi gün JSON'u markdown fence içine sarar; hafta sonu enum'da olmayan bir değer üretir — "pending" yerine "waiting" — ya da şemada hiç tanımlanmamış bir alan uydurur.

The first version is always the same: write "return only JSON" in the prompt, pass the answer straight to JSON.parse. It works in the demo. In production, one day the model prefixes the answer with "Sure, here's the output:"; the next day it wraps the JSON in a markdown fence; over the weekend it produces a value outside your enum — "waiting" instead of "pending" — or invents a field your schema never defined.

Her biri tek başına küçük görünür. Fence'i regex ile temizlersiniz, ilk { ile son } arasını kesersiniz, bilinmeyen alanları atarsınız. Ama her düzeltme yeni bir edge case doğurur ve pipeline, bir süre sonra modelin o günkü hâline göre davranan bir parser koleksiyonuna dönüşür. Günde on bin istekte yüzde birlik hata, her gün yüz kırık kayıt demektir.

Each one looks small in isolation. You strip the fence with a regex, slice between the first { and the last }, drop unknown fields. But every patch breeds a new edge case, and the pipeline becomes a collection of parsers behaving according to the model's mood that day. At ten thousand requests a day, a one percent failure rate is a hundred broken records, every day.

Serbest metni parse etmek bir sözleşme değildir; umuttur.

Parsing freeform text isn't a contract. It's hope.

02Constrained decoding ve modele yardım eden şemaConstrained decoding and schemas that help the model

Büyük sağlayıcıların native structured output modları bu sınıf hatayı kökten kapatır: şemayı istekle birlikte verirsiniz, decoding sırasında şemaya aykırı token'lar maskelenir. Model geçersiz JSON üretemez — "genellikle uyar" değil, aykırı token'ı kelimenin tam anlamıyla seçemez düzeyinde bir garanti. Ama bu bir sözdizimi garantisidir, anlam garantisi değil: şemaya uyan yanlış içerik hâlâ mümkündür. İkinci yarı şema tasarımıdır; modelin işini kolaylaştıran şema, içerik hatasını da azaltır:

The major providers' native structured output modes close this class of failure at the root: you send the schema with the request, and tokens that would violate it are masked during decoding. The model cannot emit invalid JSON — not "usually complies", but literally cannot select the violating token. That is a syntax guarantee, not a semantic one: schema-valid wrong content is still possible. The second half is schema design — a schema that makes the model's job easier also reduces content errors:

Düz, derin değil. Üç seviye iç içe nesne yerine düz alanlar. Her ekstra seviye, modelin yapıyı taşırken hata yapacağı yeni bir yerdir.
Flat, not deep. Flat fields instead of objects nested three levels down. Every extra level is another place for the model to fumble while carrying the structure.
Enum, serbest string değil. Kategori serbest metinse model yaratıcılaşır; kapalı liste verin, downstream kod tek bir değer kümesiyle yaşasın.
Enums, not free strings. Leave a category as free text and the model gets creative; give it a closed list so downstream code lives with one value set.
Açık optional'lar. Her alanın required mı nullable mı olduğu şemada yazsın; boşluğu uydurarak doldurmak yerine null demek modele meşru bir çıkış verir.
Explicit optionals. Whether each field is required or nullable belongs in the schema; a legitimate null gives the model an exit that isn't fabrication.
Description = talimat. Alan açıklamaları modelin okuduğu prompt'un parçasıdır. "Tek cümle, müşterinin dilinde" gibi kuralları description'a yazın, prompt'un dibine değil.
Descriptions are instructions. Field descriptions are part of the prompt the model reads. Rules like "one sentence, in the customer's language" belong in the description, not buried at the bottom of the prompt.

03Doğrulama katmanı yine de kurulurThe validation layer, regardless

Constrained decoding'e rağmen kod tarafında doğrulama kalır — çünkü iş kuralları, alanlar arası tutarlılık ve sınır değerler şemanın dışındadır, ve her çağrı her zaman structured mode'dan geçmez. Desen basittir: çıktıyı aynı şemayla doğrula; başarısızsa doğrulama hatasını olduğu gibi modele geri besleyerek retry et — "geçersizdi, tekrar dene" değil, "priority integer olmalı, string geldi" düzeyinde. Retry'ı iki-üçle sınırla; hâlâ başarısızsa deterministik bir fallback'e düş: kaydı kuyruğa al, güvenli bir varsayılana in ya da işi insan onayına bırak. Kör retry token yakar; fallback'siz zincir gece nöbetçisini uyandırır.

Even with constrained decoding, validation in code stays — because business rules, cross-field consistency and boundary values live outside the schema, and not every call always goes through structured mode. The pattern is simple: validate against the same schema; on failure, retry with the validation error fed back verbatim — not "that was invalid, try again" but "priority must be an integer, got a string". Cap retries at two or three; if it still fails, drop to a deterministic fallback: queue the record, fall back to a safe default, or hand it to a human. Blind retries burn tokens; a chain without a fallback wakes whoever is on call.

ticket-triage — şema + retry / schema + retry

// schema (v3) — kodun yanında / lives next to the code
{ "category": { "enum": ["billing", "bug", "other"] },
  "priority": { "type": "integer", "minimum": 1, "maximum": 5 },
  "summary":  { "type": "string" } }  // required: hepsi / all

// ✕ deneme 1 / attempt 1 — reddedildi / rejected
{ "category": "payment", "priority": "high", "urgent": true }
// → enum dışı + tip hatası + tanımsız alan; hata modele geri döner
// → enum violation + type error + unknown field; error goes back to the model

// ✓ deneme 2 / attempt 2 — geçti / passed
{ "category": "billing", "priority": 4, "summary": "Kart iki kez çekilmiş." }

04Tool calling mı, JSON mode mu — ve iki alışkanlıkTool calling or JSON mode — and two habits

İkisi aynı mekanizmanın iki yüzü. Çıktı bir eylemse — model argümanlarıyla bir fonksiyon çağıracaksa, ortada birden fazla olası eylem varsa — tool calling doğru araçtır: model aracı seçer, argümanlar şemaya bağlanır. Çıktının kendisi cevapsa — extraction, sınıflandırma, yapılandırılmış rapor — JSON mode ya da doğrudan structured output daha yalındır; tek kullanımlık sahte bir "tool" uydurmaya gerek yoktur.

They are two faces of the same mechanism. If the output is an action — the model will call a function with arguments, and there is more than one possible action — tool calling is the right fit: the model picks the tool, the arguments bind to a schema. If the output is the answer — extraction, classification, a structured report — JSON mode or direct structured output is simpler; there is no need to invent a fake single-use "tool".

Son iki alışkanlık kalıcılığı sağlar. Şemayı kodun yanında versiyonlayın: şema değişikliği bir kod değişikliğidir, PR diff'inde görünmeli, kırıcı değişiklik review'da yakalanmalı. Ve gerçek trafikten örneklenmiş bir eval seti tutun: şemadan geçme oranı yüzde yüz olsa bile içerik doğruluğu ayrı bir metriktir ve ancak gerçek kullanıcı örnekleriyle ölçülebilir.

Two habits make it stick. Version the schema next to the code: a schema change is a code change, it should show up in the PR diff, and a breaking change should be caught in review. And keep an eval set sampled from real traffic: even at a hundred percent schema pass rate, content accuracy is a separate metric — and it is only measurable against real examples.

Kısa liste. Üretime bağlamadan: constrained decoding / structured mode ✓ · düz şema + enum + description ✓ · kodda şema doğrulaması ✓ · hata geri beslemeli retry (en fazla 2–3) ✓ · deterministik fallback ✓ · şema versiyonu kodla birlikte ✓. Bu zincir yoksa, ilk parse hatası müşterinin ekranında patlar.Before wiring to production: constrained decoding / structured mode ✓ · flat schema + enums + descriptions ✓ · schema validation in code ✓ · retry with the error fed back (max 2–3) ✓ · deterministic fallback ✓ · schema versioned with the code ✓. Without this chain, the first parse failure detonates on a customer's screen.

Structured OutputJSON SchemaTool CallingLLM Reliability

LLM'den güvenilir JSON: structured output pratiğiReliable JSON from LLMs: structured output in practice

01"Sadece JSON döndür" neden kırılırWhy "just return JSON" breaks

02Constrained decoding ve modele yardım eden şemaConstrained decoding and schemas that help the model

03Doğrulama katmanı yine de kurulurThe validation layer, regardless

04Tool calling mı, JSON mode mu — ve iki alışkanlıkTool calling or JSON mode — and two habits