JSON vs YAML vs XML: Which Data Format to Use
Side-by-side comparison of JSON, YAML, and XML with the same payload, data-type support, and concrete recommendations for when to pick each.
JSON, YAML, and XML solve the same broad problem — encode structured data as text — but their syntax, ecosystems, and trade-offs differ enough that picking the wrong one will hurt. This guide compares them head-to-head and gives a clear rule for when to reach for each.
The three formats at a glance
| Property | JSON | YAML | XML |
|---|---|---|---|
| Primary use today | APIs, wire format | Config, IaC, CI pipelines | Documents, legacy systems |
| Comments | No | Yes (#) | Yes (<!-- -->) |
| Schema | JSON Schema (add-on) | JSON Schema (via converters) | XSD, RELAX NG, DTD |
| Anchors / references | No | Yes (&, *) | Limited (via DTD) |
| Attributes vs content | No distinction | No distinction | Yes (attributes + text) |
| Parsing complexity | Trivial | High (whitespace, types) | High (namespaces, DTD) |
| Human writability | OK | Best | Worst |
| File size (typical) | Small | Smallest | Largest |
The same payload in three formats
A user record, in JSON:
{
"id": 42,
"name": "Ada Lovelace",
"active": true,
"tags": ["admin", "founder"]
}
Same record, in YAML:
id: 42
name: Ada Lovelace
active: true
tags:
- admin
- founder
Same record, in XML:
<user id="42" active="true">
<name>Ada Lovelace</name>
<tags>
<tag>admin</tag>
<tag>founder</tag>
</tags>
</user>
Notice XML's structural choice you do not have in JSON or YAML: id and
active can be attributes of the element or child elements. That
flexibility is why XML mappings are bespoke per project.
Data types
- JSON has six types: string, number, boolean, null, object, array. No date, no binary.
- YAML has the JSON set plus timestamps, plus implicit typing —
1.0,1,yes,no,on,off,null,~are all magic. This is the famous Norway problem: the country codeNObecomes the booleanfalseunless quoted. - XML has no types in the base spec — everything is text. XSD adds a type system, but it is a separate document and rarely used outside enterprise.
For untyped exchange where types matter, JSON wins on simplicity. If you need to model rich types, lean on a schema (JSON Schema, OpenAPI, or XSD).
Comments, anchors, and namespaces
JSON has none of these. YAML has both:
defaults: &defaults
region: us-east-1
timeout: 30
prod:
<<: *defaults
bucket: prod-bucket
staging:
<<: *defaults
bucket: staging-bucket
Anchors are the killer feature for hand-edited config; they are why YAML is the dominant choice for Kubernetes manifests, GitHub Actions workflows, and Ansible playbooks.
XML's superpower is namespaces and a document/mixed-content model — useful for things like SOAP, SVG, and word-processing documents where text and markup interleave, but overkill for plain records.
Size, parsing speed, and security
JSON parsers are fast — JSON.parse in V8 is hand-written assembly. YAML
parsers must implement a much larger grammar and are correspondingly
slower; pathological documents have caused real CVEs (the "billion laughs"
attack works in YAML and XML, never in JSON).
XML's external entities have been a persistent security issue (XXE). If you parse XML from untrusted sources, disable external entity resolution explicitly.
For pure size, well-written YAML is usually smallest, JSON middle, XML largest — sometimes by a factor of two on the same data.
When to pick which
- HTTP APIs and wire formats — JSON. Universal support, fast parsers, no surprises.
- Configuration that humans hand-edit — YAML. Comments and anchors pay for themselves.
- Build/CI pipeline definitions — YAML, because the ecosystem (GitHub Actions, GitLab CI, CircleCI) standardised on it.
- Document-oriented data with mixed content — XML. Think DOCX, SVG,
RSS, SOAP. JSON cannot represent
Hello <b>world</b>!cleanly. - Greenfield enterprise integration — JSON with OpenAPI. XML/SOAP only if a partner forces it.
- Anything machine-to-machine, high volume — JSON, or move to a binary format like Protocol Buffers or MessagePack.
Converting between them
In practice you will need to move data between formats — pulling a YAML config into a JSON API request, or accepting XML from a legacy partner and re-emitting JSON. The conversions are mostly mechanical when types are simple:
- JSON ↔ YAML
- JSON ↔ CSV — for tabular subsets, see also JSON to CSV: flattening nested data
- JSON ↔ XML
The lossy edges are: YAML's anchors flatten on conversion, XML's attribute/element distinction collapses, and arrays-vs-single-element are ambiguous in XML. Document your conversion conventions.
Next steps
- What is JSON? — the format we keep coming back to.
- Minify vs prettify JSON — once you've picked JSON, size and readability are dials you can turn.
- /json/formatter — pretty-print or compact JSON in one click.