JSON to CSV: Flattening Nested Data

How to convert nested JSON into flat CSV rows — dotted keys, array handling, escaping, encoding. Real examples plus the lossy edges to watch for.

By ShiangYu Huang · Published 2026-05-13 · Last reviewed 2026-05-13

CSV is the universal currency of data analysis tools — Excel, Google Sheets, every database import, every BI tool. JSON, by contrast, is tree-shaped. Going from one to the other isn't a 1-to-1 mapping; flattening makes choices that lose information. This guide covers the decisions you need to make explicitly, so the conversion doesn't surprise you downstream.

Why CSV needs flat rows

CSV is a 2D format: a header row plus N data rows, all with the same number of columns. JSON is N-dimensional — objects nest, arrays nest inside objects, objects nest inside arrays. Any conversion has to project the tree onto a grid.

Three operations do this projection:

Pick a "row entity" — usually the elements of a top-level array.
Flatten nested objects into dot-pathed columns.
Decide what to do with nested arrays — explode, join, or index.

Mapping an array of objects to rows

The easy case. The JSON is a flat array of flat objects:

[
  { "id": 1, "name": "Ada", "active": true },
  { "id": 2, "name": "Alan", "active": false }
]

Every key becomes a column; every element becomes a row:

id,name,active
1,Ada,true
2,Alan,false

The header row is the union of keys across all records. If a key is missing from a row, the cell is empty.

Flattening nested objects (dotted keys)

Nest one level, dot the path:

[
  {
    "id": 1,
    "name": "Ada",
    "address": { "city": "London", "zip": "SW1A 2AA" }
  }
]

id,name,address.city,address.zip
1,Ada,London,SW1A 2AA

For multiple levels, just keep dotting: address.country.code. The notation is the same one JSONPath uses, so the column names tell you exactly which field they came from. Some tools use address_city (underscore) or address__city (double underscore) instead — pick a convention and document it.

Arrays inside records

This is where conversion gets opinionated. There are three reasonable strategies, each lossy in a different way.

Strategy A: explode. One row per array element. Best when array elements are the real records:

[
  { "id": 1, "tags": ["admin", "founder"] }
]

id,tag
1,admin
1,founder

This is what jq does with [.id, .tags[]] | @csv and what SQL would do with UNNEST. Two issues — you lose the natural "one row per user" shape, and if a record has multiple arrays you have to explode the cross-product or pick one.

Strategy B: join. Encode the array as a single delimited string:

id,tags
1,"admin;founder"

Best when the array is small and consumers can split. Choose a delimiter that doesn't appear in the values — semicolon or pipe usually.

Strategy C: index. One column per array position:

id,tags.0,tags.1
1,admin,founder

Best when the array has a fixed small length and position is meaningful (e.g. RGB triples). Falls apart for variable-length arrays — you end up with tags.0 through tags.99 mostly empty.

There's no universally right answer. Pick based on what the consumer does with the output.

Headers, delimiters, quoting, encoding

The four CSV variables that bite in production:

Delimiter. RFC 4180 says comma. European Excel says semicolon. TSV uses tab. Always set the delimiter explicitly in your converter and document it.
Quoting. Values containing the delimiter, a newline, or a double quote must be quoted. The quoting char is ", and a literal " inside a quoted value is doubled: ". So she said "hi" becomes "she said ""hi""".
Headers. Optional in RFC 4180 but you should always emit them. Without headers, the conversion is opaque to the consumer.
Encoding. UTF-8 is the only safe answer. Excel for Windows still defaults to a regional 8-bit encoding when double-clicking a .csv file — adding a UTF-8 BOM (EF BB BF) at the start of the file makes Excel auto-detect UTF-8. Yes, this is the same BOM the JSON syntax errors guide tells you to avoid in JSON. CSV is friendlier to it.

Lossy conversions and how to mitigate

Conversions that lose information:

Type loss — CSV has no types. "true", true, and 1 are all the same string true on the wire. Pin a convention: numbers as decimals, booleans as true/false or 1/0, dates as ISO 8601.
Null vs empty — there is no null in CSV. Most tools use the empty string. If you need to distinguish "missing" from "empty string", encode them differently (e.g. \N for null, à la PostgreSQL COPY).
Nested array order — when you explode, the original within-record order is preserved per row, but consumers can't always tell what the parent record was unless you carry an id column.
Object identity inside arrays — exploding arrays of objects multiplies columns: orders[*].id, orders[*].total need either an index suffix or a row explode.

The mitigation is the same in every case: write down your conversion conventions and treat them as part of the data contract.

Working with big inputs

If the JSON is large, don't load it all into memory before converting. See working with large JSON files for streaming patterns — a streaming JSON parser feeding a streaming CSV writer is the right shape, and processes arbitrarily large inputs in constant memory.

Convert yours

Paste JSON into /json/convert/csv — it offers strategy A, B, and C for nested arrays, a delimiter picker, and a UTF-8-BOM toggle for Excel compatibility. Everything runs locally; no data leaves your browser.

Next steps

Working with large JSON files — streaming JSON-to-CSV for huge inputs.
JSON vs YAML vs XML — choosing the right format upstream of conversion.