JSON vs YAML vs XML

JSON, YAML, and XML are three widely used text formats for representing structured data, and although they can often carry the same information, they differ sharply in syntax, readability, and intended use. JSON dominates web APIs for its simplicity, YAML is favoured for human-edited configuration, and XML remains common in documents and enterprise systems where its richer features matter. Each makes different trade-offs between being easy for machines to parse and easy for people to read and write. This guide compares their data models, weighs their strengths and weaknesses, and highlights the pitfalls of converting between them, type coercion chief among them.

  1. 1. JSON syntax and data model

    JSON (JavaScript Object Notation) represents data as objects of key/value pairs in braces and ordered arrays in brackets, as in `{"name": "Ada", "tags": ["a", "b"]}`. Its value types are deliberately minimal: string, number, boolean, null, object, and array, with keys and strings always in double quotes. This small, strict grammar makes JSON unambiguous and trivial to parse, which is exactly why it became the default for web APIs. The trade-offs are that it has no comments and no native date or other extended types — those are conventions layered on top of strings.

  2. 2. YAML syntax and data model

    YAML is a superset of JSON that maps to the same core model of mappings, sequences, and scalars, but expresses structure through indentation rather than braces and brackets. The equivalent of the previous example reads as `name: Ada` followed by a `tags:` list with `- a` and `- b` on indented lines, which many find far easier to read. YAML adds comments (with `#`), anchors for reusing chunks of data, and multi-line string blocks. The cost of this expressiveness is sensitivity to indentation and a notably more complex specification with subtle parsing rules.

  3. 3. XML syntax and data model

    XML (eXtensible Markup Language) wraps data in nested element tags, as in `<user><name>Ada</name></user>`, and each element can also carry attributes such as `<user id="1">`. Unlike the others its model is a tree of nodes rather than plain maps and lists, and it draws a distinction between element content and attributes that has no direct equivalent in JSON or YAML. XML supports namespaces to avoid name clashes and schemas (DTD or XSD) to validate structure formally. This makes it powerful for mixed text-and-markup documents, at the price of being the most verbose of the three.

  4. 4. Strengths and weaknesses

    JSON’s strength is simplicity and universal language support, but it lacks comments and is tedious to write and edit by hand for large configs. YAML is the most human-friendly to author and supports comments, yet its indentation sensitivity and surprising implicit typing make it error-prone, and its parsers are heavier. XML is the most feature-rich — namespaces, schemas, and rich document structure — but also the most verbose and cumbersome for simple data. The right choice depends on whether a human or a machine is the primary author and reader.

  5. 5. Where each is typically used

    JSON is the lingua franca of web APIs and data interchange between services, where compact, fast, machine-parsed payloads matter most. YAML dominates configuration that people edit by hand — CI pipelines, Kubernetes manifests, and many framework config files — because its readability and comments outweigh its quirks there. XML persists in document formats (such as office files and RSS feeds), SOAP web services, and older enterprise integrations where schemas and namespaces are valued. None is strictly better; each settled into the niche that fits its trade-offs.

  6. 6. Conversion caveats and type coercion

    These formats do not map onto each other perfectly, so converting between them can lose or change information. A frequent trap is YAML’s implicit type coercion: bare values like `yes`, `no`, `on`, and `off` may be read as booleans, and a version string such as `1.10` can be read as the number `1.1`, dropping the trailing zero — quoting the value prevents this. XML has no native arrays or distinct number and boolean types, so round-tripping to JSON forces decisions about how to represent repeated elements and how to interpret text as typed values. The safe habit is to validate after any conversion and to quote ambiguous scalars in YAML so they survive as the type you intended.

← All developer guides