this post was submitted on 24 Jan 2026
45 points (94.1% liked)

Programming

24607 readers
242 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS
 

There exists a peculiar amnesia in software engineering regarding XML. Mention it in most circles and you will receive knowing smiles, dismissive waves, the sort of patronizing acknowledgment reserved for technologies deemed passé. "Oh, XML," they say, as if the very syllables carry the weight of obsolescence. "We use JSON now. Much cleaner."

top 50 comments
sorted by: hot top controversial new old
[–] phoenixz@lemmy.ca 1 points 3 hours ago

I'm sure XML has its uses

I'm also sure that for 99% of the applications out there, XML is overkill and over complicated, making things slower and more error prone

Use JSON, and you'll be fine. If you really really need XML then you probably already know why

[–] erebion@news.erebion.eu 2 points 8 hours ago

XMPP shows pretty well that XML can do things that cannot be done easily without it. XMPP wouldn't work nearly as well with JSON. Namespaces are a super power.

[–] AnitaAmandaHuginskis@lemmy.world 11 points 13 hours ago* (last edited 13 hours ago) (2 children)

I love XML, when it is properly utilized. Which, in most cases, it is not, unfortunately.

JSON > CSV though, I fucking hate CSV. I do not get the appeal. "It's easy to handle" -- NO, it is not. It's the "fuck you" of file "formats".

JSON is a reasonable middle ground, I'll give you that

[–] unique_hemp@discuss.tchncs.de 6 points 8 hours ago (1 children)

CSV >>> JSON when dealing with large tabular data:

  1. Can be parsed row by row
  2. Does not repeat column names, more complicated (so slower) to parse

1 can be solved with JSONL, but 2 is unavoidable.

[–] abruptly8951@lemmy.world 1 points 3 hours ago

Yes..but compression

And with csv you just gotta pray that you're parser parses the same as their writer..and that their writer was correctly implemented..and they set the settings correctly

[–] thingsiplay@lemmy.ml 3 points 9 hours ago

Biggest problem is, CSV is not a standardized format like JSON. For very simple cases it could be used as a database like format. But it depends on the parser and that's not ideal.

[–] thingsiplay@lemmy.ml 13 points 14 hours ago (1 children)

JSON is easier to parse, smaller and lighter on resources. And that is important in the web. And if you take into account all the features XML has, plus the entities it gets big, slow and complicated. Most data does not need to be self descriptive document when transferring through web. Fundementally these languages are two different kind of languages: XML is a general markup language to write documents, while JSON is a generalized data structure with support for various data types supported by programming languages.

[–] Kissaki@programming.dev 1 points 5 hours ago

while JSON is a generalized data structure with support for various data types supported by programming languages

Honestly, I find it surprising that you say “support for various data types supported by programming languages”. Data types are particularly weak in JSON when you go beyond JavaScript. Only number for numbers, no integer types, no date, no time, etc.

Regarding use, I see, at least to some degree, JSON outside of use for network transfer. For example, used for configuration files.

[–] calliope@retrolemmy.com 22 points 16 hours ago* (last edited 16 hours ago) (2 children)

There exists a peculiar amnesia in software engineering regarding XML

That’s for sure. But not in the way the author means.

There exists a pattern in software development where people who weren’t around when the debate was actually happening write another theory-based article rehashing old debates like they’re saying something new. Every ten years or so!

The amnesia is coming from inside the article.

[XML] was abandoned because JavaScript won. The browser won.

This comes across as remarkably naive to me. JavaScript and the browser didn’t “win” in this case.

JSON is just vastly simpler to read and reason about for every purpose other than configuration files that are being parsed by someone else. Yaml is even more human-readable and easier to parse for most configuration uses… which is why people writing the configuration parser would rather use it than XML.

Libraries to parse XML were/are extremely complex, by definition. Schemas work great as long as you’re not constantly changing them! Which, unfortunately, happens a lot in projects that are earlier in development.

Switching to JSON for data reduced frustration during development by a massive amount. Since most development isn’t building on defined schemas, the supposed massive benefits of XML were nonexistent in practice.

Even for configuration, the amount of “boilerplate” in XML is atrocious and there are (slightly) better things to use. Everyone used XML for configuration for Java twenty years ago, which was one of the popular backend languages (this author foolishly complains about Java too). I still dread the massive XML configuration files of past Java. Yaml is confusing in other ways, but XML is awful to work on and parse with any regularity.

I used XML extensively back when everyone writing asynchronous web requests was debating between using the two (in “AJAX”, the X stands for XML).

Once people started using JSON for data, they never went back to XML.

Syntax highlighting only works in your editor, and even then it doesn’t help that much if you have a lot of data (like configuration files for large applications). Browsers could even display JSON with syntax highlighting in the browser, for obvious reasons — JSON is vastly simpler and easier to parse.

[–] tyler@programming.dev 2 points 6 hours ago

God, fucking camel and hibernate xml were the worst. And I was working with that not even 15 years ago!

[–] Kissaki@programming.dev 6 points 16 hours ago* (last edited 16 hours ago) (1 children)

Making XML schemas work was often a hassle. You have a schema ID, and sometimes you can open or load the schema through that URL. Other times, it serves only as an identifier and your tooling/IDE must support ID to local xsd file mappings that you configure.

Every time it didn't immediately work, you'd think: Man, why don't they publish the schema under that public URL.

[–] calliope@retrolemmy.com 3 points 16 hours ago* (last edited 16 hours ago)

This seriously sounds like a nightmare.

It’s giving me Eclipse IDE flashbacks where it seemed so complicated to configure I just hoped it didn’t break. There were a lot of those, actually.

[–] Ephera@lemmy.ml 22 points 18 hours ago (9 children)

IMHO one of the fundamental problems with XML for data serialization is illustrated in the article:

(person (name "Alice") (age 30))
[is serialized as]

<person>
  <name>Alice</name>
  <age>30</age>
</person>

Or with attributes:
<person name="Alice" age="30" />

The same data can be portrayed in two different ways. Whenever you serialize or deserialize data, you need to decide whether to read/write values from/to child nodes or attributes.

That's because XML is a markup language. It's great for typing up documents, e.g. to describe a user interface. It was not designed for taking programmatic data and serializing that out.

[–] Feyd@programming.dev 6 points 17 hours ago (4 children)

JSON also has arrays. In XML the practice to approximate arrays is to put the index as an attribute. It's incredibly gross.

load more comments (4 replies)
[–] aivoton@sopuli.xyz 4 points 17 hours ago* (last edited 17 hours ago) (1 children)

The same data can be portrayed in two different ways.

And that is issue why? The specification decided which one you use and what do you need. For some things you consider things as attributes and for some things they are child elements.

JSON doesn't even have attributes.

[–] Ephera@lemmy.ml 8 points 16 hours ago (1 children)

Alright, I haven't really looked into XML specifications so far. But I also have to say that needing a specification to consistently serialize and deserialize data isn't great either.

And yes, JSON not having attributes is what I'm saying is a good thing, at least for most data serialization use-cases, since programming languages do not typically have such attributes on their data type fields either.

[–] aivoton@sopuli.xyz 2 points 14 hours ago (1 children)

I worded my answer a bit wrongly.

In XML <person><name>Alice</name><age>30</age></person> is different from <person name="Alice" age="30" /> and they will never (de)serialize to each other. The original example by the articles author with the person is somewhat misguided.

They do contain the same bits of data, but represent different things and when designing your dtd / xsd you have to decide when to use attributes and when to use child elements.

[–] Ephera@lemmy.ml 7 points 11 hours ago

Ah, well, as far as XML is concerned, yeah, these are very different things, but that's where the problem stems from. In your programming language, you don't have two variants. You just have (person (name "Alice") (age 30)). But then, because XML makes a difference between metadata and data, you have to decide whether "name" and "age" are one or the other.

And the point I wanted to make, which perhaps didn't come across as well, is that you have to write down that decision somewhere, so that when you deserialize in the future, you know whether to read these fields from attributes or from child nodes.
And that just makes your XML serialization code so much more complex than it is for JSON, generally speaking. As in, I can slap down JSON serialization in 2 lines of code and it generally does what I expect, in Rust in this case.

Granted, Rust kind of lends itself to being serialized as JSON, but well, I'm just not aware of languages that lend themselves to being serialized as XML. The language with the best XML support that I'm aware of, is Scala, where you can actually get XML literals into the language (these days with a library, but it used to be built-in until Scala 3, I believe): https://javadoc.io/doc/org.scala-lang.modules/scala-xml_2.13/latest/scala/xml/index.html
But even in Scala, you don't use a case class for XML, which is what you normally use for data records in the language, but rather you would take the values out of your case class and stick them into such an XML literal. Or I guess, you would use e.g. the Jackson XML serializer from Java. And yeah, the attribute vs. child node divide is the main reason why this intermediate step is necessary. Meanwhile, JSON has comparatively little logic built into the language/libraries and it's still a lot easier to write out: https://docs.scala-lang.org/toolkit/json-serialize.html

load more comments (7 replies)
[–] Feyd@programming.dev 14 points 17 hours ago* (last edited 17 hours ago)

Honestly, anyone pining for all the features of XML probably didn't live through the time when XML was used for everything. It was actually a fucking nightmare to account for the existence of all those features because the fact they existed meant someone could use them and feed them into your system. They were also the source of a lot of security flaws.

This article looks like it was written by someone that wasn't there, and they're calling people telling them the truth that they are liars because they think features they found in w3c schools look cool.

[–] epyon22@sh.itjust.works 16 points 19 hours ago (4 children)

The fact that json serializes easily to basic data structures simplifies code so much. Most use cases don't need fully sematic data storage much of which you have to write the same amount of documentation about the data structures anyways. I'll give XML one thing though, schemas are nice and easy, but high barrier to entry in json.

load more comments (4 replies)
[–] Diplomjodler3@lemmy.world 16 points 20 hours ago (1 children)

It's true, though, that JSON is just better for most applications.

[–] MonkderVierte@lemmy.zip 7 points 18 hours ago (17 children)

Except config files. Please don't do config files in json.

load more comments (17 replies)
[–] Colloidal@programming.dev 1 points 11 hours ago

ASN.1 crying in the corner.

load more comments
view more: next ›