It is time to finally retire HTML

Building a new web — Part 1

Bence Meszaros
12 min readMay 8, 2021

Misunderstood semantics

Probably the most annoying thing when it comes to HTML and the web is the massively overinflated importance of semantics. Die-hard developers are waging an endless holy war against the ignorant lowlifes who refuse this concept in a quest to convert everyone to their cause. They believe that semantics is nothing short of a miracle and they go great lengths to convince the world that it is the best and only way to add structure, to provide accessibility and to enhance ranking in search results, maybe even cure cancer. They will tell you that a website is not just for humans, in fact, most of the “visitors” on your site will not be humans but search engines, web crawlers, screen readers and many, many other user agents that do not care about how your site looks, they only want to understand the meaning of your content to process it for their own ends.

Little do they know that this is the exact definition of data labeling, a tedious and inconvenient task needed for machine learning models to learn from raw data. Flip it however you like, but the reality is that semantics is simply a glorified data labeling process that might or might not be helpful to actual end users but you’ll never know because you have absolutely no idea and zero control over how proprietary search engines, screen readers and other machine learning models actually work. You are essentially doing the most valuable and ugliest legwork for big corporations absolutely for free in the hopes that they might throw you a bone in exchange. And by mindlessly advocating the use of HTML in “the correct way” you are even promoting the standardization of this free labor, providing even more value to these greedy intermediaries.

HTML is a tool to label text.

We keep telling the lie over and over again that perfect semantics is paramount when in fact it is just a handy tool to shift the responsibility and workload from the presenter to the developer. How on earth could I make sure for example that my website is responsive enough or accessible to as many people as possible if everything I build is fully deconstructed en route to the user by an intermediary and reconstructed in a way that I have no control over? Either it is my responsibility to make sure my site is accessible but then the user agents have no right to alter it in any way (just like an image, audio or video file), or the user agents can twist and turn my design however they seem fit, but then it is their responsibility and their job to interpret and make my data accessible, not mine.

Perfect semantics does not guarantee anything and obsessing over it removes the focus from human to human interactions in favor of feeding machines with annotated raw data. Make no mistake, labeling is a useful addition and it is an important piece to the puzzle but to think that it is the most important one or that HTML is the best way to do it is ludicrous.

HTML layout without HTML

Semantics is also misunderstood when it comes to layout building. The second tenet of the holy web is that anything related to presentation should be purged from any HTML document. This would seem reasonable if in turn anything related to semantics could be purged from the design too. After all, we keep telling ourselves that the two–semantics and design–are mutually exclusive when in fact it is impossible to attach even a single CSS property to a piece of text, image or basically any chunk of raw data without wrapping it into an HTML tag first. Just exactly how are we supposed to strip design from the semantics when there is no mechanism whatsoever to interact with the data for design purposes without involving semantics?

Because we cannot style anything in our document without HTML, we also cannot build layouts without HTML. The problem is, we can only annotate something that already exists, not the other way around. Once we have our layout we can label its parts but building a layout from annotation labels doesn’t make any sense.

Adding HTML to a layout makes sense, adding layout to HTML does not.

Flawed flow

Then there are the box model and the document flow, two of the biggest design flaws of the front end. When HTML was invented in 1989, it was designed to display textual research data on screen so the logic was seemingly obvious: letters fill the screen from left to right and lines from top to bottom. The idea was to categorize everything on a website to be either a letter-like or line-like box and make it behave accordingly. The only problem with this approach is that even characters barely fit into this system, not to mention anything else. Is an image letter-like or line-like? Or a button? And what about tables? Or table rows and cells? Or columns? The simple concept suddenly became insanely complicated and stupid names like inline box and block box didn’t help much either.

During the years every iteration of the front end languages, every new feature, every framework, every layout trick was about to break out of this restraint. This was the very reason at some point people started to build websites entirely from tables too. We laugh at them now but is it really better now that we have incomprehensible concepts like floats, pseudo-elements or hybrids like inline-block that go against common sense and the very foundation of the language itself? Is it really a good thing that we can now change an inline box to a block box or vice versa with unforeseeable side effects while it is still ambiguous which element is which by default and why? And if everything we actually want breaks the document flow then why do we still keep it?

We had only one rule but every single real use case begs for an exception. When the W3C started developing the new HTML5 specification they had the chance to fundamentally change the web for the better but instead they decided to keep the old logic and just sprinkle some new shiny stuff on top of it. HTML5 was a big step I don’t deny that, but the web is still a complete mess thanks to the inherent problems of HTML.

Content vs Container

Humans intuitively organize stuff. We put documents into folders or books on shelves to categorize them. The documents or the books are the content and the folders or shelves are the containers. This is simple. But if the number of stuff keeps growing we go one step further and begin to organize our containers into bigger containers. This step seems innocent but unfortunately this is where the organizing logic breaks. If we put folders inside folders the distinction between content and container becomes ambiguous. If we open a folder now, we might find documents, folders or both. Folders can lead to further folders or documents and even if we forbid to mix folders and documents inside any folder, the hierarchy itself causes the whole system to become incomprehensible (you can see this on any personal computer or corporate file server).

And this makes HTML fail as well. In HTML everything is a box and similarly to the previous example it is perfectly valid to nest HTML boxes inside one another. This gets even worse with ambiguous names like “node” or “element” (a node can be content and container as well, while an element can only be a container), but HTML even goes one step further into the wrong direction. In HTML there are boxes that only has “one side”, also known as self-closing elements. The problem with this is that it is absolutely counterintuitive. A regular HTML element has an opening tag and a closing tag so it is obvious that the content goes between the two. But how can you put anything inside an image element for example, that only has an opening tag?

The kinda-sorta content

In IT there is a wonderful thing called metadata. Metadata is data that we pretend it’s not. It sounds stupid and it is. But if we forget about this distinction and treat every piece of metadata as regular data we can understand what’s going on with self closing HTML elements. The trick is that there are actually two ways to add data to any HTML container. We can put it between the opening and closing tags or we can put it inside the opening tag itself. Kind of like writing your name on the cover of a notebook instead of writing it on the pages inside. We just call everything on the cover metadata and everything inside regular data. Similarly, if you put anything inside the opening tag it is considered an attribute and everything that goes between the two tags is considered a node. This distinction doesn’t make any sense in the digital world, especially in the case of self-closing HTML tags, but hey, nobody has ever complained.

This mess becomes even more obvious if you deconstruct any HTML element using simple key-value pairs. This is common when you need to parse an HTML document into the DOM or convert it into a JSON file. An image element could look like this:

type: “img”,
src: “images/image.jpeg”,
width: 200
height: 100

And a paragraph element like this:

type: “p”,
font-family: “Helvetica”,
color: “black”,
content: “This is a paragraph.”

If you look at these examples you can immediately see how stupid HTML is to overcomplicate something this simple. There is no need for metadata, attributes, properties, opening and closing tags or whatever fancy words or concepts we can come up with, we just need simple objects with key-value information. The funny thing is that CSS and JavaScript are already using this form to store everything but not HTML. HTML is like an old boss who enforces a different rule for every single situation without paying much attention to the big picture and constantly refuses to retire.

A gift that keeps on giving

HTML is also full of inconsistencies. Even if you just scratch the surface you will find different concepts clashing into each other, ambiguous situations without a simple explanation and undeveloped ideas begging for someone to clean them up. My favorite one is the way HTML handles external files. Common sense dictates that linking files to an HTML document should be handled in a similar way regardless of their types but lo and behold, there is not a single concept that covers more than one file type. Linking scripts, stylesheets, images or other files all require the developers to memorize their specific features otherwise they won’t work. Some of them are self-closing, others are not, some of them use “href” as a locator, others use “src” or “data”, some of them can be put inside the head, others only in the body, still others can be found in both. To see this mess take a look at this simple table with the most common types.

Inconsistent attributes in HTML

But the most absurd of it all is that fonts cannot even be loaded directly into HTML, there is no mechanism for that at all. You need to rely on CSS using the @font-face rule, another wildly different concept to achieve that. I would say that HTML deliberately penalizes everything beyond text, but fonts are used exclusively for text. What is wrong with having legible text and support for special characters? It is one thing to keep hating on anything non-text related but it is another to deliberately obstruct accessibility.

From 1 dimension to four

Text is always one dimensional, just like a stream of bits. This is why we can read at all. Characters have to follow each other in a predetermined order, one after the other, in one single direction. But using text purely in its one dimensional form is pretty impractical and thus we tend to wrap it inside a container (a sheet of paper or a window of a software). This wrapping makes the text look like two dimensional, but it is conceptually still one dimensional.

And this is a big problem with HTML. Every concept in HTML to its very core is designed in one dimension. Sure, the source file is always a text file but thinking that a rendered website is also a one dimensional creature wrapped inside the browser window is pretty limiting to put it mildly.

Think of this process like threading a needle. You have to push a single thread through a tiny hole in a way that makes the thread rearrange itself into a two dimensional image on the other side. Sounds kind of impossible, isn’t it.

And it gets even worse with 3D documents. And I am not talking about building CAD models with HTML, I am simply talking about overlapping elements, layered compositions, an intuitive freedom in the third dimension, something that even the overhyped CSS grid is unable to achieve.

But there is even more. Animation introduces a fourth dimension: time. The problem here is not that it is impossible to achieve animations with HTML and CSS, it is the fact that now we are trying to describe complex spacetime components using only plain text in a text editor. This is the visual equivalent of writing machine code instead of JavaScript.

We have to realize that no matter how hard we try, no matter what exotic concepts we come up with in CSS, CSS will never be able to patch this tremendous hole in HTML. It’s just mathematically impossible.

Whitespace

HTML is already ambiguous when it comes to content versus container, but it is also ambiguous when it comes to source text versus display text. Even though any high-level source file is a human readable text file, this text has two distinct meanings: it contains instructions for the machine to execute and it contains strings to display on screen. This can quickly become confusing and that is why we need to clearly separate the two in any coding language.

JavaScript, for example, uses special characters to separate instructions from display text:

let text = "Hello \nWorld!";\n

In this example, everything outside the double quotes is machine instruction and everything inside the double quotes is a string intended for humans to read. Simple and easy to understand.

HTML (and markup languages in general), however, deliberately chose not to use separators leading up to severe ambiguity:

<body>\n
<div>Hello \n
World\n
</div>\n
</body>

In this example we added a line break after each line, four spaces before the two div tags and one between “Hello” and “World”. Now, adding whitespace to format our source file wouldn’t be a problem on its own, but because we don’t know where an instruction ends and a display part begins the machine interpreting this text file doesn’t know what to do with any of our white space characters.

HTML is not a programming language, but it doesn’t mean that it is not a coding language. And as such, it has to adhere to the same logical rules as other coding languages. One way or the other it has to separate its instructions from the visible text, but because HTML refuses to use simple separator characters, it has to use a far more complicated process that creates numerous new issues as a ripple effect. This separation process is so overcomplicated that it deserves its own article.

Conclusion

HTML, and in general any markup language, is arguably a rudimental choice for describing anything beyond a block of text. As its name suggests, marking up text is more like a makeshift process rather than a powerful and reasonable way to build multidimensional and interactive documents, not to mention standalone apps. On top of this, HTML is inconsistent and overcomplicated, it handles whitespace and even its own hierarchy terribly and it even obstructs CSS and JavaScript as well.

Even though HTML is widely popular and considered the king of the web, its status is about to change and for the better. With the ever-growing power and popularity of JavaScript and WebAssembly, with millions of iOS and Android apps and with billions of mobile users, it is only a matter of time before someone finds a way to replace it everywhere or, if all of our browsers are being held hostage to this language, just beat it into submission until then.

Either way, HTML will be retired and I am about to speed up this process.

--

--