
Lot’s of people use HTML, but not many really understand it. If you don’t know what’s the difference between <br> and <br/> this article is for you.
What was before World Wide Web?
SGML (Standard Generalized Markup Language) is format for writing text documents with additional informations (tags). In 1986 it became ISO standard. SGML document look’s something like this:
<tag1>
Some Text
<tag2 attribute otherattribute="value">
</tag1>
Standard defines syntax, and how to parse it. SGML parser know, that there is tag with name tag1 and it has 2 child nodes: text and another tag.
But it doesn’t define what tag1 means (semantics). It’s general purpose language, it’s base for more specific applications like
HTML
Published in 1991, HTML defines for example that <a> is a link, and href attribute contains url.
But excluding HTML, SGML wasn’t so popular. That’s why not many people bother to even remember this name. But I think, that it is important to know the difference between a syntax and a semantics.
XML and XHTML
In 1998 was completed new standard: Extensible Markup Language (XML). It is successor of SGML: it carries only about syntax of document.
In 2000 W3C (organization, that makes web standard) published XHTML 1.0. It was HTML 4.01 witch syntax changed for XML – XHTML document is 100% valid XML document.
What is the difference between HTML 4.01 AND XHTML 1.0?
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<title>XHTML 1.0 Example</title>
</head>
<body>
<p>
This is an example
<br/>
of XHTML
</p>
<div id="empty"/>
</body>
</html>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 strictl//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang="en">
<head>
<title>XHTML 1.0 Example</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<p>
This is an example
<br>
of XHTML
</p>
<div id="empty"></div>
</body>
</html>
It looks similar, that’s why many people don’t see differences on first sight.
First big change is that XML don’t tolerate syntax errors. In case of error SGML would to try parse something, but XML should return error.
Every XHTML document should have XML declaration, that provides version of XML and encoding.
XML has namespaces, so you can combine many different format in one document. For example you can write XHTML inside SVG or RSS files.
Self closing tags
HTML provides mechanism for automated closing tags. For example <br> is self closing – it cannot have any content. Another example: <p> cannot have another <p> inside, so when you write
<p>first
<p>second
<p>last</p>
parser knows, that opening new <p> should automatically close previous. But as you see, it requires, that parser know individual behaviour of each tag. In XHTML you must close all tags manually, so every XML parser should parse a document correctly without any knowledge about this tags. If something isn’t closed, you will get an error.
But to have more readable code, in XML you have short syntax <br/> which is equal <br></br>. But it is important to know, that in HTML this / is ignored (as i said before, HTML tries to ignore syntax errors) and don’t do anythink.
If you write:
<body>
<div/>
<p>Some text</p>
</body>
if it is inside XHTML document, browser will parse as
<body>
<div></div>
<p>Some text</p>
</body>
but inside HTML document it will be
<body>
<div>
<p>Some text</p>
</div>
</body>
Abandoned XHTML 2.0
At the beginning XHTML 1.0 and 1.1 has 2 big problems. First: it wasn’t fully supported by Internet Explorer untill version 9 (released in 2011, 11 years after XHTML was standarized). And remember, that evet then users didn’t updated theis browsers too quicly. So if you wanted to write XHTML, you still needed to serve it as HTML, so browser will parse it as HTML, so you don’t have any advantages of using XHTML.
Second problem was, that programmers didn’t like that idea, that in case of any syntax error whole webpage/application will stop working, and user will see big error message.

W3C proposed XHTML 2.0, which was designed to break backwards compatibility. This idea didn’t meet with approval of web developers, so in 2006 project was abandoned. But it isn’t the end of XHTML.
What is now – HTML5
After abandoning XHTML 2.0 W3C created new standard: HTML5. One of the main assumptions was that HTML and XHTML will be developed together as part of the same standard. You can choose your syntax, but everything else will be identical. After parsing code by browser it will be threaded identically.
W3C decided also, that HTML won’t be fully compatible with SGML specification, because no one really cares. They also introduced using MathML (still not supported by Chrome 😡) or SVG tags inside HTML file.
It’s important to know, that even if you don’t write namespaces in HTML, browser will add it automatically while parsing. You can test it by simple JS code:
var div = document.createElement('div');
div.innerHTML = '<p></p><svg></svg>';
console.log(div.children[0].namespaceURI);// -> "http://www.w3.org/1999/xhtml"
console.log(div.children[1].namespaceURI);// -> "http://www.w3.org/2000/svg"
It is important to know, that if you want to add svg elements to html by JS, you need to add namespaces manually, otherwise browser will not treat is as SVG, but rather like unknown HTML element.
var badSvg=document.createElement("svg");
console.log(badSvg.namespaceURI);// -> "http://www.w3.org/1999/xhtml"
var goodSvg=document.createElementNS("http://www.w3.org/2000/svg","svg");
console.log(goodSvg.namespaceURI);// -> "http://www.w3.org/2000/svg"