HTML and XHTML Tutorial

What is an HTML File?

HTML stands for Hyper Text Markup Language
An HTML file is a text file containing small markup tags
The markup tags tell the Web browser how to display the page
An HTML file must have an htm or html file extension
An HTML file can be created using a simple text editor

XHTML is a stricter and cleaner version of HTML.

What Is XHTML?

XHTML stands for EXtensible HyperText Markup Language
XHTML is aimed to replace HTML
XHTML is almost identical to HTML 4.01
XHTML is a stricter and cleaner version of HTML
XHTML is HTML defined as an XML application
XHTML is a W3C Recommendation

XHTML - Why?

XHTML is a combination of HTML and XML (EXtensible Markup Language).

XHTML consists of all the elements in HTML 4.01 combined with the syntax of XML.

The following HTML code will work fine if you view it in a browser, even if it does not follow the HTML rules:

<html>

<head>

<title>This is bad HTML</title>

<body>

<h1>Bad HTML

</body>

XML is a markup language where everything has to be marked up correctly, which results in "well-formed" documents.
XML was designed to describe data and HTML was designed to display data.
Today's market consists of different browser technologies, some browsers run Internet on computers, and some browsers run Internet on mobile phones and hand helds. The last-mentioned do not have the resources or power to interpret a "bad" markup language.
Therefore - by combining HTML and XML, and their strengths, we got a markup language that is useful now and in the future - XHTML.
XHTML pages can be read by all XML enabled devices AND while waiting for the rest of the world to upgrade to XML supported browsers, XHTML gives you the opportunity to write "well-formed" documents now, that work in all browsers and that are backward browser compatible !!!

How To Get Ready For XHTML

XHTML is not very different from the HTML 4.01 standard.
So, bringing your code up to the 4.01 standard is a good start. Our complete HTML 4.01 reference can help you with that.
In addition, you should start NOW to write your HTML code in lowercase letters, and NEVER skip ending tags (like ).
Happy coding!

The Most Important Differences:

XHTML elements must be properly nested
XHTML elements must always be closed
XHTML elements must be in lowercase
XHTML documents must have one root element

XHTML Elements Must Be Properly Nested

In HTML, some elements can be improperly nested within each other, like this:

<b><i>This text is bold and italic</b></i>

In XHTML, all elements must be properly nested within each other, like this:

<b><i>This text is bold and italic</i></b>

Note: A common mistake with nested lists, is to forget that the inside list must be within <li> and </li> tags.
This is wrong:

<ul>

  <li>Coffee</li>

  <li>Tea

    <ul>

      <li>Black tea</li>

      <li>Green tea</li>

    </ul>

  <li>Milk</li>

</ul>

This is correct:

<ul>

  <li>Coffee</li>

  <li>Tea

    <ul>

      <li>Black tea</li>

      <li>Green tea</li>

    </ul>

  </li>

  <li>Milk</li>

</ul>

Notice that we have inserted a </li> tag after the </ul> tag in the "correct" code example.

XHTML Elements Must Always Be Closed

Non-empty elements must have an end tag.

This is wrong:

<p>This is a paragraph

<p>This is another paragraph

This is correct:

<p>This is a paragraph</p>

<p>This is another paragraph</p>

Empty Elements Must Also Be Closed

Empty elements must either have an end tag or the start tag must end with />.
This is wrong:

A break: <br>

A horizontal rule: <hr>

An image: <img src="happy.gif" alt="Happy face">

This is correct:

A break: <br />

A horizontal rule: <hr />

An image: <img src="happy.gif" alt="Happy face" />

XHTML Elements Must Be In Lower Case

The XHTML specification defines that the tag names and attributes need to be lower case.
This is wrong:

<BODY>

<P>This is a paragraph</P>

</BODY>

This is correct:

<body>

<p>This is a paragraph</p>

</body>

XHTML Documents Must Have One Root Element

All XHTML elements must be nested within the <html> root element. All other elements can have sub (children) elements. Sub elements must be in pairs and correctly nested within their parent element. The basic document structure is:

<html>

<head> ... </head>

<body> ... </body>

</html>

Mandatory XHTML Elements

All XHTML documents must have a DOCTYPE declaration. The html, head and body elements must be present, and the title must be present inside the head element.
This is a minimum XHTML document template:

<!DOCTYPE Doctype goes here>

<html xmlns="http://www.w3.org/1999/xhtml">

<head>

<title>Title goes here</title>

</head>

<body>

</body>

</html>

Note: The DOCTYPE declaration is not a part of the XHTML document itself. It is not an XHTML element, and it should not have a closing tag.

The XHTML standard defines three Document Type Definitions.

The most common is the XHTML Transitional.

<!DOCTYPE> Is Mandatory

An XHTML document consists of three main parts:

the DOCTYPE
the Head
the Body

The basic document structure is:

<!DOCTYPE ...>

<html>

<head>

<title>... </title>

</head>

<body> ... </body>

</html>

The DOCTYPE declaration should always be the first line in an XHTML document.

An XHTML Example

This is a simple (minimal) XHTML document:

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html>

<head>

<title>simple document</title>

</head>

<body>

<p>a simple paragraph</p>

</body>

</html>

The DOCTYPE declaration defines the document type:

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

The rest of the document looks like HTML:

<html>

<head>

<title>simple document</title>

</head>

<body>

<p>a simple paragraph</p>

</body>

</html>

A public identifier is a document processing construct in SGML and XML.
In HTML and XML, a public identifier is meant to be universally unique within its application scope. It typically occurs in a Document Type Declaration.
A public identifier is meant to identify a document type that may span more than one application. A system identifier is meant for a document type that is used exclusively in one application.
In the following Document Type Declaration, the public identifier is -//W3C//DTD XHTML 1.0

Transitional//EN:

<!DOCTYPE
html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

The 3 Document Type Definitions

DTD specifies the syntax of a web page in SGML.
DTD is used by SGML applications, such as HTML, to specify rules that apply to the markup of documents of a particular type, including a set of element and entity declarations.
XHTML is specified in an SGML document type definition or 'DTD'.
An XHTML DTD describes in precise, computer-readable language, the allowed syntax and grammar of XHTML markup.

There are currently 3 XHTML document types:

STRICT
TRANSITIONAL
FRAMESET

XHTML 1.0 specifies three XML document types that correspond to three DTDs: Strict, Transitional, and Frameset.

XHTML 1.0 Strict

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

Use this when you want really clean markup, free of presentational clutter. Use this together with Cascading Style Sheets.

XHTML 1.0 Transitional

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Use this when you need to take advantage of HTML's presentational features and when you want to support browsers that don't understand Cascading Style Sheets.

XHTML 1.0 Frameset

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd">

Use this when you want to use HTML Frames to partition the browser window into two or more frames.

How W3Schools Was Converted To XHTML

W3Schools was converted from HTML to XHTML the weekend of 18. and 19. December 1999, by Hege Refsnes and Ståle Refsnes.
To convert a Web site from HTML to XHTML, you should be familiar with the XHTML syntax rules of the previous chapters. The following steps were executed (in the order listed below):

A DOCTYPE Definition Was Added

The following DOCTYPE declaration was added as the first line of every page:

<!DOCTYPE html PUBLIC

"-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Note that we used the transitional DTD. We could have chosen the strict DTD, but found it a little too "strict", and a little too hard to conform to.

A Note About The DOCTYPE

Your pages must have a DOCTYPE declaration if you want them to validate as correct XHTML.
Be aware however, that newer browsers (like Internet Explorer 6) might treat your document differently depending on the <!DOCTYPE> declaration. If the browser reads a document with a DOCTYPE, it might treat the document as "correct". Malformed XHTML might fall over and display differently than without a DOCTYPE.

Lower Case Tag And Attribute Names

Since XHTML is case sensitive, and since XHTML only accepts lower case HTML tags and attribute names, a general search and replace function was executed to replace all upper case tags with lowercase tags. The same was done for attribute names. We have always tried to use lower case names in our Web, so the replace function did not produce many real substitutions.

All Attributes Were Quoted

Since the W3C XHTML 1.0 Recommendation states that all attribute values must be quoted, every page in the web was checked to see that attributes values were properly quoted. This was a time-consuming job, and we will surely never again forget to put quotes around our attribute values.

Empty Tags: <hr> , and <img>

Empty tags are not allowed in XHTML. The <hr> and tags should be replaced with <hr /> and .
This produced a problem with Netscape that misinterpreted the tag. We don't know why, but changing it to worked fine. After that discovery, a general search and replace function was executed to swap the tags.
A few other tags (like the <img> tag) were suffering from the same problem as above. We decided not to close the <img> tags with </img>, but with /> at the end of the tag. This was done manually.

The Web Site Was Validated

After that, all pages were validated against the official W3C DTD with this link: XHTML Validator. A few more errors were found and edited manually. The most common error was missing </li> tags in lists.
Should we have used a converting tool? Well, we could have used TIDY.
Dave Raggett's HTML TIDY is a free utility for cleaning up HTML code. It also works great on the hard-to-read markup generated by specialized HTML editors and conversion tools, and it can help you identify where you need to pay further attention on making your pages more accessible to people with disabilities.
The reason why we didn't use Tidy? We knew about XHTML when we started writing this web site. We knew that we had to use lowercase tag names and that we had to quote our attributes. So when the time came (to do the conversion), we simply had to test our pages against the W3C XHTML validator and correct the few mistakes. AND - we have learned a lot about writing "tidy" HTML code

An XHTML document is validated against a Document Type Definition.

Validate XHTML With A DTD

An XHTML document is validated against a Document Type Definition (DTD). Before an XHTML file can be properly validated, a correct DTD must be added as the first line of the file.
The Strict DTD includes elements and attributes that have not been deprecated or do not appear in framesets:

!DOCTYPE html PUBLIC

"-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"

The Transitional DTD includes everything in the strict DTD plus deprecated elements and attributes:

!DOCTYPE html PUBLIC

"-//W3C//DTD XHTML 1.0 Transitional//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"

The Frameset DTD includes everything in the transitional DTD plus frames as well:

!DOCTYPE html PUBLIC

"-//W3C//DTD XHTML 1.0 Frameset//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-frameset.dtd"

This is a simple XHTML document:

<!DOCTYPE html

PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

<html>

<head>

<title>simple document</title>

</head>

<body>

<p>a simple paragraph</p>

</body>

</html>

Join us on Facebook

Hungry Web Developer

12.03.2013

Learn Difference Between HTML & XHTML

HTML and XHTML Tutorial

What is an HTML File?

What Is XHTML?

XHTML - Why?

How To Get Ready For XHTML

The Most Important Differences:

XHTML Elements Must Be Properly Nested

XHTML Elements Must Always Be Closed

Empty Elements Must Also Be Closed

XHTML Elements Must Be In Lower Case

XHTML Documents Must Have One Root Element

Mandatory XHTML Elements

<!DOCTYPE> Is Mandatory

An XHTML Example

The 3 Document Type Definitions

XHTML 1.0 Strict

XHTML 1.0 Transitional

XHTML 1.0 Frameset

How W3Schools Was Converted To XHTML

A DOCTYPE Definition Was Added

A Note About The DOCTYPE

Lower Case Tag And Attribute Names

All Attributes Were Quoted

Empty Tags: <hr> , <br> and <img>

The Web Site Was Validated

Validate XHTML With A DTD

Test Your XHTML With The W3C Validator

0 comments:

Post a Comment

Search This Blog

Labels

Email Newsletter

Popular Posts

About Me