Thursday, March 12, 2015

Introduction of HTML


What is HTML?

HTML is a markup language. The word markup was used by editors who marked up manuscripts (usually with a blue pencil) when giving instructions for revisions. The term remains in use, though with slightly different meaning. A markup language as it relates to browsers is a language with specific syntax that gives intructions to a web browser about how to display a page. HTML separates "content" (words, images, audio, video, and so on) from "presentation" (the definition of the type of content and the instructions for how that type of content should be displayed). HTML uses a pre-defined set of elements to identify content types. Elements contain one or more "tags" that contain or express content. Tags are surrounded by angle brackets, and the "closing" tag (the one that indicates the end of the content) is prefixed by a forward slash.
The paragraph element consists of the start tag "<p>" and the closing tag "</p>". The following example shows a paragraph contained within the HTML paragraph element but remember it will not preserve more than one white spaces: 


When this content is displayed in a web browser, it looks like this:
The browser uses the tags as an indicator of how to display the content in the tags.
Elements that contain content can usually also contain other elements. For example, the emphasis element ("<em>") can be embedded within a paragraph element, to add emphasis to a word or phrase:
 
When displayed, this looks like:
You are beginning to learn HTML.
 

Elements — the basic building blocks

HTML consists of a set of elements. Elements define the semantic meaning of their content. Elements include everything between two matching element tags, including the tags themselves. For example, the "<p>" element indicates a paragraph; the "<img>" element indicates an image. See the HTML Elements page for a complete list.
Some elements have very precise meaning, as in "this is an image", "this is a heading", or "this is an ordered list." Others are less specific, such as "this is a section on the page" or "this is part of the text." Yet others are used for technical reasons, such as "this is identifying information for the page that should not be displayed." Regardless, in one way or another all HTML elements have a semantic value.
Most elements may contain other elements, forming a hierarchic structure. A very simple but complete web page looks like this:
 
 
As you can see, the <html> element surround the rest of the document, and the <body> element surround the page content. This structure is often thought of as a tree with branches (in this case, the <body> and <p> elements) growing from the trunk (<html>). This hierarchical structure is called the DOM: the Document Object Model.


Tags

HTML documents are written in plain text. They can be written in any text editor that allows content to be saved as plain text, such as Notepad, Notepad++, or Sublime Text,  but most HTML authors prefer to use a specialized editor that highlights syntax and shows the DOM. Tag names may be written in either upper or lower case. However, the W3C (the global consortium that maintains the HTML standard) recommends using lower case (and XHTML requires lower case).
HTML attaches special meaning to anything that starts with the less-than sign ("<") and ends with the greater-than sign (">"). Such markup is called a tag. Make sure to close the tag, as some tags are closed by default, whereas others might produce unexpected errors if you forget the end tag.
Here is a simple example:


Attributes
The start tag may contain additional information, as in the preceding example. Such information is called an attribute. Attributes usually consist of 2 parts:
  • An attribute name
  • An attribute value
A few attributes can only have one value. They are Boolean attributes and may be shortened by only specifying the attribute name or leaving the attribute value empty. Thus, the following 3 examples have the same meaning:

Named character references
Named character references (often casually called entities) are used to print characters that have a special meaning in HTML. For example, HTML interprets the less-than and greater-than symbols as tag delimiters. When you want to display a greater-than symbol in the text, you can use a named character reference. There are four common named character references one must know:
  • &gt; denotes the greater than sign
  • &lt; denotes the less than sign
  • &amp; denotes the ampersand 
  • &quot; denotes double quote
There are many more entities, but these four are the most important because they represent characters that have a special meaning in HTML.

A complete but small document

Putting this together, here is a tiny example of an HTML document. You can copy this code to a text editor, save it as myfirstdoc.html, and load it in a browser. Make sure you are saving it using the character encoding UTF-8. Since this document uses no styling it will look very plain, but it is only a small start.

        


No comments:

Post a Comment