What is the World Wide Web?
World Wide Web- the web, for short- is a network of computer able to exchange text, graphics, and multimedia information via the internet. By sitting at a computer that is attached to the web, using either a dialup phone line or a much faster broadband (Ethernet, cable, or DSL connection, you the can visit web- connecters next door, at a nearby university, or halfway around the world. And you can take full advantage of the resources these computers make available, including text, graphics, videos, sounds, and animation. Think
Of the web as the multimedia version of the internet, and you’ll be right on the mark. The World Wide Web has come a long way from its humble beginnings, most internet historians recognize Gopher as the precursor to the web. Gopher was a revolutionary search tool that allowed the user to search hierarchical archives of textual documents. It enabled internet users to easily search, retrieve. And share information. Today’s World Wide Web is capable of delivering information via any number of medium-text, audio, video. The content can be dynamic and even interactive. However, the web is not a panacea. The standards that make up the HTTP protocol are implemented in different ways by different browsers. What works on one platform may not work the same, if at all, on the next. Newly web- enabled devices- PDAs, cell phones, appliances, and so on- are still searching fop a suitable form of HTML to standardize on.
How Does the web work?
The computers that make all these web pages available are called web servers. On any computer that’s connected to the web, you can rub nab application called a web browser. Technically, a web browser is called a web client-that is, a program that’s able to contact a web server and request Information
When the web server receives the requested information, it looks for this information within its file system, and sends out the requested information via the internet. They all speak a common “language,” called Hyper Text Transfer protocol (HTTP). (HTTP isn’t really a language like the ones people speak. It’s a set of rules or procedures, called protocols that enable computers to exchanges information over the web.) Regardless of where these computer reside-china., Nor- way, orAustin,Texas-they can communication with each other through HTTP.
The following illustrates how HTTP works (see figure)
- Most web pages contain hyperlinks, which are specially formatted words or phrases that enable you to access another page on the web. Although the hyperlink usually doesn’t make the address of this page visible, it contains all the information needed for your computer to request a web page form another computer.
- When you click the hyperlink your computer sends a message called an HTTP request. The message says, in effect,” Please send me the web page that I want.”
The web server receives the request, and looks within its stored files for the web page you request. When it finds the web page, it sends it to your computer, and your web browser displays it. If the page isn’t founds, you see an error message, which probably the HTTP code for this error’ 404 “Not found.”
Figure: The client requests the page. Then the server evaluates the request and server the page or an error message.
What Is Hypertext?
You probably noticed the word “hypertext” in the spelled- out version of HTTP, Hypertext Transfer Protocol. Originated by computing pioneer Theodore Nelsons, the term “hypertext “doesn’t mean “text that can’t sit still,” although some web authors do use a much- despised HTTP code that makes the text blink on- screen. Insert, the tern is an analogy to a time-honored (but physically impossible) science fiction concept, the hyperspace jump, which enables a starship to go immediately form one star system to another. Hypertext is a types of text that contains hyperlinks (or just links for short), which enable the reader to jump form ones hypertext page to another.
You may also hear the word hypermedia. A hypermedia system works just like hypertext, except that it includes graphics, sounds, videos, and animation as well as text. In contrast to ordinary text, hypertext gives readers the ability to choose their own path through the material that interests them. A book is designed to be read in sequence; page 2 follows page 1, and so on. Sure, you can skip around, but books don’t provide much help, beyond including an index. Computer – based hypertexts let readers jump around all they want. The computer part is important because it’s hard to build a hypertext system out of physical media, such as index cards or pieces of paper
The web is a giant computer- based hypermedia system, and you’ve probably already done lost of jumping around from one page to another on the web-it’s called surfing. If one web page doesn’t seem all that interesting once you visit, you can click another link that seems more related to your needs (and so on). The web makes surfing so easy that you’ll need to give some thought to keeping people on your sites-keeping them engaged and interested-so they won’t surf away!
Where Dose HTML Fit In?
Hypertext Markup Language (HTML) enables you to mark up text so that it can function as hypertext on the web. The term markup comes form printing; editors mark up manuscript pages with funny- looking symbols that tell printer how to print the page. HTML consists of its own set of funny-looking symbols that tell web browsers how to display the page, These symbols, called elements, include the ones needed to create hyperlinks.
The Hyper Text Markup Language (HTML) IS A Simple markup language that describes the structure and behavior of Web document. HTML is the standard language that all Web browsers are designed to; understand and interpret. HTML is implemented using another larger markup language—the Standard Generalized Markup Language, usually known by its acronym, SGML
The invention of HTML
HTML and HTTP were both invented by Tim Berners-Lee, who was then working as a computer and networking specialist at a Swiss research institute. He wanted to give the Institute’s researchers a simple markup language, which would enable them to; share their research pares via the internet. Berners-Lee based HTMNL on Standard Generalized Markup Language (SGML), an international standard for mar5king up text for presentation on a variety of physical devices. The basic idea of SGML is that the document’s structure should be separated form its presentation:
- Structure refers to the var5ious components or parts of a document that authors create, such as titles, paragraphs, headings, and lists. Fir example, you’re reading an item in an unordered list, as it is termed in SGML (most people use the more familiar bulleted list). In SGML, you markup this item as a bulleted list, but you don’t say anything about how it’s supposed to look. That’s left up to whatever device displays or prints the marked-up file
- Presentation refers to the way these various components are actually displayed by a given media device, such as a computer or a printer. For example, this book displays this bulleted list item with an indentation and other special formatting.
- What’s so great about separating structure from presentation? There are several very important advantages:
- Authors usually aren’t very good designers. It’s wise, especially in large organizations. To let writers compose their documents, and let designers worry about how the documents are supposed to look. That’s particularly true when an organization has a corporate look or style, such as Apple computer’s standard typeface, which you’ll see in all of its documents. T he designers make sure that every document produced with the organization conforms to that style. So SGML doesn’t contain any features that control presentation.
- If markup consists of structure alone, the document’s appearance can be changed quickly. All that’s necessary is to change the presentation settings on whatever device is displaying the document.
- Documents containing only structural markup are much easier and cheaper to maintain. When presentation markup is included along with structural markup, the document becomes an unmanageable mess, and maintenance costs skyrocket.
- If a document contains only structural markup, it is more accessible to people with limited vision or other physical limitations. For example, a document marked up structurally might be presented by a Braille printer for those with limited vision, or by a text reader for those with limited hearing.
Sounds great, right? Still, from the beginning, HTML didn’t make the structure versus presentation distinction as clearly as SGML purists would have liked. And as HTML developed and the internet became a commercial network, Web authors demanded more tools to make their documents look attractive on-screen. The companies that make web browsers responded by introducing new, nonstandardized HTML elements that contained presentation information. By 1996, many web experts were worried that HTML standards were spiraling out of control. The newly founded world wide consortium, hoping to deep at least some kind of standard in place, tried to standardize existing practices, including the use of presentation and structure. The result was the W3C’s HTML 3.2 standard, which is still widely used. But organizations found that HTML3.2 exposed them to excessive maintenance cists. The SGML purists were right: Structure and presentation should have been kept separate
A Short History of HTML
To date, HTML has gone through four major standards, including the latest 4.01. In addition to the HTML standards, cascading style sheets and XML have also provided valuable contributions to Web standards.
The following sections provide a brief overview of the various versions and technologies.
HTML 1.0
HTML 1.0 was never formally specified by the W3C because the W3C came along too late It was the original specification Mosaic 1.0 used, and it supported few element. What you couldn’t do on a page is more interesting than what you could do. You couldn’t set the background color or background image of the page. There were no tables or frames. You couldn’t dictate the font. All inline images had to be GIFs; JPEGs were used for out-of-line images. And there were no forms.
Every page looked pretty much the same: gray background and Times Roman font. Links were indicated in blue until you’d visited them and then they were red. Because scanners and image- manipulation software weren’t as available the, image limitation wasn’t a huge problem. HTML 1.0 was only implemented in Mosaic and Lynx (a text-only browser that runs under UNIX).
HTML 2.0
Huge strides forward were made between HTML 1.0 AND html2.0. An HTML 1.1 actually did exist, created by Netscape to support what its first browser could do. Because only Netscape and Mosaic were available at the time (both written under the leadership of Marc Andreesen), browser makers were in the habit of adding their own new features and creating names for HTML elements to use those features.
Between HTML1.0 and HTML 2.0, the W3C also came into being, under the leadership of Tim Berners-Lee, founder of the web. HTML 2.0 was a huge improvement over HTML 1.0 Back-ground colors and images could be set. Forms became available With a limited set of fields, but nevertheless, for first time, visitors to a web page could submit information. Tables also became possible.
HTML 3.2
Why no 3.0? The W3C couldn’t get a specification out in time for agreement by the members. HTML 3.2 was vastly richer than HTML 2.0. It included support for style sheets (CSS level 1). Even though CSS was supported in the 3.2 specification, the browser manufacturers didn’t support CSS well enough for a designer to make use of it. HTML 3.2 expanded the number of attributes that enabled designers to customize the look of a page (exactly the opposite of HTML 4). HTML 3.2 didn’t include support for frames, but the browser makers implemented them anyway.
NOTE a page with two frames is actually processed like three separate pages within your browser. The outer page is the frameset. The frameset indicates to the browser, which pages go where in the browser window. Implement a web site. A common use for frames is navigation in the left pane and content in the right.
HTML 4.0
What does HTML 4.0 add? Not so much new elements- although those do exist-as a rethinking of the direction HTML is taking. Up until now, HTML has encouraged interjecting presentation information into the page. HTML 4.0 now clearly deprecates any uses of HTML that relate to forcing a browser to format an element a certain way. All formatting has been moved into the style sheets. With formatting information strewn throughout the page, HTML 3.2 had reached a point where maintenance was expensive and difficult. This movement of presentation out of the document, once and for all, should facilitate the continued rapid growth of the web.
Tip Use the W3C MarkUp validation service, available at http://validator .w3.org/ , to check your HTML against most of the versions mentioned in this chapter.
XML 1.0
Extensible Markup Language (XML) was originally designed to meet the needs of large-scale electronic publishing. As such, it was designed to help separate structure from presentation and provide enough power and flexibility to be applicable in a variety of publishing applications. In fact, many modern word processing contain XML components or even export their documents in XML-compliant formats.
CSS 1.0 AND 2.0
Cascading Style Sheets (CSS) were designed to help move formatting out of the HTML specification. Much like styles in a word processing program, CSS provides a mechanism to easily specify and change formatting without changing the underlying code. The “cascade” in the name comes from the fact that the specification allows for multiple style sheets to interact, allowing individual Web document to be formatted slightly different from their kin (following department document guidelines but still adhering to the company standards, for example). The second version of CSS (2.0) builds on the capabilities of the first version, adding more attributes and properties for a Web designer to draw upon.
HTML 4.01
HTML 4.01 is a minor revision of the HTML 4.0 standard. In addition to fixing errors identified since the inception of 4.0, HTML 4.01 also provides the basic for meaning of XHTML elements and attributes, reducing the size of the XHTML 1.0 specification
XHTML 1.0
Extensible Hyper Text Markup Language (XHTML) is the first specification for the HTML and XML cross-breed. XHTML was created to be the next generation of markup language, infusing the standard of HTML with the extensibility of XML. It was designed to be used in XML compliant environments, yet compatible with standard HTML 4.01 user agents.
What is an HTML File?
- HTML stands for Hyper Text Markup Language
- An HTML file is a text file containing small markup tags
- The markup tags tell the Web browser how to display the page
- An HTML file must have an htm or html file extension
- An HTML file can be created using a simple text editor
Creating an HTML Document
If you are running Window, start Notepad.
If you are on a Mac, start simple Text.
In OSX start Text Edit and change the following preferences: Open the “Format” menu and select “Plain text” instead of “Rich text “. Then open the “preferences “window under the “Text Edit” menu and select “Ignore rich text commands in HTML file “. Your HTML code will probably not work if you do not change the preferences above!
Type in the following text:
<html>
<head> <title>Title of page </title> </head> <body> This is my first homepage. <b> This text is bold </b> </body> </html> |
Save the file as “mypage.html”.
Start your internet browser “Open “ (or “Open page”) in the file menu of your browser. A dialog box will appear. Select “Browser” (or “Choose File “) and locate the HTML file you just created – “mypage.htm” –select it and click “Open “. Now you should see an address in the dialog box, foe example “C:\My Document\mypage.htm”. Click OK, and the browser will display the page.
The first tag in your HTML document is <html>. This tag tells your browser that this is the state of an HTML document. The last tag in your document is </html>. This tag tells your browser that this is the end of the HTML document. The text between the <head> tag and the </head> tag is header information. Header information is not displayed in the browser window.
The text between the <title> tags is the title of your document. The title is displayed in your browser’s caption. The text between the <body> tags is the text that will be displayed in your browser. The text between the <b> and </b> tags will be displayed in a bold font.
HTM or HTML Extension?
When you save an HTML file, you can use either the .htm or the html extension. We have used .htm in our examples. It might be a bad habit inherited from the past when some of he commonly used software only allowed three letter extensions. With newer software we think it will be perfectly safe to use.html.