[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Lecture 8 Notes

These are the notes for lecture 8...


Today's fortune:

	When you go into court you are putting your fate into the hands of twelve
	people who weren't smart enough to get out of jury duty.
			-- Norm Crosby
Title: Math 481/581 Lecture 8: HTML

Math 481/581 Lecture 8: HTML

© 1998 by Mark Hays <hays@math.arizona.edu>. All rights reserved.

In this lecture we will cover the basics of HTML.


HTML is a markup language. An HTML document consists of plain ASCII text in conjunction with various markup tags (formatting codes) that tell the browser to display special things like lists and tables.

HTML does not specify how these elements are to be displayed -- that part is up to the browser. For example, HTML does not specify what fonts are to be used in a numbered list. This feature greatly enhances HTML's portability by divorcing a document's content from the presentation of that content.

Although it is possible to, for example, specify a 10 point Times Roman italic font in a document, you should not do so. For one thing, the results are unpredictable if the browser that is formatting the document does not have access to the specified font.

For another thing, someone with a temporary or permanent visual impairment may not be able to read tiny 10 point text. If you let the users choose the fonts they are most comfortable with, they can shrink or enlarge the text to suit their taste.

You can spend hours aligning pictures and text in your HTML. And you can destroy this hard-won alignment with a simple font change in the browser. This is so annoying that many people have chosen to write the font size in stone. However, consider this: alignment can be just as easily destroyed with a simple resize of the browser window -- the author of a document has no control over this.

Your best bet is to write things in such a way that painstaking alignment is unnecessary. Or at least get over worrying about it -- because there isn't a whole lot you can do about it. HTML wasn't designed to transmit a particular presentation of a document.

You should also be aware that each browser has its own set of bugs. For example, older versions of Microsoft Internet Explorer basically cannot do tables. The text mode browser lynx cannot do tables either. Some browsers have lots of trouble with frames. You should consider your audience before appealing to buggy features like these.

The HTML standard dictates that, if a browser does not support a particular feature, formatting information for that feature is simply and silently ignored. This is in contrast to the above paragraph in which a feature is implemented but the implementation is buggy.

For maximum portability, you should stick to simple and well-known HTML constructs. Follow the rules of HTML and things should work fine. You can bend the rules quite a bit on most browsers, but not all.

Be aware that only three image formats are widely supported by browsers: GIF, X-bitmap, and JPEG. Some browsers support additional formats, but you should not make use of them in published pages because they'll show up as "broken images" in the client's browser (and this reflects poorly on you).

Finally, some browsers do not support any image formats. The most well known example is the Lynx browser. Lynx operates in text mode in a terminal window. It is widely used in situations where a graphical environment is not available. We'll see how to take Lynx into account in a while (it's easy and you should take Lynx into account when writing documents).

HTML Primer

Most of today's lecture will be spent touching on the highlights of the HTML primer from NCSA. This document is available online at:


Other HTML Resources

Sample homepages with links to other HTML tutorials are available online at the Math Department SWIG homepage .

Getting Your Page Online

Getting your home page online is pretty easy. On most systems (including the u.arizona.edu cluster), your entire home page lives under $HOME/public_html. Your top level page should be called index.html and exist in this directory. You can create subdirectories in ~/public_html and put pages in there for organizational purposes.

The most important thing to remember is that your file and directory permissions need to be correct. You home directory itself needs to be at least mode 711, as does your public_html directory. This is because the web server software usually executes as the pseudouser "nobody" and needs to be able to access your account so that it can read your pages.

All subdirectories of public_html also need to be at least mode 711 for the same reason.

All of your HTML files, image files, etc. should be mode 644.

If your umask is 022 and your home directory is mode 711, you're probably all set. If your umask is 077, you'll probably need to run chmod by hand every time you create a new file, or else change your umask to 022 when you do web work. Or you can use the commands shown below to set things up for you.

If the permissions are incorrect, you will see "broken images" if an image cannot be accessed, and you will get a server error if a document cannot be accessed. It is important to exercise all links on your page to ensure that none of them are broken.

Here are a couple of commands that will set things up for you (read the find(1) manpage for details):

  > cd $HOME/public_html
  > find . -type d -exec chmod 711 {} \;
  > find . -type f -exec chmod 644 {} \;

Copyright Issues

One good way to learn HTML is to find a document that you like and study the HTML source code. Most browsers have a "View Document Source" feature that lets you view the code in a separate window. You can also save the HTML file to disk so that you can mess around with it.

It is also possible to "steal" images from a document. In Netscape, for example, you can hold the Shift key and click the left mouse button on an image to save it to disk. This is a great way to get your hands on cool images.

You should be aware, however, that many documents and images are copyrighted by their owners. Generally, there is nothing wrong with using web content for your personal use or amusement, as long as you do not republish such content.

However, you can find yourself in court facing stiff penalties for republishing someone else's stuff without permission.

Many pages specifically state copyright information in the fine print; however, you should not count on this. The best bet is to contact the author of or webmaster for a page and request permission to use images, etc.

For You DOS Fans

Here's one I saw the other day:
And the corresponding code: