COP 2344 (Shell Scripting) Project #6
XHTML Encoding

 

Due: by the start of class on the date shown on the syllabus

Description:

A common problem when putting content on a web server, is that text files commonly contain characters that have special meaning to a web browser.  These include an ampersand (&), a less-than symbol (<), a greater-than symbol (>), and others. 

In addition valid HTML or XHTML, documents require some information at the beginning (a document prolog) and some more at the end (the document epilog).  (XHTML is a more modern version of HTML; today's web browsers understand both formats.)

In this project you will write either a Perl or Python3 script that transforms a plain text file into a valid XHTML file.

Requirements:

Create a Python3 or Perl script, that reads text from a file whose name is provided on the command line, and produces a valid XHTML document as the standard output.  The title of the document should be the name of the file.

For example, if a text file named hello contains the following text:

Hello, World & Class!
<Good-Bye!>

Then the XHTML encoded output should look like this:

 1.  <?xml version="1.0" encoding="UTF-8"?>
 2.  <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 3.      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
 4.  <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
 5.    <head>
 6.      <title>hello</title>
 7.    </head>
 8.    <body>
 9.      <pre>
10.        Hello, World &amp; Class!
11.        &lt;Good-Bye!&gt;
12.      </pre>
13.    </body>
14.  </html>

The spacing of lines 1 to 9 (the XHTML required document prolog) and lines 12 to 14 (the required document epilog) is for readability only, and not required.

Your script must make the following changes to the input:

  1. Change all occurrences of & to &amp;.
  2. Change all occurrences of < to &lt;.
  3. Change all occurrences of > to &gt;.
  4. Add the correct XHTML document header (nine lines), including a correct title with the document name.
  5. Add the correct XHTML document footer (three lines).

Additional Notes:

To be turned in:

A copy of your Python or Perl script.  A sample text file you can use for testing your script is available on YborStudent.hccfl.edu at ~wpollock/mycat.c.

You can type or send as email to .  Please see your syllabus for more information about submitting projects.