About Hal’s Quotes & Notes: Page Data Management

skip to content

Q&N introduction:
general
index pages
topical chains
sources
accessibility
selections
language/content
accuracy
typography
graphics
page data
Hal

Index pages:
authors
titles
categories
topics
translators

Fair warning: this page gives just a bit of techie background, on the off chance anyone is interested. If you aren’t, you’re probably in the majority.

Managing these pages, especially when I tackle massive format changes, requires some systems and tools. Fortunately, the Perl* language makes it easy to extract and organize the data. It got even easier once I realized I could report the data in Web pages: by creating a direct link to each page, I can jump directly from a report to any page that needs work.

* Note: “Perl” is a retronym (a name turned into initials after the fact) for “Practical Extraction and Report Language” and “Pathetically Eclectic Rubbish Lister.” Take your pick. My source is Learning Perl by Randal L. Schwartz and Tom Phoenix.

Samples are linked here to illustrate the method. They aren’t kept current, though I will try to update them now and then. They reflect the state of the working copies on my own computer; they’re not run against published files on the Web server. Some of them exceed my customary size limits.

Status codes

I embed a set of status codes, written in a simple format, in most of the pages. Separately, I maintain a description of the format and a dictionary of the codes I’m using (in HTML format, of course); it’s automatically appended to each status report, with column headings on the report linked to the appropriate portions of the dictionary.

The full status report is only one possible style. The generating Perl program can also

Aside from matching against its selection parameters, the programming is independent of the set of attributes and values; it simply accumulates a full report of all codes found.

Page titles

This report lists each page’s title and size. I actually use a version with extracted <meta> tag content; I’m displaying the more limited version to keep the file size down.

The last-change date appears but is rarely very useful, since it doesn’t reflect when the content was captured; pages are updated to extend and reorganize topical-link chains, cross-reference other new material, improve graphics, add notes and comments, correct typos, test the dates in the report, etc.

Graphics

This report checks the graphics directory against the page references. Any missing or unused GIFs would appear at the top, so I’ve been able to clean those up.

There are links, both to the graphics and to sample pages using them. The direct links to the graphics don’t show them to best advantage. The two main cases in which they turn invisible are:

  • Light colors on white: try a “select all” to improve the contrast.
  • Block of dark colors: try viewing the sample page instead, if one is available.

top of page


Background graphic copyright © 2005 by Hal Keen