An Illusory Intertwingling of Reason and Response

Tech: Yes, I’m a geek. I admit it. At least I’m not a nerd!

Tafel :: tech

[ Next 4 entries ]

Friday, February 08, 2008

Call me ishmael.infiniteplane.com

At long last, I've decided it's time to start serving OpenID. There are enough sites permitting me to log in, comment, and otherwise use and contribute with the protocol that it'd be rather foolish of me not to have an OpenID URI. It's always seemed silly to me, though, to use OpenID — a protocol designed for decentralization — in a centralized manner, procuring an OpenID from some Supreme Arbiter of Identity, when the possibility exists for me to keep my login credentials under my own lock and key.

Enter ishmael.infiniteplane.com (what more fitting name for a server used to establish identity than "Ishmael"?) and phpMyID. phpMyID, despite its rather unfortunate cliché of a "phpMyXYZ" name, is one of the nicest little single-user OpenID scripts available. Since serving one's own OpenID generally involves either rolling one's own with a ponderous array of libraries or installing a large-scale multi-user, database-backed authentication system, I didn't begrudge it the couple of hours spent in the code tweaking.

By default, phpMyID has a couple of — while not actually flaws — less-than-ideal design decisions.

First, it uses (and recommends against changing) a hardcoded HTTP authentication realm. While in a vacuum, that's not too major a difficulty, on a web with a major OpenID component, compromising OpenID servers will become a more lucrative proposition. In the same way one ought to remove easily-queried identifiers such as META Generator tags from CMS software to reduce the likelihood of unpatched vulnerabilities being exploited due to automated identification, a realm of "phpMyID" sent in every authentication header seems to invite malicious vulnerability profiling. Luckily, while the README is ambivalent about the value of such a change, it permits it easily through a configuration variable.

As well, I've made numerous other little changes, such as:

  1. defining an optional CSS file (no reason it has to look ugly)
  2. making it work properly when called as different hostnames, so that a single installation can, via a switch on HTTP_HOST, serve multiple users each with their own identity provider (IdP) hostname.
  3. enabling selection between use of META refreshes and HTTP Refresh headers
  4. optional suppression of filename requirements in IdP URIs
  5. suppression of unnecessary information at the "front gate" for security

Tuesday, February 06, 2007

Finally, a use for the pre-increment operator!

Okay, so if you hang out for long enough in programming newsgroups and the like, you eventually realize that there are two schools of thought, "both alike in dignity,"1 regarding increment operators: the pre-incrementers and the post-incrementers.

The philosophical and stylistic argument for pre-increment runs something like, "It reads more intuitively to say "increment i", rather than "i increment". The argument for post-increment is generally one of appearance: "i++ looks neater than ++i".

I'm firmly on the i++ side when it doesn't matter. However, there is a valuable semantic difference between the two, which until this moment I never had cause to appreciate: it saved me an extra case in a well-travelled switch. Sold as I have my soul to hand-optimization, that extra case irked me.

I'm using a simple format-string parser, and of course I'm using a while loop to iterate over the format string. (I dislike the fact that for loops can usually be outperformed by a more efficient while.) To save one conditional per iteration, I cast it in do . . . while form. However, the loop was conditional on *fmt++, which is of course the value of the current character, which is then incremented. The loop will break at the \0 at the end of the format string.

More precisely, it will break after processing the \0 through the state machine within the loop. When I noticed these nils getting through, and tracked them down, I was so used to post-incrementing that my first reaction was to put a case '\0': break; line in the switch of the state machine.

The value returned by post-increment is the original value: increment takes place after reading the value of the variable. Pre-increment increments the variable and returns the new value.

So, now, with a slightly less-elegant while(*++fmt) (which really looks like it shouldn't work!), it hums along nicely. And since the format function is called every time a data record is recovered, the time savings over thousands of entries should be noticeable if not substantial, even given the fact that format strings will generally be under a dozen characters.

Oh, what's it for? Unsatisfied with Analog (or any other log analysis program I've yet tried -- though I'm going to give Sawmill a spin, if only for the sake of the great beta-testing program they have), I'm writing my own log analyzer. Perl can bite me: I'll practically extract all the reports I want in a compiled language!


Two households, both alike in dignity,
In fair Verona, where we lay our scene,
From ancient grudge break to new mutiny,
Where civil blood makes civil hands unclean.
From forth the fatal loins of these two foes
A pair of star-cross'd lovers take their life;
Whose misadventured piteous overthrows
Do with their death bury their parents' strife.
The fearful passage of their death-mark'd love,
And the continuance of their parents' rage,
Which, but their children's end, nought could remove,
Is now the two hours' traffic of our stage;
The which if you with patient ears attend,
What here shall miss, our toil shall strive to mend.

— from Romeo and Juliet, Prologue, by William Shakespeare

Monday, January 08, 2007

Customizing the OS X Tiger Slideshow Framework

The Slideshow framework in Mac OS X 10.4 ("Tiger") is quite useful — and quite difficult to customize. For one thing, one must edit an obscure textfile (outside the purview of Apple's standard plist/defaults system) found at /System/Library/PrivateFrameworks/Slideshow.framework if you want to change anything (The file is called SlideshowConfig.data, and is found in the Resources directory of the framework. Though the domain (aka "bundle identifier" or "bundle ID") of the Slideshow framework is com.apple.slideshow, creating such a domain containing settings which should override those found in SlideshowConfig.data has no effect.

You can try it by: $> defaults write com.apple.slideshow this that
which should create com.apple.slideshow.plist in you user library, containing a junk key-value pair you can delete using Property List Editor. Now you can copy settings from SlideshowConfig.data into the new plist and alter them to your heart's content: nothing at all will happen.

If anyone knows how to make Slideshow respect per-user defaults, do please let me know. It doesn't seem to be dealt with by Apple (since it is, after all, a private framework), or anyone else, for that matter.

Wednesday, November 01, 2006

Porting blosxom to C

Okay, it had to come to this eventually . . . I'm porting blosxom to C.

I've started various PHP ports in the past (since even phposxom was too icchhk for me), but none of them came to much. When it comes down to it, the filesystem is a fantastically inefficient database, since you have disk lookups (msec range) rather than memory lookups (nsec range). When you pile an interpreted language with all its overhead atop that, you have abysmal page-generation times. Dynamic sites done in interpreted languages really need optimized database backends.

But give up? On blosxom? Never! This is a holy war: depend on something that may or may not change, or depend on something that has remained substantially unchanged since the seventies and eighties: pick one.

And so begins cosxom (I would have said "closxom", but that has shades of Clostridium to me . . . that's what microbiology will do to you!), the first (that I know of) C port of blosxom. So far it just pulls in a header and footer around an alphabetical-by-path collection of entries (sorting methods will come later). I know that the benchmark is far from fair at the moment, but a minimal blosxom install builds a page based on a highly-nested data directory of forty-one entries in 0.894s of user time and 0.159s of system time, while the current cosxom 0.01pre-a builds the a page in 0.007s user time and 0.019s system time. You can't tell me that two orders of magnitude is going to disappear with a sort and some filtering.

Oh, and if you'd like to see the only PHP blosxom-like CMS I'm going to keep pursuing, check out tumble on SourceForge.

Thursday, September 21, 2006

Recursive Make Considered Harmful

Every once in a while, I'll stumble across some real gems in my online researches. Today it happened with make.

I've been trying for some time to really get a handle on Makefiles, and to come up with a "One True Makefile" I could use for all my projects. See, I have a small (but growing) set of utility scripts with which I create my programming environment, including one that sets up a basic project for me. (I don't believe in topheavy IDEs . . . a text editor and a terminal are "integrated" enough for me.)

Of course, one of the things it does is places a Makefile set up for the project at hand in the project directory. In fact, if you make a bare project like that, you get a beautiful little implementation of true . . . it'll build and return 0 at you all day long if you let it.

Well, one of the problems I had with my second-edition Makefile was that I had to do clean builds a lot, since it wasn't tracking dependencies fabulously well (especially with regards to headers).

Well, my new Makefile is better-designed, and is going to integrate with my utility environment: namely dependencies are managed by depend.sh. I have to manually add header files as dependencies, but it saves the overhead of makedepend's zillions of dependencies. (Though I may write a makedepend parser to pull out only local dependencies, since they're the only ones likely to change during a development cycle.)

Anyway, I just wanted to point you to some of the better resources I found in my ongoing quest for Makefile goodness.

Monday, September 11, 2006

The Kaffe JVM

I hate Java.

Let me be more general: I hate virtual-machine languages.

However, Java particularly being a necessary evil in this day and age, there's no reason we have to be tied down to proprietary software.

Enter Kaffe.

Exit me. (I always leave the room when talk moves to Java.)

Friday, August 04, 2006

URL Escaping and GET Requests

Just discovered something: the vital difference between escape() and encodeURI() in JavaScript.

Now, up until recently, I'd always used escape() to finagle text into a form suitable for returning via GET requests. However, I was trying to return some Unicode and Apache kept throwing "406 Not Acceptable"s at me.

A little digging around the Apache dev lists and bug reports, and it turns out that the "%u####" escape methodology used by most Unicode-aware JavaScript implementations is actually not in accordance with the W3C standards on the subject; hence this is considered proper behaviour, and will not be changed.

Turns out the W3C-kosher way to do it is by encoding each Unicode character as three normal "%##" escapes within the normal ASCII URI escaping scheme. Fortunately, there exists encodeURI(), which does nearly the same thing as escape(), except it follows the W3C recommendations on encoding Unicode in URIs.

So remember, if there's even the merest smear of a chance of Unicode characters getting into your GET string, use encodeURI() and you'll save yourself many a headache.

And that, ladies and gentlemen, is a Good Thing.

Wednesday, June 14, 2006

Google Cache Hack

Okay, how many times have you tried to use Google's cache to retrieve something not at Archive.org, only to be told "Your search - cache:example.com/?id=135 - did not match any documents." Particularly annoying when it's a page you know for a solemn fact is in their cache. Turns out that you need to search for some key words. Well any key words. Search for "fhqwhgads", for all Google cares. (related, though I discovered this independently)

When you do that, suddenly the page will appear from the cache, and "These terms only appear in links pointing to this page: fhqwhgads" will be at the top of the page. Thanks to this, the Google cache just became that much more useful (since there are many sites — especially PHP sites with "?"s and "&"s in them) that get skipped by the Wayback Machine.) Hooray for Google!*

* Though I still stand firmly against gMail.

Thursday, May 04, 2006

Parsing a List of Lines in Bash

Parsing newline-delimited data records in bash is simple, if you have this odd redirect up your sleeve.

Working on my current shell-script project, a scheduling utility driven by the BSD calendar, I found myself needing to parse some input files linewise. See, I had been reading in the event data files (one for each record), translating newlines to tildes, and cutting the resultant data string on tildes (since cut doesn't like cutting on newlines, it would seem) to obtain my data fields. However, this added up to almost a half a second of runtime per record. I mean, I didn't expect bash to be the world's fastest string parser, but sometimes enough is quite simply enough.

Okay, let me put in the code here so people don't lose themselves in the article, and I'll explain in a moment.

# This shell script echoes individual lines from the file specified
# usage: . <scriptname> [file to parse]

while read line; do
	echo $line
	done < $1

The magic here is in that last line: done < $1
Because of the odd mechanics of shell substitution and token parsing, for line in $(cat $1); do . . . ; done won't work. You'd end up executing the loop whenever you hit whitespace, whether it be space, tab, or newline. What we need is some way to ensure that each line is passed as a distinct entity through the loop.

That's what read is here for. read is a shell built-in (in bash, anyway . . . I can't speak for other shells) that takes a single line of STDIN and sets it to the variable named as its argument, like so:

usage: read varname

But in a complex script, it can be difficult to track down where the interpreter believes STDIN, STDOUT, and STDERR are in the code path. In this case, if you try piping the file in, like so:

cat $1 | while read line; do . . . ; done


while cat $1 read line; do . . . ; done

or even using a standard shell redirect, as:

while read line < $1; do . . . ; done

you'll be in for some highly-unpredictable output. It turns out that STDIN for read can be accessed after the loop controlled by it, simply by redirecting the the STDIN of the entire loop to the desired file.

No, please don't ask me why! I don't think anyone knows why anything is the way it is in bash. There are fundamental programmatic reasons why it is necessary to sacrifice a goat at midnight to get your script to run properly.

Oh, by the way. Skipping all the utility invocations I had been using before cut my parser runtime by nearly two thirds . . . and I only had to hack at it for an two hours!

Tuesday, July 12, 2005

Fun with the Canon IR8500 in Service Mode

Revised 11/09/2005 09:45:30

The Canon ImageRunner 8500 (from here on out called the Canon IR8500) is a simple joy of a machine to run. It rarely jams, hardly ever malfunctions, and even when it does, the service techs are actually enjoyable to work with (which is more than I can say for some other vendors, such as R*COH, eh?). However, there’s only so much legitimate fun to be had from any machine. To really get the full enjoyment of the IR8500 (it being no different from any other machine in this respect), you have to get under the hood. Of course, Canon doesn’t want you to do this, for entirely-understandable liability reasons: in fact, service techs can lose their jobs for giving out the code to put the machine into service mode. (For the record, the code can be found online, so there’s no reason to go shooting my service tech, okay Canon?)

First, you have to go into service mode. I mean, come on, this is where all the fun is, right? Service mode in the Canon IR8500 is entered by pressing the “*” (Additional Functions) button, then immediately pressing “2” and “8” together, followed by the “*” again. This is known affectionately as the “star-twenty-eight-star” among Canon service techs.

If you’re not greeted by a beautiful white screen with a few boring grey buttons on it, try again . . . and again . . . and again if necessary. It’s a skill which takes a while to get the hang of, and even then, experienced service techs sometimes have to do it several times before it finally “takes.” (To exit service mode, press “Reset” several times until the machine returns to normal, end-user menus.)

Before you go pushing any service-mode buttons on your dear old IR8500, let me give you a friendly little WARNING: by entering service mode, you automatically incur the disfavour of the entire multinational Canon corporation, your friendly neighborhood service tech, and your boss (if you mess anything up). That said, let’s dig in.

Viewing the Analog Sensor Data (temperature, etc.)

Menu: COPIER > Display > ANALOG
Just hit the “Copier” button (the top one) upon entering service mode. The first panel/tab it drops you in ought to be the “Display” tab. From there, hit the “Analog” option. There’s not much to do here, but it’s actually where I spend most of my service-mode time. When the copier’s warming up, there’s not a whole lot to do; so you might as well watch the fuser temperature rise, right? Here are your readouts: pick your pleasure:

  1. TEMP (ambient temperature, °C)
  2. HUM (ambient relative humidity, %)
  3. ABS-HUM (ambient absolute humidity, g/m3)
  4. OPTICS (optics temperature, °C)
  5. FIX-C (fixer center temperature, °C)
  6. FIX-E (fixer end temperature, °C)

The ambient temperature (TEMP), is, of course, the temperature of the room around you, as measured from somewhere in the copier itself (and as such, it has a tendancy to be a tad high: for example, it usually gives me a reading of 23°C on a 21°C day). The optics temperature (OPTICS) is measured in the optical chassis, in the top of the machine. They are kept heated, mostly to preclude condensation, to around 75°C. The higher the relative humidity, the higher this temperature should be, I assume.

The relative humidity (HUM) is the most reliable measurement of humidity (and of course, will be a tad low, since the temperature it is measured with is a tad high), since it is independent of pressure and temperature effects on air volume.1. The absolute humidity (ABS-HUM) gives you the current mass of water per cubic meter of air, and as such, is affected by barometric pressure (because barometric pressure affects the volume of a given mass of air or other gas).

The fixer temperatures (FIX-C and FIX-E) are the temperatures of the center and ends of the top fixer (or, fuser) roller. I had originally thought that they were top versus bottom rollers (FIX-E is usually about 10°C cooler than FIX-C — ~187 as opposed to ~197°C — os I figured that that FIX-C was the bottom roller, the extra heat compensating for its indirect contact with the toner, since only the top roller directly contacts the unfixed/unfused toner), but a friendly Belgian Canon engineer read this post and corrected me.

It’s mildly amusing to watch the fixer/fuser temperature rise as the copier warms up, but other than that, there’s not a whole lot of action here. Now if you want to actually feel like you’re doing something . . .

Cleaning the Corona Wires

Now, in recent versions of the IR8500 copy machine firmware, Canon has provided a user option to clean the corona wires. However, this procedure is probably a pale and insipid version of the actual cleaningr process, since they don’t want ordinary people doing anything which could potentially harm the machine. Reasonable. Do I let that stop me? No. Here’s how an honest-to-goodness Canon service technician gets the black lines out of your copies. (Yup. If you have streaked copies and you’ve already wiped off the fuser/fixer rollers, but to no avail, this may do the trick without a visit from your Canon Man.)

This will take a few minutes, but it’s almost guaranteed to get rid of streaks and lines in your copies if they were even remotely caused by dirty corona wires.

See, the corona wires have to provide an even electrical charge (static electricity) to the sheet of paper so that it picks up toner powder evenly. If there is something on the corona wire causing a stronger charge at one point, that may result in a line across the paper where the sheet passes across that part of the corona (electrical field). See How Stuff Works for a detailed explanation of how the xerographic process (the process used in photocopiers and laser printers) works.

Viewing the Maintenance Counters

Menu: COPIER > Counter
The ImageRunner 8500 keeps track of the wear and tear of nearly every replaceable part. A counter is incremented for every copy (or, “click”) in which a given part is involved. For some items, such as the cleaning web, this means that the counter is incremented with every copy. Other parts, such as paper takeup rollers, are used only intermittently (i.e. only one paper storage drawer is drawn from per copy), and therefore have counts far lower. Counters are reset when a part is replace (ideally, though I have some parts which show a 1,000% overuse simply because their counters have never or rarely been reset when they’ve been replaced.

Oh, and that’s the other cool thing about the counters. They give you the percentage lifespan used for each counted part. Has your tech been doing his job? Now you know.

Other Miscellaneous Schtuff

  • Errors and Jams: these items give you the time an error code was set and the time it was cleared, along with the code itself and the size of paper involved (third, fourth, sixth, and ninth columns, respectively).
  • Function > MISC-P > P-PRINT: a seven-page printout of your current machine parameters and software package versions, along with your maintenance counter readings: the “Copy Service Report”
  • A whole lot of other things. Just don’t break anything, okay?

Hope you’ve enjoyed this little exposition (or, expoundification, perhaps?). Remember, I am in no way liable for any damage you may do, which may result in your service tech laughing at you and charging you double. If you want someone to blame, you can find the service code online sometimes. Anyway, I didn’t make you do anything, eh? (But speaking of making you do stuff, there’s some more service-mode tips and tricks for the ImageRunner series at the Imaging Systems Group “Canon Black and White Products” pages.

Humidity is the amount of water saturated in air (or another gas). Absolute humidity gives the mass of water vapor per unit volume of air in the atmosphere, while relative humidity is the ratio of the partial pressure of water which is actually present to the vapor pressure of water at that temperature, expressed in percent.

Wolfram Research, “Humidity”

Absolute humidity is the mass of water vapor per unit volume of air in the atmosphere. The mass of water at any given temperature is limited by the vapor pressure, which can be found in tables of water properties. A more useful notion is that of relative humidity.

Wolfram Research, “Absolute Humidity”

According to Dalton’s law, the pressure of a gas mixture is equal to the sum of the partial pressures of the gases of which it is composed. In the atmosphere, water vapor always contributes a partial pressure to the total pressure, usually expressed in terms millimeters of mercury (or inches of mercury). At any given temperature the partial pressure of water can never exceed the vapor pressure for that temperature, otherwise it would condense into liquid water.

The relative humidity is the ratio of the partial pressure of water which is actually present to the vapor pressure of water at that temperature, expressed in percent. For example, if the local atmosphere has a partial pressure from water vapor of 5 mm of mercury and the temperature is 20°C, the relative humidity would be [R.H. = (5/17.5) x 

100% = 28.6%] image copyright Wolfram Research (since the vapor pressure of water at 20°C is 17.5 mmHg).

Wolfram Research, “Relative Humidity”

How Stuff Works
What if you had to resort to making carbon copies of important documents, as many people did before copiers came along? Or worse, imagine how tedious it would be if you had to recopy everything by hand! Most of us don't think about what's going on inside a copier while we wait for copies to shoot neatly out into the paper tray, but it's pretty amazing to think that, in mere seconds, you can produce an exact replica of what's on a sheet of paper! In this article, we will explore what happens after you press "Start" on a photocopier.

More at How Stuff Works, “How Photocopiers Work”

[ Next 4 entries ]