burnunit: (crankiest. evar.)
burnunit ([personal profile] burnunit) wrote2007-09-14 10:12 am
Entry tags:

semantics semantics

Okay so here's some work. You might think this, my second post of the day means I'm not working, but this is the result of some actual work I needed to do! Suppose I want to describe some foreign text in a structural way in an html or xhtml document. I could use the <i></i> and <em></em> tags of course, but those would only give me foreign words and phrases displayed in the accustomed manner--italicized (usually the emphasis tag renders ital too, natch). Which is what people quoting words in other languages are used to seeing in books and academic texts. But that's not semantic, right? Right. It just controls the display of those words.

I am persuaded by the argument to use <cite></cite> tags, for two reasons: 1) it gets the job done, causing most browsers to render the enclosed text in a notably different way than the other text; and 2) it is (indirectly) actually descriptive of the phrase; if you accept the argument that in a manner of speaking you are quoting or citing a word from another source, in this case the source is a language other than the content of the body text. I'm more persuaded by <cite></cite> than I will ever be by <em></em> or italics. A third reason is that it's easier and quicker to code, also I have a fair amount of confidence browsers know what to do with it.

However, that isn't precisely what's going on. I'm not "citing" the language, I'm really just using it. I looked for this issue around the web. This guy has some interesting points, and his concerns seem valid. But he doesn't resolve the damn thing by a long shot.

PSU.edu has some tips and of course there's the venerable w3 consortium. I guess that's where I land with this. It feels clunky. First because it's an attribute, which means they haven't thought of a way for it to stand on its own. Would doing so be more or less structural? So it has to be applied to another element like P or SPAN.

Am I the only person who loathes the <span></span> tag? Or maybe I feel like there's a lacuna in it--somewhat because there's a lacuna in my own knowledge of the language codes, probably. RFC1766 provides a little more insight. The iso639-2 list helps, though now you have to decide if you want UTF8 or ISO8559, cripes. HERE is a list of language codes that's actually useful. (oh, here's something-- Burmese has 2 codes, bur and mya. So even with structural tags you get to make a choice of whose side you are on. For the record, I oppose the neofascist SLORC and thus choose bur. Myanmar, by providing a language code definition which clouds semantic precision, has slowed the development of the semantic web. As if brutal repression wasn't bad enough!)

So for the record, do you think you should probably write it this way: <span lang="es">tortillas, bolillos, empanadas, churros</span>? My study leads me to think I should, but now I'm going to hit the preview button to see if lj and my browser render it correctly.

(Aaand it did not. Which means I bolluxed it up because the first time I did it I didn't put it in the span tag. not to mention I forgot to put the ol ascii "ampersandltsemicolon" bit in the code tags I had)

I think it didn't render those two latin bits above (I love the word "lacuna") --which may be on lj or it may be my misuse of the tags. At this point I'm fed up for a while. That's why I'm journaling this, btw, to see what best practices other people prefer to use. You come at me with italics, though, and I'll just gag a little.

[identity profile] unclebastard.livejournal.com 2007-09-14 04:53 pm (UTC)(link)
Or you could only use good, solid AMERICAN words!