Sunday, January 19, 2003
“I'm turning Japanese, I think I'm turning Japanese, I really think so.”
JeffK mentioned that over the past few days, when he views The Boston Diaries his browser asks if he wants to download and install Japanese language support. I found the notion odd, but like some stores I've heard, computers can be affect by wierd things so it was remotely possible that for whatever reason his browser felt the need to install Japanese language support whenever my page was loaded.
So we head over to his computer and as he's bringing up my blog, it suddently hits me why his computer is asking to install Japanese language support: my entry on the 14th! (don't worry, it's fixed for now).
When writing English, I was taught that you italicize foreign words.
Easy enough to do in HTML, just slap some <I>
tags around
the word and be done with it. But semantically that doesn't really
mean anything, what with the semantic
web being a current hot topic and all. While it's apparent to most
readers that garçon is French
and über is German, what about
slumpmässig? Could
be German for all you know (it's not—it's Swedish). By using the features
inherent in HTML we can
add semantics to foreign words beyond just italicizing them.
And that's what I do, in fact. For a foreign word like slumpmässig I'll encode it up like:
<I LANG="se" TITLE="chance; luck, hazard">slumpmässig</I>
Certain browsers, like MSIE and Mozilla, will display a tooltip with the
text in the TITLE
attribute, where I stick the translation of
the word (if you happen to be using MSIE or Mozilla, try holding the mouse over a foreign
word), and an intelligently programmed HTML vocalizer (used perhaps, by the blind to speak
pages) can use the language tag to help recognize which language the word is
written in and use that to guide the pronounciation.
Semantically much better than just
<I>slumpmässig<I>
.
So, when I wrote that entry on the 14th I did what I've been doing now for some time and slapped some semantics around the Japanese terms.
The <I LANG="ja" TITLE="fan art">dojinshi</I> market .... <I LANG="ja" TITLE="comic book">Manga</I> publishers ...
They are Japanese terms after all.
Since I seem to already have the Japanese language support installed I didn't notice anything odd when I loaded the page to proof read the entry. But it seems that other browsers that don't have the Japanese language support saw the language attribute for “Japanese,” realized they weren't installed, so decided to ask the user if it was okay to install Japanese language support. But I'm using an Anglicized spelling for a Japanese word so there's no real need to download Japanese language support for what I used, so how do I get around that?
That, I don't know. I'm fudging it right now by using
LANG="x-ja"
which is allowed (any language code
starting with “x” is for private use; that shouldn't trigger any
download message from browsers—it's intended for words like Nazgûl which don't
have an officially designated language), which I suppose, is better than
nothing.
Update on Saturday, September 23rd, 2023
I think it's more semantically correctly to use the
<I>
tag than the <SPAN>
tag to mark
foreign words, so I'm going back and making that change.