Encoding (non)issues

Feedback, issues or suggestions for the forum
Locked
Chris Vogel
Posts: 5140
Joined: Fri Jan 10, 2003 1:14 am

Encoding (non)issues

Post by Chris Vogel »

Although this forum declares its character set as ISO 8859-1, it instead seems to be using windows-1252, which is basically ISO 8859-1 with some control characters replaced with displayable ones. Below are the displayable characters that are part of windows-1252 but not ISO 8859-1:

€‚ƒ„…†‡ˆ‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ

Those characters should be escaped in the source code, but they are not. Documents should not contain characters outside their character sets, although I’ll admit that, since this is a common error, most browsers will just treat ISO 8859-1 as windows-1252 anyway. I’m just being pedantic, I suppose. :P

Characters outside both ISO 8859-1 and windows-1252 seem to be escaped as expected (歯ブラシ, for example).

Archived topic from Anythingforums, old topic ID:3179, old post ID:58518
User avatar
michaelk1993adg
Posts: 413
Joined: Wed Feb 07, 2007 10:41 pm

Encoding (non)issues

Post by michaelk1993adg »

what?


Archived topic from Anythingforums, old topic ID:3179, old post ID:58540
http://web.wilson.k12.pa.us/buildings/so/s...mp3%20page.html this is a song my strings class plays. Bet You're jealous. Nerds.
User avatar
Red Squirrel
Posts: 29209
Joined: Wed Dec 18, 2002 12:14 am
Location: Northern Ontario
Contact:

Encoding (non)issues

Post by Red Squirrel »

I noticed lot of sites have screwed up encoding. I have yet to figure out how to make Firefox force it to iso-8859. The problem is only visible on pages with french characters, they'll apear as a diamond.

Pictures are also screwed up on this board, have no idea what causes that one.

Archived topic from Anythingforums, old topic ID:3179, old post ID:58548
Honk if you love Jesus, text if you want to meet Him!
Chris Vogel
Posts: 5140
Joined: Fri Jan 10, 2003 1:14 am

Encoding (non)issues

Post by Chris Vogel »

Red Squirrel wrote: I have yet to figure out how to make Firefox force it to iso-8859.
As a Web author or a Web user? If you want your pages to be served as ISO 8859-1, you would make sure Content-Type: text/html; charset=iso-8859-1 is in your HTTP headers and <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> is in your HTML, but that character set is usually the default anyway. If you want to force a different character set on a page you’re browsing in Firefox, you would go to View → Character Encoding → Western (ISO-8859-1).

I don’t know of a way to turn Firefox’s error correction off and treat ISO 8859-1 like, well, ISO 8859-1 instead of windows-1252, but that shouldn’t matter since the ISO 8859-1 characters that don’t correspond to the ones in windows-1252 can’t be used in HTML anyway. The problem arises when you use the new displayable characters in windows-1252 but serve the document as ISO 8859-1 and then rely on browsers to treat it as windows-1252 despite having declared it as something else. Browsers are not obligated to do that, and using those code points in ISO 8859-1 will make your HTML invalid.
Red Squirrel wrote: The problem is only visible on pages with french characters, they'll apear as a diamond.
Do you have an example? ISO 8859-1 mostly supports French, but it lacks single guillemets, the OE/oe ligature, the capital Y with diaeresis, and the euro sign. (Of course, those are available via HTML entities.)
Red Squirrel wrote: Pictures are also screwed up on this board, have no idea what causes that one.
Yeah, it just happened out of the blue. Maybe it was something we said? :P



Now I’ve come to a dilemma: Do I stop trying to be so typographically correct and dump my curly quotes and dashes, or do I ignore the problem?

Archived topic from Anythingforums, old topic ID:3179, old post ID:58561
Locked