Comment character encoding has been mangled

Is something wonky on the site? Describe the problem to us, and we'll fix it ASAP.
Forum rules
Bug Reporting Guidelines

Before Posting
Please make sure someone hasn't already reported the problem you're experiencing before creating a new topic. Confirmed issues have been stickied in this forum.

Rules

    1) Use a descriptive title when starting a bug topic. Example: "Comic images not showing up" Do not use vague titles like, "problem" or "uh ohh"

    2) Describe your problem in detail. Explain exactly what is wrong. Post images, screen shots, links, code snippets - whatever is relevant. The better the information you provide, the better the solution you will receive.

    3) Be polite and patient. We fix these issues as fast as we can. Please be patient with us as we work toward a solution.

Comment character encoding has been mangled

Postby The_mad_one » January 16th, 2015, 11:05 pm

It appears like special (unicode?) characters have broken in a lot of old comments.

For instance, the word Pokémon has turned into Pokémon everywhere.

This only seems to be on older comments, newer comments display é just fine.

Example:
http://mokepon.smackjeeves.com/comics/458480/prologue/

Just ctrl+f for Pokémon and you'll get 2 results right on that first page's comment section.

I'm not sure if this can be fixed anymore (check for comment's date when translating character encoding maybe?) but I figured I'd mention it anyway.
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby eishiya » January 16th, 2015, 11:08 pm

The comments show up fine for me. Are you sure your browser isn't accidentally auto-detecting the wrong encoding (something other than Unicode)?
Image
User avatar
eishiya
 
Posts: 9728
Joined: December 5th, 2009, 11:17 am

Re: Comment character encoding has been mangled

Postby The_mad_one » January 16th, 2015, 11:12 pm

I get it on Chrome, Firefox and Internet Explorer. Odd.

I never touched any character encoding settings, could it be a Windows 8 update thing? :?

I'll mention just in case: Specifically, I'm talking about the comment
Gobi-Aoi (Guest), 12 Aug 2011 06:09 pm
o.o
Now that I tink about it, they actualy drop ten years old and small, almost inofensive, pokémons in a world where dragons and gigantic mad insect things wander... Pokémon is so indecent >.>

on that page I linked. Although I imagine you would've found it if it was like that for you.

It happens on every single other comment with the word Pokémon but I figure one example should be enough.
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby eishiya » January 16th, 2015, 11:27 pm

I saw that exact comment and it shows up perfectly for me.

What encoding is your browser of choice using for the site? What encoding does it use for newer comments that are apparently fine? Are you viewing these newer comments on the comic site, or on the SJ profile?
The Mokepon site doesn't have an encoding specified, so your browsers are choosing one automatically based on the site's content (unless you have a default encoding set, which isn't likely if you haven't fiddled with it). It might just be that the site's content on the older pages is somehow tricking the browsers into thinking it's something other than Unicode, which is fairly plausible since most of the content is basic ASCII and the é character is just as out of place with the rest of the content as the wrong characters displayed.
Image
User avatar
eishiya
 
Posts: 9728
Joined: December 5th, 2009, 11:17 am

Re: Comment character encoding has been mangled

Postby The_mad_one » January 16th, 2015, 11:35 pm

I'm currently trying to find out how I can even see what encoding a web page I'm viewing has.

As of the newer comments, I only checked that by posting a comment myself, but obviously if the problem is on my side then I might've just sent a comment with the different encoding that I have. I tried to find a recent comment with a strange character in the recently updated comics but had no luck. Maybe you could place a comment somewhere I could check?

Note that I don't only have it on Moképon, I also have it everywhere on this much newer comic (in fact, that's where I just now noticed it, including on my own old comments):
http://www.smackjeeves.com/comicprofile.php?id=143512
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby The_mad_one » January 16th, 2015, 11:42 pm

Update: W3C tells me:

No Character Encoding Found! Falling back to UTF-8.

None of the standards sources gave any information on the character encoding labeling for this document. Without encoding information it is impossible to reliably validate the document. As a fallback solution, the "UTF-8" encoding was used to read the content and attempt to perform the validation, but this is likely to fail for all non-trivial documents.
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby The_mad_one » January 16th, 2015, 11:58 pm

Another update:

Firefox tells me:
The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol.


I can't get any of the browsers to tell me what encoding they're using beyond that message, but it seems the UTF-8 pages are defaulting to ASCII for me. Maybe the site's pages could be altered to specify their UTF-8 encoding to fix it?

ps: I'll be off now so I won't reply until tomorrow.

pps: If I save the webpage and check notepad++, it won't even tell me what encoding it is, while it normally does.
If I save Wikipedia's homepage (which has tons of strange characters in it that display as normal) Notepad++ tells me it's UTF-8 without BOM. Strange magics are at work.
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby eishiya » January 17th, 2015, 12:10 am

In Firefox, you can check the current encoding by going to View -> Character Encoding and seeing what's selected. If "Unicode" is selected, then the page is most likely being interpreted as UTF-8. If you're seeing something else, then there's the problem - the browser's guessing wrong and trying to use the wrong encoding. Selecting "Unicode" would solve the problem. If you're seeing Unicode or UTF-8 selected and you're seeing broken comments, then there's something off with how your computer is interpreting the data, and I'm afraid I don't know enough to know what could cause that other than "lmao Windows, you so silly".

Since SJ apparently spits comments out in Unicode (presumably UTF-8), it might be good for all the templates to have that specified as their default encoding instead of leaving it unspecified. People making their own templates can do whatever they want, but there's not much harm in specifying it in all the default templates.

Notepad++ doesn't tell you the encoding because one isn't specified, probably (I don't know, I've only ever used N++ for writing code). Wikipedia is very good about specifying its encoding everywhere, on the other hand. If N++ autodetects encodings, then it could just be because Wikipedia's pages provide way more data to work with. When é is the only non-ASCII character appearing on the page, it's hard for a program to know whether it's meant to be é or another sequence of non-ASCII characters because there isn't enough context.
Image
User avatar
eishiya
 
Posts: 9728
Joined: December 5th, 2009, 11:17 am

Re: Comment character encoding has been mangled

Postby The_mad_one » January 17th, 2015, 9:54 am

Ahahahaha what.

Your guess at "lmao Windows, you so silly" was quite right. Turns out the encoding according to Firefox is called windows-1252. I've never even heard of that. XD
http://i.imgur.com/XxneAfy.png
(and yes, that is the character encoding, on Wikipedia it says UTF-8 there)

Looks like the problem is indeed a very awkward guess from my browsers/OS, although I suppose the main problem is that the encoding isn't specified in the first place.

I'd actually suggest specifying UTF-8 encoding on the entire website, not just default layouts. Because I'm also getting the same issue on comic profiles and such.


On an unrelated note, perhaps admin should even fight it out with the W3C compliance checker some time, to get rid of all these sorts of incompatibility issues, although I imagine that would be a very large amount of work.
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am

Re: Comment character encoding has been mangled

Postby eishiya » January 17th, 2015, 11:04 am

I agree, setting a default charset everywhere would be good.

SJ templates aren't possible to make standards-compliant because of the ads, which rely on there being JavaScript somewhere outside of the head. Rating stars also work this way, as a loading time-reducing measure. The validator also doesn't handle JavaScript on HTML pages very well, which is causing a lot of false positives on the main site and probably some on the templates. That said, some of the default templates and the main site definitely have some errors that should be fixed. Like you said, it would be a lot of work (and for very little reward). It would be nice if at least future templates and such were made as error-free as possible before being released. Some of the older ones have outright tag soup.
Image
User avatar
eishiya
 
Posts: 9728
Joined: December 5th, 2009, 11:17 am

Re: Comment character encoding has been mangled

Postby Admin » January 18th, 2015, 2:45 pm

I added a UTF-8 header to the main site and comics. That should take care of it.
User avatar
Admin
Site Admin
 
Posts: 1462
Joined: August 17th, 2005, 11:10 pm

Re: Comment character encoding has been mangled

Postby The_mad_one » January 18th, 2015, 8:29 pm

Yup, it's fixed now!
Image
User avatar
The_mad_one
 
Posts: 52
Joined: September 5th, 2008, 11:17 am


Return to Bugs & Glitches

Who is online

Users browsing this forum: No registered users and 4 guests