Web Hosting Forum | Lunarpages


*
Welcome, Guest. Please login or register.
Did you miss your activation email?



Login with username, password and session length
May 24, 2012, 08:27:09 AM

Pages: [1]   Go Down
  Print  
Author Topic: Viewing special characters  (Read 989 times)
wboudx
Newbie
*
Offline Offline

Posts: 4


« on: November 28, 2011, 07:11:07 AM »

I have Virtual Weather Station software that creates htm pages and uploads it to my lunarpage.  When I view the index.htm file from my weather station it shows45.3°.  When I view the same page after it is uploaded to my lunarpage it shows 45.3�.

I did not have this problem on my current web site on Cox. 

Does anyone know how I can solve this problem?

Walt
Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5214



« Reply #1 on: November 28, 2011, 04:53:03 PM »

It sounds like the font being used doesn't contain this particular character (degrees). You said that you viewed the HTML file produced by the weather station -- how? Did you load it as a "file" into your PC's browser, or do something else? Are you looking at the file in an editor, and seeing the degree symbol there? That's not good practice to hard code non-ASCII characters (e.g., degree sign) in HTML -- it should be done with HTML entities (° for example). Anyway, does your PC's browser show the degree sign correctly when the file is loaded directly? Are any changes made to the file before trying to load it from your server? What is its character encoding (Latin-1, Windows-1252, UTF-8, or something else)? If you're getting a box symbol, that usually means that the font (on your PC) doesn't include the glyph for this particular code point. In that case, I would expect that loading the HTML file locally on your PC browser (as "File") and serving it from your site would produce identical results. Are they?
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
wboudx
Newbie
*
Offline Offline

Posts: 4


« Reply #2 on: November 29, 2011, 05:51:32 AM »

Thanks for the reply MrPhil.

I copied my file "index.htm" from my weather station system to my desktop I am on now.

I can double click on this file from windows explorer and it brings up my default browser and shows the temperature correctly (45.3°).

I have the web site I am leaving available until December 6.

I can upload the same file to the web site I am leaving then go with browser to that web site and it shows  the temperature correctly (45.3°).

I can upload the same file to the web site I am moving to on lunar pages, then go with browser to that web site and it shows  the temperature incorrectly (45.3�).

I can go to the same web site with my iPhone and get the temperature incorrectly  (45.3�).

I have been uploading the weather page to the web site I am leaving for over 10 years and I never saw the temperature incorrect.

If any changes are made uploading it to lunarpages I don't know.

Walt
Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5214



« Reply #3 on: November 29, 2011, 06:29:57 AM »

OK, first of all, what is the weather station using for a "degree" symbol? Is it a hard coded "binary" character or an HTML entity (° or &#nnnn;)? If it's a hard coded character, what is the declared character encoding on the page, if any (default should be Latin-1)? Does the page declare any specific font to use, or whatever the browser defaults to? Have you tried telling your browser to use other character encodings (e.g., UTF-8) to see if maybe this symbol (and page) are in some unexpected encoding? View > Character set or Page > Encoding or something like that. There's also the possibility that whoever wrote the weather station code was a doofus who chose to use some exotic symbol (not in Latin-1) instead of a proper "degree", but that wouldn't explain why you can properly display the page on your PC.

If the site is using a hard coded binary character, there's a chance that it was corrupted at some point, although it's unusual for a font not to cover the entire Latin-1 set. Just for grins, if you have source for the weather station (PHP, Perl, etc.) you might try commenting out that line (with hard coded degree) and substituting °. If you can't get to the source, can you post the output HTML here, so I can see if any odd things are being done? Is there any evidence that the software sends some odd HTTP headers? If the captured HTML page appears to work correctly, then it's probably not.

LunarPages servers are normally configured to output Latin-1 character set (unless overridden by an HTTP header or HTML meta tag to change character encoding). There's always a chance that there's something different about yours, but it would be surprising.
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
wboudx
Newbie
*
Offline Offline

Posts: 4


« Reply #4 on: November 29, 2011, 08:39:41 AM »

This is the line from source code of index.htm on old web site:
</font><font face="Arial" size="5">40.8°</font></b></td>

This is the line from source code of index.htm on lunarpages:
</font><font face="Arial" size="5">40.8�</font></b></td>

Is there another place I can find whether it is &deg; or &#nnnn?

How do I determine the declared character encoding?

The web page file is 77k.  Are you asking to see the source of the whole file?
And do you want both correctly labeled and incorrectly labeled?

Walt

I found out the encoding.  The page that is correct on my old web site is ISO-8859-1 and UTF-8 on lunarpages.
« Last Edit: November 29, 2011, 09:42:50 AM by wboudx » Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5214



« Reply #5 on: November 29, 2011, 11:58:54 AM »

I can see from the source that it's a hard coded binary character. It should have been coded as 48&deg;, but that's water under the bridge if you don't have access to the weather station source (PHP, Perl, etc.). "Arial" is a reasonably standard font face (font family/typeface), at least on Windows PCs.

Bring up your site on a browser and play with the character encoding. What is it declared in the code (meta tag for character set)? What does the browser see it as (the default when you go into View > Encoding or the equivalent)? Does it show properly if you change the encoding (in the browser) from Latin-1 to UTF-8 or vice-versa?

If it's a Latin-1 character in the "upper" section (x80-xFF), most browsers would show it (in UTF-8 mode) as an invalid character (typically a ?-in-black-diamond), rather than as a missing glyph. What browser are you using? I'd have to check what the lead byte in a valid UTF-8 sequence is -- maybe you got unlucky and the degree symbol is a valid starting byte (but then I would think you'd see an "invalid character" mark, unless you're doubly unlucky and what followed formed the remainder of a valid UTF-8 multibyte code).

Anyway, if this weather station software put out ISO-8859-1 (Latin-1) on other sites, why would it change to UTF-8 on LP? Is the software actually outputting a meta tag for character set (encoding) UTF-8 on LP, but Latin-1 on other servers? That would be most strange.
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
wboudx
Newbie
*
Offline Offline

Posts: 4


« Reply #6 on: November 29, 2011, 03:09:49 PM »

Does lunar pages convert ISO-8859 to UTF-8?

I can bring up the page with Internet Explorer and it shows the error.  I can change viewing from ISO to UTF-8 and it is correct.

Walt
Logged
MrPhil
Senior Moderator
Berserker Poster
*****
Offline Offline

Posts: 5214



« Reply #7 on: November 29, 2011, 04:28:57 PM »

LP shouldn't be changing encodings or individual characters. You haven't said what the page declares itself to be: ISO-8859-1 (Latin-1) or UTF-8 (if it doesn't declare an encoding, it is Latin-1 by default). When you bring up the page in the browser, you say that the default encoding (in Page > Encoding or whatever) is Latin-1 and if you change it to UTF-8 the character shows up correctly? In Latin-1, it shows a bogus character, while in UTF-8 the degree sign is correct? OK, that means that the character is UTF-8 (multibyte). Does the saved .htm file have a BOM (3 character Byte Order Mark) at the very beginning? You won't see it in any editor if your PC itself is displaying in UTF-8. If you look at the file on a PC with single byte encoding such as Latin-1 or Windows-1252 (not in UTF-8), I would expect to see 2 (most likely), 3, or even 4 bytes for the degree sign.

If this weather station software is intended to be used all over the world, it's quite possible that they chose UTF-8 for various symbols. That encoding would have to be declared in the <head> section. Now, there is still the question of why the page is being displayed in Latin-1, rather than UTF-8. Everything else looks fine when you force UTF-8? It's possible that there is a misconfiguration of the server, where some HTTP header is overriding the specified page encoding and forcing Latin-1. I've heard of that happening, but I don't think I've heard of it happening at LP. The thing to do would be to open a problem ticket and ask the LP tech to check whether something server-wide, or just on your account, is overriding the specified character encoding for a page.
Logged

Visit My Site

E-mail Me
  
-= From the ashes shall rise a sooty tern =-
Pages: [1]   Go Up
  Print  
 
Jump to: