Scandinavian character corruption...

User 298877 Photo


Ambassador
292 posts

Hi Lup,

OK, I now see the issue, but in the markup it is also incorrect.

For example, the headline is written as SYNDALENS GENTLEMANNAT?VLING 2013 in the page source code, suggesting that your problem may be inherited in the ftp exchange.

Could you post the file GEN10to.html as an attachment in a post here so I can take a look at as you intended it and maybe there we can find the issue that's causing it to upload with errors.

Also, you didn't clarify the binary option in CC DFTP. Did you alter the default to binary in the transfer mode settings within the Coffeecup ftp client?

Dave :)
User 2574201 Photo


Registered User
14 posts

Hi Eric and Dave,

Thanks for your inputs. I'm too busy with other matters for the moment but will return to this thread later in the week!
-Lup

PS
I used the binary option in CC DFTP!
User 2574201 Photo


Registered User
14 posts

Hi,

I was struck by the fact that I now have three sets of pages behaving in different ways although the three sets have an almost identical source. so I decided to have a closer look at that. The three sets are
- the pages created in 2012 (File_A)
- the pages created in 2013 early in the process, that is with very few modifications done (File_B)
- the pages created 2013 by the software with all the suggested modifications included (File_C)

A further variation come from the fact that two of the above variants proposed a Save after being viewed by CC HE in split screen mode (No edit done!) and I made a save for all three but with a changed name.

That gave me a total of six files, the three original plus another three created by saving them after viewing in CC HE, all done before the synchronisation with S-Drive.

It was my intention to include a table of the diffirences in beahaviour and content here of those six files but I then found ot that there is a limit of three attachments to these posts and as I furthermore ran into problems with getting a table to display properly here all I can do is to send the three files mentioned above. As .html files are not accepted as attacements I changed the name extensions to .TXT

-Lup
Attachments:
User 187934 Photo


Senior Advisor
20,188 posts

Did you try putting the doc type and character set in the page?
I can't hear what I'm looking at.
It's easy to overlook something you're not looking for.

This is a site I built for my work.(RSD)
http://esmansgreenhouse.com
This is a site I built for use in my job.(HTML Editor)
https://pestlogbook.com
This is my personal site used for testing and as an easy way to share photos.(RLM imported to RSD)
https://ericrohloff.com
User 2574201 Photo


Registered User
14 posts

Hi Eric,

I have made no changes to the pages since last week.

I'm not quite sure what you mean by the terms "doc type" and "character set" but there are two lines that could possibly be what you refer to:

a. '<meta http-equiv="content-type" content="text/html; charset=UTF-8">' and
b. '<meta charset="utf-8">'

File _A contains neither. (File A displays properly on non-s-drive sites but not on s-drive)
File_B contains both. (File_B displays properly on s-drive)
File_C contains only line a. (File_C does not display properly on s-drive)

When File_B is saved after being viewed in split-screen mode the line a is removed.
The change makes no difference for the page display, it's still ok.

-Lup
User 1948478 Photo


Senior Advisor
1,850 posts

Hi lup,

Since Eric seems to have taken a rare moment to sleep, I'll take a crack at this...

The "doctype" Eric is referring to is this line, which should always be the very first line in an HTML document:

<!DOCTYPE html>

In this (simplified-) format, the doctype declaration refers to the latest, although still experimental, version: HTML5. It can safely be used, though, since it in most regards is backwards compatible.

Your lines a. and b. are essentially equivalent to each other, but applicable to HTML4.01 and HTML5, respectively. Again, HTML5 is simplifying things...
The character set specifies how characters (including your Swedish characters å, ä, ö ;) ) are represented in the code, i.e. according to which standard they are represented.
For more about 'charset', read about it here: http://en.wikipedia.org/wiki/Character_encoding

To put this into context then, the structure and minimum content of an HTML document should generally be as follows:

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">

[...other content for head portion...]

</head>
<body>

[...body content...]

</body>
</html>
User 2574201 Photo


Registered User
14 posts

Thanks Per!

Your input certainly helps me to better understand what is going on!
I had hoped to get by with a minimalistic knowledge of HTML and so far succeded fairly well but then I ran into this problem with s-drive. Obviously it is high time to get a better understanding of the subject!

-Lup
User 2574201 Photo


Registered User
14 posts

Finally solved the problem!
The basic cause was my software, it was producing Ansi-code even when told to produce UTF8 output.
This error was hidden by the "smart" browsers that happily decoded Ansi and displayed pages ok.
The only condition was: no line in the page head should indicate that the code was UTF8, if that was the case the browser failed.

Uploading the Ansi-coded pages using Ipswith FTP to my old website resulted in Ansi-coded pages that were readable with the browsers but, however, when syncronizing with s-drive the non-UTF8 characters were found and marked as unreadable during the sync process.

The big confusion was caused by the pages usually being ok before upload but failing after upload to s-drive while still being ok if uploaded to my old site. Adding meta <charset="utf-8"> just caused more confusion by making previously readable pages on my disk unreadable and not having any impact on the final webpage. (Just as one could expect, it doesnt help mych to pretend that a page is UTF8 encoded when it in fact is in Ansi!)

Many thanks to all that have helped my to get this problem solved!

-Lup
User 2147626 Photo


Ambassador
2,958 posts

Glad you figured it out, and thanks for posting back and letting us know! :cool:
Graphics for the web, email, blogs and more!
-------------------------------------
https://sadduck.com
User 122279 Photo


Senior Advisor
14,450 posts

Lup, I see that you have solved it. That's good. But for the next time you make a site/page, don't create it in MSWord. When saved as html you get all sorts of weird code, and there's no saying what the outcome will be. When using html5, which starts like this:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">

then you can simply type your ä, ö and å, just like I do with my Norwegian letters æøå.
Ha en riktig god dag!
Inger, Norway

My work in progress:
Components for Site Designer and the HTML Editor: https://mock-up.coffeecup.com



Have something to add? We’d love to hear it!
You must have an account to participate. Please Sign In Here, then join the conversation.