Scott Swedorski wrote:
Let me know if you can reproduce a good page this way and then we can work from there to see if I can get it to fail using the steps you are taking.
Download video (9.2MB).
Update: If I use iso-8859-1, then I do see those
strange characters you are describing. Is there any reason to use that character set over UTF-8? Once I switch back from iso-8859-1 to UTF-8, everything worked again.
I did find this article which may help explain things:
http://www.webmasterworld.com/forum23/4227.htm
I have viewed the video and read the postings in the link you provided.
I get a bit confused as to whether it is the new Editor, the code cleaner in it, IE that can't understand charset declarations coming before the doctype or whatever else it is that messes up the text in Norwegian. It looks like I can't use xhtml for any site...
Here is what I'm doing:
1. Create a page with these settings:
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
2. Paste in a block of Norwegian text.
3. before saving, preview in the Editor: all is ok
4. save, preview in Editor again: The text looks like my index2.html that I have mentioned earlier in this thread, with yen, Â, broken vertical bars - you name it.
5. checking the text in code view: nothing wrong there.
6. preview in FF: All is good. Charset encryption is utf-8
7. preview in IE: same errors as in the browser preview. Charset encr. still at Western Europe ISO
8. Manually changing the encryption to utf-8: text is ok.
9. applying the code cleaner to change to Latin-1, doctype at auto
10. checking the code: æøå have become æ. ø and å but no change in the doctype or charset decl.!
11. before saving, preview in the Editor: all is ok
12. after saving: re-read # 4, 6, 7 and 8 above
13. applying the code cleaner again to change it back to utf-8 (just for the record; I wouldn't do that in 'real life'...): No change in the head section - again, but instead of the weird characters I had before, I now have this instead: '�'.
14. The same characters show up in built-in preview before saving.
15. preview in IE/utf-8 produces a lot of squares, with Latin-1 the same as in #13 , in FF I have these rhombes with a question mark.
16. Then save and everything gets weirder still: � in built in preview, same as # 15 in the browsers.
I think this is enough testing for an xhtml file to say that I can't use that format. It used to be ok in the old version. Normally I would not use the code cleaner to go back and forth as I have done here, so the # 13 and onwards would not happen.
I'll have a go at html5 after a break...