ButtUgly: Main_comments_140903

The formula:

String src = "abcåäö"; String res = new String( src.getBytes("ISO-8859-1"), "ISO-8859-1" ); assertEquals( src, res );

Will *NOT* work if the compiler thinks that the original characters that you put into the string (the literal expression) contains characters outside of the ISO-8859-1 range. Remember that UTF-8 involves sequences of byte values in the >128 range. If the compiler were to decide that your source was UTF-8, and you happened to be unlucky enough that the literal was a valid sequence of byte values, then the string would be composed of characters outside of the 8859-1 range.

It is possible that editing the file and saving it again might change it so the compiler thinks it is 8859-1.

Yeah, I am grasping at straws, but you have any better idea?

--KeithSwenson, 18-Sep-2003

Except that the string is NOT valid UTF-8. But actually, now that you mention it, I think JDK 1.4 no longer throws exceptions at invalid UTF-8, whereas 1.3 would. So the byte stream would still be decoded, no matter what.

--JanneJalkanen, 18-Sep-2003

More info... Add comment Back to entry

"Main_comments_140903_1" last changed on 18-Sep-2003 11:05:47 EEST by JanneJalkanen.

Current entries
About this blog
About Janne
Scientific Experiments
Recent Changes

New entry

Syndication

The Ugly Past

JSPWiki v2.4.104

CC Attribution-ShareAlike

The Butt Ugly Weblog

Syndication