This is Mark's Typepad Profile.
Join Typepad and start following Mark's activity
Join Now!
Already a member? Sign In
Mark
Recent Activity
Nice feature! I'd like playlists for track in each newsletter as well, so I can conveniently load all the new albums to try them out.
I appreciate your support of Linux, open file formats and open standards!
I wan to further clarify what the bug is here. According to the Perl docs is it against best practices to check the UTF-8 flag. Instead the programmer should keep track of the encodings of her strings and explicitly encode if necessary. Since this code sample does check the UTF-8 flag, not following the recommended best practices could be considered a bug of sorts. I get that. However, in practical terms what is wrong with encoding something as UTF-8 that is marked by the language as being UTF-8? In your comment you mentioned EUC-JP and JPEG data. I can't see that either of these would have the UTF-8 flag set, and would thus not be UTF-8 encoded by the code snippet. In practice, both CGI.pm and Catalyst have been using this method for years without any bug reports that I can see in the bug queues.
1 reply
Also, what is your opinion on the appropriateness of including handling of UTF-16 surrogate pairs in a URI percent-encoding solution? CGI.pm and URI::Escape::XS do this (using code from the same author).
1 reply
Thanks for the feedback. I updated my post to note which modules use the code you suggest is buggy. (They are CGI::Util and Mojo::Util). The W3C once clarified that part of URI percent encoding should be first converting the data to UTF-8. They spell out that step here: http://www.w3.org/International/O-URL-code.html But, perhaps based on the warning at the top of the page, that advice is not completely current. Would you say then that the current URI::Escape approach is best then-- providing one method which encodes arbitrary data, and one which first encodes it first as UTF-8 (regardless of the state of the UTF-8 flag?).
1 reply
Mark added a favorite at bulknews.typepad.com
Dec 30, 2009
Mark is now following The Typepad Team
Dec 3, 2009