[Alpine-info] URL detection doesn't think accented characters
are part of the URL
mbmiller+l at gmail.com
Thu Mar 3 15:54:34 PST 2011
On Thu, 3 Mar 2011, Eduardo Chappa wrote:
> On Thu, 3 Mar 2011, Andreas Schamanek wrote:
> :) On Thu, 3 Mar 2011, at 11:52, Eduardo Chappa wrote:
> :) > the correct way to do it was encoding non-ascii, as below.
> :) > http://en.wikipedia.org/wiki/Tomato_pur%E9e
> :) I don't think so. See http://en.wikipedia.org/wiki/URL_encoding
> Well, this is an interesting way to reply. My comment was in regards to
> the past, not to the present. I made a comment that was true in the past,
> at least previous to 2005. I did not mean by past "yesterday".
> Anyway, URIs still must be encoded, that has not changed. That this is
> displayed in a readable way to the user is another matter.
> And, just to complete the information, Alpine implements rfc 1738
> recognition, and this URL is not encoded according to that RFC.
I noticed that there were two suggestions for encoding:
One interesting difference is that when I paste the first one into
chromium-browser it does not change. The second one changes back to the
original URL after pasting into the location bar. That is, the %C3%A9
part of it changes into an 'e' with an acute accent (é). I like that.
Also, if I highlight and copy the URL from the browser location bar, then
paste it into Alpine (in an xterm), the %C3%A9 comes back to replace the
"é", which is convenient.
Regarding RFC 1738 -- do we really want to do it like this? What we
really want is for the URL to work as often as possible. We want Alpine
to work for "bad" URLs as well as "good" (RFC compliant) ones, don't we?
In the example we were given, when you look at the URL, you just know that
the é is part of it. The computer can't figure it out because it was
programmed badly, not because the computer's decision is superior to that
of the human user.
More information about the Alpine-info