[Imap-uw] utf8_mime2text doesn't decode QP correctly?

Mark Crispin mrc at CAC.Washington.EDU
Thu Oct 6 11:12:55 PDT 2005


On Thu, 6 Oct 2005, Tim Mooney wrote:
>> In conclusion, the problem is with the the entity that generated that From 
>> address, not with c-client.
> I'm not disagreeing, but what about being "... generous in what you
> accept"?

My position is:

This argument is based upon a terrible misunderstanding of Jon Postel's 
robustness principle: "be liberal in what you accept, be conservative in 
what you generate."

I knew Jon personally.  Although I can't speak for Jon, and sadly Jon is 
no longer with us, I am confident that he never intended it to be used in 
cases like this.

Jon's point was that ARPAnet (later Internet) protocols had many 
facilities, but only a subset of these facilities were commonly used. 
This had caused interoperability problems when one implementation depended 
upon a facility that another implementation did not implement.  TELNET, 
FTP, and the early email protocols suffered greatly from such problems.

Thus, "be conservative" by eschewing the facilities that aren't commonly 
used; but "be liberal" by implementing all the facilities described in the 
base specification even if they seem silly/useless/meaningless.

As an example of the Postel Principle, it was perfectly reasonable in RFCs 
733 and 822 to have a From header looking like:
 	From: Joe Mooch <(personal) joe (mooch) @ (company) example.com >
The c-client library (and hence Pine) is liberal and will interpret it
correctly as being equivalent to:
 	From: Joe Mooch <joe at example.com>
but many other MUAs will not!

Incidentally, RFC 2822 threw in the towel and outlawed sending the former. 
So, there is no longer a stated need for MUAs to "be liberal" and accept 
that form.  But some MUAs still do so out of respect for the past.

The abuse of the Postel Principle is its extension to "be liberal" by 
accepting out-of-specification protocol, even in those cases when the 
situation has always been explicitly prohibited by the specification.

This abuse encourages the practice of browbeating standards-compliant 
implementations to be non-compliant because some other implementation is 
non-compliant: "blurdybloop works with it, so your software is broken."

It also creates long-term interoperability problems; nobody really knows 
what the standard is.  A new implementor can not implement from the 
specification because much of the standard is in folklore that is not in 
any written specification.

It also causes harm.  Over the past 30 years, we've had numerous sad 
examples in which correct protocol was "repaired" into bogusity by 
well-intentioned software that incorrectly thought "it didn't really mean 
that, it meant this other thing, I'll fix it."  Email protocols in 
particular have suffered greatly because of this problem, which was 
largely brought on by the lack of rigor in the early specifications.

The IMAP protocol's strict syntax rules were a reaction to this problem. 
IMAP tries very hard to have just One Right Way That Everybody Must Obey. 
It occasionally has faltered in enforcement; and goodness knows people 
have complained about IMAP being so strict.  Nevertheless IMAP has had 
much better interoperability than had previously existed in email 
protocols.

> There is a lot of software that generates QP incorrectly in some cases;
> it would be really nice if c-client (and pine, by extension) would work
> around the deficiences of that other software -- assuming it's not
> terribly onerous to do so.

In my opinion, it is onerous for the following reasons:
  1) (Multiple) slippery slope argument(s):
     a) standard undocumented in any specification
     b) repetitive fixing of bad effects of previous fixes
     c) "you did that, so you should do this too"... :-(
     d) etc.
  2) It introduces a bug: valid, standard-compliant data *will* be
     misinterpreted, and when that happens, the aggrieved party will
     rightfully say that it's c-client's fault.
  3) It's additional work; doing this means that the code can no longer use
     the rule, explicitly stated in RFC 2047, that an encoded-word is an
     RFC 2822 atom.

Your mileage may vary... :-)

-- Mark --

http://panda.com/mrc
Democracy is two wolves and a sheep deciding what to eat for lunch.
Liberty is a well-armed sheep contesting the vote.


More information about the Imap-uw mailing list