[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: another take on hackers and painters



At 2:26 PM -0400 5/21/03, Anton van Straaten wrote:
>Dan Sugalski wrote:
>
>>  At 9:06 AM -0500 5/21/03, Matt Hellige wrote:
>...
>>  >Wow... That just makes me cringe... And sets off my "perl proximity"
>>  >alarm. By the same token, why not have 1 + "3" produce the string "13"?
>>
>>  It could, sure, but + generally notes addition, and both sides *are*
>>  numbers, after all. Why not coerce to types based on value? Seems the
>>  logical extension to dynamic typing to me.
>
>You say "both sides *are* numbers, after all" - hmmm, methinks you've been
>using that language for too long!  ;oP

You say that like it's a bad thing... :-P

>As an aside, if I were forced to pick a rule for implicit coercion, I'd say
>that the type of the first (leftmost) value should determine the type of the
>coercion: so 1 + "3" would be 4, but "1" + 3 would be "13".

At this point, having had the interesting coolness that is proper 
multimethods pushed on me, I'd have to disagree. The types of both 
sides should be taken into account. You wouldn't, after all, want 
these:

    1 + 1.4
    1.4 + 1

to produce different answers just because the first has an integer on 
the LHS and the second a float.

>   But I'd rather
>be more explicit about my intent,

You have been, though. Or, rather, you've been as explicit as you can 
manage, and are coming against a semantic ambiguity in your language. 
If your language wasn't ambiguous, the meaning would be clear and 
wouldn't have to be inferred based on the types of the arguments.

This is as much an argument against + doing double duty as addition 
and string concatenation as it is against automatic type coercion. 
(Well, that's *my* argument at least :)

>  because:
>
>How do you (or the language) know that "3" is a number?

The same way that, when it reads in data the user has entered from 
the keyboard, or that the program has read from a file, that "3" is a 
number. What, after all, is the difference between them? Either way, 
it's ASCII 51.

If we've defined "+" to represent addition, then there should be no 
difference between

    1 + "2"

and

    1 + atof("2")

or whatever your string-to-number conversion is. We're adding, we 
need two numbers, so doing type coercion of the string to a number is 
a reasonable thing to do.

>Perl has a rule for inferring the type of numbers contained in
>strings.  Only in that sense, can the string "3" be said to "be" a number.

And it's not an unreasonable thing. You may choose not to define it 
in the semantics of a language you're creating, but it's still a 
reasonable thing. Languages that don't provide this sort of facility, 
at least in some form, end up leading to the nasty hack of iterating 
through the string character by character and building up the 
resulting number. Yech.

>This highlights one of the major things types are supposed to do in the
>first place: they give context to values, in the same sort of way that
>physical quantities have units like meters or grams.

I'd argue instead that types provide meta-information about the data. 
That meta-information provides some context, yes, but non-type 
meta-information can provide it, as can the actual data itself.

Granted, it's one extra step away from static typing--not only do you 
not know the type at compile time, you don't know at runtime, and 
have to actually look at the data--which isn't appropriate in all 
cases, but it is a powerful thing. Perl takes it to one extreme, 
certainly, but it isn't alone, as there are languages that do similar 
things with values as part of doing multimethod dispatch.

What, after all, is the difference between perl's automatic coercion 
and having some MM-defined overloaded operators that have signatures 
like:

     operator +(looks_like_a_number(string a), num b);
     operator +(num a, looks_like_a_number(string b));

?

(Though I know some folks dislike the thought of value checks as part 
of MM dispatch, or MM dispatch itself, which is fine)

>When dealing with objects in the OO sense, even dynamically typed languages
>respect this issue: values are tagged with their type, and if you make a
>type error, the language implementation will tell you.  What is it about
>simple values like numbers and strings that makes them exempt from this
>issue?

Nothing, certainly. But why are you thinking this is some sort of 
conceptual violation? Addition would, in a fully-OO language, call 
the gimme_your_num_value method on the left and right sides. It 
wouldn't at all be out of line for the String class' version of that 
return the numeric value you'd get if you turned the string into a 
number.

>I think the answer, from a high level perspective, is that languages that
>use duck typing are attempting to use some very simple heuristics to
>simulate the human ability to give meaning to values based on context - the
>whole DWIM thing.

A perfectly reasonable thing to do in many circumstances. 
Unreasonable in others, in which case you shouldn't do that. It 
really isn't duck typing, not conceptually. It's just a set of 
classes with a rich set of conversion routines, certainly not an 
unreasonable thing.

>The problem with this, at least in this case, is that it trains people to
>think loosely about types, and then they end up debugging the indirect
>consequences of their typing errors.  I'm basing this on some experience:
>I've seen programmers with prior Perl experience make these exact mistakes -
>for example, testing values as strings which should be tested as numeric,
>causing incorrect program behavior when the string wasn't formatted as
>expected.

Erm, this argument, boiled down, turns into "people make logic errors 
when writing perl, therefore perl makes people make logic errors". 
Sure, perl's autoconversion can bite you, but so can checking objects 
capabilities at runtime by looking for methods by name, containers 
that are strictly typed, or the assumption that all strings are a 
sequence of non-zero 8-bit characters with a null terminator.

It may well be something that you, personally, aren't comfortable 
with. That's fine. It's certainly something that people do, on 
occasion, mess up, and that's fine too. (I sometimes swap < and >, 
but that's not an argument for eliminating one or the other) It's a 
language behaviour that is, in some circumstances, very useful, just 
like almost any other language behaviour. Any feature may render a 
language sub-optimal for some use or other, but no language is 
universally appropriate.
-- 
                                         Dan

--------------------------------------"it's like this"-------------------
Dan Sugalski                          even samurai
dan@sidhe.org                         have teddy bears and even
                                       teddy bears get drunk