Does any language have this feature?
Oct. 12th, 2007 10:08 amMany languages that have references (pointers, whatever) have a distinguished null value that refers to "nowhere": None, a null pointer, undef etc. Dereferencing this value produces some kind of runtime error (a segmentation fault or an exception, for instance) so you have check every possible use of any reference that can ever be null. (In some cases it's acceptable to take the exception or even the crash, but often it not)
What I'd like is a distinction between optional and mandatory references. An optional reference can be null, but the compiler won't let you dereference it. A mandatory reference is never null, and the compiler won't let you do anything that makes it null. Finally, all references would be either null or safe to dereference and no constructions which violated this would be permitted.
Then, there'd be some convenient way to get the compiler to do the check and let you dereference only known-to-be-safe optional pointers.
In something very C-like, you might have a new qualifier, optional, which applied to pointers only, and a with construction to do safely strip the qualifier:
int *optional op = 0; /* optional pointer to int */
int *optional uop; /* default-initialized to 0 */
int x, *mp = &x; /* mandatory pointer to int */
int *ump; /* error: must be initialized */
*mp; /* OK */
mp = op; /* error: cannot assign optional to mandatory */
mp = 0; /* error: mandatory pointer cannot be null */
*op; /* error: cannot dereference optional pointer */
op = mp; /* OK */
op = 0; /* OK */
with(op) {
/* op behaves as if declared 'int *op' within this block */
*op; /* OK */
} else { /* else part is optional */
assert(op == NULL); /* never fails */
}
(There'd also be some obvious rules about initialization but they are not the interesting bit.)
(no subject)
Date: 2007-10-12 09:36 am (UTC)(no subject)
Date: 2007-10-12 09:48 am (UTC)(no subject)
Date: 2007-10-12 09:50 am (UTC)(no subject)
Date: 2007-10-12 09:58 am (UTC)(no subject)
Date: 2007-10-12 10:03 am (UTC)sw:unset sw:enumerated_format_mixin sw:magikc_objects_mixin sw:object sw:unset_mixinand all the associated method table etc. you'd expect. The important method on this is default() which takes one argument. On object (and everything that inherits from that) default returns _self, on unset it returns its argument. So the first thing often you do in a method that might be passed unset as some of its arguments is
(no subject)
Date: 2007-10-12 10:09 am (UTC)I'm sure there's an OO language with non-nullable references but my memory fails me...
(no subject)
Date: 2007-10-12 10:16 am (UTC)I'm actually slightly horrified that C++ will let you overload that sort of thing.
(no subject)
Date: 2007-10-12 10:21 am (UTC)(no subject)
Date: 2007-10-12 10:48 am (UTC)&my_phony_variableas a second special pointer value.So perhaps we should solve both these problems by defining a basic "pointer" type in such a way that you can't ever create a value of that type which is null; then we'd also provide Haskell-like types with alternatives, along the lines of
datatype NullablePointer = NULL | PointsTo(char *);That lets you distinguish between a mandatory and optional reference, and it also lets me define
datatype PointerOrThreeErrorIndicators = Error1 | Error2 | Error3 | PointsTo(char *);for my special purposes.
(no subject)
Date: 2007-10-12 10:58 am (UTC)(no subject)
Date: 2007-10-12 11:01 am (UTC)int * ptr = 0;
int &ref = *ptr;
.. which then blows up on dereferencing ref. This is unhelpful.
(no subject)
Date: 2007-10-12 11:06 am (UTC)(no subject)
Date: 2007-10-12 11:37 am (UTC)(no subject)
Date: 2007-10-12 11:57 am (UTC)Private Sub FrobFoo(ByRef theFoo As Foo, Optional howToFrob As FrobType) If IsMissing(howToFrob) Then DefaultFrob theFoo Else Select Case howToFrob Case ... End Select End If End Sub[NB: I haven't touched VB itself in eight years or so, but this is how I remember it working.]
I also heartily second
[Sorry for the comment-spam, forgot about formatting the first time!]
(no subject)
Date: 2007-10-12 12:04 pm (UTC)(no subject)
Date: 2007-10-12 12:16 pm (UTC)(no subject)
Date: 2007-10-12 12:37 pm (UTC)(no subject)
Date: 2007-10-12 01:05 pm (UTC)C# has "nullable" types, though, which have some similar semantics -- any use in a context where the corresponding non-nullable type is required will result in a compile-time error. I haven't worked with C# much yet though, so am not entirely sure of the semantics apart from that they seemed sensible when I first encountered them!
(no subject)
Date: 2007-10-12 02:00 pm (UTC)(no subject)
Date: 2007-10-12 02:12 pm (UTC)Ah, yes. I remember the annoyance I encountered when trying to do this with a special string in Java.
Can't use a constant string (because the language guarantees that all instances of the same constant string are the same object), and the bug-checking plugin we had didn't like instantiating a
new String();, giving a warning if I tried (roughly, "just go ahead and use a constant empty string; it's the same thing for nearly all intents and purposes").(no subject)
Date: 2007-10-12 02:16 pm (UTC)(no subject)
Date: 2007-10-12 02:47 pm (UTC)In our codebase we have a lot of trouble with boost::shared_ptr: when they are passed between code modules it's unclear whether they're allowed to be null or not. A non-zero shared_ptr would be a nice type.
But, then again, there are lots of constraints on values it would be nice to be able to have checked automatically; null pointers are just one of the more common examples.
(no subject)
Date: 2007-10-12 03:57 pm (UTC)I like this idea. I'm not sure how one would do it in the C-like model above though - you'd have to define entirely new pointer-like types.
Another important kind of undereferencable pointer is the pointer just beyond the end of an array. This (and its generalization to iterators) is a useful thing to be able to think about, as it lets you use half-open intervals to refer to (sub-)ranges of arrays. Anything that supported multiple invalid values should support it fairly easily.
(Other bad pointers includes ones that you've freed, or pointers to auto variables that have gone out of scope. I think garbage collection is the underlying answer here.)
(no subject)
Date: 2007-10-12 04:13 pm (UTC)So if any other part of the code has a pointer to X, it's now semantically stale even if not pointer-dereferencingly stale: making use of that pointer is a bug, and having the GC environment conscientiously preserve X so that such code can accidentally shovel things into X and not notice is not a sensible move. I want to specifically mark X as finished with, so that any part of the program which subsequently tries to put things into it will throw an exception that I can debug.
Admittedly, the usual C approach of declaring X finished with and having future attempts to modify it lead to subtle undefined behaviour is not optimal either. But out of the two options above, I'd pick the non-garbage-collected one every time, because I prefer my misbehaviour unsubtle so I can spot it and fix it easily.
And that's without even getting into the question of what happens when X is not merely data but contains handles to some sort of external I/O; you certainly want to close output files (for example) explicitly rather than relying on the GC to get round to it at some point, and after you've closed them the last thing you want is for the data structure that described them to be "helpfully" left around.
Garbage collection is at its best when dealing with non-mutable data, because in that situation there's no semantic repercussion at all. So purely functional languages are its absolute sweet spot. But for normal procedural code with mutable data? Call me a C programmer if you must, but I've never been convinced.
(no subject)
Date: 2007-10-12 04:34 pm (UTC)I like the C++ RAII model, and how it works so nicely when scope = lifetime. I'd like to see a good compromise between that and GC to deal with more complicated lifetimes.
Stoopid Question from A VB developer...
Date: 2007-10-12 05:34 pm (UTC)(no subject)
Date: 2007-10-12 07:31 pm (UTC)http://nice.sourceforge.net/safety.html has some details.
Re: Stoopid Question from A VB developer...
Date: 2007-10-12 08:16 pm (UTC)int *p = 0;
printf("%d\n", *p);
Any C compiler will happily accept this code, which will coredump when I run the resulting executable.
Re: Stoopid Question from A VB developer...
Date: 2007-10-12 10:09 pm (UTC)Re: Stoopid Question from A VB developer...
Date: 2007-10-12 10:14 pm (UTC)(no subject)
Date: 2007-10-13 04:18 pm (UTC)(no subject)
Date: 2007-10-13 05:35 pm (UTC)close()and future accesses to the thing are going to give an error anyway, you might as well garbage-collect it now rather than waiting for any remaining stale pointers to go away, because you don't want memory to be retained long-term based on pointers that will (must) never be accessed again merely because they still happen to exist. The obvious way to do that is to haveclose()also befree().I suppose you could just mark it as freeable and wait for the next run of a garbage collector to collect it along with everything else, if you really wanted to.
(no subject)
Date: 2007-10-13 10:38 pm (UTC)Re: Stoopid Question from A VB developer...
Date: 2007-10-19 03:56 am (UTC)There. Fixed the coredump.
Re: Stoopid Question from A VB developer...
Date: 2007-10-19 04:00 am (UTC)Re: Stoopid Question from A VB developer...
Date: 2007-10-19 08:01 am (UTC)