strings and NULL argument passing

**Jeff Schwab** · Nov 14 '08, 12:35 AM

Re: strings and NULL argument passing

Rolf Magnus wrote:

Jeff Schwab wrote:
>

>James Kanze wrote:

>>On Nov 13, 8:05 pm, Jeff Schwab <j...@schwabcen ter.comwrote:
>>>Rolf Magnus wrote:
>>>>sanjay wrote:

>[snipped code that (oops) initialized a std::string from NULL]

>>Compared to the rest of what the constructor has to do, I rather
>>suspect that the run-time cost of checking isn't measurable.

>Which constructor? std::string?

>
Yes. The memory allocation alone would outweigh that simple check by far.

For small strings, there may not even be any dynamic allocation.

>If the OP's std::string isn't detecting null initializers, the decision
>apparently was made by the implementor that the check was expensive
>enough to avoid.

>
The problem is that the C++ standard doesn't require it,

I'm not sure that's a "problem." I agree that it would be a good idea
for std::string to detect null initializers, though.

so I guess some
library implementor could think that there is no point in doing such a
check, since the user can't rely on it anyway.

Mabye, but the typical implementor isn't setting out to provide the
"lowest common denominator;" in fact, the opposite tends to be true.
Each implementor typically provides features the others don't, both as a
matter of QoI, and to achieve lock-in.

**James Kanze** · Nov 14 '08, 09:25 AM

Re: strings and NULL argument passing

On Nov 14, 12:02 am, Jeff Schwab <j...@schwabcen ter.comwrote:

James Kanze wrote:

On Nov 13, 8:05 pm, Jeff Schwab <j...@schwabcen ter.comwrote:

Rolf Magnus wrote:
>sanjay wrote:

[snipped code that (oops) initialized a std::string from NULL]

Overloading for char const* is a fine option. Using a
string type that performs the run-time check is also OK.
The problem with either of those approaches is that it
imposes the run-time check, even for C-style string
literals (e.g. "hi") whose type cannot be null, but which
decay to the pointer type.

Compared to the rest of what the constructor has to do, I
rather suspect that the run-time cost of checking isn't
measurable.

Which constructor? std::string? If the OP's std::string
isn't detecting null initializers, the decision apparently was
made by the implementor that the check was expensive enough to
avoid.

Or that it wasn't necessary, because the underlying system would
take care of it in the call to strlen (which dereferences the
pointer). If the system you're running on guarantees a core
dump or its equivalent in the case of a dereferenced null
pointer, you've got the check, even if you didn't want it:-).
If the library is only designed for use on Windows and Unix
based systems, there's no point in doing more.

It may also be possible (and is, in the OP's case) to avoid
the std::string altogether, by working directly with the
c-style string.

That's a different issue; if profiling shows that he's spending
too much time constructing the string, such an alternative
should surely be considered.

/* Print a C-style string literal. */
template<std::s ize_t Size>
void print(char const (&c_str)[Size]) {
print(std::stri ng( c_str ));

Or better yet:
print( std::string( c_str, Size - 1 ) ) ;

No need to count the characters if you already know how many
there are.

Nice catch.

Except as I go on to point out, it doesn't work:-(.

Of course, this fails if the string literal was "a\0b", or
something of the sort. It also doesn't work (but nor does
your suggestion) when interfacing with C (where all you've
got is a char const*).

The template cannot even be declared in a C header, so it is
clearly not meant to be a C-language interface function. I'm
not sure why you bring that up; I don't see it as relevant
here. If the function is to be callable from C, it also
cannot be overloaded for char const* and std::string, since it
must be declared extern "C". This is possible only in C++.

I didn't mean that the function itself would be called from C.
I was wondering about the more general issue---how you handle a
string in the form of a char const* which you got from a C
interface.

Of course, the simplest and the safest is just to convert it to
an std::string immediately. So you're right that my comments
really aren't that relevant. Except that such interfaces could
easily be a source of null pointers.

/* Generate a compile time error for unacceptable types. */
template<typena me String>
void print(String const& s) {
s.is_not_of_an_ acceptable_stri ng_type();
}

As pointed out earlier, this trick (with some adaption)
could be used directly in std::string.

It's not a panacea, however.

It is only (intended to be) an optimization of the run-time
code.

I thought that the original problem was to catch NULL pointers,
so that they wouldn't be used to create an std::string.

You really do have to support constructing strings from char
const*, which can be a null pointer, even if it isn't a
literal.

I don't have to support any such thing. Of course, if the
client writes something like print(std::stri ng(0)), there's
not much I can do. There is a definite trade-off between
convenience and safety.

I was considering std::string, not this particular function.

(Of course, it can also be an invalid pointer, and there's
no way you can check for that.)

I don't know of a fool-proof way, but you can sometimes detect
nonsense if you control the memory allocation. You can check
that the pointer value is within (or outside) some range. You
can also catch a bunch of bugs by stomping on all deallocated
memory with a magic byte pattern (I like 0xDeadBeef) and
checking for that byte pattern in the addressed memory.

Certainly. One's debugging operator new and operator delete
already take care of much of that. And potentially could do
more; under Solaris or Linux, for example, there are a number of
additional controls you could make on a pointer; something like:

bool
isValid( void const* userPtr )
{
extern int end ;
int stack ;
return (char*)userPtr < (char*)&end // static
|| (char*)userPtr (char*)(&stack) // stack
|| MemoryChecker:: isAllocated( userPtr ) ;
}

The MemoryChecker:: isAllocated can be as simple or as complex as
you want. (For the simplest solution, just drop this term, and
replace "(char*)&en d" with "(char*)sbr k()" in the first term.
But you should be able to do better than that with a specialized
function, even at reasonable cost.)

I believe that similar system dependent solutions are possible
on most systems. (And this solution doesn't work in a
multithreaded environment, where you have more than one stack.)

You could (and probably should?) make the isValid function a
template, and verify pointer alignment as well.

In the end, of course, it's probably simpler and more effective
to just use Purify, valgrind or something similar. (In the
past, I developed a lot of such solutions, because Purify was
the only existing product, and it's expensive. Today, I doubt
that I'd bother, but I continue to use my older tools because my
test harnesses are built around them.)

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

**James Kanze** · Nov 14 '08, 09:35 AM

Re: strings and NULL argument passing

On Nov 14, 1:25 am, Rolf Magnus <ramag...@t-online.dewrote:

James Kanze wrote:

It's undefined behavior. With a good implementation of
std::string, it won't compile.

Would it actually be allowed by the standard to have an
additional constructor in std::string?

That's a good question. If it doesn't affect the overload
resolution of legal calls to the constructor, I think so, if
only under the as if rule. Thus, if as a library implementor, I
do something like:

namespace std {
template< ... >
class basic_string
{
// ...
private:
struct _Hidden {} ;
basic_string( int _Hidden::*,
Allocator const& = Allocator() ) ;
} ;

can a legal program detect the presence of the additional
constructor?

(One issue might be whether the following program is legal:

#include <string>

void
f( bool t )
{
if ( t ) {
std::string s(0) ;
}
}

int
main()
{
f( false ) ;
}

..)

In practice, on thinking about it, I'm not sure that it's worth
the effort. I only catches the case where you initialize with a
null pointer constant (0, NULL or the like), which is, one would
hope, pretty rare. You still need a run-time check (generally
provided directly by the hardware) in case of a variable which
happens to contain a null ponter.

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

**James Kanze** · Nov 14 '08, 09:45 AM

Re: strings and NULL argument passing

On Nov 14, 12:14 am, Jeff Schwab <j...@schwabcen ter.comwrote:

James Kanze wrote:

If his interface requires a string, then passing it a null
pointer should cause an assertion failure.

That does not follow. I consider it an abuse of assertions to
use them as detectors of contract violation. Assertions are
often appropriate for post-conditions, but rarely for
pre-conditions.

Assertions are useful for detecting programming errors.
Violation of a pre-condition is a programming error.

Exceptions should, in my opinion, not be part of the interface
definition of functions; exceptions are best reserved, for
error-reporting, and that specifically includes run-time contract
violations.

I agree with the middle clause: exceptions are best reserved for
error reporting. Which means that I disagree with the other two
parts: error reporting is a vital part of the interface
definition of a function, and run-time contract violations are
programming errors: "impossible " conditions (in a correct
program) not covered by the interface, and not reported as
"errors".

In the case at hand, std::invalid_ar gument (or a derivative)
seems obviously to be the best choice.

If the contract says so. The contract can specify many things:

-- The caller is not allowed to pass a null pointer. Doing so
violates the contract, which results in "undefined
behavior"---an assertion failure, unless performance
considerations deem otherwise.

-- The caller is allowed to pass a null pointer, and is
guaranteed a specific type of exception. I'd consider this
case fairly rare, but there are probably cases where it is
reasonable.

-- The caller is allowed to pass a null pointer, which the
function maps into a specific string, e.g. "" or
"<<NULL>>"" , or whatever.

In general (and there are exceptions), a programming error
should result in the fastest and most abrupt termination of the
program as possible.

--
James Kanze (GABI Software) email:james.kan ze@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

strings and NULL argument passing

Comment

Comment

Comment

Comment