program crashes on exit

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • jaddle
    New Member
    • Feb 2008
    • 2

    program crashes on exit

    I program I'm writing crashes when it exits with a segfault. I'm runnning under linux. Using gdb, i see that the crash happens after the last line of main(), which is just a return(EXIT_SUC CESS); If I put an exit(EXIT_SUCCE SS); right before that line, it doesn't crash though!

    In gdb, when the crash occurs, a backtrace gives the following:
    #0 0x0804e090 in ?? ()
    #1 0xb7ea2ff4 in ?? () from /usr/lib/debug/libc.so.6
    #2 0xb7ea4140 in ?? () from /usr/lib/debug/libc.so.6
    #3 0x0804e008 in ?? ()
    #4 0xbfac1928 in ?? ()
    #5 0xb7ddade9 in *__GI___libc_fr ee (mem=0x0) at malloc.c:3622
    #6 0x08048c51 in _start ()

    I'm using a debug version of libc, as you can see, but it doesn't help much.

    http://c-faq.com/strangeprob/crashatexit.html describes the problem, but the solutions don't seem to apply to my program.

    The really strange thing is that the problem is erratic - usually it happens the way but occasionally (maybe every 10th time or so) there's no problem at all, or else it just freezes without a segfault. When frozen, I've attached gdb, and it reports that it's just sitting on the last line of the program - just the final curly bracket - over and over again with nothing changing.

    Valgrind shows the following when it crashes:
    ==24604== Use of uninitialised value of size 4
    ==24604== at 0x8049AAB: main (ttuner.c:234)
    ==24604==
    ==24604== Invalid read of size 1
    ==24604== at 0x41E5FF6: (within /usr/lib/debug/libc-2.6.1.so)
    ==24604== by 0x8048C50: (within /home/jono/src/ttuner/ttuner)
    ==24604== Address 0x0 is not stack'd, malloc'd or (recently) free'd
    ==24604==
    ==24604== Process terminating with default action of signal 11 (SIGSEGV)
    ==24604== Access not within mapped region at address 0x0
    ==24604== at 0x41E5FF6: (within /usr/lib/debug/libc-2.6.1.so)
    ==24604== by 0x8048C50: (within /home/jono/src/ttuner/ttuner)
    ==24604==
    ==24604== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 19 from 1)
    ==24604== malloc/free: in use at exit: 0 bytes in 0 blocks.
    ==24604== malloc/free: 10 allocs, 10 frees, 1,381,121 bytes allocated.
    ==24604== For counts of detected errors, rerun with: -v
    ==24604== All heap blocks were freed -- no leaks are possible.


    Is there anything else that might be helpful in debugging this?
    Last edited by jaddle; Feb 25 '08, 08:39 PM. Reason: adding valgrind output
  • gpraghuram
    Recognized Expert Top Contributor
    • Mar 2007
    • 1275

    #2
    If you get a crash at the exit and when it happens inconsistently then it means there is some memory corruption in the code.
    Try to use some memory corruption identification tool like Purify to solve the issue or do a code walk through

    Raghuram

    Comment

    • jaddle
      New Member
      • Feb 2008
      • 2

      #3
      Problem has been solved! Thanks for the suggestion.

      It turns out that I had an array that wasn't quite long enough for the data going into it - a pretty standard mistake!

      The interesting bit, and what I really learned from the whole thing, is why it was breaking in such an interesting way, instead of just crashing when the invalid memory got accessed.

      I had an array declared as char notename[3]; in main(). Unfortunately, this string could be any of A, B, C#, or Ebb (double-flat), etc. (it's a program to generate sounds to use for tuning instruments). The double flats take an extra character!

      What happened then was, because this was stack memory (i.e. not malloc'ed), one byte at the end turned out to be part of something like the return address for main - i.e. where it goes when it finished. Once I screwed that up, it couldn't return properly and crashed on exit.

      Changing the declaration to char *notename = malloc(sizeof(c har) * 3); eliminated the ugly crashes, but still left the bug that the array was being overrun, so other data corruption (at the least) could have ensued. However, valgrind was able to see the overrun easily and the problem was fixed with no further problems.

      In addition to 'purify', suggested here (I'll have a look at that soon, thanks!) someone mentioned that compiling with -fmudflap (and possibly -lmudflap) could also detect things like this. Haven't tried it yet, but it's good to know for next time.

      Comment

      Working...