segfaults / backend crashing

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Jeffrey Melloy

    segfaults / backend crashing

    I'm having a problem with the backend occasionally crashing.

    I have interfaces with the database in two different applications -- a
    web viewer using JDBC and an insertion routine written using the c
    libraries. It makes fairly heavy use of the tsearch protocol. (I
    haven't been able to figure out anything repetitious, it happens at
    different times, and after I get a new connection the exact search
    query or insertion string works fine).

    All that's printed to the postgres log is this:
    2003-09-02 10:41:18 [428] LOG: server process (pid 479) was
    terminated by signal 11
    2003-09-02 10:41:18 [428] LOG: terminating any other active server
    processes
    2003-09-02 10:41:18 [428] LOG: all server processes terminated;
    reinitializing shared memory and semaphores
    2003-09-02 10:41:18 [510] LOG: database system was interrupted at
    2003-09-02 10:17:38 CDT
    2003-09-02 10:41:18 [428] LOG: all server processes terminated;
    reinitializing shared memory and semaphores
    2003-09-02 10:41:18 [510] LOG: database system was interrupted at
    2003-09-02 10:17:38 CDT
    2003-09-02 10:41:18 [510] LOG: checkpoint record is at 0/53E94490
    2003-09-02 10:41:18 [510] LOG: redo record is at 0/53E94490; undo
    record is at 0/0; shutdown TRUE
    2003-09-02 10:41:18 [510] LOG: next transaction id: 127192; next
    oid: 1562034
    2003-09-02 10:41:18 [510] LOG: database system was not properly
    shut down; automatic recovery in progress
    2003-09-02 10:41:18 [510] LOG: redo starts at 0/53E944D0
    2003-09-02 10:41:18 [510] LOG: ReadRecord: record with zero length
    at 0/53EA28E0
    2003-09-02 10:41:18 [510] LOG: redo done at 0/53EA282C
    2003-09-02 10:41:20 [510] LOG: database system is ready

    The crash log associated with that crash isn't particularly useful, at
    least to me:
    Date/Time: 2003-09-02 10:41:17 -0500
    OS Version: 10.2.6 (Build 6L60)
    Host: dahak.local.

    Command: postmaster
    PID: 479

    Exception: EXC_BAD_ACCESS (0x0001)
    Codes: KERN_INVALID_AD DRESS (0x0001) at 0xffffe640

    Thread 0 Crashed:

    PPC Thread State:
    srr0: 0xbfffe5e0 srr1: 0x0200f930 vrsave: 0x00000000
    xer: 0x00000000 lr: 0xbfffe5e0 ctr: 0x90010000 mq: 0x00000000
    r0: 0xbfffe5e0 r1: 0xbfffe5d0 r2: 0x8fe4e3e0 r3: 0x00000000
    r4: 0x0088ec1c r5: 0x00000007 r6: 0x00000006 r7: 0x0088eba4
    r8: 0x0000000c r9: 0x00000000 r10: 0x004594d4 r11: 0x008a0150
    r12: 0x90010000 r13: 0x00000000 r14: 0x00000000 r15: 0x00000000
    r16: 0x00000000 r17: 0x00000000 r18: 0x00000000 r19: 0x00000000
    r20: 0x00000000 r21: 0x00000000 r22: 0x00000000 r23: 0x00000000
    r24: 0x00000000 r25: 0x00000000 r26: 0xbffffbd4 r27: 0x0000000c
    r28: 0x00000002 r29: 0x00459534 r30: 0x00000000 r31: 0x00000000

    One of the crashlogs for the JDBC connection looks like this:
    Date/Time: 2003-08-13 01:10:21 -0500
    OS Version: 10.2.6 (Build 6L60)
    Host: dahak.local.

    Command: postmaster
    PID: 460

    Exception: EXC_BAD_ACCESS (0x0001)
    Codes: KERN_PROTECTION _FAILURE (0x0002) at 0x00daec05

    Thread 0 Crashed:
    #0 0x90010070 in __swsetup
    #1 0x00d9f8d8 in compare
    #2 0x00d9f5a0 in merge
    #3 0x00d9f788 in sort
    #4 0x00d9fbc0 in create_pool
    #5 0x00da1b40 in setup_english_s temmer
    #6 0x00da23f4 in initmorph
    #7 0x00da43c4 in tsearch
    #8 0x00064904 in ExecCallTrigger Func (trigger.c:1120 )
    #9 0x00064a7c in ExecBRInsertTri ggers (trigger.c:1171 )
    #10 0x00071018 in ExecInsert (execMain.c:121 6)
    #11 0x00070ee4 in ExecutePlan (execMain.c:110 2)
    #12 0x000701fc in ExecutorRun (execMain.c:195 )
    #13 0x000d31dc in ProcessQuery (pquery.c:247)
    #14 0x000d0f8c in pg_exec_query_s tring (postgres.c:839 )
    #15 0x000d24d4 in PostgresMain (postgres.c:202 0)
    #16 0x000b379c in DoBackend (postmaster.c:2 293)
    #17 0x000b2ff4 in BackendStartup (postmaster.c:1 916)
    #18 0x000b1ec0 in ServerLoop (postmaster.c:1 006)
    #19 0x000b19f8 in PostmasterMain (postmaster.c:7 85)
    #20 0x00086670 in main (main.c:210)
    #21 0x00001b58 in _start (crt.c:267)
    #22 0x000019d8 in start

    PPC Thread State:
    srr0: 0x90010070 srr1: 0x0200f930 vrsave: 0x00000000
    xer: 0x00000000 lr: 0x90010070 ctr: 0x90003900 mq: 0x00000000
    r0: 0x90010070 r1: 0xbfffe5f0 r2: 0x00000000 r3: 0x000000e3
    r4: 0x00000000 r5: 0x000000e3 r6: 0x0000000a r7: 0x00000020
    r8: 0x00000030 r9: 0x000000e3 r10: 0x00000060 r11: 0xa00043ac
    r12: 0x90003900 r13: 0x001a0c10 r14: 0x009b4f58 r15: 0xbfffed60
    r16: 0x00d37e00 r17: 0x00000000 r18: 0x00000000 r19: 0x00000000
    r20: 0x00d367e8 r21: 0x00d385f0 r22: 0x00d36958 r23: 0x00d36af8
    r24: 0x00000001 r25: 0x00d36f28 r26: 0x00d36a48 r27: 0x00d385f0
    r28: 0xbfffeaa0 r29: 0x00000000 r30: 0x00daebd5 r31: 0x9001000c

    Any help would be appreciated.
    Jeffrey Melloy
    jmelloy@visuald istortion.org


    ---------------------------(end of broadcast)---------------------------
    TIP 5: Have you checked our extensive FAQ?



  • Joshua D. Drake

    #2
    Re: segfaults / backend crashing

    Hello,

    What version of PostgreSQL and JDBC are you running?

    Sincerely,

    Joshua Drake


    Jeffrey Melloy wrote:
    [color=blue]
    > I'm having a problem with the backend occasionally crashing.
    >
    > I have interfaces with the database in two different applications -- a
    > web viewer using JDBC and an insertion routine written using the c
    > libraries. It makes fairly heavy use of the tsearch protocol. (I
    > haven't been able to figure out anything repetitious, it happens at
    > different times, and after I get a new connection the exact search
    > query or insertion string works fine).
    >
    > All that's printed to the postgres log is this:
    > 2003-09-02 10:41:18 [428] LOG: server process (pid 479) was
    > terminated by signal 11
    > 2003-09-02 10:41:18 [428] LOG: terminating any other active server
    > processes
    > 2003-09-02 10:41:18 [428] LOG: all server processes terminated;
    > reinitializing shared memory and semaphores
    > 2003-09-02 10:41:18 [510] LOG: database system was interrupted at
    > 2003-09-02 10:17:38 CDT
    > 2003-09-02 10:41:18 [428] LOG: all server processes terminated;
    > reinitializing shared memory and semaphores
    > 2003-09-02 10:41:18 [510] LOG: database system was interrupted at
    > 2003-09-02 10:17:38 CDT
    > 2003-09-02 10:41:18 [510] LOG: checkpoint record is at 0/53E94490
    > 2003-09-02 10:41:18 [510] LOG: redo record is at 0/53E94490; undo
    > record is at 0/0; shutdown TRUE
    > 2003-09-02 10:41:18 [510] LOG: next transaction id: 127192; next
    > oid: 1562034
    > 2003-09-02 10:41:18 [510] LOG: database system was not properly
    > shut down; automatic recovery in progress
    > 2003-09-02 10:41:18 [510] LOG: redo starts at 0/53E944D0
    > 2003-09-02 10:41:18 [510] LOG: ReadRecord: record with zero length
    > at 0/53EA28E0
    > 2003-09-02 10:41:18 [510] LOG: redo done at 0/53EA282C
    > 2003-09-02 10:41:20 [510] LOG: database system is ready
    >
    > The crash log associated with that crash isn't particularly useful, at
    > least to me:
    > Date/Time: 2003-09-02 10:41:17 -0500
    > OS Version: 10.2.6 (Build 6L60)
    > Host: dahak.local.
    >
    > Command: postmaster
    > PID: 479
    >
    > Exception: EXC_BAD_ACCESS (0x0001)
    > Codes: KERN_INVALID_AD DRESS (0x0001) at 0xffffe640
    >
    > Thread 0 Crashed:
    >
    > PPC Thread State:
    > srr0: 0xbfffe5e0 srr1: 0x0200f930 vrsave: 0x00000000
    > xer: 0x00000000 lr: 0xbfffe5e0 ctr: 0x90010000 mq: 0x00000000
    > r0: 0xbfffe5e0 r1: 0xbfffe5d0 r2: 0x8fe4e3e0 r3: 0x00000000
    > r4: 0x0088ec1c r5: 0x00000007 r6: 0x00000006 r7: 0x0088eba4
    > r8: 0x0000000c r9: 0x00000000 r10: 0x004594d4 r11: 0x008a0150
    > r12: 0x90010000 r13: 0x00000000 r14: 0x00000000 r15: 0x00000000
    > r16: 0x00000000 r17: 0x00000000 r18: 0x00000000 r19: 0x00000000
    > r20: 0x00000000 r21: 0x00000000 r22: 0x00000000 r23: 0x00000000
    > r24: 0x00000000 r25: 0x00000000 r26: 0xbffffbd4 r27: 0x0000000c
    > r28: 0x00000002 r29: 0x00459534 r30: 0x00000000 r31: 0x00000000
    >
    > One of the crashlogs for the JDBC connection looks like this:
    > Date/Time: 2003-08-13 01:10:21 -0500
    > OS Version: 10.2.6 (Build 6L60)
    > Host: dahak.local.
    >
    > Command: postmaster
    > PID: 460
    >
    > Exception: EXC_BAD_ACCESS (0x0001)
    > Codes: KERN_PROTECTION _FAILURE (0x0002) at 0x00daec05
    >
    > Thread 0 Crashed:
    > #0 0x90010070 in __swsetup
    > #1 0x00d9f8d8 in compare
    > #2 0x00d9f5a0 in merge
    > #3 0x00d9f788 in sort
    > #4 0x00d9fbc0 in create_pool
    > #5 0x00da1b40 in setup_english_s temmer
    > #6 0x00da23f4 in initmorph
    > #7 0x00da43c4 in tsearch
    > #8 0x00064904 in ExecCallTrigger Func (trigger.c:1120 )
    > #9 0x00064a7c in ExecBRInsertTri ggers (trigger.c:1171 )
    > #10 0x00071018 in ExecInsert (execMain.c:121 6)
    > #11 0x00070ee4 in ExecutePlan (execMain.c:110 2)
    > #12 0x000701fc in ExecutorRun (execMain.c:195 )
    > #13 0x000d31dc in ProcessQuery (pquery.c:247)
    > #14 0x000d0f8c in pg_exec_query_s tring (postgres.c:839 )
    > #15 0x000d24d4 in PostgresMain (postgres.c:202 0)
    > #16 0x000b379c in DoBackend (postmaster.c:2 293)
    > #17 0x000b2ff4 in BackendStartup (postmaster.c:1 916)
    > #18 0x000b1ec0 in ServerLoop (postmaster.c:1 006)
    > #19 0x000b19f8 in PostmasterMain (postmaster.c:7 85)
    > #20 0x00086670 in main (main.c:210)
    > #21 0x00001b58 in _start (crt.c:267)
    > #22 0x000019d8 in start
    >
    > PPC Thread State:
    > srr0: 0x90010070 srr1: 0x0200f930 vrsave: 0x00000000
    > xer: 0x00000000 lr: 0x90010070 ctr: 0x90003900 mq: 0x00000000
    > r0: 0x90010070 r1: 0xbfffe5f0 r2: 0x00000000 r3: 0x000000e3
    > r4: 0x00000000 r5: 0x000000e3 r6: 0x0000000a r7: 0x00000020
    > r8: 0x00000030 r9: 0x000000e3 r10: 0x00000060 r11: 0xa00043ac
    > r12: 0x90003900 r13: 0x001a0c10 r14: 0x009b4f58 r15: 0xbfffed60
    > r16: 0x00d37e00 r17: 0x00000000 r18: 0x00000000 r19: 0x00000000
    > r20: 0x00d367e8 r21: 0x00d385f0 r22: 0x00d36958 r23: 0x00d36af8
    > r24: 0x00000001 r25: 0x00d36f28 r26: 0x00d36a48 r27: 0x00d385f0
    > r28: 0xbfffeaa0 r29: 0x00000000 r30: 0x00daebd5 r31: 0x9001000c
    >
    > Any help would be appreciated.
    > Jeffrey Melloy
    > jmelloy@visuald istortion.org
    >
    >
    > ---------------------------(end of broadcast)---------------------------
    > TIP 5: Have you checked our extensive FAQ?
    >
    > http://www.postgresql.org/docs/faqs/FAQ.html[/color]


    --
    Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
    Postgresql support, programming shared hosting and dedicated hosting.
    +1-503-222-2783 - jd@commandpromp t.com - http://www.commandprompt.com
    The most reliable support for the most reliable Open Source database.


    Comment

    • Tom Lane

      #3
      Re: segfaults / backend crashing

      Jeffrey Melloy <jmelloy@visual distortion.org> writes:[color=blue]
      > I'm having a problem with the backend occasionally crashing.
      > I have interfaces with the database in two different applications -- a
      > web viewer using JDBC and an insertion routine written using the c
      > libraries. It makes fairly heavy use of the tsearch protocol. (I
      > haven't been able to figure out anything repetitious, it happens at
      > different times, and after I get a new connection the exact search
      > query or insertion string works fine).[/color]

      That sounds suspiciously like a memory-stomp problem --- that is,
      something is scribbling on RAM that doesn't belong to it, and at some
      later point there's a crash because the code expects the original values
      to be in that memory. It's a good bet (but far from certain) that the
      bug is in the tsearch module, since that's not been used as much as
      the rest of the code. I would not however put the blame on the exact
      spot where you are seeing the crash.

      I'd recommend rebuilding the backend with memory checking and asserts
      enabled (configure --enable-cassert, and you might as well
      --enable-debug too). This might help catch the culprit sooner.

      regards, tom lane

      ---------------------------(end of broadcast)---------------------------
      TIP 6: Have you searched our list archives?



      Comment

      • Jeffrey Melloy

        #4
        Re: segfaults / backend crashing

        When this started happening, I upgraded from 7.3.2 to 7.3.4, with the
        JDBC drivers included with that distribution.
        Jeff
        On Tuesday, September 2, 2003, at 12:12 PM, Joshua D. Drake wrote:
        [color=blue]
        > Hello,
        >
        >   What version of PostgreSQL and JDBC are you running?
        >
        > Sincerely,
        >
        > Joshua Drake
        >
        >
        > Jeffrey Melloy wrote:
        >
        > I'm having a problem with the backend occasionally crashing.
        >
        > I have interfaces with the database in two different applications -- a
        > web viewer using JDBC and an insertion routine written using the c
        > libraries.  It makes fairly heavy use of the tsearch protocol.  (I
        > haven't been able to figure out anything repetitious, it happens at
        > different times, and after I get a new connection the exact search
        > query or insertion string works fine).
        >
        > All that's printed to the postgres log is this:
        > 2003-09-02 10:41:18 [428]    LOG:  server process (pid 479) was
        > terminated by signal 11
        > 2003-09-02 10:41:18 [428]    LOG:  terminating any other active server
        > processes
        > 2003-09-02 10:41:18 [428]    LOG:  all server processes terminated;
        > reinitializing shared memory and semaphores
        > 2003-09-02 10:41:18 [510]    LOG:  database system was interrupted at
        > 2003-09-02 10:17:38 CDT
        > 2003-09-02 10:41:18 [428]    LOG:  all server processes terminated;
        > reinitializing shared memory and semaphores
        > 2003-09-02 10:41:18 [510]    LOG:  database system was interrupted at
        > 2003-09-02 10:17:38 CDT
        > 2003-09-02 10:41:18 [510]    LOG:  checkpoint record is at 0/53E94490
        > 2003-09-02 10:41:18 [510]    LOG:  redo record is at 0/53E94490; undo
        > record is at 0/0; shutdown TRUE
        > 2003-09-02 10:41:18 [510]    LOG:  next transaction id: 127192; next
        > oid: 1562034
        > 2003-09-02 10:41:18 [510]    LOG:  database system was not properly
        > shut down; automatic recovery in progress
        > 2003-09-02 10:41:18 [510]    LOG:  redo starts at 0/53E944D0
        > 2003-09-02 10:41:18 [510]    LOG:  ReadRecord: record with zero length
        > at 0/53EA28E0
        > 2003-09-02 10:41:18 [510]    LOG:  redo done at 0/53EA282C
        > 2003-09-02 10:41:20 [510]    LOG:  database system is ready
        >
        > The crash log associated with that crash isn't particularly useful, at
        > least to me:
        > Date/Time:  2003-09-02 10:41:17 -0500
        > OS Version: 10.2.6 (Build 6L60)
        > Host:       dahak.local.
        >
        > Command:    postmaster
        > PID:        479
        >
        > Exception:  EXC_BAD_ACCESS (0x0001)
        > Codes:      KERN_INVALID_AD DRESS (0x0001) at 0xffffe640
        >
        > Thread 0 Crashed:
        >
        > PPC Thread State:
        >   srr0: 0xbfffe5e0 srr1: 0x0200f930                vrsave: 0x00000000
        >    xer: 0x00000000   lr: 0xbfffe5e0  ctr: 0x90010000   mq: 0x00000000
        >     r0: 0xbfffe5e0   r1: 0xbfffe5d0   r2: 0x8fe4e3e0   r3: 0x00000000
        >     r4: 0x0088ec1c   r5: 0x00000007   r6: 0x00000006   r7: 0x0088eba4
        >     r8: 0x0000000c   r9: 0x00000000  r10: 0x004594d4  r11: 0x008a0150
        >    r12: 0x90010000  r13: 0x00000000  r14: 0x00000000  r15: 0x00000000
        >    r16: 0x00000000  r17: 0x00000000  r18: 0x00000000  r19: 0x00000000
        >    r20: 0x00000000  r21: 0x00000000  r22: 0x00000000  r23: 0x00000000
        >    r24: 0x00000000  r25: 0x00000000  r26: 0xbffffbd4  r27: 0x0000000c
        >    r28: 0x00000002  r29: 0x00459534  r30: 0x00000000  r31: 0x00000000
        >
        > One of the crashlogs for the JDBC connection looks like this:
        > Date/Time:  2003-08-13 01:10:21 -0500
        > OS Version: 10.2.6 (Build 6L60)
        > Host:       dahak.local.
        >
        > Command:    postmaster
        > PID:        460
        >
        > Exception:  EXC_BAD_ACCESS (0x0001)
        > Codes:      KERN_PROTECTION _FAILURE (0x0002) at 0x00daec05
        >
        > Thread 0 Crashed:
        >  #0   0x90010070 in __swsetup
        >  #1   0x00d9f8d8 in compare
        >  #2   0x00d9f5a0 in merge
        >  #3   0x00d9f788 in sort
        >  #4   0x00d9fbc0 in create_pool
        >  #5   0x00da1b40 in setup_english_s temmer
        >  #6   0x00da23f4 in initmorph
        >  #7   0x00da43c4 in tsearch
        >  #8   0x00064904 in ExecCallTrigger Func (trigger.c:1120 )
        >  #9   0x00064a7c in ExecBRInsertTri ggers (trigger.c:1171 )
        >  #10  0x00071018 in ExecInsert (execMain.c:121 6)
        >  #11  0x00070ee4 in ExecutePlan (execMain.c:110 2)
        >  #12  0x000701fc in ExecutorRun (execMain.c:195 )
        >  #13  0x000d31dc in ProcessQuery (pquery.c:247)
        >  #14  0x000d0f8c in pg_exec_query_s tring (postgres.c:839 )
        >  #15  0x000d24d4 in PostgresMain (postgres.c:202 0)
        >  #16  0x000b379c in DoBackend (postmaster.c:2 293)
        >  #17  0x000b2ff4 in BackendStartup (postmaster.c:1 916)
        >  #18  0x000b1ec0 in ServerLoop (postmaster.c:1 006)
        >  #19  0x000b19f8 in PostmasterMain (postmaster.c:7 85)
        >  #20  0x00086670 in main (main.c:210)
        >  #21  0x00001b58 in _start (crt.c:267)
        >  #22  0x000019d8 in start
        >
        > PPC Thread State:
        >   srr0: 0x90010070 srr1: 0x0200f930                vrsave: 0x00000000
        >    xer: 0x00000000   lr: 0x90010070  ctr: 0x90003900   mq: 0x00000000
        >     r0: 0x90010070   r1: 0xbfffe5f0   r2: 0x00000000   r3: 0x000000e3
        >     r4: 0x00000000   r5: 0x000000e3   r6: 0x0000000a   r7: 0x00000020
        >     r8: 0x00000030   r9: 0x000000e3  r10: 0x00000060  r11: 0xa00043ac
        >    r12: 0x90003900  r13: 0x001a0c10  r14: 0x009b4f58  r15: 0xbfffed60
        >    r16: 0x00d37e00  r17: 0x00000000  r18: 0x00000000  r19: 0x00000000
        >    r20: 0x00d367e8  r21: 0x00d385f0  r22: 0x00d36958  r23: 0x00d36af8
        >    r24: 0x00000001  r25: 0x00d36f28  r26: 0x00d36a48  r27: 0x00d385f0
        >    r28: 0xbfffeaa0  r29: 0x00000000  r30: 0x00daebd5  r31: 0x9001000c
        >
        > Any help would be appreciated.
        > Jeffrey Melloy
        > jmelloy@visuald istortion.org
        >
        >
        > ---------------------------(end of
        > broadcast)---------------------------
        > TIP 5: Have you checked our extensive FAQ?
        >
        >               http://www.postgresql.org/docs/faqs/FAQ.html
        >
        >
        > --
        > Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
        > Postgresql support, programming shared hosting and dedicated hosting.
        > +1-503-222-2783 - jd@commandpromp t.com - http://www.commandprompt.com
        > The most reliable support for the most reliable Open Source database.
        >[/color]

        Comment

        Working...