DBCC and Failed Assertion Errors - HELP!

**Paul S Randal [MS]** · Jul 20 '05, 03:50 AM

Re: DBCC and Failed Assertion Errors - HELP!

Hi Morgan,

Have you actually checked the event logs and run hardware diagnostics on
your IO system to see if there are hardware problems?

If so and there's no clues there, you should call Product Support to help
you diagnose the problem.

Regards.

--
Paul Randal
Dev Lead, Microsoft SQL Server Storage Engine

This posting is provided "AS IS" with no warranties, and confers no rights.

"Morgan Leppink" <mleppink@hotma il.com> wrote in message
news:806e6d7.04 05271455.1bf6a2 d4@posting.goog le.com...[color=blue]
> Hey all -
>
> We are running SQL 2000 with ALL available service packs, etc.
> applied. We just built a brand new database server, which has dual
> 2Ghz XEONs, 2GB memory, and the following disk configuration:
>
> RAID 1 array (2 disks) Operating System Windows Server 2003
> RAID 1 array (2 disks) Database Logs
> RAID 10 array (4 disks) Database Data
>
> Disks are SATA, with a 3Ware hardware RAID controller. The machine
> SCREAMS.
>
> We run 5 databases on this machine. 2 of these are fairly large (by
> our standards, anyway). The second largest database (and the busiest
> and most important) is consistently generating consistency errors that
> bring many important queries down. These are almost ALWAYS in the
> form of index corruption on one single table. The corruption does not
> normally occur on other tables, although it DOES happen once in a
> while - rarely - on one of the other tables), nor does it EVER occur
> on any other databases on the server.
>
> The corruption seems to happen right in the neighborhood of midnight
> ALMOST every day, give or take a few minutes, but does not seem
> directly associated with any of our MANY scheduled database cleanup
> tasks (believe me, we've tried desperately to find an association
> using SQL profiler). At midnight, our database traffic is fairly low,
> so it does not seem associated with a high traffic level.
>
> We are using the FULL recovery model, with log backups every 15
> minutes, and full backups daily at 12:15am. However, the corruption
> happens consistently BEFORE 12:15, like between 11:50pm and 12:10am.
> The most frustrating thing is, the database can go WEEKS without any
> corruption at all, and then it'll go 4 or 5 days in a row with this
> strange corruption stuff.
>
> *************** *************** *************** *************** *************
> Typical query errors when the corruption exists include:
> *************** *************** *************** *************** *************
>
> SQL Server Assertion: File:
> <p:\sql\ntdbms\ storeng\drs\inc lude\record.inl >, line=1447
> Failed Assertion = 'm_SizeRec > 0 && m_SizeRec <= MAXDATAROW'.
>
>
> SQL Server Assertion: File: <recbase.cpp> , line=1378
> Failed Assertion = 'm_offBeginVar < m_SizeRec'.
>
>
> Server: Msg 3624, Level 20, State 1, Line 7
> Location: recbase.cpp:137 4
> Expression: m_nVars > 0
>
>
> Connection Broken
>
> *************** *************** *************** *************** *************
>
> Most of the responses to this type of issue (failed assertions) on the
> newgroups appear to point to hardware failures. However, this is
> brand new hardware, AND, it seems to us that if this was a hardware
> issue, other databases, tables, and indexes would be affected
> randomly. Isn't that a valid assumption (that if it was hardware,
> particularly the RAID controller, the corruption would not be in such
> a predictable place)? What if we moved the physical database files to
> another location on the disk? Would/could that help?
>
> If anyone could offer some suggestions as to what may be causing this
> corruption, we would be eternally grateful. It is getting to be a
> real pain in the A*** to run DBCC CHECKDB with REPAIR_ALLOW_DA TA_LOSS
> every day or two (it always seems to solve the problem without data
> loss, but still...).
>
> Again, thanks in advance for your response.
>
>
> Sincerely,
>
>
> Morgan Leppink
> mleppink@hotmai l.com[/color]

**Morgan Leppink** · Jul 20 '05, 03:50 AM

Re: DBCC and Failed Assertion Errors - HELP!

Paul -

The only information in the event logs is the text of the failed
assertion error itself. I have never seen any OS-reported problems
with the hardware.

I hate to seem stupid, but can you be more specific about what you
mean when you say "hardware diagnostics?" Are you talking about the
simple Windows CheckDisk utility or something more advanced? This is
the first time I've used a hardware RAID controller - is Windows even
capable of checking the hardware-controlled disk array, or do I need
to use a utility provided by the RAID controller manufacturer?

Or would you suggest some sort of third-party utility for "burning in"
the hardware? Would you suspect disk drives, memory, or what? Could
it be ANY of the hradware, or just specific things?

One last question: What's the most effective method for contacting
product support if I need to do so?

Thanks,

Morgan Leppink

"Paul S Randal [MS]" <prandal@online .microsoft.com> wrote in message news:<40b684c7$ 1@news.microsof t.com>...[color=blue]
> Hi Morgan,
>
> Have you actually checked the event logs and run hardware diagnostics on
> your IO system to see if there are hardware problems?
>
> If so and there's no clues there, you should call Product Support to help
> you diagnose the problem.
>
> Regards.
>
> --
> Paul Randal
> Dev Lead, Microsoft SQL Server Storage Engine
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
> "Morgan Leppink" <mleppink@hotma il.com> wrote in message
> news:806e6d7.04 05271455.1bf6a2 d4@posting.goog le.com...[color=green]
> > Hey all -
> >
> > We are running SQL 2000 with ALL available service packs, etc.
> > applied. We just built a brand new database server, which has dual
> > 2Ghz XEONs, 2GB memory, and the following disk configuration:
> >
> > RAID 1 array (2 disks) Operating System Windows Server 2003
> > RAID 1 array (2 disks) Database Logs
> > RAID 10 array (4 disks) Database Data
> >
> > Disks are SATA, with a 3Ware hardware RAID controller. The machine
> > SCREAMS.
> >
> > We run 5 databases on this machine. 2 of these are fairly large (by
> > our standards, anyway). The second largest database (and the busiest
> > and most important) is consistently generating consistency errors that
> > bring many important queries down. These are almost ALWAYS in the
> > form of index corruption on one single table. The corruption does not
> > normally occur on other tables, although it DOES happen once in a
> > while - rarely - on one of the other tables), nor does it EVER occur
> > on any other databases on the server.
> >
> > The corruption seems to happen right in the neighborhood of midnight
> > ALMOST every day, give or take a few minutes, but does not seem
> > directly associated with any of our MANY scheduled database cleanup
> > tasks (believe me, we've tried desperately to find an association
> > using SQL profiler). At midnight, our database traffic is fairly low,
> > so it does not seem associated with a high traffic level.
> >
> > We are using the FULL recovery model, with log backups every 15
> > minutes, and full backups daily at 12:15am. However, the corruption
> > happens consistently BEFORE 12:15, like between 11:50pm and 12:10am.
> > The most frustrating thing is, the database can go WEEKS without any
> > corruption at all, and then it'll go 4 or 5 days in a row with this
> > strange corruption stuff.
> >
> > *************** *************** *************** *************** *************
> > Typical query errors when the corruption exists include:
> > *************** *************** *************** *************** *************
> >
> > SQL Server Assertion: File:
> > <p:\sql\ntdbms\ storeng\drs\inc lude\record.inl >, line=1447
> > Failed Assertion = 'm_SizeRec > 0 && m_SizeRec <= MAXDATAROW'.
> >
> >
> > SQL Server Assertion: File: <recbase.cpp> , line=1378
> > Failed Assertion = 'm_offBeginVar < m_SizeRec'.
> >
> >
> > Server: Msg 3624, Level 20, State 1, Line 7
> > Location: recbase.cpp:137 4
> > Expression: m_nVars > 0
> >
> >
> > Connection Broken
> >
> > *************** *************** *************** *************** *************
> >
> > Most of the responses to this type of issue (failed assertions) on the
> > newgroups appear to point to hardware failures. However, this is
> > brand new hardware, AND, it seems to us that if this was a hardware
> > issue, other databases, tables, and indexes would be affected
> > randomly. Isn't that a valid assumption (that if it was hardware,
> > particularly the RAID controller, the corruption would not be in such
> > a predictable place)? What if we moved the physical database files to
> > another location on the disk? Would/could that help?
> >
> > If anyone could offer some suggestions as to what may be causing this
> > corruption, we would be eternally grateful. It is getting to be a
> > real pain in the A*** to run DBCC CHECKDB with REPAIR_ALLOW_DA TA_LOSS
> > every day or two (it always seems to solve the problem without data
> > loss, but still...).
> >
> > Again, thanks in advance for your response.
> >
> >
> > Sincerely,
> >
> >
> > Morgan Leppink
> > mleppink@hotmai l.com[/color][/color]

**druss** · Jul 20 '05, 04:00 AM

Re: DBCC and Failed Assertion Errors - HELP!

I am running a 3ware SATA Raid card also and have been getting consistency
errors randomly also. I have to run repair_allow_da ta_loss to fix. I wish
I knew the cause. No drive errors. Microsoft can not pin point either. All
they can tell me is that it is most likely hardware related and to move my
database to another server.

**Greg D. Moore \(Strider\)** · Jul 20 '05, 04:00 AM

Re: DBCC and Failed Assertion Errors - HELP!

"druss" <dean@corp.dsle xtreme.com> wrote in message
news:6c23824f88 90833bc7e6bd07c 5331636@localho st.talkaboutdat abases.com...[color=blue]
> I am running a 3ware SATA Raid card also and have been getting consistency
> errors randomly also. I have to run repair_allow_da ta_loss to fix. I wish
> I knew the cause. No drive errors. Microsoft can not pin point either. All
> they can tell me is that it is most likely hardware related and to move my
> database to another server.[/color]

I would suggest they're probably right in this case.

[color=blue]
>[/color]

DBCC and Failed Assertion Errors - HELP!

DBCC and Failed Assertion Errors - HELP!

Comment

Comment

Comment

Comment