failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • uspensky@gmail.com

    failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

    We are trying to setup a system to system failover cluster using two
    nodes (x346) which each have a single hba running to seperate
    controllers on the DS400.

    For full redundnancy, IBM recommends dual path from each node but we
    dont need that. The current setup has two completly seperate paths. hba
    on node 1 to controller A on DS400 and hba on node 2 to controller B.
    If i take a controller offline, failover works fine to jumo to other
    controller and throw all resources to it's node but if i shutdown a
    node- the cluster loses all attached storage and DS400 is unaware to
    switch ownership to other controller.

    Is there a way to us mscs without dual path from each node?
    anotherwords... if either node or controller fails on a single path, we
    want the other path to become active.

    our main goal is to use sql server 2005 clustering on the cluster.
    everything checks out perfect if i only use one controller on the DS400
    for both nodes but this brings us back to another single point of
    failure.

    I saw that Qlogic has MPIO drivers on thir website for the DS400 but it
    seems as though they are for 32bit systems and the install errors out
    with:

    C:\Drivers\mpio \1.0.8.4 (w32)>install.e xe -i
    Pre-Installing the Multi-Path Adapter Filter...
    Success
    Installing the Multi-Path Bus Driver...
    Failure. Error code (0xe0000235)

    configuration:
    2 X IBM x346 w/ single QLogic 2340 HBAs running win2k3 64bit Enterprise
    DS400 w/ dual controllers

  • Rob Turk

    #2
    Re: failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

    <uspensky@gmail .com> wrote in message
    news:1144859495 .769566.133510@ e56g2000cwe.goo glegroups.com.. .[color=blue]
    > We are trying to setup a system to system failover cluster using two
    > nodes (x346) which each have a single hba running to seperate
    > controllers on the DS400.
    >
    > For full redundnancy, IBM recommends dual path from each node but we
    > dont need that. The current setup has two completly seperate paths. hba
    > on node 1 to controller A on DS400 and hba on node 2 to controller B.
    > If i take a controller offline, failover works fine to jumo to other
    > controller and throw all resources to it's node but if i shutdown a
    > node- the cluster loses all attached storage and DS400 is unaware to
    > switch ownership to other controller.
    >
    > Is there a way to us mscs without dual path from each node?
    > anotherwords... if either node or controller fails on a single path, we
    > want the other path to become active.[/color]

    It sounds like you're trying to persuade the DS400 to control your failover
    action. You're making a LUN available to one node, and when a failure occurs
    you're expecting the DS400 to switch ownership of that LUN to the other node
    so it can proceed. That's not what you want. You want both nodes to see and
    share the LUN(s) on the DS400 at all times. Mscs will then figure out
    between the two nodes which one will access the LUN.

    Rob


    Comment

    • uspensky@gmail.com

      #3
      Re: failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

      logically that would make sense that MSCS would be responsible for
      everything.
      however.... both nodes are able to see the storage but can only read
      the drives when the respective controller is the active one.

      both initiators have access to all the LUNs on the storage. both HBAs
      have access to all LUNs
      [color=blue]
      >
      > It sounds like you're trying to persuade the DS400 to control your failover
      > action. You're making a LUN available to one node, and when a failure occurs
      > you're expecting the DS400 to switch ownership of that LUN to the other node
      > so it can proceed. That's not what you want. You want both nodes to see and
      > share the LUN(s) on the DS400 at all times. Mscs will then figure out
      > between the two nodes which one will access the LUN.
      >
      > Rob[/color]

      Comment

      • Rob Turk

        #4
        Re: failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

        <uspensky@gmail .com> wrote in message
        news:1144869108 .727995.55910@i 40g2000cwc.goog legroups.com...[color=blue]
        > logically that would make sense that MSCS would be responsible for
        > everything.
        > however.... both nodes are able to see the storage but can only read
        > the drives when the respective controller is the active one.
        >
        > both initiators have access to all the LUNs on the storage. both HBAs
        > have access to all LUNs
        >[/color]

        The DS400 wasn't certified for MSCS when it was initially introduced. If you
        have a model from before mid-2005 then you may need to update firmware or
        contact IBM about the exact features required to make it work with MSCS. The
        latest firmware is available from Adaptec's website:
        Microchip Technology is a leading provider of microcontroller, mixed-signal, analog and Flash-IP solutions that also offers outstanding technical support.


        Rob


        Comment

        • uspensky@gmail.com

          #5
          Re: failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

          logically that would make sense that MSCS would be responsible for
          everything.
          however.... both nodes are able to see the storage but can only read
          the drives when the respective controller is the active one.

          both initiators have access to all the LUNs on the storage. both HBAs
          have access to all LUNs
          [color=blue]
          >
          > It sounds like you're trying to persuade the DS400 to control your failover
          > action. You're making a LUN available to one node, and when a failure occurs
          > you're expecting the DS400 to switch ownership of that LUN to the other node
          > so it can proceed. That's not what you want. You want both nodes to see and
          > share the LUN(s) on the DS400 at all times. Mscs will then figure out
          > between the two nodes which one will access the LUN.
          >
          > Rob[/color]

          Comment

          • uspensky@gmail.com

            #6
            Re: failover cluster using 64 bit windows 2k3 on single HBAs and DS400 with dual controller

            OK,

            we have now added to the configuration to provide multipaths to both
            nodes from both controllers.

            Each node now has two HBAs with with connections to both controllers.
            It seems as though everything is working as expected with failover
            occuring system to system if the node fails and also controller to
            controller if the controller fails.

            When I do a failover from system to system, it works flawlessly.
            When i Do a failover from controller to controller however, the active
            node seems to kick in fine when the resources are bak up and available
            but shows an error in taskbar and event log saying:

            windowsDelayed Write Failed: Windows was unable to save all the data
            for the file M:\ The data has been lost. This error may be caused by a
            failure of your computer hardware or network connection. Please try to
            save this file elsewhere.

            Since this cluster is being used for a SQL Server 2005 cluster, losing
            data is not something we would like to do. The controllers have 256
            battery backup memory on them. Since this is the case, are the
            controllers taking care of this issue and windows is just not aware of
            it or do we actually have an issue where we might lose data?

            Comment

            Working...