running same script on same data on two different machines -->different result

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Christopher Brewster

    running same script on same data on two different machines -->different result

    I am running the same script on the same data on two different
    machines (the folder is synchronised with Dropbox).
    I get two different results. All the script does is count words in
    different files and perform a simple set operation on the word lists.
    The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
    Python 2.5.1
    The desktop is an iMac (brand new) running OS X 10.5.5 also with
    Python 2.5.1

    I have tried running the script on an ubuntu server with Python 2.5.2
    and the results corresponded with my laptop's output.
    How can I find out the cause of this anomaly? What tests can I
    perform?

    Thank you,

    Christopher Brewster
    Aston University
  • Steve Holden

    #2
    Re: running same script on same data on two different machines -->differen t result

    Christopher Brewster wrote:
    I am running the same script on the same data on two different
    machines (the folder is synchronised with Dropbox).
    I get two different results. All the script does is count words in
    different files and perform a simple set operation on the word lists.
    The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
    Python 2.5.1
    The desktop is an iMac (brand new) running OS X 10.5.5 also with
    Python 2.5.1
    >
    I have tried running the script on an ubuntu server with Python 2.5.2
    and the results corresponded with my laptop's output.
    How can I find out the cause of this anomaly? What tests can I
    perform?
    >
    OK, as a university denizen you are presumably a smart type. Do you
    *really* think this is an adequate problem description for debugging?

    You might drop lucky, but more information couldn't possibly hurt. We
    *try* to be mindreaders, but it would help to know whether you are
    talking about string handling or floating-point computations, for example.

    If the latter then it's probably because one machine is based on PowerPC
    architecture and the other is a more recent Intel-architecture Mac.

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC http://www.holdenweb.com/

    Comment

    • Christopher Brewster

      #3
      Re: running same script on same data on two different machines -->differen t result

      On Nov 14, 3:22 pm, Steve Holden <st...@holdenwe b.comwrote:
      Christopher Brewster wrote:
      I am running the same script on the same data on two different
      machines (the folder is synchronised with Dropbox).
      I get two different results. All the script does is count words in
      different files and perform a simple set operation on the word lists.
      The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
      Python 2.5.1
      The desktop is an iMac (brand new) running OS X 10.5.5 also with
      Python 2.5.1
      >
      I have tried running the script on an ubuntu server with Python 2.5.2
      and the results corresponded with my laptop's output.
      How can I find out the cause of this anomaly? What tests can I
      perform?
      >
      OK, as a university denizen you are presumably a smart type. Do you
      *really* think this is an adequate problem description for debugging?
      >
      You might drop lucky, but more information couldn't possibly hurt. We
      *try* to be mindreaders, but it would help to know whether you are
      talking about string handling or floating-point computations, for example..
      >
      If the latter then it's probably because one machine is based on PowerPC
      architecture and the other is a more recent Intel-architecture Mac.
      >
      regards
       Steve
      --
      Steve Holden        +1 571 484 6266   +1 800 494 3119
      Holden Web LLC              http://www.holdenweb.com/
      Thanks for the suggestion but they are both Intel machines.
      There is no floating point just simple additions.

      No matter how smart you are, if you do not do this sort of thing
      often,
      you do not know exactly what sort of information to provide or what
      questions to ask.
      So that is exactly my question - what are the right questions?
      What information do I need to provide to try to solve this?

      Christopher

      Comment

      • Philip Semanchuk

        #4
        Re: running same script on same data on two different machines --&gt;differen t result


        On Nov 14, 2008, at 10:14 AM, Christopher Brewster wrote:
        I am running the same script on the same data on two different
        machines (the folder is synchronised with Dropbox).
        I get two different results. All the script does is count words in
        different files and perform a simple set operation on the word lists.
        The laptop is a Macbook Pro (2 1/2 years old) running OS X 10.5.5 with
        Python 2.5.1
        The desktop is an iMac (brand new) running OS X 10.5.5 also with
        Python 2.5.1
        >
        I have tried running the script on an ubuntu server with Python 2.5.2
        and the results corresponded with my laptop's output.
        How can I find out the cause of this anomaly? What tests can I
        perform?
        No idea what Dropbox is, but it is a potential point of failure.
        Ensure it is doing its job. Programmaticall y ensure that the source
        files are exactly the same before you start your Python program.

        Then try your program on different source files. If the problem shows
        up on some source files and not on others, try to figure out the
        pattern that relates the files.

        Or take your "problem" data file and chop it in half by deleting the
        lines from the first half of the file. See if the problem still
        occurs. If not, try using the latter half of the file. By using a
        binary search like this, maybe you can isolate the problem data to a
        very small portion making visual detection of the problem easier.

        Until you get more info, this is just generic debugging and isn't
        specific to Python.

        Good luck
        Philip


        Comment

        • Steven D'Aprano

          #5
          Re: running same script on same data on two different machines --&gt;differen t result

          On Fri, 14 Nov 2008 07:14:20 -0800, Christopher Brewster wrote:
          I am running the same script on the same data on two different machines
          (the folder is synchronised with Dropbox). I get two different results.
          All the script does is count words in different files and perform a
          simple set operation on the word lists. The laptop is a Macbook Pro (2
          1/2 years old) running OS X 10.5.5 with Python 2.5.1
          The desktop is an iMac (brand new) running OS X 10.5.5 also with Python
          2.5.1
          >
          I have tried running the script on an ubuntu server with Python 2.5.2
          and the results corresponded with my laptop's output. How can I find out
          the cause of this anomaly? What tests can I perform?
          Try eliminating files and see if you can narrow the problem down to a
          single file.

          Make sure the files really are synchronized. Try comparing their md5
          checksums.

          Create a batch of test files, copy them from one machine to the other,
          and then confirm that the script calculates the same result.

          Lastly, make sure that both machines really are using the same script!

          And if you do find the result, please let us know... I'm intrigued.



          --
          Steven

          Comment

          • John Machin

            #6
            Re: running same script on same data on two different machines --&gt;differen t result

            On Nov 15, 2:14 am, Christopher Brewster <cbrews...@gmai l.comwrote:
            I am running the same script on the same data on two different
            machines (the folder is synchronised with Dropbox).
            I get two different results. All the script does is count words in
            different files and perform a simple set operation on the word lists.
            1. "same data" versus "different files": are you using "different" in
            the same sense as in "different machines" and "different results"? How
            do you know the data is the same?

            2. Either show us your script, or tell us (with a reasonable degree of
            precision):
            * how do you define a "word"
            * what is a "word list"
            * what is "a simple set operation on the word lists"
            * does the script use any of: random module, current date/time,
            iteration over dictionaries while updating them, etc

            3. (a) Which of the two sets of results is correct? (b) What is your
            basis for answering (a)?

            Comment

            Working...