view data in python class

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • natachai
    New Member
    • Jan 2010
    • 6

    view data in python class

    I am following some code written for the binary tree.
    The tree variable is the class object.
    The program is running fine. however, I really confuse with the way that python storing the data in object. I can not view the data store in this tree the only thing that I can see when try to print is:

    >> print tree
    <treepredict.de cisionnode instance at 0x03E0E6C0>

    what is the way to view all the data store in this "tree" object???
  • bvdet
    Recognized Expert Specialist
    • Oct 2006
    • 2851

    #2
    To view the data in a class instance using print, define a __str__() method. A simple example:
    Code:
    >>> class Point(object):
    ... 	def __init__(self, x=0.0, y=0.0, z=0.0):
    ... 		self.x = x
    ... 		self.y = y
    ... 		self.z = z
    ... 	def __str__(self):
    ... 		return "Point(%s, %s, %s)" % (self.x, self.y, self.z)
    ... 	
    >>> Point(2,2,2)
    <__main__.Point object at 0x00F47050>
    >>> print Point(2,2,2)
    Point(2, 2, 2)
    >>>

    Comment

    • natachai
      New Member
      • Jan 2010
      • 6

      #3
      Thank you bvdet,

      however the object "tree" that I got is kind da class for binary tree. It seem like there are a lot of objects (node data) in just one main object.
      sorry for a little confuse question I post the code for you here

      Code:
      mydata = [['slashdot', 'USA', 'yes', '18', 'None'],
                ['google', 'France', 'yes', '23', 'Premium'],
                ['digg', 'USA', 'yes', '24', 'Basic'],
                ['kiwitobes', 'France', 'yes', '23', 'Basic'],
                ['google', 'UK', 'no', '21', 'Premium'],
                ['(direct)', 'New Zealand', 'no', '12', 'None'],
                ['(direct)', 'UK', 'no', '21', 'Basic'],
                ['google', 'USA', 'no', '24', 'Premium'],
                ['slashdot', 'France', 'yes', '19', 'None'],
                ['digg', 'USA', 'no', '18', 'None'],
                ['google', 'UK', 'no', '18', 'None'],
                ['kiwitobes', 'UK', 'no', '19', 'None'],
                ['digg', 'New Zealand', 'yes', '12', 'Basic'],
                ['slashdot', 'UK', 'no', '21', 'None'],
                ['google', 'UK', 'yes', '18', 'Basic'],
                ['kiwitobes', 'France', 'yes', '19', 'Basic']]
      class decisionnode:
          def __init__(self,col=-1,value=None,results=None,tb=None,fb=None):
              self.col=col
              self.value=value
              self.results=results
              self.tb=tb
              self.fb=fb
      def divideset(rows,column,value):
          # make a function that tells us if a row is in the first group (true)
          # or the second group (false)
          if isinstance(value,int) or isinstance(value,float):
              split_function=lambda row:row[column]>=value
          else:
              split_function=lambda row:row[column]==value
          # divide the rows into two sets and return them
          set1=[row for row in rows if split_function(row)]
          set2=[row for row in rows if not split_function(row)]
          return(set1,set2)
      def uniquecounts(rows):
          results ={}
          for row in rows:
              # the result is the last column
              r=row[len(row)-1]
              if r not in results: results[r]=0
              results[r]+=1
          return results
      def entropy(rows):
          from math import log
          log2 = lambda x:log(x)/log(2)
          results=uniquecounts(rows)
          # now calculate the entropy
          ent=0.0
          for r in results.keys():
              p=float(results[r])/len(rows)
              ent=ent-p*log2(p)
          return ent
      def buildtree(rows,scoref=entropy):
          if len(rows)==0: return decisionnode()
          current_score=scoref(rows)
      
          # Set up some variables to track the best criteria
          best_gain=0.0
          best_criteria=None
          best_sets=None
      
          column_count=len(rows[0])-1
          for col in range(0,column_count):
              # Generate the list of different value in
              # this column
              column_values={}
              for row in rows:
                  column_values[row[col]]=1
              # now try dividing the rows up for each value
              # in this column
              for value in column_values.keys():
                  (set1,set2)=divideset(rows,col,value)
                  
                  # information gain
                  p=float(len(set1))/len(rows)
                  gain=current_score-p*scoref(set1)-(1-p)*scoref(set2)
                  if gain>best_gain and len(set1)>0 and len(set2)>0:
                      best_gain=gain
                      best_criteria=(col,value)
                      best_sets=(set1,set2)
          # Create the subbranches
          if best_gain>0:
              trueBranch=buildtree(best_sets[0])
              falseBranch=buildtree(best_sets[1])
              return decisionnode(col=best_criteria[0],value=best_criteria[1],tb=trueBranch,fb=falseBranch)
          else:
              return decisionnode(results=uniquecounts(rows))
      I save this code in treecart.py and call it in pythonshell. Then creating the binary tree as follows:

      >> tree = treecart.buildt ree(treecart.my data)

      Now, I have a problem of how i can preview this kind of variable because it seem that tree variable has several object data (node data) in it.

      thank you in advance, this board have been a very great helper to me.

      Comment

      • bvdet
        Recognized Expert Specialist
        • Oct 2006
        • 2851

        #4
        You will need a recursive function to evaluate all the branches. I wrote one to recursively evaluate the value and results attributes for the object, fb branch and tb branch.
        Code:
        x = buildtree(mydata)
        
        def print_tree(x, pad="", results=[]):
            if x:
                s = "%sValue:%s - Results:%s" % (pad, getattr(x, 'value', "NONE"), getattr(x, 'results', "NONE"))
                results.append(s)
                for branch in ('fb', 'tb'):
                    obj = getattr(x, branch, None)
                    if obj:
                        for attr in ('value', 'results'):
                            s = "%sBranch: %s - %s: %s" % (pad, branch, attr, getattr(obj, attr, 'NONE'))
                            results.append(s)
                        print_tree(obj, pad+"  ", results)
                    else:
                        results.append("%s  Object '%s' evaluates False" % (pad, branch))
            return results
        
        print
        sList = print_tree(x)
        print "\n".join(sList)
        The output:
        Code:
        >>> 
        Value:google - Results:None
        Branch: fb - value: slashdot
        Branch: fb - results: None
          Value:slashdot - Results:None
          Branch: fb - value: yes
          Branch: fb - results: None
            Value:yes - Results:None
            Branch: fb - value: 21
            Branch: fb - results: None
              Value:21 - Results:None
              Branch: fb - value: None
              Branch: fb - results: {'None': 3}
                Value:None - Results:{'None': 3}
                  Object 'fb' evaluates False
                  Object 'tb' evaluates False
              Branch: tb - value: None
              Branch: tb - results: {'Basic': 1}
                Value:None - Results:{'Basic': 1}
                  Object 'fb' evaluates False
                  Object 'tb' evaluates False
            Branch: tb - value: None
            Branch: tb - results: {'Basic': 4}
              Value:None - Results:{'Basic': 4}
                Object 'fb' evaluates False
                Object 'tb' evaluates False
          Branch: tb - value: None
          Branch: tb - results: {'None': 3}
            Value:None - Results:{'None': 3}
              Object 'fb' evaluates False
              Object 'tb' evaluates False
        Branch: tb - value: 18
        Branch: tb - results: None
          Value:18 - Results:None
          Branch: fb - value: None
          Branch: fb - results: {'Premium': 3}
            Value:None - Results:{'Premium': 3}
              Object 'fb' evaluates False
              Object 'tb' evaluates False
          Branch: tb - value: yes
          Branch: tb - results: None
            Value:yes - Results:None
            Branch: fb - value: None
            Branch: fb - results: {'None': 1}
              Value:None - Results:{'None': 1}
                Object 'fb' evaluates False
                Object 'tb' evaluates False
            Branch: tb - value: None
            Branch: tb - results: {'Basic': 1}
              Value:None - Results:{'Basic': 1}
                Object 'fb' evaluates False
                Object 'tb' evaluates False
        >>>
        The output does not make much sense to me.

        Comment

        • Glenton
          Recognized Expert Contributor
          • Nov 2008
          • 391

          #5
          The point is that if you want to preview it, you have to tell the class how! If you don't know what the preview is meant to look like, how will the class know?!

          When you define the __str__ function you can put anything in there. You can do whatever calculations you need to do etc to show the value you'd like to see!

          If you want help with that you need to tell us what kind of thing you want to see, or at least what the code is doing...

          Comment

          • natachai
            New Member
            • Jan 2010
            • 6

            #6
            Thank you for all of your answer, I have couple more questions though.

            1. could you point me to the source of references related to this kind of variable? (is that call recursion or generator)
            2. to view structure of "tree" variable, the only way is to write the code like the one that you provided in the previous post right?

            Thank you in advance and thank you for patient with the newbie

            Comment

            • bvdet
              Recognized Expert Specialist
              • Oct 2006
              • 2851

              #7
              #1 Here's a simple example of recursion:
              Code:
              >>> def int_with_commas(s):
              ... 	if len(s) <= 3:
              ... 		return s
              ... 	return '%s,%s' % (int_with_commas(s[:-3]), s[-3:])
              ... 
              >>> int_with_commas('1234567890')
              '1,234,567,890'
              >>>
              The same function without recursion:
              Code:
              >>> def int_with_commas1(s):
              ... 	output = ""
              ... 	while True:
              ... 		s, sub = s[:-3], s[-3:]
              ... 		if not sub:
              ... 			return output
              ... 		if output:
              ... 			output = "%s,%s" % (sub, output)
              ... 		else:
              ... 			output = sub
              ... 			
              >>> print int_with_commas1('1234567890')
              1,234,567,890
              >>> print int_with_commas1('90')
              90
              >>>
              Sometimes recursion makes a function easier to write, and sometimes not.


              #2 Actually, it could be done like this:
              Code:
                  def __init__(self,col=-1,value=None,results=None,tb=None,fb=None):
                      self.col=col
                      self.value=value
                      self.results=results
                      self.tb=tb
                      self.fb=fb
                      
                  def __str__(self):
                      return "\n".join(self.print_tree(self))
              
                  def print_tree(self, next, pad="", results=[]):
                      if next:
                          s = "%sValue:%s - Results:%s" % (pad, getattr(next, 'value', "NONE"), getattr(next, 'results', "NONE"))
                          results.append(s)
                          for branch in ('fb', 'tb'):
                              obj = getattr(next, branch, None)
                              if obj:
                                  for attr in ('value', 'results'):
                                      s = "%sBranch: %s - %s: %s" % (pad, branch, attr, getattr(obj, attr, 'NONE'))
                                      results.append(s)
                                  self.print_tree(obj, pad+"  ", results)
                              else:
                                  results.append("%s  Object '%s' evaluates False" % (pad, branch))
                      return results

              Comment

              • Glenton
                Recognized Expert Contributor
                • Nov 2008
                • 391

                #8
                By the way, recursion, though sometimes easier to write, is often more expensive in process time. So if your real data set is much bigger that what you've suggested then you need to rethink. Sometimes the best is to use what's known as memoization (?). But basically you save the answers as you generate them, and reuse them rather than regenerate them. The easy way to do this is to pass the saved data in and out of the function.

                Also, "to understand recursion you first have to understand recursion" is a terribly old coders joke, but it's almost obligatory to include it in any discussion so there it is.

                To answer your second question: the only way to view the tree variable is to write code that returns the data you want to view.
                It could be as simple as:
                Code:
                def __str__(self):
                    return self.value, self.results
                However, it's fundamentally up to you to decide what the "tree" variable should look like. If you help us with what it should look like, then we can help you with how to code it!

                Comment

                Working...