dictionary comparison

**Bill Mill** · Jul 19 '05, 12:57 AM

Re: dictionary comparison

On 5 May 2005 08:19:31 -0700, rickle <devrick88@gmai l.com> wrote:[color=blue]
> I'm trying to compare sun patch levels on a server to those of what sun
> is recommending. For those that aren't familiar with sun patch
> numbering here is a quick run down.
>
> A patch number shows up like this:
> 113680-03
> ^^^^^^ ^^
> patch# revision
>
> What I want to do is make a list. I want to show what server x has
> versus what sun recommends, and if the patch exists, but the revision
> is different, I want to show that difference.
>
> Here are some sample patches that sun recommends:
> 117000-05
> 116272-03
> 116276-01
> 116278-01
> 116378-02
> 116455-01
> 116602-01
> 116606-01
>
> Here are some sample patches that server x has:
> 117000-01
> 116272-02
> 116272-01
> 116602-02
>
> So there are some that are the same, some that sun recommends that
> server x doesn't have, and some where the patch is the same but the
> revision is different.
>
> I've thrown the data into dictionaries, but I just can't seem to figure
> out how I should actually compare the data and present it. Here's what
> I have so far (the split is in place because there is actually a lot
> more data in the file, so I split it out so I just get the patch number
> and revision). So I end up with (for example) 116272-01, then split so
> field[0] is 116272 and field[1] is 01.
>
> def sun():
> sun = open('sun-patchlist', 'r')
> for s in sun:
> sun_fields = s.split(None, 7)
> for sun_field in sun_fields:
> sun_field = sun_field.strip ()
> sun_patch = {}
> sun_patch['number'] = sun_fields[0]
> sun_patch['rev'] = sun_fields[1]
> print sun_patch['number'], sun_patch['rev']
> sun.close()
>
> def serverx():
> serverx = open('serverx-patchlist', 'r')
> for p in serverx:
> serverx_fields = p.split(None, 7)
> for serverx_field in serverx_fields:
> serverx_field = serverx_field.s trip()
> serverx_patch = {}
> serverx_patch['number'] = serverx_fields[0]
> serverx_patch['rev'] = serverx_fields[1]
> print serverx_patch['number'], serverx_patch['rev']
> serverx.close()
> [/color]

The first thing you should notice about this code is that you copied a
good amount of code between functions; this should be a huge warning
bell that something can be abstracted out into a function. In this
case, it's the parsing of the patch files.

Also, you should see that you're creating a new dictionary every
iteration through the loop, and furthermore, you're not returning it
at the end of your function. Thus, it's destroyed when the function
exits and it goes out of scope.

<snip>

Anyway, since you at least made an effort, here's some totally
untested code that should (I think) do something close to what you're
looking for:

def parse_patch_fil e(f):
patches = {}
for line in f:
patch, rev = line.strip().sp lit('-')
patches[patch] = rev
return patches

def diff_patches(su n, serverx):
for patch in sun:
if not serverx.has_key (patch):
print "Sun recommends patch %s" % patch
for patch in serverx:
if not sun.has_key(pat ch):
print "Serverx has unnecessary patch %s" % patch

def diff_revs(sun, serverx):
for patch, rev in sun.iteritems() :
if serverx.has_key (patch) and rev != serverx[patch]:
print "Sun recommends rev %d of patch %s; serverx has rev %d"\
% (rev, patch, serverx[patch])

if __name__ == '__main__':
sun = parse_patch_fil e(open('sun-patchlist'))
serverx = parse_patch_fil e(open('serverx-patchlist'))
diff_patches(su n, serverx)
diff_revs(sun, serverx)

Hope this helps.

Peace
Bill Mill
bill.mill at gmail.com

**Jordan Rastrick** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

rickle wrote:[color=blue]
> I'm trying to compare sun patch levels on a server to those of what[/color]
sun[color=blue]
> is recommending. For those that aren't familiar with sun patch
> numbering here is a quick run down.
>
> A patch number shows up like this:
> 113680-03
> ^^^^^^ ^^
> patch# revision
>
> What I want to do is make a list. I want to show what server x has
> versus what sun recommends, and if the patch exists, but the revision
> is different, I want to show that difference.
>
> Here are some sample patches that sun recommends:
> 117000-05
> 116272-03
> 116276-01
> 116278-01
> 116378-02
> 116455-01
> 116602-01
> 116606-01
>
> Here are some sample patches that server x has:
> 117000-01
> 116272-02
> 116272-01
> 116602-02
>
> So there are some that are the same, some that sun recommends that
> server x doesn't have, and some where the patch is the same but the
> revision is different.
>
> I've thrown the data into dictionaries, but I just can't seem to[/color]
figure[color=blue]
> out how I should actually compare the data and present it. Here's[/color]
what[color=blue]
> I have so far (the split is in place because there is actually a lot
> more data in the file, so I split it out so I just get the patch[/color]
number[color=blue]
> and revision). So I end up with (for example) 116272-01, then split[/color]
so[color=blue]
> field[0] is 116272 and field[1] is 01.
>
> def sun():
> sun = open('sun-patchlist', 'r')
> for s in sun:
> sun_fields = s.split(None, 7)
> for sun_field in sun_fields:
> sun_field = sun_field.strip ()
> sun_patch = {}
> sun_patch['number'] = sun_fields[0]
> sun_patch['rev'] = sun_fields[1]
> print sun_patch['number'], sun_patch['rev']
> sun.close()
>
> def serverx():
> serverx = open('serverx-patchlist', 'r')
> for p in serverx:
> serverx_fields = p.split(None, 7)
> for serverx_field in serverx_fields:
> serverx_field = serverx_field.s trip()
> serverx_patch = {}
> serverx_patch['number'] = serverx_fields[0]
> serverx_patch['rev'] = serverx_fields[1]
> print serverx_patch['number'], serverx_patch['rev']
> serverx.close()
>
> if __name__=='__ma in__':
> sun()
> serverx()
>
>
> Right now I'm just printing the data, just to be sure that each
> dictionary contains the correct data, which it does. But now I need
> the comparison and I just can't seem to figure it out. I could
> probably write this in perl or a shell script, but I'm trying really
> hard to force myself to learn Python so I want this to be a python
> script, created with only built-in modules.
>
> Any help would be greatly appreciated,
> Rick[/color]

Well, it seems that what youre asking is more of a generic programming
question than anything specific to Python - if you can think of how
you'd solve this in Perl, for example, then a Python solution along the
same lines would work just as well. I'm not sure if there was some
specific issue with Python that was confusing you - if so, perhaps you
could state it more explicitly.

To address the problem itself, there are a few things about your
approach in the above code that I find puzzling. First of all, the
sun() and servex() functions are identical, except for the name of the
file they open. This kind of code duplication is bad practice, in
Python, Perl, or any other language (even Shell scripting perhaps,
although I wouldn't really know) - you should definitely use a single
function that takes a filename as an argument instead.

Second, you are creating a new dictionary inside every iteration of the
for loop, one for every patch in the file; each dictionary you create
contains one patch number and one revision number. This data is
printed, and thereafter ignored (and thus will be consumed by Python's
Garbage Collector.) Hence youre not actually storing it for later use.
I don't know whether this was because you were unsure how to proceed to
the comparing the two datasets; however I think what you probably
wanted was to have a single dictionary, that keeps track of all the
patches in the file. You need to define this outside the for loop; and,
if you want to use it outside the body of the function, you'll need to
return it. Also, rather than have a dictionary of two values, keyed by
strings, I'd suggest a dictionary mapping patch numbers to their
corresponding revision numbers is what you want.

Once you've got two dictionaries - one for the list for the servers
patches, and one for Sun's recommended patches - you can compare the
two sets of data by going through the Sun's patches, checking if the
server has that patch, and if so, caluclating the difference in
revision numbers.

So heres a rough idea of how I'd suggest modifying what you've got to
get the intended result:

def patchlevels(fil ename):
patchfile = open(filename, 'r')
patch_dict = {}
for line in patchfile:
fields = line.split(None , 7)
for field in fields:
field = field.strip()
number = fields[0]
rev = fields[1]
patch_dict[number] = rev
# print number, patch_dict[number]
patchfile.close ()
return patch_dict

if __name__=='__ma in__':
sun = patchlevels('su n-patchfile')
serverx = patchlevels('se rverx-patchfile')
print "Sun recommends:\t\t ", "Server has:\n"
for patch in sun:
if patch in serverx:
rev = serverx[patch]
diff = int(rev) - int(sun[patch])
serverhas = "Revision: %s Difference: %s" % (rev, diff)
else:
serverhas = "Does not have this patch"
print patch, sun[patch], "\t\t", serverhas

I've tried to stay as close to your code as possible and not introduce
new material, although I have had to use the inbuilt function int to
convert the revision numbers from strings to integers in order to
subtract one from the other; also, I used C printf-style string
formatting on the line after. I hope its reasonably obvious what these
things do.

For the sample data you gave, this outputs:

Sun recommends: Server has:

116276 01 Does not have this patch
116378 02 Does not have this patch
116272 03 Revision: 01 Difference: -2
116278 01 Does not have this patch
116602 01 Revision: 02 Difference: 1
116606 01 Does not have this patch
116455 01 Does not have this patch
117000 05 Revision: 01 Difference: -4

Here negative differences mean the server's version of the patch is out
of date, whereas positive differences mean its as recent as Sun's
recommendation or better. You could change the nature of the output to
whatever your own preference is easily enough. Or, if you want store
the data in some other structure like a list for further processing,
instead of just printing it, thats also pretty simple to do.

This code isn't exactly a work of art, I could have put more effort
into a sensible name for the function and variables, made it more
'pythonic' (e.g. by using a list-comprehension in place of the
whitespace stripping for loop ), etc; but I think it achieves the
desired result, or something close to it, right?

Let me know if I was on completely the wrong track.

**rickle** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

Bill and Jordan, thank you both kindly. I'm not too well versed in
functions in python and that's exactly what I needed. I could see I
was doing something wrong in my original attempt, but I didn't know how
to correct it.

It's working like a charm now, thank you both very much.
-Rick

**James Stroud** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

On Thursday 05 May 2005 10:20 am, so sayeth rickle:[color=blue]
> Bill and Jordan, thank you both kindly. I'm not too well versed in
> functions in python and that's exactly what I needed. I could see I
> was doing something wrong in my original attempt, but I didn't know how
> to correct it.
>
> It's working like a charm now, thank you both very much.
> -Rick[/color]

I thought I'd throw this in to show some things in python that make such comparisons very easy to write and also to recommend to use the patch as key and version as value in the dict.:

Note that the meat of the code is really about 4 lines because of (module) sets and list comprehension. Everything else is window dressing.

James

=============== =============== =====

# /usr/bin/env python

from sets import Set

# pretending these stripped from file
recc_ary = ["117000-05", "116272-03", "116276-01", "116278-01", "116378-02", "116455-01", "116602-01", "116606-01"]
serv_ary = ["117000-01", "116272-02", "116272-01", "116602-02"]

# use patch as value and version as key
recc_dct = dict([x.split("-") for x in recc_ary])
serv_dct = dict([x.split("-") for x in serv_ary])

# use Set to see if patches overlap
overlap = Set(recc_dct.ke ys()).intersect ion(serv_dct.ke ys())

# find differences (change comparison operator to <,>,<=,>=, etc.)
diffs = [patch for patch in overlap if recc_dct[patch] != serv_dct[patch]]

# print a pretty report
for patch in diffs:
print "reccomende d patch for %s (%s) is not server patch (%s)" % \
(patch, recc_dct[patch], serv_dct[patch])

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

**Bengt Richter** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

On 5 May 2005 08:19:31 -0700, "rickle" <devrick88@gmai l.com> wrote:
[color=blue]
>I'm trying to compare sun patch levels on a server to those of what sun
>is recommending. For those that aren't familiar with sun patch
>numbering here is a quick run down.
>
>A patch number shows up like this:
>113680-03
>^^^^^^ ^^
>patch# revision
>
>What I want to do is make a list. I want to show what server x has
>versus what sun recommends, and if the patch exists, but the revision
>is different, I want to show that difference.
>
>Here are some sample patches that sun recommends:
>117000-05
>116272-03
>116276-01
>116278-01
>116378-02
>116455-01
>116602-01
>116606-01
>
>Here are some sample patches that server x has:
>117000-01
>116272-02
>116272-01
>116602-02
>
>So there are some that are the same, some that sun recommends that
>server x doesn't have, and some where the patch is the same but the
>revision is different.
>
>I've thrown the data into dictionaries, but I just can't seem to figure
>out how I should actually compare the data and present it. Here's what
>I have so far (the split is in place because there is actually a lot
>more data in the file, so I split it out so I just get the patch number
>and revision). So I end up with (for example) 116272-01, then split so
>field[0] is 116272 and field[1] is 01.
>
>def sun():
> sun = open('sun-patchlist', 'r')
> for s in sun:
> sun_fields = s.split(None, 7)
> for sun_field in sun_fields:
> sun_field = sun_field.strip ()
> sun_patch = {}
> sun_patch['number'] = sun_fields[0]
> sun_patch['rev'] = sun_fields[1]
> print sun_patch['number'], sun_patch['rev']
> sun.close()
>
>def serverx():
> serverx = open('serverx-patchlist', 'r')
> for p in serverx:
> serverx_fields = p.split(None, 7)
> for serverx_field in serverx_fields:
> serverx_field = serverx_field.s trip()
> serverx_patch = {}
> serverx_patch['number'] = serverx_fields[0]
> serverx_patch['rev'] = serverx_fields[1]
> print serverx_patch['number'], serverx_patch['rev']
> serverx.close()
>
>if __name__=='__ma in__':
> sun()
> serverx()
>
>
>Right now I'm just printing the data, just to be sure that each
>dictionary contains the correct data, which it does. But now I need
>the comparison and I just can't seem to figure it out. I could
>probably write this in perl or a shell script, but I'm trying really
>hard to force myself to learn Python so I want this to be a python
>script, created with only built-in modules.
>
>Any help would be greatly appreciated,
>[/color]
In place of sun_rec.splitli nes() and x_has.splitline s() you can substitute
open('sun-patchlist') adn open('serverx-patchlist') respectively,
and you can wrap it all in some rountine for your convenience etc.
But this shows recommended revs that are either there, missing, and/or have unrecommended revs present.
I added some test data to illustrate. You might want to make the input a little more forgiving about
e.g. blank lines etc or raise exceptions for what's not allowed or expected.

----< sunpatches.py >--------------------------------------------------------------
#Here are some sample patches that sun recommends:
sun_rec = """\
117000-05
116272-03
116276-01
116278-01
116378-02
116455-01
116602-01
116606-01
testok-01
testok-02
testok-03
test_0-01
test_0-02
test_0-03
test_2-01
test_2-02
test_2-03
test23-02
test23-03
"""

#Here are some sample patches that server x has:
x_has = """\
117000-01
116272-02
116272-01
116602-02
testok-01
testok-02
testok-03
test_2-01
test_2-02
test23-01
test23-02
test23-03
"""

def mkdict(lineseq) :
dct = {}
for line in lineseq:
patch, rev = line.split('-')
dct.setdefault( patch, set()).add(rev)
return dct

dct_x_has = mkdict(x_has.sp litlines()) # or e.g., mkdict(open('su nrecfile.txt'))
dct_sun_rec = mkdict(sun_rec. splitlines())

for sunpatch, sunrevs in sorted(dct_sun_ rec.items()):
xrevs = dct_x_has.get(s unpatch, set())
print 'patch %s: recommended revs %s, missing %s, actual other %s'%(
sunpatch, map(str,sunrevs &xrevs) or '(none)',
map(str,sunrevs-xrevs) or '(none)', map(str,xrevs-sunrevs) or '(none)')
----------------------------------------------------------------------------------
Result:

[12:51] C:\pywk\clp>py2 4 sunpatches.py
patch 116272: recommended revs (none), missing ['03'], actual other ['02', '01']
patch 116276: recommended revs (none), missing ['01'], actual other (none)
patch 116278: recommended revs (none), missing ['01'], actual other (none)
patch 116378: recommended revs (none), missing ['02'], actual other (none)
patch 116455: recommended revs (none), missing ['01'], actual other (none)
patch 116602: recommended revs (none), missing ['01'], actual other ['02']
patch 116606: recommended revs (none), missing ['01'], actual other (none)
patch 117000: recommended revs (none), missing ['05'], actual other ['01']
patch test23: recommended revs ['02', '03'], missing (none), actual other ['01']
patch test_0: recommended revs (none), missing ['02', '03', '01'], actual other (none)
patch test_2: recommended revs ['02', '01'], missing ['03'], actual other (none)
patch testok: recommended revs ['02', '03', '01'], missing (none), actual other (none)

Oops, didn't pyt multiple revs in sort order. Oh well, you can do that if you like.

Regards,
Bengt Richter

**Bengt Richter** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

On Thu, 5 May 2005 10:37:23 -0700, James Stroud <jstroud@mbi.uc la.edu> wrote:
[...]
We had the same impulse ;-)
(see my other post in this thread)[color=blue]
>
># use patch as value and version as key[/color]
??? seems the other way around (as it should be?)
[color=blue]
>recc_dct = dict([x.split("-") for x in recc_ary])
>serv_dct = dict([x.split("-") for x in serv_ary])
>[/color]
But what about multiple revs for the same patch?

Regards,
Bengt Richter

**James Stroud** · Jul 19 '05, 12:58 AM

Re: dictionary comparison

On Thursday 05 May 2005 01:18 pm, so sayeth Bengt Richter:[color=blue]
> On Thu, 5 May 2005 10:37:23 -0700, James Stroud <jstroud@mbi.uc la.edu>
> wrote: [...]
> We had the same impulse ;-)
> (see my other post in this thread)
>[color=green]
> ># use patch as value and version as key[/color]
>
> ??? seems the other way around (as it should be?)[/color]

Sorry, typo in the comment.
[color=blue]
>[color=green]
> >recc_dct = dict([x.split("-") for x in recc_ary])
> >serv_dct = dict([x.split("-") for x in serv_ary])[/color]
>
> But what about multiple revs for the same patch?[/color]

My Bad...

serv_dct = dict([(a,max([z for y,z in [f.split("-") for f in serv_ary] if a==y]))
for a,b in [g.split("-") for g in serv_ary]])

;o)

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/

dictionary comparison

dictionary comparison

Comment

Comment

Comment

Comment

Comment

Comment

Comment