User Profile
Collapse
-
Thanks, that does make more sense now! -
Thanks so much...
now seeing as i don't really see something like this in the documentation, could you explain this line a bit more?
datetime.dateti me(*time.strpti me(s, "%Y-%m-%d %H:%M:%S")[:6])
I understand calling datetime.dateti me, but why the * before time? and also what exactly is the [:6] pertaining too
Thanks again for your help!!!Leave a comment:
-
dealing with datetime while parsing a csv
Hello all!
I am parsing a csv file and one of the fields is a date time field that looks something like this
2010-01-15 23:15:30
year-month-day hour24:minute:s econd
as i loop through this csv i am going to need to do some time arithmetic. my question is, how do i turn that field from a string in my list to a datetime object?
...Code:from datatime import datetime TmpArr = [] reader = open("c:/test.csv",'r') -
wow, thanks dwblas!
That dramatically increased the search speed! Still alot to learn with python..
Very much appreciated!
EricLeave a comment:
-
best way to match values in two tables...
Hey all,
Sorry the subject should have said..
"best way to match values in TWO tables"
I have two tables that I need to match based off an Unique ID in both tables. Im running this process using hadoop streaming with python, so the actual code is a bit different (ie using csv files to debug locally). I've tried a couple different methods and both are not quite fast enough... ha!
First was like this, using the... -
alright, well ignore this post.. Just figured out that if i just print the final values after the loop, it works
...Code:#!/usr/bin/python import sys TmpArr = [] OutArr = [] i = int(0) j = int(0) id = "" VAL1 = int(0) VAL2 = int(0) TOTAL = int(0) for line in sys.stdin: j += 1 try: line = line.strip() TmpArrLeave a comment:
-
looping through values error....
Hey all.
I do not understand what is wrong with my script and would love some help... first off the examples in my script are based off running a map reduce in hadoop.. the part I am struggling with is the reduce.. my basic input is something like this
ID--VAL1--VAL2
41,0,1
41,1,0
41,1,0
46,0,1
46,0,1
46,1,0
46,1,0
basically I need to loop through each line and check to see if the ID from... -
alright after a bit more web searches.. found it, so thought I would share if other folks ever need to do this.
...Code:Imports System Imports System.IO Imports System.Net Dim uriWebSite As New Uri("http://ourserver...com:port#/filetos...es/actual_file") Dim WReq As WebRequest = System.Net.WebRequest.Create(uriWebSite) Dim wResp As WebResponse = WReq.GetResponse() DimLeave a comment:
-
reading a file from a url to array
Hey All,
I have been trying to research how to get a file (basically a csv file) from a url to an array and my search has come up with some examples, but not quite what I am looking for. Would appreciated any direction and or other examples of folks that have done this before.
Basically we have a cloud computer set up (Hadoop) and the output of some of our processes is what I am trying to get to..
the file is something like
... -
-
oh... i found my error...
in line 15 i was appending all of a to d, then joining c and d into tuple e. what i should have done is just join c and a into e...
my out is now correct... thank you so much!Leave a comment:
-
sorry,
didn't mean to imply that i wasn't willing or able to try. I definitely worked on the tuple method yesterday and still did not sort correctly. the code that i originally posted was that method... well l least thats what i thought that method was.. where e was the tuple
...Code:a = [] d = [] i = 0 c = [] line = "0001 , 4 , 0.34 , 3 , 15 , 25.3" a.append(line.split(','))Leave a comment:
-
Thanks for the reply...
still not sure how that would help me, as its still sorting by all the values and not just select values/columns in my MD List.Leave a comment:
-
How to sort a multi-dimensional list in python 2.3
Yes I know.. Python 2.3... unfortunately our development servers are RedHat and that is the default version installed on them and apparently, upgrading can cause system failures (trying to figure out a work around to run multiple versions of python, but not there yet) in the mean time... I need see if any one can help me with sorting a multidimensiona l list by certain elements with in that list. I've read about Schwartzian Transform in Python? but... -
this is what i've been doing... im pretty new to python too though...
now you have a multidimensiona l list (TmpArr)... you can sort by columns by doing something like thisCode:TmpArr = [] for line in openfile: #strips line line = line.strip() TmpArr.append(line.split('\t'))
if say you wanted to sort by authorlistCode:TmpArr.sort(key=lambda a:(a[0]))
Leave a comment:
-
Thanks!
Alright.. well the only reason I check for line == 0 (or in this case line = "") is the actual end of file.. there will be no NULL lines from the map input. I am still having to pretty much duplicate the code as you see. The code is all over the place too as im still in debug mode... but it is working properly with a csv file as the input.. I am calculating the average, standard deviation, median, min and max values too. Will...Leave a comment:
-
i am messing around with just running the reducer.py with a txt file and am able to process the whole file by adding this
seems slightly wrong to have to repeat all my code in the if and the else but it works... am not able to get it to work using sys.stdin......Code:while True: line = reader.readline() if len(line) != 0: <my code here> else: <repeat my code here>Leave a comment:
-
parsing a file
Hey all.
(hopefully) a quick question here. I am processing data using Hadoop Streaming Map/Reduce.. the map.py is straight forward.. basically takes the input data (in the form of sys.stdin), loads it into a list, sorts that list, then... well not exactly sure what hadoop does with that, but pretty sure it creates a temporarly file much like a csv in memory
...Code:for line in sys.stdin: <append into a list then sort> -
one last post (on this subject anyhow!) just wanted to let you know that I was able to complete my first Map/Reduce job on Hadoop with Python! Thanks again for all your help!Leave a comment:
-
Hey all!
Well, i am now able to run this code on my sample csv file (3 million rows) on my desktop. It completes in under 1 minute. Which is great! Still having issues on the hadoop end, but i think that problem is not for this forum. Would still appreciate any suggestions or improvements on the code itself as im still very much a newbie!
...Code:import time t1 = time.clock() TmpArr = [] Unique = [] SortArr = []
Leave a comment:
No activity results to display
Show More
Leave a comment: