PASS for parallel processing

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • shifali
    New Member
    • Dec 2013
    • 4

    PASS for parallel processing

    Searching for a PASS(or may be IASS) that allow to have virtual environment for parallel processing. I have a large dataset(1.5 GB in csv format), need to process this data and also visualize it again and again. My PC configuration is 2 GB RAM, and processor is Intel(R) Pentium(R) Dual CPU E2180; with this configuration I am unable to even open whole dataset in Excel or in R, using Gephi for visualization; so need any platform that will allow to process data and use Gephi too.

    Please suggest some solution to deal with massive data specially in context of infrastructure support.

    My requirement is similar to Hadoop but it doesn't work for me because I need to use Gephi and Python programs. In case it can work kindly point out how.
  • sicarie
    Recognized Expert Specialist
    • Nov 2006
    • 4677

    #2
    I don't believe there is a preset program to take "any" CSV data and put it into a visualization that is useful to "every" user.

    There are utilities that can split the file into smaller files so they can be opened on your computer, but you'll either need to learn how, or get someone who knows how to use a language to open the file, read in the relevant data, and then create the visualization you require.

    Comment

    • shifali
      New Member
      • Dec 2013
      • 4

      #3
      Yes there is no program(in my knowledge too) that can accept "any" CSV data. Programs need some specified formate of input for their work; due to this reason first I have to configure the file as per Gephi/R or any S/W or program that I will use for processing and visualization.

      If you are saying that I should first try all my efforts on small dataset(a part of big csv file) and then with conclusion from that, should move for full data for final conclusion, that's what acctually I am doing; Using Ubuntu(linux based) OS I can read top/end n lines of a file using head and tail commands. Working on that small part is almost done so searching for some palteform that can help me to work with whole data.

      Well probably got solution http://aws.amazon.com/ec2/ thier Free Tier will work for me though not explored it more yet.

      Comment

      • sicarie
        Recognized Expert Specialist
        • Nov 2006
        • 4677

        #4
        You mentioned that the file was too big to open on your desktop. If you were looking for a quick load into a tool, you could split the file and then load each view separately. However, if all the data needs to be compiled together (ie, is not historic) to get an accurate picture of what is going on.

        Comment

        • shifali
          New Member
          • Dec 2013
          • 4

          #5
          Well yes, at present my requirement is only to work on large data file all together(compli ed together).

          Have used split/head/tail.

          Thanks for your suggestions.

          Comment

          Working...