Help with data analysis in python

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • darkapostle
    New Member
    • Sep 2018
    • 1

    Help with data analysis in python

    Hello, I am struggling with an assignment using python for data analysis. We are allowed to use pandas but I had a problem with showing the data using Pandas so i continued the code without Pandas. I am having problem with getting multiple things from the data( I have to get the year, northern latitude, and name of a hurricane) as well as accessing certain strings from the data.

    Here is the text file with the data: https://www.nhc.noaa... 2017-050118.txt

    Here is the .ipynb file with code to check your answers and the code for reading the file and how to access its records.
    http://www.cis.umass.. .8fa/a1/a1.ipynb

    This is my code so far for problems 1-3 but I have no idea how to access the records for problems 4 and 5.

    Problem 1: Unique Hurricanes
    Code:
    names = set()
    for record in records:
        #access names record and remove ','
        first_entry = record[0].split(',')[1]
        first_entry = first_entry.split(' ')[-1]
        # strip whitespace
        first_entry.strip()
        # if hurricane name not UNNAMED add to set, thus generating unique names
        if(first_entry != 'UNNAMED'):
            names.add(first_entry)
            # answer is number of unique hurricane names 
    answer = len(names)
    print(answer)
    Code:
    names = []
    # import counter
    from collections import Counter
    for record in records: 
        # access hurricane name record, remove ','
        first_entry = record[0].split(',')[1]
        first_entry = first_entry.split(' ')[-1]
        # strip white space
        first_entry.strip()
        # if not unnamed append to names set
        if(first_entry != 'UNNAMED'):
            names.append(first_entry)
            # call most_common function to get most common hurricane name 
            answer = Counter(names).most_common(1)[0][0]
            print(answer)
    Code:
    years = []
    for record in records:
        # access years record
        first_entry = record[0].split(',')[0]
        year = first_entry[-4:]
        # append year to years set
        years.append(year)
        # call most_common function to get most common hurricanes in 1 year 
    answer = Counter(years).most_common(1)[0][0]
    print(answer)
    I need help with these:

    4. Most Northerly Hurricane (10 pts)
    Write code that computes the hurricane that went furthest north as measured by the greatest latitude. You need to find the name and the year of the hurricane.
    Hints:
    Check the documentation to find where the latitude is recorded.
    You will need to go through the tracking points to check all of the latitude points recorded.
    You need to keep track of three things: the maximum latitude seen so far plus the name of the corresponding hurricane and year
    The latitude adds the N character to indicate the northern hemisphere. This needs to be removed to do numeric comparisons.
    You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.
    5. Hurricane with Maximum Sustained Wind (10 pts)
    Write code that determines the hurricane with the highest sustained windspeed. You need to find the name, year, and wind speed for this hurricane.
    Hints:
    Check the documentation to find where the wind speed is recorded.
    You will need to go through the tracking points to check all of the wind speeds recorded.
    You can convert a string to a float or int by castingit. For example, float("81.5") returns a floating-point value of 81.5.

    I have tried using pandas but when I try to read it in my way and create a dataframe it returns an error:
    "list indices must be integers or slices, not str"

    here is my code for my failed implementation:
    Code:
    import pandas as pd
    
    hurricane_storm_dfs = []
    for storm_dict in hurricane_storms_r:
        storm_id, storm_name, storm_entries_n = storm_dict['header'].split(",")[:3]
        # remove hanging newline ('\n'), split fields
        data = [[entry.strip() for entry in datum[:-1].split(",")] for datum in storm_dict['data']
        frame = pd.DataFrame(data)
        frame['id'] = storm_id
        frame['name'] = storm_name
        hurricane_storm_dfs.append(frame)
Working...