Possible to have different datatypes among columns of an array?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • RockRoll
    New Member
    • Jul 2020
    • 10

    Possible to have different datatypes among columns of an array?

    Hi Everyone,

    I create an expandable earray of Nx4 columns. Some columns require float64 datatype, the others can be managed with int32. Is it possible to vary the data types among the columns? Right now I just use one (float64, below) for all, but it takes huge disk space for (>10 GB) files.

    For example, how can I ensure column 1-2 elements are int32 and 3-4 elements are float64?

    Code:
    a = f1.create_earray(f1.root, "dataset_1", atom=tables.Float32Atom(), shape=(0, 4))
  • SioSio
    Contributor
    • Dec 2019
    • 272

    #2
    When reading with "read_csv() ", specify the column type in dictionary format for the argument "dtype".
    Code:
    df = pd.read_csv('Nx4.csv',dtype = {'col1':'int64', 'col2':'int64', 'col3':'float64','col4':'float64'})
    Or change the data type of a column in Pandas and write it to a file.
    Code:
    df_ = df.astype({'col1':'int8','col2':'int8','col3':'float64', 'c': 'float64'})
    df_.to_csv('output.csv')
    If the data is too large, you can compress it and write it out.
    Code:
    df_.to_csv('output.csv.gz', compression='gzip')

    Comment

    • madankarmukta
      Contributor
      • Apr 2008
      • 308

      #3
      @RockRoll, Can you please elaborate on what you want to achieve.

      Comment

      Working...