Python/Numpy have I already written the swiftest code for large array?

Collapse
X
 
  • Time
  • Show
Clear All
new posts
  • justforkicks1
    New Member
    • Dec 2017
    • 1

    Python/Numpy have I already written the swiftest code for large array?

    I would like to get my script total execution time down from 4 minutes to less than 30 secs. I have a large 1d array (3000000+) of distances with many duplicate distances. I am trying to write the swiftest function that returns all distances that appear n times in the array. I have written a function in numpy but there is a bottleneck at one line in the code. Swift performance is an issue because the calculations are done in a for loop for 2400 different large distance arrays.*

    Code:
     import numpy as np
        for t in range(0, 2400):
         a=np.random.randint(1000000000, 5000000000, 3000000)
         b=np.bincount(a,minlength=np.size(a))
         c=np.where(b == 3)[0] #SLOW STATEMENT/BOTTLENECK
         return c
    Given a 1d array of distances [2000000000,3005 670000,20000000 00,12345667,400 0789000,1234568 7,12345667,2000 000000,12345667]
    I would expect back an array of [2000000000,1234 5667] when queried to return an array of all distances that appear 3 times in the main array.

    What should I do?
Working...