I have a query on a database that returns a large number of rows from a database (somewhere near a million) to an application that then does some processing with them. Some of these returns may be duplicates, and I considered using DISTINCT to make sure these duplicates do not occur, however I suspect I would be better off filtering the duplicates within the application, rather than within the database.
My justification would be as follows:
- Using DISTINCT means the entire query needs to complete before any results are returned
- Filtering myself means the results will start to be returned almost immediately, so I can be processing the results as the query is still executing.
Just wondering whether people more familiar with how databases (in particular Oracle) work.
(This application is written in C#, however I suppose the bigger point is questioning whether there's any point using 'distinct' in a cursor (other than in a sub-query) in any language....)
My justification would be as follows:
- Using DISTINCT means the entire query needs to complete before any results are returned
- Filtering myself means the results will start to be returned almost immediately, so I can be processing the results as the query is still executing.
Just wondering whether people more familiar with how databases (in particular Oracle) work.
(This application is written in C#, however I suppose the bigger point is questioning whether there's any point using 'distinct' in a cursor (other than in a sub-query) in any language....)
Comment