Finding "missing" rows

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • maury

    Finding "missing" rows

    Hello, I have a DB table with data filled from a weather sensor probe,
    I have one row every 10 minutes and the data fields is not
    in DateTime format but in string format:
    yyyyMMddHHmm

    So for example I have
    200804140340
    200804140350
    200804140400
    200804140410
    and so on

    I need to write a query in SQL server that "tells" me if there is some
    hole
    in this table, for example if some data was not retrieved or the
    sensor
    probe didn't collected for an hour or other time interval...
    for example if I have
    200804140340
    200804140350
    200804140410
    the query have to spot that the record with 200804140400 date is
    missing

    Some suggestion about building this query?

    Thanks
  • Anith Sen

    #2
    Re: Finding "missing&q uot; rows

    Create table ( temporary table/view/udf or even an SELECT expression that
    can be used as a derived table) with yyyyMMddHHmm values starting from the
    smallest value to the largest value for that column in your table. Below
    you'll see a simple way to create such a table. Now you should be able to
    write an simple query using OUTER JOIN or a correlated subquery to get the
    list of missing values.

    SET NOCOUNT ON
    DECLARE @t TABLE( c CHAR(12) NOT NULL PRIMARY KEY) ;
    DECLARE @i BIGINT ;
    SET @i = 200804140330 ;
    WHILE @i < 200804140410 BEGIN
    SET @i = @i + 10 ;
    IF ( @i % 100 ) < 60
    INSERT @t SELECT CAST( @i AS CHAR(12)) ;
    END
    SELECT * FROM @t ;

    To get the missing values, try:

    SELECT * FROM @t t
    WHERE NOT EXISTS ( SELECT * FROM <your table>
    WHERE <your col= t.c );

    --
    Anith


    Comment

    • --CELKO--

      #3
      Re: Finding &quot;missing&q uot; rows

      Build a look up table of "hhmm" strings (6 periods * 24 hours = 144
      rows) then create a query for a given day like this:

      SELECT @my_date + T.probe_time
      FROM TimeSlots AS T
      WHERE @my_date + T.probe_time
      NOT IN (SELECT P.probe_time FROM ProbeReadings);

      Acutally, an entire year of strings would be only (365 days * 144
      slots) = 52560 rows, so you could put this in a VIEW and be ahead of
      the game.

      Comment

      • maury

        #4
        Re: Finding &quot;missing&q uot; rows

        Can I use something similare a FOR statement
        in T-SQL?

        I wouldn't create a new table with all the rows
        (the DB is very 'huge' by now)

        Thanks

        M.A.

        Comment

        • --CELKO--

          #5
          Re: Finding &quot;missing&q uot; rows

          >Can I use something similar a FOR statement in T-SQL? <<

          Technically, yes. It is a WHILE loop. But you are missing the whole
          idea of SQL and RDBMS. This is a declarative data retrieval language.
          Loops are for procedural, computational languages.
          >I wouldn't create a new table with all the rows (the DB is very 'huge' by now) <<
          Huge means Petabytes these days and will soon mean Exabytes. A table
          with only 52560 rows is tiny; fill it for ten years and it is still
          small-- less than 1 million rows of one column.

          If this really bothers you, then set up a table with strings of just
          the time slots for a day. Then create a VIEW which uses the
          CURRENT_TIMESTA MP, extracts the date from it, concatenates the date
          and time together. This VIEW will good for one day at a time. There
          might be problems at midnight if the database and the data source are
          not in synch.

          Comment

          • maury

            #6
            Re: Finding &quot;missing&q uot; rows

            On 15 Apr, 19:29, --CELKO-- <jcelko...@eart hlink.netwrote:
            Huge means Petabytes these days and will soon mean Exabytes. A table
            with only 52560 rows is tiny; fill it for ten years and it is still
            small-- less than 1 million rows of one column.
            .... my concept of "huge database" was really wrong...
            If this really bothers you, then set up a table with strings of just
            the time slots for a day.
            ok, this is a great idea...
            thanks

            M.A.

            Comment

            Working...