"slicing" records

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Claudio Lapidus

    "slicing" records

    Hello

    I have a table with objects' descriptions:

    id | length
    ---------+--------
    object1 | 40
    object2 | 66
    object3 | 12
    object4 | 107
    object5 | 220

    But I need to export data to a legacy system that doesn't handle lengths
    greater than 50 (don't ask me why...). Instead, it expects the data in this
    format:

    id | length | fragment | offst
    ---------+--------+----------+-------
    object1 | 40 | whole | 0
    object2 | 50 | start | 0
    object2 | 16 | end | 50
    object3 | 12 | whole | 0
    object4 | 50 | start | 0
    object4 | 50 | middle | 50
    object4 | 7 | end | 100
    object5 | 50 | start | 0
    object5 | 50 | middle | 50
    object5 | 50 | middle | 100
    object5 | 50 | middle | 150
    object5 | 20 | end | 200

    So when length becomes greater, it is break up in as many pieces as
    necessary, each of max allowed length except the last one, in such a way
    that the sum of partial lengths equals the original one.

    Now I couldn't manage to get a query capable of doing this. If anybody has
    an idea, I'll be very much appreciated.

    TIA,
    cl.

    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

  • Richard Huxton

    #2
    Re: "slicing&q uot; records

    On Saturday 11 October 2003 06:00, Claudio Lapidus wrote:[color=blue]
    > Hello
    >
    > I have a table with objects' descriptions:
    >
    > id | length
    > ---------+--------
    > object1 | 40
    > object2 | 66
    > object3 | 12
    > object4 | 107
    > object5 | 220
    >
    > But I need to export data to a legacy system that doesn't handle lengths
    > greater than 50 (don't ask me why...). Instead, it expects the data in this
    > format:
    >
    > id | length | fragment | offst
    > ---------+--------+----------+-------
    > object1 | 40 | whole | 0
    > object2 | 50 | start | 0
    > object2 | 16 | end | 50
    > object3 | 12 | whole | 0
    > object4 | 50 | start | 0
    > object4 | 50 | middle | 50
    > object4 | 7 | end | 100
    > object5 | 50 | start | 0
    > object5 | 50 | middle | 50
    > object5 | 50 | middle | 100
    > object5 | 50 | middle | 150
    > object5 | 20 | end | 200[/color]

    Simplest way is probably to write either a plpgsql function within PG or a
    perl script outside it to split up the data.

    If doing it within PG, you might find Stephan Szabo's article on set-returning
    functions useful (http://techdocs.postgresql.org)

    --
    Richard Huxton
    Archonet Ltd

    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster

    Comment

    • Jan Wieck

      #3
      Re: "slicing&q uot; records

      Claudio Lapidus wrote:
      [color=blue]
      > Hello
      >
      > I have a table with objects' descriptions:
      >
      > id | length
      > ---------+--------
      > object1 | 40
      > object2 | 66
      > object3 | 12
      > object4 | 107
      > object5 | 220
      >
      > But I need to export data to a legacy system that doesn't handle lengths
      > greater than 50 (don't ask me why...). Instead, it expects the data in this
      > format:[/color]

      Oh, it's one of these _don't ask me why_ things ... well, then "what is
      the target legacy system?" ... hehe.
      [color=blue]
      >
      > id | length | fragment | offst
      > ---------+--------+----------+-------
      > object1 | 40 | whole | 0
      > object2 | 50 | start | 0
      > object2 | 16 | end | 50
      > object3 | 12 | whole | 0
      > object4 | 50 | start | 0
      > object4 | 50 | middle | 50
      > object4 | 7 | end | 100
      > object5 | 50 | start | 0
      > object5 | 50 | middle | 50
      > object5 | 50 | middle | 100
      > object5 | 50 | middle | 150
      > object5 | 20 | end | 200
      >[/color]

      If there is a total upper maximum for the object length and it's not way
      too obscenely large, then you can create a view that get's you this:

      select id, length(data), data from t1;
      id | length | data
      ----+--------+-------------------------------------------------
      1 | 6 | 123456
      2 | 10 | 1234567890
      3 | 15 | 123456789012345
      4 | 20 | 123456789012345 67890
      5 | 27 | 123456789012345 678901234567
      6 | 47 | 123456789012345 678901234567890 123456789012345 67
      (6 rows)

      select * from t1_sliced order by id, fragoffset;
      id | fragoffset | fraglength | fragtype | fragdata
      ----+------------+------------+----------+------------
      1 | 0 | 6 | whole | 123456
      2 | 0 | 10 | whole | 1234567890
      3 | 0 | 10 | start | 1234567890
      3 | 10 | 5 | end | 12345
      4 | 0 | 10 | start | 1234567890
      4 | 10 | 10 | end | 1234567890
      5 | 0 | 10 | start | 1234567890
      5 | 10 | 10 | middle | 1234567890
      5 | 20 | 7 | end | 1234567
      6 | 0 | 10 | start | 1234567890
      6 | 10 | 10 | middle | 1234567890
      6 | 20 | 10 | middle | 1234567890
      6 | 30 | 10 | middle | 1234567890
      6 | 40 | 7 | end | 1234567
      (14 rows)


      See attached sample script. I didn't know if you really wanted this
      fancy "whole|start|mi ddle|end" string or if that was supposed to be the
      data of the fragment itself. Please notice that the view in the sample
      is "configured " for data sized up to 100 characters.


      Jan

      --
      #============== =============== =============== =============== ===========#
      # It's easier to get forgiveness for being wrong than for being right. #
      # Let's break this rule - forgive me. #
      #============== =============== =============== ====== JanWieck@Yahoo. com #

      drop view t1_sliced;
      drop table t1;
      drop sequence t1_id_seq;
      drop table slice_config;
      drop function slice_length (integer, integer, integer);
      drop function slice_type (integer, integer, integer);

      create table t1 (
      id serial primary key,
      data text
      );

      insert into t1 (data) values ('123456');
      insert into t1 (data) values ('1234567890');
      insert into t1 (data) values ('1234567890123 45');
      insert into t1 (data) values ('1234567890123 4567890');
      insert into t1 (data) values ('1234567890123 45678901234567' );
      insert into t1 (data) values ('1234567890123 456789012345678 901234567890123 4567');

      create table slice_config (
      s_off integer primary key,
      s_len integer
      );
      insert into slice_config (s_off, s_len) values (0, 10);
      insert into slice_config (s_off, s_len) values (10, 10);
      insert into slice_config (s_off, s_len) values (20, 10);
      insert into slice_config (s_off, s_len) values (30, 10);
      insert into slice_config (s_off, s_len) values (40, 10);
      insert into slice_config (s_off, s_len) values (50, 10);
      insert into slice_config (s_off, s_len) values (60, 10);
      insert into slice_config (s_off, s_len) values (70, 10);
      insert into slice_config (s_off, s_len) values (80, 10);
      insert into slice_config (s_off, s_len) values (90, 10);

      create function slice_length (integer, integer, integer) returns integer
      as '
      declare
      data_size alias for $1;
      slice_off alias for $2;
      slice_len alias for $3;
      frag_len integer;
      begin
      frag_len = data_size - slice_off;
      if frag_len > slice_len then
      return slice_len;
      end if;
      return frag_len;
      end;
      ' language plpgsql;

      create function slice_type (integer, integer, integer) returns text
      as '
      declare
      data_size alias for $1;
      slice_off alias for $2;
      slice_len alias for $3;
      begin
      if slice_off = 0 then
      if data_size <= slice_len then
      return ''whole'';
      end if;
      return ''start'';
      end if;
      if data_size <= slice_off + slice_len then
      return ''end'';
      end if;
      return ''middle'';
      end;
      ' language plpgsql;

      create view t1_sliced as
      select T.id, C.s_off as fragoffset,
      slice_length (length(T.data) , C.s_off, C.s_len) as fraglength,
      slice_type (length(T.data) , C.s_off, C.s_len) as fragtype,
      substr (T.data, C.s_off + 1, C.s_len) as fragdata
      from t1 T, slice_config C
      where C.s_off = 0 or length(T.data) > C.s_off;

      select id, length(data), data from t1;

      select * from t1_sliced order by id, fragoffset;


      ---------------------------(end of broadcast)---------------------------
      TIP 6: Have you searched our list archives?



      Comment

      • Claudio Lapidus

        #4
        Re: &quot;slicing&q uot; records

        Jan Wieck wrote:[color=blue]
        > Oh, it's one of these _don't ask me why_ things ... well, then "what is
        > the target legacy system?" ... hehe.
        >[/color]
        Of course, "don't ask me why" is my own way of saying "I don't know why!"
        :-)
        [color=blue]
        > If there is a total upper maximum for the object length and it's not way
        > too obscenely large, then you can create a view that get's you this:
        >[/color]
        [snip][color=blue]
        > See attached sample script. I didn't know if you really wanted this
        > fancy "whole|start|mi ddle|end" string or if that was supposed to be the
        > data of the fragment itself. Please notice that the view in the sample
        > is "configured " for data sized up to 100 characters.[/color]

        No, the destination system actually needs the labels as a flag of the
        fragment position or if it's a fragment at all (i.e. not 'whole'). Actually,
        your view/functions seem to almost fit my original need, I think they'll
        just need minor touch up. Thanks a lot Jan, really nice code.

        cheers
        cl.

        ---------------------------(end of broadcast)---------------------------
        TIP 4: Don't 'kill -9' the postmaster

        Comment

        Working...