Import/Normalize approach - column-DML, or loop

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Steve Jorgensen

    #16
    Re: Import/Normalize approach - column-DML, or loop

    On Sat, 31 Jan 2004 00:06:28 GMT, "David W. Fenton"
    <dXXXfenton@bwa y.net.invalid> wrote:
    [color=blue]
    >Steve Jorgensen <nospam@nospam. nospam> wrote in
    >news:75ll105g7 st0pdnoqhn1r3vp v664erila9@4ax. com:
    >[color=green]
    >> On Fri, 30 Jan 2004 19:39:25 GMT, "David W. Fenton"
    >><dXXXfenton@b way.net.invalid > wrote:
    >>
    >> ...[color=darkred]
    >>>> Yeah. If i get what you're saying, I don't thing ripping
    >>>> arguments into 10 or 40 new columns will be a good thing, and
    >>>> anything short of the 40 or so means writing more code to handle
    >>>> the arguments past #10 or so differently than the rest.
    >>>
    >>>Well, I'm rather shocked at the idea that there'd be 40 separate
    >>>values stored in a single field. I know you said it was a
    >>>spreadshee t, but what kind of morons would write a spreadsheet
    >>>that complex when they obviously need a database?[/color]
    >>
    >> It probably is stored in a database inside the proprietary system
    >> it comes from. What I get is the spreadsheet exported from that
    >> system. Also, the items in the export are very short codes.[/color]
    >
    >So, you're normalizing data that has been denormalized for export?
    >
    >How stupid is that?
    >
    >Wouldn't it be better to have someone skip the spreadsheet and have
    >a normalized export process instead?[/color]

    Of course, that was the first thing I asked the client after they gave me the
    requirements. It's a totally closed system. Since it provides valuable
    market data on a subscription basis, presumably, they think giving customers
    too much access would compete with their own analysis consulting business - so
    we end up working around them.

    It's not the first time I've seen this sort of thing, and it probably won't be
    the last.

    Comment

    • Peter Miller

      #17
      Re: Import/Normalize approach - column-DML, or loop


      Steve,

      On Sat, 31 Jan 2004 04:03:07 GMT, Steve Jorgensen
      <nospam@nospam. nospam> wrote in comp.databases. ms-access:
      [color=blue]
      >Of course, that was the first thing I asked the client after they gave me the
      >requirements . It's a totally closed system. Since it provides valuable
      >market data on a subscription basis, presumably, they think giving customers
      >too much access would compete with their own analysis consulting business - so
      >we end up working around them.
      >
      >It's not the first time I've seen this sort of thing, and it probably won't be
      >the last.[/color]

      I've run into this before, under similar circumstances.

      I would suggest the following. Just because the vendor only provides
      a largely unsuitable export routine and doesn't expose the data
      through standard means (odbc, a propr. driver, an embedded prog. lang,
      etc) does not mean that the data is not accessible by more reliable
      means.

      Consider the following. If there is any logical construction to the
      data stored by this other app (and if there is any complexity at all
      to the app, its data storage must be logically structured or it would
      have been unmanagable) then the vendor probably used a third party
      solution for storage. In other words, the data files probably are
      some recognizable format, even if this is not acknowledged by the
      vendor. Open the data file in an editor. What does it look like?
      Jet? dbase? sql server? mysql? Are there any processes that need
      to be run in order for this other app to work (ie, is there a db
      server of some sort)? If you can determine what they used for
      storage, you can access the data directly without using their
      poorly-designed export interface. I would recommend strongly though
      that this export be enforced as an offline read-only process. You do
      not want to impact performance of the other app, or interfere with
      your licensing or provision for vendor support of the existing app.

      But that said, in most cases, although the vendor protects the means
      of obtaining and using the information (ie the protocols for updating
      and the 'program' itself) this does not necessarily mean that the data
      itself is owned or protected legally by the vendor. They may have
      obfuscated its storage format and provided a controlled export process
      to limit your ability to access the data, but in most cases, a
      licensed user would not be prohibited from using the data via an
      external routine, should you be able to devise one for them.

      That said, everything is, of course, dependant upon the actual license
      involved here. But in numerous cases where the license wasn't
      prohibitive in this regard, and the data storage architecture was
      obfuscated and the programs for accessing it limited, I have written
      external routines for directly reading data from such systems for
      clients who needed faster and more flexible access to their data. I'd
      encourage you to take a closer look at both the license involved and
      the proprietary (but almost inevitably non-custom) data storage
      formats and decide for yourself whether there isn't a better way,
      without taking your client's word for it. You may well find a lot
      better solution than the one you are putting together now.

      HTH

      Peter Miller
      _______________ _______________ _______________ _______________
      PK Solutions -- Data Recovery for Microsoft Access/Jet/SQL
      Free quotes, Guaranteed lowest prices and best results
      www.pksolutions.com 1.866.FILE.FIX 1.760.476.9051

      Comment

      • Mike Sherrill

        #18
        Re: Import/Normalize approach - column-DML, or loop

        On Fri, 30 Jan 2004 19:39:25 GMT, "David W. Fenton"
        <dXXXfenton@bwa y.net.invalid> wrote:
        [color=blue]
        >Well, I'm rather shocked at the idea that there'd be 40 separate
        >values stored in a single field. I know you said it was a
        >spreadsheet, but what kind of morons would write a spreadsheet that
        >complex when they obviously need a database?[/color]

        "Obvious" is in the eye of the beholder.

        One of my first system conversions involved building a SQL database
        out of 30 megs of WordPerfect (not a typo) tables. Only forty values
        in a cell would have been a real treat.

        --
        Mike Sherrill
        Information Management Systems

        Comment

        • Mike Sherrill

          #19
          Re: Import/Normalize approach - column-DML, or loop

          On Sat, 31 Jan 2004 04:00:27 GMT, Steve Jorgensen
          <nospam@nospam. nospam> wrote:
          [color=blue][color=green]
          >>Did you consider using code to generate SQL statements, then executing
          >>them? (Possibly within explicit transactions?)[/color]
          >
          >Yes, that's precisely what I am doing.[/color]

          I got the impression your SQL statements were executing VBA functions.
          I was talking (unclearly) of building SQL statements that contain only
          literal values, no function calls.

          --
          Mike Sherrill
          Information Management Systems

          Comment

          • Steve Jorgensen

            #20
            Re: Import/Normalize approach - column-DML, or loop

            On Sat, 31 Jan 2004 17:12:35 -0500, Mike Sherrill
            <MSherrillnonon o@compuserve.co m> wrote:
            [color=blue]
            >On Sat, 31 Jan 2004 04:00:27 GMT, Steve Jorgensen
            ><nospam@nospam .nospam> wrote:
            >[color=green][color=darkred]
            >>>Did you consider using code to generate SQL statements, then executing
            >>>them? (Possibly within explicit transactions?)[/color]
            >>
            >>Yes, that's precisely what I am doing.[/color]
            >
            >I got the impression your SQL statements were executing VBA functions.
            >I was talking (unclearly) of building SQL statements that contain only
            >literal values, no function calls.[/color]

            Oh, right. I see what you're getting at.

            The issue was that the only way I could figure out to use a query to split up
            the multi-vlaued fields was to call a UDF. That's because expressions that
            can be built to do that using native functions (FWICS would become longer and
            more deeply nested for each successive argument number, and would quickly
            exceed the limits of what JET could be reasonably expected to parse (long
            before argument 40). If this was SQL Server, I think I could use PATINDEX to
            find, say, the text between the 12th and 13th delimiter characters.

            Comment

            • Mike Sherrill

              #21
              Re: Import/Normalize approach - column-DML, or loop

              On Sat, 31 Jan 2004 22:35:32 GMT, Steve Jorgensen
              <nospam@nospam. nospam> wrote:
              [color=blue][color=green]
              >>I got the impression your SQL statements were executing VBA functions.
              >>I was talking (unclearly) of building SQL statements that contain only
              >>literal values, no function calls.[/color]
              >
              >Oh, right. I see what you're getting at.
              >
              >The issue was that the only way I could figure out to use a query to split up
              >the multi-vlaued fields was to call a UDF.[/color]

              And you were trying to approach this in a set-based way. Sorry; I
              lost the context somewhere.

              Did you consider using third-party software like DataJunction? I had
              a project that involved moving loads of data between a SQL system and
              Lotus Notes, and a third-party program really made my life easier.
              (It wasn't DataJunction, but it was the same kind of thing.)

              --
              Mike Sherrill
              Information Management Systems

              Comment

              Working...