B-trees

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • rsprawls

    B-trees

    I found a disk for a b-tree algorithm that I purchased back in 93 or
    so. I'd hoped to find this, but now I'd like to know how worthwhile
    are b-trees in today's advancements? This is old C code using pascal
    and cdecl declarations, but it would be a minor project to bring up to
    date in C, but I was thinking it might be a good project to convert it
    over to C++, especially since I don't know C++ very well and would
    like a good sorting algorithm for medium sized datafiles.

    Are b-trees still a valued data structure today? It's been so long
    since I set foot in programming, I'm itching to start again, but I'm
    wanting a starting point.
  • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

    #2
    Re: B-trees

    On 2008-08-03 00:02, rsprawls wrote:
    I found a disk for a b-tree algorithm that I purchased back in 93 or
    so. I'd hoped to find this, but now I'd like to know how worthwhile
    are b-trees in today's advancements? This is old C code using pascal
    and cdecl declarations, but it would be a minor project to bring up to
    date in C, but I was thinking it might be a good project to convert it
    over to C++, especially since I don't know C++ very well and would
    like a good sorting algorithm for medium sized datafiles.
    >
    Are b-trees still a valued data structure today? It's been so long
    since I set foot in programming, I'm itching to start again, but I'm
    wanting a starting point.
    Not really topical for this group, you might want to ask in a general
    programming group like comp.programmin g.

    AFAIK B-Trees or one of its derivatives are used in most modern
    filesystems, since the time it takes to read in a node from the disk is
    so high you want to have as few reads as possible. I would also suspect
    that they are commonly used in databases and other places which deals
    with on-disk storage. If the datastructur is in RAM other structures are
    generally better.

    --
    Erik Wikström

    Comment

    • James Kanze

      #3
      Re: B-trees

      On Aug 3, 2:43 am, Erik Wikström <Erik-wikst...@telia. comwrote:
      On 2008-08-03 00:02, rsprawls wrote:
      I found a disk for a b-tree algorithm that I purchased back
      in 93 or so. I'd hoped to find this, but now I'd like to
      know how worthwhile are b-trees in today's advancements?
      This is old C code using pascal and cdecl declarations, but
      it would be a minor project to bring up to date in C, but I
      was thinking it might be a good project to convert it over
      to C++, especially since I don't know C++ very well and
      would like a good sorting algorithm for medium sized
      datafiles.
      Are b-trees still a valued data structure today? It's been
      so long since I set foot in programming, I'm itching to
      start again, but I'm wanting a starting point.
      Not really topical for this group, you might want to ask in a
      general programming group like comp.programmin g.
      AFAIK B-Trees or one of its derivatives are used in most
      modern filesystems, since the time it takes to read in a node
      from the disk is so high you want to have as few reads as
      possible.
      I'm not sure what you're considering a B-Tree here. I don't
      think that the OS does anything special with directories,
      however: the general philosophy is that if your directory
      structure is well organized, no individual directory will
      contain enough entries to make anything other than linear search
      worth the effort. Also, the main use of B-Trees is to keep
      sorted data; the OS's I know don't maintain the directory
      entries sorted, so a B-Tree wouldn't improve access in any way.
      I would also suspect that they are commonly used in databases
      and other places which deals with on-disk storage.
      In a relational database, it probably depends on the types of
      indexes, and the use which is made of them. I think modern
      databases dynamically tune the organization according to the
      actual accesses they see, which means that if sorted access
      isn't useful, they won't use a B-Tree. (Sorted access is useful
      not only when you request sorted output, but also when you have
      index criteria like "where x a and x < b". On the other hand,
      a hash table is generally better for "where x = a", and off
      hand, I can't think of any "optimizing " structure for things
      like "where x like '...%'"---I've also noticed that when I use
      such requests, my access times go up significantly:-).)
      If the data structure is in RAM other structures are generally
      better.
      That used to be true, but I'm not sure now, with paging, etc.
      If you need random access to sorted data, using a B-Tree with
      each node filling an entire page (and page aligned) might be
      worth it (but you'd need one awfully big table to see the
      difference).

      --
      James Kanze (GABI Software) email:james.kan ze@gmail.com
      Conseils en informatique orientée objet/
      Beratung in objektorientier ter Datenverarbeitu ng
      9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

      Comment

      • =?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

        #4
        Re: B-trees

        On 2008-08-03 10:08, James Kanze wrote:
        On Aug 3, 2:43 am, Erik Wikström <Erik-wikst...@telia. comwrote:
        >On 2008-08-03 00:02, rsprawls wrote:
        >
        I found a disk for a b-tree algorithm that I purchased back
        in 93 or so. I'd hoped to find this, but now I'd like to
        know how worthwhile are b-trees in today's advancements?
        This is old C code using pascal and cdecl declarations, but
        it would be a minor project to bring up to date in C, but I
        was thinking it might be a good project to convert it over
        to C++, especially since I don't know C++ very well and
        would like a good sorting algorithm for medium sized
        datafiles.
        >
        Are b-trees still a valued data structure today? It's been
        so long since I set foot in programming, I'm itching to
        start again, but I'm wanting a starting point.
        >
        >Not really topical for this group, you might want to ask in a
        >general programming group like comp.programmin g.
        >
        >AFAIK B-Trees or one of its derivatives are used in most
        >modern filesystems, since the time it takes to read in a node
        >from the disk is so high you want to have as few reads as
        >possible.
        >
        I'm not sure what you're considering a B-Tree here. I don't
        think that the OS does anything special with directories,
        however: the general philosophy is that if your directory
        structure is well organized, no individual directory will
        contain enough entries to make anything other than linear search
        worth the effort. Also, the main use of B-Trees is to keep
        sorted data; the OS's I know don't maintain the directory
        entries sorted, so a B-Tree wouldn't improve access in any way.
        The nice things about B-Trees is that they keep the depth of the tree
        very low, which means that the number of nodes that have to be examined
        before you find the data you want is also low. But making each node in
        the tree the size of (or a multiple of) a disk-block you can make the
        number of disk-reads equal to the number of nodes examined. And since
        disk-reads are far more expensive than almost any operation you might do
        on the data this is a Good Thing.

        An example of how one might organise a filesystem would be to keep four
        things on disk, a directory (containing directories and files), a B-Tree
        of inodes (where an inode contains the data about a file, includin
        information about where the actual file data is stores), file data, and
        a B-Tree with free space information.

        Each entry in the directory represents a file in the system and contains
        the inode number of the file. The directory can probably be made small
        enough to keep at least part of it in cache.

        To read the contents of a file you first look it up in the directory
        which gives you the inode number, which is used to find the inode in the
        B-Tree. The inode then contains data about the file and in which disk-
        blocks the file data is stored. For large files (or sparse files) you
        can once again use a B-Tree for the disk-blocks to speed up random access.

        The free space tree is much simpler, it just contains the address of
        each free disk-block.

        --
        Erik Wikström

        Comment

        • Jerry Coffin

          #5
          Re: B-trees

          In article <345093ab-bd86-40a4-bdb6-
          d7ef9f4e992a@t5 4g2000hsg.googl egroups.com>, james.kanze@gma il.com
          says...

          [ ... Warning: this post is long, and topicality is marginal at best: ]
          I'm not sure what you're considering a B-Tree here. I don't
          think that the OS does anything special with directories,
          however: the general philosophy is that if your directory
          structure is well organized, no individual directory will
          contain enough entries to make anything other than linear search
          worth the effort. Also, the main use of B-Trees is to keep
          sorted data; the OS's I know don't maintain the directory
          entries sorted, so a B-Tree wouldn't improve access in any way.
          Windows NTFS does use B-trees for directories. Small directories (up to
          something like 4K of total data) are stored entirely in the directory's
          MFT entry as essentially a vector. Large directories (anything where the
          directory entries won't fit entirely in the MFT) are stored as B-trees.

          Most Unix file systems use hashing instead. E.g. ext2 and ext3 are quite
          common on Linux. There's a pretty decent description of their hashing
          at: http://www.nongnu.org/ext2-doc/ext2....EXED-DIRECTORY

          Interestingly, though this uses hashing, it's not "flat" hashing -- the
          hash is used as the key in (you've probably guessed by now) a balanced
          tree -- pretty much a B-tree.

          Though the decision as to when to use this indexing is made a bit
          differently than it is in NTFS, the same basic idea applies: only large
          directories are indexed in this fashion.

          There are certainly a few primitive file systems (e.g. FAT) that don't
          support indexing directories at all, but at this point, I'd consider
          them more the exception than the rule. Oddly, one of the primary uses of
          FAT at this point is in things like cards in digital cameras -- which
          typically organize your pictures as a single directory containing _lots_
          of files (hundreds or thousands of files is quite common). IOW, one of
          the most common uses of FAT also closely resembles its worst case.

          [ ... ]
          In a relational database, it probably depends on the types of
          indexes, and the use which is made of them. I think modern
          databases dynamically tune the organization according to the
          actual accesses they see, which means that if sorted access
          isn't useful, they won't use a B-Tree. (Sorted access is useful
          not only when you request sorted output, but also when you have
          index criteria like "where x a and x < b". On the other hand,
          a hash table is generally better for "where x = a", and off
          hand, I can't think of any "optimizing " structure for things
          like "where x like '...%'"---I've also noticed that when I use
          such requests, my access times go up significantly:-).)
          If you didn't mind a huge index, you could create an index that pointed
          to a particular record with the string starting from every possible
          offset in a field. For example, given two records with keys of "abcd"
          and "xyz", the index would look something like:

          key record number
          abcd 0
          bcd 0
          cd 0
          d 0
          xyz 1
          yz 1
          z 1

          Given its (almost inevitably large) size, you'd probably want to
          organize that index as a b-tree.

          Then, matching a search with a leading wildcard would be essentially
          similar to matching something without the leading wildcard. Given the
          larger index size, you might easily have an extra level or two in the
          tree, so access might be somewhat slower, but not the orders of
          magnitude slower that we expect with current SQL servers.

          Of course, on its own, this design only optimizes access in the case of
          a leading wildcard -- but IME, that's a very common scenario. OTOH, with
          only a small amount of extra data, this index could also be used for
          embedded wildcards: you'd also store the offset into the string for each
          entry. In this case, searching for something like 'ab%f' would consist
          of searching for 'ab' with an offset of 0 and 'f' with an offset >= 2.

          [ ... ]
          That used to be true, but I'm not sure now, with paging, etc.
          If you need random access to sorted data, using a B-Tree with
          each node filling an entire page (and page aligned) might be
          worth it (but you'd need one awfully big table to see the
          difference).
          I wouldn't be surprised if it did pretty well -- like paging, the
          performance of caching also tends to depend heavily upon locality of
          reference -- and that's a factor B-trees were designed to maximize to a
          large degree.

          --
          Later,
          Jerry.

          The universe is a figment of its own imagination.

          Comment

          • Jerry Coffin

            #6
            Re: B-trees

            In article <044d17e0-e39a-4103-9aa6-
            be0c263c9d1f@m3 6g2000hse.googl egroups.com>, james.kanze@gma il.com
            says...

            [ ... ]
            When you create the first file in a directory, you probably
            want to allocate from a relatively large block of free space,
            to support creating more files nearby as above, since it's
            _fairly_ unusual to create a new directory that'll only ever
            contain one file.
            >
            (You don't know the Unix culture. I've seen empty directories
            created for locking purposes:-).)
            That's not a problem at all -- an empty directory (or an empty file)
            only involves creating a directory entry. In addition, you're only
            really concerned when you have difficulty allocating free space on the
            disk because it has all been fragmented -- so something created for
            locking purposes isn't likely to last long enough to make any real
            difference anyway.
            More generally, those both sound like interesting heuristics.
            But somehow, given the evolution of systems (if it ain't broke,
            don't fix it, etc.), I doubt that there widespread in current
            Unix or Windows. (Maybe Linux... where the philosophy seems at
            times to be "if it ain't broke, it ain't complicated enough".)
            :-)

            [ I don't have time to write more now, but I'll probably post another
            follow-up to your post when I can...]

            --
            Later,
            Jerry.

            The universe is a figment of its own imagination.

            Comment

            • James Kanze

              #7
              Re: B-trees

              On Aug 4, 3:50 pm, Erik Wikström <Erik-wikst...@telia. comwrote:
              On 2008-08-04 14:41, James Kanze wrote:
              On Aug 3, 7:52 pm, Erik Wikström <Erik-wikst...@telia. comwrote:
              On 2008-08-03 16:59, James Kanze wrote:
              On Aug 3, 12:25 pm, Erik Wikström <Erik-wikst...@telia. comwrote:
              On 2008-08-03 10:08, James Kanze wrote:
              An example of how one might organise a filesystem would
              be to keep four things on disk, a directory (containing
              directories and files), a B-Tree of inodes (where an
              inode contains the data about a file, includin
              information about where the actual file data is stores),
              file data, and a B-Tree with free space information.
              I rather suspect that most systems keep the mapping inode
              number to where the inode is located on disk in memory,
              initializing it once, when the disk is mounted. (The
              original Unix kept the inodes in a separate contiguous
              sub-partition, with the inode number being used as a
              direct index. A very clever scheme to maximize head
              movement:-). I'm not sure what is being used today, but
              for large volumes, at least, a hash table would seem most
              appropriate.
              With old FSs it was common to pre-allocate a certain number
              of inodes when the disk was formatted, which had two
              problems, either you had more inodes than you needed or you
              did not have enough. In the first case you wasted space,
              in the second you were out of luck. New FSs allocates
              inodes on the fly and the trend seems to be to put as much
              as possible in the B-Tree (or Hash-tree) nodes and only
              keep the directory and file data separate.
              New, being at least everything since the Berkley 4.2 kernel,
              right:-). Except that I've never heard of inodes being kept
              in a B-Tree, and I don't see what that would gain (or even
              how it would be possible, since you don't even have any
              reasonable ordering criteria).
              New as in the last few years. I might have confused you by
              using the term inode, perhaps meta-data block or some other
              term would be better to avoid associations with UFS. One
              ordering criteria would be to use the inode (or meta-data
              block) number, another to hash the path.
              But why? That's what I'm trying to figure out. I'm familiar
              with what B-Trees are classically used for (ISAM), and they seem
              very much optimized for that: rapid direct access, but also
              sequential access according to an ordering criterion. I don't
              see where the second would ever apply to inodes or meta-blocks.
              As for the gain, while I'm not an expert on the area there are
              at least two FSs that I know of (one in early development)
              which uses unified B-Trees to store everything but the actual
              file data (and perhaps the free blocks, not too sure about how
              that works).
              OK. I can sort of see why one might want a unified method, and
              if B-Trees have distinct advantages in some cases, why not use
              them systematically. Jokingly: if all you have is a hammer,
              everything looks like a nail. Realistically, however: you have
              finite develoment time, and its probably preferable spending it
              to get one structure 100% right, rather than three different
              structures only more or less right. Once you've got the B-Tree,
              you use it unless there is a distinct disadvantage in doing so.

              [...]
              Solaris (AFAIK) uses UFS which is hardly a modern FS,
              I didn't say it was modern:-).
              but it is old and proven which is usually more important than
              features and performance in the enterprise market (especially
              when both features and performance can be added by in the form
              of extra software and RADI arrays).
              That's sort of what I hinted at in my reply to Jerry: if it
              ain't broke, don't fix it (vs if it ain't broke, it ain't
              complicated enough). There's also the fact that there doesn't
              seem to be as much money to be made making your system more
              effecient as there is to be made selling a new, more powerful
              hardware, and that I've never heard of anyone benchmarking this
              sort of thing in the procuration procedures. Solaris is by far
              the system I'm most familiar with, and it was definitly my
              impression that what I'd learned back when I was working on
              kernel software (some 20 or more years ago) hadn't really
              changed that much.

              [...]
              XFS used two synced B-Trees to manage free space, one was
              sorted by logical address, which is handy when trying to find
              free space close to some already allocated file (such as when
              appending data). The other tree was sorted by size of the free
              space, which is nice when trying to find the best place for a
              file (as when creating).
              If the file system supports contiguous files, this is
              definitely an advantage. The filesystems I'm familiar with
              today don't. :-(.
              No FS I know of do, but that does not mean that trying to keep
              up locality of reference is a bad idea.
              It depends. If you're going to read sector by sector anyway, on
              a multi-user system, it probably doesn't matter. Someone else
              is going to access elsewhere between your accesses anyway.

              --
              James Kanze (GABI Software) email:james.kan ze@gmail.com
              Conseils en informatique orientée objet/
              Beratung in objektorientier ter Datenverarbeitu ng
              9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

              Comment

              Working...