How do I make this quicker?

I should probably start referring to database textbooks at this point. Lookups and indexing:

  • Hashing is almost definitely quicker, given a decent hash bucket size…
  • .. but doesn’t allow next/previous operations very well.
  • The hash vales can be easily stored to disk, so it’s not like having to build or rebuild indices.
  • Tree structures can also be stored to disk, but you need to distinguish between whether you’re storing a disk tree (pointers are blocks / offsets / etc), or storing a memory tree with has been indirected / translated to a stream datastructure.
  • Unfortunately, if doing the latter, where the offsets are an indirection of some sort, if you’re not careful, reading the tree off disk can be more time consuming than building it from scratch: you need another datastructure to store the translation from disk blocks / offsets to memory addresses, which has a similar set of costs to a tree (unless you use another hash table).
  • I suppose I could increase the branching factor, but even if you do, you still have the same set of costs to build the tree.

So, as you might guess, the datastructures I’ve got have good amortized cost, but there’s a still a startup cost to be minimised. I also have other techniques to help minimise the startup costs (background loading). This is not critical, and an optimisation to be made after completing basic functionality. Answers on a postcard please.

Leave a Reply

Your email address will not be published. Required fields are marked *