Rob,
Over the past 25 years, the relevant principles that I have followed are...
- reduce the volume of I/Os, then next...
- reduce the latency of I/Os
Step 1 translates to "make it more efficient" by generating fewer I/Os.
Step 2 recognizes that there is only so far you can go with step 1, so when
step 1 is exhausted, then (and only then) go on to
step 2.
Caching falls into
step 2. So does more memory, faster CPU, etc.
If you take care of
step 1, you rarely have to worry about
step 2.
Often, people try to reverse the order and perform
step 2 first, but putting inefficient processes on faster hardware (a.k.a. "KIWI" or "kill it with iron") is only a temporary solution, never permanent, if it solves anything at all. Expensive and disappointing.
Also, remember that indexing isn't always the right solution; there are numerous situations where a full scan is far more efficient.
Reducing the number of I/Os and the latency of I/Os requires that you're able to measure both. You can't improve that which you cannot measure.
Sorry, a bit preachy and perhaps more high-level, but that's how I approach this stuff, and I did this for a living for a number of years.
Hope this helps...
-Tim