lookup("/bin","top") => -ENOENT lookup("/usr/bin","top") => -ENOENT lookup("/usr/local/bin","top") => -ENOENT lookup("/opt/bin","top") => SUCCESS, returns filehandle, etc.
A similar sequence would occur for all other execs, and even for attempted executions of non-existing binaries or scripts.
Simple caching of recent failed lookups is insufficient for two reasons. First, the timeout must be very short, perhaps as low as 100-200ms, since it is imperative that a file deemed not to exist really does not exist (consider lock files created on one host and used on another). Caching a failed lookup for 3 to 5 seconds (as successful lookups are cached) would destroy NFS semantics. Second, there is little reason to believe (and little evidence to show) that failing lookups on the same filename in the same directory recur very frequently. The work of maintaining the cache would be wasted most of the time.
An alternative currently in a research implementation is to keep a failed lookup cache with a tunable timeout defaulting to 200ms. The cache is populated not with recent failed lookup results, but instead with a queue of asynchronously prefetched lookups. These prefetches are based on a simple prediction algorithm which maintains a matrix of what directories are accessed soon after each other directory in a fixed-size window of the most recent failed lookups. For example, in the above execution of top, the lookup for the binary in /bin is done synchronously, but (assuming the matrix is warm from earlier use) when the lookup returns with -ENOENT, the matrix is consulted and two directories most often visited after /bin in the last 32 failed lookups will have asynchronous lookup RPCs issued. If we predicted correctly and the one of the prefetched lookups does occur, the asynchronous calls often have already returned. Their results (just a boolean, whether the lookup returned -ENOENT) are then available immediately.
Preliminary results using data regarding failed lookups logged from the running of our benchmarks show that 88% of the predictions made are required before they time out. This means that we are not generating much additional network traffic, or loading the server unreasonably more for wasted lookups (presumably, the lookup RPC is fairly inexpensive for the server, anyway). We correctly predict about 53% of the failed lookups that occur, and are able to reuse about 6.5% of the failed lookups within 200ms (i.e., the file was searched for again, so we not only saved the latency of the original lookup, but also avoided a second lookup) These numbers suggest that almost 60% of the synchronous lookups could be eliminated by a simple linear scan of the 10 element predictively-prefetched failed lookup catch. The cache hit saves the synchronous RPC call, along with its associated delays and context switches.
However, the cost of maintaining and using the failed lookup cache is not trivial. For each lookup, the 10 element cache is scanned, and the matrix is updated. Also, whenever predictions are made, queuing up the (up to two) asynchronous lookup predictions delays the return to the user code which performed the lookup resulting in the predictions. It is not yet clear whether the implementation can be optimized enough to result in overall improved performance.