Write Operations

Next: A kinder, gentler cleaning Up: Future Work Previous: Disconnected operation

Write Operations

The other major area that could be improved is the write subsystem. Because of the NFS requirement that the server commit writes to stable storage, little can be done from the client side to reduce real transfer time (aside from hardware solutions like Presto and NFS V3's additional commit RPC). A partially attainable goal is improved perceived response time: ultimately we still have to send the bits, but we could extend the use of delayed-writes to let the user program that issued the write to continue computing while the data is being sent.

There are two interesting issues with writing in general and asynchronous writing in particular that need to be addressed. The first concern is the interaction between reads and writes. Logically, a read issued on a client immediately after a write on that client should, independent of the server state and the world view of other clients, read the new data. To do otherwise would weaken the already weak NFS consistency model further than is reasonable. However, the simple ``solution,'' whereby data is first written to the local cache and later sent to the server from there, cannot work in the NFS model. When a new page of a file is written to the server, the timestamp on the server's file will be changed. To the client, the file looks changed, but it cannot tell if the change is the result of the update it just sent (assuming it remembers that it just wrote a page) or from some other host. NFS does not allow the client to discover the instigator of the change, and therefore the client is forced to either invalidate or update its cache as a hedge against the (admittedly unlikely) possibility that the change on the server was not its own. NFS V3 corrects this shortcoming by having write return the prior modified time as well as the new one [PJS⁺94, p. 142]. If the prior modified time matches the client's, then its write was the only one that transpired. Furthermore, again as a limitation of NFS, the client cannot discover which pages of a file have changed, so if it chooses to update its cache, it will have to re-fetch the entire file. Our current implementation does no caching of writes and invalidates the NFS disk cache whenever data is written to a file.

The second concern with asynchronous writing pertains to closing files. Clients that are especially interested in data integrity will prefer a policy where the close operation blocks until all pending writes to the file have actually been written to stable storage on the server. On the other hand, a client concerned more with performance might prefer to return immediately from close, and allow the data to be sent to the server later. Although NFS requires the former semantics for close, the latter seems useful, so it would make sense to provide a mount flag ([no]dc_force_on_close) to control this possibly-dangerous behavior.¹⁸

Some studies suggest that 20-30% of newly-written data is deleted within 30 seconds [NWO88, p. 138]. Naïvely, these data appear to further support delaying writes, because there is a good chance that the file will be removed before we get around to actually performing them. However, in general files are closed before they are deleted, which would cause us to simply wait at that point for the writes to complete.

Next: A kinder, gentler cleaning Up: Future Work Previous: Disconnected operation

Greg Badros
1998-04-23