X-Git-Url: https://git.decadent.org.uk/gitweb/?p=nfs-utils.git;a=blobdiff_plain;f=utils%2Fmount%2Fnfs.man;h=87e27e1519615d3663172dc0b29d7aef33514695;hp=be91a252150c37dda50610abcff10c7de70d17c4;hb=9a5293a10551c03b4fb976503dd24da569fcadb3;hpb=4bbd6d624c000f26ab828852ee90a4624df26c49 diff --git a/utils/mount/nfs.man b/utils/mount/nfs.man index be91a25..87e27e1 100644 --- a/utils/mount/nfs.man +++ b/utils/mount/nfs.man @@ -46,11 +46,10 @@ files on this mount point. The fifth and sixth fields on each line are not used by NFS, thus conventionally each contain the digit zero. For example: .P -.SP -.NF -.TA 2.5i +0.75i +0.75i +1.0i +.nf +.ta 8n +14n +14n +9n +20n server:path /mountpoint fstype option,option,... 0 0 -.FI +.fi .P The server's hostname and export pathname are separated by a colon, while @@ -113,12 +112,16 @@ option may mitigate some of the risks of using the option. .TP 1.5i .BI timeo= n -The time (in tenths of a second) the NFS client waits for a -response before it retries an NFS request. If this -option is not specified, requests are retried every -60 seconds for NFS over TCP. -The NFS client does not perform any kind of timeout backoff -for NFS over TCP. +The time in deciseconds (tenths of a second) the NFS client waits for a +response before it retries an NFS request. +.IP +For NFS over TCP the default +.B timeo +value is 600 (60 seconds). +The NFS client performs linear backoff: After each retransmission the +timeout is increased by +.BR timeo +up to the maximum of 600 seconds. .IP However, for NFS over UDP, the client uses an adaptive algorithm to estimate an appropriate timeout value for frequently used @@ -369,14 +372,8 @@ Valid security flavors are .BR sys , .BR krb5 , .BR krb5i , -.BR krb5p , -.BR lkey , -.BR lkeyi , -.BR lkeyp , -.BR spkm , -.BR spkmi , and -.BR spkmp . +.BR krb5p , Refer to the SECURITY CONSIDERATIONS section for details. .TP 1.5i .BR sharecache " / " nosharecache @@ -503,6 +500,8 @@ Specifying a netid that uses TCP forces all traffic from the command and the NFS client to use TCP. Specifying a netid that uses UDP forces all traffic types to use UDP. .IP +.B Before using NFS over UDP, refer to the TRANSPORT METHODS section. +.IP If the .B proto mount option is not specified, the @@ -517,6 +516,8 @@ The option is an alternative to specifying .BR proto=udp. It is included for compatibility with other operating systems. +.IP +.B Before using NFS over UDP, refer to the TRANSPORT METHODS section. .TP 1.5i .B tcp The @@ -752,8 +753,8 @@ If is specified, the client assumes that POSIX locks are local and uses NLM sideband protocol to lock files when flock locks are used. .IP -To support legacy flock behavior similar to that of NFS clients < 2.6.12, use -'local_lock=flock'. This option is required when exporting NFS mounts via +To support legacy flock behavior similar to that of NFS clients < 2.6.12, +use 'local_lock=flock'. This option is required when exporting NFS mounts via Samba as Samba maps Windows share mode locks as flock. Since NFS clients > 2.6.12 implement flock by emulating POSIX locks, this will result in conflicting locks. @@ -900,40 +901,40 @@ The following example from an file causes the mount command to negotiate reasonable defaults for NFS behavior. .P -.NF -.TA 2.5i +0.7i +0.7i +.7i +.nf +.ta 8n +16n +6n +6n +30n server:/export /mnt nfs defaults 0 0 -.FI +.fi .P Here is an example from an /etc/fstab file for an NFS version 2 mount over UDP. .P -.NF -.TA 2.5i +0.7i +0.7i +.7i +.nf +.ta 8n +16n +6n +6n +30n server:/export /mnt nfs nfsvers=2,proto=udp 0 0 -.FI +.fi .P Try this example to mount using NFS version 4 over TCP with Kerberos 5 mutual authentication. .P -.NF -.TA 2.5i +0.7i +0.7i +.7i +.nf +.ta 8n +16n +6n +6n +30n server:/export /mnt nfs4 sec=krb5 0 0 -.FI +.fi .P This example can be used to mount /usr over NFS. .P -.NF -.TA 2.5i +0.7i +0.7i +.7i +.nf +.ta 8n +16n +6n +6n +30n server:/export /usr nfs ro,nolock,nocto,actimeo=3600 0 0 -.FI +.fi .P This example shows how to mount an NFS server using a raw IPv6 link-local address. .P -.NF -.TA 2.5i +0.7i +0.7i +.7i +.nf +.ta 8n +40n +5n +4n +9n [fe80::215:c5ff:fb3e:e2b1%eth0]:/export /mnt nfs defaults 0 0 -.FI +.fi .SH "TRANSPORT METHODS" NFS clients send requests to NFS servers via Remote Procedure Calls, or @@ -1073,6 +1074,83 @@ or options are specified more than once on the same mount command line, then the value of the rightmost instance of each of these options takes effect. +.SS "Using NFS over UDP on high-speed links" +Using NFS over UDP on high-speed links such as Gigabit +.BR "can cause silent data corruption" . +.P +The problem can be triggered at high loads, and is caused by problems in +IP fragment reassembly. NFS read and writes typically transmit UDP packets +of 4 Kilobytes or more, which have to be broken up into several fragments +in order to be sent over the Ethernet link, which limits packets to 1500 +bytes by default. This process happens at the IP network layer and is +called fragmentation. +.P +In order to identify fragments that belong together, IP assigns a 16bit +.I IP ID +value to each packet; fragments generated from the same UDP packet +will have the same IP ID. The receiving system will collect these +fragments and combine them to form the original UDP packet. This process +is called reassembly. The default timeout for packet reassembly is +30 seconds; if the network stack does not receive all fragments of +a given packet within this interval, it assumes the missing fragment(s) +got lost and discards those it already received. +.P +The problem this creates over high-speed links is that it is possible +to send more than 65536 packets within 30 seconds. In fact, with +heavy NFS traffic one can observe that the IP IDs repeat after about +5 seconds. +.P +This has serious effects on reassembly: if one fragment gets lost, +another fragment +.I from a different packet +but with the +.I same IP ID +will arrive within the 30 second timeout, and the network stack will +combine these fragments to form a new packet. Most of the time, network +layers above IP will detect this mismatched reassembly - in the case +of UDP, the UDP checksum, which is a 16 bit checksum over the entire +packet payload, will usually not match, and UDP will discard the +bad packet. +.P +However, the UDP checksum is 16 bit only, so there is a chance of 1 in +65536 that it will match even if the packet payload is completely +random (which very often isn't the case). If that is the case, +silent data corruption will occur. +.P +This potential should be taken seriously, at least on Gigabit +Ethernet. +Network speeds of 100Mbit/s should be considered less +problematic, because with most traffic patterns IP ID wrap around +will take much longer than 30 seconds. +.P +It is therefore strongly recommended to use +.BR "NFS over TCP where possible" , +since TCP does not perform fragmentation. +.P +If you absolutely have to use NFS over UDP over Gigabit Ethernet, +some steps can be taken to mitigate the problem and reduce the +probability of corruption: +.TP +1.5i +.I Jumbo frames: +Many Gigabit network cards are capable of transmitting +frames bigger than the 1500 byte limit of traditional Ethernet, typically +9000 bytes. Using jumbo frames of 9000 bytes will allow you to run NFS over +UDP at a page size of 8K without fragmentation. Of course, this is +only feasible if all involved stations support jumbo frames. +.IP +To enable a machine to send jumbo frames on cards that support it, +it is sufficient to configure the interface for a MTU value of 9000. +.TP +1.5i +.I Lower reassembly timeout: +By lowering this timeout below the time it takes the IP ID counter +to wrap around, incorrect reassembly of fragments can be prevented +as well. To do so, simply write the new timeout value (in seconds) +to the file +.BR /proc/sys/net/ipv4/ipfrag_time . +.IP +A value of 2 seconds will greatly reduce the probability of IPID clashes on +a single Gigabit link, while still allowing for a reasonable timeout +when receiving fragmented traffic from distant peers. .SH "DATA AND METADATA COHERENCE" Some modern cluster file systems provide perfect cache coherence among their clients. @@ -1413,7 +1491,7 @@ security flavor encrypts every RPC request to prevent data exposure during network transit; however, expect some performance impact when using integrity checking or encryption. -Similar support for other forms of cryptographic security (such as lipkey and SPKM3) +Similar support for other forms of cryptographic security is also available. .P The NFS version 4 protocol allows @@ -1558,10 +1636,10 @@ To ensure that the saved mount options are not erased during a remount, specify either the local mount directory, or the server hostname and export pathname, but not both, during a remount. For example, .P -.NF -.TA 2.5i +.nf +.ta 8n mount -o remount,ro /mnt -.FI +.fi .P merges the mount option .B ro