git.decadent.org.uk Git - nfs-utils.git/log

statd: Add API to canonicalize mon_names

Provide a shared function to generate canonical names that statd
uses to index its on-disk monitor list. This function can resolve
DNS hostnames, and IPv4 and IPv6 presentation addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

libnsm.a: Add support for multiple lines in monitor record files

To support IPv6, statd must support multi-homed remote peers.  For our
purposes, "multi-homed peer" means that more than one unique IP
address maps to the one canonical host name for that peer.

An SM_MON request from the local lockd has a "mon_name" argument that
statd reverse maps to a canonical hostname (ie the A record for that
host).  statd assumes the canonical hostname is unique enough that
it stores the callback data for this mon_name in a file named after
that canonical hostname.

Because lockd can't distinguish between two unique IP addresses that
may be from the same physical host, the kernel can hand statd a
mon_name that maps to the same canonical hostname as some previous
mon_name.  So that the kernel can keep this instance of the mon_name
unique, it creates a fresh priv cookie for each new address.

Note that a mon_name can be a presentation address string, or the
caller_name string sent in each NLMPROC_LOCK request.  There's
nothing that requires the caller_name to be a fully-qualified
hostname, thus it's uniqueness is not guaranteed.  The current
design of statd assumes that canonical hostnames will be unique
enough.

When a mon_name for a fresh SM_MON request maps to the same canonical
hostname as an existing monitored peer, but the priv cookie is new,
statd will try to write the information for the fresh request into an
existing monitor record file, wiping out the contents of the file.
This is because the mon_name/cookie combination won't match any record
statd already has.

Currently, statd doesn't check if a record file already exists before
writing into it.  statd's logic assumes that the svc routine has
already checked that no matching record exists in the in-core monitor
list.  And, it doesn't use O_EXCL when opening the record file.  Not
only is the old data in that file wiped out, but statd's in-core
monitor list will no longer match what's in the on-disk monitor list.

Note that IPv6 isn't needed to exercise multi-homed peer support.
Any IPv4 peer that has multiple addresses that map to its canonical
hostname will trigger this behavior.  However, this scenario will
become quite common when all hosts on a network automatically get both
an IPv4 address and an IPv6 address.

I can think of a few ways to address this:

1.  Replace the current on-disk format with a database that has a
uniqueness constraint on the monitor records

2.  Create a new file naming scheme; eg. one that uses a truly
unique name such as a hash generated from the mon_name, my_name, and
priv cookie

3.  Support multiple lines in each monitor record file

Since statd's on-disk format constitutes a formal API, options 1 and 2
are right out.  This patch implements option 3.  There are two parts:
adding a new line to an existing file; and deleting a line from a file
with more than one line.  Interestingly, the existing code already
supports reading more than one line from these files, so we don't need
to add extra code here to do that.

One file may contain a line for every unique mon_name / priv cookie
where the mon_name reverse maps to the same canonical hostname.  We
use the atomic write facility added by a previous patch to ensure the
on-disk monitor record list is updated atomically.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

libnsm.a: Factor atomic write code out of nsm_get_state()

We're about to use the same logic (mktemp, write, rename) for
other new purposes, so pull it out into its own function.

This change also addresses a latent bug: O_TRUNC is now used when
creating the temporary file. This eliminates the possibility of
getting stale data in the temp file, if somehow a previous "atomic
write" was interrupted and didn't remove the temporary file.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: Save mon_name and my_name strings

Currently sm-notify does not use the mon_name and my_name strings
passed to smn_get_host(). Very soon we're going to need the mon_name
and my_name strings, so add code to store those strings in struct
nsm_host, and free them when each host is forgotten.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: Support IPv6 in sm_simu_crash_1_svc

Ensure that SM_SIMU_CRASH does not allow non-AF_INET callers to
bypass the localhost check.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: Support IPv6 is caller_is_localhost()

For the time being, statd is not going to support receiving SM_MON
calls from the local lockd via IPv6.

However, the upcalls (SM_MON, etc.) from the local lockd arrive on the
same socket that receives calls from remote peers. Thus
caller_is_localhost() at least has to be smart enough to notice that
the caller is not AF_INET, and to display non-AF_INET addresses
appropriately.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: add IPv6 support in sm_notify_1_svc()

We have all the pieces in place, so update sm_notify_1_svc() to handle
SM_NOTIFY requests sent from IPv6 remotes.

This also eliminates a memory leak: the strdup'd memory containing the
callers' presentation address was never freed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: add nsm_present_address() API

Add an API to convert a socket address to a presentation address
string. This is used for displaying error messages and the like.

We prefer getnameinfo(3) over inet_?to?(3) as it supports IPv6 scope
IDs. Since statd has to continue to build correctly on systems whose
glibc does not have getnameinfo(3), an inet_?to?(3) version is also
provided.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: Introduce statd version of matchhostname()

For the near future, statd will support IPv6 but exportfs will not.
Thus statd will need a version of matchhostname() that can deal
properly with IPv6 remotes.  To reduce the risk of breaking exportfs,
introduce a separate version of matchhostname() for statd to use while
exportfs continues to use the existing AF_INET-only implementation.

Note that statd will never send matchhostname() a hostname string
containing export wildcards, so is_hostame() is not needed in the
statd version of matchhostname().  This saves some computational
expense when comparing hostnames.

A separate statd-specific implementation of matchhostname() allows
some flexibility in the long term, as well.  We might want to enrich
the matching heuristics of our SM_NOTIFY, for example, or replace
them entirely with a heuristic that is not dependent upon DNS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

nfs-utils: Collect socket address helpers into one location

Introduce generic helpers for managing socket addresses.  These are
general enough that they are useful for pretty much any component of
nfs-utils.

We also include the definition of nfs_sockaddr here, so it can be
shared.  See:

  https://bugzilla.redhat.com/show_bug.cgi?id=448743

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: Support IPv6 DNS lookups in smn_lookup

When IPV6_SUPPORTED is enabled and the local system has IPv6 support,
request AF_INET6 and AF_INET addresses from the DNS resolver.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: Use getaddrinfo(3) to create bind address in smn_create_socket()

This patch updates the "bind to a user-specified port" arm of
smn_create_socket() so it can deal with IPv6 bind addresses.

A single getaddrinfo(3) call can convert a user-specified bind address
or hostname to a socket address, optionally plant a provided port
number, or whip up an appropriate wildcard address for use as the main
socket's bind address.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: IPv6 support in reserved port binding in smn_create_socket()

This patch updates the "bind to an arbitrary privileged port" arm of
smn_create_socket() so it can deal with IPv6 bind addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: Support creating a PF_INET6 socket in smn_create_socket()

Socket creation is unfortunately complicated by the need to handle the
case where sm-notify is built with IPv6 support, but the local system
has disabled it entirely at run-time (ie, socket(3) returns
EAFNOSUPPORT when we try to create an AF_INET6 socket).

The run-time address family setting is made available in the global
variable nsm_family. This setting can control the family of the
socket's bind address and what kind of addresses we want returned by
smn_lookup(). Support for that is added in subsequent patches.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: factor socket creation out of notify()

The top half of the notify() function creates the main socket that
sm-notify uses to do its job. To make adding IPv6 support simpler,
refactor that piece into a separate function.

The logic is modified slightly so that exit(3) is invoked only in
main(). This is not required, but it makes the code slightly easier
to understand and maintain.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

statd: Update rmtcall.c

Replace the open code to construct NLM downcalls and PMAP_GETPORT RPC
requests with calls to our new library routines.

This clean up removes redundant code in rmtcall.c, and enables the
possibility of making NLM downcalls via IPv6 transports. We won't
support that for a long while, however.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

sm-notify: Replace RPC code

Replace the open code to construct SM_NOTIFY and PMAP_GETPORT RPC
requests with calls to our new library routines that support
IPv6 and RPCB_GETADDR as well.

This change allows sm-notify to send RPCB_GETADDR, but it won't do
that until the main sm-notify socket supports PF_INET6 and the DNS
resolution logic is updated to return IPv6 addresses.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

libnsm.a: Add RPC construction helper functions

To manage concurrency, both statd and sm-notify construct raw RPC
requests in socket buffers, and use a minimal request scheduler
to send these requests and manage replies.  Both statd and sm-notify
open code the RPC request construction.

Introduce helper functions that can construct and send raw
NSMPROC_NOTIFY, NLM downcalls, and portmapper calls over a datagram
socket, and receive and parse their replies.  Support for IPv6 and
RPCB_GETADDR is featured.  This code (and the IPv6 support it
introduces) can now be shared by statd and sm-notify, eliminating
code and bug duplication.

This implementation is based on what's in utils/statd/rmtcall.c now,
but is wrapped up in a nice API and includes extra error checking.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Revert "Automatically set 'nohide' on referral exports."

This partially reverts commit ec637de16210c1c6fcb3a0df34d7889592f577dc.

Only NFSv4 clients will actually want to see referall points--others are
better off just seeing an empty directory, that they can manually (or
with automount) mount the appropriate filesystem on.

So we want the kernel to automatically traverse only in the v4 case (as
recent kernels do).

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: better hiding of v4root exports from mountd clients

We've hidden v4root exports from get_exportlist (hence from the
showmount command), but not from other mountd operations--allowing
clients to attempt to mount exports when they should be getting an
immediate error.

Symptoms observed on a linux client were that a mount that previously
would have returned an error immediately now hung. This restores the
previous behavior.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: minor v4root_set cleanup, check strdup return

Move more of v4root_set into a helper function.

Also, check the return value from strdup. (We don't really handle the
error well yet--we'll end up giving negative replies to export upcalls
when we should be giving the kernel exports, resulting in spurious
-ENOENTs or -ESTALE's--but that's better than crashing with a NULL
dereference.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: simplify export list deferral in v4root_set

We're adding new entries, but not deleting them, so we don't need to do
the usual double-counter trick here.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: minor optimization in v4root_set

Since we're adding new exports as we traverse the export list, it's
possible we may find ourselves revisiting an export we just added.  It's
harmless to reprocess those exports, as we're currently doing.  But it's
also pointless.

(Actually, the current code appears to always add new export entries at
the head of each list, so we shouldn't hit this case.  It still may be a
good idea to keep this check, though, as insulation against future
changes to that data structure.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: kill unnecessary m_mayexport check

Only exportfs uses m_mayexport; mountd always populates the export list
with auth_reload(), which always sets m_mayexport on the entries it
creates.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: mountlist_del_all cleanup

Common exit code.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: auth_authenticate_internal further cleanup

Move newcache case into its own function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: auth_authenticate_internal cleanup

Break up another big function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: common exportent initializer

Consolidate duplicated initialization code.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: export_read() cleanup

Use standard indentation, move warnings to helper function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: get_exportlist() cleanup

Comment clarification, minor style cleanup.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: move most of get_exportlist() into helpers

I needed to understand get_exportlist() recently, and it gave me
trouble.

Move detail work into helper functions to make the basic logic clear,
and to remove need for excessive nesting (and fix inconsistent
indentation levels). Also remove unnecessary casts of void returns from
xmalloc().

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: turn on pseudo exports

If a pseudo root is not defined in the export file, the
v4root_needed global variable will be set, signaling
v4root_set() create the dynamic pseudo root.

Signed-off-by: Steve Dickson <steved@redhat.com>

exports: hide pseudo exports from clients

Don't show pseudo exports when clients ask to see what
is exported via the showmount mount command.

Signed-off-by: Steve Dickson <steved@redhat.com>

mountd: prefer non-V4ROOT exports.

If paths A and A/B are both exported, then we have a choice of exports
to return for A (or under A but still above A/B): we could return A
itself, or we could return a V4ROOT export leading to B.

For now, we will always prefer the non-V4ROOT export, whenever that is
an option. This will allow clients to reach A/B as long as
adminstrators keep to the rule that the security on a parent permits the
union of the access permitted on any descendant.

In the future we may support more complicated arrangements.

(Note: this can't be avoided by simply not creating v4root exports with
the same domain and path, because different domains may have some
overlap.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: NFSv4 pseudoroot support routines

Create v4root exports for each directory that is a parent of an explicit
export. Give each the minimal security required to traverse to any of
its children.

Signed-off-by: Steve Dickson <steved@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: add new flag for NFSv4 pseudoroot

Signed-off-by: Steve Dickson <steved@redhat.com>

mountd: don't require mountpoint in crossmnt case

Currently,

mount --bind /path /path

where /path is a subdirectory of a crossmnt export, can cause client
hangs, since the kernel detects that as a mountpoint, but nfs-util's
is_mountpoint() function does not.

I don't see any sure-fire way to detect such mountpoints. But that's
OK: it's harmless to allow this upcall to succeed even when the
directory is not a mountpoint, so let's just remove this check.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: further split up lookup_export

More trivial cleanup (no change in functionality) to group logical
operations together into a single function.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

mountd: move export lookup into separate function

Move this main loop to a separate function, to make it a little easier
to follow the logic of the caller.

Also, instead of waiting till we find an export to do the dns
resolution, do it at the start; it will normally be needed anyway, and
this simplifies the control flow.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: let kernel decide which flags vary by flavor

Query the kernel to ask which flavors vary by pseudoflavor, and use that
instead of a fixed constant. To allow the possibility of more flags
varying by pseudoflavor, use the set/clear_flags functions for all
options instead of setting some by hand.

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

exports: minor parse_opts cleanup

Move this into a helper function. (We'll be adding a little more code
here.)

Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>

gssd: on krb5 upcall, have gssd send a more granular error code

Currently if a krb5 context expires, GSSAPI authenticated RPC calls
start returning error (-EACCES in particular). This is bad when someone
has a long running job that's doing filesystem ops on a krb5 authenticated
NFS mount and just happens to forget to redo a 'kinit' in time.

The existing gssd always does a downcall with a '-1' error code if there
are problems, and the kernel always ignores this error code. Begin to
fix this by having gssd distinguish between someone that has no
credcache at all, and someone who has an expired one. In the case where
there is an existing credcache, have gssd downcall with an error code of
-EKEYEXPIRED. If there's not a credcache, then downcall with an error of
-EACCES.

We can then have the kernel use this error code to handle these
situations differently.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

Added the following files to .gitignore
tests/nsm_client/nlm_sm_inter.h
tests/nsm_client/nlm_sm_inter_clnt.c
tests/nsm_client/nlm_sm_inter_svc.c
tests/nsm_client/nlm_sm_inter_xdr.c

Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: add initial tests for statd that run via "make check"

Leverage the support that automake already has for running tests via
make check. Add a simple test that just checks that the statd mon and
unmon calls actually work.

Adding more tests should be a simple matter of adding new scripts
exit 0 on success and non-zero on fail, and adding those to the
Makefile.am.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: add statdb_dump utility

To dump contents of statd's monitor DB.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: introduce new statd testing simulator

rpc.statd is often prone to subtle, difficult to detect breakage. When
it has problems, they're often invisible and only manifest themselves
as failed lock recovery.

This program is intended to function as part of a test harness for
statd. It's a multicall binary that serves as a synthetic NSM client
program, and a daemon that can simulate lockd for purposes of testing
the NSM to NLM downcall.

A new top level "tests/" directory is also added to nfs-utils to start
as a repository for automated tests of nfs-utils components.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: make private cookie to hex conversion a library routine

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

statd: Use the new nsm_ file.c calls in rpc.statd

Replace open-coded accesses to on-disk NSM information in rpc.statd
with calls to the new API.

Behavior should be much the same as it was before.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

statd: Use the new nsm_ file.c calls in sm_notify

Replace open-coded accesses to on-disk NSM data with calls to the new
libnsm.a API.

One major change is that sync(2) is no longer called when the NSM
state number is updated at boot time. Otherwise sm-notify should
behave much the same as it did before.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

libnsm.a: Introduce common routines to handle persistent storage

rpc.statd and sm-notify access the same set of files under
/var/lib/nfs/statd, but both have their own code base to handle this.
They should share this code.

In addition, the on-disk format used by statd and friends is
considered a formal interface, so this new code will codify the API
and provide documentation for it.

The shared code handles switching from the default parent statd
directory, reducing privileges at start-up, and managing the NSM
state files, in addition to handling normal operations on the
monitored host and notification lists on disk.

The new code is simply a copy of the same logic that was used in
rpc.statd and sm-notify, but wrapped in a nice API. There should be
minimal behavioral and no on-disk format changes with the new
libnsm.a code.

The new code is more careful to check for bad corner cases.
Occassionally this code may not allow an operation that was permitted
in the past, but hopefully the error reporting has improved enough
that it should be easy to track down any problems.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

showmount: Try the highest mount version then fall back to lower ones

Showmount should try the highest mount version first then fall
back to the lower ones when the server returns a RPC_PROGVERSMISMATCH
error. The idea being not using the lower mount versions will begin
the process of moving away from NFSv2 support.

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: don't use IPv6 unless IPV6_SUPPORTED is set
Commit 1f3fae1fb25168aac187ff1881738c8ad53a8763 made mount.nfs start
looking up and trying to use IPv6 addresses when mount.nfs was built
against libtirpc (even when --enable-ipv6 wasn't specified).

The problem seems to be that nfs_nfs_proto_family() is basing the family
on HAVE_LIBTIRPC. I think it should be basing it on IPV6_SUPPORTED
instead.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Acked-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

libnsm.a: Move the sm_inter XDR pieces to libnsm.a

Clean up: Move the .x file and the generated C source for NSM to
libnsm.a, echoing the architecture of mountd and exportfs. This makes
the NSM protocol definitions, data types, and XDR routines available
to be shared across nfs-utils.

This simplifies the addition of other NSM-related code (for example
for testing or providing clustering support), and also provides
public data type definitions that can be used to make sense of the
contents of statd's on-disk database.

Because sim_sm_inter.x still resides in utils/statd, I've left some
rpcgen build magic in utils/statd/Makefile.am.

This is an internal organization change only. This patch should not
affect code behavior in any way.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

libexport.a: fix a long-standing typo in name_cmp()

Not sure what "(!*a || !a == ',')" means... but just a few lines later
is
"(!*a || *a == ',')". I think "a is '\0' or ','" is what was intended.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

statd: replace smn_{get,set}_port() with the shared equivalents

Use shared sockaddr port management functions instead of duplicating
this functionality in sm-notify. This is now easy because sm-notify
is linked with libnfs.a, where nfs_{get,set}_port() reside.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

statd: squelch compiler warning in sm-notify.c

Clean up: Get rid of a false positive compiler warning, seen with
-Wextra.

sm-notify.c: In function ¿record_pid¿:
sm-notify.c:690: warning: comparison between signed and unsigned integer
expressions

Document some ignored return codes while we're here.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: have gssd deal with scopeid field in upcall

Recent kernels (2.6.32) have started displaying the scopeid for some
addresses in the upcall. gssd doesn't know how to deal with them. Change
gssd to use getaddrinfo instead of inet_pton since that can deal with
scopeid's in addresses. That also allows us to elminate the port
conversion in read_service_info.

If getaddrinfo returns an address with a non-zero sin6_scope_id however,
reject it. getnameinfo ignores that field and just uses the sin6_addr
part when resolving. But, two addresses that differ only in
sin6_scope_id could refer to completely different hosts.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

NFS man page: update nfs(5) with details about IPv6 support

Add details to nfs(5) about how to specify raw IPv6 addresses when
mounting an
NFS server. Mounting via an IPv6 NFS server via hostname should work as
it
does with IPv4.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Remove nfs_name_to_address()

Clean up: nfs_name_to_address() has no more callers.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Teach umount.nfs to recognize netids in /etc/mtab

umount.nfs has to detect the correct address family to use when
looking up the server.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: proto=netid forces address family when resolving server names

Using the netid settings, determine the correct address family to use
for NFS and MNT server name resolution. Use this family when
resolving the server name for the addr= and mountaddr= options.

This patch assumes the kernel can recognize a netid, instead of a
protocol name, as the value of the proto= options.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Fix sockaddr pointer aliasing in stropts.c

Using a sockaddr_storage and casting a sockaddr pointer to it breaks
C's aliasing rules.

See:

https://bugzilla.redhat.com/show_bug.cgi?id=448743

Replacing sockaddr_storage makes this code less likely to break when
optimized by gcc. It also saves a significant amount of stack space
by replacing a 130 byte structure with a union that is less than 32
bytes.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Add new API for getting protocol family from netids

Introduce a couple of new functions that extract the protocol family
from the value of the proto= and mountproto= mount options.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: make nfs_lookup() global

Expose a DNS query API that allows callers to request DNS results from
a specific address family.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: support netids in v2/v3 version/transport negotiation

When rewriting mount options during v2/v3 negotiation, restore the
correct netids, rather than protocol names, in the rewritten protocol
options. If TI-RPC is not available, the traditional behavior is
preserved.

This patch assumes the kernel can recognize a netid, instead of a
protocol name, as the value of the proto= options.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: support netids in nfs_options2pmap()

When parsing mount options in nfs_options2pmap(), treat the value of
proto= (and mountproto=) as a netid by looking it up in local
netconfig and protocol databases to convert it to a protocol number.
If TI-RPC is not available, the traditional behavior is preserved.

The meaning of the "udp" and "tcp" mount options is not affected by
this change.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

libnfs.a: Provide shared helpers for managing netids

Introduce a couple of shared functions that can convert netids to
protocol numbers and families, and back.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Retry v4 mounts with v3 on ENOENT errors

Retry v4 mounts with a v3 mount when the version
is not explicitly specified and the mount fails
with ENOENT. The will help deal with Linux servers
that do not automatically export a pseudo root

Signed-off-by: Steve Dickson <steved@redhat.com>

statd: Replace nsm_log() with xlog() in sm-notify command

To facilitate code sharing between statd and sm-notify (and with other
components of nfs-utils), replace sm-notify's nsm_log() with xlog().

Since opt_quiet is used in only a handful of insignificant cases, it
is removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

statd: Replace note() with xlog() in rpc.statd

To facilitate code sharing between statd and sm-notify (and with other
components of nfs-utils), replace sm-notify's nsm_log() with xlog().

Since opt_quiet is used in only a handful of insignificant cases, it
is removed.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: NFSv4: fix backgrounding

he nfsmount() function checks if !bg before running
switch(rpc_createerr.cf_stat). On the other hand, the nfs4mount()
function does not, and results in exiting the loop on the first
iteration even with the bg mount option.

NOTE: This and the previous patch ("nfs-utils: mount options can be lost
when using bg option") are relevant to non text-based mount options.

See https://bugzilla.redhat.com/show_bug.cgi?id=529370 for details.

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount options can be lost when using bg option

When mounting an NFS export *without* the "bg" option, try_mount() is
called only once. Before calling it, the variables mount_opts and
extra_opts are set up. Then try_mount() calls nfsmount(), the latter
assumes that the aforementioned variables can be modified. Most
significantly, it allows the variable extra_opts to be modified.

When the "bg" mount option is used *and* the first try_mount() attempt
fails, it daemonizes the process and calls try_mount() again,
unfortunately, we've lost the required mount options in the variable
extra_opts.

See https://bugzilla.redhat.com/show_bug.cgi?id=529370 for details.

Signed-off-by: Harshula Jayasuriya <harshula@redhat.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

relax insecure option on mountd

In nfs-utils 1.2.0, I noticed that the insecure option validates that
the client port is a
subset of IPPORT_RESERVED as opposed to just validating it is a valid
reserved port. The following proposed patch would correct that issue.

Acked-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Robert Gordon <rbg@openrbg.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Assume v2/v3 if mount-related options are present

Don't try NFSv4 if any MNT protocol related options were
presented by the user.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: process service= attribute in new upcall

Add processing of the "service=" attribute in the new gssd upcall.

If "service" is specified, then the kernel is indicating that
we must use machine credentials for this request.  (Regardless
of the uid value or the setting of root_uses_machine_creds.)
If the service value is "*", then any service name can be used.
Otherwise, it specifies the service name that should be used.
(For now, the values of service will only be "*" or "nfs".)

Restricting gssd to use "nfs" service name is needed for when
the NFS server is doing a callback to the NFS client.  In this
case, the NFS server has to authenticate itself as "nfs" --
even if there are other service keys such as "host" or "root"
in the keytab.

Another case when the kernel may specify the service attribute
is when gssd is being asked to create the context for a
SETCLIENT_ID operation.  In this case, machine credentials
must be used for the authentication.  However, the service name
used for this case is not important.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: process target= attribute in new upcall

Add processing of the "target=" attribute in the new gssd upcall.
Information in this field is used to construct the gss service name
of the server for which gssd will create a context .

This, along with the next patch handling "service=", is needed
for callback security.

For Kerberos, the NFS client will use a service principal present
in its keytab during authentication of the SETCLIENT_ID operation.
When establishing the context for the callback, the gssd on the
NFS server will attempt to authenticate the callback against the
principal name used by the client.

Note: An NFS client machine must have a keytab for the callback
authentication to succeed.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: handle new client upcall

Add support for handling the new client-side upcall. The kernel,
beginning with 2.6.29, will attempt to use a new pipe, "gssd",
which can be used for upcalls for all gss mechanisms.

The new upcall is text-based with an <attribute>=<value> format.
Attribute/value pairs are separated by a space, and terminated
with a new-line character.

The intial version has two required attributes,
mech=<gss_mechanism_name> and uid=<user's_UID_number>, and two
optional attributes, target=<gss_target_name> and service=<value>.

Future kernels may add new attribute/value pairs.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: print full client directory being handled

For convenience, add the full name of the upcall pipe being processed.
(Distinquishes between "normal" upcall, and a callback upcall.)

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: add upcall support for callback authentication

Change the processing so that all subdirectories within the rpc_pipefs
directory are treated equally.  Any "clnt" directories that show up
within any of them are processed.  (As suggested by Bruce Fields.)

Note that the callback authentication will create a new "nfs4d_cb"
subdirectory.  Only new kernels (2.6.29) will create this new directory.
(The need for this directory will go away with NFSv4.1 where the
callback can be done on the same connection as the fore-channel.)

Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

gssd: refactor update_client_list()

Split out the processing for a pipe to a separate routine. The next
patch adds a new pipe to be processed.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

This patch adds the krb5 hostbased principal, name which the
nfs client used to authenticate, to the svcgssd downcall
information. This information is needed for the callback
authentication.

When estabishing the callback, nfsd will pass the principal
name in the upcall to the gssd. gssd will acquire a service
ticket for the specified principal name.

Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu>
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: Steve Dickson <steved@redhat.com>

Remove the AI_ADDRCONFIG hint flag to getaddrinfo() when it's
call by nfsd to set up the file descriptors that are
sent to the kernel. The flag causes the getaddrinfo()
to fail, with EAI_NONAME, when there is not a non-loopback
network interface configured.

Signed-off-by: Steve Dickson <steved@redhat.com>

Release 1.2.1

Signed-off-by: Steve Dickson <steved@redhat.com>

Fixed configuration error when --disable-mount was used.

Signed-off-by: Steve Dickson <steved@redhat.com>

mount: Fix po_join() call site in nfs_try_mount_v4()

Make sure the copied options string is freed in case po_join() fails.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Assume v2/v3 if mount-related options are present

Don't try NFSv4 if any MNT protocol related options were presented by
the user.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

Made some aesthetic changes to the code that sets
the defaults that were a result of the code review.

Signed-off-by: Steve Dickson <steved@redhat.com>

Retry v4 mounts with a v3 mount when the version
is not explicitly specified and the mount fails
with ENOENT. The will help deal with Linux servers
that do not automatically export a pseudo root

Signed-off-by: Steve Dickson <steved@redhat.com>

Added wrappers around the setting of default values
from the config file which will be compiled out
when the config file is not enabled.

Signed-off-by: Steve Dickson <steved@redhat.com>

Added the defaultproto and defaultvers variable to the mount
configuration file.

Signed-off-by: Steve Dickson <steved@redhat.com>

Use the default protocol and version values, when they
are set in the configuration file, to start the negation
with the server

Signed-off-by: Steve Dickson <steved@redhat.com>

Introducing the parsing of both 'defaultvers' and 'defaultproto'
config variables which will be used to set the the default
version and network protocol.

A global variable will be set for each option with the
corresponding value. The value will be used as the
initial value in the server negation.

Signed-off-by: Steve Dickson <steved@redhat.com>

Make sure all protocol version options are checked in check_vers()

Signed-off-by: Steve Dickson <steved@redhat.com>

Make the network transports value in the mount
config file case sensitive, since they are in the
mount command's parsing code.

Signed-off-by: Steve Dickson <steved@redhat.com>

There are a number of different mount options that can be
used to set the protocol version on the command line. The
config file code needs to know about each option so the
command line value will override the config file value.

Signed-off-by: Steve Dickson <steved@redhat.com>

mount: Support negotiation between v4, v3, and v2

When negotiating between v3 and v2, mount.nfs first tries v3, then v2.
Take the same approach for v4: try v4 first, then v3, then v2, in
order to get the highest NFS version both the client and server
support.

No MNT request is needed for v4.  Since we want to avoid an rpcbind
query for the v4 attempt, just go straight for mount(2) without a MNT
request or rpcbind negotiation first.  If the server reports that v4
is not supported, try lower versions.

The decisions made by the fg/bg retry loop have nothing to do with
version negotation.  To avoid a layering violation, mount.nfs's
multi-version negotiation strategy is wholly encapsulated within
nfs_try_mount().  Thus, code duplication between nfsmount_fg(),
nfsmount_parent(), and nfsmount_child() is avoided.

For now, negotiating version 4 is supported only on kernels that can
handle the vers=4 option on type "nfs" file systems.  At some point
we could also allow mount.nfs to switch to an "nfs4" file system in
this case.

Since mi->version == 0 can now mean v2, v3, or v4, limit the versions
tried for RDMA mounts.  Today, only version 3 supports RDMA.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

The user's mount options and the set of versions to try should not
change over the course of mount retries.

With this patch, each version-specific mount attempt is compartment-
alized, and starts from the user's original mount options each time.
Thus these attempts can now be safely performed in any order,
depending on what the user has requested, what the server advertises,
and what is up and running at any given point.

Don't regress the fix in commit 23c1a452. For v2/v3 negotation, only
the user's mount options are written to /etc/mtab, and not any options
that were negotiated by mount.nfs. There's no way to guarantee that
the server configuration will be the same at umount time as it was at
mount time.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Keep server's address in nfsmount_info

We want to pass the server's address around. Put it in the mount
context structure.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

mount.nfs: Add API to duplicate a mount option list

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Steve Dickson <steved@redhat.com>

nfs-utils: nfs-iostat.py autofs cleanup and option to sort by ops/s

Adds --sort option to display mount point stats sorted by ops/s
Adds --list=<n> option to only display stats for first <n> mount points
E.g. the use of "--sort --list=1" should be useful in seeing stats for
only the mountpoint with the highest ops/s.

Signed-off-by: Lans Carstensen <Lans.Carstensen@dreamworks.com>
Signed-off-by: Steve Dickson <steved@redhat.com>