Chuck Lever [Thu, 14 Jan 2010 17:24:08 +0000 (12:24 -0500)]
statd: Add API to canonicalize mon_names
Provide a shared function to generate canonical names that statd
uses to index its on-disk monitor list. This function can resolve
DNS hostnames, and IPv4 and IPv6 presentation addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:24:00 +0000 (12:24 -0500)]
libnsm.a: Add support for multiple lines in monitor record files
To support IPv6, statd must support multi-homed remote peers. For our
purposes, "multi-homed peer" means that more than one unique IP
address maps to the one canonical host name for that peer.
An SM_MON request from the local lockd has a "mon_name" argument that
statd reverse maps to a canonical hostname (ie the A record for that
host). statd assumes the canonical hostname is unique enough that
it stores the callback data for this mon_name in a file named after
that canonical hostname.
Because lockd can't distinguish between two unique IP addresses that
may be from the same physical host, the kernel can hand statd a
mon_name that maps to the same canonical hostname as some previous
mon_name. So that the kernel can keep this instance of the mon_name
unique, it creates a fresh priv cookie for each new address.
Note that a mon_name can be a presentation address string, or the
caller_name string sent in each NLMPROC_LOCK request. There's
nothing that requires the caller_name to be a fully-qualified
hostname, thus it's uniqueness is not guaranteed. The current
design of statd assumes that canonical hostnames will be unique
enough.
When a mon_name for a fresh SM_MON request maps to the same canonical
hostname as an existing monitored peer, but the priv cookie is new,
statd will try to write the information for the fresh request into an
existing monitor record file, wiping out the contents of the file.
This is because the mon_name/cookie combination won't match any record
statd already has.
Currently, statd doesn't check if a record file already exists before
writing into it. statd's logic assumes that the svc routine has
already checked that no matching record exists in the in-core monitor
list. And, it doesn't use O_EXCL when opening the record file. Not
only is the old data in that file wiped out, but statd's in-core
monitor list will no longer match what's in the on-disk monitor list.
Note that IPv6 isn't needed to exercise multi-homed peer support.
Any IPv4 peer that has multiple addresses that map to its canonical
hostname will trigger this behavior. However, this scenario will
become quite common when all hosts on a network automatically get both
an IPv4 address and an IPv6 address.
I can think of a few ways to address this:
1. Replace the current on-disk format with a database that has a
uniqueness constraint on the monitor records
2. Create a new file naming scheme; eg. one that uses a truly
unique name such as a hash generated from the mon_name, my_name, and
priv cookie
3. Support multiple lines in each monitor record file
Since statd's on-disk format constitutes a formal API, options 1 and 2
are right out. This patch implements option 3. There are two parts:
adding a new line to an existing file; and deleting a line from a file
with more than one line. Interestingly, the existing code already
supports reading more than one line from these files, so we don't need
to add extra code here to do that.
One file may contain a line for every unique mon_name / priv cookie
where the mon_name reverse maps to the same canonical hostname. We
use the atomic write facility added by a previous patch to ensure the
on-disk monitor record list is updated atomically.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:23:53 +0000 (12:23 -0500)]
libnsm.a: Factor atomic write code out of nsm_get_state()
We're about to use the same logic (mktemp, write, rename) for
other new purposes, so pull it out into its own function.
This change also addresses a latent bug: O_TRUNC is now used when
creating the temporary file. This eliminates the possibility of
getting stale data in the temp file, if somehow a previous "atomic
write" was interrupted and didn't remove the temporary file.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:23:48 +0000 (12:23 -0500)]
sm-notify: Save mon_name and my_name strings
Currently sm-notify does not use the mon_name and my_name strings
passed to smn_get_host(). Very soon we're going to need the mon_name
and my_name strings, so add code to store those strings in struct
nsm_host, and free them when each host is forgotten.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:23:37 +0000 (12:23 -0500)]
statd: Support IPv6 is caller_is_localhost()
For the time being, statd is not going to support receiving SM_MON
calls from the local lockd via IPv6.
However, the upcalls (SM_MON, etc.) from the local lockd arrive on the
same socket that receives calls from remote peers. Thus
caller_is_localhost() at least has to be smart enough to notice that
the caller is not AF_INET, and to display non-AF_INET addresses
appropriately.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:23:23 +0000 (12:23 -0500)]
statd: add nsm_present_address() API
Add an API to convert a socket address to a presentation address
string. This is used for displaying error messages and the like.
We prefer getnameinfo(3) over inet_?to?(3) as it supports IPv6 scope
IDs. Since statd has to continue to build correctly on systems whose
glibc does not have getnameinfo(3), an inet_?to?(3) version is also
provided.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:23:19 +0000 (12:23 -0500)]
statd: Introduce statd version of matchhostname()
For the near future, statd will support IPv6 but exportfs will not.
Thus statd will need a version of matchhostname() that can deal
properly with IPv6 remotes. To reduce the risk of breaking exportfs,
introduce a separate version of matchhostname() for statd to use while
exportfs continues to use the existing AF_INET-only implementation.
Note that statd will never send matchhostname() a hostname string
containing export wildcards, so is_hostame() is not needed in the
statd version of matchhostname(). This saves some computational
expense when comparing hostnames.
A separate statd-specific implementation of matchhostname() allows
some flexibility in the long term, as well. We might want to enrich
the matching heuristics of our SM_NOTIFY, for example, or replace
them entirely with a heuristic that is not dependent upon DNS.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:22:59 +0000 (12:22 -0500)]
sm-notify: Use getaddrinfo(3) to create bind address in smn_create_socket()
This patch updates the "bind to a user-specified port" arm of
smn_create_socket() so it can deal with IPv6 bind addresses.
A single getaddrinfo(3) call can convert a user-specified bind address
or hostname to a socket address, optionally plant a provided port
number, or whip up an appropriate wildcard address for use as the main
socket's bind address.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:22:33 +0000 (12:22 -0500)]
sm-notify: Support creating a PF_INET6 socket in smn_create_socket()
Socket creation is unfortunately complicated by the need to handle the
case where sm-notify is built with IPv6 support, but the local system
has disabled it entirely at run-time (ie, socket(3) returns
EAFNOSUPPORT when we try to create an AF_INET6 socket).
The run-time address family setting is made available in the global
variable nsm_family. This setting can control the family of the
socket's bind address and what kind of addresses we want returned by
smn_lookup(). Support for that is added in subsequent patches.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:22:26 +0000 (12:22 -0500)]
sm-notify: factor socket creation out of notify()
The top half of the notify() function creates the main socket that
sm-notify uses to do its job. To make adding IPv6 support simpler,
refactor that piece into a separate function.
The logic is modified slightly so that exit(3) is invoked only in
main(). This is not required, but it makes the code slightly easier
to understand and maintain.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:22:12 +0000 (12:22 -0500)]
statd: Update rmtcall.c
Replace the open code to construct NLM downcalls and PMAP_GETPORT RPC
requests with calls to our new library routines.
This clean up removes redundant code in rmtcall.c, and enables the
possibility of making NLM downcalls via IPv6 transports. We won't
support that for a long while, however.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Thu, 14 Jan 2010 17:22:09 +0000 (12:22 -0500)]
sm-notify: Replace RPC code
Replace the open code to construct SM_NOTIFY and PMAP_GETPORT RPC
requests with calls to our new library routines that support
IPv6 and RPCB_GETADDR as well.
This change allows sm-notify to send RPCB_GETADDR, but it won't do
that until the main sm-notify socket supports PF_INET6 and the DNS
resolution logic is updated to return IPv6 addresses.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Chuck Lever [Tue, 12 Jan 2010 21:41:43 +0000 (16:41 -0500)]
libnsm.a: Add RPC construction helper functions
To manage concurrency, both statd and sm-notify construct raw RPC
requests in socket buffers, and use a minimal request scheduler
to send these requests and manage replies. Both statd and sm-notify
open code the RPC request construction.
Introduce helper functions that can construct and send raw
NSMPROC_NOTIFY, NLM downcalls, and portmapper calls over a datagram
socket, and receive and parse their replies. Support for IPv6 and
RPCB_GETADDR is featured. This code (and the IPv6 support it
introduces) can now be shared by statd and sm-notify, eliminating
code and bug duplication.
This implementation is based on what's in utils/statd/rmtcall.c now,
but is wrapped up in a nice API and includes extra error checking.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Only NFSv4 clients will actually want to see referall points--others are
better off just seeing an empty directory, that they can manually (or
with automount) mount the appropriate filesystem on.
So we want the kernel to automatically traverse only in the v4 case (as
recent kernels do).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Wed, 13 Jan 2010 00:27:21 +0000 (19:27 -0500)]
mountd: better hiding of v4root exports from mountd clients
We've hidden v4root exports from get_exportlist (hence from the
showmount command), but not from other mountd operations--allowing
clients to attempt to mount exports when they should be getting an
immediate error.
Symptoms observed on a linux client were that a mount that previously
would have returned an error immediately now hung. This restores the
previous behavior.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Sat, 9 Jan 2010 17:44:57 +0000 (10:44 -0700)]
mountd: minor v4root_set cleanup, check strdup return
Move more of v4root_set into a helper function.
Also, check the return value from strdup. (We don't really handle the
error well yet--we'll end up giving negative replies to export upcalls
when we should be giving the kernel exports, resulting in spurious
-ENOENTs or -ESTALE's--but that's better than crashing with a NULL
dereference.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Sat, 9 Jan 2010 17:22:17 +0000 (10:22 -0700)]
mountd: minor optimization in v4root_set
Since we're adding new exports as we traverse the export list, it's
possible we may find ourselves revisiting an export we just added. It's
harmless to reprocess those exports, as we're currently doing. But it's
also pointless.
(Actually, the current code appears to always add new export entries at
the head of each list, so we shouldn't hit this case. It still may be a
good idea to keep this check, though, as insulation against future
changes to that data structure.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Fri, 27 Nov 2009 20:01:12 +0000 (15:01 -0500)]
mountd: move most of get_exportlist() into helpers
I needed to understand get_exportlist() recently, and it gave me
trouble.
Move detail work into helper functions to make the basic logic clear,
and to remove need for excessive nesting (and fix inconsistent
indentation levels). Also remove unnecessary casts of void returns from
xmalloc().
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Steve Dickson [Tue, 1 Dec 2009 12:20:43 +0000 (07:20 -0500)]
exports: turn on pseudo exports
If a pseudo root is not defined in the export file, the
v4root_needed global variable will be set, signaling
v4root_set() create the dynamic pseudo root.
J. Bruce Fields [Thu, 24 Dec 2009 20:51:20 +0000 (15:51 -0500)]
mountd: prefer non-V4ROOT exports.
If paths A and A/B are both exported, then we have a choice of exports
to return for A (or under A but still above A/B): we could return A
itself, or we could return a V4ROOT export leading to B.
For now, we will always prefer the non-V4ROOT export, whenever that is
an option. This will allow clients to reach A/B as long as
adminstrators keep to the rule that the security on a parent permits the
union of the access permitted on any descendant.
In the future we may support more complicated arrangements.
(Note: this can't be avoided by simply not creating v4root exports with
the same domain and path, because different domains may have some
overlap.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Steve Dickson [Tue, 1 Dec 2009 12:16:13 +0000 (07:16 -0500)]
exports: NFSv4 pseudoroot support routines
Create v4root exports for each directory that is a parent of an explicit
export. Give each the minimal security required to traverse to any of
its children.
Signed-off-by: Steve Dickson <steved@redhat.com> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Tue, 22 Dec 2009 18:02:08 +0000 (13:02 -0500)]
mountd: don't require mountpoint in crossmnt case
Currently,
mount --bind /path /path
where /path is a subdirectory of a crossmnt export, can cause client
hangs, since the kernel detects that as a mountpoint, but nfs-util's
is_mountpoint() function does not.
I don't see any sure-fire way to detect such mountpoints. But that's
OK: it's harmless to allow this upcall to succeed even when the
directory is not a mountpoint, so let's just remove this check.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Tue, 22 Dec 2009 16:22:58 +0000 (11:22 -0500)]
mountd: move export lookup into separate function
Move this main loop to a separate function, to make it a little easier
to follow the logic of the caller.
Also, instead of waiting till we find an export to do the dns
resolution, do it at the start; it will normally be needed anyway, and
this simplifies the control flow.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
J. Bruce Fields [Mon, 14 Dec 2009 22:07:19 +0000 (17:07 -0500)]
exports: let kernel decide which flags vary by flavor
Query the kernel to ask which flavors vary by pseudoflavor, and use that
instead of a fixed constant. To allow the possibility of more flags
varying by pseudoflavor, use the set/clear_flags functions for all
options instead of setting some by hand.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Jeff Layton [Tue, 12 Jan 2010 12:32:51 +0000 (07:32 -0500)]
gssd: on krb5 upcall, have gssd send a more granular error code
Currently if a krb5 context expires, GSSAPI authenticated RPC calls
start returning error (-EACCES in particular). This is bad when someone
has a long running job that's doing filesystem ops on a krb5 authenticated
NFS mount and just happens to forget to redo a 'kinit' in time.
The existing gssd always does a downcall with a '-1' error code if there
are problems, and the kernel always ignores this error code. Begin to
fix this by having gssd distinguish between someone that has no
credcache at all, and someone who has an expired one. In the case where
there is an existing credcache, have gssd downcall with an error code of
-EKEYEXPIRED. If there's not a credcache, then downcall with an error of
-EACCES.
We can then have the kernel use this error code to handle these
situations differently.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Steve Dickson [Tue, 12 Jan 2010 11:03:22 +0000 (06:03 -0500)]
Added the following files to .gitignore
tests/nsm_client/nlm_sm_inter.h
tests/nsm_client/nlm_sm_inter_clnt.c
tests/nsm_client/nlm_sm_inter_svc.c
tests/nsm_client/nlm_sm_inter_xdr.c
Jeff Layton [Tue, 12 Jan 2010 01:27:54 +0000 (20:27 -0500)]
nfs-utils: add initial tests for statd that run via "make check"
Leverage the support that automake already has for running tests via
make check. Add a simple test that just checks that the statd mon and
unmon calls actually work.
Adding more tests should be a simple matter of adding new scripts
exit 0 on success and non-zero on fail, and adding those to the
Makefile.am.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Jeff Layton [Tue, 12 Jan 2010 10:55:20 +0000 (05:55 -0500)]
nfs-utils: introduce new statd testing simulator
rpc.statd is often prone to subtle, difficult to detect breakage. When
it has problems, they're often invisible and only manifest themselves
as failed lock recovery.
This program is intended to function as part of a test harness for
statd. It's a multicall binary that serves as a synthetic NSM client
program, and a daemon that can simulate lockd for purposes of testing
the NSM to NLM downcall.
A new top level "tests/" directory is also added to nfs-utils to start
as a repository for automated tests of nfs-utils components.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Tue, 12 Jan 2010 00:10:49 +0000 (19:10 -0500)]
statd: Use the new nsm_ file.c calls in sm_notify
Replace open-coded accesses to on-disk NSM data with calls to the new
libnsm.a API.
One major change is that sync(2) is no longer called when the NSM
state number is updated at boot time. Otherwise sm-notify should
behave much the same as it did before.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Tue, 12 Jan 2010 00:08:10 +0000 (19:08 -0500)]
libnsm.a: Introduce common routines to handle persistent storage
rpc.statd and sm-notify access the same set of files under
/var/lib/nfs/statd, but both have their own code base to handle this.
They should share this code.
In addition, the on-disk format used by statd and friends is
considered a formal interface, so this new code will codify the API
and provide documentation for it.
The shared code handles switching from the default parent statd
directory, reducing privileges at start-up, and managing the NSM
state files, in addition to handling normal operations on the
monitored host and notification lists on disk.
The new code is simply a copy of the same logic that was used in
rpc.statd and sm-notify, but wrapped in a nice API. There should be
minimal behavioral and no on-disk format changes with the new
libnsm.a code.
The new code is more careful to check for bad corner cases.
Occassionally this code may not allow an operation that was permitted
in the past, but hopefully the error reporting has improved enough
that it should be easy to track down any problems.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Steve Dickson [Mon, 11 Jan 2010 23:26:41 +0000 (18:26 -0500)]
showmount: Try the highest mount version then fall back to lower ones
Showmount should try the highest mount version first then fall
back to the lower ones when the server returns a RPC_PROGVERSMISMATCH
error. The idea being not using the lower mount versions will begin
the process of moving away from NFSv2 support.
Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Jeff Layton [Mon, 4 Jan 2010 20:42:51 +0000 (15:42 -0500)]
mount.nfs: don't use IPv6 unless IPV6_SUPPORTED is set
Commit 1f3fae1fb25168aac187ff1881738c8ad53a8763 made mount.nfs start
looking up and trying to use IPv6 addresses when mount.nfs was built
against libtirpc (even when --enable-ipv6 wasn't specified).
The problem seems to be that nfs_nfs_proto_family() is basing the family
on HAVE_LIBTIRPC. I think it should be basing it on IPV6_SUPPORTED
instead.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Wed, 23 Dec 2009 16:29:19 +0000 (11:29 -0500)]
libnsm.a: Move the sm_inter XDR pieces to libnsm.a
Clean up: Move the .x file and the generated C source for NSM to
libnsm.a, echoing the architecture of mountd and exportfs. This makes
the NSM protocol definitions, data types, and XDR routines available
to be shared across nfs-utils.
This simplifies the addition of other NSM-related code (for example
for testing or providing clustering support), and also provides
public data type definitions that can be used to make sense of the
contents of statd's on-disk database.
Because sim_sm_inter.x still resides in utils/statd, I've left some
rpcgen build magic in utils/statd/Makefile.am.
This is an internal organization change only. This patch should not
affect code behavior in any way.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Fri, 11 Dec 2009 17:36:42 +0000 (12:36 -0500)]
statd: replace smn_{get,set}_port() with the shared equivalents
Use shared sockaddr port management functions instead of duplicating
this functionality in sm-notify. This is now easy because sm-notify
is linked with libnfs.a, where nfs_{get,set}_port() reside.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Jeff Layton [Fri, 11 Dec 2009 18:05:06 +0000 (13:05 -0500)]
gssd: have gssd deal with scopeid field in upcall
Recent kernels (2.6.32) have started displaying the scopeid for some
addresses in the upcall. gssd doesn't know how to deal with them. Change
gssd to use getaddrinfo instead of inet_pton since that can deal with
scopeid's in addresses. That also allows us to elminate the port
conversion in read_service_info.
If getaddrinfo returns an address with a non-zero sin6_scope_id however,
reject it. getnameinfo ignores that field and just uses the sin6_addr
part when resolving. But, two addresses that differ only in
sin6_scope_id could refer to completely different hosts.
Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Fri, 11 Dec 2009 15:53:13 +0000 (10:53 -0500)]
NFS man page: update nfs(5) with details about IPv6 support
Add details to nfs(5) about how to specify raw IPv6 addresses when
mounting an
NFS server. Mounting via an IPv6 NFS server via hostname should work as
it
does with IPv4.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Fri, 11 Dec 2009 15:48:24 +0000 (10:48 -0500)]
mount.nfs: proto=netid forces address family when resolving server names
Using the netid settings, determine the correct address family to use
for NFS and MNT server name resolution. Use this family when
resolving the server name for the addr= and mountaddr= options.
This patch assumes the kernel can recognize a netid, instead of a
protocol name, as the value of the proto= options.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Replacing sockaddr_storage makes this code less likely to break when
optimized by gcc. It also saves a significant amount of stack space
by replacing a 130 byte structure with a union that is less than 32
bytes.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Fri, 11 Dec 2009 15:38:50 +0000 (10:38 -0500)]
mount.nfs: support netids in v2/v3 version/transport negotiation
When rewriting mount options during v2/v3 negotiation, restore the
correct netids, rather than protocol names, in the rewritten protocol
options. If TI-RPC is not available, the traditional behavior is
preserved.
This patch assumes the kernel can recognize a netid, instead of a
protocol name, as the value of the proto= options.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Fri, 11 Dec 2009 15:37:02 +0000 (10:37 -0500)]
mount.nfs: support netids in nfs_options2pmap()
When parsing mount options in nfs_options2pmap(), treat the value of
proto= (and mountproto=) as a netid by looking it up in local
netconfig and protocol databases to convert it to a protocol number.
If TI-RPC is not available, the traditional behavior is preserved.
The meaning of the "udp" and "tcp" mount options is not affected by
this change.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Neil Brown [Mon, 7 Dec 2009 22:23:48 +0000 (17:23 -0500)]
mount.nfs: Retry v4 mounts with v3 on ENOENT errors
Retry v4 mounts with a v3 mount when the version
is not explicitly specified and the mount fails
with ENOENT. The will help deal with Linux servers
that do not automatically export a pseudo root
he nfsmount() function checks if !bg before running
switch(rpc_createerr.cf_stat). On the other hand, the nfs4mount()
function does not, and results in exiting the loop on the first
iteration even with the bg mount option.
NOTE: This and the previous patch ("nfs-utils: mount options can be lost
when using bg option") are relevant to non text-based mount options.
See https://bugzilla.redhat.com/show_bug.cgi?id=529370 for details.
Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
When mounting an NFS export *without* the "bg" option, try_mount() is
called only once. Before calling it, the variables mount_opts and
extra_opts are set up. Then try_mount() calls nfsmount(), the latter
assumes that the aforementioned variables can be modified. Most
significantly, it allows the variable extra_opts to be modified.
When the "bg" mount option is used *and* the first try_mount() attempt
fails, it daemonizes the process and calls try_mount() again,
unfortunately, we've lost the required mount options in the variable
extra_opts.
See https://bugzilla.redhat.com/show_bug.cgi?id=529370 for details.
Signed-off-by: Harshula Jayasuriya <harshula@redhat.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Robert Gordon [Mon, 16 Nov 2009 18:25:02 +0000 (13:25 -0500)]
relax insecure option on mountd
In nfs-utils 1.2.0, I noticed that the insecure option validates that
the client port is a
subset of IPPORT_RESERVED as opposed to just validating it is a valid
reserved port. The following proposed patch would correct that issue.
Acked-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Robert Gordon <rbg@openrbg.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Add processing of the "service=" attribute in the new gssd upcall.
If "service" is specified, then the kernel is indicating that
we must use machine credentials for this request. (Regardless
of the uid value or the setting of root_uses_machine_creds.)
If the service value is "*", then any service name can be used.
Otherwise, it specifies the service name that should be used.
(For now, the values of service will only be "*" or "nfs".)
Restricting gssd to use "nfs" service name is needed for when
the NFS server is doing a callback to the NFS client. In this
case, the NFS server has to authenticate itself as "nfs" --
even if there are other service keys such as "host" or "root"
in the keytab.
Another case when the kernel may specify the service attribute
is when gssd is being asked to create the context for a
SETCLIENT_ID operation. In this case, machine credentials
must be used for the authentication. However, the service name
used for this case is not important.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
Add processing of the "target=" attribute in the new gssd upcall.
Information in this field is used to construct the gss service name
of the server for which gssd will create a context .
This, along with the next patch handling "service=", is needed
for callback security.
For Kerberos, the NFS client will use a service principal present
in its keytab during authentication of the SETCLIENT_ID operation.
When establishing the context for the callback, the gssd on the
NFS server will attempt to authenticate the callback against the
principal name used by the client.
Note: An NFS client machine must have a keytab for the callback
authentication to succeed.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
Add support for handling the new client-side upcall. The kernel,
beginning with 2.6.29, will attempt to use a new pipe, "gssd",
which can be used for upcalls for all gss mechanisms.
The new upcall is text-based with an <attribute>=<value> format.
Attribute/value pairs are separated by a space, and terminated
with a new-line character.
The intial version has two required attributes,
mech=<gss_mechanism_name> and uid=<user's_UID_number>, and two
optional attributes, target=<gss_target_name> and service=<value>.
Future kernels may add new attribute/value pairs.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
For convenience, add the full name of the upcall pipe being processed.
(Distinquishes between "normal" upcall, and a callback upcall.)
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
gssd: add upcall support for callback authentication
Change the processing so that all subdirectories within the rpc_pipefs
directory are treated equally. Any "clnt" directories that show up
within any of them are processed. (As suggested by Bruce Fields.)
Note that the callback authentication will create a new "nfs4d_cb"
subdirectory. Only new kernels (2.6.29) will create this new directory.
(The need for this directory will go away with NFSv4.1 where the
callback can be done on the same connection as the fore-channel.)
Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
Split out the processing for a pipe to a separate routine. The next
patch adds a new pipe to be processed.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
This patch adds the krb5 hostbased principal, name which the
nfs client used to authenticate, to the svcgssd downcall
information. This information is needed for the callback
authentication.
When estabishing the callback, nfsd will pass the principal
name in the upcall to the gssd. gssd will acquire a service
ticket for the specified principal name.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: Kevin Coffman <kwc@citi.umich.edu> Signed-off-by: Steve Dickson <steved@redhat.com>
Steve Dickson [Thu, 12 Nov 2009 19:16:12 +0000 (14:16 -0500)]
Remove the AI_ADDRCONFIG hint flag to getaddrinfo() when it's
call by nfsd to set up the file descriptors that are
sent to the kernel. The flag causes the getaddrinfo()
to fail, with EAI_NONAME, when there is not a non-loopback
network interface configured.
Steve Dickson [Tue, 3 Nov 2009 14:49:03 +0000 (09:49 -0500)]
Retry v4 mounts with a v3 mount when the version
is not explicitly specified and the mount fails
with ENOENT. The will help deal with Linux servers
that do not automatically export a pseudo root
Steve Dickson [Sat, 17 Oct 2009 13:16:18 +0000 (09:16 -0400)]
Introducing the parsing of both 'defaultvers' and 'defaultproto'
config variables which will be used to set the the default
version and network protocol.
A global variable will be set for each option with the
corresponding value. The value will be used as the
initial value in the server negation.
Steve Dickson [Fri, 9 Oct 2009 13:19:39 +0000 (09:19 -0400)]
There are a number of different mount options that can be
used to set the protocol version on the command line. The
config file code needs to know about each option so the
command line value will override the config file value.
Chuck Lever [Tue, 29 Sep 2009 14:38:52 +0000 (10:38 -0400)]
mount: Support negotiation between v4, v3, and v2
When negotiating between v3 and v2, mount.nfs first tries v3, then v2.
Take the same approach for v4: try v4 first, then v3, then v2, in
order to get the highest NFS version both the client and server
support.
No MNT request is needed for v4. Since we want to avoid an rpcbind
query for the v4 attempt, just go straight for mount(2) without a MNT
request or rpcbind negotiation first. If the server reports that v4
is not supported, try lower versions.
The decisions made by the fg/bg retry loop have nothing to do with
version negotation. To avoid a layering violation, mount.nfs's
multi-version negotiation strategy is wholly encapsulated within
nfs_try_mount(). Thus, code duplication between nfsmount_fg(),
nfsmount_parent(), and nfsmount_child() is avoided.
For now, negotiating version 4 is supported only on kernels that can
handle the vers=4 option on type "nfs" file systems. At some point
we could also allow mount.nfs to switch to an "nfs4" file system in
this case.
Since mi->version == 0 can now mean v2, v3, or v4, limit the versions
tried for RDMA mounts. Today, only version 3 supports RDMA.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
Chuck Lever [Tue, 29 Sep 2009 14:38:05 +0000 (10:38 -0400)]
The user's mount options and the set of versions to try should not
change over the course of mount retries.
With this patch, each version-specific mount attempt is compartment-
alized, and starts from the user's original mount options each time.
Thus these attempts can now be safely performed in any order,
depending on what the user has requested, what the server advertises,
and what is up and running at any given point.
Don't regress the fix in commit 23c1a452. For v2/v3 negotation, only
the user's mount options are written to /etc/mtab, and not any options
that were negotiated by mount.nfs. There's no way to guarantee that
the server configuration will be the same at umount time as it was at
mount time.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Steve Dickson <steved@redhat.com>
nfs-utils: nfs-iostat.py autofs cleanup and option to sort by ops/s
Adds --sort option to display mount point stats sorted by ops/s
Adds --list=<n> option to only display stats for first <n> mount points
E.g. the use of "--sort --list=1" should be useful in seeing stats for
only the mountpoint with the highest ops/s.
Signed-off-by: Lans Carstensen <Lans.Carstensen@dreamworks.com> Signed-off-by: Steve Dickson <steved@redhat.com>