[clue-tech] nfs frustrations

Thu Jun 4 08:39:43 MDT 2009

--- On Wed, 6/3/09, Angelo Bertolli <angelo.bertolli at gmail.com> wrote:

> From: Angelo Bertolli <angelo.bertolli at gmail.com>
> Subject: Re: [clue-tech] nfs frustrations
> To: "CLUE technical discussion" <clue-tech at cluedenver.org>
> Date: Wednesday, June 3, 2009, 8:22 PM
> Yes.  I think that would be the
> next technology we move to because network connections seem
> to increase at a faster pace than these other attachment
> protocols (at least now).  But I'm not sure if that's
> always the case, or if they just take turns getting
> better.  

Have you looked at the RFCs? 

-Mike

I'm not sure how we would treat the mount
> points then though, I only looked at iSCSI briefly.  I
> don't expect it to do any "filesystem stuff" so I expect we
> will still have to use either NFS or a different
> filesystem.
> 
> I hear GPFS is the shizznit.  But no chance of us
> buying that esp with the price on a "per core" basis. 
> Sheesh!
> 
> Nate Duehr wrote:
> > Any chance you could move to iSCSI?  Treat them
> more like a SAN?
> > 
> > -- Nate Duehr
> > Sent from my iPhone
> > 
> > On Jun 3, 2009, at 17:24, Angelo Bertolli <angelo.bertolli at gmail.com>
> wrote:
> > 
> >> Nate Duehr wrote:
> >>> On Wed, 03 Jun 2009 17:22 -0400, "Angelo
> Bertolli"
> >>> <angelo.bertolli at gmail.com>
> wrote:
> >>> 
> >>>> Ok so none of the options on an NFS mount
> do what I want it to do.  Maybe automount is the only
> solution, but for regular nfs...
> >>>> 
> >>> 
> >>> autofs will mount and unmount things as
> they're "used", but it adds some
> >>> wait time when you first use the remote
> filesystem.  I also forget how
> >>> to tell it how long to wait before
> unmounting... it's been years since I
> >>> had to deal with developers machines that used
> it to auto-mount the
> >>> development playground server...
> >>> But if the problem really is network
> connectivity going away, it won't
> >>> help... the NFS mount will still be "hung" by
> bad network connectivity. NFS is from another era where it
> assumes networks are perfect.  When
> >>> they're not, NFS becomes highly annoying.
> >>> 
> >> 
> >> That's exactly what I was thinking when I was
> going through this:  it runs based on the assumption
> that the connectivity never goes away.
> >> 
> >>>> 2) There doesn't seem to be any way to
> tell NFS to fail within 1 minute.  I know the maximum
> retrans timeout is supposed to be 60 seconds, but after
> tweaking it
> >>>> 
> >>>> When a mount is unavailable (I'm using ls
> to test) ...
> >>>>    - soft/hard doesn't seem to
> make any difference (I'm using ro,noexec)
> >>>>    - retrans, timeo, retry don't
> seem to make any difference no matter what settings I use
> >>>> 
> >>> 
> >>> timeo should work, but it requires that there
> actually be file access
> >>> going on... if the mount is "quiet", it has no
> idea that there's
> >>> something to "timeout", so to speak.  If
> you already have network issues
> >>> going on, using soft would make your life a
> living hell.  I highly
> >>> recommend against it, unless you enjoy I/O
> errors in your application
> >>> level code.  (GRIN!)
> >>> 
> >> 
> >> I only played with timeo a little bit.  But
> the default is already at 0.7 seconds, and the behavior has
> always been at least 3 minutes which doesn't match the 60
> second rule in the man page, unless it's climbing up to 60
> seconds each of 3 times (retrans rate).  But that's not
> what it says.  The default says after the first
> transmission problem, wait 0.7 seconds, then wait 1.4
> seconds, then wait 2.8 seconds...  unless of course the
> it ONLY has a minor timeout when TCP times out which would
> make timeo almost pointless unless you wanted really long
> wait times.
> >> 
> >> I'm ok with not caring about the mount if it's not
> being used.  It's just when something goes to use it, I
> want it to return with some feedback in 60 seconds so I can
> do something like... write a script that tests all the
> mounts.  We have over 400 NFS mounts on one machine, so
> I really can't wait 3 minutes for each one.  What I've
> decided to do for the time being is only testing the first
> mount per host in fstab, which gets me down to under 60.
> >> 
> >> 
> >>>>    - I've tried at least 10
> combinations of the above, and ls returns with an IO error
> within 3 - 5 minutes every time.
> >>>> 
> >>> 
> >>> I've also farted around with it in the
> past.  There were a number of
> >>> implementation bugs in Linux NFS stacks over
> the years.  Those weren't
> >>> very helpful at the time.  Maybe they've
> cleaned those up.  The
> >>> strongest NFS implementation has always been
> the one in Solaris, but
> >>> like many things Solaris, it traded robustness
> for lack of features...
> >>> and you still couldn't really do anything
> about "hung" NFS mounts very
> >>> well.
> >>> 
> >>> 
> >>>> Oh well.  We're using nfs3. 
> Should I expect different behavior from nfs4
> >>>> 
> >>> 
> >>> Doubt it.  NFSv4 really only dealt with
> authentication issues, and is
> >>> kinda a "too little, too late" approach to
> fixing things with NFS.
> >>> I think other network filesystems, even the
> venerable and possibly hated
> >>> CIFS ("Windows shares") handle network outages
> better.  But there's a
> >>> whole new world of problems there...
> filenames, permissions,
> >>> ownership... Samba can also drive someone mad
> given the wrong set of
> >>> requirements for group access or other weird
> requests.
> >>> 
> >>> I guess the only GOOD thing about NFS is that
> it certainly shows you if
> >>> your network or servers aren't up to
> snuff.  If you can fix the
> >>> root-cause connectivity problems, it's plenty
> fast and maps better to
> >>> unix permissions and other things "Linuxy",
> but eventually NFS does
> >>> drive one mad when network or server issues
> are happening.
> >>> 
> >>> I've always wanted to try out OpenAFS, but I
> can't think of a good need
> >>> I have for it right now...
> >> 
> >> Well actually, the network is pretty good, and
> machines rarely just crash.  Our biggest problem is
> storage hardware:  we have some problematic JBODs that
> were built poorly and are underpowered.  Once in a
> while they end up dropping all their devices. 
> oops!  Then some user comes and does an ls on one of
> these guys via ftp, the system hangs, so they try again,
> etc.  Maybe the simple addition of intr will help with
> this problem because it allows the IO request to go away
> more easily.
> >> 
> >> I admit, I've been testing this by shutting off
> the nfs server, and not by disappearing the devices. 
> But I figured that was a good test.
> >> 
> >> So far some kind of automounter sounds like the
> only possibility (if there's anything we can do besides just
> getting rid of these JBODs), but I've been told that the
> automounters aren't really able to unmount things that are
> having problems either (as you mentioned above).
> >> 
> >> Angelo
> >> 
> >> _______________________________________________
> >> clue-tech mailing list
> >> clue-tech at cluedenver.org
> >> http://www.cluedenver.org/mailman/listinfo/clue-tech
> > _______________________________________________
> > clue-tech mailing list
> > clue-tech at cluedenver.org
> > http://www.cluedenver.org/mailman/listinfo/clue-tech
> 
> _______________________________________________
> clue-tech mailing list
> clue-tech at cluedenver.org
> http://www.cluedenver.org/mailman/listinfo/clue-tech
>