[CLUE-Tech] Problems with FTP

Dave Anselmi anselmi at americanisp.net
Tue Mar 26 20:38:33 MST 2002


"Shapiro, Gary L" wrote:

> I have been having problems receiving files via FTP from another computer,
> and I wonder whether anyone has had this experience:
>
> I am running a program on a Windows NT computer which is automatically
> receiving files from my Linux computer, processing them, then sending files
> via FTP back to the Linux computer. Periodically, the process "hangs" on a
> "put" when the sending computer (the NT) does not receive verification from
> the Linux computer of completion of file transfer (even though it has
> completed). The FTP documentation says that this can happen if the receiving
> computer violates the FTP protocol.

Well, if you aren't using a well known ftpd on Linux, it might not follow the
spec.  I'd bet that well known servers do.

If the transfer has completed, then the Linux box must have closed the
connection (on a timeout if nothing else).  If NT is hung, it probably missed
some traffic after it finished sending.  Does NT time out?  If not, that seems
broken.  If the issue is that you need to know the transfer was complete, or
resend, then perhaps the network isn't reliable enough for your application.
There may be other, more robust protocols you could use.


> The only way to continue the program
> without restarting is to "kill" the demon from the Linux side.

Hmm.  So when the control channel goes down, the application continues--most
likely because its ftp process returns (does it recognize it may have an
error?)  But Linux hasn't dropped the control channel for a timeout.  I wonder
if the NT side isn't finishing the transfer properly?  Can you sniff the packets
by any chance?  Both ends over many connections would be best.  It may take some
learning to know what a good transfer looks like, but it may also make the
problem completely visible.


> Also, at some
> times, I suddenly start getting a message back from the Linux machine which
> says "connection refused". I then have to abort the program and I can't send
> anything to the Linux machine from anywhere for several minutes, after which
> it will usually clear itself. I am running Redhat version 6.2 on an HP
> Visualize PL-class workstation.

Any chance the NT code is reusing ports?  If you've aborted the NT program and
the Linux sockets are still open you'll have to wait for them to time out before
you can send to them again.  Typically in a client program opening connections
uses a different port on successive attempts so that isn't a problem.  But if
you insist on using the same ports at both ends, you can't have two connections
at once (even though it's only on one machine).  A sniffer would show that, too.

HTH, let us know if you figure it out.

Dave





More information about the clue-tech mailing list