[CLUE-Tech] Linux TCP sockets SYN -> long delay waiting for tcp_syn_retries

Jim Ockers ockers at ockers.net
Mon Oct 4 15:19:50 MDT 2004


Hi Dave/everyone,

> You didn't say what telnet you're using and there seem to be several for 
> Linux.  Perhaps you can find a "better" one or one that allows you to 
> configure the part that's causing problems.

Actually the problem is not with telnet, it's with a Linux JMS client,
but telnet exhibits the same behavior.  I am investigating what
we can do at the kernel/system level to fix this.  The telnet we are
using for the testing is just RedHat 7.2 or RHEL-WS3 /usr/bin/telnet.
(They behave the same.)

> Jim Ockers wrote:
> [...]
> > According to ethereal here's what happens for the Windows attempt:
> > 
> > SYN is sent to 6.7.8.9:23
> > RST is received from 6.7.8.9:23
> > SYN sent
> > RST received
> > SYN sent
> > RST received
> > ...then the telnet command returns and the error message above is 
> > printed.  Elapsed time is 8-9 seconds.
> 
> What's the time between SYNs (all the same or using a back-off algorithm)?

I'm glad you asked - I think this is important.  Here is what ethereal
says about the timing for Linux telnet to a closed port:

time=0s	SYN sent (initial)
1.4	RST received
3.0	SYN sent, retry 1
4.4	RST
9.0	SYN sent, retry 2
9.9	RST
21.0	SYN sent, retry 3
22.4	RST
45.0	SYN sent, retry 4
45.9	RST
93.0	SYN sent, retry 5
94.8	RST received
194.5	[telnet process exits (this is not shown in ethereal of course)]

I'm not sure why it takes 100 seconds for the telnet process to exit
after it gets the 6th RST in a row.  The JMS client shows the same
delay when giving up on the defunct server.

Here is what Windows does for the exact same test:

time=0s SYN sent (initial)
1.4	RST received
1.8	SYN sent, retry 1
3.1	RST received
3.5	SYN sent, retry 2
4.4	RST received
8.0	[telnet command exits (this is not shown in ethereal of course)]

> the right options).  If it is long then the bug is in the telnet code 
> and you need to debug it to find the cause and whether there might be a 
> reason for it (or use different code).

I wonder about that, since it's the stock telnet, and the JMS client
doesn't use the same code as telnet (maybe the same library though).
Over our iDirect satellite system everything is fast.  On the LAN
everything is fast.  This one VSAT network seems to break something
on Linux, but the ethereal traces look the same to me (except for 
the timing of course).  As I mentioned before, Windows telnet and
TCP programs work fine on both VSAT systems and the LAN.

> Seems broken that something would wait a long time after being told a 
> port is closed.  But it seems broken that several SYNs would be sent 
> after a RST ('course I don't know nuthin bout the TCP specs).

The /proc/sys/net/ipv4/tcp_syn_retries is set to 5 by default, you can
set it lower.  Windows does 2 retries (as observed in ethereal).

I had the bright idea to try iptables -j REJECT with different targets
on the server host.  I was hoping that maybe if TCP RST wasn't enough
of a rejection, maybe something else would be stronger/quicker:

-j DROP: 3m13.434s
-j REJECT --reject-with tcp-reset: 3m14.469s
-j REJECT --reject-with icmp-port-unreachable: 3m12.218s
-j REJECT --reject-with icmp-net-unreachable: 3m13.085s
-j REJECT --reject-with icmp-host-unreachable: 3m16.970s
-j REJECT --reject-with icmp-proto-unreachable: 3m15.092s
-j REJECT --reject-with icmp-host-prohibited: 3m13.085s
-j REJECT --reject-with icmp-net-prohibited: 3m13.100s

all of which made for a really boring half hour.  Interestingly
the DROP took the same amount of time for telnet to exit as the
others.

When I set the tcp_syn_retries to 1, the telnet process exits in
12.3 seconds.  I think there is a good reason to use 2 or more
SYN retries so I'm not sure that is a good solution.  Also you'll
note from the timing above that the first retry's RST is received 
after only 4.4 seconds, but the telnet process takes an additional
8 seconds to exit after receiving the second RST.

Thanks for any more suggestions/ideas,
Jim

-- 
Jim Ockers, P.Eng. (ockers at ockers.net)
Contact info: please see http://www.ockers.net/



More information about the clue-tech mailing list