[CLUE-Tech] HP ethernet switch UDP broadcast storm

Tue May 11 09:52:03 MDT 2004

When you say "mesh" are you referring to redundant links between switches?
If so, I'm assuming you're using spanning tree to remove loops.

Even though it was a layer 3 broadcast it got treated as a layer 2
broadcast because the switches won't have an ARP entry for a broadcast
address. At least, they shouldn't. :) So that would explain why you saw a
SINGLE copy of the particular packet all over, but doesn't explain why you
continued to see it flooded all over.

Was anyone in your closet plugging stuff in at the time this happened? Any
possibility that someone created another loop in the network? Was anyone
plugging in switches anywhere else in the network?

Sounds like there was a loop someplace or a new switch with a higher
priority got plugged in and spanning tree got confused. This gains
credence especially when you say the problem got fixed by disconnecting
"mesh" ports, which will potentially physically remove an existing loop,
and also cause spanning tree to recalculate.

Hope that helps... hard to tell exactly what happened. I've seen spanning
tree fail before, particularly if someone is plugging cables in and out
faster than spanning tree can keep up.

Even if someone plugs in a switch in their office it has the potential to
cause problems if that shiny new switch has a higher priority - it will
trigger the tree to recalculate.

Fortunately this doesn't happen very often.

> Hi everyone,
>
> We had a very unusual problem yesterday in our core network.
> This isn't strictly a Linux problem but I'm not sure who else
> to ask.
>
> We have HP Procurve 4000M managed ethernet switches.  Somehow
> a bogus ethernet frame (or frames) entered the network from
> a PC on our network through one of the switches.
>
> The traffic was all UDP, sent to the broadcast IP address for
> our LAN subnet.
>
> This traffic was replicated to the other switches but for
> some reason instead of dying out, it was replicated back to
> the original switch and turned into quite the packet storm.
> The storm grew until it was saturating our network and causing
> some serious connectivity problems.  Ethereal showed that
> some 98% of the traffic on the network was this UDP broadcast
> garbage.
>
> We disconnected the PC whose IP & MAC address were listed as
> the source, according to ethereal.  The traffic continued to
> grow even after that PC was disconnected.  Ethereal still
> showed that the traffic was from that IP & MAC address but
> when we did a search in the switches for that MAC address, all
> of the switches said it was on a MESH port and no switch would
> admit to being the source of the traffic - they all blamed the
> other switches in the mesh.
>
> We've never before had traffic from a MAC address that we
> could not trace back to a port on one of our switches.  This
> is clearly a switch glitch, right?
>
> We resolved the issue by disconnecting the "mesh" ports from
> the switch that the PC was originally connected to.  The cables
> were disconnected for about 30 seconds, and ethereal showed
> that the traffic went away.  We reconnected the mesh ports and
> the problem has not re-occurred.
>
> We did NOT reboot any of the switches.
>
> I did some google searches and didn't find anything about this.
> Does anyone have any ideas?  I have an ethereal PCAP showing
> the bogus traffic, and we know we weren't imagining things.
>
> Here are some more details.
>
> We have 7 80-port ethernet switches, all 10/100 except for one
> 1-port gigabit blade.  They are all connected with 4 "mesh" ports
> back to "switch2" which is the concentrator.  Switch2 has 6
> sets of 4 mesh ports, one set for each of the other switches
> that are connected to it.  We have a hub & spoke "switch mesh"
> with switch2 as the hub, that normally works quite well.
>
> The switches are HP Procurve 4000M running Firmware revision
> C.09.16, which I think is the most recent firmware.
>
> The bogus packet storm traffic itself was as follows:
>
> UDP, source 172.16.1.108, sport 138, dest 172.16.255.255, dport 138, 243
> bytes.
> UDP, source 172.16.1.108, sport 1783, dest 172.16.255.255,dport 42508, 260
> bytes.
> 	same, 132 bytes.
> 	same, 234 bytes.
> UDP, source 172.16.1.108, sport 137, dest 172.16.255.255, dport 137, 92
> bytes.
>
> That traffic was repeated over and over again.  The dport 42508
> traffic was the majority.  We know that eTrust antivirus was the
> original source of those packets but the PC was disconnected from
> the network and we stopped the eTrust services on the eTrust
> server, and the packet storm continued anyway.
>
> Every port on the switches was receiving this traffic, regardless
> of subnet (we do have a couple of other subnets in use, and some
> external ports).  The Linux users had to turn off nmbd because
> nmbd was using so much CPU resources handling the port 137 & port
> 138 traffic.
>
> We have only one VLAN and every port is in the same VLAN.  All VLAN
> ports are untagged.
>
> Weird eh?  Anyone seen anything like this before?  What did you do
> to resolve the problem?
>
> Thanks,
> Jim
>
> --
> Jim Ockers, P.Eng. (ockers at ockers.net)
> Contact info: please see http://www.ockers.net/
> _______________________________________________
> CLUE-Tech mailing list
> Post messages to: CLUE-Tech at clue.denver.co.us
> Unsubscribe or manage your options:
> http://clue.denver.co.us/mailman/listinfo/clue-tech
>