|
网络环境: <BR><BR> cisco 4006交换机两台,通过2条光纤模块1/1-2配置trunk相互连接,然后连接其他网络设备或者主机。 <BR><BR> 故障现象: <BR><BR> cisco 4006交换机cpu利用率过高,业务时断时续,无法正常进行,交换机日志采集的信息如下: <BR><BR> 2007 May 24 03:55:40 %SYS-4-P2_WARN: 1/Host 00:02:fd:06:d0:b0 is flapping between port 1/2 and port 1/1 <BR><BR> 2007 May 24 03:55:42 %SYS-4-P2_WARN: 1/Host 00:04:de:17:28:20 is flapping between port 1/2 and port 4/45 <BR><BR> 2007 May 24 03:55:44 %SYS-4-P2_WARN: 1/Host 00:00:0c:07:ac:01 is flapping between port 1/2 and port 4/47 <BR><BR> 2007 May 24 03:55:45 %SYS-4-P2_WARN: 1/Host 00:05:9a:20:78:20 is flapping between port 1/2 and port 4/47 <BR><BR> 2007 May 24 03:55:48 %SYS-4-P2_WARN: 1/Host 00:02:fd:06:d0:b0 is flapping between port 1/1 and port 1/2 <BR><BR> 2007 May 24 03:55:49 %SYS-4-P2_WARN: 1/Host 00:11:25:19:c3:c2 is flapping between port 1/2 and port 4/13 <BR><BR> 2007 May 24 03:55:53 %PAGP-5-PORTFROMSTPort 4/45 left bridge port 4/45 <BR><BR> 2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:06:29:ec:aa:f2 is flapping between port 1/2 and port 4/37 <BR><BR> 2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:10:5c:c5:6a:ca is flapping between port 1/1 and port 4/7 <BR><BR> 2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:09:6b:f5:0f:33 is flapping between port 1/1 and port 4/13 <BR><BR> 2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:10:5c:45:6a:ca is flapping between port 1/2 and port 1/1 <BR><BR> 2007 May 24 03:55:54 %SYS-4-P2_WARN: 1/Host 00:16:ec:7b:6c:b4 is flapping between port 1/1 and port 1/2 <BR><BR> 2007 May 24 03:55:55 %SYS-4-P2_WARN: 1/Host 00:10:5c:c5:6a:ca is flapping between port 1/1 and port 4/7 <BR><BR> 分析原因: <BR><BR> 两台cisco 4006交换机之间出现环路,某种原因使得STP算法失效,导致网络上出现广播风暴。 <BR><BR> 处理步骤: <BR><BR> 1、首先重启了两台cisco4006交换机(其实网络上还连接了两台IBM小型机通过HACMP做了双机,由于双机对共享资源的保护,对备机发出了 shutdown命令;正确的做法,应该先关闭一台交换机,或者将备机的hacmp停止后再关闭两台交换机),启动后,cpu利用率下降,业务得以正常进行; <BR><BR> 2、接下来,根据报错信息上提到的各个端口检查网络中是否存在环路。经检查,出了两台4006之间有环路外,不存在其他环路,各命令检查结果正常,所用到的命令有:show spantree active,show trunk,show config,show vlan,show port 等。 <BR><BR> 3、使用端口镜像方式对流经交换机上的数据进行抓包,看是否有可疑的arp包,是否为arp病毒导致网络出现环路。检查结果未发现。使用的命令为:set span;使用的工具为:sniffer。 <BR><BR> 4、考虑到曾经遇到过cisco STP算法出现bug的情况,决定对两台交换机之间的配置做一个改动,将1/1-2两个光纤端口做成一个channel,然后在做trunk,这样既保持了两台交换机之间的连接冗余,又可以消除环路。使用的命令为: <BR><BR> set port channel 1/1-2 53 <BR><BR> set port channel 1/1-2 mode on <BR><BR> 两边做完后,通过show portchannel查看状态,其中4006-2为notconnect,另一边4006-1为errdisable;在4006-1上执行命令:setport 1/1-2 enable;在使用show port channel查看,两边的状态均为connected; <BR><BR> 在其中一台交换机上设置trunk: <BR><BR> set trunk 1/1-2 on 1 <BR><BR> 使用show trunk命令查看状态正常; <BR><BR> 使用show spantree active 查看正常: <BR><BR> 4006-2> (enable) show spantree <BR><BR> VLAN 1 <BR><BR> Spanning tree enabled <BR><BR> Spanning tree type ieee <BR><BR> Designated Root 00-05-32-db-b0-00 <BR><BR> Designated Root Priority 32768 <BR><BR> Designated Root Cost 3 <BR><BR> Designated Root Port 1/1-2 (agPort 13/1) <BR><BR> Root Max Age 20 sec Hello Time 2 sec Forward Delay 15 sec <BR><BR> Bridge ID MAC ADDR 00-05-32-db-b4-00 <BR><BR> Bridge ID Priority 32768 <BR><BR> Bridge Max Age 20 sec Hello Time 2 sec Forward Delay 15 sec <BR><BR> Port Vlan Port-State Cost Prio Portfast Channel_id <BR><BR> ------------------------ ---- ------------- --------- ---- -------- ---------- <BR><BR> 1/1-2 1 forwarding 3 32 disabled 769 <BR><BR> 这样,在STP计算时,会将1/1-2当成一个端口在计算,从而消除了环路。 |
|