从收集的信息来看万兆网卡丢包统计如下:
eth2 Link encap:Ethernet HWaddr E4:A8:B6:97:A2:CE
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:22311179313 errors:0 dropped:
212932 overruns:0 frame:0
TX packets:19863061884 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:49594989710939 (45.1 TiB) TX bytes:27686963916333 (25.1 TiB)
eth3 Link encap:Ethernet HWaddr E4:A8:B6:97:A2:CE
UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1
RX packets:46459354889 errors:0 dropped:
312597 overruns:0 frame:0
TX packets:58374998148 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:95965226801957 (87.2 TiB) TX bytes:69855612439915 (63.5 TiB)
ethtool -S eth2
{
NIC statistics:
rx_packets: 22311179703
tx_packets: 19863062276
rx_bytes: 49594989747075
tx_bytes: 27686963955170
rx_pkts_nic: 35662916328
tx_pkts_nic: 19863062274
rx_bytes_nic: 50620484550566
tx_bytes_nic: 27766799456380
lsc_int: 9
tx_busy: 0
non_eop_descs: 20975198746
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
multicast: 1091666
broadcast: 937755
rx_no_buffer_count: 0
collisions: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
hw_rsc_aggregated: 18940077554
hw_rsc_flushed: 5588340927
fdir_match: 21123563285
fdir_miss: 18307510764
fdir_overflow: 58857
rx_fifo_errors: 0
rx_missed_errors:
212932
ethtool -S eth3
{
NIC statistics:
rx_packets: 46459355292
tx_packets: 58374998542
rx_bytes: 95965226840776
tx_bytes: 69855612477675
rx_pkts_nic: 58927970523
tx_pkts_nic: 58374998539
rx_bytes_nic: 97025415708655
tx_bytes_nic: 70089871434535
lsc_int: 9
tx_busy: 0
non_eop_descs: 40836101363
rx_errors: 0
tx_errors: 0
rx_dropped: 0
tx_dropped: 0
multicast: 1091738
broadcast: 5089745
rx_no_buffer_count: 0
collisions: 0
rx_over_errors: 0
rx_crc_errors: 0
rx_frame_errors: 0
hw_rsc_aggregated: 20910806075
hw_rsc_flushed: 8442190819
fdir_match: 52579073218
fdir_miss: 6482918673
fdir_overflow: 116389
rx_fifo_errors: 0
rx_missed_errors:
312597
总体丢包在十万分之一以下,以后端实验室标准属于大压力下正常丢包情况,且丢包类型都为rx_missed_errors,是CPU处理不过来DMA中的ring buffer缓存报文导致的丢包,网卡硬件和网络协议栈层面没有异常。
同时,现场服务器的操作系统版本为RHEL6.5原生未升级版本:
Kernel /boot/vmlinuz-
2.6.32-431.el6.x86_64 ro root=UUID=4646c097-2af0-44d5-9a1d-3c3c9a54f353 rd_NO_LUKS rd_NO_LVM LANG=en_US.UTF-8 rd_NO_MD SYSFONT=latarcyrheb-sun16 crashkernel=auto KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM rhgb quiet
其bonding模块有影响性能的缺陷,参考如下红帽官网链接:
https://access.redhat.com/solutions/631123
综上所述,现场网口丢包类型为CPU无法及时处理内存DMA区域中报文导致;当前操作系统内核网卡bonding模块的缺陷,建议升级操作系统内核并进行网卡调优。