(This is the third in a series of five posts. See
part 1,
part 2,
part 4,
part 5.)
The checksum test in nf_conntrack can be disabled with sysctl -w net.netfilter.nf_conntrack_checksum=0 (see, for example, the net->ct.sysctl_checksum test on
line 185 of net/ipv4/netfilter/nf_conntrack_proto_icmp.c), but this just moves the error elsewhere:
br0: hw csum failure.
Call Trace:
[00000000006be93c] icmp_rcv+0x1a0/0x324
[0000000000695578] ip_local_deliver_finish+0x1b8/0x2a0
[00000000006950a4] ip_rcv_finish+0x3a4/0x3d0
[000000000066eaa0] netif_receive_skb+0x618/0x640
[0000000010271160] br_nf_pre_routing_finish+0x334/0x348 [bridge]
[00000000102719e4] br_nf_pre_routing+0x870/0x894 [bridge]
[000000000068e0cc] nf_iterate+0x34/0x90
[000000000068e2b0] nf_hook_slow+0x4c/0xec
[000000001026c2e8] br_handle_frame+0x24c/0x2d0 [bridge]
[000000000066e92c] netif_receive_skb+0x4a4/0x640
[000000000066eb38] process_backlog+0x70/0xc4
[000000000066ed70] net_rx_action+0x98/0x188
[000000000045ab84] __do_softirq+0x80/0x110
[0000000000429d60] do_softirq+0x54/0x80
[000000000045a870] irq_exit+0x38/0x90
[0000000000429e60] handler_irq+0xd4/0xec
A checksum test similar to the one in tcp_error/udp_error/icmp_error can be found on
line 1000 of net/ipv4/icmp.c. Fundamentally, this is triggered by calling __skb_checksum_complete() on packets which the SunHME driver produced which were longer than RX_COPY_THRESHOLD. Next, I think I'll see if I can move the bug earlier in processing by adding such a test to the VLAN driver after the skb_pull_rcsum() call, which is the first point in processing at which the IP packet is available sans any ethernet or 802.1q headers.