openstack-ansible liberty 6e3815d, l2pop=off
At first I encountered this issue: https://bugs.launchpad.net/openstack-ansible/liberty/+bug/1563448 , but after deleting vxlan interfaces and restarting neutron agents, there are still problems.
- Unicast packets in VM tenent networks works, which means from agent namespaces you can ping VM with their IP in that subnet.
- Broadcast packets in VM tenant networks doesn’t work, which means VM can’t get DHCP reply, nor ARP reply, and nor can the DHCP agents. but strangely in my case L3 agent can get VM’s ARP reply.
- After further investigation, I found that vxlan-X interface on compute and agent containers doesn’t receive broadcast packets that others sent.
- Further, on physical interface (in my case `eth1.30`, slave of `br-vxlan`), I can see packets going out to `vxlan_group` (220.127.116.11 by default), but not packets from 18.104.22.168
- Packet capturing on the switch found that the switch is not forwarding the packets.
#openstack-ansible@jimmdenton pointed me this article: http://movingpackets.net/2013/06/04/multicast-problems-juniper-ex/ , suggest to disable IGMP snooping
- It works!
Thanks to Rackspace guys at
#openstack-ansible, who spend a lot of time helping me debug the issue.
This article helped me understand how VXLAN and L2 population works: https://kimizhang.wordpress.com/2014/04/01/how-ml2vxlan-works/ . (Don’t use L2 population! Neutron developers suggest against it!)