Hi!
tl;dr: Azure ILB and Linux IP spoofing protection prevent a connection from a machine to itself via the ILB.
A few days ago, I talked to a customer who had quite some trouble using the Azure Internal Load Balancer with his Linux VMs.
From his tests, he concluded that ILB “is broken”, “is buggy” and “is unstable”. What he observed is the following:
– He created two linux VMs in a virtual network on Azure: Machine A on IP address 10.0.0.11 and Machine B on IP Address 10.0.0.12. And then he set up an internal load balancer for HTTP with the following configuration: Input address 10.0.0.10, output 10.0.0.11 and 10.0.0.12, source, destination and probe ports all set to 80.
Then he opened a SSH connection to Machine A (10.0.0.11) and observed the following behavior:
root@ILB-a:~# curl 10.0.0.10
^C
root@ILB-a:~# curl 10.0.0.10
<html><head><title>B</title></head>
<body> B </body> </html>
So it seems that only every second connection worked. Or to be more precise, whenever the ILB was forwarding the connection to the machine he was working on, the connection failed.
So I recreated this setup and tried for myself, this time looking at the tcpdump output for the case when the connection did not work:
root@ILB-a:~# tcpdump -A port 80
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
10:32:05.210953 IP 10.0.0.11.58705 > 10.0.0.10.http: Flags [S], seq 265927267, win 29200, options [mss 1460,sackOK,TS val 1471701 ecr 0,nop,wscale 7], length 0
E..<..@.@.W.
…
..
.Q.P…c……r…………
..t………
10:32:05.216395 IP 10.0.0.11.58705 > 10.0.0.11.http: Flags [S], seq 265927267, win 29200, options [mss 1418,sackOK,TS val 1471701 ecr 0,nop,wscale 7], length 0
E..<….@…
…
….Q.P…c……r..:………
..t………
10:32:06.210783 IP 10.0.0.11.58705 > 10.0.0.10.http: Flags [S], seq 265927267, win 29200, options [mss 1460,sackOK,TS val 1471951 ecr 0,nop,wscale 7], length 0
E..<..@.@.W.
…
..
.Q.P…c……r…………
..u………
10:32:06.212291 IP 10.0.0.11.58705 > 10.0.0.11.http: Flags [S], seq 265927267, win 29200, options [mss 1418,sackOK,TS val 1471951 ecr 0,nop,wscale 7], length 0
E..<….@…
…
….Q.P…c……r..@………
..u………
It looked like the ILB was forwarding the packets (it’s actually just rewriting the destination IP and port, as you can see the rest of the packet just stays the same.) And then the linux kernel would just drop the packet. Turns out this behavior is actually a clever idea. But why?
Because the packet the linux kernel sees in its interface input queue could actually never get there in the first place! It carries the local IP address both in its source and destination address. So if the network stack would want to send a packet to the local host, it would have never been sent into the network but would have been handled in the network stack already, much like the packet to localhost (127.0.0.1). So any incoming packet with a local IP address as source must therefore be evil, probably a spoofing attack directed at some local service that would accept local connections without further authentication. So dropping this packet is actually clever.
But how can we prove that this is actually the case? Fortunately, there’s a kernel runtime configuration switch which enables logging of such packets. And here’s where the title of this post comes from: the configuration is called log_martians. This can be set globally (echo 1 > /proc/sys/net/ipv4/conf/all/log_martians) or for a specific interface (e.g. echo 1 > /proc/sys/net/ipv4/conf/eth0/log_martians). The kernel then logs these events to syslog and can be seen by running dmesg.
In syslog, these packets show up like this:
Sep 15 11:05:06 ILB-a kernel: [ 8178.696265] IPv4: martian source 10.0.0.11 from 10.0.0.11, on dev eth0
Sep 15 11:05:06 ILB-a kernel: [ 8178.696272] ll header: 00000000: 00 0d 3a 20 2a 32 54 7f ee 8f a6 3c 08 00 ..: *2T….<..
Conclusion: The linux kernel default behavior drops any packet that seems to be coming from a local IP but shows up in the network input queue. And that’s not the ILBs fault. The ILB works just fine as long as you don’t connect to it from a machine that’s also a potential destination for the ILB.
Fortunately, this limitation rarely occurs in real life architectures. As long as the clients of an ILB and the servers load-balanced by the ILB are distinct (as they are in the Galera example in this blog) the ILB just “works”. In case you actually have to “connect back” to the same server, you have to either build a workaround with a timeout and retry in the connection to the ILB or reconfigure the linux kernel to allow such packets in the input queue. How to do that with the kernel configuration parameters, I leave as an exercise to the reader. 😉
Hope this helps,
H.
Source: msdn