Properly stopping a SIP flood

At about 9am yesterday morning, I noticed on my server monitor that the CPU utilization of one of my servers was abnormally high, in addition to a sustained 1mbit/sec of inbound traffic and 2mbits/sec of outbound traffic. syslog messages from Asterisk showed it to be a SIP brute force attack, so I dropped the offending IP (an Amazon EC2 instance IP) into /etc/idiots to block it and went back to my work.

A while later, I noticed the traffic still hadn't died down, so I reported the incident to Amazon and my server's network provider. No luck on either front; Amazon just sent back a form reply stating the incident was forwarded to the EC2 instance's owner (yeah, seriously) and the network provider said they wouldn't bother adding an ACL to their border equipment unless it was needed to protect their entire network. With the IP blocked on my server, the CPU utilization had died down and it was no longer sending out reply traffic, but I was worried about the inbound garbage traffic counting towards the server's monthly bandwidth cap.

This morning, 24 hours after the flood started, there was still no progress from Amazon (even after a second report filed) nor the network provider on stopping it. Usually these SIP attacks try sequential SIP account numbers and common usernames, find valid accounts, then try to brute force each one's password with common ones, then move on. Presumably the valid account/password combinations are later used for spamming or other fraudulent purposes. For whatever reason, this particular scanner didn't "move on" and had been trying the same account hundreds of thousands of times over many hours.

I thought about reconfiguring Asterisk to use an empty password for the SIP account being brute forced so the scanning script would think it found a match and stop trying, but I wasn't sure if it would consider a REGISTER request with any password valid on an account with no password. Since the account in question was also in use at the time, I didn't want to disrupt it by changing its password, and taking down Asterisk just to have something custom respond to the attacker's SIP traffic was out of the question.

I took the IP out of the /etc/idiots file and reconfigured pf to redirect the traffic to a different local port where I could have something else listening:

pass in quick proto udp from 174.129.137.135 to any port 5060 rdr-to 127.0.0.1 port 5061

With the traffic from the attacking IP now going to port 5061 instead of 5060, I cooked up a simple ruby script to bind to 5061 and show me the incoming SIP messages:

require "socket"

s = UDPSocket.new
s.bind(nil, 5061)
while true
  packet = s.recvfrom(1024)
  puts packet.inspect
end

Instantly dozens of messages scrolled by within seconds of starting it:

REGISTER sip:[...] SIP/2.0
Via: SIP/2.0/UDP 10.242.91.15:5182;branch=z9hG4bK-2528031440;rport
Content-Length: 0
From: "2011" <sip:2011@[...]>
Accept: application/sdp
User-Agent: friendly-scanner
To: "2011" <sip:2011@[...]>
Contact: sip:123@1.1.1.1
CSeq: 1 REGISTER
Call-ID: 3361196543
Max-Forwards: 70

["AF_INET", 5182, "ec2-174-129-137-135.compute-1.amazonaws.com", "174.129.137.135"]

REGISTER sip:[...] SIP/2.0
Via: SIP/2.0/UDP 10.242.91.15:5209;branch=z9hG4bK-1678605574;rport
Content-Length: 0
From: "2011" <sip:2011@[...]>
Accept: application/sdp
User-Agent: friendly-scanner
To: "2011" <sip:2011@[...]>
Contact: sip:123@1.1.1.1
CSeq: 1 REGISTER
Call-ID: 1992838843
Max-Forwards: 70

["AF_INET", 5209, "ec2-174-129-137-135.compute-1.amazonaws.com", "174.129.137.135"]

Oh, the irony of the attacking script using a user-agent string of "friendly-scanner"…

With a little parsing added to the script, it could now send back a valid "OK" message with the specific Via, From, To, Call-ID, and CSeq lines from the request. Hopefully these would be enough for the attacking script to think it got a valid reply for one of its requests and finally move on.

require "socket"

s = UDPSocket.new
s.bind(nil, 5061)
while true
  packet = s.recvfrom(1024)

  via = packet[0].match(/Via: (.+);rport/)[1]
  from = packet[0].match(/From: (.+)/)[1]
  to = packet[0].match(/To: (.+)/)[1]
  call_id = packet[0].match(/Call-ID: (.+)/)[1]
  cseq = packet[0].match(/CSeq: (\d+) REGISTER/)[1]

  remote_ip = packet[1][3]
  remote_port = packet[1][1].to_i

  puts packet.inspect

  if packet[0].match(/^REGISTER /)
    ret = "SIP/2.0 200 OK\r\n" +
      "Via: #{via};received=#{remote_ip}\r\n" +
      "From: #{from}\r\n" +
      "To: #{to}\r\n" +
      "Call-ID: #{call_id}\r\n" +
      "CSeq: #{cseq.to_i + 1} REGISTER\r\n" +
      "\r\n"

    puts "sending to #{remote_ip}:#{remote_port}:\n#{ret}"

    s.send(ret, 0, remote_ip, remote_port)
  end
end

I ran this new version of the script and, after about 25 messages scrolled by in a second or two, everything stopped. The SIP traffic stopped coming in, the server's bandwidth utilization returned back to nil, I put the attacking IP back in /etc/idiots, and everything was back to normal.

screenshot of bandwidth graph
Questions or comments?
Please feel free to contact me.