Well, here we are again. Indonesian porn breaking my infrastructure. We had a half hour outage of all HTTP and VPN based services on OPT1-4 due to overload from Pomf traffic. This was a little different, however. My script DIDN'T cause a cascade failure. No, actually there was just so much load spread out across all four endpoints that the whole thing collapsed. Let's go into that load profile.
First, an image.
At approximately 12:40pm EST on 7/18/2022, my infrastructure collapsed due to dead or non-responsive edge nodes. You can see the giant gap in outbound (red) bandwidth. Then, once services were restored, they immediately jumped to a full gigabit of outbound traffic. I ended up shedding load by removing some very popular files, then rolled those files back into Pomf's datastore to reactivate load. We're now sitting at maximum outbound capacity, but without outage, so all systems are operational again as of now. Still - it was impactful. One sad yellow line.
I can attribute a final root cause to one of the following scenarios:
- Extreme overload caused by millions of legitimate requests all inside a tiny timespan (e.g. someone posting a new video on a VERY popular platform)
- A well planned out, multi stage attack on my infrastructure by someone with a vested interest in not allowing porn to be served to Indonesia.
Let's go over each scenario, assuming good faith and bad faith.
The Good Faith Argument
Let's just say that someone posted a batch of new videos on Facebook or something in Indonesia as the scenario. What might happen?
- People with notifications turned on, for example, might all bum rush the video.
- Or, perhaps this was special in that a user linked a dozen or so videos instead of just one, and everyone clicked all of them at the same time.
- Or, perhaps these videos were posted somewhere where traffic is far greater than we've ever seen before.
I don't collect metrics so I can't say. I try to keep user privacy paramount. The above are guesses, but it's possible. The initial rush could have been so great that my nodes completely collapsed under the inbound GET requests and subsequent outbound 206 range replies, causing severe CPU or other resource overload, effectively killing the node.
What's the evidence for this?
- We've seen this before, with the first Indonesian Porn Incident. It stands to reason it could happen again, but worse.
- The requests were all to the same batch of videos, all with the same content type.
- Internet metrics tell me that 50% of my traffic is Indonesia now. Neat!
- Current data all appears legitimate, and we're 100% maxed out at 4Gbps. There must be real people behind this traffic.
The Bad Faith Argument
Okay, hear me out. I think this was an attack. I don't want to give away all the evidence because it could be used to evade my protections that are now in place, but there were some really interesting events that took place that I've never seen before.
Inbound Packet Overload
I'm telling you, I ran tcpdump, hit ctrl+c immediately, and saw 4 million packets filtered on each node. That's not HTTP traffic anymore, that's a flood. No way we'd see 16Mpps inbound (4 x 4 nodes) on a normal day. Hell, right now, at 4Gbps, it's only about 2Mpps total in the same timeframe. What's more interesting - this eventually disappeared. Perhaps PATH (the DDoS protection provider) stepped in. All four VMs cleared up immediately like a light switch was flipped, and the packet count was back to normal.
Look at the inbound scaling on that! The top of that blue chart is 250mbps. 250! And no corresponding outbound traffic.
VPS Providers Making GET Requests
I found it odd that a crap ton of VPS providers would all suddenly request the same batch of files, over and over and over, all at the same time. It's almost like it was designed to create what's known as a Layer 7 DDoS, where they hit the application layer, which in this case is Nginx and its cache. It would be strikingly effective, as the downtime shows, to just make spurious video requests from, oh, I don't know, 100 different VMs across the world, all with 100mbit or better pipes to just keep pulling that data and piping it into /dev/null. Now, I have request limiting in place, but that doesn't work so well if they have enough IPs to where it still kills the network.
Here's an example. OVH! I remember them.
- inetnum: 5.39.80.0 - 5.39.95.255
- netname: OVH
- descr: OVH SAS
- descr: Dedicated servers
- descr: http://www.ovh.com
- country: FR
- admin-c: OK217-RIPE
- tech-c: OTC2-RIPE
- status: ASSIGNED PA
- mnt-by: OVH-MNT
- created: 2013-08-23T22:14:05Z
- last-modified: 2013-08-23T22:14:05Z
- source: RIPE # Filtered
Note: There is a slight chance that all of these VPSes are actually VPN gateways for commerical VPN providers, e.g. NordVPN. Doubtful, but not impossible.
Strange ISPs Hitting the Request Cap
Want some more weird WHOIS replies? Here's one. And this guy used up all 1,000 of his request "tokens" mighty quick. Smells like... a zombie?
- inetnum: 62.201.232.0 - 62.201.235.255
- netname: IQ-NETWORKS-FTTH1
- descr: IQ Networks Fiber To The Home Range 1
- country: IQ
- geoloc: 35.562 45.405
- language: ku
- language: ar
- language: en
- admin-c: SD13325-RIPE
- admin-c: IN1910-RIPE
- tech-c: IN1910-RIPE
- tech-c: SD13325-RIPE
- status: ASSIGNED PA
- mnt-by: IQNET-LIR-MNT
- mnt-lower: IQNET-LIR-MNT
- mnt-routes: IQNET-LIR-MNT
- created: 2013-07-04T11:04:45Z
- last-modified: 2021-11-02T12:42:14Z
- source: RIPE
All of it Targeted the Indonesian Porn
Every single "bad" request that I could find was wired up right to the porn videos. This tells me that whoever decided to launch an attack needed to find some way to exhaust my resources, and a Facebook post or other method of distributing my links was needed so they could just make all these spurious requests for the same set of large files. If this attack was motivated for any other reason, they would have targeted different links, but no. They went after these ones. Hell, there were so many requests that you could see the CPU impact just from Pomf shooting back 404 responses after I yanked the files to curb bandwidth:
Indonesia has a History of Cyberattacks
If you read my transparency log, you may notice that the Indonesian Government has emailed me a few times asking for takedowns, to which I generally refused. If you take a look at this fairly recent Reuters article, you may notice some of the same patterns. I'm not saying it was the Indonesian government. It could be someone sympathetic to the rule of law there trying to knock my site off for violating their draconian laws. But, when you factor in the complexity and scale of the attacks, you may come to the same conclusion that they were quite well-armed for this kind of action.
Other Factors
I found some correlating factors between all the various machines used to attack lain.la. I can't tell you what they are as I'm using them for mitigation, but I can tell you that it's extremely odd that all the requests *look* the same despite being geographically independent. Also, the fact that all of this started at exactly the same time instead of a ramp-up like I usually see with large waves of file requests is suspicious.
What's Next
Meh, nothing. I'll keep fighting them off. All the files in question are being served right now, successfully, at reasonable speeds, despite the maxed out network pipe. Hopefully Francisco doesn't get too mad in my quest to keep Pomf going. I do pay him a lot of money, so I'd hope so. I'll adjust my defenses and watch for new attack patterns. You can always watch my uptime portal to see the live battle.