The Trouble with Trackbacks
April 21, 2005 Whiterose Admin
The Mighty Whiterose Blogging Server is, in real life, a lowly PowerMac G3/350 with 512MB of memory. It has, generally, a few 100 MB available to handle user requests for stuff as they come in. Scumbag Maggots selling Google PageRank do so by making it look like your site (which updates frequently) links to their penis enlargement clients. They can do this with comments or they can do this with trackbacks, both nice features of Movable Type to extend the web-based conversation on a topic posted here. MT can be extended to use blacklists, moderation, nofollow tags and other mechanisms to prevent these people who are are selling your web space from getting any benefit out of it. However, the Scumbag Maggots put up their ads in an indiscriminate fashion: they don't care if some or even most sites are protected as long as some aren't. In order to get as high a return as possible, they use automated spamming scripts coming from random IPs (which they may or may not have legitimate access to use). What is the intersection of the size of Whiterose and the escalating war between spammers and anti-spammers? It's that Trackback floods from spammers can easily spawn 50 processes on whiterose at 5M each and cause the server to stop responding to new requests. We tried several options, including rebooting the server, stopping the web server, and running under mod_perl, but nothing really solved the issue. What did work for us was the following script, which we're leaving running on Whiterose.org until we come up with a better solution. The main loop does the following:
  1. gets the list of running processes for user www
  2. pares that list to only the ones with mt-tb.cgi in them
  3. trims the remaining list to only the process IDs
  4. counts the running mt-tb.cgi processes
  5. compares that to my spam-attack threshold
  6. kills every trackback process during a spam attack
  7. goes to sleep for a variable amount of time
  8. repeats
This won't work if you're running under mod_perl or if you can't run a shell script as a user who can kill processes run by the web server. It will kill "real" trackbacks that come in during an attack. For us, it's the difference between an unreachable web-server and one that may be slower but still works. Comments and suggestions welcome. It's my first useful shell script effort.
#! /bin/bash
# kill-tb.bash
# (c) 2005 Michael Croft
# http://www.whiterose.org
 
starttime=(`date`);
sleeptime=0
killcount=0
let MaxProc=15
while true
do
ProcArray=(`ps -U www | grep mt-tb.cgi | awk '{print $1}'`);
let ProcCount=${#ProcArray[*]};
echo "`date` : ${ProcCount} mt-tb.cgi processes found"
if (( $ProcCount > $MaxProc ))
then
 for i in ${ProcArray[*]}
 do
  killcount++;
  kill -9 $i
  echo "`date` : killing process ${i} --$killcount processes killed since $starttime"
 done;
 sleeptime=40;
else
 if (( $ProcCount > 1 ))
 then
  sleeptime=20;
 else
  sleeptime=150;
 fi
fi
sleep $sleeptime
done
.:Posted by Michael on April 21, 2005 10:37 AM:.

Can you explain trackbacks to me?

.:Posted by Thomas ( total) on May 8, 2005 1:37 AM:.
Archives
Archives
Category Archives
Recent Entries
Comment Leaders
Links


Blogcritics: news and reviews
 
Syndicate this site (XML)
Powered by

powered by Movable Type
Movable Type 3.33
Apple Computer