Baiduspider, Twiceler, and Yeti – Bad Robots!

One of the websites I manage had been experiencing tremendous use in bandwidth. "Sweet!" we thought, a solid boost in traffic is a good thing. But as time went by, it kept going up and up in an unreal way - something was awry. After checking the log files, I saw that the 3 bots mentioned above seemed to be literally attacking the website. After some Googling, I found that some or all of these robots have run amuck on other websites too, and are generally worthless as far as supplying any valuable traffic - so I just blocked them all by placing the following in my .htaccess file:

Options +FollowSymLinks
RewriteEngine on
 
# block bad bots
 
RewriteCond %{HTTP_USER_AGENT} ^Baiduspider [NC]
RewriteRule ^.*$ http://google.com/ [R,L]
 
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5(.*)Twiceler
RewriteRule . http://www.cuill.com/your_bot_sucks [R=301]
 
RewriteCond %{HTTP_USER_AGENT} ^Yeti [NC]
RewriteRule ^.*$ http://google.com/ [R,L]
 
# end bad bots

The above is a combination of various Googling (I wish I kept the links so I can thank the fine folks that helped me address this). You can send them away to any site you like - in the first and last I just kicked them to Google, and for the second I sent the robot back to it's own site. My bandwidth now seems to back down to normal levels. Try them out, monitor your bandwidth and log files, and tweak as needed - good luck!

Update - December 2010

I've consolidated the code, and added another bot:

# block bad bots
 
RewriteCond %{HTTP_USER_AGENT} ^Baiduspider [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5(.*)Twiceler [OR]
RewriteCond %{HTTP_USER_AGENT} ^Yeti [OR]
RewriteCond %{HTTP_USER_AGENT} ^Java.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^Mail.Ru.*
RewriteRule ^(.*)$ http://help.naver.com/robots/ [R,L]
Posted in Apache | Leave a comment

Drupal – Select List “default_value”

After beating my head against the desk the past hour, I finally figured out that to set a default value on a plain select list, you need to use "#value" and not "#default_value". So, for example, say I have an array of banks consisting of the bank id and the bank name, like so:

$banks = array(14=>"bank one", 22=>"bank two);

I then want to construct a simple select form element, this WILL NOT work:

$form['drop'] = array(
 #id' => "bank_drop",
 '#type' => 'select',
 '#title' => 'test',
 '#options' => $banks,
 '<span style="color: #ff0000;">#default_value</span>' =&gt; 22  // will NOT work!
);

But this works:

$form['drop'] = array(
  #id' =&gt; "bank_drop",
  '#type' =&gt; 'select',
  '#title' =&gt; 'test',
  '#options' =&gt; $banks,
  '<span style="color: #339966;"><strong>#value</strong></span>' =&gt; 22  // works!
);

This was on Drupal 6.15

Hope that saves someone some headache :)

Posted in Drupal | 10 Comments

Sizing and Positioning Fancybox

You've probably seen Fancybox all over the place. In a way, it's the living-successor of ThickBox and Lightbox.

I was recently on a project where I had to both size and place it. Sizing it is easy. Say you have a link tag with an id of opener, like this:

<a id="opener" href="http://www.google.com">Google</a>

To create and size the fancybox that will open when it's clicked, you'd just add the following javascript to your page:

<script type="text/javascript" charset="utf-8">
$(document).ready(function(){
  $("#opener").fancybox({
    'width': 390,
    'height': 400
  });
});
</script>

Note that pixels are assumed, you should not say something like "390 pixels" like you would in css, it won't work then.

So, above we have defined a fancybox that is 390 pixels wide, and 400 pixels high. Next we need to position it.

(Note this is based on Fancybox 1.3 - I can't say if it'll work with other versions.)

Fancybox puts everything in a layer with an id of "fancybox-wrap" - so we can easily manipulate it with basic css, like so:

#fancybox-wrap {
  margin: -70px 0 0 290px;
}

You can do the same with padding, maybe even positioning.

Hope that helps someone!

Posted in jquery | 11 Comments

Drupal: Redirecting a User After Login

This has me beating my head on the desk a bit. I'm coding up a Drupal 6 module, and wanted to redirect the user after logging in, depending on what role they belong to.

In this example, let's say I have a role called "treasurer" - and anyone that's logged in and not a treasurer is an administrator. In my module (called mymod here), I knew that hook_user would do the trick somehow. Note that in hook_user, the $op variable contains what operation is being performed, and "login" is an option we can use here. So, this will fire right after the login form is processed and the user is loaded. The $account variable contains the entire $user object being worked on (so we can use it instead of a "global $user" declaration). Here's what I originally wanted to do but it does not work:

function mymod_user($op, &amp;$edit, &amp;$account, $category = NULL) {
  switch ($op) {
    case 'login':
      if (in_array('treasurer', $account-&gt;roles)) {
        drupal_goto('user/' . $account-&gt;uid);
      } else {
        drupal_goto('admin/users');
      }
      break;
    default:
      // nothing
      break;
  }
}

We can't  use drupal_goto here, surprisingly. But after a little Googling, here is the way to do it - this works:

function mymod_user($op, &amp;$edit, &amp;$account, $category = NULL) {
  switch ($op) {
    case 'login':
      if (in_array('treasurer', $account-&gt;roles)) {
        $_REQUEST['destination'] = 'user/' . $account-&gt;uid;
      } else {
        $_REQUEST['destination'] = 'admin/user/user';
      }
      break;
    default:
      // nothing
      break;
  }
}

In short, we just need to set $_REQUEST['destination'] to whatever we want. In the above example, if the user is a treasurer, we sent them to their user account page. Otherwise, we know that it's an administrator (since that's the only other role I set up), and we show the admin all the users. Tailor to your needs. Note that if your destination has a url alias setup, this will automatically fetch the alias too. Cheers.

Posted in Drupal | 8 Comments

Tweeting from Launchbar

I've been checking out Launchbar, thinking replacing my beloved Quicksilver because supposedly it is not being worked on much anymore. While viewing these very helpful Launchbar Video tutorials, I noticed one of them was about Tweeting from Launchbar. For firing off a quick tweet, this is very handy.

However, I had problems getting the script mentioned in the tutorial to work. After some more searching and tweaking, I can came up with this script, which works well enough for me and gives Launchbar credit for the tweet, which makes you look super cool/geeky as most folks will probably be like what the hell is this Launchbar???

cartman-tweet

To use this script, open up Script Editor and copy it in, Compile and save it into ~/Library/Application Support/LaunchBar/Actions - where Launchbar should pick it up. From there, just use as shown in the video. Note this assumes you have Growl installed - which you should ;)

using terms from application "LaunchBar"
    on handle_string(tweet)
        -- Init
        my growlRegister()
        set charcount_max to 140
        set charcount_tweet to (count characters of tweet)
 
        -- Check message length
        if charcount_tweet ? charcount_max then
            -- Get credentials for twitter.com
            tell application "Keychain Scripting"
                set twitter_key to first Internet key of current keychain whose server is "twitter.com"
                set twitter_login to quoted form of (account of twitter_key &amp; ":" &amp; password of twitter_key)
            end tell
 
            set twitter_status to quoted form of ("source=launchbarat&amp;status=" &amp; tweet)
            try
                -- Send tweet
                do shell script "curl --user " &amp; twitter_login &amp; " --data-binary " &amp; twitter_status &amp; " http://twitter.com/statuses/update.json"
                -- Display success message
                growlNotify("Tweet Sent:", tweet)
            on error
                -- Display error message
                growlNotify("Error Tweeting.", "Try again?")
            end try
        else
            -- Tweet is too long
            growlNotify("Tweet Too Long", "Tweet is " &amp; charcount_tweet &amp; " characters long. The maximum length is " &amp; charcount_max &amp; " characters.")
        end if
    end handle_string
end using terms from
 
using terms from application "GrowlHelperApp"
    -- Register Growl
    on growlRegister()
        tell application "GrowlHelperApp"
            register as application "Tweet" all notifications {"Alert"} default notifications {"Alert"} icon of application "Script Editor.app"
        end tell
    end growlRegister
 
    -- Notify using Growl
    on growlNotify(grrTitle, grrDescription)
        tell application "GrowlHelperApp"
            notify with name "Alert" title grrTitle description grrDescription application name "Tweet"
        end tell
    end growlNotify
end using terms from
Posted in OS X | 1 Comment

Finding Duplicates in MySQL

I get quite a few Excel files from clients that need to get cleaned up and inserted in to MySQL. Sometimes the import goes smoothly, sometimes it doesn't. This last time, I had 3,741 rows in an Excel file, but somehow wound up with 3,751 rows in the database - what went wrong? Don't know, and don't have time to figure it out - so I just used this query to isolate all the duplicates.

My database table consists of nothing more than an id, a part number (number), a description, and a price. Here's the query that located all duplicate part numbers - from there I was able to easily delete them:

SELECT NUMBER, 
 COUNT(NUMBER) AS NumOccurrences
FROM part_prices_new
GROUP BY NUMBER
HAVING ( COUNT(NUMBER) > 1 )
Posted in mysql | Leave a comment