Beautiful Soup Tutorial Featuring Select Boxes

This example deals primarily with pulling option values out of select boxes, and should be much easier to understand than my last, somewhat convoluted Beautiful Soup example. However, much like the previous example, this one stemmed from a “real world” need. I’m talking about trying to find a bargain vacation on the super slow website, AppleVacations.com.

The issue at hand is that the price is different for about every departure date, and they give you many to choose from in a SELECT element. The problem is that you have to select each date, one at a time, from the list box in order to know what the price is. If the site was snappy, this would be annoying at best. But the fact that the site is slow as dirt makes it a serious waste of time if you want to check them all out:

apple-vac.jpg

So using Beautiful Soup, I wrote an 11 line python script that grabs the values from each of the options in this list, and opens it in a new web browser. While all the pages are loading, I can do something useful with my time, like contemplate if my bathing suit can survive another year.

This was a good exercise in scraping info out of <option> values, and identifying a <select> list by it’s name attribute. The basic code is below, and here’s a link to the code with better comments.

import urllib2, webbrowser
from BeautifulSoup import BeautifulSoup

select_list_page = 'http://the-apple-page-with-the-select-element'

page = urllib2.urlopen(select_list_page)
soup = BeautifulSoup(page)

# This is basically their template page, sans the id that
# identifies the package (at very end of the URL)

url = 'http://the-page-that-lists-the-price-without-the-ending-id'

# Now we identify the select list that has the dates
# Easy enough, the name of this select list is also selectedBatchNumber

select = soup.find('select',{'name':"selectedBatchNumber"})
option_tags = select.findAll('option')

# The 1st one has no value, so strip it out

option_tags = option_tags[1:]

# Now we loop through each value
# webbrowser.open will attempt to identify the default browser,
# and open each page in a new window/tab

for option in option_tags:
    webbrowser.open(url + option['value'].strip())

# Best to get a cup of coffee while all the pages load

One thought on “Beautiful Soup Tutorial Featuring Select Boxes

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>