Code yourself a date on Match.com

Web Culture — Mark Percival @ 1:59 pm

Finally, Ruby code to solve your real-world problems - getting a date.

I recently had the idea to scrape Match.com’s database and use it to profile the number of users in each age bracket. Seeing as I’ve already got a girlfriend, this is an exercise with no end goal except that of personal curiosity, but don’t let that stop you.

It seems pretty trivial, but when I looked at the page source I realized that the form was quite nasty, and I didn’t relish the thought of cutting through the cruft. Thankfully I was able to use the Ruby Mechanize library to submit the form and save the response in an orderly fashion. This way I could poll Match.com for each year of age, and get back a results page and parse out the page list at the top.

Using the number of pages returned I could crudely estimate how many users were signed up in the area I selected, and broken down by age. Due to Match.com’s liberal policy of letting anyone search for their users, I didn’t even have to sign up.

Now the limitations to this are Match.com’s imposed limit of 32 pages. In some cases I had to slim up the results by excluding people by race and education, as the smallest 5 mile radius in NYC returned more than 32 pages. Obviously this information should only be used for casual observations, as my methods varied by city.

First, the code - it’s pretty simple thanks to Mechanize:

module MatchDotCom
  class Scrape
    require 'rubygems'
    require 'mechanize'
    require 'logger'

    MALE = 1
    FEMALE = 2

    attr_accessor :page, :agent, :options

    def initialize(options = {})
      default_options = {
        :zip_code => 30305,
        :radius => 5, #In miles, in increments of 5
        :age_min => 18,
        :age_max => 18,
        :photos_only => false,
        :gender => MALE,
        :looking_for => FEMALE,
        :caucasian => 'M_Ethnic_02',
        :degree => 'M_Edu_04'
      }
      self.options = default_options.merge(options)
      # agent = WWW::Mechanize.new { |a| a.log = Logger.new("mech.log") }
      @agent = WWW::Mechanize.new
      @agent.user_agent_alias = 'Windows IE 7'
    end

    def pages
      self.page = @agent.get("http://www.match.com/search")
      form = page.forms.first
      form['SearchIndexSearchForm:M_LAGE_A0'] = options[:age_min]
      form['SearchIndexSearchForm:M_UAGE_A1'] = options[:age_max]
      form['SearchIndexSearchForm:GenderCode'] = options[:gender]
      form['SearchIndexSearchForm:ThemRelationship'] = options[:looking_for]
      form['SearchIndexSearchForm:POSTALCODE'] = options[:zip_code]
      form['SearchIndexSearchForm:txtLivingWithin'] = options[:radius]
      form['SearchIndexSearchForm:chkPhotosOnly'] = options[:photos_only]
      form['M_Ethnic'] = options[:caucasian]
      form['M_Edu'] = options[:degree]
      form['SearchIndexSearchForm:SubmitButton.x'] = 18
      form['SearchIndexSearchForm:SubmitButton.y'] = 5
      form.action = “/search/?ER=sessiontimeout&trackingid=0&lid=1000002″
      self.page = form.submit
      page.search(”a[@id=lnkPageTwenty]“).first ? page.search(”a[@id=lnkPageTwenty]“).first.html : “0″
    end

  end
end

require ‘yaml’

ages = 18..70
# zip_codes = {”Dunwoody”=>30338, “Midtown”=>30309, “NWAtlanta”=>30318, “Decatur”=>30030}
zip_codes = {”Chicago”=>60605}
city = “Chicago” # Use this to keep track of the file names
radius = 5
gender = 2
looking_for = 1

data_file = File.new(”#{city}-match-#{gender}4#{looking_for}-#{Time.now.strftime(’%y%m%d%H%M’)}.yaml”, “w”)
data ={}

zip_codes.each do |name, zip_code|
  results = {}
  details = {”radius”=>radius, “district”=>name}
  ages.each do |age|
    match = MatchDotCom::Scrape.new({:age_min=>age, :age_max=>age, :gender=>gender, :looking_for=>looking_for,
        :zip_code=>zip_code, :photos_only=>true})
    results.merge!(age=>match.pages)
  end
  data.merge!(zip_code=>{”results”=>results, “details”=>details})
end
data_file << data.to_yaml

It’s a fairly simple task to survey you’re own area, just modify the zip code hash to your needs.

And now for my own results.

As you can see, if you live in Atlanta, and your a guy looking to find that special lady on Match.com, you’d be best advised to shoot for around age 30, as the numbers are clearly in your favor.

Yes, that’s right, you’ve just used Ruby to improve your odds of getting a date - and that’s probably something you should keep to yourself.

Wiki’s and their failings

Web Culture — Tags: — Mark Percival @ 6:05 am

I’m perennially impressed with Wikipedia. Seldom does a day go by that I don’t pull the page up at least a couple times, especially on my morning news reads. When you want to get a overview of a historical topic, it’s hard to beat.

And then there’s the RubyOnRails Wiki. It’s painfully cluttered, and each topic is strewed with various solutions, some quite dubious. So while I sometimes find a useful tidbit on there, I often just continue my search elsewhere.

What’s the difference in these two wiki’s that makes one so successful, and the other a slurry of half-answers. I’d surmise that the problem lies in a lack of final authority. Wikipedia has it’s guardians on each page, and while one can skew topics, you can’t ignore that facts. On the Rails wiki there’s no one right solution, nor a person to enforce some standard way of doing things, and therefore it gets out of hand.

I’ve also used the Facebook Developers wiki, which is actually very well organized and maintained, but maybe it’s addressing a different problem - documenting an API, not trying to solve various programming dilemmas.

What’s the answer? What makes wiki’s so great for some technical data, and so bad for others?

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.
(c) 2008 WebChicanery | powered by WordPress with Barecity