Code yourself a date on Match.com
Finally, Ruby code to solve your real-world problems - getting a date.
I recently had the idea to scrape Match.com’s database and use it to profile the number of users in each age bracket. Seeing as I’ve already got a girlfriend, this is an exercise with no end goal except that of personal curiosity, but don’t let that stop you.
It seems pretty trivial, but when I looked at the page source I realized that the form was quite nasty, and I didn’t relish the thought of cutting through the cruft. Thankfully I was able to use the Ruby Mechanize library to submit the form and save the response in an orderly fashion. This way I could poll Match.com for each year of age, and get back a results page and parse out the page list at the top.
Using the number of pages returned I could crudely estimate how many users were signed up in the area I selected, and broken down by age. Due to Match.com’s liberal policy of letting anyone search for their users, I didn’t even have to sign up.
Now the limitations to this are Match.com’s imposed limit of 32 pages. In some cases I had to slim up the results by excluding people by race and education, as the smallest 5 mile radius in NYC returned more than 32 pages. Obviously this information should only be used for casual observations, as my methods varied by city.
First, the code - it’s pretty simple thanks to Mechanize:
module MatchDotCom
class Scrape
require 'rubygems'
require 'mechanize'
require 'logger'
MALE = 1
FEMALE = 2
attr_accessor :page, :agent, :options
def initialize(options = {})
default_options = {
:zip_code => 30305,
:radius => 5, #In miles, in increments of 5
:age_min => 18,
:age_max => 18,
:photos_only => false,
:gender => MALE,
:looking_for => FEMALE,
:caucasian => 'M_Ethnic_02',
:degree => 'M_Edu_04'
}
self.options = default_options.merge(options)
# agent = WWW::Mechanize.new { |a| a.log = Logger.new("mech.log") }
@agent = WWW::Mechanize.new
@agent.user_agent_alias = 'Windows IE 7'
end
def pages
self.page = @agent.get("http://www.match.com/search")
form = page.forms.first
form['SearchIndexSearchForm:M_LAGE_A0'] = options[:age_min]
form['SearchIndexSearchForm:M_UAGE_A1'] = options[:age_max]
form['SearchIndexSearchForm:GenderCode'] = options[:gender]
form['SearchIndexSearchForm:ThemRelationship'] = options[:looking_for]
form['SearchIndexSearchForm:POSTALCODE'] = options[:zip_code]
form['SearchIndexSearchForm:txtLivingWithin'] = options[:radius]
form['SearchIndexSearchForm:chkPhotosOnly'] = options[:photos_only]
form['M_Ethnic'] = options[:caucasian]
form['M_Edu'] = options[:degree]
form['SearchIndexSearchForm:SubmitButton.x'] = 18
form['SearchIndexSearchForm:SubmitButton.y'] = 5
form.action = “/search/?ER=sessiontimeout&trackingid=0&lid=1000002″
self.page = form.submit
page.search(”a[@id=lnkPageTwenty]“).first ? page.search(”a[@id=lnkPageTwenty]“).first.html : “0″
end
end
end
require ‘yaml’
ages = 18..70
# zip_codes = {”Dunwoody”=>30338, “Midtown”=>30309, “NWAtlanta”=>30318, “Decatur”=>30030}
zip_codes = {”Chicago”=>60605}
city = “Chicago” # Use this to keep track of the file names
radius = 5
gender = 2
looking_for = 1
data_file = File.new(”#{city}-match-#{gender}4#{looking_for}-#{Time.now.strftime(’%y%m%d%H%M’)}.yaml”, “w”)
data ={}
zip_codes.each do |name, zip_code|
results = {}
details = {”radius”=>radius, “district”=>name}
ages.each do |age|
match = MatchDotCom::Scrape.new({:age_min=>age, :age_max=>age, :gender=>gender, :looking_for=>looking_for,
:zip_code=>zip_code, :photos_only=>true})
results.merge!(age=>match.pages)
end
data.merge!(zip_code=>{”results”=>results, “details”=>details})
end
data_file << data.to_yaml
It’s a fairly simple task to survey you’re own area, just modify the zip code hash to your needs.
And now for my own results.
As you can see, if you live in Atlanta, and your a guy looking to find that special lady on Match.com, you’d be best advised to shoot for around age 30, as the numbers are clearly in your favor.
Yes, that’s right, you’ve just used Ruby to improve your odds of getting a date - and that’s probably something you should keep to yourself.