Online Dating: Analyzing the Algorithms of Attraction
Robert L. Mitchell, Computerworld
Rather than hang out in bars or hope that random dates worked out, the 34-year-old aerospace engineer signed up for eHarmony.com, an online dating service that uses detailed profiles, proprietary matching algorithms and a tightly controlled communications process to help people find their perfect soul mate.
Over a three-month period last fall, Joe found 500 people who appeared to fit his criteria. He initiated contact with 100 of them, corresponded with 50 and dated three before finding the right match. He's now happily in a relationship, and although he was skeptical at first, he says high tech played a big role in his success.
(Check out my blog for more details on how Joe got the girl, high-tech style.)
Internet dating sites are the love machines of the Web, and they're big business. eHarmony and similar sites drew 22.1 million unique visitors during just one month, December 2008, according to comScore Media Metrix.
And unlike many social networking sites, they actually make money -- the top sites bring in hundreds of millions per year, mostly in subscription fees.
These online dating services run on a curious mix of technology, science (some say pseudoscience), alchemy and marketing. Under the covers, they combine large databases with business intelligence, psychological profiling, matching algorithms and a variety of communications technologies (is your online avatar ready for a little virtual dating?) to match up lonely singles -- and to convert one-time visitors into paying monthly subscribers.
All is not chocolates and roses online, however. Security is one big challenge for e-dating services, which can attract pedophiles, sexual predators, scammers, spammers and plain old liars -- most notably, people who say they're single when in fact they're married. And sticky questions have yet to be answered over what rights such sites have to your personal information -- how they use it to market other services to you, if and how they share it with advertisers, and how long they store it after you've moved on.
Finally, there's the biggest question of all -- do these tech-driven, algorithm-heavy sites work any better to help people find true love than the local bar, church group or chance encounter in the street?
Armed with these questions, a passably decent head shot, and a very patient wife, I set out to discover what's under the covers in the world of online dating.
The Business Model Behind Online Dating
A well-oiled Internet dating machine can generate well in excess of $200 million a year in a market that's expected to top $1.049 billion in 2009 -- only gaming and digital music sites generate higher revenues -- and is expected to grow at a rate of 10% annually, according to Forrester Research.
Most popular onlinedating sites in 2008
|3. Yahoo Personals||5.21%|
|8. Date Hookup||2.89%|
Source: Hitwise. Market share numbers are based on percentage of all visits to U.S. sites in the online dating category, averaged over a 12-month period.
Most online dating sites generate the bulk of that revenue from subscriptions, although free, advertising-supported sites are starting to gain some ground.
Most dating sites allow users to sign up and create a profile for free.
Before communicating with matches, however, visitors must sign on as a paying member.
To succeed, a site needs to do the following:
The battle isn't over once a service has its inventory in place and has paying customers. The business needs to keep priming the pump to bring on new subscribers because the typical customer -- one of the 10% who actually pay -- stays on less than three months.
But one man's folly is another man's fortune: A large percentage of customers fall off the love wagon after finding their "one true love." They keep coming back over and over again, producing a revenue stream that has a very long tail, says Herb Vest, CEO and founder of the dating site True.com.
Step 1: A perfect match, served up fast
Online dating sites take two basic approaches to provide users with matches.
Online personals services such as Yahoo Personals (which costs $29.99 for one month, $59.97 for three months or $95.94 for six months), are glorified search engines -- big, searchable databases. Users fill out a short profile with check-box items and short descriptions about themselves.
They then narrow down the search by filtering prospects using criteria such as gender, ZIP code, race, religion, marital status and whether or not a person is a smoker. Users filter through the results themselves, deciding on their own which prospects to pursue.
The "scientific" matching services, such as eHarmony (which costs $59.95 for one month, $119.85 for three or $179.70 for six), PerfectMatch and Chemistry.com, attempt to identify the most compatible matches for the user by asking anywhere from a few dozen to several hundred questions. The services then assemble a personality profile and use that against an algorithm that ranks users within a set of predefined categories; from there, the system produces a list of appropriate matches.
Some sites take a hybrid approach. PerfectMatch.com, for example, issues recommended picks but also lets customers browse the "inventory" for themselves.
The technology that powers these dating sites ranges from incredibly simple to incredibly complicated. Unsurprisingly, eHarmony has one of the most sophisticated data centers. Joseph Essas, vice president of technology, says the company stores 4 terabytes of data on some 20 million registered users, each of whom has filled out a 400-question psychological profile (eHarmony's founder is a clinical psychologist).
The company uses proprietary algorithms to score that data against 29 "dimensions of compatibility" -- such as values, personality styles, attitudes and interests -- and match up customers with the best possible prospects for a long-term relationship.
A giant Oracle 10G database spits out a few preliminary candidates immediately after a user signs up, to prime the pump, but the real matching work happens later, after eHarmony's system scores and matches up answers to hundreds of questions from thousands of users. The process requires just under 1 billion calculations that are processed in a giant batch operation each day. These MapReduce operations execute in parallel on hundreds of computers and are orchestrated using software written to the open-source Hadoop software platform.
Once matches are sent to users, the users' actions and outcomes are fed back into the model for the next day's calculations. For example, if a customer clicked on many matches that were at the outset of his or her geographical range -- say, 25 miles away -- the system would assume distance wasn't a deal-breaker and next offer more matches that were just a bit farther away.
"Our biggest challenge is the amount of data that we have to constantly score, move, apply and serve to people, and that is fluid," Essas says. To that end, the architecture is designed to scale quickly to meet growth and demand peaks around major holidays. The highest demand comes just before Valentine's Day. "Our demand doubles, if not quadruples," Essas says.
Online dating site visitors Snapshot: November 2008
Source: comScore Media Metrix
PerfectMatch.com, which claims to have 5 million members, uses a matching algorithm, but its psychological test is shorter than that required by eHarmony. "We wanted to take the basic concept of the Myers-Briggs indicator and apply that to relationships," says Founder and CEO Duane Dahl. The core architecture of the system consists of five front-end Web servers and a large, back-end SQL Server database, plus a variety of servers that handle messaging, marketing and other functions. The matching process is immediate.
True.com also offers "scientific compatibility" matching based on how users answer about 200 questions. The site uses about 200 servers, including a 64-bit, 32-processor Unisys server running Microsoft SQL Server. The matching algorithm's calculations are performed on an array of 64-bit servers that hold a compressed version of the entire multi-terabyte database in memory to facilitate fast matching. "The system can shoot back [matches] with little or no delay," says CEO Vest.
On the other end of the spectrum, Plentyoffish.com's philosophy is to keep it simple. The service focuses on searching and filters: It uses a short questionnaire, and while it does offer some matching capabilities if users want them, CEO Markus Frind says he doesn't promote them -- and he is disdainful of the complex matching algorithms offered by some competitors.
The business operates on just three Web servers, five messaging servers and five database servers (the entire database is just 200GB in size), yet it serves up 200 billion pages a month to some 12 million users. "My entire cost is only a few hundred thousand dollars a year," says Frind. The biggest piece isn't the technology, he says, but the bandwidth required to keep traffic to the site flowing smoothly.
Step 2: From "just looking" to "paying customer"
When it comes to converting users to paid subscribers, the battle is all uphill in an industry in which more than 90% of users never pay a dime. That's where having extensive demographic and psychological data on customers comes in handy.
In fact, online dating sites are so adept at using personal data, potential customers can be forgiven for wondering just who is being "matched up" -- two strangers bent on true love, or lonely customers and the matchmaking site that needs them. (See Online dating: Your profile's long, scary shelf life for details on the ways dating sites mine the data they collect.)
Yahoo Personals uses all of the information at its disposal to tailor its sales pitch to the user. "We try to take advantage of what we know about the user and where they are in their level of engagement with the product," says Ellen Perelman, general manager.
Once users sign up for a free account and fill out a short questionnaire, Yahoo uses targeted messaging to push them through a "conversion tunnel." The messages that users see to persuade them to sign on as paying customers vary depending on the user's profile and his or her behavior on the site.
Similarly, PerfectMatch.com puts users on different "message tracks" based on their profile and what they're doing on the site at any given time. "Everything you do or don't do triggers a response," says Dahl. "We take the information and do a comparative analysis on the fly to serve up the best possible offers to you based on your profile."
Users who aren't "taking full advantage of the site" -- who haven't posted a photo, for example, or have failed to review all their matches -- are targeted by the system. "You will get an e-mail message custom to your situation, encouraging you to perform the action needed," Dahl says.
eHarmony, which has the most comprehensive user profiles, may be the most sophisticated in the ways in which it leverages that information. It pulls information -- more than a terabyte of data each day -- from its Oracle database into high-performance Netezza data warehouse appliances that slice and dice users into behavioral and demographic "buckets."
"We use [Netezza] to do a lot of offline calculations to try to understand patterns and business intelligence about user behavior," explains Essas. Some of that feeds back into the matching process, but it also helps eHarmony persuade users to subscribe to its service. "Because we know more about them, we can target them much better," says Essas. Messaging is tailored to each user's behavior on the site -- and their personality type.
Step 3: Make a high-quality connection
Once users have paid for a subscription, online dating sites offer different tech-driven options for contacting and getting to know prospective dates, everything from chat rooms to instant messaging, e-mail and even video chat.
eHarmony controls the process by moving users through a series of proscribed communication steps on its Web site. The idea is to make users of the site comfortable with each other, but sometimes the technology just gets in the way, or backfires, users say.
Mary, a 45-year-old executive for a large IT consultancy, says the process of moving from eHarmony's prewritten questions and responses to online chat to e-mail to telephone can be tedious when what you really want is to meet someone. "You continuously go through this job interview." Then, after all that, people will suddenly cut off communications. "What happened?" she asks.
Video chat is perhaps the most controversial communication method offered, if only because video sessions often take a "sexual tilt," especially with men, and that drives away the women, says Mark Brooks, editor of Online Personals Watch, a newsletter that covers online dating and social networking sites. Mary explains the situation more plainly: "You go look at their webcam, and they're naked."
Some sites try to police that. True.com, which refers to video chat as "virtual dating," has staffers who constantly watch banks of security monitors that alternate between the 300 to 700 video chat sessions occurring at any one time. Participants who are breaking the rules may be kicked offline for an hour -- or permanently -- or staff may "whisper" a message to them to knock off the deviant behavior. Flashing your breasts, showing a weapon or showing your kids will get you a whisper, while showing "below the belt" body parts or verbal abuse will get you kicked off for an hour. "Porn site girls," underage users and scammers get the boot.
Perhaps the most innovative communication method is virtual dates in a 3-D world. One company, OmniDate, offers an avatar-based virtual dating system that acts as a kind of front end to existing online dating sites and is developing a new version for rollout later this spring that will use photo-realistic avatars. (See Online dating: Avatars tackle the first date for you for a glimpse of just how foxy one reporter can look online.)
So far, few sites have adopted the technology. Frind at Plenty of Fish decided to pass. "At the end of the day, it creates a false sense of reality for people. The point is to meet someone as quickly as possible," he says.
Step 4: Weeding out cheats, scammers and married guys
Mary, who says she has used most of the major services out there, worries about stalkers and fraudsters when visiting online dating sites -- and for good reason.
Stories of negative user experiences associated with online dating sites range from the woman duped into sending $4,500 in emergency funds to a man she thought was stranded in Nigeria, to pedophiles who scan the online dating sites looking for lonely women with kids to the New York woman who was the victim of a romance scam that cost her $100,000. The Internet Crime Complaint Center's 2007 Internet Crime Report found Internet fraud had risen and that online dating fraud was one of the most commonly reported complaints.
The top 5 types of abuse on online dating sites 1. Identity mining/phishing and/or 1-1 credit card fraud - 61% 2. Spam - 14% 3. Profile misrepresentation - 7.6% 4. General misconduct - 5.9% 5. Solicitation - 2.9% Source: Iovation compilation of incidents from online dating sites using its security services
Keeping out the riffraff is a big headache for Plenty of Fish. "Ten percent of sign-ups a day are people trying to scam someone -- or rude, obnoxious people, or spammers," Frind says, adding that he removes about 2,000 suspicious users from the system daily. The issue is such a large problem that Frind has spent more time writing programs to deal with undesirables than he did creating all of the other elements of the service.
Online dating sites use a variety of approaches to detect suspicious accounts. "These are not the sharpest guys out there. They use the same techniques over and over," says PerfectMatch.com's Dahl. He looks for scammers who set up an account and blast e-mail messages to thousands of people, as well as for certain keywords and phrases that might indicate trouble.
eHarmony has recruited outside help to combat the problem. In addition to in-house tools, Essas says, the company has contracted with Iovation Inc., which offers ReputationManager, a service that gathers information on individuals' illicit activity from online dating and other sites and makes it available to subscribers. (See Blocking the bad guys for more on how Iovation's service works.)
True.com takes a broad-brush approach to security by blocking users with IP addresses associated with specific countries, such as Nigeria. Such steps immediately filter out about 10% of applicants, says CEO and founder Vest. eHarmony flags certain IP addresses, but Essas says it doesn't do wholesale blocking because many of its clients travel.
True.com is the only major online dating site to run criminal background checks on everyone who subscribes to its service -- a fact that it trumpets in its marketing messages. Vest says True blocked 80,000 felons from subscribing last year -- about 5% of total requests. "Our view is to do more than anyone else is doing and make it so hard on the scammers that it's easier for them to go elsewhere," he says.
Other sites have been hesitant to embrace background checks. "Scammers use stolen credit cards all the time, so what good is a background check [on a stolen identity]? It's more of a [marketing] gimmick than anything," says Plenty of Fish's Frind.
Dahl doesn't think background checks are reliable. "There are hundreds of law enforcement databases that aren't communicating with each other," he says, adding that PerfectMatch does offer its users the option to buy background checks using a third-party service.
Users like Mary and "Michelle," a 45-year-old scientist who asked that her real name not be used, liked the idea of background checks. But a much bigger problem in their eyes was meeting "single" men on dating sites who turned out to be married. "There's supposedly a screening process. That's why you pay the extra money," Michelle says.
Vest understands the problem but says technology can't help. "We tried to screen for married people and it got to be almost impossible," he says. True.com dropped the practice last June.
Do Online Dating Sites Work?
While they may be helpful as an introduction service, the jury is out on how effective they are at creating better long-term matches.
eHarmony and other online dating sites have their own studies and success stories about the services, but no independent research has been completed that demonstrates the effectiveness of online dating services.
Online dating site trends
Do the matching algorithms produce better matches that lead to long-term relationships? Dan Ariely doesn't think so. "The sites are claiming a lot, but show no evidence of doing anything useful in terms of matches," says Ariely, a professor of behavioral economics at MIT who is researching ways in which online dating sites can do a better job.
Ariely hasn't examined how well those proprietary matching algorithms work, since eHarmony and other sites won't release the details. But he suspects that they're not very effective. "My unsupported guess is that their algorithms are placebos," he says.
His suggestions focus on providing more meaningful information -- more along the lines of what people typically exchange when they meet, such as the books they like to read and who their friends are. He also advocates virtual games as a way for people to get to know one another better.
Joe, the aerospace engineer who's now happily in a relationship, thinks people get out of online dating services what they put into them. While he was reluctant to consider online dating at first -- he says he was "bullied into" using eHarmony by friends and family -- he says the service worked well. "Most of the matches -- maybe 80% -- were pretty close to what I was interested in."
The key, he says, is being honest when filing out the profiles. "Honesty really is what makes the filtering work," he says. To that end, he not only tried to be honest with himself, but recruited two friends to review his answers. He says the service pushed him to consider people just outside the boundaries he had set for criteria such as age and distance. "I'm not sure we would be dating if I hadn't been matched up with her," he says of his new girlfriend, who was located outside of his initial distance limit.
Have Your Say
Others have had less luck. Jake, a 56-year-old writer and editor, has used many of the free services online. He is still single, and his expectations aren't high. "I don't expect miracles from these sites, but they do increase the number of interactions I have, and that's all I'm looking for."
Michelle has all but given up. Online personals helped her meet people who were at least looking themselves for someone, but the quality of the matches -- and the number of married men on the sites -- left her turned off on the experience.
Ariely sees that situation as a tragedy. "This is a market that needs a lot of help -- people are single and want to find a match -- but the sites are not really helping solve this problem. They just provide a list of other people, somewhat like a catalog," he says.
While Joe met a girlfriend on eHarmony who is "pretty much everything I could hope for in a woman," he's still hedging his bets. "It has only been a few months," he says. "I'm interested to see if it will last."
If it doesn't, he'll be back in the game -- and the dating sites will be waiting for him. "The relationship doesn't end once they cancel the subscription," says Perelman at Yahoo Personals. "A high percentage of our users resubscribe."