In 2014, Opendoor set out to reinvent life's most important transaction with a new, radically simple way to buy and sell a home. Opendoor makes instant offers on homes, so that sellers can get a fair price without having to retain an agent, do a bunch of cosmetic repairs, and spend months hosting open houses. To make those offers, Opendoor uses algorithmic pricing products, which means machine learning accuracy and reliability has been at the core of their business model since day 0.
The company's mission is to empower everyone with the freedom to move, and they have served more than 50,000 customers who have come to Opendoor to make that move easier. Whether it's getting married, starting a family, or taking a new job, they help people get to their next step in one simple, seamless transaction.
Opendoor currently operates in twenty-one cities across the US, and is headquartered in San Francisco. In this post, we'll meet three of Opendoor's data scientists who will share insights on how they use data science to drive value. We hope you enjoy!
Yes! We have different positions, which range from analytics to building and maintaining data products. Many of our data science roles relate to pricing homes, which is at the core of our business. We also have data scientists working in all other parts of our business if pricing isn't your thing.
Without further ado, let's meet three of Opendoor's Data Science Team Members!
I am a data scientist on the Pricing team, working on pricing our inventory of homes for resale. I design, implement and monitor the models and systems that we use to set prices on thousands of homes every month. I think a lot about how our pricing should respond to demand signals both at the macro level and the hyper-local level. Right now, we're focusing on systematization and optimization. Measuring the quality of our resale pricing strategies is tricky and requires some creativity and flexibility. Right now, I am also working on designing experiments to help us optimize our pricing strategies and make sure they are responsive to market conditions.
I was introduced to Opendoor through a program called Insight that I participated in when I decided I wanted to leave academia and transition into data science. I had a great time during my PhD, which I did in Geophysics at Stanford, but in the months leading up to graduation, I was feeling more and more uncertain about a career in academia. So while I was writing applications for post-doc positions, a friend of mine convinced me to also apply to Insight. I had always loved coding and data analysis, which I did a lot of during grad school, so it definitely appealed to me (plus I really wanted to stay in the Bay Area). I heard back from Insight before any of the post doc positions got back to me, so that was how I made my decision. At Insight, we heard pitches from about 40 different companies, and Opendoor really stood out to me because of the types of problems the team was tackling and how data science really was core to the business. When I visited the office and met some of the team, I was sold - the combination of economics, data science, customer focus and super smart colleagues was exactly what I was looking for. I have been here for over two and a half years now, worked on several different teams, and have loved learning and growing with the company as we've scaled from 200 employees to over 1300.
Recently, I have been working on a model for suggesting a list price for our listings. This is an important problem because if we list a home too high, it just won't sell, but if we list it too low, we might be leaving money on the table. The trickiest part about this problem is that we can only see what happened given the actual list price, and not the counterfactual of what would have happened given a different price. Because of this, it's not obvious how to set up the machine learning problem - what exactly should the model be solving for? Instead of directly trying to solve for the optimal price, I decided to simplify the problem and solve for a range of prices that market data suggests led to positive outcomes. There is so much interesting work to do in this space, I am really excited to keep iterating and building.
In terms of benefit to customers, anything that we do to improve the margins on our resales gets invested back into the fees we charge those selling to Opendoor. Our goal is to be as efficient as possible throughout the business so we can keep fees low.
Academia and tech are very different in so many ways, but one thing that academia does really well is building on the experience of others. As a grad student, I spent a lot of time reading papers and understanding how others had approached the problems I was interested in, and how I might slightly modify previous work to answer new questions. In industry, we don't always take the time to research how others have solved problems similar to ours, or think about how our super-specific problem might be really similar to other types of problems. This can lead to more efficient solutions because you aren't reinventing the wheel and can learn from others' experiences. I like reading blog posts by data scientists at other companies to see how they're thinking about their problems, and what has worked well for them.
On the flip side, one thing I wish I had been better about in grad school was code readability and version control. Most grad students do not write code assuming anyone else will need to understand it, and I know from experience that this inevitably leads to headaches down the line.
At Opendoor, I work on the Pricing team at the intersection of engineering, data science, and product. I joined Opendoor as a data scientist, transitioned into engineering, and am currently on the tech team, leading multiple teams that work together as a system to produce accurate pricing decisions, from buying to reselling homes and everything in between.
I learned very early on that accurate pricing is not just about having great algorithms and ML models (though they certainly are essential ingredients!). Since Opendoor cares deeply about individual predictions being accurate, we employ a human-in-the-loop system where some predictions are verified or overwritten by human experts. We also invest heavily in gathering accurate data (otherwise, garbage in, garbage out) in our consumer- and internal operator-facing user interfaces. My work is ensuring that systems, tooling, and algorithms work together to produce the best pricing decisions.
I joined Opendoor as a data scientist after finishing my graduate studies in theoretical physics. Prior to that, I wrapped up a degree in piano performance at a music conservatory.
While my background may seem somewhat eclectic, the move to data science at a tech startup felt natural to me since I have always been an enthusiast of using technology to make life better in small and big ways. Back in the early days of Android 1.5 (Cupcake, anyone?), I wrote a few apps to solve my annoyances with using my new phone, and doing late night coding sessions instead of actual school work felt like a guilty pleasure. Now I have legitimized my hobby and work on making the home buying and selling experience better for people across the country using data and technology!
Like most people trying to get a sense of what all the "data science" hype was about, I was initially introduced to the field through Kaggle competitions and was excited to learn about the huge variety of machine learning models. It was not surprising that before joining Opendoor, I thought data science and machine learning were just about improving algorithms and models to achieve better and better accuracy.
As a data scientist at Opendoor, I've done a fair amount of ML model performance optimization, but that is just the tip of the iceberg. I was attracted to data science over academia in a large part because data science is a living and breathing thing that can directly impact the real world. And I learned very early on in Opendoor that the real world rarely comes in tidy, nicely packaged Xs and Ys just waiting for someone to train a model. So much of data science is about making the right connection between abstract algorithms and the real world.
There are many times when identifying the target variable Y and establishing the right metric to optimize was the hardest part of the problem. If we use the wrong variable to train our model, the model is irrelevant regardless of how accurate it is. Other times, we realized that the model is fundamentally limited by the lack of high signal data, so if we have limited time, we'd be better off improving our product to gather better signals than optimizing the model performance.
I'm a Data Scientist focused on solving data problems at Opendoor, on a team we call Pricing Data. It may sound obvious that a data scientist works with data, but there are a number of different ways data scientists make themselves useful at the company, ranging from building predictive services with data to using data to help inform strategic business decisions.
For the past year I've primarily been focusing on how to improve the quality and quantity of data that the predictive services teams within pricing can use to make better pricing decisions for our customers. The strategy we've landed on heavily uses human judgment, so the work involves a combination of data engineering, data science, and product engineering. It's a role where I'm excited to bring both my skills and experience in data science and my broader skillset as a full stack engineer to the table.
Before Opendoor I was a research scientist at OpenAI, an independent organization dedicated to ensuring the beneficial development of artificial general intelligence, focused primarily on developing new techniques for robotics and transfer learning. Before OpenAI, I spent some time working at Clara Labs developing NLP models, and before that I spent a few years building a mobile analytics company called Watchsend.
After working on long-term research at OpenAI, I wanted to swing in the other direction and find a place where I could learn how to use my machine learning expertise to build a product that helps people today. When my friends at Opendoor reached out to discuss the company, I was immediately impressed by the pace of development, product-market fit, and the opportunity to use ML in a way that felt very core to the business and beneficial to the customer. At Opendoor, in many ways the price is the product, and as a data scientist I felt like I could have more impact here than anywhere else - considering the proximity to the product, the phenomenal growth, and the product-market fit we already had with our customers.
One of my first projects at Opendoor, and still one of the most impactful, was to help us develop a more nuanced view of time-to-sale for the homes we buy, when we're buying them. In the traditional real estate model, a seller finds an agent to help them sell their home, and that agent (+ the agent who helps the buyer find that home) collectively take about 6% in commission for the sale. There are a number of difficulties with this process, but one of the biggest is the uncertainty of the whole thing. It's always unclear how long it will take for you to sell your home, and at what price. At Opendoor, we offer you that certainty - with an instant offer on your home you'll know the price immediately. We still need to sell your home however, and are thus accepting that uncertainty for you, so we charge a variable fee to cover our costs. Right now, that fee is lower than the traditional realtor process sometimes, and sometimes it's more.
The better we can predict how long it takes to sell your home, the lower our fees and the more appealing our product becomes. Most of the trickiness with this problem involved framing it to take maximal advantage of the data we have on hand. While it could be tempting to simply use regression, for problems with censored data (something that hasn't happened yet) that approach could result in enormous bias in your predictions. You can read more about the approach we landed on here.