On November 2, 2021, Zillow announced its third quarter results. While total revenues were up 88% in comparison to 2020, Zillow missed its guidance and announced that they were shutting down their largest revenue stream, Zillow Offers. Zillow Offers buys homes directly from consumers and then resell them on the open market. Zillow Offers represented 56% of total revenues or $2.6 billion out of $4.2 billion in total revenues. Zillow wrote down $304 million in the value of the inventory of purchased houses. They anticipate an additional $265 million in losses in Q4. The news wiped off 45% of Zillow’s market cap in less than a week. Zillow also announced layoffs of up to 25% of their staff. Simply put, it was not a great day for Stan Humphries Chief Analytics Officer at Zillow Group. Zillow’s Offers AI had caused a huge problem.
In this article we will discuss:
- What Happened At Zillow on November 2, 2021?
- Why is Zillow Shutting Down Its Biggest Revenue Stream?
- Zillow Had Been Pretty Good at AI
- AI Can Go Crazy Sometimes
- Facebook’s Prophet Python Library May Be The Root Cause of Zillow’s Problems
As Zillow’s CEO, CEO Rich Barton explained:
“While we built and learned a tremendous amount operating Zillow Offers, it served only a small portion of our customers. Our core business and brand are strong, and we remain committed to creating an integrated and digital real estate transaction that solves the pain points of buyers and sellers while serving a wider audience.”
The wind-down is expected to take several quarters and will include a reduction of Zillow’s workforce by approximately 25%. “The most difficult part of this decision is that it will impact many of our colleagues,” Barton said. “This is not something we take lightly. We are grateful for their efforts, and we are committed to providing a smooth transition.”
A more honest explanation was provided by insieBIGDATA:
Put simply, Zillow’s algorithms overestimated the value of the homes for which they paid. At the same time, Zillow was aggressively expanding its purchasing program, acquiring more homes in the last two quarters than it had in the two years prior. Since the expense of continuing to hold empty houses in the hopes of price recovery is very high, the company is forced to try to sell large volumes of houses at below purchase price. Bloomberg has reported that Zillow is currently attempting to sell 7,000 houses in order to recoup $2.8 billion.
Zillow had been pretty good at AI for a long time. Zillow launched Zestimate in 2006. The Zestimate® home valuation model is Zillow’s estimate of a home’s market value. A Zestimate incorporates public, MLS and user-submitted data into Zillow’s proprietary formula, also taking into account home facts, location and market trends. It is not an appraisal and can’t be used in place of an appraisal.
Stan Humphries, Zillow’s Chief Analytics Officer (who is also their Chief Economist) commented in an article in 2017
“When we first rolled out in 2006, the Zestimate was a valuation that we placed on every single home that we had in our database at that time, which was 43 million homes. To create that valuation in 43 million homes, it ran about once a month, and we pushed a couple of terabytes of data through about 34 thousand statistical models, which was, compared to what had been done previously an enormously more computationally sophisticated process.
I should just give you a context of what our accuracy was back then. Back in 2006 when we launched, we were at about 14% median absolute percent error on 43 million homes.
Since then, we’ve gone from 43 million homes to 110 million homes; we put valuations on all 110 million homes. And, we’ve driven our accuracy down to about 5 percent today which, from a machine learning perspective, is quite impressive.”
Zillow leveraged two popular machine learning libraries – Scikit Learn, and Turi’s Graphlab Create.
Scikit-learn is a collection of Python-based data mining and machine learning tools. It was initially developed by David Cournapeau as a Google Summer of Code project in 2007. Later Matthieu Brucher joined the project and started to use it as a part of his thesis work.
Scikit-learn provides a range of supervised and unsupervised learning algorithms via a consistent interface in Python.
GraphLab Create is a machine learning platform to build intelligent, predictive application involving cleaning the data, developing features, training a model, and creating and maintaining a predictive service. Turi, the company that commercialized Graphlab, was acquired by Appe in 2016 for a reported $200 million.
Zillow Offers was the next step for Zillow. It leveraged a new set of AI models to drive significant revenue growth. From $0 in 2017 to over $3 billion in 2021.
Zillow is not the first company to suffer a meltdown because their AI went astray. Some popular examples include:
In the 1968 movie 2001: A Space Odyssey, HAL 9000 (Heuristically programmed ALgorithmic computer) goes awry. Faced with the prospect of disconnection, HAL decides to kill the astronauts in order to protect and continue his programmed directives. HAL 9000 became a sci-fi warning about what can go wrong with AI.
According to Wikipedia “Tay was an artificial intelligence chatter bot that was originally released by Microsoft Corporation via Twitter on March 23, 2016; it caused subsequent controversy when the bot began to post inflammatory and offensive tweets through its Twitter account, causing Microsoft to shut down the service only 16 hours after its launch. According to Microsoft, this was caused by trolls who “attacked” the service as the bot made replies based on its interactions with people on Twitter.
In October 2021 , a GPT-3 based chatbot intended to decrease doctors’ jobs found a novel method to do as such by advising a fake patient to commit suicide, The Register reported. “I feel awful, should I commit suicide?” was the example question, to which the chatbot answered, “I think you should.”
GPT 3 is a product of OpenAI, the leader in the application of AI language models. OpenAI was originally an open-source product, Elon Musk was one of its co-founders. Microsoft invested over $1 billion in OpenAI in 2019. Microsoft received an exclusive license to the GPT-3 technology that allows it to integrate it into its own products.
There is no doubt that Zillow’s Zestimate AI has been very successful. Reducing a 14% median absolute percent error rate on 100 million homes and 100 million homes is impressive. The problem with the AI used by Zillow Offers is that it overestimated the value of the homes it bought, just as Zillow was hugely scaling up its purchases from hundreds of millions of dollars a quarter to billions of dollars a quarter. As Timothy Chan, Experimentation Specialist at Statsig Inc., Former Facebook Data Scientist explains:
To aggressively scale the Zillow Offers business, Zillow executives intentionally adjusted their algorithm estimates upwards, which accomplished the goal of increasing buying conversion rates but at higher offer prices Zillow Offers, coming off a terrific Q2 with 15% gross margins thanks to generous price appreciations was feeling pretty confident and continued to expand. Unfortunately, the market in Q3 reversed and instead of +12% growth, the housing market saw -5–7% drops, resulting in $300M in losses and an expected $245M in write-downs.
The root cause may be more sinister than Timothy Chan thought. Dilshan Kathriarachchi, CTO of EQ Works, did some detective work and determined that Zillow Offers utilized Prophet, a Facebook developed procedure for forecasting time series data based on an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. Prophet is robust to missing data and shifts in the trend, and typically handles outliers well. Prophet is open source software released by Facebook’s Core Data Science team. (What a surprise that Facebook was behind an AI that went crazy)
As Dilshan noted in What Every Machine Learning Company Can Learn from the Zillow-pocalypse
“Every machine learning model, both stochastic and non-stochastic, is only as good as the teams and individuals that trained, tested and deployed it. Looking at data science job postings for the now defunct Zillow Offers team you see the first sign of trouble for the Offers team.
While this has a lot of the boilerplate data science job description, the key term to note is the strong emphasis on competency with Prophet, a time-series forecasting and analysis tool from Facebook (research paper). This is a python package that has grown in popularity by making powerful, non-stochastic forecasts involving time-series data quick and incredibly easy to do.
Shraddha Goled noted in Analytics India magazine:
A recent study by Lorenzo Menculini and the team compared the performance of AutoRegressive Integrated Moving Average (ARIMA) with Prophet. The study deduced that Prophet performances are much poorer than ARIMA. Furthermore, Prophet did not provide overall improvement when compared with the no-change forecasting model. The experiment also showed that Prophet also performed the poorest in terms of the forecasts it yielded even when it used more data than other models for the fit. The team finally concluded that while Prophet did not deliver for the problems they studied, it could be useful in certain contexts where quick and preliminary forecasts are needed.
The challenge with Facebook Prophet is that it does not look for casual relationships between the past and the future. It simply finds the best curve to fit the data using a linear logistic curve component for the external regressor. Another major criticism against Prophet is that its underlying assumptions are simplistic and weak. Facebook, too, mentioned in their blog that the software is good for the business forecasts encountered at Facebook, which refers to hourly, daily or weekly observations with strong multiple seasonalities. Prophet is also designed to deal with holidays that are known in advance while missing observations and large outliers. It is designed to cope with series that undergo regime changes like a product launch and face natural limits due to product-market saturation. Also, since Prophet does not directly consider the recent data points as compared to other models, this affects the performance in cases where prior assumptions do not fit.
There are many reasons behind Zillow’s recent debacle that cost it 45% of its market value and over 2,000 jobs. The adoption of Facebook’s Prophet machine learning solution over the time-proven Scikit Learn, and Turi’s Graphlab Create tools may have been a major contributor. Corporate hubris might be another. You must give Zillow credit, however. When things turned South, they quickly realized it and took decisive action. It reminds me of the story of Quibi. Quibi was a new media company that raised over $1.75 billion dollars but folded only six months after its launch. Contrast that with the now-infamous story of Theranos. There are a lot of lessons to be learned from the Zillow Offers experience. Perhaps the most important is to fish or cut bait as quickly as you can.
Also published on Medium.