A few months ago I invested in Next Level Automotive, a company of two young entrepreneurs who are working hard to realize their dream: the restoration and sale of Supercars. And when we talk about Supercars, we are talking about Japanese cars with a lot of horsepower. As you can see from the photo *, they are successful. The Toyota Supra is a car for real enthusiasts. But they also crash into brutal bins like the Nissan GTR-34 Skyline. Or really just on the entire line-up of cars from the first films of Fast & Furious.
Jan and Marco are good in Japanese cars. But we want more with the company. We would also like to find other brands of cars, import them where necessary, restore them and resell them. The cars must have a lot of horsepower, otherwise they do not like it. We have to search the entire European market for cars that meet the requirements of Jan and Marco. Those requirements, are simple to summarize: many horsepower and an asking price that is below the price for which we can resell the car.
Jan and Marco are working from early in the morning until late at night. The systematic search the market for cars that fit their concept is time-consuming. It is also not the work that they find super nice. It would be useful if they had a tool that could help them with that. It would be nice to be able to determine the value of a car in an objective way. This gives you a little more insight into the potential profit when purchasing the car. And that's what I find very important as an investor. I have absolutely no knowledge of cars and their appreciation. But making tools...
So, I thought it would be handy to make a tool that searches the internet for cars that have more than 300 hp (which is not a Supra) and are of a certain brand - for example only Porsche, BMW and Audi, and especially no Volvo.
In addition, we also want to have a method to determine whether the asking price is reasonable, and what the potential sales value could be. As a first step, I developed an application that stores the cars that are on sale on a set of selected websites. If you have that data, the fun can begin. You can model that data. In this case, we want to create a model with which we can rate cars based on the brand, model, number of kilometers on the counter and the number of horsepower. And then, we keep it simple, because in practice, options such as open roof, air conditioning and leather upholstery also have an influence on the price.
With this example I can explain well how the Azure Machine Learning Studio works. Look. Top of page 1. Import data. I import the data set of about 10,000 cars that I - for this example - have taken from a website where cars are offered. In the final system we can retrieve the data directly from the database in which we will update our cars on a daily basis. With 2. Transform data. We add some columns to the data set, rename some columns and we correct existing metrics or calculate new metrics. A good way - if it concerns few variables, perhaps the best - to get a picture of your data set and the model you can make, is visualizing (3). Then comes the fourth step, analyzing and modeling itself. Here I have chosen a decision forest regression. This is a method of regression or classification that simply makes a large number of decision trees with many branches (clusters of cars) and then makes a statement about the expected price for each branch. Exactly what we need now.
Finally, I make a web service to call the model so that you can integrate it into your business processes.
If we look at the picture below, we see that there is a clear correlation between price (y-axis) and year of construction. We would also see the same for horsepower: the more horsepower, higher the price. And higher the mileage, lower the price.
A decision forest is too big to explain properly here, but a simplified model that predicts the price of a Porsche could look like this:
The price that the model calculates in the first instance is the price of a Porsche 911, with an x1 number of horsepower and an x2 number of kilometers on the counter. For other models, the model then deducts a fixed amount, as punishment, because they are not 911s.
This way we can estimate the price the car should have for each brand and car model. We can also determine an acceptable price range within which the car can be put up for sale. This way, we can have our application make an overview every day of all cars on the market, the asking price, and the expected asking price according to our model.
Cars with a green ball are good deals, cars with a red ball are not good deals. The most interesting deals - the bargains - are at the top. These are the cars that we have to go after!
We could even think about other applications. We can cluster vendors into groups that usually offer cars above or below the expected asking price. We can identify options that make cars more valuable, or perhaps identify colors that have value-enhancing or lowering effects. After all, a model with a wrong color has been repainted. And we could use our application to predict how much a car someone now owns will be worth next year. This can be useful to determine whether you still trade in your car this year, or wait another year.
With these models you can, of course, predict more than just the value of a car. If you do not care about cars, you might be interested in the quality of a Bordeaux. And there are countless other possibilities. I wonder what possible applications can be thought up for your business. Do you have questions or ideas? Or is something bubbling but cannot put your finger on it yet? Do not hesitate - call us, email us, and we can think along! And we might write about it next time.
Maurik van den Heuvel
Tecknoworks Nederland BV
Pascalstraat 13H | 2811 EL Reeuwijk | Nederland