Me and other friens participated in a MELI challenge consisting in predicting the probability for stock to be sold after N days. This was my first time doing this type of projects with this kind of data. While we didn’t won, I think that this was a nice project. After data analysis we saw that the cumulative sold followed gaussian and Poisson distributions, making reasonable trying to use Poisson regression to predict the probabilities, the results were awesome!

Data

An example of the data is shown in the next figure

Figure 1

Analysis

The main assumption we did was that all the information required to predict when the stock will be sold is contained in the data, and that the probability to sell a product is not affected by the selling of other products. By doing this, we then created histograms of the probability to sell M products in a given day. Then, by convolution it is expected to obtain probabilities for sucesive days.

Figure 2

What this example (and others) is that after several days, the distribution follows an exponential distribution. This means that the cumulative probability of a product to be sold has to follow a Poissonian distribution. This gave us the idea to use a Poisson Regressor.

Poisson regressor

By implementing a Poisson regression we trained a model for each product. One example is shown below Figure 3

The results were astonishing!

Final considerations

This project involved a lot more parts, for instance, a particular definition of the probability of sold out, the test data (which we didn’t have access to) etc. I am not going in further detail because I think that the main idea of this project was told and in case you want to check more you have the link to the github repository.