database design - How to build a scalable statistical system? -
Say I have a table called products
and I want to know how often the products are searched , Have seen and bought me, when products have to be searched, viewed and purchased.
My first method was that product_id
, a field indicates that the item 0 = searched
, 1 = visited
and 2 = purchased
and any other area has 'WETTIME' of the event, so I can filter time.
It works very well, but is not scalable if I have 50,000 products in the database and 1,000 users use 5 searches every day, then I have 50,000 * 1,000 * 5 = 250'000,000
is a new record per day , so it does not look like the right solution for me.
I have some ideas about how to increase it, but I really want to read it better, because I am not happy with myself. Keep storing this data (storage is cheap and relatively scalable, if you do not have access to it) / p>
Overall, which is interesting to you.
Once you know which statistics are interesting for you, then you can generate these incrementally by using a set of minimum period of interest. To take a simple example: If you are interested in the number of total sales for the item, but only on an annual basis, you can add "Sales in 2010", "Sales in 2009". Whenever possible, work with these sets.
However, by using basic data, you can generate new sets if you find that another metric is interesting.
Comments
Post a Comment