For the sake of simplicity, let's assume we build weather stations:
Per station we collect about 100 sensor measurements per 15 minutes.
There might be 1000 or more stations.
A customer wants to see the data of one weather station (may of a few). but he wants to see it aggregated per day, week, month, or compare it to data from last year. He might want to study charts with temperature curves over time. For maintenance, one might also see actual sensor data but this is not the main use case.
Clearly, it is not sufficient to aggregate data on the fly for longer period of times as there would be many rows to summarize. So there might be other tables that hold the information on a daily, weekly, or whatever basis.
Here's the part where I am unsure: Would this be a typical task for OLAP / SQL Server Analytical Services? Or can/should it be done within the normal SQL Server database? Which approach would be best for this task?
I 've heard some terms before, but I have no glue if OLAP / Analytical Services would be the way to go. What do you think? Also, any recommendations which book would give me the basic knowledge?
Thanks for any pointers. Hartmut
Well it has to be looked at in a number of ways.
Are you wanting to do high-end decision support work on this? trend analysis and the like?
You are talking about 3.5 Billion single datapoints per year going by your numbers: 400 per hour * 24 hours * 365 days * 1000 stations.
SQL Server can handle this amount of data, it also depends upon how much data is being stored and what edition of sql server you have available.
If you just want to do aggregations (avg temps, rainfall etc), it would be possible to do that all in t-sql, storing the aggs separately from the raw data. Would your historical data have to be as granualar as current data? You could do something along the lines of current year = full granualarity, previous years = daily granularity.
You could still put OLAP on top of pre-agged data too, if you are wanting to do trending etc.
Sorry for the vague answer, both are possible, it depends on what you really want to do with the data.
answered Dec 17 '10 at 06:45 AM
Thank you for your answers. My needs are not that sophisticated. Actually, we are just collecting measurement data for many sensors. So the user might want to see that sensor data plotted per day, week, month or year. He might want to look at a single sensor or at the sum of all sensors of one station. So not much to aggregate either...
After a year, one value per day would be sufficient.
From what you all said, it looks like I am far from the limits of SQL Server. So there might be not need to add OLAP at all. We are not looking for trends, just logging of sensor data. The data might be compared to similar data from the year before. If we ever used a trend, it would be to see if the there is a component degradation over time...
Is this basically correct?
answered Dec 28 '10 at 08:44 AM
If I get the math right, you have 9600 measurements collected per station per day. I'm not sure I'd say that's terribly much to summarize.
I have created on-the-fly reports with much, much more data than that. Sometimes they have caused performance problems, but that's usually because of bad SQL programming and/or bad indexing, not because of data volume.
answered Dec 17 '10 at 06:42 AM