x

Aggregating large amount of data - OLAP or not?

For the sake of simplicity, let's assume we build weather stations:

Per station we collect about 100 sensor measurements per 15 minutes.

There might be 1000 or more stations.

A customer wants to see the data of one weather station (may of a few). but he wants to see it aggregated per day, week, month, or compare it to data from last year. He might want to study charts with temperature curves over time. For maintenance, one might also see actual sensor data but this is not the main use case.

Clearly, it is not sufficient to aggregate data on the fly for longer period of times as there would be many rows to summarize. So there might be other tables that hold the information on a daily, weekly, or whatever basis.

Here's the part where I am unsure: Would this be a typical task for OLAP / SQL Server Analytical Services? Or can/should it be done within the normal SQL Server database? Which approach would be best for this task?

I 've heard some terms before, but I have no glue if OLAP / Analytical Services would be the way to go. What do you think? Also, any recommendations which book would give me the basic knowledge?

Thanks for any pointers. Hartmut
more ▼

asked Dec 17, 2010 at 06:35 AM in Default

Hartmut gravatar image

Hartmut
23 1 1 1

(comments are locked)
10|1200 characters needed characters left

3 answers: sort newest

Well it has to be looked at in a number of ways.

Are you wanting to do high-end decision support work on this? trend analysis and the like?

You are talking about 3.5 Billion single datapoints per year going by your numbers: 400 per hour * 24 hours * 365 days * 1000 stations.

SQL Server can handle this amount of data, it also depends upon how much data is being stored and what edition of sql server you have available.

If you just want to do aggregations (avg temps, rainfall etc), it would be possible to do that all in t-sql, storing the aggs separately from the raw data. Would your historical data have to be as granualar as current data? You could do something along the lines of current year = full granualarity, previous years = daily granularity.

You could still put OLAP on top of pre-agged data too, if you are wanting to do trending etc.

Sorry for the vague answer, both are possible, it depends on what you really want to do with the data.
more ▼

answered Dec 17, 2010 at 06:45 AM

WilliamD gravatar image

WilliamD
25.9k 17 19 41

(comments are locked)
10|1200 characters needed characters left

Thank you for your answers. My needs are not that sophisticated. Actually, we are just collecting measurement data for many sensors. So the user might want to see that sensor data plotted per day, week, month or year. He might want to look at a single sensor or at the sum of all sensors of one station. So not much to aggregate either...

After a year, one value per day would be sufficient.

From what you all said, it looks like I am far from the limits of SQL Server. So there might be not need to add OLAP at all. We are not looking for trends, just logging of sensor data. The data might be compared to similar data from the year before. If we ever used a trend, it would be to see if the there is a component degradation over time...

Is this basically correct?

Hartmut
more ▼

answered Dec 28, 2010 at 08:44 AM

Hartmut gravatar image

Hartmut
23 1 1 1

(comments are locked)
10|1200 characters needed characters left

If I get the math right, you have 9600 measurements collected per station per day. I'm not sure I'd say that's terribly much to summarize.

I have created on-the-fly reports with much, much more data than that. Sometimes they have caused performance problems, but that's usually because of bad SQL programming and/or bad indexing, not because of data volume.
more ▼

answered Dec 17, 2010 at 06:42 AM

Magnus Ahlkvist gravatar image

Magnus Ahlkvist
16.4k 16 19 33

+1 for right maths! ;)

You are right for reports then for single stations being potentially quick. If he wants reports across all, then it starts getting tricky.
Dec 17, 2010 at 06:47 AM WilliamD
(comments are locked)
10|1200 characters needed characters left
Your answer
toggle preview:

Up to 2 attachments (including images) can be used with a maximum of 524.3 kB each and 1.0 MB total.

New code box

There's a new way to format code on the site - the red speech bubble logo will automatically format T-SQL for you. The original code box is still there for XML, etc. More details here.

Follow this question

By Email:

Once you sign in you will be able to subscribe for any updates here

By RSS:

Answers

Answers and Comments

SQL Server Central

Need long-form SQL discussion? SQLserverCentral.com is the place.

Topics:

x1833
x171
x58
x19

asked: Dec 17, 2010 at 06:35 AM

Seen: 1817 times

Last Updated: Aug 26, 2011 at 02:27 AM