question

Hamed_ji avatar image
Hamed_ji asked

SQL Server to store article

Hi, i going to build a web site that is very look like to "CodeProject" that needs to store articles. My question is how much of data should be stored in SQL and how much should store on hard disk ? Clearly: Should i Store article images in Database OR in hard drive ?
-> Store in database may make huge database files and also loading from HDD must be more fast because do not need any query to get file. but also make backup more hard and the chance of conflict between database and HDD data Should i store article Bodies in Database as nvarchar(MAX) OR as text file in Hard drive ? Store in database make database more huge but how much is performance of reading file from hard drive every time ? What to do with downloadable files ? And finally what to do with user comments as maybe they become so many for special article ? What's the best way ? as i must buy host for my application. Thank you
datastoragehard-disks
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

1 Answer

·
Magnus Ahlkvist avatar image
Magnus Ahlkvist answered
I wouldn't say storing binaries and large amounts of text-data in SQL Server is by default the wrong thing to do. As an example, I'm maintaining a picture database, with around 100 GB of product pictures. Every picture is stored as high resolution and in a number of downscaled formats. A web product catalogue is retrieving pictures on demand from this database, as well as a product maintenance system and an application used to upload and download product pictures. When we designed the solution most people thought it would be a terribly bad idea to store binaries in a database but there were reasons for us to at least test what it would mean for our performance. One reason was disk scalability as you mention. If we'd go for a file store on another server, we'd have to run the websites with a domain account. Etc. So we implemented a test solution, where we tested three different scenarios: - 1) Store files in the file system on another server than the webserver accessing them. - 2) Store the files in the file system on the webserver. - 3) Store the files in a SQL Server (on another server than the webserver). I didn't really believe SQL Server storage would stand a chance, at least not to (2). But to my surprise, accessing pictures stored in SQL Server was on average faster than storing files on the file system on another server and equally fast as storing on the webserver itself. Worst case was slower on SQL Server, but best case outperformed both the other scenarios. Without giving it too much thought, I'd say the reason is that SQL Server caches more data than the disc cache, and therefore it's more likely to get a result delivered directly from memory than from disk when using SQL Server. The high resolution pictures in our solution are very seldom queried - it's the lower resolution pictures that are displayed on the web that are often queried. The standard image format for the web catalogues make up around 2GB of the space int he database, and the instance of SQL Server we use for this purpose has a 4GB dedicated memory, so that's a factor to think about. I'm not saying this automatically means that it's always a good idea to store binaries and large chunks of text in SQL Server. I'm saying it's an option that at least could be considered. SQL Server is not automatically slower than a file system in delivering files. It probably depends a lot on the cache hit rate.
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

+1. This is a very good answer. About 4 years ago, Microsoft has published a very useful quantitative reasearch paper titled "To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem". Here is the link leading to the paper in either pdf or Word format: http://research.microsoft.com/apps/pubs/default.aspx?id=64525. I believe their verdict (for SQL Server 2008 which allows the luxury of decorating the column with FileStream attribute) is this: 256 Kb or less works well as varbinary(max) or nvarchar(max) while 1 MB or more should be implemented as FileStream. Any of the 2 will definitely beat the storage of the contents as files in the file system. Of course it goes without saying that both text and image data types should be avoided at all cost in favour of nvarchar(max) and varbinary (max) respectively. From what I know in practice, varbinary(max) fields are easily serializable to xml/t-sql friendly format out of the box should the data be subjected to replication.
2 Likes 2 ·
Thanks for the reference Oleg. I haven't had the time to look into SQL Server 2008, but FileStream seems lika a good reason to do just that..
0 Likes 0 ·
Filestream is a very nice addition in 2008 and in many ways it is a half way point between storing the information in the database or in the file system.
0 Likes 0 ·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.