question

Roman Pokrovskij avatar image
Roman Pokrovskij asked

assync IO in the SQL Server

I'm interested in to understand how SQL Server (all versions) make a decision to read pages using assync IO : 1) to use or not to use assync IO; 2) how many calls to start. My concrete problem that I can't explain 800 avg.queue length during 15 minutes. I found a session that generates huge amount of reads (1gb) but it is not enough to explain: huge scans are evil but they never generates such absurd queue's within my knowledge. I understand that there could be a more complex scheme: when SQL Server starts assync calls as many as it need (for example if it in need of 200 pages - it starts 200 assync reads), then Windows Assync IO API decides start them asynchronously or not... Thats how it works by Windows Assync IO API, but really I'm not sure does sql server use standard windows assync api or use its own. Thank you for any ideas and thank you very much for links to documentation. UPDATE: Hardware: 1) 32 bit, 9 GB in AWE buffer. 2) IO - Fibber Channel to SAS disk array; 800 is total = 500 on disk where mdf islocated, 300 on the disk where ndf is located. New information: I've found that in "normal" situation the same scan produce only 6-10 members in the queue and then there are only 3 entries in the dm_io_pending_io_requests at one moment. It seems that AWE flash is responsible for huge queue, but I still want to know is there any control/management of queue length on the system/server/sqlserver level? I start thinking that at least in the case of fiber channel/disk-array this control is just impossible: when may be it is possible to know what queue is on this logical volume, it's to costly to decide how many assync reads could be performed effectively.
sql-server-2005disk-io
3 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Blackhawk-17 avatar image Blackhawk-17 commented ·
How large are the underlying tables in question? Would they "fit" in cache or do they need to overwrite it? And does the execution plan use a clustered index?
0 Likes 0 ·
Roman Pokrovskij avatar image Roman Pokrovskij commented ·
OK if you want. It's about 1Gb (with indexes) and a 600Mb without, and yes there is clustered index scan in the execution plan. But actually I'm interesting to know not why I have a queue, but why it is so huge...
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
Are these only disk ASYNC or also Network?
0 Likes 0 ·
Grant Fritchey avatar image
Grant Fritchey answered
Just a suspicion, but I'd look at parallelism as a possibility there. That would make sense for why you're seeing such heavy queuing all at once.
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Roman Pokrovskij avatar image Roman Pokrovskij commented ·
As I know parallelism should be somehow be "watchable" in the plan. In this case plan is normal - just index scan.
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
An index scan is still a heavy operation... and it can be a parallel execution.
0 Likes 0 ·
Wilfred van Dijk avatar image
Wilfred van Dijk answered
What is your hardware configuration? Do you have your data and log separated? Is somebody hammering TempDB? Maybe Os related? (virusscanner, Pagefile activity?)
2 comments
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Roman Pokrovskij avatar image Roman Pokrovskij commented ·
OK. If you want... Hardware is crazy (32-bit, Enterprise Edition 9 GB AWE buffer), data and log are separated, TempDB not used. May be OS related, but not virusscanner. May be pagefile activity... Whole situation is very complex. There I'm looking just for Assync IO knowledge. Just want to know how SQL Server decide how many Assync IO calls to start. It's should be not very complex question, but I can't find any documentation about this.( OK, I suspect IO controller error, so I need to know how IO controller/disk participate in this decision making process).
0 Likes 0 ·
Blackhawk-17 avatar image Blackhawk-17 commented ·
What other consumers are running on the system? Are there applications or is this server only running SQL Server? Is the virus scanner set to exclude dB files?
0 Likes 0 ·
WilliamD avatar image
WilliamD answered
There is some awesome information awavilable on SQL Server IO on MSDN. The first thing to read and understand would be [SQL Server 2000 IO Basics][1], followed by [SQL Server IO Basics - Chapter 2][2] the first article is based on SQL 2000 (obvious by the title), but is a strong base for how SQL Server deals with I/O when talking to Windows and NTFS. There is a good section explaining the async IO methods (SQL Server uses the Windows APIs as you guessed). The major thing to consider is the read-ahead ability of SQL server, especially as you are using Enterprise Edition - as that feature is beefed up for EE. Those docs should shed some light on the whole thing. If you ever get chance, I suggest you watch Bob Ward present on the SQL Engine and compelte underbelly of SQL Server - he is/was a Senior Escalation Engineer for SQL Server and dives *deep* into the engine internals and talks about stuff like async io. He co-authored/tech reviewed those whitepapers I linked to too. [1]: http://technet.microsoft.com/en-us/library/cc966500.aspx [2]: http://technet.microsoft.com/en-us/library/cc917726.aspx
1 comment
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Roman Pokrovskij avatar image Roman Pokrovskij commented ·
Thank you. Where I could find Bob Ward video? I know about those two articles. There are an information that SS use async IO reading pages to buffer, but those documents does not explain haw many calls would be started, possibly because it is very natural for authors - one page, one call (for me this method looks too dangerous to be natural - it violates other session right to read quickly)..
0 Likes 0 ·
Blackhawk-17 avatar image
Blackhawk-17 answered
The SQLOS decides if calls will be ASYNC or not. **SYNC** calls follow the Windows API and enter the Windows queue. If the resource requested is not available the O/S puts it on a wait queue until it is. **ASYNC** calls tell Windows to initiate them straightaway. SQLOS then creates its own internal wait queues until Windows says the desired resource is available. In your question you say the table is a 600 MB Clustered Index. With an index scan the query will attempt to read through all 600 MB. How long does this query generally take? What stats are you looking at in arriving at your conclusions? Want to make sure we're all comparing the same apples :) Can you provide some PerfMon I/O metrics for us? Edit ==> **io_pending From BOL:** > Indicates whether the I/O request is > pending or has been completed by > Windows. An I/O request can still be > pending even when Windows has > completed the request, but SQL Server > has not yet performed a context switch > in which it would process the I/O > request and remove it from this list. So io_pending may not truly represent reality in all cases. Another Edit ==> A quote from **Paul Randal**: > High disk queue length = an I/O > subsystem that isn't optimal for the > load.
10 |1200

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.