VDA401: Advanced Data Access Patterns with ADO.NET 2.0 : VDA401: Advanced Data Access Patterns with ADO.NET 2.0 Julie Lerman
The Data Farm
jlerman@thedatafarm.com
About Me : About Me Independent Software Developer
20 years developing
Microsoft .NET MVP
ASPInsider
INETA Speaker
Various publications & conferences
Blogs: thedatafarm.com/blog, blog.ziffdavis.com/devlife
Founder and leader of Vermont .NET
INETA Advisory Board Member
Agenda : Agenda ADO.NET 2.0 DataSet performance
Pushing huge data back to SQL Server
Caching huge amounts of data
Querying huge amounts of data in memory
ADO.NET 2.0 Features for this talk : ADO.NET 2.0 Features for this talk Batch Updating with DataAdapter
SqlBulkCopy Class
SqlDependency Class
ADO.NET 2.0 Engine Improvements : ADO.NET 2.0 Engine Improvements DataSet Scalability across the board
Can handle lots of rows
Indexing Engine completely re-written
Incremental indexing updates
much more
Low overhead for using Data Views
Serialization: Faster, more compact
Supports Binary & Schema-less serialization
DEMO : DEMO Indexing Engine Performance
Moving Huge Amounts of Data : Moving Huge Amounts of Data Getting data is easy and fast!
Uploading is a different story
1.1 Update Options
DataAdapter.Update – one row at a time
Controlled stored procedure update – one row at a time
Updating in 2.0 : Updating in 2.0 DataAdapter Batch Update
Pushes groups of rows up to SQL Server
SQL Server still updates one at a time
Find sweet spot for UpdateBatchSize value
Bulk Copy
Streamed Insert using SQL Server Bulk Copy *slide slightly modified from original printed version
DEMO : DEMO Batch Update & Bulk Copy Class
Best of all worlds – BCP & DML : Best of all worlds – BCP & DML Use BCP to stream Inserts & Updates and Deletes to temptables
Create & run a DML TSQL command to process inserts, updates and deletes from the temptables into the real tables
Wrap it all in a transaction
End to End streaming with DataReader DML: Database Manipulation Language *slide slightly modified from original printed version
DEMO : DEMO Using BCL & DML to process huge amounts of data
DataSet Caching : DataSet Caching DataSet improvements enable storing lots of data into cache
SqlDependency enables automated notification of changes to .NET
Combine SqlDependency and Caching for serious resource reduction
DEMO : DEMO SqlDependency & DataSet Cache Part 1
Big DataSet cache begs granularity : Big DataSet cache begs granularity SqlDependency is all or nothing
Create separate dependencies for logical sections of your cache
DEMO : DEMO SqlDependency & DataSet Cache Part 2: Achieve granularity
Divide & Conquer your huge dataset : Divide & Conquer your huge dataset In-Memory Querying is coming!
LiNQ
VB9/C#3
What about today?
ADO.NET 2.0 has everything you need to build a simple query processor
Great example of QP from ADO.NET Team Technical Lead Pablo Castro (“for demo only”)
High Level view of Query Processor : High Level view of Query Processor Familiar Query Syntax
eg SELECT, WHERE, JOIN
Query is parsed to “Algebra Tree”
Some of the Algebraic Operators
Scan: enumeration over the logical tree
Projection: identifying what columns to return
Filter, Join, Sort
Just a data structure
Iterator is created for each operator
Does the actual work of processing the data and extracting what is defined by operator
Projection Iterator returns all the rows but only containing the desired fields
Filter iterator processes all the rows and returns only rows matching filter rules
Roll your own Query Processor : Roll your own Query Processor cached
data Filter
Iterator Project
Iterator Narrowed down to specific columns Flattened to match criteria Enhanced with related data
Our Custom QP Iterators : Our Custom QP Iterators Scan
Enumerates over dataset rows (eg leaf-nodes of tree)
Select (Project)
DataSet Expressions can pull out columns
Filter
DataTable.Select for on-the-fly indexing
Consider DataView for a permanent index and it’s raw .NET 2.0 power
DataTable.Select can leverage this index
Sort
Bundle this with the filter
leverage DataTable.Select or DataView and indexing
Use DataView’s new power for JOIN : Use DataView’s new power for JOIN JOIN will be your weakest link in a query operation
Iterates through joined table for every row in primary table
Try a DataView instead
DataView’s indexing will speed this up
Beware Operator Constraints : Beware Operator Constraints DataView/DT.Select uses == and !=
need to be creative for implementing things like >, IN or BETWEEN
Can’t leverage indexing
Roll your own?
DEMO : DEMO A look at a sample Query Processor written with ADO.NET 2.0
Summary : Summary ADO.NET 2.0’s performance and functional improvements put new power in our hands
You can work with huge amounts of data efficiently
Functionality like BatchUpdate gives you easy access to some of this power
With a little more code, you can do wonders!
Resources : Resources PDC05 Session DAT408, Pablo Castro, ADO.NET Team Technical Lead
ADO.NET Team Blog: blogs.msdn.com/dataaccess
MSDN Data Access Dev Center
msdn.microsoft.com/data
What’s New in ADO.NET 2.0, Julia Lerman, MSDN Magazine April 2005
ADO.NET 2.0 Bookshelf is filling up : ADO.NET 2.0 Bookshelf is filling up ADO.NET and System.Xml v 2.0 – The Beta Version: Alex Homer, Dave Sussman & Mark Fussell, Addison-Wesley
PRO ADO.NET 2.0, Sahil Malik, APress
Professional ADO.NET 2: Programming with SQL Server 2005, Oracle, and MySQL, Wallace B. McClure, Gregory A. Beamer, IV John J. Croft, J. Ambrose Little, Bill Ryan, Phil Winstanley, David Yack, Jeremy Zongker, WROX Press
Programming Microsoft ADO.NET 2.0 Applications: Advanced Topics, Glenn Johnson, Microsoft Press
Microsoft ADO.NET 2.0 Step by Step, Rebecca Riordan, Microsoft Press
Contact Info : Contact Info Julie Lerman
jlerman@thedatafarm.com
www.thedatafarm.com
Blogs
www.thedatafarm.com/blog
blogs.ziffdavis.com/devlife