Monday, October 11, 2010
Changing the security context with Dynamic SQL [T-SQL]
I’ve always found that understanding the SQL security model can be a finicky affair; that’s no criticism, its more to do with my reluctance to actually spend time immersing myself in it – I generally know what I need to know and that’s all. I remember back when I was first learning T-SQL I found the whole area of security around dynamic SQL difficult to grok until I saw an actual example of it and I figured it might be useful to anyone in the same position if I were to share a code sample demonstrating these peculiarities. Hence this blog post. The particular vagary that I want to demonstrate is that where the use of dynamic T-SQL inside a stored procedure will change the security context under which that dynamic T-SQL is executed. Code inside a stored procedure will execute as the owner of the stored procedure until some dynamic T-SQL is encountered at which time the security context switches to that of the user that called the stored procedure. Allow me to demonstrate: --Create demo DB with objects SET NOCOUNT ON; USE MASTER BEGIN TRY DROP DATABASE DynSQL; END TRY BEGIN CATCH END CATCH CREATE DATABASE DynSQL; GO USE DynSQL GO CREATE SCHEMA [sch]; GO CREATE TABLE [sch].[foo] ([bar] INT); GO --===The important bit============================ CREATE PROC [sch].[demoproc] AS BEGIN --The following statement will be executed as [sch] SELECT * FROM [sch].[foo]; DECLARE @sql NVARCHAR(MAX) = 'SELECT * from [sch].[foo];'; --The dynamic SQL in @sql will be executed as the user that called [sch].[demoproc] EXEC sp_executesql @sql; END GO CREATE LOGIN [u] WITH PASSWORD = 'p@ssw0rd'; CREATE USER [u] FOR LOGIN [u]; GRANT EXECUTE ON [sch].[demoproc] TO [u] EXECUTE AS LOGIN = 'u'; EXEC [sch].[demoproc]; -- <-Will throw error "The SELECT permission was denied on the object 'foo', database 'master', schema 'sch'." REVERT; --=============================================== --Cleanup DROP USER [u]; DROP PROC [sch].[demoproc]; DROP TABLE [sch].[foo]; DROP SCHEMA [sch]; DROP LOGIN [u] USE MASTER DROP DATABASE [DynSQL]; Let’s take a look at what happens. Firstly we get a (empty) resultset returned from the none-dynamic T-SQL inside the stored procedure because user [u] has got permission to execute [sch].[demoproc] and [sch].[demoproc] has got permission to select data from [sch].[foo]: However over on the messages tab we see that we also get an error from the dynamic T-SQL even though its the same T-SQL statement (SELECT * FROM [sch].[foo];). This is because the security context switches to the caller (i.e. [u]) and that caller does not have permission to select from [sch].[foo]: You can copy/paste the code above and execute it – it shouldn’t leave any objects lying around. Hope this helps! @Jamiet P.S. The code above was formatted for the web using The Simple-Talk Code Prettifier.
tSQL Regular Expressions
Join the UK SQL Usergroup – See Tony in wellies and Hawaiian shirt
Tony Rogerson who co-ordinates the UK SQL Usergroup has set a challenge. If we can get the number of members for the UK SQL Usergroup LinkedIn group to 1000 by the time of SQLBits. He’ll present in wellies and a Hawaiian shirt. So get joining the linkedIn...(read more)
Performance of SQL’s NullIf() Function (versus Case)
20 MVPs and 4 former MVPs at SQLBits
I thought I’d have a little count up and I’ve found that over almost 50% of the sessions will be delivered by current or former SQL MVPs . How awesome is that. Whats great is that we have MVPs from all over the world attending SQLBits. Which goes to show...(read more)
Newly closed Connect items auger well for SSIS in Denali
Todd Mcdermid spoke recently on his blog post Integration Services vNext Coming Soon about how some recently closed Connect items had encouraged him as to the future of SSIS. Also, Matt Masson from the SSIS team has written a similarly encouraging blog post about some upcoming SSIS enhancements at Upcoming Product Changes. Now its my turn. Just this morning I had three Connect items returned to me as "fixed" and, like Todd, I'm delighted to see these items getting closed as such. I have long complained about the shortcomings of SSIS's logging framework as I don't believe it produces enough "context" as to why a container happens to be executing; the consequence being that we end up getting lost in a plethora of log records where we can't see how each one relates to another. These three Connect items give me hope that this issue is getting addressed in the next version of SSIS (aka SQL11 aka Denali). The three Connect items in question are: Can't differrentiate between multiple instances of a task running in parallel - If a container happens to be running in parallel with another instance of itself (e.g. a dataflow task in a package that has been called from two Execute Package Tasks) there is no way to differentiate between the two. A solution would be to provide an Execution identifier for the execution of each container just like the ExecutionGUID that we get for a package. Please put ExecutionGUID property on DtsContainer - Pretty much the same as the last one SSIS: Make container stack available - This refers to what I call the context of a task being executed. We know that a task is executing but what are all the ancestral tasks and containers in the container hierarchy that have led to that task being executed? The provision of more execution metadata for logging purposes gives me hope that the next version of SSIS will have a much better story around logging. Hopefully we won't have to wait long to find out. @Jamiet
Setting up database unit testing as part of a Continuous Integration build process [VS2010 DB Tools - Datadude]
[This blog post assumes a familiarity with the terms Continuous Integration (CI), MSBuild & MSTest. If you don’t have a good appreciation of those terms then this blog post probably isn’t for you anyway so don’t worry about it!] Over the past few days I have been working to get database unit tests executing as part of our Continuous Integration (CI) build and in this blog post I’ll explain how I went about it because it is decidedly not straightforward. We are using the DB Tools in Visual Studio 2010 (aka DBPro or Datadude - I will refer to it as Datadude from here on in) which includes unit testing functionality. The rest of this blog post also assumes a familiarity with database unit testing in Datadude although if you want to do some background reading an excellent place to start is Jamie Laflen’s whitepaper Apply Test-Driven Development to your Database Projects on MSDN. We got to the point where we had a C# test project containing database unit tests that executed successfully inside Visual Studio. For demonstration purposes I have put together a solution that contains a simple database project and a test project containing a database unit test: And just to prove that the test executes successfully in Visual Studio: N.B. The code for the database object(s) and the unit test itself are not important, that is outside the scope of this blog post. At this point we have some tests that run in our development sandbox, the configuration for which is done using the Database Test Configuration dialog: Those config settings are stored in the app.config file which exists as part of our test project (screenshot of which is at the top of this blog post). If we take a look inside that file we can see the ConnectionString that we defined in the Database Test Configuration dialog: Note the <DatabaseUnitTesting> element, we’re going to be coming back to that a little later!! Once we have the tests running in Visual Studio the next step is to get them running inside a CI build and for that we call out to MSTest.exe from inside an MSBuild script: When we execute that script we get some positive results and all looks peachy: OK, we now have a test that runs successfully both in Visual Studio and by calling MSTest.exe directly from MSBuild. The problem I have to solve now though (and this is the real crux of this blog post) is that the test is still running against the server I specified in app.config; I don’t want this, I want the test to run against a server of my choosing, namely my build server. For this I had to call on the help of the aforementioned Jamie Laflen (who, luckily, I have met in the past) and he gave me guidance on how to achieve it. Below are the steps that you need to take. Firstly you need to tell Datadude that you want to specify some different credentials and that is done by adding an attribute AllowConfigurationOverride="true" to the <DatabaseUnitTesting> element of app.config like so: Don’t forget to build your test project after changing the app.config file!!! This will cause datadude unit testing framework to go and search for a file called *.dbunittest.config where the * indicates either: The name of machine upon which the tests are being run or The name of the user running the tests According to Jamie L datadude runs the following precedence checks: Is an override specified in app.config? If not, use app.config Does an override file exist called <machinename>.dbunittest.config? If so, use it, if not… Does an override file exist called <username>.dbunittest.config? If so, use it, if not… Fail! “So”, you may be asking, “what goes in this *.dbunittest.config file then?”. Fair question, that’s where the <DatabaseUnitTesting> element that I mentioned earlier comes in. Copy that element from the app.config file into your *.dbunittest.config file, remove the AllowConfigurationOverride="true" attribute and change the ConnectionString property accordingly: The next step is to tell MSTest.exe that there is another file that it needs to be aware of when it runs the tests and to do that we need to create a custom testsettings file that will be used instead of the default Local.testsettings. To create a new testsettings file use the Add New Item dialog: In the Test Settings dialog go to ‘Deployment’, click ‘Enable deployment’, then ‘Add File…’. Browse to your *.dbunittest.config file and add it to the project: At this point you’re good to go and all you need to do edit your call to MSTest.exe and tell it to use your new testsettings file: Notice the extra information in the output: You’ll have to trust me that it used the new config file and therefore the new ConnectionString, although having said that you do get some useful information in the test results file (*.trx) that MSTest.exe creates: I think its fair to say that this process is a little fiddly to say the least so I’m hoping that Microsoft come up with a better solution in the future, one that integrates better into the whole testing infrastructure. Meanwhile you can download the demo that I put together for this blog post from my SkyDrive. Hope that helps! Comments are welcome. Jamiet UPDATE: Atul Verma has written a 3-part blog series on how to write your database unit tests so if Jamie Laflen's whitepaper (linked to above) doesn't tickle your fancy try reading Atul's series instead: How to unit test SQL Server 2008 database using Visual Studio 2010 How to unit test SQL Server 2008 database using Visual Studio 2010 – Part 2 How to unit test SQL Server 2008 database using Visual Studio 2010 – Part 3
Enforcing naming conventions using database unit testing
My naming convention obsession Anyone that has ever worked with me will tell you that I am a stickler for naming conventions. I have a somewhat obsessive reputation for it; I can’t help it – I seem to have a deep seated uncontrollable desire to ensure that every object in my database(s) is/are named consistently (is there anyone else out there equally as obsessive?). I have tried various techniques down the years to try and enforce naming conventions but none of them really worked. I’ve got scripts that alter object names (such a script is in my script library in fact) but these are only any use if they actually get run, they don’t actually enforce the conventions – that’s a manual step. I’ve thought about using Policy-Based Management (PBM) to enforce naming conventions but given I’m a developer and not a DBA that’s not something that is readily available to me and besides, using PBM to enforce naming conventions is reactive rather than proactive if you are developing the code on a machine where the policies are not enforced. Another option I looked into using was Team Foundation Server (TFS) check-in policies; these are policies that can be applied to artefacts when they get checked-in to TFS’s source control system. This option really appealed to me because the naming conventions could be enforced during check-in (i.e. very very early) and didn’t require DBA intervention. In practice though enforcing naming conventions using TFS check-in policies has a few sizable issues: Its not easy. It would require you to parse the file that was getting checked-in, decide what sort of object is defined in the file, and then check the name of the object based on things like object name, schema, etc... TFS check-in policies are not installed on the TFS server, they are installed on the development workstations. This means there is a dependency and, even though the source code for the check-in policies can be distributed with your application source code, I didn’t really like this. You’re relying on each developer to enforce the check-in policy and with the greatest will in the world….that aint gonna happen. Its too easy to turn them off. There is the obvious dependency on using TFS, not something every development shop uses even in the Microsoft space. Database unit testing to the rescue No, a better solution was needed and I came up with one in the shape of automated database unit testing. I have spoken recently about how I have become a big fan of database unit testing (see my post Experiences from writing sp_CascadingDataViewer – DB unit testing and code distribution) and being able to enforce naming conventions is one very good reason for that. Enforcing naming conventions using automated unit tests has a number of advantages: They can be written against the metadata of the objects themselves (i.e. by querying SQL Server’s system views) so there’s no parsing that needs to be done. They can be employed as part of a Continuous Integration (CI) process and run as a build verification test (BVT). Someone checks-in an object that violates the naming convention? Bang: broken build! Developers can’t circumvent the tests. Nothing needs to be installed on the development workstations. The tests live wholly as part of your source code. Not dependent on use of a particular source control system Hence I have written some unit tests that enforce the following naming conventions: Check constraints must be of the form CK_<schemaName><tableName>_XXX Column names must begin with a capital letter Column names cannot contain underscores Default constraints must be named DF_<schemaName><tableName>_<ColumnName> Foreign keys must be of the form FK_<parentObjectSchema><parentObject>_REF_<referencedObjectSchema><referencedObject>XXX Non-unique clustered keys must be of the form IXC_<schemaName<TableName>_<Column><Column><Column>… Non-unique non-clustered keys must be of the form IX_<schemaName><TableName>_<Column><Column><Column>... Unique clustered keys must be of the form IXUN_<schemaName><TableName>_<Column><Column><Column>… Unique non-clustered keys must be of the form IXUN_<schemaName><TableName>_<ColumnColumnColumn>... Primary keys must be of the form PK_<schemaName><tableName> Stored procedure names should not contain underscores Stored procedure names must begin with a capital letter Table names must not contain underscores Table names must begin with a capital letter I’m not stating that you should agree with these naming conventions (I don’t necessarily agree with them myself – they were defined before I arrived on my current project), the point here is that all of these rules can be enforced and its very easy to do it. Here’s the code for the unit test that enforces the primary key naming convention: /*PK name is PK_<schemaName><tableName>*/ SET NOCOUNT ON DECLARE @cnt INT; SELECT * INTO #t FROM ( SELECT OBJECT_NAME(c.[parent_object_id]) AS [TableName],OBJECT_SCHEMA_NAME(c.[parent_object_id]) AS [SchemaName],c.* FROM [sys].[key_constraints] c INNER JOIN [sys].[tables] t ON c.[parent_object_id] = t.[object_id] LEFT OUTER JOIN sys.extended_properties ep ON t.[object_id] = ep.major_id AND ep.[name] = 'microsoft_database_tools_support' WHERE ep.[major_id] IS NULL AND c.[type] = 'PK' )q WHERE [name] <> N'PK_' + [SchemaName] + [TableName] SET @cnt = @@ROWCOUNT; IF (@cnt > 0) BEGIN DECLARE @msg NVARCHAR(2048); SELECT @msg = '%d Primary Keys do not conform to naming convention (PK_<schemaName><tableName>):' + STUFF((SELECT ', ' + [name] FROM #t a FOR XML PATH('') ),1,2,'') FROM ( SELECT 1 AS [Id],* FROM #t t )q GROUP BY q.[Id] SELECT @msg RAISERROR(@msg,11,1,@cnt); END Essentially all it does is pull all of the primary keys out of [sys].[key_constraints], checks to see what the name should be, then if it finds any that violate the naming convention raise an error containing the names of all the primary keys in question. Here’s the error obtained when running the test against [AdventureWorks] (I’ve highlighted the pertinent bit): Test method Prs.SchemaTests.NamingConventions.PrimaryKeys threw exception: System.Data.SqlClient.SqlException: 70 Primary Keys do not conform to naming convention (PK_<schemaName><tableName>):PK_ErrorLog_ErrorLogID, PK_Address_AddressID, PK_AddressType_AddressTypeID, PK_AWBuildVersion_SystemInformationID, PK_BillOfMaterials_BillOfMaterialsID, PK_Contact_ContactID, PK_ContactCreditCard_ContactID_CreditCardID, PK_ContactType_ContactTypeID, PK_CountryRegionCurrency_CountryRegionCode_CurrencyCode, PK_CountryRegion_CountryRegionCode, PK_CreditCard_CreditCardID, PK_Culture_CultureID, PK_Currency_CurrencyCode, PK_CurrencyRate_CurrencyRateID, PK_Customer_CustomerID, PK_CustomerAddress_CustomerID_AddressID, PK_DatabaseLog_DatabaseLogID, PK_Department_DepartmentID, PK_Document_DocumentID, PK_Employee_EmployeeID, PK_EmployeeAddress_EmployeeID_AddressID, PK_EmployeeDepartmentHistory_EmployeeID_StartDate_DepartmentID, PK_EmployeePayHistory_EmployeeID_RateChangeDate, PK_Illustration_IllustrationID, PK_Individual_CustomerID, PK_JobCandidate_JobCandidateID, PK_Location_LocationID, PK_Product_ProductID, PK_ProductCategory_ProductCategoryID, PK_ProductCostHistory_ProductID_StartDate, PK_ProductDescription_ProductDescriptionID, PK_ProductDocument_ProductID_DocumentID, PK_ProductInventory_ProductID_LocationID, PK_ProductListPriceHistory_ProductID_StartDate, PK_ProductModel_ProductModelID, PK_ProductModelIllustration_ProductModelID_IllustrationID, PK_ProductModelProductDescriptionCulture_ProductModelID_ProductDescriptionID_CultureID, PK_ProductPhoto_ProductPhotoID, PK_ProductProductPhoto_ProductID_ProductPhotoID, PK_ProductReview_ProductReviewID, PK_ProductSubcategory_ProductSubcategoryID, PK_ProductVendor_ProductID_VendorID, PK_PurchaseOrderDetail_PurchaseOrderID_PurchaseOrderDetailID, PK_PurchaseOrderHeader_PurchaseOrderID, PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID, PK_SalesOrderHeader_SalesOrderID, PK_SalesOrderHeaderSalesReason_SalesOrderID_SalesReasonID, PK_SalesPerson_SalesPersonID, PK_SalesPersonQuotaHistory_SalesPersonID_QuotaDate, PK_SalesReason_SalesReasonID, PK_SalesTaxRate_SalesTaxRateID, PK_SalesTerritory_Territor... I am currently including these tests inside a C# test project inside Visual Studio 2010. Visual Studio has a rather nice feature that allows you to link to artefacts in other projects and hence we can host our single test class containing all of these tests in one place and link to it from whichever test project we want (typically you will have a test project per database) thus following the DRY principle. Here I show the dialog that demonstrates adding a link to an existing test class: And how it appears in the project. Note that NamingConventions.cs exists in both test projects but one is just a link to the other: Wrap-up I’m not sure my colleagues are too happy about these new tests given that they’re now breaking the build more often but nonetheless I think they realise the value (I guess I’ll find out tomorrow when they read this!!!) All-in-all its working very well for us and I’m now a very happy bunny knowing that naming conventions are being enforced and will continue to be so with zero effort from here on in. I have made the test class that contains all of the tests that I detailed above available on my SkyDrive at http://cid-550f681dad532637.office.live.com/self.aspx/Public/BlogShare/20101005/MIControlTest.zip. If you want to use it you should simply be able to drop it into an existing C# database test project and away you go (change the tests to suit your naming conventions of course though). Hope this helps. If it does please let me know, I’d really love some feedback on this. @Jamiet
SQLBits videos (distinctly amateur)
Hi all, SQLBits 7 is here, I’m here, and I’m walking around being very obnoxious and sticking a video camera in people’s faces. Thankfully nobody has offered me outside yet so if you’re interested in having a look at what’s going on in and around SQLBits check out my videos at http://www.youtube.com/user/jamiekth Here’s a few for your delectation. more coming, I promise!! UPDATE: The videos are now available as a single-click playlist at http://www.youtube.com/view_play_list?p=01617594AB55E450 Brent Ozar SQLBits backstage - Rob Farley and Chris Webb shoot the breeze SQLBits entrance hall - Gavin Payne, Andre Kamman, Jon Kehayias More to come throughout the weekend so keep checking out my YouTube channel: http://www.youtube.com/user/jamiekth and keep an eye on my Twitter feed where I’ll be posting them all too! @Jamiet UPDATE: Direct links to the videos here: SQLBits backstage - Chris Webb SQLBits backstage - Rob Farley and Chris Webb shoot the breeze SQLBits backstage - Tim Kent SQLBits entrance hall - Gavin Payne, Andre Kamman, Jon Kehayias SQLBits York - Allan Mitchell SQLBits attendees SQLBits 7 - York - Kevin Kline SQLBits 7 - York - Brent Ozar SQLBits 7 - York - James Rowland-Jones SQLBits 7 - York - Delegates at lunch SQLBits 7 - York - Dave Bally SQLBits 7 - York - Chris Testa-O'Neill & Maciej Pilecki SQLBits 7 – York – Neil Hambly SQLBits 7 - York - Simon Sabin (and his phone) SQLBits 7 - York - Buck Woody's hat (and Buck Woody)
Take your script library with you [T-SQL]
As SQL professionals I’m sure many of you build up a library of SQL scripts that you take from project to project, right? Here’s a tip for those of you that have such a collection and want to make sure that you always have ready access to it. I’m a big fan of all things sync-related and there are many tools on the market now that will sync the contents of a Windows folder to the cloud or around different machines; Dropbox is a popular one, as is Windows Live Mesh which I use and I also hear good things about SugarSync (they’re all free by the way). It doesn’t matter which one you use, the basic premise of these tools is that any file dropped into a monitored folder automatically gets copied up to a cloud store and whenever you buy/inherit a new machine simply installing the client will bring that stuff down for you. In short, its an ideal solution for always having access to your SQL script library. Here’s a screenshot of the Windows Live Mesh client: Notice that I have a folder called “Personal SQL Store” (I also have one for my Powershell scripts too). Its synced to two of my laptops and also to SkyDrive Synced Storage which is Windows Live’s name for the cloud store that I referred to earlier. Up at that cloud store I have ready access to the contents of that folder: It makes it really simple to get to those folders from any computer in the world (although admittedly I have occasionally come a cropper thanks to some over zealous IT departments) and I’d highly recommend using a similar system if you have such a script library that you always need access to particularly if, like me, you’re always on the move around different client sites. Alternatively if you don’t have your own set of scripts I have the option to share my folder with up to nine people so if you want read-only access to my script library let me know. UPDATE: Adam Machanic posted a comment that made me realise it was pretty stupid to invite you to share my script library without telling you what was in it. Hence: cdf.TitleCase - Returns Title Case Formatted String dbo.FindNumberOfOccurencesOfOneStringInAnother fnSplitWithCTE - take a delimited list of characters and return them as a dataset All Sessions Data cache usage per table DatawarehouseCleardown - Deletes all data from all tables in a database, respecting FKs. Discovering_tempdb_Stats – Various queries to return info about tempdb Empty-SQL-Cache Get Index Fragmentation Get Table Row Counts Index information including column names Paul Randal myth - DDL Triggers are INSTEAD OF triggers Paul Randal myth - Null bitmap myth Split csv into a dataset using XML & XQuery SSISPackageStatsCollector_v2 tablestats – Stats for every table in a DB View all permissions csp_ToolRenameObjects2005 – Rename FKs, CHECK constraints, Indexes according to a defined naming convention. Work on SQL2005 and above dbo.csp_ToolIDAssignSurrogateKey – Assign/Reassign surrogate keys on a table sp_generate_inserts2005 – Generate inserts according to existing data in a table. Works on SQL2005 and above SqlClrToolkit – a suite of useful SQLCLR procs/functions. FIRST/LAST/MERGE aggregate functions, SafeDivision, RegexMatch, CountString, CapitaliseText, RemoveWhitespace, DynamicPivotSQL, Split, DynamicPivot, DirectoryInfo, RegistryInfo, FindChildTables, EnableDisableConstraints SSAS clearcache Hopefully the names are self-explanatory enough!UPDATE 2: Jezza101 left a comment below suggesting that I just zip the whole lot up and post it somewhere which is what I have done at: http://cid-550f681dad532637.office.live.com/self.aspx/Public/BlogShare/20101003/Personal%20SQL%20Store.zip Hope this is useful! @Jamiet
Experiences from writing sp_CascadingDataViewer – DB unit testing and code distribution
Its now been a month since I introduced sp_CascadingDataViewer and I wanted to take the opportunity to talk about a couple of my experiences while writing it. SQL Server needs packages Version 1 of sp_CascadingDataViewer is an 899 line stored procedure (view the code at changeset 58999), that’s a lot of code and much of it is repeated. I would have loved to have hived off different parts into dedicated functions but I couldn’t do that because my aim was to keep everything in a single stored procedure thus making it as easy as possible for someone to add it to their SQL Server instance. That is an unfortunate trade-off that we have to make with SQL Server. Oracle has a solution to this problem called packages. A package is ostensibly a collection of stored procedures and functions that can be distributed and installed as a single unit. That makes them extremely portable and would be an ideal mechanism for me to distribute a collection of stored procedures and functions that make up Cascading Data Viewer. SQL Server needs something equivalent to Oracle packages. You need to be doing automated database unit testing sp_CascadingDataViewer wouldn’t exist without automated database unit testing in Visual Studio. Period. With something as complicated as sp_CascadingDataViewer there were many many combinations of code, schema and data that could cause is to break. I had to deal with all those combinations and have a way of ensuring that any changes that I made did not break something else and automated testing was the means with which I did that. I wrote the code for sp_CascadingDataViewer in Visual Studio rather than SQL Server Management Studio (as I do for all my T-SQL code these days) which meant that my code and the tests lived in the same solution. At the time of writing that solution includes 30 tests and all are available in the source code on Codeplex: With a simple key chord (CTRL+R, CTRL+A) I could deploy my code and run all my tests against it – that’s an incredibly powerful mechanism and I actually find it to be very productive method of development even if you have many tests that need to be run. Writing sp_CascadingDataViewer switched me on to the value of automated database unit testing and I now advocate its use wherever I go. @jamiet
My SQLBits presentation this coming Friday – SSIS Performance Tuning
SQLBits is once again upon us, this time its number 7 and it starts on 30th September 2010 in York. I’ve been looking forward to this one since it was first announced because its in my original neck of the woods so as well as going to this great conference I get to go and visit family as well. The guys, Simon, James, Chris, Allan, Darren, Martin & Chris (and anyone else I have forgotten sorry), always put on a great conference and I’m sure this one will be no exception. I’ve taken a look at the agenda and am slightly disappointed to learn that I’m on at the same time as folks that I would really like to go and see for myself; I was especially keen to go to Jonathan Kehayias’ talk on Extended Events, sadly not to be: Can’t wait for this Friday, hope to see you there! @Jamiet
Be wary of using UNC paths in SSIS packages
I have recently discovered what I believe to be a bug in SQL Server Integration Services (SSIS) 2008 and am taking the opportunity herein to warn about it so that nobody else gets stung by it. The bug concerns the use of Uniform Naming Convention (UNC) paths inside a .dtsx package (i.e. paths that start with a “\\”). I have managed to reproduce a situation where a package will attempt to validate a File Connection Manager containing a UNC path even if the task(s) that use that Connection Manager have got DelayValidation=TRUE. In other words, the package may attempt to validate a Connection that will fail and this will cause errors in your package. The first screenshot below shows the execution of my repro package. Notice: When the package starts up @[User::FakeFileLocation] points to a non-existent drive "u:" Also be aware of a few things that aren’t evident from the screenshot: The ConnectionString of the "FILE" connection manager is set to @[User::FakeFileLocation] The script task changes @[User::FakeFileLocation] to be the same as @[User::RealFileLocation] The dataflow task, the connection manager & the package all have DelayValidation=TRUE In this case everything works OK, no errors are thrown, everything is peachy. Now take a look at this second screenshot. It shows the same package as before but with one important difference, @[User::FakeFileLocation] has been changed from u:\NonExistentFile to \\u\NonExistentFile. Notice in the output that we have some errors because SSIS has attempted to validate that UNC path; it did not attempt to do so before the change. One other point of note is that these errors did not cause the package to stop executing however the error will still “bubble-up” to any calling package and will be interpreted as a package failure which is what makes this such a dangerous bug. I have submitted the repro to Microsoft at https://connect.microsoft.com/SQLServer/feedback/details/585479/. No reply as yet. @Jamiet UPDATE: Microsoft have acknowledged the bug and have resolved to fix it in SQL11 (aka SQL Server Denali). See here for the following response: We believe we found the cause of the problem. One of the system API we use to help resolve path can return different error codes depending on the network configurations and what are available. We will fix the bug in Release 11 of SSIS.
Only a few places left for the SQLBits training days
All the training day sessions for SQLBits 7 are really popular which means there are only a few places left on each of them. I know lots of people that are thinking of going to the training day and haven’t booked yet. if you are one of them make sure...(read more)
Don’t turn the dataflow into a cursor [SSIS]
I saw a thread on the SSIS forum today that went something like this: I have the following dataset: AccountNo Date DailyMovement 00000001 28/08/2010 10 00000001 29/08/2010 5 00000001 30/08/2010 7 00000002 28/08/2010 8 00000002 29/08/2010 6 for which I want to compute a running total per [AccountNo] & [Date] like so: AccountNo Date DailyMovement RunningTotal 00000001 28/08/2010 10 10 00000001 29/08/2010 5 15 00000001 30/08/2010 7 22 00000002 28/08/2010 8 8 00000002 29/08/2010 6 14 How do I do that using a script component? That last comment: How do I do that using a script component? is a fairly common question. People assume that if a calculated value is dependant on prior rows then a script component needs to be involved because that running total needs to be tracked somewhere, but that isn’t necessarily the case. Ask yourself, how would you do this if the data were residing in a database table, would you do this?: SELECT '00000001' AS [AccountNo], CONVERT(date,'20080828') AS [Date],10 AS [DailyMovement] INTO #balances UNION ALL SELECT '00000001' AS [AccountNo], CONVERT(date,'20080829') AS [Date],5 AS [DailyMovement] UNION ALL SELECT '00000001' AS [AccountNo], CONVERT(date,'20080830') AS [Date],7 AS [DailyMovement] UNION ALL SELECT '00000002' AS [AccountNo], CONVERT(date,'20080828') AS [Date],8 AS [DailyMovement] UNION ALL SELECT '00000002' AS [AccountNo], CONVERT(date,'20080829') AS [Date],6 AS [DailyMovement] SELECT [AccountNo] , [Date] , [DailyMovement] , ( SELECT SUM(DailyMovement) FROM #balances b2 WHERE b1.[AccountNo] = b2.[AccountNo] AND b1.[Date] >= b2.[Date]) FROM #balances b1; OK that works, and here’s a screenshot to prove it: But really, would any of you actually do this? Hopefully not, the use of the correlated subquery has simply turned what should be a nice set operation into a cursor-in-disguise (something I have talked about before) because that subquery will be getting executed for every row in the table. Instead you could run the following on that same dataset: SELECT b1.[AccountNo] , b1.[Date] , b1.[DailyMovement] , SUM(b2.[DailyMovement]) AS [RunningTotal] FROM #balances b1 INNER JOIN #balances b2 ON b1.[AccountNo] = b2.[AccountNo] AND b1.[Date] >= b2.[Date] GROUP BY b1.[AccountNo],b1.[Date],b1.[DailyMovement]; and you get the same result but with a much more amenable execution plan (execute with SET STATISTICS IO ON if you don’t believe me)! The same principle applies in a SSIS dataflow. Often there is no need to resort to a script component, the same result can be achieved using some smart dataflow design. You can use, for example, the same approach as in the second SQL query (above) using a combination of SSIS’ Sort, Aggregate, Merge Join and Conditional Split components. That is not to say that one will be quicker than the other but at least you’ll have something that is more intuitive and arguably more maintainable. Of course if performance gain is your primary goal then the correct advice is, as always, “test and measure, test and measure, test and measure”!!! Ask yourself “If I could, how would I solve this using T-SQL?” and see if that influences your dataflow design at all. Invariably script components should be the last choice, not the first! @jamiet
New training session for SQLBits
After some thought we’ve decided to change Buck Woody’s training day session. Instead of doing a full day on career development he will be doing a session on SQL Server for the Non DBA. This is a great session for anyone that has been thrown into running...(read more)
tSQL concatenate strings “1, 2, 3” and select results, too
Last minute availability for SQLBits Training Days
Due to a cancellation there is some last minute availability on Chris Webb’s Performance Tuning Analysis Services and Chris Testa-Oneil’s Implementing Reporting Services training days. If you haven’t registered for a training day then register quick as...(read more)
Some thoughts on Visual Studio database references and how they should be used for SQL Server BI
Over the past few weeks I have been making heavy use of the Database tools in Visual Studio 2010 (formerly known as datadude, DBPro, VSTS for Database Professionals or one of a plethora of other names that it has gone by over the past few years) and one of the features that has most impressed me has been database references. Database references allow you to have stored procedures in your database project that refer to objects (tables, views, stored procedures etc…) that exist in other database projects and hence when you build your database project it is able to resolve those references. Gert Drapers has a useful introduction to them at his blog post Database References; in his words database references allow you to: represent and resolve 3 and/or 4-part name usage inside a database project. Database references are conceptually the same as assembly references inside a C# or VB.NET projects; they allow you to reference objects from your database project that live inside another namespace (database). It occurred to me that similar functionality would be incredibly useful for SQL Server Integration Services(SSIS), Analysis Services (SSAS) & Reporting Services (SSRS) projects. After all reports, packages and data source views are rife with references to database objects – why shouldn’t we be able to have design-time dependency checking in our BI projects the same way that database and .Net developers do? Here are some examples of ways in which this would be useful:If a SSRS report pulls data from a view, I firstly want to know (at design-time) that that view exists. Secondly, if the view definition changes I want to know that my report will still work.Similarly for a SSAS cube, if a view referenced in the DSV changes or disappears I want my SSAS project build to fail or succeed accordinglySimilarly for a SSIS Execute SQL Task, if it executes a stored procedure I want to know that the named parameters that I am using actually exist in the stored procedure. To this end I have submitted three identical suggestions to Connect, one each for SSIS, SSAS & SSRS: [SSAS] Declare database object dependencies[SSRS] Declare database object dependencies [SSIS] Declare database object dependencies I have said before in my blog post The SQL developer gap that we database and BI developers deserve as much love from Microsoft as our .Net counterparts do and I’m of the opinion that providing design-time dependency checking across the full gamut of BI projects would be a huge step in the right direction. The holy grail is an object dependency graph from database tables all the way up through database views, database functions, stored procedures, DAL, ORM layer & business logic layer all the way through to presentation – I don’t know how long it will take but we will get there one day I am sure, hopefully sooner rather than later! @JamieT
SSIS MSBuild task now included in MSBuild Extension Pack
The information in this blog post may be of interest to anyone out there that is using Continuous Integration with msbuild in order to build/deploy a project that includes SSIS packages. The MSBuild Extension Pack at http://msbuildextensionpack.codeplex.com/ now includes an MSBuild task that will build an SSIS project, the output from which is a .deploymentmanifest file that can be used in concert with the SSIS deployment utility. Building a SSIS project using the SSIS MSBuild Task essentially has the same effect as right-clicking on a SSIS project and selecting Build with CreateDeploymentUtility=True (i.e. produces a .deploymentmanifest file): The advantage of using the MSBuild task is of course that you can now do it in an MSBuild script, none of this manual mouse-clicking required thank you very much! The SSIS MSBuild task was previously available in the SSIS community samples project at http://sqlsrvintegrationsrv.codeplex.com but having it as part of the MSBuild Extension Pack is definitely advantageous as this is a very widely used set of utilities. The source code has been available in the MSBuild Extension pack since changeset 54481 which was checked-in on 26th August 2010 and was included in the August 2010 Release. Note that in order to deploy your packages after using the SSIS MSBuild task you will need to call the SSIS deployment utility and pass in the outputted .deploymentmanifest file. Information on doing that is decidedly thin on the ground so I’ll try and elucidate. The SSIS Deployment Utility is an executable that gets installed with SQL Server Integration Services workstation tools and is called dtsinstall.exe. You can call it on the command-line and pass in the path to a .deploymentmanifest file but as far as I can determine there is no way to do an unattended install (i.e. there is no way to tell dtsinstall.exe where to install the packages to, it simply opens up a GUI and prompts you for this information). This is a horribly antiquated method of deployment and I’m hunting around to see if there’s anything I’ve missed here – if you know any better please let me know. Hope this helps! @Jamiet
tSQL Running Total: CTE wins
Dates don’t work when copied from SQL Profiler - Error converting data type varchar to datetime
If you are running with British english as your language or some other language with a different date format to the guys in the US then you will have come across this handy error. Msg 8114, Level 16, State 5, Procedure foo, Line 0 Error converting data...(read more)
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment