When working with legacy versions of SQL Server, if an agent job fails with error 22050 (“error formatting query…”), it is entirely possible that the issue has nothing whatsoever to do with query formatting or parameterization. It is possible for a job to fail due to other causes, even if the query is formatted perfectly, while still seeing this error. The original error message can often be obtained by using the profiler. Thank you to Gianluca Sartori for suggesting that I try using the profiler when I was stuck on this issue.
The following was inspired by an actual story, but the names of all tables and indexes have been changed to protect the innocent. The specific system that I was working with at the time was running SQL Server 2008 R2 Standard edition on Windows Server 2008 R2. And yes, I know this blog post is already obsolete as I write it because I have not been able to reproduce the issue in SQL Server 2012 but it is still one of the more interesting issues that I’ve looked into lately. Additionally, at least as of the time of this writing, I’ve found that web search results for this particular error are less than helpful most treatments of this error are rather terse.
Enough with the disclaimers. A few months ago I was trying to get a handle on what tables had the most serious fragmentation issues. Intending to do something quick and dirty I just threw together a quick SQL job to send me email every morning while I got a gut feel for the database. If the below query doesn’t look familiar to you and you’re curious what I’m up to you can look, for example, here. The job contained a single step, which was
exec msdb.dbo.sp_send_dbmail @email@example.com', @subject='Fragmentation Report', @execute_query_database='IBrokeMyDB', @query = 'SELECT OBJECT_NAME(a.object_id) AS [Table], b.name, a.avg_fragmentation_in_percent, a.avg_page_space_used_in_percent, a.page_count FROM sys.dm_db_index_physical_stats(5, NULL, NULL, NULL, ''Sampled'') a INNER JOIN sys.indexes b ON a.object_id = b.object_id and a.index_id = b.index_id WHERE page_count >= 750 ORDER BY avg_fragmentation_in_percent DESC'
Or if you prefer a screenshot
The first few steps in the deployment process went fine. There were no issues setting up Database Mail and all of the test emails came through fine. Also, executing the above call in a query window worked fine and the email arrived as expected. When set up as an agent job, though, there was no joy.
Hmmmm. So the error suggests there is an issue with parameters, and I do have a quoted string in the parameter which had to be put in double single quotes since the entire query was quoted. The default for the mode parameter is ‘Limited’, which is actually fine with me, so I updated my exec statement to
exec msdb.dbo.sp_send_dbmail @firstname.lastname@example.org', @subject='Fragmentation Report', @execute_query_database='IBrokeMyDB', @query = 'SELECT OBJECT_NAME(a.object_id) AS [Table], b.name, a.avg_fragmentation_in_percent, a.avg_page_space_used_in_percent, a.page_count FROM sys.dm_db_index_physical_stats(5, NULL, NULL, NULL, NULL) a INNER JOIN sys.indexes b ON a.object_id = b.object_id and a.index_id = b.index_id WHERE page_count >= 750 ORDER BY avg_fragmentation_in_percent DESC'
Which, unfortunately, still doesn’t work
Hmmmmm. I swear I’ve done this before and it’s worked fine. At this point I’m not sure if I’m looking at a problem with dm_db_index_physical stats or some kind of permissions issue with the “sys” schema. I decide it’s easier to test for the latter and update the job step to
exec msdb.dbo.sp_send_dbmail @email@example.com', @subject='Fragmentation Report', @execute_query_database='IBrokeMyDB', @query = 'SELECT b.name FROM sys.indexes b'
Upon re-running the job, an email message listing all of the indexes created in the database appeared in my mailbox. Hmmmm. So the good news is that I’m not looking at some kind of weird issue with the sys schema, but the bad news is that I am stumped. Since I’ve done exactly this kind of thing many times before I can’t really convince myself there is an issue with using dm_db_index_physical_stats in this way. But is there any chance I’ve just been lucky so far? I wish I could say I figured the rest of this out on my own, but at this point I hit #sqlhelp on twitter to ask if there were any known issues like this with dm_db_index_physical_stats and after a few tweets Gianluca Sartori suggested that I try profiler.
In my case, the key to getting the real error was to include the “User Error Message” event to the profile. I used the standard profile and added this event to it. I then started the trace and ran the job as I had before. It helps a lot that the error messages appear in red, but after slogging through the trace output I eventually found the error message “The user does not have permission to perform this action.” In other words, the user that SQL Server Agent runs as has permission to access sys.indexes but not sys.dm_db_index_physical_stats.
This is the nudge that I needed. At this point it finally dawned on me that dm_db_index_physical_stats is actually a function, not a table or view. So one really big difference between it and sys.indexes is that a function requires execute permission. After a little more digging I discovered that
- The agent user was not a member of the sysadmins server role. I suspect this was accidental, the developer who handled the initial deployment phases probably did not fully appreciate the importance of using Configuration Manager.
- The user had been manually added to the SQLAgentOperatorRole, SQLAgentReaderRole, and SQLAgentUserRole database roles in the MSDB database so the agent worked fine, as long as appropriate database permissions are granted.
- The user had been added to the db_datareader and db_datawriter database roles in the database IBrokeMyDB, which means that all of the agent jobs which only operate on views and tables in this database worked fine.
- The user was not granted any other permissions in the database. Among other things, this means that the user could not execute any stored procedures or functions.
- Added 2013 Aug 28 Not so much a discovery as a lesson from the “do as I say not as I did” department. It’s usually best not to run agent jobs as the agent user which usually has administrative privileges. When possible it really is best to follow the principle of least privilege and run jobs as users with minimal permissions. Doing this will not make a user any more or less likely to encounter this particular issue but as long as I am dwelling on agent configuration it is important that I try not to get the reader into bad habits. More props to Gianluca for pointing out this omission in my first version of this post.
But that’s not the point of this post. The takeaway is that the information which I really needed to solve the problem did not find it’s way into the job history but was available when I dug deeper using profiler. I have not done the leg work to verify this, but I would expect that extended events could also be used to expose this information.
So what about SQL Server 2012
As I mentioned earlier, at least in my experience this seems to be much less of an issue in SQL Server 2012 because in my experience so far, 2012 seems to add the actual error which causes the query to fail to the generic 22050 error message. YMMV, I did not spend a tremendous amount of time trying to verify that this is always the case. But here is a screen shot taken from my 2012 instance which I tried to break in a similar way.