Grant Fritchey
SQL Down Under Show 54 - Guest: Grant Fritchey - Published: 2 Feb 2013
This show features SQL Server MVP Grant Fritchey discussing how to trace and optimize queries, along with tips on reading execution plans.
Details About Our Guest
Grant Fritchey is a SQL Server with over 20 years of experience in IT including time spent in support and development. He has worked with SQL Server since version 6 back in 1995, he has developed in VB, VB.net, C sharp and Java. Grant volunteers at PASS and that is where I first came across him. His authored books for Apress and Simple Talk and joined Redgate as a product evangelist in January 2011.
Show Notes And Links
Show Transcript
Greg Low: Introducing Show 54 with guest Grant Fritchey.
Welcome our guest today is Grant Fritchey. Grant is a SQL Server with over 20 years of experience in IT including time spent in support and development. He has worked with SQL Server since version 6 back in 1995, he has developed in VB, VB.net, C sharp and Java. Grant volunteers at PASS and that is where I first came across him. His authored books for Apress and Simple Talk and joined Redgate as a product evangelist in January 2011, so welcome grant!
Grant Fritchey: Thank you very much!
Greg Low: And so as I do with everyone first time on the show I get you to tell how did you ever come to be involved with SQL Server?
Grant Fritchey: Well I first started out in SQL Server as a developer. I have been working in development for many, many, many years and started off with databases and doing a lot of database work. I was looking in Paradox believe it or not. We were working in also Sybase and that was around the time that Sybase and Microsoft made the deal and it became SQL Server. I began working in that as a developer. I was building applications against databases. I became a DBA later; I was working at a Dotcom. We had gone through several DBAs they were having trouble keeping people and I was working on some projects that involve SQL Server and I was having all sorts of problems. I finally walked into the bosses office and said ‘look here are the problems we are having, this is horrible we need to get this all done. When are you getting a DBA because I can’t believe this is going on? The boss said which of these you are going to do first. Suddenly I was a DBA.
Greg Low: Often there is that discussion about the accidental DBA. It is whoever was standing closest to the server, at the time one was needed.
Grant Fritchey: I was the poor guy who didn’t notice anyone else step back.
Greg Low: Exactly, indeed, interesting yeah. Coming from a developer background and where I must. Yes I kind have done the same. I very much come from a development background although I’ve been involved in like systems engineering and things all through the 80s and so on but still tend to consider myself more of a developer bent. What do you find that is different about that having a developer background brings to the way you approach DBA issues?
Grant Fritchey: Interesting question. Two things that happen, one I tend to be a bit more of a cowboy than a lot of my administration friends. The guys that came into administration are very,very careful and methodical and they treat the production server in a very serious manner. I do most of the time but every once in a while I will be like hey what the heck no big deal. I am sure it will be fine in the production server and do things that I really shouldn’t. You know most of the time I get away with it.
Greg Low: In fact it is interesting, yesterday I was at a site and they have a guy there as a strict DBA role and he is a very, very conservative guy in his approach but it was just interesting as we were doing various tasks the level I suppose you can say the backside protection. The way he was so never assume anything be suspicious of everything you know it was it was an extraordinarily conservative approach in and in some roles that that is needed.
Grant Fritchey: Right I agree, I do agree. Probably not a role I would be suited for.
Greg Low: No that is all good and the topic we really wanted to cover today is getting into your approach on query tuning. I suppose the first thing is what leads you to be doing query tuning in the first? Like what what happens that makes you go I need to be looking at queries?
Grant Fritchey: Well basically the problem most of the time inside databases is not the server, it is not the disks, it is not CPU generally. Right you literally become problems, most of the time what we are looking at is code, code is the number one issue inside the database it just is. Over and over again, so when we have got slow performing going on when something is occurring that is going to be reflected out to the people who are using system more often than not it shows up as code. It shows up as code, queries running slow, this query is in a deadlock. There is missing indexes, you know all these things and they all just completely reflect out in the T-SQL real quick. It is the place where I just spent most of my time.
Greg Low: It's interesting you say that, I find it fascinating I often read articles from Microsoft product support and PSS will often say look here are the main things that tend to cause performance issues and I often look at those lists and I think you clearly live in a different world to make right? And what's occurred to me over the years is of course most of the issues that are real performance issues are dealt with long before anyone ever goes to product support and so I think what happened is product support then gets a very skewed view of what is it that causes most of the performance problems. But what I find yes exactly is it may be 70% of my problems at least are basic application design and that's the thing that I think are is a very big missing thing where somebody just has a strict DBA approach. For example I'd rather ask a question instead of saying how can I make this query run 5% faster I'd rather be asking the question why is it being asked 12,000 times a minute with the same parameters.
Grant Fritchey: Right, exactly. It is so funny you say it that way. I have seen that happen.
Greg Low: I think that the away more telling thing and if I look at performance tuning I have done over the years the biggest outcomes I get are where I am what I am able to change the application now you can’t always do that because of third-party apps but yet I see the biggest payback when you're able to change the app.
Grant Fritchey: Yes that has been pretty much the way I have done it too. I mean with a lot of the new tools coming out, Entity Framework, n-hibernate, some of the ORM tools, the code generated in some cases not all. The code is frequently great but in some cases it is a nightmare. I mean it is really bad. Not oh god you have a function in your where clause but you just turn this thing into a giant cursor and you are really, really hurting performance. But is interesting that the code that allows you to generate your application fast, also allows you to dig really deep holes quickly.
Greg Low: Oh yes, in fact it's interesting you mention that I recall with a lot of the early entity framework things, actually even back with linked to SQL those areas. I remember the first video that Scott Guthrie did and he sort of posted this video and said oh look if you're worried about the code that is generated underneath you can always hit a breakpoint. And he opened up a window and showed the code and there were two reactions in the crowd everyone I saw see that. One half of the crowd that were the developer side sort of looked at that and went I'm so glad I didn't have to write that query and the DBA side all looked and went who would write that? You could just see that it was like a hundred, way more complicated than what was required.
Grant Fritchey: Right, I see that coming out time and again. It is kind of creepy but you know it is clear that generated code is definitely the future. It is a question of figuring out how to work with that is the key.
Greg Low: It is interesting too, does that mean the SQL team in the backend need to start to recognize the patterns that are coming out of the ORM tools and optimize those specific patterns?
Grant Fritchey: Absolutely, I think I put this back to my development background. As a DBA, as a database developer somebody who has worked in lots and lots of new abs inside databases or in old ones. I definitely work with the development team to figure out what they're doing how the doing and how they are doing it and why they are doing it that way. Because the fact is, this may be huberous, I know but I assume that developers are smart.
Greg Low: Yes.
Grant Fritchey: I really do, I assume they are smart. I assume they are doing something for a good reason. I try to figure out what that is and even if it hurting the database. If they are doing it for a going reason, I will work with them to know why they are doing it but you are killing me over here so let’s can we change this, can we change that, can you try this approach and usually it works out pretty well. Occasionally you get the horrible nasty DBA versus developer fight.
Greg Low: Yes I see less and less and less of that as time goes on. I think most DBAs realize they have to get more involved in the development side as time goes on. I look at the future of the product and I really think is probably far less of the low-level roles. I think, I'm sure they will be there but I think they will be there in much smaller numbers of fewer numbers than what they will be.
Grant Fritchey: Yes I don't disagree at all, especially some of the stuff that is coming out with the cloud-based operations. Not even talking about Azure SQL databases. Just various SQL Server, it is completely changing the whole. You know we handle the big iron approach that people have. Because I have never even seen some of my servers now, they exist someplace else.
Greg Low: It is interesting too because I noticed I have been putting together a session which covers off lessons in learning while working with our Windows Azure SQL databases and I do find I'm doing, I'm getting the best results out of coding styles that I would not have ever considered doing on premises and that people get this idea it is just a case of just pointing your connection string to the cloud and I just think that nothing could be further from the truth.
Grant Fritchey: True, true, it is funny because I have been working with a bunch lately too. What I feel like I am doing is going back in time. I am noticing that I am going back to my old client server applications, you know only move the data you need to move. Only move it when you need to move it. Transactions are short, right. It is very much the old school style, in order to take into account, heh you know that database it might move. You know, at a moment’s notice. You never are going to know, so you got to make sure your code is written and your application and your queries all can deal with that fact and so it is kind of funny at least to me it feels old. Not new, it is kind of weird.
Greg Low: I suppose, the thing I am getting at is that for example I had a table sitting in the cloud that I have local data and performing like a SINK operation were saying insert these rows if they don't exist or write update them if they do straightforward sort of stuff and if that was on premises now I'd quite happily use a MERGE statement that I found the latency in the connection kills me when I'm doing a merge.
Grant Fritchey: Right.
Greg Low: Yet what I ended up doing instead as building and in INSERT or UPDATE trigger in the cloud and then just sending inserts so the whole thing just streams up their but then the trigger does the update if the row already exists instead and I was giving extraordinarily better performance I do by doing a MERGE. And so are again I look at things like that and go I would not have thought to ever do that on premises. In fact I would actively not do that but yet I am I'm finding approaches like that are interesting for high latency environment side I think there are some very interesting things coming there. Predominantly if we looking at on premises things at the moment which is where most people are, so how do you go about finding those queries instead of asking the person who is right into execution plan writing but anyway so how do you go about finding those queries? What is your best technique the ones that are causing the problems?
Grant Fritchey: I'm pretty boring in this regard I turn on extended events and I capture RPC calls complete, and SQL Batch complete and I aggregate the information if I can. On a critical system if I can I will run 24x7 and capture all the data. If I can’t I will try and find a very busy time either a month cycle or whatever we are doing? Some area where I know we are feeling pain and I will capture those queries and I will run aggregates. You know which ones were called the most frequently, which ones ran the longest, which ones you know took up the most time if you combine both the frequency and the time of execution.
Greg Low: Yes.
Grant Fritchey: And just aggregate those things together and you get a top 10 list. You know I joke and say I am not from Australia but you guys have this phrase, tall poppy. I look for a tall poppy and I find that query that stands out and I knock it down.
Greg Low: Look that is interesting you say that, one of the measures I have for when I'm done enough with query tuning as I think you never done is that when there are less queries that are standing out so I find that when I first look at those aggregates there will be the top two or three or four queries will probably be 90% of the system resources and over time as I knock them off other ones will appear and so on in that list but eventually I will get to a point where it is the top 30 or the 40 by the time you get to that 90% and that's what I think I know things are starting to balance out.
Grant Fritchey: Right pretty much, that is it exactly.
Greg Low: In terms of the aggregates we should talk about the specific things that you're looking to aggregate I'm interested in your thoughts on duration and logical reads and CPU and things like that. What is your take there?
Grant Fritchey: My take there is all the measures stick, duration is a lousy measure because contention adds to it.
Greg Low: Yes.
Grant Fritchey: So you maybe not be there so you can’t really use duration so you are going to go with logical reads. But logical reads is not necessarily good measure because you just happen to me moving data with this query and it is not running long per say but it is moving a bunch of stuff so you can’t really use that. You can with CPU, again CPU has just the same problems that the measures are not going to be accurate. So what I tend to do is I aggregate on all of them and cross reference the list. You have got 3 sets of 10. Which query is hitting all three or you also go back and look at your wait stats and if you see oh boy we are really maxing out on CPU, I am not going to worry about duration or reads right now. I am going to see which queries are using my CPU. It is just, I wish I can tell you that it is a science and you just do these three things and everything is cool, but it is really still an art form. You still have to investigate and make guesses. Hit things in multiple directions, you can’t rely on any one thing.
Greg Low: No it is interesting you say that because it depends what the system is short of and so still I find most sites I go to be the SQL box tends to be I/O bound. But that that's not always the case and so I am interested generally though in what whatever is going to reduce my I/O load and I just wish that logical reads were slightly better measure than they?
Grant Fritchey: Yes unfortunately as you already know but just repeating for the listeners, logical reads are basically talks about stuff moving in and out of memory not just in and off disk so it is painful. It won’t always be helpful.
Greg Low: Yes in fact it is one of the things when I have a Performance Tuning class I sort and try and explain to people, that if you are going to spend your time trying to make something that is 20 logical reads be 3 then you're probably kidding yourself. But if you are talking about something is 2 million and now it's 100 then okay.
Grant Fritchey: Yes, yes.
Greg Low: Most of the time that those sort of gigantic numbers that stick up are worth a look someone along the way and unless as you say it's doings like massive range scans you know there can be scenarios but even then I'd be going back looking at why the app is doing that in the first place. It is like what is the thing doing to consume that number of rows and could it not have been resolved in the database but perhaps?
Grant Fritchey: Yes absolutely, I recently did a consulting gig for a company and they had massive memory problems and they were completely and utterly memory bound. You think oh well your system is not configured properly. You have these other issues you are going to dig around a bit, and found they turned off Auto Update Statistics. Statistics are like four month old, the query plans being generated were horrifically bad and they were bound on memory because they were doing scans across multiple partitions and it was just literally chewing up every little bit of resources they had. We Updated Statistics and suddenly they didn’t have any memory problems, they still had other tuning problems. Once the memory problems went away then they hit IO and they were suddenly going oh my god, look, wow these were really bad queries but it was completely masked by the memory issue.
Greg Low: Yes it is interesting so I tend to find that most of them are I/O bound and so I do tend to get a better outcome looking at reads upfront but as you say it does change and particularly as you start to knock those queries off. I do find it is a good indication of finding queries that really are problematic in in the way they do things.
Grant Fritchey: Yes oh yeah, it is really is. It was explained to us that at the PASS summit by Dr Derwit, disks just haven’t gotten as fast as everything has. The more IO you have got, the more pain you are going to feel, it is just flat out. That is how it works, so you mean you have to try and knock that down. It is a good measure but as we have already said, it is not a complete measure you still have to look at the other stuff.
Greg Low: Now in terms of aggregating the queries, I presume you also normalize the query in some way before aggregating it?
Grant Fritchey: Right removing all the parameter values.
Greg Low: Parameters and things, what sort of of techniques to use when doing that?
Grant Fritchey: I cheat, I use the RML utility.
Greg Low: That’s perfectly fine, maybe mention the RML utilities for those that haven’t come across those.
Grant Fritchey: Right, look up RML utility it is a free resource from Microsoft and the only problem is that it does require you to have a trace output not extended events. But Jonathan Cahais has a script that will convert them across. Those were the two things that you want to look up. RML utilities and Cahais.
Greg Low: It is interesting if you look at the details around SQL Profiler and the SQL Trace events talk about it being deprecated for the database engine, yet it is curious that nearly every tool out there and including new ones that were shipped with SQL Server 2012 require a trace but not extended events out instead.
Grant Fritchey: Believe the fact they did the beautiful distributed replay, I love that thing. It is glorious.
Greg Low: Again based on trace.
Grant Fritchey: Yes all based on trace, it makes me so angry but what are you going to do?
Greg Low: Yes. I think it almost feels a bit premature to be talking about it being deprecated but anyway it will be interesting to see how that plays out and do you tend to use profiler or do you just normally script it out and use trace?
Grant Fritchey: I use profiler very, very sparingly. I script everything out. I generally use, most of the work I do nowadays I use extended events but what I do trace I do script it all out. It is just that the profiler GUI is problematic in a number of ways so lots of TSQL.
Greg Low: Yes indeed, although setting individual bits and things like I often find profilers are a good place for people start if not used to what all the different columns and events are and so on.
Grant Fritchey: Right, you can’t set up the trace in profiler and then just export it out as a script. That is a good approach, it just after having done it 50 times you more or less got what you need. You don’t dislike it as much.
Greg Low: So the approach is capture traces at interesting periods of time, normalize the queries so we are removing parameters and things and getting back to. So if I am saying find me customer where Customer ID is 12 and whether that's 12 or 14, 16 it really makes no difference we are interested in that query. Not parameter and it is important to normalize it out otherwise you might be undervaluing particular queries.
Grant Fritchey: Absolutely.
Greg Low: So once you get that aggregated you find that hey this is a query that is then a problem what's your approach from there?
Grant Fritchey: Well then I go to execution plans. I can’t help it but I will when I get a query you know again. I can visually look at a query and frequently just looking at the query spot issues.
Greg Low: Yes.
Grant Fritchey: I mentioned earlier, like a function in a WHERE clause, somebody has a function on a column.
Greg Low: Actually that's a really good point, I tend to call these I know a lot of people do come from developer background they call them code smells and a code smell is something you look at and just immediately make you realize that that this is probably worth having a look. So number one you said was calling typically scalar functions I presume.
Grant Fritchey: Well any function. Generally functions on columns not functions on variables or functions on columns.
Greg Low: So in the SELECT clause or the WHERE clause. Either way the things that are going to be called all the time.
Grant Fritchey: Especially the WHERE and the JOIN because they absolutely do Table scans.
Greg Low: Yes.
Grant Fritchey: Once you done, depending on the query again you do the code review, take a quick look at it and probably spot the stuff. Great, cool, wonderful but most of the time I have to go back into the Execution Plan because I want to go and look at what optimizer is doing. The only way to tell is to look at the query plan.
Greg Low: So it is interesting where people getting their head around the idea that certain types of objects, things like Views for example normally get in lined back to the query they came from and optimized as a whole. Whereas the minute you stick one of these functions in, even if it is a simple function that just isn’t what happens. That breaks unfortunately the rules where you try and encapsulate logic but in many cases where I see simple Scalar functions particularly that they are nicely encapsulated but called from a SELECT or WHERE had the person just simply put that same logic back in the SELECT it would get a completely different level of performance.
Grant Fritchey: Yes, over and over again. It is just freaky and weird but then also another code smell is when they start referencing the objects from other objects. So they treat views to views, nesting views you know when these view which calls a view, which calls a view think of functions, which calls a function, which calls a function. That kind of stuff very quickly becomes huge problems, I mean just nightmare problems. It is shame because what it is, is developers writing code the way they write their C sharp.
Greg Low: Correct.
Grant Fritchey: VB or whatever I mean, using TSQL as Object Oriented code. I can’t get upset with them because again they are smart. Unfortunately they are also a little bit ignorant and you know SQL Server.
Greg Low: Ignorant of how SQL Server works, that’s right they are applying rules of encapsulation that are actually good coding techniques in higher level languages.
Grant Fritchey: Right and TSQL just isn’t one of those languages. You know I like TSQL as a language and it works well for what it is supposed to do. But you really have to play by its rules. If you try and force it into a different paradigm it will hurt you.
Greg Low: Yes and so nesting of these objects yes I find the same it tends to be problematic. I think another part of the reason why I find that a hassle is it can make it, it obscures the code.
Grant Fritchey: Right.
Greg Low: So what I find is that if somebody says I needed a column and instead of going look looking for a value and finding where the column is I just find some view and then grab out of that and then then people can do that to my code and so on and so on and in the end you can end up with like a simple SELECT. I saw one a few years back, I posted it, it was like SELECT DISTINCT some column FROM something, your look at that and think how can that be a problem but that thing was a view and then when you look at the mess that was underneath that. I think that it is it just hides massive complexity.
Grant Fritchey: Right. I had a similar thing with functions. They had nested those 86 layers deep, it was just the most amazing and beautiful thing you have ever seen in your life. It didn’t work but it looked great and it was just beautiful code. It didn’t function at all.
Greg Low: So those functions, lots of nesting of code, other code smells?
Grant Fritchey: Well let’s see, try and thing off the top of my head here.
Greg Low: I was going to say, one for me is the minute I see tons of temporary tables because it tends to be a particular coding style so what I will see often is someone will create a temporary table, select a bunch of temporary data into it, do the processing very procedurally, select the rows back out of the temporary table, drop the temporary table and you look at the entire thing and it just should of been a SELECT statement.
Grant Fritchey: Right I have definitely seen those, or they build the data into temporary tables and join the temporary table together in order to do the output. That can frequently be an issue. You know that is definitely one of the things you look for, any query that just basically goes on for pages.
Greg Low: Yes, actually that's a really interesting thing you mention that it’s one of things I used to read years and years ago now this is really showing age but back in the days where you'd be using 15 x 11 paper
Grant Fritchey: Right.
Greg Low: I have read a lot of material when I was back doing academic work where they said that the minute a block of code exceeded a page of print you started to massively increase the chance of bugs and issue.
Grant Fritchey: Right
Greg Low: Simply because somebody could no longer visualize that entire chunk of code.
Grant Fritchey: I think that is still a rule, isn’t it?
Greg Low: Yes so you now get the equivalent thing when you go into sort of the screen-based environment but that it is interesting that’s right there is got to be some point that the longer that block of code gets unless it's very logical chunks inside it
Grant Fritchey: Right
Greg Low: There is some sort of complexity thing happening there too, where either the code ends up messier or the code ends up buggier
Grant Fritchey: Oh yes and messy code is nightmare to unravel. It makes you crazy. Let’s see code smells. There is certainly anytime you are seeing hints. Oh yeah that is a good one, query hints. When I am searching query hints in there and frequently and the ones I see most frequently is NOLOCK because everyone thinks that is the turbo button for SQL Server.
Greg Low: And again in fairness in SQL Server 2000 for a lot of people who are having issues with heh my reports when they're running our stopping me doing the transactional work basically they didn't have a lot of option but NOLOCK at that point. But yes it is worth so what are the issues with NOLOCK and what are the approaches to avoid that now.
Grant Fritchey: Right, the problem with NOLOCK is that it doesn’t lock which is nice. You think well okay, if the value is getting updated from cat to do, I don’t care if I see it as cat or dog as long as I get it back quickly. Right.
Greg Low: Yes.
Grant Fritchey: You explain that to a business person and you say ok that is great, obviously a banking concern wants it one way. They don’t want it, it has to be a 1 or a 0, don’t give me cat or dog. Give me what is right but what they don’t take into account is the fact that it is not simply a column update that is occurring it is index changes or page splits or page rearranging that occurs as data gets updated, deleted or inserted and you can if you are doing a NOLOCK and it is doing a scan you can miss rows because they have been moved.
Greg Low: Yes.
Grant Fritchey: Or you can get the same row multiple times.
Greg Low: Yes, so a simple example I am going down an index, it is based on date, you changed the date you could have pushed it into the area that I have already gone past or you could take one that I have already read and pushed it into my future so either way and it is really hard to explain to someone missing rows in a report. It is trickier to explain them being there twice the other one of course is because you don't either raise or honor locks is that you could report on data that never existed as well so somebody else could start to put data in and then roll the transaction back and yet there is on your report.
Grant Fritchey: Right.
Greg Low: So that's a really tricky one to explain to someone later when they look at your report and go where is this row when it never existed in the system.
Grant Fritchey: Yes, yes so that stuff, I mean it is shocking how easily you can do that kind of thing if you are using a NOLOCK process.
Greg Low: It is of course safe to use if you have read-only petitions and historical data and things like that. There are places for it but.
Grant Fritchey: Absolutely and is certainly occasionally going to be a situation of an exceptional situation where you are going to put NOLOCK back onto a query.
I don’t disagree but if you are in a place where you have got a read-only file group you don’t need to put NOLOCK on because SQL Server knows that is a read-only file group and it will not put locks on it. Better still if you have a read-only database SQL Server won’t put any locks on the database. It is a great way to get good performance.
Greg Low: Curiously not quite actually, it is one that came up in the Masters program, we did testing on that and if you have a read-only database it doesn't raise locks that’s cool but if you have read-only file groups have a look at the locking. It actually goes through the mechanism of raising them even though it doesn't need them. It is completely bizarre, but if you have NOLOCK then that is safe to do there.
Yeah I went back to them and said hey look why is this thing raising locks against data. Again it would of increased the of the locking behavior. If you look at the active locks you will see them raised against that which again is an interesting one because it just made no sense to me at all and as to why that actually would be there.
Grant Fritchey: Wow, I am going to do more testing. That is great thank you.
Greg Low: So there you go, but anyway as you said look lots of query hints, gives you think lots and lots of hints particularly I would say index hints freak me a bit.
Grant Fritchey: Absolutely, the one exception is if we are dealing with either the XML data the index hints can become very useful or if we are dealing with spatial data the index hints can be necessary.
Greg Low: I am glad to hear you mention that particularly with SQL Server 2008 it is getting better as time have gone on but you build spatial indexes but the optimizer not so good at working out what to use.
Grant Fritchey: No that is correct, but almost every time you are seeing index hints is because people have got some sort of non-searchable function where they are running on a function some place. They are getting scans instead of seeks and they are trying to force SQL Server to do whatever they want. They are trying to force loop joins are they are trying to force merge joins instead of hashes and they are building those index hints and that is frequently an issue.
Greg Low: It is a really, really bad approach so for example, an example I normally use in one of my classes, is that people would say look I have heard MERGE JOINS are more efficient than hashes or something and so that the thing is MERGE JOINS require sorted input so at you go. Yes in a merge join instead of inner join then sure but all SQL Server is going to do is it will do that but it will jam a sort in front of that and sort all the data before it pushes it in and sorting it then merging is way less efficient than doing a hash join in the first place.
Grant Fritchey: Right, no kidding, and then you are also slamming your TempDB more than before. Oh god yeah I see some really annoying things that have gone wrong there.
Greg Low: So listen query hints that are certainly problematic, another one that is people don't get the idea of CASE statements. I find the CASE statements are amazingly underutilized and so for example I would see someone who needs to update a row 4 different ways but they will actually execute 4 UPDATE statements rather than having a CASE.
Grant Fritchey: Yes that is odd, instead of just looking at the values that make up.
Greg Low: Yes just passing it through once, it's again they think in this situation I need to do this in this situation I need to do this but in many cases they end up hitting all those rows numerous times. Where simply applying good use of CASE statements would have made a huge difference to the query.
Grant Fritchey: The other thing that does is that the optimizer is going to create execution plans for all 4 of those statements based on the values that you pass the first time when it is calling the cache. One of those statements is going to be great but the other three are going to stink right because it does parameter values are different and therefore the sniffing occurs which is a good thing most of the time but it is going to create an index execution plan and it is not optimal for the later dates that are coming in that is going to run the second statement or third statement or fourth statement. That is actually a code smell too when I see multiple statements instead of split out. You know you have a wrapper procedure and then a series of procedures underneath so each one gets its’ own execution plan when it gets executed as opposed to, as part of a single batch.
Greg Low: Indeed, well listen that is a good point to take a break and we will come back after the break.
Greg Low: Welcome back from the break. Another thing I get people to do Grant, is just to tell us if there is a life outside SQL Server.
Grant Fritchey: No! Of course there is. Big thing in our house lately has been cross fit we just started doing it about three months ago and we have been doing all the Olympic weightlifting. And the crazy metabolic conditioning and all that stuff to lose weight. Get in shape, it’s fun, it’s hard, I am sore frequently, but it has definitely been very enlightening.
Greg Low: Look it is the interesting one too I suppose we are in an industry where we spent a lot of time in seats.
Grant Fritchey: Right, absolutely. It is actually freaky just how much we sit down. We would get a lot more out of a standing desk or a walking desk. Some of those things I mean you know it feels like a fad so instant instead I do activities like a crazy person.
Greg Low: Yes, I am so glad to hear you do that something I wish I had done all along the way myself. I find that as I think you get to older age mobility. Mobility the number one thing I think in most cases just to encourage people don’t just get chained to your desk all day every day.
Grant Fritchey: Absolutely.
Greg Low: Listen for those that haven’t been to the summit. Tell us about SQL quilt.
Grant Fritchey: SQL kilt, a few years back we were on Twitter with several people joking back and forth about the upcoming summit and somebody and somebody said something about wearing a skirt and then someone said does kilts count? They said I dare you. Then I said okay fine, I own several so no big deal. I declare the middle day of the summit at that time it was Wednesday but normally it is Thursday to be a kilt day. First time there was three of us and then the second time we talked to the women in technology and they agreed to sponsor us and we sponsored them and the next time there was about 20 of us and the third year there was 50 and this year there will there was 50+ I am not sure how many it has just grown. It doesn’t mean anything you don’t have to do be Scottish I am certainly not it is just a celebration with women in technology and a chance to wear a kilt.
Greg Low: Awesome that is the question I was going to ask is if you had some sort of Scottish ancestry? All how did you come to have some kilts in the first place?
Grant Fritchey: I was like everyone else in the world, I assume, I was cruising the Internet and I saw an ad for utility kilt which is an American company in Seattle actually.
Greg Low: That is the geeky kilt.
Grant Fritchey: it is kilt but it is very much a sports kilt if you will a practical kilt because it has pockets. I fell in love because I wear shorts all the time in the summer. And I said what the heck I tried it and it was wonderful. They’re very cool in summer and not that bad in spring or fall and I also wear it in winter time. It is just a great way to deal with the weather a little bit better. And it is a conversation starter.
Greg Low: Awesome. Absolutely use and it has been an interesting conversation in a variety of summits and totally awesome. Well listen the area a number of people might have come across you from is the writings on reading execution plans. If we just basic 101, with people getting started with execution plans what are the main things they need to know?
Grant Fritchey: Well, that is actually a good question there is a few things you need to be looking at. Just a few I have only got a basic six items, right, that you would start off with. The most important.
Greg Low: Well just maybe if even first up just grabbing the plans too we should just mention this that there are actual plans and estimation plans and what the difference is there?
Grant Fritchey: Right, actual plans are basically the same as estimated plans unless there is a recompile that has occurred. It really depends on where you get the plans and what kind plan you are looking at. If you run a query and capture the plan or capture a plan in after execute using trace or extended events those are actual plans and they will include a little bit of run-time information. The actual number of rows, the actual number of executions, other than that those queries are exactly identical to the estimated plans. All the values and everything are the same, it’s just that they have that little bit of extra data.
I prefer though, if I can get them but if I can’t my second choice is getting plans directly out of cache. Using dynamic management objects because those plans are 100% of the plans used by optimizer, even though they are estimated plans they are the final plans. If any recompile occurs, the recompile will stall that up in the cache and that is the best place for that. Say DM Exec query plans join with either sys dm exec query stats assisting sys dm request for actively running queries allows you to get execution plans.
Greg Low: Yes, so basically you are passing in the hash of the hash and then basically use that to look up the plan and that returns the XML of the plan.
Grant Fritchey: Yes, exactly you can click on it inside Management Studio only will come up as a graphical plan and you can use it from there.
Greg Low: Yes and so for a lot of people in simple cases I will find that people will start was Profiler or are all one of these tools, where there are low-volume things and capture a query that is problematic and paste back into Management Studio or something. Control+L will give an estimated plan and that is going to show them a query plan.
Grant Fritchey: Right.
Greg Low: Now, again another thing I think that is kind of nice about getting an estimated plan is it is great you can do that because some of these queries might be intrusive to run again to find what the actual plan was. So sometimes it is nice to but in the quality of that plan is going to depend upon these statistics that are available as to how good that plan is anyway. That is a topic for another day. So once we get another plan, where do we start?
Grant Fritchey: Well, you start off with the first operator. The first operator is the one all the way to the left. If you select insert update or delete and that operator the reason why it starts there is the properties inside the operator is all the fun, call and interesting stuff. I mean things like, how optimiser works with the plan it, is it a trivial plan? Did it go through a full optimization? Why did it stop optimising?
There is a thing called reason for early termination, and if it says there was no good plan found then the optimiser did its job. It found the best plan it could find and this is what you are looking at. And that is a good thing to know because it means with the plan while the query may be slow there may be opportunities for that plan to be consistent. Meaning it will roughly come up with the same way each time until you start making changes to the query.
Greg Low: So maybe define a trivial plan upfront as well?
Grant Fritchey: Okay, a trivial plan is a plan that can only be run one way. The optimiser really has no chance for choices in what it can do. For example, SELECT * FROM table, there is no WHERE clause no joins that is guaranteed to be a trivial plan because optimiser can’t do anything but a table scan. It has no choice so it is a trivial plan.
Greg Low: But the minute you add an ORDER BY then that could then change.
Grant Fritchey: Yes, then it has choices. Right, then it can say I could do this all I could do that. It starts to look at the table structure and determinations like do you have any index that I can use. Do you have two indexes I could use and stuff like that. It will do all kinds of crazy fun things, some of them good, some of them bad. There is no choice when you get a trivial plan.
Greg Low: Plus, it is also worth mentioning things like the DDL statements. So CREATE, ALTER, DROP, DCL, GRANT, DENY, REVOKE those sorts of things. There is only one way to do those things.
Grant Fritchey: Absolutely, funny thing is that for most of those there is a way to look at those plans. Most will actually show up in an execution plan, the normal execution plan. You have to hop through a few hoops to pull up the plans for those. You don’t need to, as you said there is a whole what different way to do them but if you are interested in what is going on or if you think you have a bug, it is a good thing to know that you can tune into that area. Don’t do with unless you need to all unless you have got extra time on your hands.
Greg Low: So beyond that SQL Server is going looking for a plan? Which we should also say is not the best but is sufficient.
Grant Fritchey: Well, like I said the reason for early determination, the reason it that stopped optimising is because it found a good enough plan. Right. It is not because it found the best plan, perfect plan, it just found one that said based on the juristic’s the calculation inside the optimizer. It said yes that this one is good enough and it just returned it.
That doesn’t mean that there is not a better plan out there. That there is not a different way to return the query, it just means that optimiser found what it thought was a lower cost.
Greg Low: Again, the more choices it has the longer it can spend working out how to do it, eventually though you what the query to be executed. It needs to draw a line.
Grant Fritchey: Yes you can see the compile time in the first operators you can see how long it took it to get the plan it is running. For some of those, I have seen it take 3 ½ minutes to compile a plan. I have seen it not compile plans for hours, there are a lot of other issues.
Greg Low: For the first operator we can see that.
Grant Fritchey: Yes, and a whole bunch of other stuff. That is where you go all out to just learn about stuff. The next thing you want to look for after the first operator, is you want to look for warnings. Where there is either going to be a little yellow alert sign or an exclamation point or a big red X. It is all kind of different things, it is basically a poke in the eye. You have got to call a potential problem, instead of a real problem. But you may have missing statistics, which means the optimiser is making bad choices. You might have a missing predicate which means you have basically got a Cartesian join which if you wrote it that way on purpose that is great no big deal but if you didn’t, you didn’t notice that and that is what you were doing it is able to get that.
Greg Low: Actually, that is something that does surprise me. Is when I write a statement that says SELECT FROM table, table then I do get those warnings saying paid this is a red X doing a Cartesian join, is this really what you were meant to be doing, but what does surprise me when I write FROM table, CROSS JOIN table, it still often gives me those warnings. And I think like I am being really specific here.
Grant Fritchey: Yes, I know that is one of those as you said is about the locking on the file group, that is an odd thing I don’t know why they would do that since you are doing a CROSS JOIN. I wanted to do a CROSS JOIN.
Greg Low: I came to the conclusion that they end up being passed into the same thing. In the end that thing will look at how to do it is saying hey are you missing something? Yeah.
Grant Fritchey: Yes, must be, another warning that is new in 2012 you can get them only in the properties in 2012 is Implicit conversions. When SQL Server is converting a data type for you, that can affect your plans again you can mess up the index but not obviously the index obviously. Mess up the use of the index, the optimiser because of the implicit conversions. Those now in 2012, are serviced directly in the execution plan have a warning which is good to have.
Greg Low: Look can you mention an example of that? Because I know that is something that is not well understood but it is a really common problem.
Grant Fritchey: The example I see most often is that people are storing their dates and times as a string but most of the time they are storing it as a string it is usually some old technology or an older design. Then they pass it in as a daytime data type and SQL Server will convert that for you which you can completely as long as the string evaluates date, SQL Server is more than happy to do that conversion for you. But it is invisible, it is under the full and that can cause scans to do that because the calculation and therefore you cannot seek to value it has to look at everything and do the calculations.
Greg Low: Actually, it is worth mentioning too the string formats for that too. People need should be very careful about the string formats they use for dates as well. Again the site I was at yesterday, I mean people have started to get the idea that most of the formats with dashes in it are problems but again the one I looked at yesterday’s said it is not a problem because I just write 29 Jan 2013. I said that yes that is fine but is somebody had different regional language settings on the machine then all of a sudden that will go bang as well. The one that I find is reliable all the time, in all the data types is just the eight digit integer if I am after a date so yyyymmdd with no dashes. Because it is at least date in variant, and so on but the minute you put dashes in, if it is a date time data type then it is a language setting reliant but if you have a DateTime2 then it is not language setting dependent. I just don’t really want to get into that territory so I find that if it is just a date then you if you write an eight digit integer, in quotes your life is good.
Grant Fritchey: Right, the whole implicit conversion can make you nuts.
Greg Low: Yes.
Grant Fritchey: I love Adventure Works because it has so many errors in it, it is a perfect example database. They have actually got implicit conversion in some of the calculating columns, so it is easy to generate the errors and showing examples to people.
Greg Low: So we go looking for the little yellow triangles on the operators. They are worth a look?
Grant Fritchey: Yes, that is basically looking for the most costly operations. That is tricky because those values are all estimates. They are based on statistics, I’ll will leave statistics for another day because that is a huge topic. The statistics can be a wrong, and that can give you their estimated value is being wrong. The one thing that everybody needs to know is the estimated values in an estimated plan and the estimated values in an actual plan are all estimated. They are not real numbers but..
Greg Low: Exactly, so you are looking at the actual plan, you are still looking at the estimated values.
Grant Fritchey: Absolutely, that is one that people wish because they think it is an actual plan that it is real numbers. Nope. They are still estimated but they are the only numbers we get so when you look at do those higher cost operations and figure out if it is a real number or not. We were talking earlier about functions, table valued functions, multi valued table functions will have a cost on most plans because it has no statistics so the optimiser will assume one row. Even though it may return 1 million rows, it’s estimated cost will be fewer because it is only returning one row.
Greg Low: Yes it could be the most expensive part of the query.
Grant Fritchey: Absolutely, that could be killing you. When you look at the most costly operators, you also in need to know what the costly operators are to understand what is going on.
Greg Low: Another question, how do you find the biggest numbers when you have a big plan?
Grant Fritchey: You know, for a medium-sized plan 40 to 50 operators I just look but as soon as I start getting past that, I will use X query. If the query is XML, so I can use X query to say okay the estimated cost and that will show you which ones are the most likely the most costly.
Greg Low: One of those you think they will have some graphical way of doing it inside directly inside the plan viewer. It is worth noting in SQL Azure portal for that, it does have things that help you go off and find those operators.
Grant Fritchey: Yes, I was just looking at a series of blog posts for that it does. Then there are third-party tools that I realise I work for Redgate but SQL century has a fantastic tool that you can use this will for. It will allow you to resort the objects if you don’t want and sit and write X query go talk to those guys.
Greg Low: Indeed, as I said navigating around large plans, the other thing a lot of people aren’t aware of two is just in the bottom right hand corner there is a little navigate button.
Grant Fritchey: Yes that is a good thing to have.
Greg Low: Which I find people just completely unaware that is there, so there being allows you to move around a plan.
Grant Fritchey: Yes, right that is what I used on a medium small plan I use that to cruise through a look through different operators real quick.
Greg Low: We shouldn’t mention their width of lines as well?
Grant Fritchey: Funnily enough fat pipes.
Greg Low: Yes fact pipes exactly.
Grant Fritchey: Fat pipes, the pipes are the little blue lines that are connecting up the operators by the plan and you see a little arrow some of them are thin, some of them are fat. The bigger they are the more, literally that direct representation the amount of data being moved. And that thing you want to look for is really fat pipes over on the right-hand side of your plan going down the really thin pipes over on the left hand side. Because what that is showing is you are accessing huge amounts of data but then filtering it later and you really want to filter it up front. So it is doing some kind of scan or something that is costly. That is a great indicator.
Greg Low: That is a great point if you see big pipes on the right and they end up as little pipes on the left all we are saying to get a little amount of data you are reading a very large amount of data.
Grant Fritchey: Also if you see thin pipes turning into big fat ones, you are multiplying data. That is another one to look for. The really scary one is when you see thin fat thin, it is doing some really odd calculations inside of the plan. You are probably hitting temp DB, there are all kinds of issues going on that is one that always makes me freak out and run in circles. It is something to look for if you see that give me a call because I want to talk to.
Greg Low: Yes, most of the operators, I mean you hover over them you see an awful lot of details, what are the main details you go looking for in there?
Grant Fritchey: Nowadays, I don’t. I skip right past the tool and go into the property sheet. The property sheet has more details so for example if you are looking at an index scan all hash joins so there is going to be two values you are looking for in a hash joint. One is the key value where you are joining on and you want to look for residual values that is a secondary filter and those properties and the key values are usually in the tooltip and is a residual can be but it isn’t always. If it is a big residual then you will have to go down the property sheet and do it from there. I will usually look at the properties from there.
Greg Low: Yes, it is goods concept people don’t realise is if you show a little bit visually the tooltip has more information, the properties window has even more again in the XML itself so you are getting different little of info but most of it is there in the properties window.
Grant Fritchey: With the properties window you get into the habit of having it up and it will make your life easier. Right. You click on the thing and he will show you all the information you need, you click on the next one it will show you the information. It is a good to have it. That’s a problem, that is something to look at and figure out what is going on. It is basically any kind of operators that you can’t explain why is that there. All wow I have never seen that one before, but that is one that you should also take a look at.
Greg Low: Yes, I find with my development hat from that side, one of the things I like with execution plans is not necessarily to stare at them and go what is wrong, I tend to hope with most queries if I have written it then I have an idea in my head on how that would be done and then I am interested to see if that is how it is actually being done.
Grant Fritchey: Right, it is like if I have a good index here and you have a WHERE clause in it, it should use that index. Then you go and look at the plan and is doing a scan and ask yourself why is it doing a scan even though there is an index.
Greg Low: That is the thing, the typical questions you get from people is that is right. Why why is this crew that looks like that query so very different? All, why is SQL Server ignoring the index even though I have supplied it? When I tested it used it but now it is not those sorts of things.
Grant Fritchey: You know, my favourite was the same query, same parameters same query different day, different execution plans.
Greg Low: Of course. Simply the data is self is different. In fact it is something that I’ll was explaining to something someone at a site I was at yesterday. Again is you are sort of analysing a query and sort of looking at the plan and he didn’t have a value full one of the thing that was used in a predicate and he said let’s just analyse it without that an predicate and I said hang on. I just wanted to get over the idea that that could fundamentally alter how that is going to be done.
Grant Fritchey: No kidding. I mean if you have compound keys, some kind of index that may be completely what it needs.
Greg Low: Yes, you could be just completely wasting your time having a look at that. One that was quite interesting I found yesterday to, I was working at a site there was this interesting issue with a cursor. It was something that had come from 2000 and worked perfectly but when they got to 2008 are to it wasn’t working perfectly. But again one of the problems he was having was he was looking at the definition of a cursor and optimising that query and looking up that query and going gee that seems like it is going okay but the problem is when it was doing the first fetch it was stumbling and taking for ever and of course the thing you have got to keep in mind with a cursor is it is just not be. How do I do the query in the first place, is if you have anything except a fast forward read only cursor then the fetch really becoming part of that statement as well when it is executed. So it is kind of like them being combined together to produce the first row. So it is how is it going to get that first row. It is one thing to execute the query. It is interesting in that case the different behaviour by changing into a fast forward read only cursor, again problem completely to disappeared. The thing went from was like 15 minutes to sort of save this screen, to half a second sort of thing. It was the interaction between the fetch and the definition of a cursor that was the difference there. I think you can never just look at part of a query and think this is going to give me enough of info to necessarily look at the entire query but I do like breaking queries and looking at pieces of it in some cases where it is nested.
Grant Fritchey: Oh, yes, the more you can dig down and understand how is it doing this it makes a big difference. As you combine, it is kind of funny you can run to queries and each one will perform a certain way but as soon as you combine the two, then optimiser will be pounding.
Greg Low: Yes, something completely different.
Grant Fritchey: You get a completely different thing and you I had this all worked out, but it is really messed up.
Greg Low: I suppose it is worth mentioning to that plans there is a SQL plan file type as well, so they are portable.
Grant Fritchey: They are really useful because if you ever need help, you want to go to stack overflow, stack exchange, SQL Server Central, MSDN forms and you want to ask for why is my query running slow? If you can post their execution plan with the sequel plan file and post it, you will get a lot more help than if you just either ask why is my query slow or here is my query why is it slow? The execution plan will get everyone who is trying to help you a lot more to work with.
Greg Low: Actually another really good hint when people are posting to forums and asking for help of any type is to also posted DDL and some sample statements. Again, don’t expect people will mockup DDL to try and work out how to help you with your query. Give them something that they can actually execute that will demonstrate the problem.
Grant Fritchey: Absolutely, I guarantee it I am not going to do that. My boss does not pay me to make your database on my system.
Greg Low: No indeed, anything else in particular that is really important to be looking at in plans?
Grant Fritchey: No, that is pretty much it those other general things, that pipes, scans. Scans are not necessarily a bad thing, not necessarily a bad feature, not necessarily good but scans are an indication a lot of data movement. That is what we talked about right at the beginning. Logical reads may not be the perfect measure but the more data you are moving the more likely you are having issues right? Scans are an indication of data movement, so it is something you look for.
Greg Low: In this case we are talking sometimes it is saying table scans, some sometimes it is clustered index scan.
Grant Fritchey: Right, or an index scan.
Greg Low: Or an index scan.
Grant Fritchey: Actually, it is interesting that the optimiser will look at the size in the pages and if they can satisfy a query against a non-clustered index by scanning it, it will use that over a clustered index. If the size is smaller, it is going to have to hit a substantial portion of a bigger index it will go to the smaller index and use that instead.
The optimiser is a smart cookie man, it is really an amazing piece of coding. We so talk much about the problems it has, because we deal with the problems mostly right. It is doing great things for us all day every day and it is kind of easy to ignore and take it for granted.
Greg Low: Yes, a really good point, that’s right. Even if an index isn’t structured to help with the query if it can if it completely covers the query in terms of providing all the data, then that’s right reading that may be way less work than reading the entire table. And so it is kind of interesting too, you get the same argument with people to where they talk about subset indexes. They say there is no point having an index on A and B if you have an index on A,B and C and I go maybe not. I mean if the most important query in the system, only involving A and B there may be an argument with having that in an index all by itself.
Grant Fritchey: Right, absolutely depending on the query the optimiser will make that choice for you. You don’t have to put in index hint on or anything. It is cool, these indexes are 1000 pages smaller when I use it.
Greg Low: Yes, I find it does a really good job of that. One of the things that surprise me though is if I have a table and I create an index, let’s say is based on a different key and have every column in the table and I think that is kind of strange. If I have a query it seems to always prefer the non-clustered covering index there, rather than the underlying table in that case and again I get a weird kind of optimiser thing. I don’t know why it seems to do that.
Grant Fritchey: Well, that is an odd one.
Greg Low: So I wonder if there is a rule a bias towards that yes. It is kind of interesting.
Grant Fritchey: Yes, that is really interesting.
Greg Low: I do wonder if it applies a rule that says, yes is there an index that already has all the things before it then goes off and looks that and compares that to the table itself. I often wonder how that works but I mean it is fairly rare to have indexes on all the columns in the table in the table in any way so.
Grant Fritchey: Well, unfortunately it is not all that rare.
Greg Low: Oh sorry, it should be rare.
Grant Fritchey: Yes, it should be. I have six different indexes and every single index had every column in the include so like when you run it is said oh man. Asked why am I insert so slow?
Greg Low: Yes, the other big smell there I find all the time is people who simply just create single indexes on every column in the table. You know, like that is probably not really useful that one. But again somebody has got the idea that this is a really good idea to do this but yeah it turns out to be tragic. Actually the other thing that I do really want to mention with query plans, the thing that surprises people often is how old messy plans are for inserts, updates, and deletes.
Grant Fritchey: Yes.
Greg Low: So if we said the create, altar, drop there is only one way of doing those things but insert, updates, deletes some of those plans are quite complicated.
Grant Fritchey: I have an example I use all the time for example I used the adventure works for my examples when I’m doing presentations because then anybody can just grab it and run it and not have to do anything special. One of my favourites is the deletes from products, it is just a simple little statement.
Greg Low: Delete from customer is the same. It is amazing.
Grant Fritchey: You say okay that is going to be really simple, you open up the execution plan and you’d 60 operators in it. At least 60 and it is pointing to all the foreign key constraints and doing a check without their data and the fun part about that one is that you can tell which part of the table it is checking but you can’t really tell how it is doing the checking until you look at the insert. The insert operator, has got this really funky formula in it it says where expression, 0052 equals this all where expression 053 which of these expressions is hitting is to go back through all the operators. Look through that all the operators and look at the output and figure out on this table the product, is expression 53. Okay cool now I know that. Then you have to walk back through the whole thing. It is crazy.
Greg Low: Actually I find if I do a query for like delete customer where customer ID equal thing or rather in Adventure Works, it is interesting that the plan that comes back you can tell all the different foreign keys that have been declared and it goes off hitting those. But again as you read across you will see this one is a seek, this one is a scan, and so on and you can tell straight away what sort of indexes have been put on the foreign keys in those other tables. It is a surprising even again in adventure works the number that end up having straight out table scans in every one of those simply there is no index on that foreign key. And so again for me we go looking for all the time when doing some sort of health check, is I go looking for declare foreign key where the keys are not the left-hand component of at least one non-clustered index and so. Long story but the bottom line is that’s right foreign keys aren’t indexed by default in SQL server and I was content that it should be because even though you can invariably build a better index the amount of pain that is caused by none is amazing.
Grant Fritchey: Yes, I agree. Which is kind of funny when you think back on the learning curves. I remember from a very long time ago now because I am old. I remember thinking oh foreign keys must be just like primary keys but just as an index on it and I was shocked when I found out that you didn’t. Which is probably why I remember it so well.
Greg Low: Of course, you delete the things where you really pay the price, so I go and delete customer in the orders table and there is 1 billion rows in it that says here is the customer and I am going to go and have to read the whole table to find out if that customer is in there or anywhere. I mean in the middle of deleting that one row of customer, sometimes those plans like the insert, update, delete updates and deletes in particular can get really, really nasty.
Grant Fritchey: Also also the very fun part about that is when someone tells me that my database is primarily right I always go really, cool. Let’s take a look at that you know again you sort of walk them through how to look at the index access statistics and staff and sure enough almost all OLTP systems I have ever seen is still primarily read.
Greg Low: It funny you mention that, I have got a site in Sydney that I work with and they do 10,000 inserts a second right but even then when I look at their load it is primarily read.
Grant Fritchey: Really, oh wow.
Greg Low: I have people all the time when I say to them what percentage of the time do you think these systems spends reading and writing I always hear numbers like 10%, 20%, 30%, every time I look it is 2% present. Or something like that and so while they are important I tend to probably lean towards a lot more indexes then what people are comfortable with. I am going to optimise for the 98% of the load, not for the 2% unless the 2% is incredibly critical.
Grant Fritchey: Absolutely, it is funny. I work with a team and there is a smart guy, one of the league guys is the one true index person right. One index on the system and you are done. The arguments we used to have an old will be like Jason one more just one more right here it is going to be a huge deal Jason fought tooth and nail. It really would help, if we had another index is not necessarily really painful.
Greg Low: Yes, it is interesting like even again I was involved in a White Paper a while ago. And some of the folk in Microsoft said they had figures in the paper where they were talking like 40% right load and I was saying to them due you ever see that? I was really struggling to convince them that it was just not a realistic number but every time I have gone looking in typical transactional systems it is so rare to see high write loads. It just really is rare and certainly my world.
Grant Fritchey: Yes, same here. I do know of release one system is not one where I am in thank God that is primarily write. I know the person in there, trust me, he knows what he’s doing but he actually has a system that is almost exclusively write during the day but every night it goes over and turns a whole bunch of reads but that is abnormal. It is a real abnormal situation.
Greg Low: There is this sort of myth where people say the more indexes you have, these slower days than manipulation operations will be. Boy that is the very broad statement, yes they have to be maintained but as we said as you go and delete a customer and reader billion orders to do that you are not talking about slowing down data manipulation operations by having an index in place. I mean you are talking about radically improving them all if I go to update some row in a table and it is not based on the primary key or something, let me tell you I have to find the row before I can go when updated. Again something that is going to avoid me reading the whole table to do that can have a profound impact on positively on an update statement.
Grant Fritchey: Oh huge. Just night and day. It is fun to be the guy that walks in puts one index on a table and takes a query from running hours.
Greg Low: Yes and the world changes yes. Yes usually the low hanging fruit and takes a query from
all that tall poppy thing, yes usually I can deal with most of those pretty quickly. But listen anyway thanks for your time today Grant, where will people, across you in upcoming days and months?
Grant Fritchey: Well I am getting around as we are starting to move into the two season. The first one is SQL Saturday in Silicon Valley, in March I am going to be in Connecticut at the SQL Saturday Newhaven. April we will be, is me and Steve Jones will be adds SQL Connections with Paul Randall, Kim trip and a whole bunch of really smart people. I am not sure how I snuck in but I am happy to be there. Oh also is SQL Saturday in Boston in April 2 so there is definitely a lot of stuff coming up and I think I can say this now but I’m not sure but I will say it anyway. I am going out to London for SQL Bits this year so I am looking forward to that.
Greg Low: Yes I haven’t managed to get to SQL Bits yet, but I have so many friends do, the folks that run it and yes it is high on my list that I will get to some time soon.
Grant Fritchey: Yes I have meant to get to the last five of them.
Greg Low: Yes, I love it how the guys move it around the country, it is kind of good. Magic ! Thank you very much for your time today.
Grant Fritchey: No, thank you Greg I really appreciate it this has been fun.
Greg Low: Great. Thank goodness.
Phone: 1300 SQL SQL (1300 775 775) l International +61 1300 775 775 l Fax: +61 3 8676-4913
Copyright 2017 by SQL Down Under | Terms Of Use | Privacy Statement