Sunday, February 19, 2012

Interpreting index statistics on SQL 2005

I ran the DBCC SHOW_STATISTICS command for all of my indexes; I was told that high density numbers are bad, low numbers good. I have some questions about my results, though; I'm not sure how to interpret them.

Of 48 indexes, 14 have a density of 0. Does this mean that the indexes are not selective enough? Does it mean they're garbage and I should toss them?

6 have a density of NULL. They are all primary keys. I suppose this just means that they're never used because these tables are rarely queried. Would this assumption be correct?

13 have a density of 1. I have no idea what this means.

The others have densities ranging from 0.01210491 to 0.5841165. I was told that the lower this number is, the more selective and thus more useful an index is. I think 0.5841165 is too high a number. Would this be correct?

Thanks in advance.First you need to ask good and bad for what. I can argue both ways equally well. Low density numbers are good for OLTP, and high density numbers are good for OLAP. Density is practically meaningless for compound keys, since they are almost always compound and the densisty only refers to the leftmost column.

Density is meant to convey one specific property of a table. I use it to get an initial feel for the table, but it doesn't mean much in the "big picture" to me... I really need more information than what the Density can convey by itself.

-PatP|||First you need to ask good and bad for what. I can argue both ways equally well. Low density numbers are good for OLTP, and high density numbers are good for OLAP. Density is practically meaningless for compound keys, since they are almost always compound and the densisty only refers to the leftmost column.

Density is meant to convey one specific property of a table. I use it to get an initial feel for the table, but it doesn't mean much in the "big picture" to me... I really need more information than what the Density can convey by itself.

-PatP

Ummm kaaaaaaay. Didn't understand hardly a word of that.

Basically I have a bunch of giant indexes chewing up space on my DB...one table's indexes take up 3 times as much space as the table's data does. Another tables indexes take up 20% more space than the table's data. I want to get rid of indexes that aren't being used or that are garbage; I thought I could use statistics to help determine that.

I know next to nothing about indexing.|||Ah, now we're getting somewhere!

First of all, I need to correct one typing error, I meant "natural keys, since they..." in the second paragraph above, I just didn't proofread it before I sent it.

Just FYI, the density has very little to do with how useful the index is to an application. There are lots of other things that factor in much more importantly than the density.

If you don't want to take the time to understand indexing, you really shouldn't try to make decisions about it... Kind of like if you don't know a distributor from a carburator, you probably shouldn't work on your motorcycle... Just ride it and enjoy! When it needs a tune up, call a professional.

If you want to learn about indexing, there are lots of web pages that explain it at varying levels of detail. I'm not sure what your needs are, so I can't really recommend one over another, but if you snoop around ten or fifteen minutes I'm sure you can find several that suit your needs.

If you don't want to learn about indexing, you've still got several choices. You can hire someone to come in and do the job for you. This is quick, efficient, and gets you about as good a solution as you're likely to get. You can post your index script here, and maybe someone will help you online. This isn't as good or as reliable, but its cheaper and might be faster too.

-PatP|||One thing you may try is the Index Tuning Wizard. It is not a perfect tool, but it may point out what indexes are not getting used. You will need to run a trace of the activity, and making sure you get enough activity is the big problem. If some of these indexes are used for the CEO's big report at the end of the quarter, and you drop that index, there could be a few uncomfortable explanations you will have to give. In short, have all the indexes scripted out, so you can add them back later on, if need be.|||Ah, now we're getting somewhere!

First of all, I need to correct one typing error, I meant "natural keys, since they..." in the second paragraph above, I just didn't proofread it before I sent it.

Just FYI, the density has very little to do with how useful the index is to an application. There are lots of other things that factor in much more importantly than the density.

If you don't want to take the time to understand indexing, you really shouldn't try to make decisions about it... Kind of like if you don't know a distributor from a carburator, you probably shouldn't work on your motorcycle... Just ride it and enjoy! When it needs a tune up, call a professional.

If you want to learn about indexing, there are lots of web pages that explain it at varying levels of detail. I'm not sure what your needs are, so I can't really recommend one over another, but if you snoop around ten or fifteen minutes I'm sure you can find several that suit your needs.

If you don't want to learn about indexing, you've still got several choices. You can hire someone to come in and do the job for you. This is quick, efficient, and gets you about as good a solution as you're likely to get. You can post your index script here, and maybe someone will help you online. This isn't as good or as reliable, but its cheaper and might be faster too.

-PatP

*sigh*

I never said I didn't want to learn. I said that I don't know very much about it and that I didn't understand what you had said earlier, which is not the same thing.

I have read many web pages dealing with indexing. I've found them to be as useful as a bucket of pee.

I'm sorry I asked. I don't feel that I did anything to deserve that barrage of condescending comments. I certainly won't be back asking any more questions or taking up any more of your precious time.|||One thing you may try is the Index Tuning Wizard. It is not a perfect tool, but it may point out what indexes are not getting used. You will need to run a trace of the activity, and making sure you get enough activity is the big problem. If some of these indexes are used for the CEO's big report at the end of the quarter, and you drop that index, there could be a few uncomfortable explanations you will have to give. In short, have all the indexes scripted out, so you can add them back later on, if need be.

We don't have access to that here; I did have the DBAs run it on a SQL Server 2000 database a short time ago and while they didn't let me see the actual output, they said that my indexes were "fine." Then they called me the next day griping about clustered index scans. Can't hit a moving target so why try? Seems nearly every time I ask, I get insulted and shouted down.|||Well, I think you need to chill just a bit...I don't think Pat was trying to be condesending...

If the DBA's are griping about scans...that is a totally different issue...|||Well, I think you need to chill just a bit...I don't think Pat was trying to be condesending...

If the DBA's are griping about scans...that is a totally different issue...

I'm trying to chill but it seems that indexing is a sacred subject because everywhere I've asked, I've basically been told to take a hike when I ask questions about that particular subject. The books I've read suck and I haven't found a decent website yet. I'm here asking for help and I'm getting fobbed off. My DBAs don't want to help me since they're about to be outsourced; you couldn't pay them to care about what's going on at the moment. Thing is, I still have a job and my database's performance is an issue. I'm trying to fix it. That's all I'm trying to do here.|||How about posting one of the queries that seem to be giving you a problem? Not all queries can be helped by indexes. In fact, some can be outright hurt by indexes, so even the best of experts have to guess more often than you might think.|||I've been watching Pat post for several years and he is just about the most even handed poster on here. I think you will probably find that he was trying to triangulate in on your experience and knowledge level to ensure his answer was appropriate and (it appears) simply missed the target a little.

And what is wrong with the UK anyway? Or have we covered that already before...?|||How about posting one of the queries that seem to be giving you a problem? Not all queries can be helped by indexes. In fact, some can be outright hurt by indexes, so even the best of experts have to guess more often than you might think.

Well...I think some of the queries are problematic because they're pulling down a ton of rows. The queries drive reports in an application I wrote and their where clauses are dynamic. I've tried to index the columns used in the where clauses but they just never seem to get used; I just get hit with a clustered index scan unless I keep narrowing the focus of the query; then I get the index seeks that the DBAs want to see.

For example, this query takes 24 seconds to run and returns 53,914 rows.

SELECT a.Pol_no
, a.OM_Available
, a.OM_Complete
, a.Agency
, a.Channel
, a.EDI_Flag
, a.NTU_Date
, a.PLCY_NAPI
, a.Plcy_Type_Cd
, PIE.Started AS 'Policy_Issue_Started'
, PR.Started AS 'PROPNTU_Started'
, Reg.Ended AS 'Registration_Ended'
, a.Team
, a.SumAssured
, a.RunDate
, a.MissingItems
, a.Distributor
, a.ParentCompany
, a.Site
, a.CI_Type
, a.ProcessOwner
, a.Status AS 'FullStatus'
, CriticalIllness = CASE WHEN a.CI_Type IS NULL THEN 'Life Only' ELSE 'Critical Illness' END
, QuartersOld = CAST(DATEDIFF(m, a.OM_Available, a.RunDate)/3 AS Int)
, DATEDIFF(d, a.OM_Available, a.RunDate) AS 'DaysElapsed'
FROM vwFESubset a
LEFT OUTER JOIN vwPIE PIE ON a.Pol_no = PIE.Pol_no
LEFT OUTER JOIN vwPROPNTU PR ON a.Pol_no = PR.Pol_no
LEFT OUTER JOIN vwReg Reg ON a.Pol_no = Reg.Pol_no
WHERE a.Status IN('Accepted','Registered','Pending')
AND a.OM_Complete IS NULL

We never run it on its own; we just query it for other reports. I get index scans in five different tables and index seeks in three, and sometimes multiple index scans or seeks for the same table. There are some tables with less than 12 rows in them which always have clustered index scans. I've been doing this less than a year so I'm not always sure how to interpret the query plan but our DBAs really hate seeing scans of any kind and they jump all over me whenever they see them.|||And what is wrong with the UK anyway? Or have we covered that already before...?

I don't like it.|||I am going to guess that the "vw" in the from clause means views. One of the first things to try here is to go through those views, and see if the views involve tables you do not need. If so, you can rewrite the query to go to the base tables, and you will avoid all the reads and joins against those extra tables.

As a percentage, how many rows have a.Status IN('Accepted','Registered','Pending')? I am going to guess it is a sizable percentage. If so, this query may not be eligible for indexes.

Is this query run for every report? If so, is the data aggregated or filtered further up the chain? If you can get that extra filtering back down to the database level, you may stand a better chance of making your DBAs less unpleasant.|||I am going to guess that the "vw" in the from clause means views. One of the first things to try here is to go through those views, and see if the views involve tables you do not need. If so, you can rewrite the query to go to the base tables, and you will avoid all the reads and joins against those extra tables.

As a percentage, how many rows have a.Status IN('Accepted','Registered','Pending')? I am going to guess it is a sizable percentage. If so, this query may not be eligible for indexes.

Is this query run for every report? If so, is the data aggregated or filtered further up the chain? If you can get that extra filtering back down to the database level, you may stand a better chance of making your DBAs less unpleasant.

The views are all necessary, and that table is indexed. The data is filtered sometimes, and the columns it is filtered on are indexed.

The Status IN value grabs 106800 rows out of 124727 possibles. I would think that wouldn't be eligible for indexing ever but I can't make my DBAs understand. They want me to further narrow the scope of the queries, which would make the reports pretty much useless.

No comments:

Post a Comment