Showing posts with label number. Show all posts
Showing posts with label number. Show all posts

Friday, March 30, 2012

Invoice dimension - best way to model it

We have a system that will have ~15 million invoices per year. We would like to track lots of very unique attributes of each invoice (invoice number, etc). Currently we are using a junk dimension ([Dim Invoice Attribute]) that contains 4 or 5 of the attributes, but our changes will require going to the grain of the invoice table. I have read having a physical dimension table that has a 1:1 cardinality to the fact table isn't a good idea (strictly because of the size of the table and the operation of joining fact to dimension). Currently we have 2 or 3 attributes of the invoice in the fact table (degenerate dimensions) in order to fulfill relational queries for tying things back to the MDX results.

My question is what is the best way to model this? I have read both sides of the story - create an invoice dimension table, and store everything there with keys relating back to the fact table, and also store all of the attributes in the fact table and create a named query in the .dsv that selects out only the attributes for the [Dim Invoice] dimension, thus saving space and join energies during processing.

Can anyone shed some light on this?

Thank you in advance,

John Hennesey

Typically in this situation, you have a fact table that represents the invoicing process. One of the dimensions is the invoice which will most likely have a one-to-one relationship with the fact. Does this describe your model?

Kimball would describe this as a degenerate dimension and would recommend you move the dimension into the fact table. SSAS provides support for this through a fact dimension. (See Books Online topic "Dimension Relationships" for a more detailed description.)

So this works if we're talking about a fact dimension with just a few attributes. After that, the model starts to look a little sloppy. I would look at the cardinality of some of these other attributes. If the cardinality is relatively low, you may want to move these into their own dimensions or if the cardinality is very low lump them into a junk dimension.

The other thing I would consider is the user's experience. How often will users be accessing these other dimensions created to hold those attributes? If these are presented separate from the Invoice dimension, would these seem weird or be cumbersome for your user? Do you plan on providing user-hierarchies that role your invoice number up to these attributes? If not, then I'd recommend going ahead with moving them out.

From a performance standpoint, I don't think your design choice has too big an impact on SSAS. If you figure that every attribute hierarchy is given its own storage and the only thing that would be different internally would be the relationship maps. The impact of having or not having these depends on how you intend to use your attributes.

So, I guess my general recommendation would be keep to standard, Kimball principles and put an emphasis on end-user experience. If you run into a performance problems down the road, you should be able to easily tweak the model.

If others folks have thoughts on this, esp regarding how the SSAS engine handles this stuff, please, please chime in.

Thanks,
Bryan

|||

I agree with the approach of keeping the invoice attributes in the fact table.

One issue is the number of invoices(metadata) that will be downloaded to the client when they use the invoice dimension. This will be a really slow process if you show to many invoice members and the solution is to create artificial levels above the invoice number. I build the first level with the first character in the invoice number using the TSQL-left function in the data source view.. A guide line is to have no more than 10-15 groups on each level. On the leaf level you have have larger groups.

I have asked the SSAS2005 development team about any performance issues with building a fact table dimension and they have no negative impact.

HTH

Thomas Ivarsson

|||

Interesting - I have heard arguments on both sides of the argument. To implement a fact table dimension, it would be using a named query that goes against the fact table, right? Is there a special way to set up a fact table dimension (any attributes of the dimension itself?) I have also read in just about every article it is necessary to set up a hierarchy to limit the number of members returned to the client - indeed a very good idea.

Thank you all for your input,

John Hennesey

|||

Hello John! Or use named queries in the fact table for each artifical level in the invoice hierarchy/dimension.

Good luck!

/Thomas

Wednesday, March 28, 2012

INvariant part inside SELECT

All,
Can we do this trick and if yes then how? Just schematically: the SP should
return the number of records if the parameter @.Count=1, if not, then the
records themselves. The problem is that there is some complicated JOIN and
the whole set of WHERE clauses that I wouldn't like to repeat in two
different queries looking almost identically excluding the main SELECT part.
The idea described below doesn't work.
--Parameter
Declare @.Count bit
SET @.Count = 1
SELECT
CASE
WHEN @.Count = 1
THEN pe.*, pn.*
ELSE COUNT(*)
END
...
FROM ...
INNER JOIN ... ON ...
WHERE ...
Any ideas?
Just D.Just D (no@.spam.please) writes:
> Can we do this trick and if yes then how? Just schematically: the SP
> should return the number of records if the parameter @.Count=1, if not,
> then the records themselves. The problem is that there is some
> complicated JOIN and the whole set of WHERE clauses that I wouldn't like
> to repeat in two different queries looking almost identically excluding
> the main SELECT part. The idea described below doesn't work.
The best is probably to put the whole JOIN-WHERE business in an
inline table-valued function. Then the procedure can read:
IF @.count = 1
SELECT COUNT(*) FROM tblfunc(@.par1, @.par2, ...)
ELSE
SELECT col1, col2, ...
FROM tblfunc (@.par1, @.par2, ...)
You could also bounce the data over a temp tble, but that would be more
expensive in terms of performance, not the least for the COUNT. (Since for
the COUNT(*) SQL Server may find a quicker query plan when it does not have
to read all data pages.)
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Hi
Using SELECT * in production code is not a good idea. The only way to do
what you are doing without writing the query twice would be to use dynamic
SQL.
John
"Just D" wrote:

> All,
> Can we do this trick and if yes then how? Just schematically: the SP shoul
d
> return the number of records if the parameter @.Count=1, if not, then the
> records themselves. The problem is that there is some complicated JOIN and
> the whole set of WHERE clauses that I wouldn't like to repeat in two
> different queries looking almost identically excluding the main SELECT par
t.
> The idea described below doesn't work.
> --Parameter
> Declare @.Count bit
> SET @.Count = 1
>
> SELECT
> CASE
> WHEN @.Count = 1
> THEN pe.*, pn.*
> ELSE COUNT(*)
> END
> ...
> FROM ...
> INNER JOIN ... ON ...
> WHERE ...
> Any ideas?
> Just D.
>
>|||Erland,
Correct me if I am wrong. I don't see any performance benifit by using the
table valued function over using the actual query, except for the fact that
the stored procedure looks better :)
The execution plan is not stored for the TVF but is stored in the calling
SP. And the plan will be recomplied everytime the condition changes. I would
say it would be better performance wise, if we have two stored procedures on
e
for returning the row count and one for returning the result set and call
these two SPs from the main SP based on the condition.So that only the main
SP will get recompiled and will not be much of an overhead.
If its a query with a simple execution plan, then what you suggest will be
fine.
-Omnibuzz (The SQL GC)
http://omnibuzz-sql.blogspot.com/
"Erland Sommarskog" wrote:

> Just D (no@.spam.please) writes:
> The best is probably to put the whole JOIN-WHERE business in an
> inline table-valued function. Then the procedure can read:
> IF @.count = 1
> SELECT COUNT(*) FROM tblfunc(@.par1, @.par2, ...)
> ELSE
> SELECT col1, col2, ...
> FROM tblfunc (@.par1, @.par2, ...)
> You could also bounce the data over a temp tble, but that would be more
> expensive in terms of performance, not the least for the COUNT. (Since for
> the COUNT(*) SQL Server may find a quicker query plan when it does not hav
e
> to read all data pages.)
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/pr...oads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodin...ions/books.mspx
>|||>> Just schematically: the SP should return the number of records [sic] if the parame
ter @.Count=1, if not, then the records [sic] themselves. <<
Did you ever have a software engineering course? Remember cohesion?
The idea that a properly designed code module will perform one
well-defined task. Good programmers do not write things that return
the square root of a number or translate Flemish depending on a
parameter.
Why did you make your low-level BIT flag a reserved word? Why are you
thinking in terms of assembly language style flags and variant records
instead of rows?
DECLARE @.Count bit
SET @.Count = 1
SELECT
CASE
WHEN @.Count = 1
THEN pe.*, pn.*
ELSE COUNT(*)
END
..
FROM ...
INNER JOIN ... ON ...
WHERE ... ; <<
CASE is an expression and not a control flow device. You can use an
IF-THEN-ELSE construct in T-SQL to mimic procedural coding with variant
records instead of using declarative coding.
Did you also notice that you want to return one column and then want to
return two columns? Arow in a relational table always has a fixed
number of columns, unlike records in a file. Basically, you are still
writing COBOL or some other procedural file-oriented language, but you
are doing it in SQL.
This is a simple matter of cut & paste, not the end of the world.
However, if you are just looking for a newsgroup kludge instead of a
real answer in one query, try:
SELECT
CASE WHEN @.assembly_language_flag = 1
THEN 'violated cohesion'
ELSE COUNT(*) END AS foobar,
CASE WHEN @.assembly_language_flag = 1
THEN PA.x
ELSE 'violated cohesion' END AS x,
etc.
FROM ..
Boy that is awful, isn't it?|||You're so kind as usual writing that in this style. :) Let me guess, you're
from the Western Ukraine, aren't you?
"--CELKO--" <jcelko212@.earthlink.net> wrote in message

> Did you ever have a software engineering course? Remember cohesion?
> The idea that a properly designed code module will perform one
> well-defined task. Good programmers do not write things that return
Tell that to the MS coders (mostly contractors from India:)) who were
usually adding 20 and more parameters like NULL (reserved) to the method
parameter list.overriding one method tons of times. That was always MS
style.

> the square root of a number or translate Flemish depending on a
> parameter.
Yea-yea, pretty close.
The flame is closed.|||Omnibuzz (Omnibuzz@.discussions.microsoft.com) writes:
> Correct me if I am wrong. I don't see any performance benifit by using
> the table valued function over using the actual query, except for the
> fact that the stored procedure looks better :)
Correct, but the presumption was that Just D wanted to the procedure
to look better. That is, he did not want repeat the conditions. And I can
think of four ways to achieve this aim:
1) view/inlined table function.
2) bounce over temp table.
3) dynamic SQL.
4) pre-processor.
In my post I only discussed the first two options, and of these the
TVF gives better performance than the temp table.
In my opinion, using dynamic SQL introduces another level of complexity
which is not worth the pain in this case.
And preprocessor? Well, we have one in our environment, but most
people doesn't.

> The execution plan is not stored for the TVF but is stored in the calling
> SP. And the plan will be recomplied everytime the condition changes.
As I understood it, the JOIN and WHERE conditions of the query are
stable. As for the condition on whether to return COUNT or result set,
that should lead to any recompilation.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx|||Erland, thanks for your answer.
Yes, the main idea was to make the SP more flexible and maintainable. The
conditions are complex enough to repeat them more than one time, and that's
especially bad if we need to improve/modify them in future, we can easily
make a simple mistake doing that in two places, OR we will have to
copy/paste each time we need to change something. That's why this idea
appeared. But from another side any change like that should not seriously
affect the speed of the code or the whole complexity because having this
divantage we're getting into another trouble which is not acceptable.
That's why I asked this newsgroup for a new, better idea. To implement the
function - then we'll need to maintain this function and provide the
required set of tables and parameters that should be cached in a different
way I guess if we call the function inside our SP. Temporary table - it's
even the worst scenario. Many different ways are able to change the whole
idea and to do one thing crashing all around.
Just D.
"Erland Sommarskog" <esquel@.sommarskog.se> wrote in message
news:Xns97E0F409CDF8DYazorman@.127.0.0.1...
> Omnibuzz (Omnibuzz@.discussions.microsoft.com) writes:
> Correct, but the presumption was that Just D wanted to the procedure
> to look better. That is, he did not want repeat the conditions. And I can
> think of four ways to achieve this aim:
> 1) view/inlined table function.
> 2) bounce over temp table.
> 3) dynamic SQL.
> 4) pre-processor.
> In my post I only discussed the first two options, and of these the
> TVF gives better performance than the temp table.
> In my opinion, using dynamic SQL introduces another level of complexity
> which is not worth the pain in this case.
> And preprocessor? Well, we have one in our environment, but most
> people doesn't.
>
> As I understood it, the JOIN and WHERE conditions of the query are
> stable. As for the condition on whether to return COUNT or result set,
> that should lead to any recompilation.
>
>
> --
> Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
> Books Online for SQL Server 2005 at
> http://www.microsoft.com/technet/pr...oads/books.mspx
> Books Online for SQL Server 2000 at
> http://www.microsoft.com/sql/prodin...ions/books.mspx|||Just D (no@.spam.please) writes:
> That's why I asked this newsgroup for a new, better idea. To implement the
> function - then we'll need to maintain this function and provide the
> required set of tables and parameters that should be cached in a different
> way I guess if we call the function inside our SP.
Not really sure what you mean here. An inline-table function does not have
any query plan of its own. An inline table function is really a macro that
the optimizer pastes in before building the query plan. (Note that this
does not apply to multi-statement functions nor to scalar functions.)
As for the maintenance, you would move that to the function. The procedure
would just be a wrapper on the function.
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se
Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pr...oads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodin...ions/books.mspx

Wednesday, March 21, 2012

Invalid Logical Page

When trying to DTS a table (Auth_Details) from one database to another, I received the following error message:
Error at Source for row number 296443, Errors encountered so far in this task: 1
Attempt to fetch logica page (1:566896) in database abc belongs to object 'Ben_Tables', not to object 'Auth_Details'.
Can anyone tell me what may cause this and how to get rid of the problem?
Thanks,
Joe
Seems you have a corruption in your database. Here's some generic info:
http://www.karaszi.com/sqlserver/inf...suspect_db.asp
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
"JoeA" <anonymous@.discussions.microsoft.com> wrote in message
news:8027A9B8-23E8-4F1A-945C-5A5328CC9B0D@.microsoft.com...
> When trying to DTS a table (Auth_Details) from one database to another, I received the following error
message:
> Error at Source for row number 296443, Errors encountered so far in this task: 1
> Attempt to fetch logica page (1:566896) in database abc belongs to object 'Ben_Tables', not to object
'Auth_Details'.
> Can anyone tell me what may cause this and how to get rid of the problem?
> Thanks,
> Joe

Invalid Logical Page

When trying to DTS a table (Auth_Details) from one database to another, I re
ceived the following error message:
Error at Source for row number 296443, Errors encountered so far in this tas
k: 1
Attempt to fetch logica page (1:566896) in database abc belongs to object 'B
en_Tables', not to object 'Auth_Details'.
Can anyone tell me what may cause this and how to get rid of the problem?
Thanks,
JoeSeems you have a corruption in your database. Here's some generic info:
http://www.karaszi.com/sqlserver/in..._suspect_db.asp
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
"JoeA" <anonymous@.discussions.microsoft.com> wrote in message
news:8027A9B8-23E8-4F1A-945C-5A5328CC9B0D@.microsoft.com...
> When trying to DTS a table (Auth_Details) from one database to another, I received
the following error
message:
> Error at Source for row number 296443, Errors encountered so far in this t
ask: 1
> Attempt to fetch logica page (1:566896) in database abc belongs to object 'Ben_Tab
les', not to object
'Auth_Details'.
> Can anyone tell me what may cause this and how to get rid of the problem?
> Thanks,
> Joesql

Monday, March 19, 2012

Invalid dates passed to an SP

Hi,
We have a number of SP's which have a couple of datetime params.
We expect the format to be such as 'YYYYMMDD' to get around the issue of US
vs UK dates. Unfortunately if an invalid date is passed, it errors straight
away with (e.g.)
Server: Msg 8114, Level 16, State 4, Procedure usp_GetUsage, Line 0
Error converting data type varchar to datetime.
Is there any way of getting around this? The error occurs BEFORE I can check
whether the dates are valid with the isdate() function.
I know that I can change all the params to varchar, and check them before
putting the values into datetime variables, but this seems a bit of a fudge.
Any help would be most appreciated.
Cheers!
AndyWell, you pass a invalid parameter to a stored procedure. Of course the
parameters are checked _before_ the stored procedure itself is executed.
And what is the exact use of checking the parameter in the stored procedure
itself (if you changed the parameter to a character datatype)? The only
thing you could do is raise a more custom error message, but I hope the
programmers that code against this stored procedure are smart enough to work
out the meaning the error message they get now, and implement error handling
or fix bugs on their side accordingly.
Jacco Schalkwijk
SQL Server MVP
"Andy" <Andy@.discussions.microsoft.com> wrote in message
news:6D80C811-3ED4-4F2A-8A4F-2D18FCD111ED@.microsoft.com...
> Hi,
> We have a number of SP's which have a couple of datetime params.
> We expect the format to be such as 'YYYYMMDD' to get around the issue of
> US
> vs UK dates. Unfortunately if an invalid date is passed, it errors
> straight
> away with (e.g.)
> Server: Msg 8114, Level 16, State 4, Procedure usp_GetUsage, Line 0
> Error converting data type varchar to datetime.
> Is there any way of getting around this? The error occurs BEFORE I can
> check
> whether the dates are valid with the isdate() function.
> I know that I can change all the params to varchar, and check them before
> putting the values into datetime variables, but this seems a bit of a
> fudge.
> Any help would be most appreciated.
> Cheers!
> Andy|||I agree with what you're saying, but just wanted to write a more robust SP s
o
that i could provide users with a simple error message.
regards,
Andy
"Jacco Schalkwijk" wrote:

> Well, you pass a invalid parameter to a stored procedure. Of course the
> parameters are checked _before_ the stored procedure itself is executed.
> And what is the exact use of checking the parameter in the stored procedur
e
> itself (if you changed the parameter to a character datatype)? The only
> thing you could do is raise a more custom error message, but I hope the
> programmers that code against this stored procedure are smart enough to wo
rk
> out the meaning the error message they get now, and implement error handli
ng
> or fix bugs on their side accordingly.
> --
> Jacco Schalkwijk
> SQL Server MVP
>
> "Andy" <Andy@.discussions.microsoft.com> wrote in message
> news:6D80C811-3ED4-4F2A-8A4F-2D18FCD111ED@.microsoft.com...
>
>|||The applications that access the database should provide the users with a
simple error message. Users really shouldn't call stored procedures directly
IMO.
Jacco Schalkwijk
SQL Server MVP
"Andy" <Andy@.discussions.microsoft.com> wrote in message
news:DDAC2AB1-4794-4F79-B467-04EB4E36CDA7@.microsoft.com...
>I agree with what you're saying, but just wanted to write a more robust SP
>so
> that i could provide users with a simple error message.
> regards,
> Andy
> "Jacco Schalkwijk" wrote:
>|||Nothing reasonable you can do except make sure that when you build an
application you don't allow bad dates. There are lots of date controls that
can be purchased, or easily written.
You "could" change your parms to text and check them, but that is not a
great use of processing time. Checking universally known domain values like
this should always be done in the client where the possible 10 milliseconds
to perform the task will not be added to multiple other operations, causing
greater waiting all around. Doing it as the user leaves the date control
puts this work in a better place.
----
Louis Davidson - drsql@.hotmail.com
SQL Server MVP
Compass Technology Management - www.compass.net
Pro SQL Server 2000 Database Design -
http://www.apress.com/book/bookDisplay.html?bID=266
Blog - http://spaces.msn.com/members/drsql/
Note: Please reply to the newsgroups only unless you are interested in
consulting services. All other replies may be ignored :)
"Andy" <Andy@.discussions.microsoft.com> wrote in message
news:6D80C811-3ED4-4F2A-8A4F-2D18FCD111ED@.microsoft.com...
> Hi,
> We have a number of SP's which have a couple of datetime params.
> We expect the format to be such as 'YYYYMMDD' to get around the issue of
> US
> vs UK dates. Unfortunately if an invalid date is passed, it errors
> straight
> away with (e.g.)
> Server: Msg 8114, Level 16, State 4, Procedure usp_GetUsage, Line 0
> Error converting data type varchar to datetime.
> Is there any way of getting around this? The error occurs BEFORE I can
> check
> whether the dates are valid with the isdate() function.
> I know that I can change all the params to varchar, and check them before
> putting the values into datetime variables, but this seems a bit of a
> fudge.
> Any help would be most appreciated.
> Cheers!
> Andy

Monday, March 12, 2012

Invalid column name 'rowguiedcol' - error number 207

"The process could not deliver the snapshot to the Subscriber"
I added a new database to the same server as the publisher and distributor.
The error ocurred when pushing a new subscription to it.
I did indicate that the subscriber already has the schema and data.
Any ideas for correcting?
thanks,
bob
Sounds like the table you are replicating to has an extra column in it which
is not in the publisher. Look for a column called rowguidcol and delete it.
If you get an error message complaining that this table is published for
replication or a similar message, verify that this table is not published,
or published to.
If it is you might have to do the following:
exec sp_configure N'allow updates', 1
go
reconfigure with override
go
UPDATE syscolumns set colstat = colstat & ~4096 WHERE colstat &4096 <>0
GO
UPDATE sysobjects set replinfo=0
GO
and then issue a sp_MSunmarkreplinfo against the problem table
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
"Robert A. DiFrancesco" <bob.difrancesco@.comcash.com> wrote in message
news:umf1ACevEHA.2520@.TK2MSFTNGP15.phx.gbl...
> "The process could not deliver the snapshot to the Subscriber"
> I added a new database to the same server as the publisher and
distributor.
> The error ocurred when pushing a new subscription to it.
> I did indicate that the subscriber already has the schema and data.
> Any ideas for correcting?
> thanks,
> bob
>

Friday, February 24, 2012

intersection of dimension?

i have a query that counts the number of stores that both exist in two dimension:

nonemptycrossjoin(descendants(dimlocation.currentmember,9,leaves),descendants(dimtargetset.currentmember,9,leaves)).count

dimlocation is a parent child dimension while dimtargetset is a star schema...

later i realized that im only counting the stores with data and excluding stores with no data in the count.. i tried crossjoin only but it doesnt return the correct count...

is there any way i could count them? im using as 2000... thanks...

any ideas?|||

See this post: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2139906&SiteID=1

Best regards

- Jens

intersection of dimension?

i have a query that counts the number of stores that both exist in two dimension:

nonemptycrossjoin(descendants(dimlocation.currentmember,9,leaves),descendants(dimtargetset.currentmember,9,leaves)).count

dimlocation is a parent child dimension while dimtargetset is a star schema...

later i realized that im only counting the stores with data and excluding stores with no data in the count.. i tried crossjoin only but it doesnt return the correct count...

is there any way i could count them? im using as 2000... thanks...

any ideas?|||

See this post: http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2139906&SiteID=1

Best regards

- Jens