Saturday, February 25, 2012

"Multi relational" table design question

Hi! Im working on a webapplication and has serious thoughts about howto optimize my table structure. To explain:

My tablestructure today

(simplified):

tbl_customers
cust_id
name
....

tbl_contacts
con_id
name
....

tbl_groups
grp_id
name
....

My subtables look like this(alternative 1):

tbl_sub_phone
phone_id
parent_type
parent_id
phone_area
phone_nr
....

tbl_sub_email
mail_id
parent_type
parent_id
email
....

As seen above every contact, group and customer can be assigned an unlimited amount of phonenumbers or emailadresses.
For example when entering a new email or a customer following will be inserted in tbl_sub_email: parent_type = 'cst', parent_id= '2' (the cust_id from tbl_customers), email ='gwerg@.fe.com'

The problem is i am uncertain if this is a very unefficient way of handling it? i see two alternatives:

Alternative 2:
i create x subtables for each table for example tbl_customers will get its mailadresses and phonenumbers contained in tbl_customers_phone and tbl_customers_email
What i am uncertain of here is if this would make things alot more troublesome when searching p? example after a specific phonenumber.

Alternative 3:

(simplified):

tbl_customers
cust_id
name
....

tbl_contacts
con_id
name
....

tbl_groups
grp_id
name
....

tables connection objects to subobjects

tbl_customers_phone
id
cust_id
phone_id

tbl_contacts_phone
id
con_id
phone_id

tbl_customers_mail
id
cust_id
mail_id

subtables

tbl_sub_phone
phone_id
phone_area
phone_nr
....

tbl_sub_email
mail_id
email
....

Ranking these three models, wich would be the most efficient and most inefficient performanswise?
What i want to avoid is performanceproblems when listing the objects, my indexing skills are a bit limited although im doing alot of reading and testing regarding this.
So thats why im asking for advice so that i can minimize the need of rebuilding the table structure when the application already has been starting to get used.

I also have another general question.

I have alot of select querys when i need to fetch data from several different tables.
Most of them is that i for example get an application from tbl_applications table, and that tables contains the columns cat1, cat2 and cat3 (wich are categories and contain the primary key integer to the tbl_sub_categorys table)
With 3 joins i retrieve these 3 category names returning 1 result with all the info i need.

Since ive been getting som strange results from the query analyzer(i got results that using clustered indexing for the primary key resulted in a slower query (higher cost)) i actually have another question.

Can it generally be summed up that a single query(join or subquery) generaly ils faster than getting the data in separate selects?
In the example above this i have the options either of using joins = 1 query or doing 2 querys and sorting the categorys codewise in aspx pages or doing 4 querys, one for the app followed by 1 for every category.
Any input regarding this?

As i said earlier im looking for the most efficient way of doing the things abov, would greatly appriechiate any input!Big Smile

Hi,

I don't know what tools come with SQL Server Express, but most full version RDBMS have tools that let you look at execution plans for queries and show the reletive cost of actions a given query plan will take. Look for terms like 'show plan', 'show query plan', 'query optimizer', or options for whatever 'query designer' tool you may have. You can also save a querry with an execution plan that works well as a stored procedure, which helps even more. Selective denormalization of a database (adding redundant data columns to a table), can improve performance, if the number of tables referenced by a single query exceeds a certain number (something like >7), depending on the RDBMS (you can look that up in the help). Doing this means you'll have to cover any data integrity issues on your own, though.

Indexing generally helps, but you'll want to test to see if it does in any given situation by testing. Also, consider the physical ordering of whatever tables you have. For any given table, there can be only one order in which the rows are stored. When a query needs to read through a table to find the rows it needs, the physical sort oder is the most effecient.

Doesn't sound like data integrity is a prime concern in your app. Keep in mind though, that in many cases, what you do to make queries more effecient can open up your database to data integrity issues. Optimally, you can copy a subset of a database where data integrity is the prime concern, to a database setup just for queries (maybe that's what you have already). This way, you can optimize without worrying about effects that might have on the data you need to keep safe. Also, you won't have people hitting and locking up a line-of-business database, if they don't need to.

Just some things to look into. In the end, you can chase optimization forever, so get an idea of what degree or performance is good enough, and then weigh that against the other issues you face. It could be that other parts of your app will show a better return on the time you spend than data structure optimization. Hope it helps. BRN..

|||

Well as i understand the main thing is to avoid table scans since this can be a *** when u reach a certain number of querys. Have done some tests inserting approx 50 000 - 100 000 rows and then checking performance degradation, wich is quite significant. What i aim at is creating a table structure that can be optimised as easy as possible. I have some lack in indexing skills wich makes me wonder how much one can improve the table structure i have now without splitting the subtables or creating intermediate tables that are used when listning the objects.

|||

Think i will split the tables for now and, if i see any reason to it, join them later(wich i dont think + if i would, it would be easier than the other way around).

And yeah, optimizing the code probably would be something worth spending time on as well. But will go with what i have now and read more into indexing and query optimization (The ladder wich i think is quite important to be able to chosse the right query/code structure).

No comments:

Post a Comment