I use SQL Server and Entity Framework as ORM.
Currently I have a table
Product which contains all products of any kind. The different kinds of products possess different attributes.
- All products of kind
- Where as all products of kind
Carhave attributes like
Based on this scenario I created a table called
Attribute which contains all the attributes of a product.
Now to fetch a product from database I always have to join all the attributes.
To insert a product I have to insert all the attributes one by one as single rows.
The application is not just a shop or anything like it. It should be possible to add/remove an attribute to/from a kind of product on the fly without changing the db.
But my questions to you is still:
- Is this a bad design?
- Is there another way of doing it?
- Will my solution slow down significant? (f.e. an insert takes several seconds assumed the product has hundreds of attributes…)
The problem is that my application is really complex. There are a lot of huge algorithms. The software is used for statistical purposes.
One problem for example is the following one: In an algorithm-table I’m storing which attributes are used for filters. Say an administrator wants to filter all cars that have less than 100 horsepowers. The filters are dynamical, what means that I have a filter table which stores the filter type (lessThan) and the attribute (horsepowers). How can I keep this flexibility with the suggested approaches (with “hardcoded” columns)?
There is a thing about EF that I don’t think everybody is aware of when designing the relations.
When you query something, EF (at least <= 4) wants to create a single SELECT for that query.
What that implies is that if you have entity A, that have a one-to-many relationship to entity B (say Item to Attributes) then EF joins the two together such there will be a returned row for all dependent Bs for each A. If A have many properties, multiple dependencies or even worse if B has many sub-dependencies, then the returned table will be quite massive, since all A-properties will be copied for each row of dependent B. Over time, when your entity models grow in complexity, this can turn into a real performance problem.
EF only includes the Bs if you explicitly tell to it to eager load the dependencies “include”s. If the includes are omitted, your stuff will initially load faster, but once you access your attributes, they will be lazy-loaded by EF. This is known as the SELECT N+1 problem (each A will require N times B-lazy queries, which can be a huge overhead).
While this is not a straight answer to your question, it is something to consider when designing your tables.
Also note, that EF supports several alternatives for base-classing. One strategy is to have a common table, that automatically joined together with the sub-entities. The alternative, which typically performs better, but is harder to upgrade, is to have one table with a super-set of all properties of all sub-classes.
More (over) generalized database design considerations:
- The devil is in the details. You can make a whole career out of making good database design choices. There is no silver bullet database patterns.
- EF comes with a lot of limitations. This is the price for the convenience. If the model suits EF well, then EF is quite good, but do consider more flexible alternatives like NHibernate. Sometimes even plain old data tables with views and stored procedures are to be preferred.
- EF is not efficient if your model has a lot of small dependents (like a ton of attributes to an item table). It will result in either a monster query and return table or the select n+1 problem. You can write some tricky multi-part LINQ queries to somewhat compensate, but it is tricky.
- SQL’s strength is in integrity and reporting which works best for rather rigid data models.
- Depending on the details, your model looks like a great candidate for a NoSql backend, like RavenDb and MongoDb. NoSql is much better for dynamic datamodels and scale really well.