1

do you store every variable in a relational db? more specifically, only just the class members not local vars. what if table has 200 columns and roughly 75% of the table is nulls. why or why not?

Comments
  • 4
    Assuming each of your fields gets its own column I would say your class is muuuuuuuch to big to have 200 columns. A normal class has I would say max. 15 fields if it's an entity. All other stuff I can think of would be references to other entities like a person class would have a reference to a Adress class. And both would life in their own table referenced by a link table.

    So in short. Save what has to be persistent but keep your classes small and so by using an ORM your tables in your database too.
  • 4
    Sounds like a horrible design.

    Store only what you need.

    Tables in a relational db should be designed to store the data so that it can be accessed easily, quickly and safely.

    Classes should be designed to work with the data that you get from the db.
    They should not be a reflection of your tables.
  • 2
    One of the most common mistakes....

    The more indices an table has, the less performant it will be regarding write operations and *read* operations.

    Most databases can only utilize one index per table **easily**. Yes, there are several techniques that allow using more than index, but most of these techniques have *very* specific rulesets, they cannot be used in most cases.

    So if we make it kind of a rule of thumb that the fewer the indexes per table and the higher the selectivity of indexes (meaning the ability to narrow the search for values, e.g. regarding cardinality) - the better.

    One can immediately infer that a higher number of columns spells trouble... After all, you can only add few indexes - and compound indices (btree) require evaluation of index columns in the same order as defined... So if you have 200 columns in a table, it's unlikely you need only few indexes... After all, the columns all contain data that might be queried.

    Even worse when you join the tables and need all fields.

    You're creating an memory blasting desaster.

    Try to minimize tables based on search criteria, use inner joins to increase selectivity by reference and make sure that the indexes are "properly" designed.

    E.g. avoid the fear of the unknown and making columns unnecessarily large regarding types - if an integer is enough, use an integer. If an tinyint is sufficient, use an tinyint.
  • 1
    Use child tables my friend, for storing part of the data which may not exist more than half of the time. It's fine even if each row in the parent table can only have one child in that table. This solves the 75% null (emptyness) because when a parent doesn't have that particular data, she simply doesn't have a child in that child table.
  • 1
    table Members: ObjectID, Class, Key, Value?

    or table Objects: ObjectID, blob BinarySerializedData

    or table Objects: ObjectID, text SerializedData

    ?

    obviously depends on whether you're just saving/loading the data, or you also need to do queries with the specific members.
Add Comment