23

I'm designing a database for a web site that will have at least 4 different object types represented (articles, blog posts, photos, stories), each of which have different enough data requirements to warrant their own tables. We want users to be able to post comments for any of these types. The data requirements for comments are simple and independent of the type of thing the comment regards (ie just a comment body, and the author's email).

I want to avoid the redundancy of creating and managing 4+ separate tables for the comments, so I'd like to be able to hold all comments in one table, possibly specifying the relation via 2 columns: one to designate the parent entity and one for the parent row Id.

but I don't understand how, then, I would implement foreign keys, since foreign keys establish a relation between 2 and only 2 tables (right?).

So with all that in mind, what would be the best approach?

36

Here's one way to implement supertype/subtype tables for your app.

First, the supertype table. It contains all the columns common to all subtypes.

CREATE TABLE publications (
  pub_id INTEGER NOT NULL PRIMARY KEY,
  pub_type CHAR(1) CHECK (pub_type IN ('A', 'B', 'P', 'S')),
  pub_url VARCHAR(64) NOT NULL UNIQUE,
  CONSTRAINT publications_superkey UNIQUE (pub_id, pub_type)
);

Next, a couple of subtype tables.

CREATE TABLE articles (
  pub_id INTEGER NOT NULL,
  pub_type CHAR(1) DEFAULT 'A' CHECK (pub_type = 'A'),
  placeholder CHAR(1) NOT NULL, -- placeholder for other attributes of articles
  PRIMARY KEY (pub_id, pub_type),
  FOREIGN KEY (pub_id, pub_type) REFERENCES publications (pub_id, pub_type)
);

CREATE TABLE stories (
  pub_id INTEGER NOT NULL,
  pub_type CHAR(1) DEFAULT 'S' CHECK (pub_type = 'S'),
  placeholder CHAR(1) NOT NULL, -- placeholder for other attributes of stories
  PRIMARY KEY (pub_id, pub_type),
  FOREIGN KEY (pub_id, pub_type) REFERENCES publications (pub_id, pub_type)
);

The CHECK() and FOREIGN KEY constraints in these subtype tables prevent rows from referencing the wrong kind of row in the supertype. It effectively partitions the pub_id values among the subtypes, guaranteeing that any given pub_id can appear in one and only one of the subtype tables. That's why you need either a PRIMARY KEY or NOT NULL UNIQUE constraint on the pair of columns {publications.pub_id, publications.pub_type}.

The table for comments is simple. Given that it is to have the same structure for all subtypes, you can reference the supertype.

CREATE TABLE comments (
  pub_id INTEGER NOT NULL REFERENCES publications (pub_id),
  comment_timestamp TIMESTAMP NOT NULL DEFAULT now(),
  commenter_email VARCHAR(10) NOT NULL, -- Only allow people who have 
                                        -- really short email addresses
  comment_text VARCHAR(30) NOT NULL,    -- Keep 'em short!
  PRIMARY KEY (pub_id, comment_timestamp, commenter_email)
);

Add a little bit of data.

INSERT INTO publications VALUES
(1,'A', 'url 1 goes here'),
(2,'A', 'url 2 goes here'),
(3,'S', 'url 3 goes here');

INSERT INTO articles VALUES
(1,'A', 'A'),
(2,'A', 'B');

INSERT INTO stories VALUES
(3,'S', 'A');

INSERT INTO comments VALUES
(1, now(), 'a@b.com','You''re stupid'),
(1, now(), 'b@c.com', 'You''re stupid, too!');

Now you can create a view to show all articles and resolve the join. You'd do the same for each of the subtypes.

CREATE VIEW articles_all AS
SELECT P.*, A.placeholder
FROM publications P
INNER JOIN articles A ON (A.pub_id = P.pub_id)

You might prefer names like "published_articles" instead of "articles_all".

To select one article and all its comments, you can just left join the two tables. (But see below why you probably won't do that.)

SELECT A.*, C.*
FROM articles_all A
LEFT JOIN comments C ON (A.pub_id = C.pub_id)
WHERE A.pub_id = 1;

You'd probably not actually do that for a web interface, because the dbms would have to return 'n' copies of the article, where 'n' equals the number of comments. But it does make sense to do this in some applications. In applications where it makes sense, you'd use one updatable view for each subtype, and application code would use the updatable views most of the time.


The more common business application of a supertype/subtype involves "Parties" (the supertype), "Organizations" and "Individuals" (the subtypes, informally companies and people. Addresses, like "comments" in the example above, are related to the supertype, because all the subtypes (organizations and individuals) have addresses.

  • 5
    @Snake: It's not a bad idea, but it's not necessary for joins, and you don't need dynamic SQL. (Dynamic SQL has risks that are better avoided if possible.) A supertype/subtype design for 'n' subtypes maps to 'n'+1 base tables and 'n' views. Each of the 'n' views joins the supertype table to one of the subtype tables; then clients use the views, not the base tables. (Write triggers, if necessary, to let you do INSERT, UPDATE, DELETE through the views.) Once you have a well-named, updatable view, you don't need to read that column yourself. It's just to help SQL maintain data integrity. – Mike Sherrill 'Cat Recall' Mar 9 '11 at 11:26
  • 1
    @Catcall: Thanks. Gotcha clear on the use of views. I kinda knew that but didn't think thru the question well enough. Another question: My plan would be to rely on the Publications table to generate the Pub_Id -- then this gets copied to the sub-type table, therefore the value will be unique within each table, so I wouldn't need to have a compound primary key on Publications table. Does that make sense? – Faust Mar 9 '11 at 15:57
  • 5
    @Snake: I understand what you're saying, but you really need that column and the compound key if you're using a SQL dbms. That CHAR(1) column, implemented in the supertype and every subtype and used as part of both a compound key and a foreign key reference, guarantees that each row in the supertype table can join to one and only one row in one and only one subtype table. Without it, you could insert the same pub_id in every subtype table, which makes the supertype/subtype design useless. – Mike Sherrill 'Cat Recall' Mar 9 '11 at 16:15
  • 2
    @CatCall: So back to the last question: given the role pubtype has in the primary key, is there a performance reason, or other, to keep it to 1 character? And if so, would an int be even better? (BTW: thanks for your time and patience. And pls excuse my "density" on these things. My experience is primarily front-end, but I'm a one-man project this time; though, fortunately, I've lobbied successfuly for lots of time so I can think things thru carefully.) – Faust Mar 10 '11 at 9:11
  • 5
    @Snake: CHAR(1) is long enough to be readable enough, and it takes less space than an integer. You might get marginally better performance with CHAR(1), or you might get marginally better performance with an integer. But I generally prefer text to numbers, because I find it easier to read text. (A)rticle, (B)log post, (S)tory is easier to remember than (1) article, (2) blog post, (3) story. I don't mind spending time; you shouldn't mind upvoting every one of my replies. And my original answer, too. ;) – Mike Sherrill 'Cat Recall' Mar 10 '11 at 11:10
5

You can use super-type/subtype in DB design to avoid that problem. Create a super-type for images, video, notes and then link to the super-type. Keep all common column in the super-type table.

Here are few links to several similar questions/answers with models:

  • 2
    Do you mean create a sub-type for images, video notes rather than super-type? – Pixelated Sep 24 '14 at 12:47
1

In my opinion you are better off having 4+ separate tables for the comments. Or you could have join tables. One table for all the comments... ex: blog table, comment table, blog_comment table. this would allow you to have your foreign keys.

Blog
--------
Blog_id
{other fields}

Blog_Comment
--------------
Blog_id
Comment_id


Comment
------------
Comment_id
{other fields}
  • 1
    This is the design I would recommend. The last possibility (and whether you need something like this would depend upon your design) would be to add a UserID to each entry in either the comment table or the blog/Video/Etx tables, such the one could retrieve all comments for a specific user (again, if this fits your design requirments, and with some possible adjustments for specific design requirements . . . ). Where the USerID ended up would probably require some thought. – XIVSolutions Feb 12 '11 at 16:56

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.