Let\'s take CodeReview page as an example. If you go here, you will see the list
ID: 650655 • Letter: L
Question
Let's take CodeReview page as an example. If you go here, you will see the list of the questions with their vote numbers and comment numbers
Now, I am wondering if these numbers are calculated on the fly (e.g. count the total comments in db for this question), or is there an actual field in the questions table called voteCount and it's incremented by one whenever someone votes for this question?
Just another question. Both VoteCount and CommentCount are just integers, what about the user information that is displayed for each question?
Hope this is the correct place to ask this question.
Explanation / Answer
It really depends on the traffic of the site, and you should probably don't bother yourself with such decisions unless:
You've identified an actual bottleneck
You've identified an actual bottleneck on a very similar scenario in the past
Premature optimization is the buzzword, and Donald Knuth describes it1 better than I could possibly:
Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
The simplest approach is to have a vote table, where you log all votes. Typical fields would be voteID, voterID, questionID, voteTimestamp, and you could calculate the totals on each request. But, that will probably become a performance issue very soon, especially with a high traffic site as StackOverflow2.
In that case, my approach would be to run a scheduled background process3 that calculates totals and stores them in a different table (or the question table), and possibly even in a document storage database4. Or, even in a memory cache5, if it makes sense.
There are other ways to cache calculated totals, these are the simplest ones (I think).
As for the user information, they are expected to change less often, so you could probably get away with caching them, without any special database realm approach.
Generally speaking reads are quite faster than writes and in common low traffic scenarios you'd be just fine by just sensibly caching your views. There is no definitive approach, mix and match as you identify bottlenecks.
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.