Large forums (i.e. large number of posts) may encounter slow loading of home page. This is due to this query in html/modules/forums/index.php:
Code:
$sql = "SELECT t.forum_id, t.topic_id, p.post_time
FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
WHERE p.post_id = t.topic_last_post_id
AND p.post_time > " . $userdata['user_lastvisit'] . "
AND t.topic_moved_id = 0";
As you notice, $userdata['user_lastvisit'] is different from one user to another, and from one visit to another. Therefore, this large query has to be re-calc'ed quite often, and cannot take advantage of mysql query cache (on by default in mysql).
So to speed things up, we attempt to have that large query cached. To do this, we do a compromise and show the user only new ads in the last 24 hours.
Locate:
Code:
$sql = "SELECT t.forum_id, t.topic_id, p.post_time
FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
WHERE p.post_id = t.topic_last_post_id
AND p.post_time > " . $userdata['user_lastvisit'] . "
AND t.topic_moved_id = 0";
if ( !($result = $db->sql_query($sql)) )
{
message_die(GENERAL_ERROR, 'Could not query new topic information', '', __LINE__, __FILE__, $sql);
}
//steve: take advantage of caching, only return rows updated in the last 24 hours, and round that off to the nearest hour
$last_period = intval(time()/3600) - 24; $last_period *= 3600; //round it off, so it is cached every hour
$sql = "select SQL_CACHE t.forum_id, t.topic_id, p.post_time
FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
WHERE p.post_id = t.topic_last_post_id
AND p.post_time > " . $last_period . "
AND t.topic_moved_id = 0
";
if ( !($result = $db->sql_query($sql)) )
{
message_die(GENERAL_ERROR, 'Could not query new topic information', '', __LINE__, __FILE__, $sql);
}
$timer->addmarker('marker 5a');
$new_topic_data = array();
while( $topic_data = $db->sql_fetchrow($result) )
{
if ($topic_data['post_time'] > $userdata['user_lastvisit']) { //steve: new topics are the ones posted after last visit
$new_topic_data[$topic_data['forum_id']][$topic_data['topic_id']] = $topic_data['post_time'];
}
}
one thing that may increase performance is to use the SQL_BIG_RESULT statement in conjunction with SQL_CACHE.
the first time the query is run it will of course be a little slower, then, the caching kicks in for future executions. now, if you are expecting a large amount of data to be returned, instruct the MySQL Optimizer to be prepared by using SQL_BIG_RESULT.
for example,
SELECT SQL_CACHE SQL_BIG_RESULT ........
you could also replace that with the SQL_SMALL_RESULT, however, from what i've read, it should no longer be needed for MySQL version 3.23 or higher.
give it a whirl, see if there any speed increases
steven111 Lieutenant
Joined: Dec 30, 2003
Posts: 283
Posted:
Sat Jun 05, 2004 8:23 pm
According to my research, SQL_BIG_RESULT affects the performance of distinct and group by queries. So perhaps it does not have an effect on this query? although I have not tested it.
yes, you are correct. for some reason as i scanned your post i thought i saw a 'group by' in there.
must be getting late here
Zhen-Xjell Nuke Cops Founder
Joined: Nov 14, 2002
Posts: 5939
Posted:
Sun Jun 13, 2004 11:08 am
Interesting, I may give this a try.
_________________ Paul Laudanski, Microsoft MVP Windows-Security
CastleCops: [de] [en] [wiki]
steven111 Lieutenant
Joined: Dec 30, 2003
Posts: 283
Posted:
Sun Jun 13, 2004 11:16 am
What I state above is the FIRST STEP, but not the only thing that has to be done. I also am addressing the case where the db is being frequently updated. In that case, mysql cache will get busted as soon as new data is written to any of the tables in the join.
So IN ADDITION to what I am doing above, I am also caching the resulting set from the sql query, and refreshing it every 15 minutes. This is giving me real performance boost. I use cache-lite for this.
I haven't used any of the caching capabilities for the site. I'd appreciate some details if you can.
_________________ Paul Laudanski, Microsoft MVP Windows-Security
CastleCops: [de] [en] [wiki]
steven111 Lieutenant
Joined: Dec 30, 2003
Posts: 283
Posted:
Sun Jun 13, 2004 11:48 am
What I have done may be considered extreme.... I limit to the last 24 hours the "what's new" for users. This compromise allows me to fix the sql query, and not make it user dependent. Then I cache the query as well, and refresh it every 15 minutes. So the big query in the home page is executed at most 4 times in an hour (vs. one time per user visit to home page). This is a big query hitting topics table and posts table, and the performance improvement is significat.
In module/forums/index.php, the NEW code is as follows (look at your source code for what it was like..)
//steve: take advantage of caching, only return rows updated in the last 24 hours, and round that off to the nearest hour
$last_period = intval(time()/3600) - 24; $last_period *= 3600; //round it off, so it is cached every hour
/* $sql = "select SQL_CACHE t.forum_id, t.topic_id, p.post_time
FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
WHERE p.post_id = t.topic_last_post_id
AND p.post_time > " . $userdata['user_lastvisit'] . "
AND t.topic_moved_id = 0 */
$sql = "select SQL_CACHE t.forum_id, t.topic_id, p.post_time
FROM " . TOPICS_TABLE . " t, " . POSTS_TABLE . " p
WHERE p.post_id = t.topic_last_post_id
AND p.post_time > " . $last_period; // .
/* "AND t.topic_moved_id = 0 */
// ";
//caching area--------------------
if (isset($Cache_Lite) && $contentA = $Cache_Lite->get($_SERVER['SERVER_NAME'] . 'bigindex')) { //make it unique to each webserver address, to not interfere
$topic_dataA = unserialize($contentA);
}
else {
if ( !($result = $db->sql_query($sql)) )
{
message_die(GENERAL_ERROR, 'Could not query new topic information', '', __LINE__, __FILE__, $sql);
}
while( $topic_dataA[] = $db->sql_fetchrow($result) ) {
}
if (isset($Cache_Lite)) $Cache_Lite->save(serialize($topic_dataA)); //save it for next time
}
//end of caching ------------------
$new_topic_data = array();
foreach ($topic_dataA as $topic_data) {
if ($topic_data['post_time'] > $userdata['user_lastvisit']) { //steve: new topics are the ones posted after last visit
$new_topic_data[$topic_data['forum_id']][$topic_data['topic_id']] = $topic_data['post_time'];
}
}
}
cache-lite was initialized elsewhere (in mainfile.php) like this:
On a dedicated single-processor server (light load) with 730k posts,
I am getting:
-on home (index) page, logged in user: first run 2.14 seconds
-thereafter: .58 seconds
_________________ Paul Laudanski, Microsoft MVP Windows-Security
CastleCops: [de] [en] [wiki]
steven111 Lieutenant
Joined: Dec 30, 2003
Posts: 283
Posted:
Sun Jun 13, 2004 12:18 pm
Used to be about 50 seconds, and thrashing around by the unix box, going to swap space (bad!!).
I suspect that a simple indexing of topic_last_post_id in nuke_bbtopics would fix 90% of the performance problem, and the other stuff I have done may not be needed. If one has a perf. problem, should try the indexing first. It is discussed here:
http://nukecops.com/modules.php?name=Forums&file=viewtopic&p=132042#132042
By the way, nukecops seems to be running just fine Another 200k posts, and we will talk.
and if everything looks fine speedwise, not to worry about the changes suggested here. The changes in this forum are for forums largers than say 400k posts- if you are in that category, then you may want to take a look here.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum