You are missing our premiere tool bar navigation system! Register and use it for FREE!

NukeCops  
•  Home •  Downloads •  Gallery •  Your Account •  Forums • 
Readme First
- Readme First! -

Read and follow the rules, otherwise your posts will be closed
Modules
· Home
· FAQ
· Buy a Theme
· Advertising
· AvantGo
· Bookmarks
· Columbia
· Community
· Donations
· Downloads
· Feedback
· Forums
· PHP-Nuke HOWTO
· Private Messages
· Search
· Statistics
· Stories Archive
· Submit News
· Surveys
· Theme Gallery
· Top
· Topics
· Your Account
Who's Online
There are currently, 196 guest(s) and 0 member(s) that are online.

You are Anonymous user. You can register for free by clicking here
Nuke Cops :: View topic - sitemap.xml [ ]
 Forum FAQ  •  Search  •   •  Memberlist  •  Usergroups   •  Register  •  Profile •    •  Log in to check your private messages  •  Log in

 
Post new topic  Reply to topicprinter-friendly view
View previous topic Log in to check your private messages View next topic
Author Message
oferlaor
Nuke Cadet
Nuke Cadet


Joined: Dec 07, 2004
Posts: 2


PostPosted: Mon Jun 20, 2005 4:18 am Reply with quoteBack to top

is there a phpnuke friendly mod that builds sitemap.xml (and subordinate xml files) for the new index scheme that Google has just started implementing?
Find all posts by oferlaorView user's profileSend private message
foxyfemfem
Support Staff
Support Staff


Joined: Jan 23, 2003
Posts: 668

Location: USA

PostPosted: Mon Aug 01, 2005 7:05 am Reply with quoteBack to top

Hello,

You can goto Create Google Sitemap xml

This is a google sitemap xml generator.

_________________
If you shoot for the moon and miss, you'll still be amongst the stars.
Find all posts by foxyfemfemView user's profileSend private message
sixpack
Lieutenant
Lieutenant


Joined: Oct 20, 2004
Posts: 165


PostPosted: Mon Aug 01, 2005 7:51 am Reply with quoteBack to top

I went through this just recently and used a few different things. I made a post about what I found that worked and links to a few free tools to get the job done. Check it out Free Tools to Create XML Sitemaps for Google Sitemaps Beta Goodluck
Find all posts by sixpackView user's profileSend private message
Steptoe
Captain
Captain


Joined: Oct 10, 2004
Posts: 563


PostPosted: Mon Aug 01, 2005 11:09 am Reply with quoteBack to top

I have beeh messing with a few site map generators over the last month..not being a coder and not realy understanding what Im doing I found it rather confusing
Yest came across this
http://johannesmueller.com/gs/
It takes a while to crawl the site, even on LAN...after 3 crawls using the filters It created a sitemap of only what I wanted. I crawled on a p3 512 meg ram...it works realy hard, and does an excellent job...ended up with a sitemap 70k of a site with approx 1200 posts, just over 200 members. Still took a few hrs to creal thu...just doing its job.
With the filters one can take out the reply, new post, account, and many other similar links.
Find all posts by SteptoeView user's profileSend private messageVisit poster's website
Snoboreders
Nuke Soldier
Nuke Soldier


Joined: Jun 30, 2005
Posts: 31


PostPosted: Tue Aug 02, 2005 7:26 pm Reply with quoteBack to top

What paramaters would you remove? If I dropped "sid" do you think it would recognize that as sessions ID? Also, since the random_num pages were aborted, does that mean when Google crawls my site, it won't time out on those pages?
Find all posts by SnoboredersView user's profileSend private message
sixpack
Lieutenant
Lieutenant


Joined: Oct 20, 2004
Posts: 165


PostPosted: Sat Aug 06, 2005 6:28 pm Reply with quoteBack to top

sid would be good to filter as well as reply, mark, search etc... as far as the other question I am not sure what you are asking.. random_num?

_________________
Fix and troubleshoot your computer!
Find all posts by sixpackView user's profileSend private message
Steptoe
Captain
Captain


Joined: Oct 10, 2004
Posts: 563


PostPosted: Sat Aug 06, 2005 7:26 pm Reply with quoteBack to top

Like u Im trying to get my head around sitemap, meta, .htaccess. google tap. and have no idea who "Sid" is or what he does for a living lol

Currently these threads are where background/things being looked into explained to us 'lay' ppl
http://www.storebuilder.co.uk/modules.php?name=Forums&file=viewtopic&p=1784#1784
http://ravenphpscripts.com/modules.php?name=Forums&file=viewtopic&p=44076#44076
Find all posts by SteptoeView user's profileSend private messageVisit poster's website
Rockdrala
Sergeant
Sergeant


Joined: Aug 09, 2005
Posts: 97


PostPosted: Thu Aug 18, 2005 2:26 am Reply with quoteBack to top

I have installed a sitemapper and I spidered my own site using different bots with submitexpress.com.. I have a problem... My php Nuke is installed in a Folder called Nuke2 instead of the Main Directory whenever the Spider Shows the me the results... It brings up a list of bad url without the Nuke for example
http://studio505.net/modules.php?name=Journal
Should be http://studio505.net/Nuke2/modules?php=Journal

Something in the PHP nuke Site somwhere is sending these wrong URLS and sending the Database Engine Searches to dead ends... I though Installing the Sitemapper would fix it but it Didnt, what do I do?

Heres the Fucked Up thing, After I installed sitemapper.php in the Nuke2 Folder which is the PHP nuke Root Folder ( installed the Site Mapper to Correct this Issue)... IT SHOWED UP ON THE CRAWLERS! But Its name too was fucked up as well... thus sending Crawlers to a Page Cannot Be found...
Example what showed is
http://studio505.net/sitemapper.php
(should have been)
http://studio505.net/Nuke2/sitemapper.php

What is causing this?
Find all posts by RockdralaView user's profileSend private messageVisit poster's websiteYahoo Messenger
Snoboreders
Nuke Soldier
Nuke Soldier


Joined: Jun 30, 2005
Posts: 31


PostPosted: Thu Aug 18, 2005 8:39 am Reply with quoteBack to top

Sorry RockDrala, I can't answer that one.

Here's what is under my "Remove Parameters"
gfx
orderby
osCsid
PhpSessId
PhpSessionID
random_num
Session
SessionID
SID
XTCsid

Here's what's under my "Drop parts"
ratenum
ratetype

Under your "ban URL's" add these:
www.yoursite.com/admin.php
www.yoursite.com/admin/
www.yoursite.com/blocks/
www.yoursite.com/images/
www.yoursite.com/includes/
www.yoursite.com/language/
www.yoursite.com/modules/
www.yoursite.com/themes/
&op=new_user
&op=pass_lost
&op=ShowCookies
&op=ShowCookiesRedirect



I hope this is some use to you

Oh yeah I'm running Nuke Platinum. I've noticed the Googlebot and MSNbot are at my site every day. It's too bad Google hasn't updated their pagerank because I'm still at 0 (it was indexed about a 40 days ago).
Find all posts by SnoboredersView user's profileSend private message
Rockdrala
Sergeant
Sergeant


Joined: Aug 09, 2005
Posts: 97


PostPosted: Thu Aug 18, 2005 10:02 am Reply with quoteBack to top

Can anyone tell me whay the hell my site is generating bad urls for crawlers? This is really pissing me off!

Maybe its some sick joke by the what his face that created php-nuke...

Here is a clue... The bot im using is the Meta Tag Analyzer from submitexpress.com You can choose if you want google bot results or spiders any bot you want. You do it all online... Great Tool

This BAD URLS have to generated FROM MY Site becuase its not Finding the Actual URL on the crawl results...

The Bots are Finding Acutal Names Page Names SOMEWHERE or it wouldnt have shown the sitemapper.php after I installed it.

So the answer is what is Cutting out the Nuke2 directory?
Find all posts by RockdralaView user's profileSend private messageVisit poster's websiteYahoo Messenger
Steptoe
Captain
Captain


Joined: Oct 10, 2004
Posts: 563


PostPosted: Thu Aug 18, 2005 11:39 am Reply with quoteBack to top

I think this is the problem, thu I do not understand the whys.
1/Google doesnt like sids and long urls
2/There for google doesnt like going beyond the links on the front page.
3/I think google tap still needs to be installed to take care of long urls and make urls like www.yoursite/forums.html
4/Even then google has trouble getting into forums, (something to do with SIDS ??) thu if latest forum block is on the front page these are crawled
5/Google does get to downloads, web links ok
6/Other engines like MSN, etc will crawl ok
I dont think google like to many urls on the front page..eg links to news posters details, links to new members details, and other similar stuff
7/ google and (most engines)doesnt like the user info that has visitor ips with xxx replacing the last ip numbers (replace this with zzz)
8/Google says the site map will not neccaryaly increases rankings but will crawl more parts of the site..it does this once or twice then stops at index page again.
9/Somehow I think pages/posts need dynamic meta tags for description of posts/threads??

Our site was in in the top 5 of subject on google, and out of 19 subject +parameters (kakariki + ) 12 where also in the top 10S
So unlike most trying to get up there, my playing has been from the top down.
I accidentally messed my meta tags...dont believe that google doesnt rate these very high..over night the main subject parameter stayed up, ALL the rest dropped below 100 and 200 rank! After 2 weeks they are slowly coming up...other engines didnt drop as much and come up faster again.

I dont have the answers , just observations.
Find all posts by SteptoeView user's profileSend private messageVisit poster's website
Rockdrala
Sergeant
Sergeant


Joined: Aug 09, 2005
Posts: 97


PostPosted: Thu Aug 18, 2005 1:31 pm Reply with quoteBack to top

Steptoe, I appreciate the info, I want you to check this out... go to www.submitexpress.com and choose the fee site meta tag anaylazer...

Punch in this www.studio505.net/nuke2

Select any Crawler, Google, Spiders, Etc and take a look, it is not just google but any crawler...

Its cutting out the Nuke2 directory in every URL found... do every URL goes to a dead end... Is this perhaps a meta engine somewhere hidden in the PHP Nuke?

Shure I could just throw a static html up with a sitemap and a redirect but seriously, we should know this for future reference of future PHP Nuke Isntallations... I bet that anyone else who has installed the 7.7 Like me.. and put it in a sub driectory,...who ran the same test I did with the online crawlers will see they have have the same results.

I may be a little crazy to spider my own sites just to see the results But I like to know what the Bot is seeing.. and right now its just seeing a bunch dead Ends.... Wouldnt you want to see what the hell the bots are seeing?

Where is this URL Generating engine hidden in phpnuke?!?!?!!?!

Thanks Steve
Find all posts by RockdralaView user's profileSend private messageVisit poster's websiteYahoo Messenger
Steptoe
Captain
Captain


Joined: Oct 10, 2004
Posts: 563


PostPosted: Thu Aug 18, 2005 2:21 pm Reply with quoteBack to top

www.studio505.net/nuke2
I dont know much about setting up subdomains...
But somehow , how yours is setup seems wrong???
Shouldnt it be something like www.nuke2.studio505.net/ and setup up like that in your virual hosts in the file httpd.conf in apache?
Find all posts by SteptoeView user's profileSend private messageVisit poster's website
Rockdrala
Sergeant
Sergeant


Joined: Aug 09, 2005
Posts: 97


PostPosted: Thu Aug 18, 2005 5:02 pm Reply with quoteBack to top

Apache functions are not avaible on my server... so I might as well kiss mod_rewrite and nuke sentinal and stuff like that goodbye.. everything I use is just core scripts.. and phpmyadmin... thays why im looking to fix the engine its self.. with out having to resort to additional programs.
Find all posts by RockdralaView user's profileSend private messageVisit poster's websiteYahoo Messenger
softplus
Nuke Cadet
Nuke Cadet


Joined: Aug 25, 2005
Posts: 2

Location: Switzerland

PostPosted: Thu Aug 25, 2005 12:37 am Reply with quoteBack to top

Hi Rockdrala
I think there's a reason for the sub-optimal results:
Your site is probably the single most broken website I have ever seen (and I've seen a lot lately, testing my GSiteCrawler Smile). You have multiple head/body/html sections, you seem to be mixing several sites all into one single giant page.

There is no way any search engine will be able to get good results from your site, it's amazing it even renders in my browser. Sorry, that sounds really mean, but it isn't meant that way - it's just a real mess...

If I were you, I would try to clean it up a bit, at least make sure that you have a valid structure around it - even if the contents of the body section doesn't validate (it would be better, but is not a requirement). Make sure your page starts with either a doctype or a html-section, keep a single head-section and a single body section. Any search engine that comes along now will possibly pick a "random" one of your head-sections, possibly skipping your meta-tags (title, description, keywords) and possibly even just giving up trying to read the page. I know for sure that Google is getting very strict about the quality of the sites that they list..

Once you have that done, you can get to work with sitemaps, etc.. Before you clean up the code it is a waste of time as Google will only mark you up as "potentially bad" and put you on the back-burner regarding crawling.

Also, regarding the crawlers going from http://www.studio505.net/nuke2 to http://www.studio505.net/[bla bla] - that is an error in the crawler software (I suppose the test-link you posted isn't a very smart - standard-conform - crawler, usually that doesn't matter). It thinks that "nuke2" is a file instead of a directory, i.e. it thinks it needs to place all new links below http://www.studio505.net/ instead of in /nuke2/. It's a simple mistake, but the server headers (which you don't see) tell the browser/crawler, that it's a directory. But it's no big deal, Google + co. will do it right. You can confirm that using for example my GSiteCrawler, which finds the URLs correctly. (however your server is so slow that I didn't crawl the complete site Smile).

Hope that helps!
John

PS There are very many sites out there that are not valid HTML, but that is no reason to do this for your site as well. You want as many good points in your favor as possible for the search engines; get to work Smile

Edit: The meta-tag checker you posted doesn't respect subdirectories at all. Dont use it Smile

_________________
Try the GSiteCrawler for Google Sitemap files!
Find all posts by softplusView user's profileSend private messageVisit poster's website
Display posts from previous:      
Post new topic  Reply to topicprinter-friendly view
View previous topic Log in to check your private messages View next topic
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum



Powered by phpBB © 2001, 2005 phpBB Group

Ported by Nuke Cops © 2003 www.nukecops.com
:: FI Theme :: PHP-Nuke theme by coldblooded (www.nukemods.com) ::
Powered by TOGETHER TEAM srl ITALY http://www.togetherteam.it - DONDELEO E-COMMERCE http://www.DonDeLeo.com - TUTTISU E-COMMERCE http://www.tuttisu.it
Web site engine's code is Copyright © 2002 by PHP-Nuke. All Rights Reserved. PHP-Nuke is Free Software released under the GNU/GPL license.
Page Generation: 0.044 Seconds - 693 pages served in past 5 minutes. Nuke Cops Founded by Paul Laudanski (Zhen-Xjell)
:: FI Theme :: PHP-Nuke theme by coldblooded (www.nukemods.com) ::