10 Ways to Increase Pages Indexed
增加页面索引的10大秘诀
Or how to make Google pay more attention
或者说怎样是Google更加关注你的网站

For a while now webmasters have fretted over why all of the pages of their website are not indexed. As usual there doesn't seem to be any definite answer. But some things are definite, if not automatic, and some things seem like pretty darn good guesses.
现在,很多站长暂时都会对为什么搜索引擎没有索引他们网站的所有页面而感到焦躁。按照惯例,他们似乎没有一个确定的答案。但是有些事情是一定的,如果不是自动的,那么有些事情是完全靠猜测的。
So, we scoured the forums, blogs, and Google's own guidelines for increasing the number of pages Google indexes, and came up with our (and the community's) best guesses. The running consensus is that a webmaster shouldn't expect to get all of their pages crawled and indexed, but there are ways to increase the number.
所以,我们为了增加Google索引页面的数量和得到关于Google的最新猜测而逛论坛、博客和Google的知道方针。不断的一致意见是站长们不要幻想所有的页面被爬行并且索引,但是,有很多方法增加索引页面的数量。
PageRank PR值
It depends a lot on PageRank. The higher your PageRank the more pages that will be indexed. PageRank isn't a blanket number for all your pages. Each page has its own PageRank. A high PageRank gives the Googlebot more of a reason to return. Matt Cutts confirms, too, that a higher PageRank means a deeper crawl.
它非常依赖PR值。你的网站的PR值越高被索引的页面也就越多。你的所有的页面的PR值并不是一样的。每个页面都有它自己的PR值。Pr值越高,Googlebot爬行的频率也就越快。这也得到了Matt cutts 确认,Pr值越高就意味着深度爬行。
Links 链接
Give the Googlebot something to follow. Links (especially deep links) from a high PageRank site are golden as the trust is already established.
提供给Googlebot爬行的路径。来自于高PR值网站的链接(尤其是深度链接)被视为黄金链接是已经确定的。
Internal links can help, too. Link to important pages from your homepage. On content pages link to relevant content on other pages.
内部链接也有很大的帮助。从网站的首页链接到重要页面。内容页面链接到其它相关的内容页面。
Sitemap 站点地图
A lot of buzz around this one. Some report that a clear, well-structured Sitemap helped get all of their pages indexed. Google's Webmaster guidelines recommends submitting a Sitemap file, too:
· Tell us all about your pages by submitting a Sitemap file; help us learn which pages are most important to you and how often those pages change.
关于站点地图有很多传言。有些报道说一个清晰的、井然有序的站点地图会帮助搜索引擎索引他们所有的页面。Google站长指导方针也推荐提交站点地图文件(一下是引用的原文):
告诉我们你所有的页面通过提交站点地图文件;让我们知道哪个网页对你来说最重要和更新的频率。
That page has other advice for improving crawlability, like fixing violations and validating robots.txt.
那个页面对于改进爬行缺陷还有一些其它的建议,如修补为例和确认robots.txt文件。
Some recommend having a Sitemap for every category or section of a site.
有些人建议为每个类别或者网站的每个部分创建站点地图。
speed 速度
A recent O'Reilly report indicated that page load time and the ease with which the Googlebot can crawl a page may affect how many pages are indexed. The logic is that the faster the Googlebot can crawl, the greater number of pages that can be indexed.
This could involve simplifying the structures and/or navigation of the site. The spiders have difficulty with Flash and Ajax. A text version should be added in those instances.
O’Reilly 最新的一项报告指出,页面加载时间和Googlebot爬行网页速度的减弱都会影响网页索引的的数量。这个逻辑就是Googlebot爬行的越快,网站页面被索引的数量也就越多。
这个可能包含着结构的简单化或者是站点的导航。蜘蛛程序索引Flash和Ajax存在问题。在这种情况下应该加一个文本说明。
Google's crawl caching proxy Google爬行缓存服务器
Matt Cutts provides diagrams of how Google's crawl caching proxy at his blog. This was part of the Big Daddy update to make the engine faster. Any one of three indexes may crawl a site and send the information to a remote server, which is accessed by the remaining indexes (like the blog index or the AdSense index) instead of the bots for those indexes physically visiting your site. They will all use the mirror instead.
Matt Cutts在他的博客中用图表说明了Google怎样爬行缓存服务器。这是使搜索引擎更快的Big Daddy更新的一部分。三个索引中的任何一个都可能爬行你的网站并把网站的信息发送到远端服务器,由其余的索引保存而不是蜘蛛程序物理性的爬行你的网站。他们全部都是使用镜像。
Verify 检验
Verify the site with Google using the Webmaster tools.
利用Google的站长工具检验你的网站。
Content, content, content 内容为王
Make sure content is original. If a verbatim copy of another page, the Googlebot may skip it. Update frequently. This will keep the content fresh. Pages with an older timestamp might be viewed as static, outdated, or already indexed.
确保内容的原创性。如果是完全的抄袭其它的页面,Googlebot会直接跳过。
经常更新,这会保持内容的新鲜性。带有过时站点地图的网页会被当作静态的、过时的和已经被索引的处理。
Staggered launch 错乱发布
Launching a huge number of pages at once could send off spam signals. In one forum, it is suggested that a webmaster launch a maximum of 5,000 pages per week.
一次性发布许多网页对搜索引擎来说是发布一种作弊信号。在一个论坛,站长每个星期发布的文章最多不能超过5000个页面。
Size matters 网站的大小问题
If you want tens of millions of pages indexed, your site will probably have to be on an Amazon.com or Microsoft.com level.
如果你向自己网站的数百万的页面被收录,那么你的网站可能要达到亚马逊和微软的水平。^_^
Know how your site is found, and tell Google
知道怎样才能找到你的网站,并把它告诉Google
Find the top queries that lead to your site and remember that anchor text helps in links. Use Google's tools to see which of your pages are indexed, and if there are violations of some kind. Specify your preferred domain so Google knows what to index
找到通向你网站的最高的搜索请求同时要记住锚定文本出现在链接中有非常打的好处。利用Google工具查看一下你的哪些页面被索引,是否有一些不符合的因素。指定你的首选域名,这样Google才知道该索引些什么。
本文由seo部落原创,转载请注明网址。 |