Change In The DNA 11: Google, Yahoo, MSN Unite On Support For Nofollow Attribute For Links
In the first cooperative move for nearly ten years, the major search engines have unveiled a new indexing command for web authors that they all recognize, one that they hope will help reduce the link and comment spam that plagues many web sites, especially those run by bloggers.The new "nofollow" attribute that can be associated with links was originated as an idea by Google several weeks ago and pitched past MSN and Yahoo, as well as major blogging vendors, gaining support.
The Nofollow Attribute
The new attribute is called "nofollow" with rel="nofollow" being the format inserted within an anchor tag. When added to any link, it will serve as a flag that the link has not been explicitly approved by the site owner.
For example, this is how the HTML markup for an ordinary link might look:
<a href="http://www.site.com/page.html">Visit My Page</a>This is how the link would look after the nofollow attribute has been added, with the attribute portion shown in bold
<a href="http://www.site.com/page.html" rel="nofollow">Visit My Page</a>This would also be acceptable, as order of elements within the anchor tag makes no difference:
<a rel="nofollow" href="http://www.site.com/page.html" >Visit My Page</a>Once added, the search engines supporting the attribute will understand that the link has not been vetted in some way by the site owner. Think of it as a way to flag to them, "I didn't post this link -- someone else did."
By the way, should you be one of the few using other types of rel attributes within your links (a way to show the relationship between your page and the page you're linking to), Google advises that you should separate them with spaces.
For example, Google cited this page, which provides one example of multiple rel attributes in action, like this:
<a href="http://jane-blog.example.org/" rel="sweetheart date met">Jane</a>If you wanted to add nofollow to the existing one, you'd just put a space between it and the other attributes of sweetheart, date and met, like this:
<a href="http://jane-blog.example.org/" rel="sweetheart date met nofollow">Jane</a>Google also said upper or lower case is usage of the attribute is fine and that the creation of this new attribute is believed to meet W3C standards on markup, as they allow for anyone to create new attributes.
Causes Of Link Spam
Why would you want to use the attribute? Blog publishers, forum operators, sites with guest books and others who allow anyone to contribute in some way to their web sites have suffered when people have used these systems to spam them with links.
For search engine purposes, getting a link to your site from someone else's site can serve as a "vote" that your site is seen as good. In Googlespeak, getting a link increases the PageRank value of your page -- sometimes a tiny bit, sometimes much more.
In addition, getting a link may help better ensure that your page is indexed by the major search engines. Finally, getting a link with words you want to be found for embedded in the anchor text can help you not just be seen as popular but also help you rank better for particular words.
Here's an example of comment spam in action. I did a Google search for texas holdem comment to find some candidates and focused on this page as an illustration. From PoliPundit.com, it's a blog post from Nov. 2002 about a political development.
Below the post is the comment area. The area has been link spammed heavily -- 30 entries containing links to web sites promoting casinos, poker, dating and other topics, like this (I've removed the links):
http://www.-texas-holdem-poker.us holdem poker texas holdem pokerIt's not just a Google problem. Do a Yahoo search, an Ask Jeeves search, or a search at MSN Search. All bring up examples of pages that contain link spam, which have been indexed by these search engines. As a result, they also might find their ranking systems impacted by the activity.
Comment by texas holdem poker | Email | Homepage | 12/26/2004 - 12:31 pm
Your blogg is smashing! Payday Loans http://www.payday-express.com
Comment by Payday Loans | Email | Homepage | 1/15/2005 - 4:04 am
Your blogg is full o information. HGH http://www.hgh-express.com
Comment by HGH | Email | Homepage | 1/15/2005 - 12:40 pm
Great article and great website. I wish you could update if more frequently.
You?re also welcome to visit my websites: Checks, Cigarette, Dating, Honda,
Insurance, Las Vegas, Lawyers, Lexus, Online Poker, PDA, Toyota.
Google, nevertheless, often gets the blame -- which is why it was under the most pressure for coming up with something for the problem. The hope is that by allowing web authors to flag links in this manner, it will make blogs, forums, guest books and other places accepting contributions less attractive to spamming.
What Nofollow Means
Below I'll cover what Google says it does, if it sees a link with the nofollow attributed associated with it. Yahoo and MSN are likely to react in a similar fashion, though I haven't yet spoken with them to get exact details since news of their support only just emerged.
If Google sees nofollow as part of a link, it will:
- NOT follow through to that page.
- NOT count the link in calculating PageRank link popularity scores.
- NOT count the anchor text in determining what terms the page being linked to is relevant for.
Now let's look at the impact of each action:
1) Not following the link to the page it points at means that potentially, Google might not index the page at all. As said, the more links that point at a particular page, the more likely it is that Google (and generally the other major search engines) will include that page within its index.
The nofollow attribute DOES NOT mean that someone will prevent a page they do not actually control from being indexed, however. If Google finds even one ordinary link pointing at a page, it may then index that page.
In addition, people can submit their pages directly to Google (and most major search engines). So it's crucial to understand that just because someone might place nofollow in a link pointing at your site, this WILL NOT prevent your page from getting indexed.
2) As for PageRank calculations, it's important to remember that PageRank is a pure popularity score (other search engines have similar scoring mechanisms, just not catchy names other than Yahoo's Web Rank). The nofollow attribute means that a link will not be counted as a "vote" in this popularity contest. That can have an impact on ranking, in cases where the impact of other factors beyond pure popularity come into play.
Huh? Say there are two pages, one with a PR score of 6, the other a PR of 7. Even though the PR7 page is more popular from a link counting point of view, it could still get outranked by the PR6 page if other factors such as the words on the page, or the anchor text pointing at the PR6 page, make it more relevant for a particular search.
It's also important to note that nofollow DOES NOT mean you are flagging a link as being bad in some way. Google isn't going to say, "Aha -- nofollow is on this link -- that's a bad link." Or as Matt Cutts, a Google software engineer who helped develop the attribute, said:
"It doesn't mean that it is a bad link, or that you that you hate it, just that this link doesn't belong to me."
Instead, nofollow effectively will cause Google to ignore the link, to pretend it doesn't exist. This also means you shouldn't worry that people will link to you and use nofollow as a way to hurt you -- Google says that won't happen.
3) This leads to anchor text. Generally much more important in ranking well for a particular term is to get the words you want to rank well for put into a link that points at you. With nofollow added to a link, Google won't associate the anchor text in the link with the page the link is pointed at. This, more than anything else, will sour things for link spammers.
Stop Spam? No. A Start, Yes!
The new attribute won't stop link spamming. Many people may still spam simply because they hope human beings will see the links, click through and perhaps convert. As with email spam, maybe only an incredibly tiny number will do so. But since there's no heavy cost to the spamming, that might still be enough.
In particular, much blog spamming is done through automation. So even with the new system in place, some of that automation will keep rolling along. It will no doubt even evolve to spot blogs and other areas that aren't making use of the nofollow attributes, just as smart spammers currently focus on blogs that have been abandoned, rather than irritating active bloggers.
This means other types of systems of blocking spam will likely still have to be used, such as forcing people to input characters from graphics (captchas), registration and so on (The Solution To Blog Spamming at ThreadWatch has a nice rundown on these, and also see Six Apart's Guide to Comment Spam).
While link or comment spamming isn't going away, it's still heartening that it will be less attractive. Site owners have been given an important new tool that lets them control indexing -- something they've not had offered for years. Perfect or not, I'm glad it's emerged.
Vendor Support
Google started developing the idea of a nofollow attribute several weeks ago and quietly shared it with a number of the major blogging vendors. Many of them have now signed on, pledging to support and implement the tag in the future, if they've not already done so now.
As a result, those using systems provided by one of the major vendors such as Blogger or Movable Type (see here for support news) should find that implementing the tags to be associated in links in comments is a matter of flipping a switch. OK, maybe clicking a radio button or drop-down box! Google provides a list of those supporting it here.
Google said it will soon begin talking with other companies, such as those that making forum software. But makers of these packages or any packages could implement support when they are ready.
Uses For The Attribute
The tag can be used by anyone anywhere, of course. It's not just for use with blog comment areas or forum posts. For example, Cutts said people might use it if they publish dynamically generated referrer stats and visitor information.
"Wherever it means that another person placed a link on your site, that would be appropriate," Cutts said.
Because of this, some page authoring tools will likely add support in the future, if it is widely adopted as will likely be the case. Some tools may allow adding it right now -- and those who know HTML code can do an easy insertion.
That might be handy if you need to link to a site but are worried that a search engine might consider it a "bad neighborhood," as they've often described them. In reality, the chances are very small that the typical person might link to a site that would actually hurt them with a search engine. But if in double, nofollow could offer peace of mind.
Of course, those who are swapping links with other sites now have a whole new thing to look out for. If someone offers to link to you, you'll want to make sure they don't make use of the nofollow tag -- at least if you were hoping for some search engine gain. Otherwise, the link's not going to count.
Don't forget -- there are other good reasons to still get links even beyond search engines, of course. My Golden Rules Of Link Building article covers this more.
You definitely DO NOT want to use the attribute on links to your own pages. Do that, and you'll deprive your own pages from the chance of influencing how your other pages rank.
Having said this, I've no doubt some people will try playing with the new tag as a means to "hoard" PageRank that's passed on to only a few pages in your site. For example, your home page might link to 25 of your internal pages. Using the new attribute, you could exclude all but five of these pages. Do that, and you might possibly cause Google to give those five pages more credit (see the Link Building & Link Analysis article for Search Engine Watch members for more about this).
Maybe. Perhaps. And perhaps the search engines may make other changes down the line. Rather than get tricky with this tag, I'd recommend using it as intended for now -- as a means to flag that there are certain links on your web site that you didn't place there.
Support From Other Search Engines
How about the other search engines? MSN and Yahoo are onboard. In fact, Yahoo beat Google out of the gate in blogging its support of the new tag first. A Defense Against Comment Spam offers a few details, an example and news that the change will be implemented in the coming weeks.
As for MSN, Working Together Against Blog Spam explains how the company made a snap decision today to support the tag, though the idea was something it had considered during its Search Champs meetings with bloggers and search marketers several months ago. It promises that its crawler will begin respecting the attribute in the coming weeks.
Google, of course, has been onboard from the start. It provides more details on its blog in Preventing comment spam.
So how about Ask Jeeves, the remaining major crawler? They're still looking at the new option and weighing it up.
"We'll consider it for the future, but because we use local [link] popularity and not global popularity, we are not going to rush into anything today. It has more impact for Google and Yahoo because of their similar methodologies. The upside for us is much more modest," Lanzone said.
By local popularity, Ask Jeeves is referring to how its Teoma search engine will calculate the popularity of pages and do ranking only after culling a subset of pages deemed relevant, rather than looking at all links from across the entire web. My Make Room For Teoma article explains this more.
More Info
Google To Add "Nofollow" Tagging Of Links To Fight Spam? is where I explain more about how the news of the new attribute emerged, plus provides some background on the difference between it and the nofollow attribute of the meta robots tag.
Comment Spam? How About An Ignore Tag? How About An Indexing Summit! is my post wishing for an "ignore" tag similar to what's emerged here and how others have been wishing for this even longer.
It also looks at how it has been literally years since we've had an advancement in the type of indexing control given to site owners. This new attribute -- whether you love the idea or hate it -- is welcome move for at least giving site owners themselves some choice in the matter.
The New Nofollow Link Attribute is a thread in our forums where you can discuss the new attribute.
Source: http://searchenginewatch.com/article/2062985/Google-Yahoo-MSN-Unite-On-Support-For-Nofollow-Attribute-For-Links
Change In The DNA 12: Google's Feb. 2005 Update
It's not your imagination. If you've noticed some changes to Google's web search results over the past week, there has indeed been an update to how those results are generated.The changes have kept various search forums buzzing with discussion over what's happened. It's not to the level of the big upset of the big December 2003 "Florida" update, but this February 2005 update is arguably the most talked about one since then.
"Things are always changing because we're always looking for ways to improve our algorithms and scoring. Most of the changes are not due to nofollow, but we are already starting to see a positive impact from the adoption of nofollow," said Google software engineer Matt Cutts.
Nofollow refers to the recent nofollow attribute that was introduced last month as a way for bloggers and others to help combat link spam.
So what were the changes? As is traditional for Google and the other major search engines, no specifics were provided.
Major Google Changes: Latent Semantic Analysis? from our forums recounts one popular bit of speculation that first drew a lot of attention last year. Is Google doing some type of analysis of pages to understand overall topics of what they might be about -- meanings beyond the keywords on the page itself? Latent Semantic Indexing (LSI) from our forum and Google Latent Semantic Indexing Technology from Aaron Wall provide some further background on the topic.
In general, when I looked at LSI as raised this time last year, my feeling was that for most site owners, even if it was being used, you'd have little control over influencing it. As long as you have pages rich in content, descriptive of the types of products, services or information you offer, your pages would already tap into meaning beyond keywords, if such analysis were happening.
As said, forum threads are hopping, if you want to contribute to discussions or read the speculation of others:
- What's Going On With Google: Feb. 2005 Update summarizes key threads on the topic in our forums.
- Google Assigning Less Weight to Links? from Search Engine Roundtable recaps chatter on other forums.
- Over at WebmasterWorld, you might check out Serious Google Update Algo Analysis Thread and if brave, dive into the multipage, multipart Update Allegra - Google Update 2-2-2005 and Update Allegra Part 2 Update 2-2-2005 discussions.
- At HighRankings, see the Big Change In Google Serps and Tweak Or Not For The New Google Update threads.
- This is not a way to get indexing support. If you need that type of support, visit Google's webmaster info section for answers to many questions. You can also send email describing your problem to webmaster@google.com, though responses are not guaranteed. Alternatively, try visiting our Google Web Search Forum where members offer advice to each other.
- Google's engineers will see the message and review them and make any changes they think may help the index overall.
- If you love something, ensure you tell Google what the search query was you did and the page or pages you think shined.
- If you hate something, again -- tell Google the query you did and the page or pages that were disappointing.
"We do in-depth testing of the changes we make to ensure that we're improving our relevancy and results," Cutts said.
One last bit of advice. Updates like this sometimes are still followed by a week or two of further significant tweaking. Google, of course, always is making changes to its ranking systems as a matter of course. But these may be more pronounced through the rest of this month -- and it's one reason why not to start running around immediately changing things, if you've had a ranking drop. As tweaks happen, you might find rankings return without any actions on your part.
Source: http://searchenginewatch.com/article/2047678/Googles-Feb.-2005-Update
Search gets personal
June 29, 2005
With the launch of Personalized Search, you can use that search history you've been building to get better results. You probably won't notice much difference at first, but as your search history grows, your personalized results will gradually improve.
This whole concept of personalization has been a big part our lives since some of the team was in grad school at Stanford. We shared an office, which happened to be the same one Sergey had used before, and we were pretty familiar with the research he and Larry had done. Related to their work, we thought building a scalable system for personalizing search results presented an interesting challenge. We've still got a long way to go, but we're excited to release this first step. So check out this latest addition to Google Labs and tell us what you think.
Source: http://googleblog.blogspot.ro/2005/06/search-gets-personal.html
Change In The DNA 13: Google Relaunches Personal Search - This Time, It Really Is Personal
Google has released a new version of Google Personalized Search, this time in a format intended to constantly monitor what people select from search results and shape future queries based on their choices.The new service is linked to the My Search History feature that Google unveiled last April (see our Google My Search History Personalizes the Web for more on the feature). Google Personalized Search uses My Search History data to refine your results based on your searching habits.
The service hasn't been formally rolled out via Google Labs, something that should happen later today. But it is starting to show up in search results pages for some people, as Dirson's spotted here and here, with a screenshot here.
When it does appear, you should be able to access it here: Google Personalized Search. I can reach that page myself, but it currently generates errors if I try to do a web search. Similarly, the Personalized Search help area has yet to go up.
Here's what I can tell you so far. Google hasn't explained exactly how the My Search History data is used. The service is literally brand new, and I'll be doing a follow-up to hopefully provide more details later in the day. However, it's pretty likely that a profile of what you like is created based on the pages you visit via the search results, rather than the actual searches you do.
Huh? Google gives an example (not yet posted live) that says:
For the query [bass], Google Personalized Search may show the user results about the instrument and not the fish if that person was a frequent Google searcher for music informationHow would Google know you are a frequent music information searcher? It could monitor the types of queries you do and use various methods to tell if you seem to be searching for music information often. But another method -- and one using technology Google has already has demonstrated -- is to monitor what you click on in the results.
(FYI, a Google patent on personalization based on bookmarks that recently came to light is covered in this SEW Forums thread and in great depth in this Cre8asite thread. Another recently discussed patent also covers things like using clickthrough measurements to refine results. In addition, Google has personalization technologies and patents from past acquisitions, such as Outride).
Google Personalized Search 1.0 - Pages By Topic
The previous incarnation of Google Personalized Search that opened last year let you create a profile used to customize your results. By selecting categories, you could tell Google you were interested in things like movies, radio and music. Then by using a slider, you could "personalize" your results to skew them toward your particular interest areas.
More about how that service operates is covered in my past review of it, Google Gains Personalized Search Results. In it, I explained that Google was classifying pages across the web into topics. The "personal" results were simply those skewed more toward the topics areas you were interested in, a profile you had to manually create.
In the new system, a profile is created automatically. As said, exactly how isn't explained yet by Google. But almost certainly, it's measuring what you click on and then skewing your results over time to favor sites that fall in particular topics areas seems part of it.
Turning Off Personalization
What if you don't want the skew? There's a "Turn OFF Personalized Search for these results" link on the search results page you can use. If you see that link, it's a sign that the results HAVE been personalized. No link, then no personalization happened. And if you get the link, clicking on it will bring back regular results on a per result basis.
Want regular results all the time? You'll need to sign out of Google, if you've signed in. Signed in? When did I sign in! You logged into Google any time you used a service that can range from Gmail to Google Sitemaps. Anything that requires a Google Accounts sign-on is a service that logs you into Google. And if you activated My Search History, then that gets switched on along with personalized results once you've logged into ANY Google Accounts service.
Pausing Search History Recording
Understanding that signing-in automatically activates these features is important. Google's My Search History and Privacy from The Unofficial Google Weblog from a few weeks ago explains how some might not have realized that My Search History went active just because they went to check their email. So be aware. If you've enabled My Search History, it's going to switch on if you log into most anything at Google.
Dislike that? You don't have to sign-up for My Search History, of course. That will block the recording of your searches and the personalization now happening. But you can also pause the service.
When I tested today, a pause will be retained even if you sign out and then back in. It's an easy way to stop your history from being recorded unless you specifically want it to be. But remember, pausing will not stop personalization from happening. If you have any recorded search history at all, then Google will try to personalize your results, whenever you are logged in. There's no "Pause Personalization," as of yet.
Finally -- SEO Faces A Thousand Fronts
When Eurekster kicked off round two of search personalization last year (why round one died in 1999 is covered here), I explained in my review for Search Engine Watch members that personalization was appealing to search engines as a spam fighting tactic:
Link analysis itself is facing problems. Link spammers and others overtly manipulate links. Links are also created naturally in ways much different than in the past, polluting their usefulness in search. Personalization poses a potential next leap forward -- and clickthrough measurement can provide that.Since then, we've seen the major search engines add search history features but not actual personalization of results, as I explained last October in my article for Search Engine Watch members, Search Personalization: A Marketer's Perspective.
In addition, past uses of clickthrough measurements never delivered personalized results by default. Anyone was allowed to influence the results that everyone else saw. In Eurekster's system, only those within your search network can directly influence you. This effectively creates hundreds, thousands and even millions of different possible results for the same search.
Click spammers suddenly face many different "fronts" in the war to be in the top ten, and they only get to fight in that war by invitation -- if someone they know asks them to be part of their network. Eurekster assumes "friends don't spam friends," and it's a pretty safe assumption.
That was written when Yahoo's "My Web" search personalization features came out, including the ability for searchers to block sites and the issues and workarounds this poses for site owners. Now that Yahoo's My Web is offered to anyone as part of the regular search experience, search marketers are taking more notice of the "Block" and "Save" features that appear next to every page listed.
And so they should. While these features don't rerank results yet, the Block can certainly make pages disappear. In addition, the data could be used at any time as a way for Yahoo to decide what users may like or dislike. In fact, that Search Personalization: A Marketer's Perspective article covered how this was something Yahoo said it was considering.
Since then, Yahoo's dropped heavy hints that it will create a social search service where communities may create reshape results in different ways, as Yahoo Wanted Flickr For The Tags (& Tagging Community) from last week covers briefly. Meanwhile, Google's gone and done it. Personalized results have come firmly come to the major search engines, a third generational step toward improved relevancy and the beginning of the end of everyone seeing the same results.
Will marketers find a way to spam personalized search? That remains to be seen. History so far has shown that each improvement eventually gets less effective. Heck, the Google My Search History Spam from May shows how you can spam entries easily into someone's search history at Google. It's still working. But while you can leave entries, you aren't generating clicks -- and so you aren't impacting the personalized search results. I'm sure personalization will lose some spam resistance over time, but there's no doubt it will make spamming results much harder.
Postscript: If you log out of your Google Account, then you'll see the Personalized Search home page with this text:
Personalized Search is an improvement to Google search that orders your search results based on what you've searched for before. Learning from your history of searches and search results you've clicked on, Personalized Search brings certain results closer to the top when it's clear they're most relevant to you.Want to discuss? Visit our forum thread, Google Getting New Personalized Search
Part of Personalized Search is the Search History feature, which lets you view and manage your history of past searches and the search results you've clicked on. As you build up your search history, your personalized search results will continue to improve over time.
Source: http://searchenginewatch.com/article/2061728/Google-Relaunches-Personal-Search-This-Time-It-Really-Is-Personal
Change In The DNA 14: Google's Cutts Says Not An Update - I Say An Update, Just Not A Dance
Matt Cutts from Google weighs in on whether the changes people are seeing at Google right now constitute an update. He says no. I say yes. What you say depends on how you want to define update :)From Matt's blog post, What?s an update?, he notes that Google is constantly updating its index with new pages, updated pages, new backlinks and new PageRank data (the link count value assigned to particular pages). But some of this isn't visible to those who might do backlink lookups. As he says:
We only export new backlinks, PageRank, or directory data every three months or so though....When new backlinks/PageRank appear, we?ve already factored that into our rankings quite a while ago. So new backlinks/PageRank are fun to see, but it?s not an update; it?s just already-factored-in data being exported visibly for the first time in a while.He also notes that Google is constantly going through "everflux" style changes, because of new data flowing in. In fact, all major search engines have (and have had even before Google) this type of constant change to some degree. Such low level change aren't an update to him:
The term "everflux" is often used to describe the constant state of low-level changes as we crawl the web and rankings consequently change to a minor degree. That's normal, and that's not an updateI agree with that. I certainly wouldn't call changes of this type of low level an update either. So what IS an update? To Matt, it's a major algorithmic change:
Usually, what registers with an update to the webmaster community is when we update an algorithm (or its data), change our scoring algorithms, or switch over to a new piece of infrastructure. Technically Update Gilligan is just backlink/PageRank data becoming visible once more, not a real update. There haven?t been any substantial algorithmic changes in our scoring in the last few days.Matt links over to WebmasterWorld, which initially dubbed this an update with the name of "Gilligan" but has since retitled the thread "False Alarm."
It's not a false alarm to me. That's because I don't define an "update" solely by whether there's an algorithm change that shows massive shifts in rankings. I'd define an update to be any major, significant change to the search engine's underlying index, noticed or not. And that's what's going right now. Google has either added a significant number of new pages to its index or significantly changed the way that it reports counts.
Moreover, the change IS getting noticed and commented upon. One person emailed me happy that he suddenly went from 5 pages to having all of his 120 or so pages indexed. Another person emailed that an SEO contract that was to be based on how "competitive" a term is in Google had to be rewritten when the counts for various words shot up, suddenly making them seemingly a much more competitive challenge. This has been an update to both of those people!
Updates also used to be called dances, as in the Google Dance. As I wrote on our SEW Forums, perhaps we should be using those terms to mean different things.
We're not having a dance right now. The results aren't radically shifting around. To have a dance, you have to have an update -- to have an update, you don't necessarily have to dance.Want to comment, disagree, enhance or read more? Visit our forum thread, Sept. 2005 Google Index Update & Size Increase Coming?
Source: http://searchenginewatch.com/article/2061165/Googles-Cutts-Says-Not-An-Update-I-Say-An-Update-Just-Not-A-Dance
What’s an update?
What is an update? Google updates its index data, including backlinks and PageRank, continually and continuously. We only export new backlinks, PageRank, or directory data every three months or so though. (We started doing that last year when too many SEOs were suffering from “B.O.”, short for backlink obsession.) When new backlinks/PageRank appear, we’ve already factored that into our rankings quite a while ago. So new backlinks/PageRank are fun to see, but it’s not an update; it’s just already-factored-in data being exported visibly for the first time in a while.
Google also crawls and updates its index every day, so different or more index data usually isn’t an update either. The term “everflux” is often used to describe the constant state of low-level changes as we crawl the web and rankings consequently change to a minor degree. That’s normal, and that’s not an update.
Usually, what registers with an update to the webmaster community is when we update an algorithm (or its data), change our scoring algorithms, or switch over to a new piece of infrastructure. Technically Update Gilligan is just backlink/PageRank data becoming visible once more, not a real update. There haven’t been any substantial algorithmic changes in our scoring in the last few days. I’m happy to try to give weather reports when we do our update scoring/algo data though.
Um, that’s all I can think of regarding taxonomies of updates, so I guess I’ll publish it.
Source: http://www.mattcutts.com/blog/whats-an-update/
Change In The DNA 15: Google Merges Local and Maps Products
Launch of Google Local Enables Seamless Local Search and Mapping Experience From One LocationMOUNTAIN VIEW, Calif. – Oct. 6, 2005 – Google Inc. (NASDAQ: GOOG) today announced the official launch of Google Local, merging the technologies behind Google Local and Google Maps. No longer in beta in the U.S. and Canada, users can visit maps.google.com/maps to find local search and mapping information in one place.
"With today’s launch of Google Local, users will be able to go to one location to find all the local and mapping information they need," said Marissa Mayer, director, Consumer Web Products, Google Inc. "Whether it’s directions to the nearest pharmacy or reviews of nearby dim sum restaurants, we will continue to develop innovative technologies that enrich our users’ lives."
Google Local offers users access to relevant information such as integrated local search results and detailed driving directions, and includes features such as draggable maps, satellite imagery, keyboard shortcuts, and more. With mapping data combined with relevant local information from Google’s web index and business listings such as Yellow Page directories, Google Local is a comprehensive local search and mapping product.
Access to local and mapping information is a natural extension of Google’s ongoing mission to organize the world’s information and make it universally accessible and useful. Google will continue to add local search and mapping functionality to Google Local as the product evolves.
More information about Google Local can be found at maps.google.com/maps.
About Google Inc.
Google’s innovative search technologies connect millions of people around the world with information every day. Founded in 1998 by Stanford Ph.D. students Larry Page and Sergey Brin, Google today is a top web property in all major global markets. Google’s targeted advertising program provides businesses of all sizes with measurable results, while enhancing the overall web experience for users. Google is headquartered in Silicon Valley with offices throughout the Americas, Europe and Asia. For more information, visit www.google.com.Media Contacts:
Eileen Rodriguezeileen@google.com
650.253.4235
###
Google is a registered trademark of Google Inc. All other company and product names may
be trademarks of the respective companies with which they are associated.
Source: http://googlepress.blogspot.ro/2005/10/google-merges-local-and-maps-products_06.html
A Review Of The Jagger 2 Update
Matt Cutts has some information
about the Jagger Update, looks like some people are over at WMW trying
to figure it out and provide us with some info. Googleguy on WMW says:
He also says, "Jagger1, Jagger2, and Jagger3 are mostly independent changes, but they’re occurring closely enough in time (plus they interact to some degree) that it’s clearer just to act as if they were one update for feedback purposes. "
Discussion at WMW
Source: http://www.seroundtable.com/archives/002711.html
McMohan, good eyes in spotting some changes at 66.102.9.104. I expect Jagger2 to start at 66.102.9.x. It will probably stay at 1-2 data centers for the next several days rather than spreading quickly. But that data center shows the direction that things will be moving in (bear in mind that things are fluxing, and Jagger3 will cause flux as well).Matt Cutts posted how to send feedback on Jagger1 at http://www.mattcutts.com/blog/update-jagger-contacting-google/Matt indicates that you submit a reinclusion request to Google by putting "Jagger 1, Jagger 2, or Jagger 3" in the subject line of the email.
If you’re looking at 66.102.9.x and have new feedback on what you see there (whether it be spam or just indexing related), please use the same mechanism as before, except use the keyword Jagger2. I believe that our webspam team has taken a first pass through the Jagger1 feedback and acted on a majority of the spam reports. The quality team may wait until Jagger3 is visible somewhere before delving into the non-spam index feedback.
If things stay on the same schedule (which I can’t promise, but I’ll keep you posted if I learn more), Jagger3 might be visible at one data center next week. Folks should have several weeks to give us feedback on Jagger3 as it gradually becomes more visible at more data centers.
He also says, "Jagger1, Jagger2, and Jagger3 are mostly independent changes, but they’re occurring closely enough in time (plus they interact to some degree) that it’s clearer just to act as if they were one update for feedback purposes. "
Discussion at WMW
Source: http://www.seroundtable.com/archives/002711.html
Google SEO News and Discussion Forum | |
| ||||
Dealing With Consequences of Jagger Update Your site dropped? Lost rankings? What to do now? | ||||
reseller msg:744901 | 8:25 am on Nov 12, 2005 (gmt 0) | |||
Hi Folks Jagger is winding down and life must go on. If Jagger has been kind to your site, Congrats. But for the rest of fellow members who lost rankings or their sites dropped of the index, its time to do some thinking and decide on what to improve or change on your affected websites. Still ethical measures are what interest me most. Some food for the thought. After my site was hit by Allegra (2-3 Feb 2005) and lost 75% of my Google's referrals and hit for second time on 22nd July 2005 ending up with only 5-10% of pre-Allegra Google's referrals. My site is now back to the level of around 50% of pre-Allegra Google's referrals and growing... until further. I say "until further" because who knows what the next update or "everflux" do to my site! Before my site returned back around 19-22 Sept 2005 (very slow at the begining), I went through my site several times for months and did the followings: - removed duplicate pages. In my case it was several testing pages (even back to 1997) which I just forgot on the server. - removed one or two 100% frame pages. - removed some pre-sell affiliate program pages with content provided entirely by affiliate program vendors. - removed few (affiliate referrals) outbound links which was on the menu bar of all pages (maybe we are talking about sitewide linking). - on resource pages, I reduced the outbound links to be less than 100 . - made a 301 redirect non-www to www (thanks to my good Norwich friend Dayo-UK). - finally filed a reinclusion request in accordance with the guidelines posted on Matt's blog (thanks Mr. Inigo). Would you be kind to tell us how Jagger Update affected your site, and what do you intend to do about it. Thanks! |
Change In The DNA 16: Indexing timeline
Fair enough. Some people don’t want to read the whole mind-numbingly long post while their eyes glaze over. For those people, my short summary would be two-fold. First, I believe the crawl/index team certainly has enough machines to do its job, and we definitely aren’t dropping documents because we’re “out of space.” The second point is that we continue to listen to webmaster feedback to improve our search. We’ve addressed the issues that we’ve seen, but we continue to read through the feedback to look for other ways that we could improve.
People have been asking for more details on “pages dropping from the index” so I thought I’d write down a brain dump of everything I knew about, to have it all in one place. Bear in mind that this is my best recollection, so I’m not claiming that it’s perfect.
Bigdaddy: Done by March
- In December, the crawl/index team were ready to debut Bigdaddy, which was a software upgrade of our crawling and parts of our indexing.
- In early January, I hunkered down and wrote tutorials about url canonicalization, interpreting the inurl:
operator, and 302 redirects. Then I told people about a data center where Bigdaddy was live and asked for feedback.
- February was pretty quiet as Bigdaddy rolled out to more data centers.
- In March, some people on WebmasterWorld started complaining that they saw none of their pages indexed in Bigdaddy data centers, and were more likely to see supplemental results.
- On March 13th, GoogleGuy gave a way for WMW folks to give example sites.
- After looking at the example sites, I could tell the issue in a few minutes. The sites that fit “no pages in Bigdaddy” criteria were sites where our algorithms had very low trust in the inlinks or the outlinks of that site. Examples that might cause that include excessive reciprocal links, linking to spammy neighborhoods on the web, or link buying/selling. The Bigdaddy update is independent of our supplemental results, so when Bigdaddy didn’t select pages from a site, that would expose more supplemental results for a site.
- I worked with the crawl/index team to tune thresholds so that we would crawl more pages from those sorts of sites.
- By March 22nd, I posted an update to let people know that we were crawling more pages from those sorts of sites. Over time, we continued to boost the indexing even more for those sites.
- By March 29th, Bigdaddy was fully deployed and the old system was turned off. Bigdaddy has been powered our crawling ever since.
Considering the amount of code that changed, I consider Bigdaddy pretty successful in that I only saw two complaints. The first was one that I mentioned, where we didn’t index pages from sites with less trusted links, and we responded and started indexing more pages from those sites pretty quickly. The other complaint I heard was that pages crawled by AdSense started showing up in our web index. The fact that Bigdaddy provided a crawl caching proxy was a deliberate improvement in crawling and I was happy to describe it in PowerPoint-y detail on the blog and at WMW Boston.
Okay, that’s Bigdaddy. It’s more comprehensive, and it’s been visible since December and 100% live since March. So why the recent hubbub? Well, now that Bigdaddy is done, we’ve turned our focus to refreshing our supplemental results. I’ll give my best recollection of that timeline too. Around the same time, there was speculation that our machines are full. From my personal perspective in the quality group, we have certainly have enough machines to crawl/index/serve web results; in fact, Bigdaddy is more comprehensive than our previous system. Seems like a good time to throw in a link to my disclaimer right here to remind people that this is my personal take.
Refreshing supplemental results
Okay, moving right along. As I mentioned before, once Bigdaddy was fully deployed, we started working on refreshing our supplemental results. Here’s my timeline:
- In early April, we started showing some refreshed supplemental results to users.
- On April 13th, someone started a thread on WMW to ask about having fewer pages indexed.
- On April 24th, GoogleGuy gave a way for people to provide specifics (WebmasterWorld, like many webmaster forums, doesn’t allow people to post specific site names.)
- I looked through the feedback and didn’t see any major trends. Over the next week, I gave examples to the crawl/index team. They didn’t see any major trend either. The sitemaps team investigated until they were satisfied that it had nothing to do with sitemaps either.
- The team refreshing our supplemental results checked out feedback, and on May 5th they discovered that a “site:” query didn’t return supplemental results. I think that they had a fix out for that the same day. Later, they noticed that a difference in the parser meant that site: queries didn’t work with hyphenated domains. I believe they got a quick fix out soon afterwards, with a full fix for site: queries on hyphenated domains in supplemental results expected this week.
- GoogleGuy stopped back by WMW on May 8th to give more info about site: and get any more info that people wanted to provide.
Reading current feedback
Those are the issues that I’ve heard of with supplemental results, and those have been resolved. Now, what about folks that are still asking about fewer pages being reported from their site? As if this post isn’t long enough already, I’ll run through some of the emails and give potential reasons that I’ve seen:
- First site is a .tv about real estate in a foreign country. On May 3rd, the site owner says that they have about 20K properties listed, but says that they dropped to 300 pages. When I checked, a site: query shows 31,200 pages indexed now, and the example url they mentioned is in the index. I’m going to assume this domain is doing fine now.
- Okay, let’s check one from May 11th. The owner sent only a url, with no text or explanation at all, but’s let’s tackle it. This is also a real estate site, this time about a Eastern European country. I see 387 pages indexed currently. Aha, checking out the bottom of the page, I see this:
Linking to a free ringtones site, an SEO contest, and an Omega 3 fish oil site? I think I’ve found your problem. I’d think about the quality of your links if you’d prefer to have more pages crawled. As these indexing changes have rolled out, we’ve improving how we handle reciprocal link exchanges and link buying/selling.
- Moving right along, here’s one from May 4th. It’s another real estate site. The owner says that they used to have 10K pages indexed and now they have 80. I checked out the site. Aha:
This time, I’m seeing links to mortgages sites, credit card sites, and exercise equipment. I think this is covered by the same guidance as above; if you were getting crawled more before and you’re trading a bunch of reciprocal links, don’t be surprised if the new crawler has different crawl priorities and doesn’t crawl as much.
- Some one sent in a health care directory domain. It seems like a fine site, and it’s not linking to anything junky. But it only has six links to the entire domain. With that few links, I can believe that out toward the edge of the crawl, we would index fewer pages. Hold on, digging deeper. Aha, the owner said that they wanted to kill the www version of their pages, so they used the url removal tool on their own site. I’m seeing that you removed 16 of your most important directories from Oct. 10, 2005 to April 8, 2006. I covered this topic in January 2006:
Q: If I want to get rid of domain.com but keep www.domain.com, should I use the url removal tool to remove domain.com?You didn’t remove your entire domain, but you removed all the important subdirectories. That self-removal just lapsed a few weeks ago. That said, your site also has very few links pointing to you. A few more relevant links would help us know to crawl more pages from your site. Okay, let’s read another.
A: No, definitely don’t do this. If you remove one of the www vs. non-www hostnames, it can end up removing your whole domain for six months. Definitely don’t do this. If you did use the url removal tool to remove your entire domain when you actually only wanted to remove the www or non-www version of your domain, do a reinclusion request and mention that you removed your entire domain by accident using the url removal tool and that you’d like it reincluded.
- Somebody wrote about a “favorites” site that sells T-shirts. The site had about 100 pages, and now Google is showing about five pages. Looking at the site, the first problem that I see is that only 1-2 domains have any links at all to you. The person said that every page has original content, but every link that I clicked was an affiliate link that went to the site that actually sold the T-shirts. And the snippet of text that I happened to grab was also taken from the site that actually sold the T-shirts. The site has a blog, which I’d normally recommend as a good way to get links, but every link on the blog is just an affiliate link. The first several posts didn’t even have any text, and when I found an entry that did, it was copied from somewhere else. So I don’t think that the drop in indexed pages for this domain necessarily points to an issue on Google’s side. The question I’d be asking is why anyone would choose your “favourites” site instead of going directly to the site that sells T-shirts?
Closing thoughts
Okay, I’ve got to wrap up (longest. post. evar). But I wanted to give people a feel for the sort of feedback that we’re getting in the last few days. In general, several domains I’ve checked have more pages reported these days (and overall, Bigdaddy is more comprehensive than our previous index). Some folks that were doing a lot of reciprocal links might see less crawling. If your site has very few links where you’d be on the fringe of the crawl, then it’s relatively normal that changes in the crawl may change how much of your site we crawl. And if you’ve got an affiliate site, it makes sense to think about the amount of value-add that your site provides; you want to provide a reason why users would prefer your site.
In March, I was able to read feedback and identify an issue to fix in 4-5 minutes. With the most recent feedback, we did find a couple ways that we could make site: more accurate, but despite having several teams (quality, crawl/index, sitemaps) read the remaining feedback, we’re seeing more a grab-bag of feedback than any burning issues. Just to be clear, I’m not saying that we won’t find other ways to improve. Adam has been reading and replying to the emails and collecting domains to dig into, for example. But I wanted to give folks an update on what we were seeing with the most recent feedback.
Source: http://www.mattcutts.com/blog/indexing-timeline/
Please Rate & Subscribe!!
Max Igan Channel ~ aodscarecrow
Website ~ http://thecrowhouse.com
Facebook Page ~ https://www.facebook.com/pages/Max-Ig...
Learning to 'Live Free' comes from experience and personal growth ... Lets break our conditioning!
https://www.facebook.com/pages/Live-F...
FAIR USE NOTICE: These Videos may contain copyrighted (© ) material the use of which has not always been specifically authorized by the copyright owner. Such material is made available to advance understanding of ecological, political, human rights, economic, democracy, scientific, moral, ethical, and social justice issues, etc. It is believed that this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, this material is distributed without profit to those who have expressed a prior general interest in receiving similar information for research and educational purposes. For more information go to: http://www.law.cornell.edu/uscode
No comments:
Post a Comment