Tuesday, 31 March 2015

DNS Verification FTW

Webmaster Level: Advanced

A few weeks ago, we introduced a new way of verifying site ownership, making it easy to share verified ownership of a site with another person. This week, we bring you another new way to verify. Verification by DNS record allows you to become a verified owner of an entire domain (and all of the sites within that domain) at once. It also provides an alternative way to verify for folks who struggle with the existing HTML file or meta tag methods.

I like to explain things by walking through an example, so let's try using the new verification method right now. For the sake of this example, we'll say I own the domain example.com. I have several websites under example.com, including http://www.example.com/, http://blog.example.com/ and http://beta.example.com/. I could individually verify ownership of each of those sites using the meta tag or HTML file method. But that means I'd need to go through the verification process three times, and if I wanted to add http://customers.example.com/, I'd need to do it a fourth time. DNS record verification gives me a better way!

First I'll add example.com to my account, either in Webmaster Tools or directly on the Verification Home page.


On the verification page, I select the "Add a DNS record" verification method, and follow the instructions to add the specified TXT record to my domain's DNS configuration.



When I click "Verify," Google will check for the TXT record, and if it's present, I'll be a verified owner of example.com and any associated websites and subdomains. Now I can use any of those sites in Webmaster Tools and other verification-enabled Google products without having to verify ownership of them individually.

If you try DNS record verification and it doesn't work right away, don't despair!


Sometimes DNS records take a while to make their way across the Internet, so Google may not see them immediately. Make sure you've added the record exactly as it’s shown on the verification page. We'll periodically check, and when we find the record we'll make you a verified owner without any further action from you.

DNS record verification isn't for everyone—if you don't understand DNS configuration, we recommend you continue to use the HTML file and meta tag methods. But for advanced users, this is a powerful new option for verifying ownership of your sites.

As always, please visit the Webmaster Help Forum if you have any questions.

Monday, 30 March 2015

URL removal explained, Part I: URLs & directories

Webmaster level: All

There's a lot of content on the Internet these days. At some point, something may turn up online that you would rather not have out there—anything from an inflammatory blog post you regret publishing, to confidential data that accidentally got exposed. In most cases, deleting or restricting access to this content will cause it to naturally drop out of search results after a while. However, if you urgently need to remove unwanted content that has gotten indexed by Google and you can't wait for it to naturally disappear, you can use our URL removal tool to expedite the removal of content from our search results as long as it meets certain criteria (which we'll discuss below).

We've got a series of blog posts lined up for you explaining how to successfully remove various types of content, and common mistakes to avoid. In this first post, I'm going to cover a few basic scenarios: removing a single URL, removing an entire directory or site, and reincluding removed content. I also strongly recommend our previous post on managing what information is available about you online.

Removing a single URL

In general, in order for your removal requests to be successful, the owner of the URL(s) in question—whether that's you, or someone else—must have indicated that it's okay to remove that content. For an individual URL, this can be indicated in any of three ways:
Before submitting a removal request, you can check whether the URL is correctly blocked:
  • robots.txt: You can check whether the URL is correctly disallowed using either the Fetch as Googlebot or Test robots.txt features in Webmaster Tools.
  • noindex meta tag: You can use Fetch as Googlebot to make sure the meta tag appears somewhere between the <head> and </head> tags. If you want to check a page you can't verify in Webmaster Tools, you can open the URL in a browser, go to View > Page source, and make sure you see the meta tag between the <head> and </head> tags.
  • 404 / 410 status code: You can use Fetch as Googlebot, or tools like Live HTTP Headers or web-sniffer.net to verify whether the URL is actually returning the correct code. Sometimes "deleted" pages may say "404" or "Not found" on the page, but actually return a 200 status code in the page header; so it's good to use a proper header-checking tool to double-check.
If unwanted content has been removed from a page but the page hasn't been blocked in any of the above ways, you will not be able to completely remove that URL from our search results. This is most common when you don't own the site that's hosting that content. We cover what to do in this situation in a subsequent post. in Part II of our removals series.

If a URL meets one of the above criteria, you can remove it by going to http://www.google.com/webmasters/tools/removals, entering the URL that you want to remove, and selecting the "Webmaster has already blocked the page" option. Note that you should enter the URL where the content was hosted, not the URL of the Google search where it's appearing. For example, enter
   http://www.example.com/embarrassing-stuff.html
not
   http://www.google.com/search?q=embarrassing+stuff

This article has more details about making sure you're entering the proper URL. Remember that if you don't tell us the exact URL that's troubling you, we won't be able to remove the content you had in mind.

Removing an entire directory or site

In order for a directory or site-wide removal to be successful, the directory or site must be disallowed in the site's robots.txt file. For example, in order to remove the http://www.example.com/secret/ directory, your robots.txt file would need to include:
   User-agent: *
   Disallow: /secret/

It isn't enough for the root of the directory to return a 404 status code, because it's possible for a directory to return a 404 but still serve out files underneath it. Using robots.txt to block a directory (or an entire site) ensures that all the URLs under that directory (or site) are blocked as well. You can test whether a directory has been blocked correctly using either the Fetch as Googlebot or Test robots.txt features in Webmaster Tools.

Only verified owners of a site can request removal of an entire site or directory in Webmaster Tools. To request removal of a directory or site, click on the site in question, then go to Site configuration > Crawler access > Remove URL. If you enter the root of your site as the URL you want to remove, you'll be asked to confirm that you want to remove the entire site. If you enter a subdirectory, select the "Remove directory" option from the drop-down menu.

Reincluding content

You can cancel removal requests for any site you own at any time, including those submitted by other people. In order to do so, you must be a verified owner of this site in Webmaster Tools. Once you've verified ownership, you can go to Site configuration > Crawler access > Remove URL > Removed URLs (or > Made by others) and click "Cancel" next to any requests you wish to cancel.

Still have questions? Stay tuned for the rest of our series on removing content from Google's search results. If you can't wait, much has already been written about URL removals, and troubleshooting individual cases, in our Help Forum. If you still have questions after reading others' experiences, feel free to ask. Note that, in most cases, it's hard to give relevant advice about a particular removal without knowing the site or URL in question. We recommend sharing your URL by using a URL shortening service so that the URL you're concerned about doesn't get indexed as part of your post; some shortening services will even let you disable the shortcut later on, once your question has been resolved.

Edit: Read the rest of this series:
Part II: Removing & updating cached content
Part III: Removing content you don't own
Part IV: Tracking requests, what not to remove

Companion post: Managing what information is available about you online

BlogHer 2007: Building your audience

Last week, I spoke at BlogHer Business about search engine optimization issues. I presented with Elise Bauer, who talked about the power of community in blogging. She made great points about the linking patterns of blogs. Link out to sites that would be relevant and useful for your readers. Comment on blogs that you like to continue the conversation and provide a link back to your blog. Write useful content that other bloggers will want to link to. Blogging connects readers and writers and creates real communities where valuable content can be exchanged. I talked more generally about search and a few things you might consider when developing your site and blog.

Why is search important for a business?
With search, your potential customers are telling you exactly what they are looking for. Search can be a powerful tool to help you deliver content that is relevant and useful and meets your customers' needs. For instance, do keyword research to find out the most common types of searches that are relevant to your brand. Does your audience most often search for "houses for sale" or "real estate"? Check your referrer logs to see what searches are bringing visitors to your site (you can find a list of the most common searches that return your site in the results from the Query stats page of webmaster tools). Does your site include valuable content for those searches? A blog is a great way to add this content. You can write unique, targeted articles that provide exactly what the searcher wanted.

How do search engines index sites?
The first step in the indexing process is discovery. A search engine has to know the pages exist. Search engines generally learn about pages from following links, and this process works great. If you have new pages, ensure relevant sites link to them, and provide links to them from within your site. For instance, if you have a blog for your business, you could provide a link from your main site to the latest blog post. You can also let search engines know about the pages of your site by submitting a Sitemap file. Google, Yahoo!, and Microsoft all support the Sitemaps protocol and if you have a blog, it couldn't be easier! Simply submit your blog's RSS feed. Each time you update your blog and your RSS feed is updated, the search engines can extract the URL of the latest post. This ensures search engines know about the updates right away.

Once a search engine knows about the pages, it has to be able to access those pages. You can use the crawl errors reports in webmaster tools to see if we're having any trouble crawling your site. These reports show you exactly what pages we couldn't crawl, when we tried to crawl them, and what the error was.

Once we access the pages, we extract the content. You want to make sure that what your page is about is represented by text. What does the page look like with Javascript, Flash, and images turned off in the browser? Use ALT text and descriptive filenames for images. For instance, if your company name is in a graphic, the ALT text should be the company name rather than "logo". Put text in HTML rather than in Flash or images. This not only helps search engines index your content, but also makes your site more accessible to visitors with mobile browsers, screen readers, or older browsers.

What is your site about?
Does each page have unique title and meta description tags that describe the content? Are the words that visitors search for represented in your content? Do a search of your pages for the queries you expect searchers to do most often and make sure that those words do indeed appear in your site. Which of the following tells visitors and search engines what your site is about?

Option 1
If you're plagued by the cliffs of insanity or the pits of despair, sign up for one of our online classes! Learn the meaning of the word inconceivable. Find out the secret to true love overcoming death. Become skilled in hiding your identity with only a mask. And once you graduate, you'll get a peanut. We mean it.

Option 2
See our class schedule here. We provide extensive instruction and valuable gifts upon graduation.

When you link to other pages in your site, ensure that the anchor text (the text used for the link) is descriptive of those pages. For instance, you might link to your products page with the text "Inigo Montoya's sword collection" or "Buttercup's dresses" rather than "products page" or the ever-popular "click here".

Why are links important?
Links are important for a number of reasons. They are a key way to drive traffic to your site. Visitors of other sites can learn about your site through links to it. You can use links to other sites to provide valuable information to your visitors. And just as links let visitors know about your site, they also let search engines know about it. Links also tell search engines and potential visitors about your site. The anchor text describes what your site is about and the number of relevant links to your pages are an indicator of how popular and useful those pages are. (You can find a list of the links to your site and the most common anchor text used in those links in webmaster tools.)

A blog is a great way to build links, because it enables you to create new content on a regular basis. The more useful content you have, the greater the chances someone else will find that content valuable to their readers and link to it. Several people at the BlogHer session asked about linking out to other sites. Won't this cause your readers to abandon your site? Won't this cause you to "leak out" your PageRank? No, and no. Readers will appreciate that you are letting them know about resources they might be interested in and will remember you as a valuable source of information (and keep coming back for more!). And PageRank isn't a set of scales, where incoming links are weighted against outgoing ones and cancel each other out. Links are content, just as your words are. You want your site to be as useful to your readers as possible, and providing relevant links is a way, just as writing content is, to do that.

The key is compelling content
Google's main goal is to provide the most useful and relevant search results possible. That's the key thing to keep in mind as you look at optimizing your site. How can you make your site the most useful and relevant result for the queries you care about? This won't just help you in the search results, which after all, are just the means to the end. What you are really interested in is keeping your visitors happy and coming back. And creating compelling and useful content is the best way to do that.

Will the Real <Your Site Here> Please Stand Up?

Webmaster Level: Intermediate



In our recent post on the Google Online Security Blog, we described our system for identifying phishing pages. Of the millions of webpages that our scanners analyze for phishing, we successfully identify 9 out of 10 phishing pages. Our classification system only incorrectly flags a non-phishing site as a phishing site about 1 in 10,000 times, which is significantly better than similar systems. In our experience, these “false positive” sites are usually built to distribute spam or may be involved with other suspicious activity. If you find that your site has been added to our phishing page list (”Reported Web Forgery!”) by mistake, please report the error to us. On the other hand, if your site has been added to our malware list (”This site may harm your computer”), you should follow the instructions here. Our team tries to address all complaints within one day, and we usually respond within a few hours.

Unfortunately, sometimes when we try to follow up on your reports, we find that we are just as confused as our automated system. If you run a website, here are some simple guidelines that will allow us to quickly fix any mistakes and help keep your site off our phishing page list in the first place.

- Don’t ask for usernames and passwords that do not belong to your site. We consider this behavior phishing by definition, so don’t do it! If you want to provide an add-on service to another site, consider using a public API or OAuth instead.

- Avoid displaying logos that are not yours near login fields. Someone surfing the web might mistakenly believe that the logo represents your website, and they might be misled into entering personal information into your site that they intended for the other site. Furthermore, we can’t always be sure that you aren’t doing this intentionally, so we might block your site just to be safe. To prevent misunderstandings, we recommend exercising caution when displaying these logos.

- Minimize the number of domains used by your site, especially for logins. Asking for a username and password for Site X looks very suspicious on Site Y. Besides making it harder for us to evaluate your website, you may be inadvertently teaching your visitors to ignore suspicious URLs, making them more vulnerable to actual phishing attempts. If you must have your login page on a different domain from your main site, consider using a transparent proxy to enable users to access this page from your primary domain. If all else fails...

- Make it easy to find links to your pages. It is difficult for us (and for your users) to determine who controls an off-domain page in your site if the links to that page from your main site are hard to find. All it takes to clear this problem up is to have each off-domain page link back to an on-domain page which links to it. If you have not done this, and one of your pages ends up on our list by mistake, please mention in your error report how we can find the link from your main site to the wrongly blocked page. However, if you do nothing else...

- Don’t send strange links via email or IM. It’s all but impossible for us to verify unusual links that only appeared in your emails or instant messages. Worse, using these kinds of links conditions your users/customers/friends to click on strange links they receive through email or IM, which can put them at risk for other Internet crimes besides phishing.

While we hope you consider these recommendations to be common sense, we’ve seen major e-commerce and financial companies break these guidelines from time to time. Following them will not only improve your experience with our anti-phishing systems, but will also help provide your visitors with a better online experience.

Saturday, 28 March 2015

An update on spam reporting

(Note: this post has been translated into English from our German blog.)

In 2006 one of our initiatives in the area of communication was to notify some webmasters in case of a violation of our Webmaster Guidelines (e.g. by using a "particular search engine friendly" software that generates doorways as an extra). No small number of these good-will emails to webmasters have been brought about by spam reports from our users.

We are proud of our users who alert us to potential abuses for the sake of the whole internet community. We appreciate this even more, as PageRank™ (and thus Google search) is based on a democratic principle, i.e. a webmaster is giving other sites a "vote" of approval by linking to it.

In 2007 as an extension and complement of this democratic principle, we want to further increase our users' awareness of webmaster practices that do or do not conform to Google's standards. Such informed users are then able to take counter-action against webspam by filing spam reports. By doing so a mutually beneficial process can be initiated. Ultimately, not only will all Google users benefit from the best possible search quality, but also will spammy webmasters realize that their attempts to unfairly manipulate their site's ranking will pay off less and less.

Our spam report forms are provided in two different flavors: an authenticated form that requires registration in Webmaster Tools, and an unauthenticated form. Currently, we investigate every spam report from a registered user. Spam reports to the unauthenticated form are assessed in terms of impact, and a large fraction of those are reviewed as well.

So, the next time you can't help thinking that the ranking of a search result was not earned by virtue of its content and legitimate SEO, then it is the perfect moment for a spam report. Each of them can give us crucial information for the continual optimization of our search algorithms.

Interested in learning more? Then find below answers to the three most frequent questions.

FAQs concerning spam reports:

Q: What happens to an authenticated spam report at Google?
A: An authenticated spam report is analyzed and then used for evaluating new spam-detecting algorithms, as well as to identify trends in webspam. Our goal is to detect all the sites engaging in similar manipulation attempts automatically in the future and to make sure our algorithms rank those sites appropriately. We don´t want to get into an inefficient game of cat and mouse with individual webmasters who have reached into the wrong bag of tricks.

Q: Why are there sometimes no immediately noticeable consequences of a spam report?
A: Google is always seeking to improve its algorithms for countering webspam, but we also take action on individual spam reports. Sometimes that action will not be immediately visible to an outside user, so there is no need to submit a site multiple times in order for Google to evaluate a URL. There are different reasons that might account for a user´s false impression that a particular spam report went unnoticed. Here are a few of those reasons:

  • Sometimes, Google might already be handling the situation appropriately. For example, if you are reporting a site that seems to engage in excessive link exchanging, it could be the case that we are already discounting the weight of those unearned backlinks correctly, and the site is showing up for other reasons. Note that changes in how Google handles backlinks for a site are not immediately obvious to outside users. Or it may be the case that we already deal with a phenomenon such as keyword stuffing correctly in our scoring, and therefore we are not quite as concerned about something that might not look wonderful, but that isn't affecting rankings.
  • A complete exclusion from Google´s SERPs is only one possible consequence of a spam report. Google might also choose to give a site a "yellow card" so that the site can not be found in the index for a short time. However, if a webmaster ignores this signal, then a "red card" with a longer-lasting effect might follow. So it's possible that Google is already aware of an issue and communicating with the webmaster about that issue, or that we have taken action other than a removal on a spam report.
  • Sometimes, simple patience is the answer, because it takes time for algorithmic changes to be thoroughly checked out, or for the externally displayed PageRank to be updated.
  • It can also be the case that Google is working on solving the more general instance of an issue, and so we are reluctant to take action on an individual situation.
  • A spam report may also just have been considered unjustified. For example, this may be true for a report whose sole motivation appears to attempt to harm a direct competitor with a better ranking.

Q: Can a user expect to receive feedback for a spam report?
A: This is a common request, and we know that our users might like verification of the reported URLs or simple confirmation that the spam report had been taken care of. Given the choice how to spend our time, we have decided to invest our efforts into taking action on spam reports and improving our algorithms to be more robust. But we are open to consider how to scale communication with our users going forward.

The Webmaster Academy goes international

Webmaster level: All

Since we launched the Webmaster Academy in English back in May 2012, its educational content has been viewed well over 1 million times.

The Webmaster Academy was built to guide webmasters in creating great sites that perform well in Google search results. It is an ideal guide for beginner webmasters but also a recommended read for experienced users who wish to learn more about advanced topics.

To support webmasters across the globe, we’re happy to announce that we’re launching the Webmaster Academy in 20 languages. So whether you speak Japanese or Italian, we hope we can help you to make even better websites! You can easily access it through Webmaster Central.

We’d love to read your comments here and invite you to join the discussion in the help forums.


Thursday, 26 March 2015

Tips for Eastern European webmasters

In 2006 we ramped up on international webmaster issues and particularly tried to support Eastern Europe. We opened several offices in the region, improved our algorithms with respect to these languages, and localized many of our products. Should I find only one word to describe these markets, I would say they are diverse. Still, they have two things in common: their online markets are currently in a developing phase and a high number of webmasters and search engine optimizers work there in a variety of languages. We are aware that a certain amount of webspam is generated in this region and we would like to reinforce that we have been working hard to take action on it both algorithmically and manually. Since I have seen some common phenomena in a bunch of these markets, here are a couple of suggestions for Eastern European webmasters and SEOs:
  • Avoid link exchanges. If a fellow webmaster approaches you with some sketchy offer, just refuse. Instead, work on the content of your site. Once you have the quality content, you can use the buzzing blogger community and social web services in your language to get nice linkbaits. Creating good content for your language community will pay off. Help the high-quality people in your language community and they will re-power you.
  • Use regional and geographical domains in line with their purpose. First, a sidenote for the Western webmasters: some Eastern European countries like Poland and Russia have so-called regional or geographical domains. Imagine that all the states in the U.S. had their official second level domain and if you wanted to open your webshop delivering to Kentucky, you could do it cheap or for free on eg. ky.us. This could help Google serve geographically relevant search results. In case you wish to sell organic soaps to people in Szczecin, do open your webshop on szczecin.pl. If you are from Kalmykia and would like to show the world the beauty of your area, go ahead and set up your Kalmyki travel site on kalmykia.ru. If you like a region, support it by hosting your site on the related regional or geographical domain. Be aware that webspam on these regional domains violates the correct use of them and prevents the development of your country's web culture.
  • Say no to Cybersquats! Sneaky registering of strong online brands with Belarusian, Estonian or Slovak top level domains is just bad. While it will not particularly help you boost the ranking of your site, cybersquatting often has created disappointed users and legal actions as side effects.
  • Think long-term. You have your share of responsibility for the development of your market. Creating quality sites that target users who search for highly specific content in your particular language will help you get your market into a more mature status -- and mature markets mean mature publisher revenue too.

Wednesday, 25 March 2015

A new opt-out tool

Webmasters have several ways to keep their sites' content out of Google's search results. Today, as promised, we're providing a way for websites to opt out of having their content that Google has crawled appear on Google Shopping, Advisor, Flights, Hotels, and Google+ Local search.

Webmasters can now choose this option through our Webmaster Tools, and crawled content currently being displayed on Shopping, Advisor, Flights, Hotels, or Google+ Local search pages will be removed within 30 days.

Tuesday, 24 March 2015

Increasing G11i Pro and HD7 internal memory

For those that have been following my G11i Pro or HD7 support threads (here and here, respectively), you may have noticed that my latest custom ROMs now support ext2 file system (thanks to casacristo).

That means you can now use Android App2SD scripts and increase your phone's internal memory. I will detail here the step-by-step instructions in order to achieve that.





Okay, let's start:
  • Install my latest custom ROM
  • Partition your SD card (create a FAT or FAT32 partition, an ext2 partition and optionally a swap partition)
This can be done in two different ways (beware to backup the contents of your SD card first as it will be completely erased):
  • Using the feature integrated into my custom recovery (for that you have to reboot into recovery again so that the phone can assume the newly flashed recovery image). Follow the instructions on this guide, on paragraph 8.6.




    Delete the old partition on the SD card:
    Create the first partition, set it to primary and choose FAT if you have a 2GB or smaller card or FAT32 if you have 4GB or greater card:
    Then, create the second partition, set it to primary as well (very important) and choose ext2:
    Optionally, you can create a third partition for swap. In order to finish, click apply so that the partitions are created and card is formatted.
  • Boot your phone for the first time and let all userdata and Dalvik cache be created
  • Optionally, at this point you can restore some of your applications
  • Disable Quick boot (under Settings / Accessibility)


  • Install your preferred app2sd script (you can grab a modified version of ad2sdx from here, which is the script that I actually recommend to be installed)
Copy over the script file into /system/etc/init.d folder:
And make sure that you set the proper permissions to the file:
 
  • Reboot the phone once after the script installation (you may probably notice that SIM cards are not detected but that is perfectly normal)
  • Wait a minute or so and then reboot the phone again

Now, enjoy the increased space for lots of applications and games:


Important notes:

  • Do not try this with a slow SD card or else phone performance will became really sluggish. In order to get a good experience, a class 6 or higher SD card must be used.
  • Set a maximum of 1GB for the size of the ext2 partition. That should be more than enough!

Saturday, 21 March 2015

Easier management of website verifications

Webmaster level: All

To help webmasters manage the verified owners for their websites in Webmaster Tools, we’ve recently introduced three new features:

  • Verification details view: You can now see the methods used to verify an owner for your site. In the Manage owners page for your site, you can now find the new Verification details link. This screenshot shows the verification details of a user who is verified using both an HTML file uploaded to the site and a meta tag:

    Where appropriate, the Verification details will have links to the correct URL on your site where the verification can be found to help you find it faster.

  • Requiring the verification method be removed from the site before unverifying an owner: You now need to remove the verification method from your site before unverifying an owner from Webmaster Tools. Webmaster Tools now checks the method that the owner used to verify ownership of the site, and will show an error message if the verification is still found. For example, this is the error message shown when an unverification was attempted while the DNS CNAME verification method was still found on the DNS records of the domain:

  • Shorter CNAME verification string: We’ve slightly modified the CNAME verification string to make it shorter to support a larger number of DNS providers. Some systems limit the number of characters that can be used in DNS records, which meant that some users were not able to use the CNAME verification method. We’ve now made the CNAME verification method have a fewer number of characters. Existing CNAME verifications will continue to be valid.

We hope this changes make it easier for you to use Webmaster Tools. As always, please post in our Verification forum if you have any questions or feedback.

Friday, 20 March 2015

Making search-friendly mobile websites — now in 11 more languages

Webmaster level: Intermediate

As more and more users worldwide with mobile devices access the Internet, it’s fantastic to see so many websites making their content accessible and useful for those devices. To help webmasters optimize their sites we launched our recommendations for smartphones, feature-phones, tablets, and Googlebot-friendly sites in June 2012.

We’re happy to announce that those recommendations are now also available in Arabic, Brazilian Portuguese, Dutch, French, German, Italian, Japanese, Polish, Russian, Simplified Chinese, and Spanish. US-based webmasters are welcome to read the UK-English version.

We welcome you to go through our recommendations, pick the configuration that you feel will work best with your website, and get ready to jump on the mobile bandwagon!

Thanks to the fantastic webmaster-outreach team in Dublin, Tokyo and Beijing for making this possible!

Thursday, 19 March 2015

Five common SEO mistakes (and six good ideas!)

Webmaster Level: Beginner to Intermediate

To help you avoid common mistakes webmasters face with regard to search engine optimization (SEO), I filmed a video outlining five common mistakes I’ve noticed in the SEO industry. Almost four years ago, we also gathered information from all of you (our readers) about your SEO recommendations and updated our related Help Center article given your feedback. Much of the same advice from 2008 still holds true today -- here’s to more years ahead building a great site!




If you’re short on time, here’s the gist:

Avoid these common mistakes
1. Having no value proposition: Try not to assume that a site should rank #1 without knowing why it’s helpful to searchers (and better than the competition :)

2. Segmented approach: Be wary of setting SEO-related goals without making sure they’re aligned with your company’s overall objectives and the goals of other departments. For example, in tandem with your work optimizing product pages (and the full user experience once they come to your site), also contribute your expertise to your Marketing team’s upcoming campaign. So if Marketing is launching new videos or a more interactive site, be sure that searchers can find their content, too.

3. Time-consuming workarounds: Avoid implementing a hack rather than researching new features or best practices that could simplify development (e.g., changing the timestamp on an updated URL so it’s crawled more quickly instead of easily submitting the URL through Fetch as Googlebot).

4. Caught in SEO trends: Consider spending less time obsessing about the latest “trick” to boost your rankings and instead focus on the fundamental tasks/efforts that will bring lasting visitors.

5. Slow iteration: Aim to be agile rather than promote an environment where the infrastructure and/or processes make improving your site, or even testing possible improvements, difficult.
Six fundamental SEO tips
1. Do something cool: Make sure your site stands out from the competition -- in a good way!

2. Include relevant words in your copy: Try to put yourself in the shoes of searchers. What would they query to find you? Your name/business name, location, products, etc., are important. It's also helpful to use the same terms in your site that your users might type (e.g., you might be a trained “flower designer” but most searchers might type [florist]), and to answer the questions they might have (e.g., store hours, product specs, reviews). It helps to know your customers.

3. Be smart about your tags and site architecture: Create unique title tags and meta descriptions; include Rich Snippets markup from schema.org where appropriate. Have intuitive navigation and good internal links.

4. Sign up for email forwarding in Webmaster Tools: Help us communicate with you, especially when we notice something awry with your site.

5. Attract buzz: Natural links, +1s, likes, follows... In every business there's something compelling, interesting, entertaining, or surprising that you can offer or share with your users. Provide a helpful service, tell fun stories, paint a vivid picture and users will share and reshare your content.

6. Stay fresh and relevant: Keep content up-to-date and consider options such as building a social media presence (if that’s where a potential audience exists) or creating an ideal mobile experience if your users are often on-the-go.
Good luck to everyone!

Upcoming changes in Google’s HTTP Referrer

Webmaster level: all

Protecting users’ privacy is a priority for us and it’s helped drive recent changes. Helping users save time is also very important; it’s explicitly mentioned as a part of our philosophy. Today, we’re happy to announce that Google Web Search will soon be using a new proposal to reduce latency when a user of Google’s SSL-search clicks on a search result with a modern browser such as Chrome.

Starting in April, for browsers with the appropriate support, we will be using the "referrer" meta tag to automatically simplify the referring URL that is sent by the browser when visiting a page linked from an organic search result. This results in a faster time to result and more streamlined experience for the user.

What does this mean for sites that receive clicks from Google search results? You may start to see "origin" referrers—Google’s homepages (see the meta referrer specification for further detail)—as a source of organic SSL search traffic. This change will only affect the subset of SSL search referrers which already didn’t include the query terms. Non-HTTPS referrals will continue to behave as they do today. Again, the primary motivation for this change is to remove an unneeded redirect so that signed-in users reach their destination faster.

Website analytics programs can detect these organic search requests by detecting bare Google host names using SSL (like "https://www.google.co.uk/"). Webmasters will continue see the same data in Webmasters Tools—just as before, you’ll receive an aggregated list of the top search queries that drove traffic to their site.

We will continue to look into further improvements to how search query data is surfaced through Webmaster Tools. If you have questions, feedback or suggestions, please let us know through the Webmaster Tools Help Forum.

Working with multilingual websites

Webmaster Level: Intermediate

A multilingual website is any website that offers content in more than one language. Examples of multilingual websites might include a Canadian business with an English and a French version of its site, or a blog on Latin American soccer available in both Spanish and Portuguese.

Usually, it makes sense to have a multilingual website when your target audience consists of speakers of different languages. If your blog on Latin American soccer aims to reach the Brazilian audience, you may choose to publish it only in Portuguese. But if you’d like to reach soccer fans from Argentina also, then providing content in Spanish could help you with that.

Google and language recognition


Google tries to determine the main languages of each one of your pages. You can help to make language recognition easier if you stick to only one language per page and avoid side-by-side translations. Although Google can recognize a page as being in more than one language, we recommend using the same language for all elements of a page: headers, sidebars, menus, etc.

Keep in mind that Google ignores all code-level language information, from “lang” attributes to Document Type Definitions (DTD). Some web editing programs create these attributes automatically, and therefore they aren’t very reliable when trying to determine the language of a webpage.

Someone who comes to Google and does a search in their language expects to find localized search results, and this is where you, as a webmaster, come in: if you’re going to localize, make it visible in the search results with some of our tips below.

The anatomy of a multilingual site: URL structure


There's no need to create special URLs when developing a multilingual website. Nonetheless, your users might like to identify what section of your website they’re on just by glancing at the URL. For example, the following URLs let users know that they’re on the English section of this site:

http://example.ca/en/mountain-bikes.html
http://
en.example.ca/mountain-bikes.html

While these other URLs let users know that they’re viewing the same page in French:

http://example.ca/fr/mountain-bikes.html
http://fr.example.ca/mountain-bikes.html


Additionally, this URL structure will make it easier for you to analyze the indexing of your multilingual content.

If you want to create URLs with non-English characters, make sure to use UTF-8 encoding. UTF-8 encoded URLs should be properly escaped when linked from within your content. Should you need to escape your URLs manually, you can easily find an online URL encoder that will do this for you. For example, if I wanted to translate the following URL from English to French,

http://example.ca/fr/mountain-bikes.html

It might look something like this:

http://example.ca/fr/vélo-de-montagne.html

Since this URL contains one non-English character (é), this is what it would look like properly escaped for use in a link on your pages:

http://example.ca/fr/v%C3%A9lo-de-montagne

Crawling and indexing your multilingual website


We recommend that you do not allow automated translations to get indexed. Automated translations don’t always make sense and they could potentially be viewed as spam. More importantly, the point of making a multilingual website is to reach a larger audience by providing valuable content in several languages. If your users can’t understand an automated translation or if it feels artificial to them, you should ask yourself whether you really want to present this kind of content to them.

If you’re going to localize, make it easy for Googlebot to crawl all language versions of your site. Consider cross-linking page by page. In other words, you can provide links between pages with the same content in different languages. This can also be very helpful to your users. Following our previous example, let’s suppose that a French speaker happens to land on http://example.ca/en/mountain-bikes.html; now, with one click he can get to http://example.ca/fr/vélo-de-montagne.html where he can view the same content in French.

To make all of your site's content more crawlable, avoid automatic redirections based on the user's perceived language. These redirections could prevent users (and search engines) from viewing all the versions of your site.

And last but not least, keep the content for each language on separate URLs - don't use cookies to show translated versions.

Working with character encodings


Google directly extracts character encodings from HTTP headers, HTML page headers, and content. There isn’t much you need to do about character encoding, other than watching out for conflicting information - for example, between content and headers. While Google can recognize different character encodings, we recommend that you use UTF-8 on your website whenever possible.

If your tongue gets twisted...


Now that you know all of this, your tongue may get twisted when you speak many languages, but your website doesn’t have to!

For more information, read our post on multi-regional sites and stay tuned for our next post, where we'll delve into special situations that may arise when working with global websites. Until then, don't hesitate to drop by the Help Forum and join the discussion!

Monday, 16 March 2015

Site content and use of web catalogues

Sites with more content can have more opportunities to rank well in Google. It makes sense that having more pages of good content represent more chances to rank in search engine result pages (SERPs). Some SEOs however, do not focus on the user’s needs, but instead create pages solely for search engines. This approach is based on the false assumption that increasing the volume of web pages with random, irrelevant content is a good long-term strategy for a site. These techniques are usually accomplished by abusing qlweb style catalogues or by scraping content from sources known for good, valid content, like Wikipedia or the Open Directory Project.

These methods violate Google's webmaster guidelines. Purely scraped content, even from high quality sources, does not provide any added value to your users. It's worthwhile to take the time to create original content that sets your site apart. This will keep your visitors coming back and will provide useful search results.

In order to provide best results possible to our Polish and non-Polish users, Google continues to improve its algorithms for validating web content.

Google is willing to take action against domains that try to rank more highly by just showing scraped or other autogenerated pages that don't add any value to users. Companies, webmasters, and domain owners who consider SEO consultation should take care not to spend time on methods which will not have worthwhile long-term results. Choosing the right SEO consultant requires in-depth background research, and their reputation and past work should be important factors in your decision.

PS: Head on over to our Polish discussion forum, where we're monitoring the posts and chiming in when we can!

Treść oraz katalogi na serwisach internetowy

Serwisy o dużej ilości stron mają szanse na wyższe pozycje w indeksie Google. Oznacza to, że oferując wiele stron z niepowtarzalną treścią można polepszyć notowania w wynikach wyszukiwarek (SERP). Fakt ten jest znany i wykorzystywany przez przedsiębiorstwa oferujące usługi pozycjonowania witryn internetowych. Często jednak nie jest brane pod uwagę, że treść strony powinna być tworzona dla użytkowników, a nie dla wyszukiwarek (w tym Google). Takie podejście prowadzi do błędnego założenia, że wystarczy zwiększyć ilość stron konkretnej domeny, dodając na przykład katalogi z dowolną, niejednokrotnie zupełnie nieistotną treścią, aby na dłuższy okres czasu wypozycjonować domenę. Przejawia się to między innymi nadużywaniem katalogów typu qlweb lub kopiowaniem znanych z jakościowo dobrej treści serwisów, jak Wikipedia lub Open Directory Project.

Takie metody są bez wątpliwości rozbieżne z wytycznymi Google dla webmasterów. Dowolnie skopiowane treści, nawet jeżeli dobrej jakości, nie stanowią większej wartości informacyjnej dla użytkowników. Aby wyróżnić serwis internetowy, warto poświęcić czas na tworzenie nowej treści, dzięki czemu można zwiększyć lojalność użytkowników i dostarczyć przydatnych wyników w wyszukiwarce.

W trosce o naszych polskich użytkowników (i nie tylko) Google konsekwentnie ulepsza algorytmy weryfikujące merytoryczną wartość serwisów internetowych.

Google jest skłonny podejmować działania przeciwko domenom, których webmasterzy usiłują osiągnąć lepsze pozycje w wynikach poprzez dodawanie skopiowanej lub automatycznie wygenerowanej treści, która nie stanowi żadnej wartości dla użytkowników. Przedsiębiorstwa, webmasterzy oraz właściciele domen biorący pod uwagę konsultacje specjalistów SEO, powinni zadbać o to, żeby ich czas nie był wykorzystywany na stosowanie metod nieprzynoszących długoterminowych rezultatów. Przy wyborze doradców oraz firm oferujących pozycjonowanie, ich reputacja jest kluczowym czynnikiem i powinna zostać dokładnie zweryfikowana przed podjęciem ostatecznej decyzji.

PS: Zapraszamy na naszą polską grupe dyskusyjną, na której z zainteresowaniem czytamy Wasze wpisy i staramy się na nie reagować.

Posted by Kaspar Szymanski, Search Quality

Sunday, 15 March 2015

Get a more complete picture about how other sites link to you

For quite a while, you've been able to see a list of the most common words used in anchor text to your site. This information is useful, because it helps you know what others think your site is about. How sites link to you has an impact on your traffic from those links, because it describes your site to potential visitors. In addition, anchor text influences the queries your site ranks for in the search results.

Now we've enhanced the information we provide and will show you the complete phrases sites use to link to you, not just individual words. And we've expanded the number we show to 100. To make this information as useful as possible, we're aggregating the phrases by eliminating capitalization and punctuation. For instance, if several sites have linked to your site using the following anchor text:

Site 1 "Buffy, blonde girl, pointy stick"
Site 2 "Buffy blonde girl pointy stick"
Site 3 "buffy: Blonde girl; Pointy stick."

We would aggregate that anchor text and show it as one phrase, as follows:

"buffy blonde girl pointy stick"

You can find this list of phrases by logging into webmaster tools, accessing your site, then going to Statistics > Page anaysis. You can view this data in a table and can download it as a CSV file.

And as we told you last month, you can see the individual links to pages of your site by going to Links > External links. We hope these details give you additional insight into your site traffic.

Sharing advice from our site clinic

Webmaster Level: All

Members of the Google Search Quality Team have participated in site clinic panels on a number of occasions. We receive a lot of positive feedback from these events and we've been thinking of ways to expand our efforts to reach even more webmasters. We decided to organize a small, free of charge pilot site clinic at Google in Dublin, and opened the invitation to webmasters from the neighborhood. The response we received was overwhelming and exceeded our expectations.


Meet the Googlers who hosted the site clinic: Anu Ilomäki, Alfredo Pulvirenti, Adel Saoud, Fili Wiese, Kaspar Szymanski and Uli Lutz.

It was fantastic to see the large turnout and we would like to share the slides presented as well as the takeaways.

These are some questions we came across, along with the advice shared:
  1. I have 3 blogs with the same content, is that a problem?

    If the content is identical, it's likely only one of the blogs will rank for it. Also, with this scattered of an effortwith this scattered of an effort chances are your incoming links will be distributed across the different blogs, instead of pointing to one source. Therefore you're running the risk of both users and search engines not knowing which of your blogs is the definitive source. You can mitigate that by redirecting to the preferred version or using the cross domain canonical to point to one source.

  2. Should I believe SEO agencies that promise to make my site rank first in Google in a few months and with a precise number of links?

    No one can make that promise; therefore the short answer is no, you should not. However, we have some great tips on how to find a trustworthy SEO in our Help Center.

  3. There are keywords that are relevant for my website, but they're inappropriate to be shown in the content e.g. because they could be misunderstood, slang or offensive. How can I show the relevance to Google?

    Depending on the topic of your site and expectations of the target group, you might consider actually using these keywords in a positive way, e.g. explaining their meaning and showing your users you're an authority on the subject. However if the words are plain abusive and completely inappropriate for your website, it's rather questionable whether the traffic resulting from these search queries is interesting for your website anyway.

  4. Would you advise to use the rewrite URL function?

    Some users may like seeing descriptive URLs in the search results. However, it's quite hard to correctly create and maintain rewrites that change dynamic URLs to static-looking URLs. That's why, generally speaking, we don't recommend rewriting them. If you still want to give it a try, please be sure to remove unnecessary parameters while maintaining a dynamic-looking URL and have a close look at our blog post on this topic. And if you don't, keep in mind that we might still make your URLs look readable in our search results no matter how weird they actually are.

  5. If I used the geo-targeting tool for Ireland, is Northern Ireland included?

    Google Webmaster Tools geo-targeting works on a country basis, which means that Northern Ireland would not be targeted if the setting was Republic of Ireland. One possible solution is to create a separate site or part of a website for Northern Ireland and to geo-target this site to the United Kingdom in Webmaster Tools.

  6. Is there any preference between TLDs like .com and .info in ranking?

    No, there is none. Our focus is on the content of the site.

  7. I have a website on a dot SO (.so) domain name with content meant for the Republic of Ireland. Will this hurt my rankings in the Irish search results?

    .so is the Internet country code top-level domain for Somalia. This is one factor we look into not pointing to the desired destination. But we do look at a larger number of factors when ranking your website. The extension of the domain name is just one of these. Your website can still rank in the Irish search results if you have topic-specific content. However, keep in mind that it may take our algorithms a little bit longer to fully understand where to best serve your website in our search results.
We would like to thank all participants for their time and effort. It was a pleasure to help you and we hope that it was beneficial for you, too. For any remaining questions, please don't hesitate to join the community on our GWHF.

Saturday, 14 March 2015

Brand new German Webmaster Central Blog

For those German-speaking folks among our readers of this English Webmaster Central Blog we have exciting news: We have just launched the German Webmaster-Zentrale Blog! This is a tribute to the fact that the German-speaking webmaster community is our second biggest audience of this blog. The German Webmaster Blog will provide you with first-hand information tailored towards our German-speaking webmasters. The blog will contain a mix of German versions of postings from this blog as well as unique postings about market-specific issues.

So German speakers around the world check out this new resource for questions about indexing, ranking, quality guidelines for webmasters, and how to design websites with the user in mind. We'll also be participating in the German discussion forum, so head over there if you have questions or other things you'd like to talk about.

Don't speak German? We want to talk to webmasters all over the world, so stay tuned for more!

We created a first steps cheat sheet for friends & family


Webmaster level: beginner
Everyone knows someone who just set up their first blog on Blogger, installed WordPress for the first time or maybe who had a web site for some time but never gave search much thought. We came up with a first steps cheat sheet for just these folks. It’s a short how-to list with basic tips on search engine-friendly design, that can help Google and others better understand the content and increase your site’s visibility. We made sure it’s available in thirteen languages. Please feel free to read it, print it, share it, copy and distribute it!

We hope this content will help those who are just about to start their webmaster adventure or have so far not paid too much attention to search engine-friendly design. Over time as you gain experience you may want to have a look at our more advanced Google SEO Starter Guide. As always we welcome all webmasters and site owners, new and experienced to join discussions on our Google Webmaster Help Forum.




Friday, 13 March 2015

Video about pagination with rel=“next” and rel=“prev”

Webmaster Level: Beginner to Intermediate

If you’re curious about the rel=”next” and rel=prev” for paginated content announcement we made several months ago, we filmed a video covering more of the basics of pagination to help answer your questions. Paginated content includes things like an article that spans several URLs/pages, or an e-commerce product category that spans multiple pages. With rel=”next” and rel=”prev” markup, you can provide a strong hint to Google that you would like us to treat these pages as a logical sequence, thus consolidating their linking properties and usually sending searchers to the first page. Feel free to check out our presentation for more information:


This video on pagination covers the basics of rel=”next” and rel=”prev” and how it could be useful for your site.


Slides from the pagination video

Additional resources about pagination include:
  • Webmaster Central Blog post announcing support of rel=”next” and rel=”prev”
  • Webmaster Help Center article with more implementations of rel=”next” and rel=”prev
  • Webmaster Forum thread with our answers to the community’s in-depth questions, such as:

    Does rel=next/prev also work as a signal for only one page of the series (page 1 in most cases?) to be included in the search index? Or would noindex tags need to be present on page 2 and on?

    When you implement rel="next" and rel="prev" on component pages of a series, we'll then consolidate the indexing properties from the component pages and attempt to direct users to the most relevant page/URL. This is typically the first page. There's no need to mark page 2 to n of the series with noindex unless you're sure that you don't want those pages to appear in search results.

    Should I use the rel next/prev into [sic] the section of a blog even if the two contents are not strictly correlated (but they are just time-sequential)?

    In regard to using rel=”next” and rel=”prev” for entries in your blog that “are not strictly correlated (but they are just time-sequential),” pagination markup likely isn’t the best use of your time -- time-sequential pages aren’t nearly as helpful to our indexing process as semantically related content, such as pagination on component pages in an article or category. It’s fine if you include the markup on your time-sequential pages, but please note that it’s not the most helpful use case.

    We operate a real estate rental website. Our files display results based on numerous parameters that affect the order and the specific results that display. Examples of such parameters are “page number”, “records per page”, “sorting” and “area selection”...

    It sounds like your real estate rental site encounters many of the same issues that e-commerce sites face... Here are some ideas on your situation:

    1. It’s great that you are using the Webmaster Tools URL parameters feature to more efficiently crawl your site.

    2. It’s possible that your site can form a rel=”next” and rel=”prev” sequence with no parameters (or with default parameter values). It’s also possible to form parallel pagination sequences when users select certain parameters, such as a sequence of pages where there are 15 records and a separate sequence when a user selects 30 records. Paginating component pages, even with parameters, helps us more accurately index your content.

    3. While it’s fine to set rel=”canonical” from a component URL to a single view-all page, setting the canonical to the first page of a parameter-less sequence is considered improper usage. We make no promises to honor this implementation of rel=”canonical.”

Remember that if you have paginated content, it’s fine to leave it as-is and not add rel=”next” and rel=”prev” markup at all. But if you’re interested in pagination markup as a strong hint for us to better understand your site, we hope these resources help answer your questions!

Thursday, 12 March 2015

Crawl Errors: The Next Generation

Webmaster level: All

Crawl errors is one of the most popular features in Webmaster Tools, and today we’re rolling out some very significant enhancements that will make it even more useful.

We now detect and report many new types of errors. To help make sense of the new data, we’ve split the errors into two parts: site errors and URL errors.

Site Errors

Site errors are errors that aren’t specific to a particular URL—they affect your entire site. These include DNS resolution failures, connectivity issues with your web server, and problems fetching your robots.txt file. We used to report these errors by URL, but that didn’t make a lot of sense because they aren’t specific to individual URLs—in fact, they prevent Googlebot from even requesting a URL! Instead, we now keep track of the failure rates for each type of site-wide error. We’ll also try to send you alerts when these errors become frequent enough that they warrant attention.

View site error rate and counts over time

Furthermore, if you don’t have (and haven’t recently had) any problems in these areas, as is the case for many sites, we won’t bother you with this section. Instead, we’ll just show you some friendly check marks to let you know everything is hunky-dory.

A site with no recent site-level errors

URL errors

URL errors are errors that are specific to a particular page. This means that when Googlebot tried to crawl the URL, it was able to resolve your DNS, connect to your server, fetch and read your robots.txt file, and then request this URL, but something went wrong after that. We break the URL errors down into various categories based on what caused the error. If your site serves up Google News or mobile (CHTML/XHTML) data, we’ll show separate categories for those errors.

URL errors by type with full current and historical counts

Less is more

We used to show you at most 100,000 errors of each type. Trying to consume all this information was like drinking from a firehose, and you had no way of knowing which of those errors were important (your homepage is down) or less important (someone’s personal site made a typo in a link to your site). There was no realistic way to view all 100,000 errors—no way to sort, search, or mark your progress. In the new version of this feature, we’ve focused on trying to give you only the most important errors up front. For each category, we’ll give you what we think are the 1000 most important and actionable errors.  You can sort and filter these top 1000 errors, let us know when you think you’ve fixed them, and view details about them.

Instantly filter and sort errors on any column

Some sites have more than 1000 errors of a given type, so you’ll still be able to see the total number of errors you have of each type, as well as a graph showing historical data going back 90 days. For those who worry that 1000 error details plus a total aggregate count will not be enough, we’re considering adding programmatic access (an API) to allow you to download every last error you have, so please give us feedback if you need more.

We've also removed the list of pages blocked by robots.txt, because while these can sometimes be useful for diagnosing a problem with your robots.txt file, they are frequently pages you intentionally blocked. We really wanted to focus on errors, so look for information about roboted URLs to show up soon in the "Crawler access" feature under "Site configuration".

Dive into the details

Clicking on an individual error URL from the main list brings up a detail pane with additional information, including when we last tried to crawl the URL, when we first noticed a problem, and a brief explanation of the error.

Details for each URL error

From the details pane you can click on the link for the URL that caused the error to see for yourself what happens when you try to visit it. You can also mark the error as “fixed” (more on that later!), view help content for the error type, list Sitemaps that contain the URL, see other pages that link to this URL, and even have Googlebot fetch the URL right now, either for more information or to double-check that your fix worked.

View pages which link to this URL

Take action!

One thing we’re really excited about in this new version of the Crawl errors feature is that you can really focus on fixing what’s most important first. We’ve ranked the errors so that those at the top of the priority list will be ones where there’s something you can do, whether that’s fixing broken links on your own site, fixing bugs in your server software, updating your Sitemaps to prune dead URLs, or adding a 301 redirect to get users to the “real” page. We determine this based on a multitude of factors, including whether or not you included the URL in a Sitemap, how many places it’s linked from (and if any of those are also on your site), and whether the URL has gotten any traffic recently from search.

Once you think you’ve fixed the issue (you can test your fix by fetching the URL as Googlebot), you can let us know by marking the error as “fixed” if you are a user with full access permissions. This will remove the error from your list.  In the future, the errors you’ve marked as fixed won’t be included in the top errors list, unless we’ve encountered the same error when trying to re-crawl a URL.

Select errors and mark them as fixed

We’ve put a lot of work into the new Crawl errors feature, so we hope that it will be very useful to you. Let us know what you think and if you have any suggestions, please visit our forum!