Tuesday, 26 August 2014

The Impact of User Feedback, Part 2 (and more Popular Picks!)

As a follow-up to my recent post about how user reports of webspam and paid links help improve Google's search results for millions of users, I wanted to highlight one of the most essential parts of Google Webmaster Central: our Webmaster Help Group. With over 37,000 members in our English group and support in 15 other languages, the group is the place to get your questions answered regarding crawling and indexing or Webmaster Tools. We're thankful for a fabulous group of Bionic Posters who have dedicated their time and energy to making the Webmaster Help Group a great place to be. When appropriate, Googlers, including myself, jump in to clarify issues or participate in the dialogue. One thing to note: we try hard to read most posts in the group, and although we may not respond to each one, your feedback and concerns help drive the features we work on. Here are a few examples:

Sitemap detailsSubmitting a Sitemap through Webmaster Tools is one way to let know Google know about what pages exist on your site. Users were quick to note that even though they submitted a Sitemap of all the pages on their site, they only found a sampling of URLs indexed through a site: search. In response, the Webmaster Tools team created a Sitemaps details page to better tell you how your Sitemap was processed. You can read a refresher about the Sitemaps details page in Jonathan's blog post.

Contextual help
One request we received early on with Webmaster Tools was for better documentation on the data displayed. We saw several questions about meta description and title tag issues using our Content Analysis tool, which led us to beef up our documentation on that page and link to that Help Center article directly from that page. Similarly, we discovered that users needed clarification on the distinction between "top search queries" and "top clicked queries" and how the data can be used. We added an expandable section entitled "How do I use this data?" and placed contextual help information across Webmaster Tools to explain what each feature is and where to get more information about it.

Blog posts
The Webmaster Help Group is also a way for us to keep a pulse on what overarching questions are on the minds of webmasters so we can address some of those concerns through this blog. Whether it's how to submit a reconsideration request using Webmaster Tools, deal with duplicate content, move a site, or design for accessibility, we're always open to hearing more about your concerns in the Group. Which reminds me...

It's time for more Popular Picks!
Last year, we devoted two weeks to soliciting and answering five of your most pressing webmaster-related questions. These Popular Picks covered the following topics:
Seeing as this was a well-received initiative, I'm happy to announce that we're going to do it again. Head on over to this thread to ask your webmaster-related questions. See you there!

MT65x3 based smartphones are coming...


The first prototypes started to appear in June, but only in August we had the chance to see the first smartphones based on the new chipsets being sold.

MediaTek released two versions of the new MT65x3 platform: MT6513 and MT6573. 

The first one, MT6513, is the low-end version, and it can be considered the direct successor of the old MT6516. This chipset supports Quad-band GSM and GPRS / EDGE data connections. It features an ARM11™ processor running at 650 MHz as well as support for advanced 3D graphics. The high-end version is MT6573, has all the same features as it's little brother, but also integrates a 3G / HSPA modem supporting mobile broadband rates of up to 7,2 Mbps downlink and 5,76 Mbps uplink.


From several video reviews that are appearing on the web, it can be noticed that these chipsets provide a good Android user experience. Personally, I have many faith on these new MediaTek chipsets and will soon be reviewing one MT6513 based smartphone. Keep posted!

Monday, 25 August 2014

#NoHacked: a global campaign to spread hacking awareness

Webmaster level: All

This June, we introduced a weeklong social campaign called #NoHacked. The goals for #NoHacked are to bring awareness to hacking attacks and offer tips on how to keep your sites safe from hackers.

We held the campaign in 11 languages on multiple channels including Google+, Twitter and Weibo. About 1 million people viewed our tips and hundreds of users used the hashtag #NoHacked to spread awareness and to share their own tips. Check them out below!

Posts we shared during the campaign:


Some of the many tips shared by users across the globe:
  • Pablo Silvio Esquivel from Brazil recommends users not to use pirated software (source)
  • Rens Blom from the Netherlands suggests using different passwords for your accounts, changing them regularly, and using an extra layer of security such as two-step authentication (source)
  • Дмитрий Комягин from Russia says to regularly monitor traffic sources, search queries and landing pages, and to look out for spikes in traffic (source)
  • 工務店コンサルタント from Japan advises everyone to choose a good hosting company that's knowledgeable in hacking issues and to set email forwarding in Webmaster Tools (source)
  • Kamil Guzdek from Poland advocates changing the default table prefix in wp-config to a custom one when installing a new WordPress to lower the risk of the database from being hacked (source)

Hacking is still a surprisingly common issue around the world so we highly encourage all webmasters to follow these useful tips. Feel free to continue using the hashtag #NoHacked to share your own tips or experiences around hacking prevention and awareness. Thanks for supporting the #NoHacked campaign!

And in the unfortunate event that your site gets hacked, we’ll help you toward a speedy and thorough recovery:

Friday, 22 August 2014

silver_medal_count++

Since both tennis and table tennis are in the Olympics, perhaps you're wondering: if there's soccer, why not "table soccer?" Of course, we know table soccer by another name; and while foosball may not be an Olympic sport, we still cheered Nathan Johns and Jan Backes—two members of our Search Quality team—as they brought home the foosball silver medal at the search engine foosball smackdown at SES San Jose.

"Smackdown" doesn't quite equate to "Olympics," but check out the intensity—you could hear a pin drop!

silver medalists at foosball

The gold medal (cup) went to the search engine down the road. :)

gold medalists at foosball
Yahoo's first place winners Daniel Wong and Jake Rosenberg.

Just to be sure they weren't ringers, I quizzed Daniel and Jake, "How can you prevent a file from being crawled?" They correctly answered, "robots.txt."

Gold cup well deserved.

Thursday, 21 August 2014

Hey Google, I no longer have badware

This post is for anyone who has been emailed or notified by Google about badware, received a badware warning when browsing their own site using Firefox, or has come across malware-labeled search results for their own site(s).  As you know, these warnings are produced by our automated scanning systems, which we've put in place to ensure the quality of our results by protecting our users.  Whatever the case, if you are dealing with badware, here are a few recommendations that can help you out. 





1.  If you have badware, it usually means that your web server, your website, or a database used by your website has been compromised. We have a nifty post on how to handle being hacked.  Be very careful when inspecting for malware on your site so as to avoid exposing your computer to infection.

2. Once everything is clear and dandy, you can follow the steps in our post about malware reviews via Webmaster Tools. Please note the screen shot on the previous post is outdated, and the new malware review form is on the Overview page and looks like this:



  • Other programs, such as Firefox, also use our badware data and may not recognize the change immediately due to their caching of the data.  So even if the badware label in search is removed, it may take some time for that to be visible in such programs.

3. Lastly, if you believe that your rankings were somehow affected by the malware, such as compromised content that violated our Webmaster Guidelines [i.e. hacked pages with hidden pharmacy text links], you should fill out a reconsideration request. To clarify, reconsideration requests are usually used for when you notice issues stemming from violations of our Webmaster Guidelines and are separate from malware requests.

If you have additional questions, please review our documentation or post to the discussion group with the URL of your site. We hope you find this updated feature in Webmaster Tools useful in discovering and fixing any malware-related problems. 

Tuesday, 19 August 2014

Make your 404 pages more useful

Your visitors may stumble into a 404 "Not found" page on your website for a variety of reasons:
  • A mistyped URL, or a copy-and-paste mistake
  • Broken or truncated links on web pages or in an email message
  • Moved or deleted content
Confronted by a 404 page, they may then attempt to manually correct the URL, click the back button, or even navigate away from your site. As hinted in an earlier post for "404 week at Webmaster Central", there are various ways to help your visitors get out of the dead-end situation. In our quest to make 404 pages more useful, we've just added a section in Webmaster Tools called "Enhance 404 pages". If you've created a custom 404 page this allows you to embed a widget in your 404 page that helps your visitors find what they're looking for by providing suggestions based on the incorrect URL.


Example: Jamie receives the link www.example.com/activities/adventurecruise.html in an email message. Because of formatting due to a bad email client, the URL is truncated to www.example.com/activities/adventur. As a result it returns a 404 page. With the 404 widget added, however, she could instead see the following:



In addition to attempting to correct the URL, the 404 widget also suggests the following, if available:
  • a link to the parent subdirectory
  • a sitemap webpage
  • site search query suggestions and search box

How do you add the widget? Visit the "Enhance 404 pages" section in Webmaster Tools, which allows you to generate a JavaScript snippet. You can then copy and paste this into your custom 404 page's code. As always, don't forget to return a proper 404 code.

Can you change the way it looks? Sure. We leave the HTML unstyled initially, but you can edit the CSS block that we've included. For more information, check out our guide on how to customize the look of your 404 widget.

This feature is currently experimental -- we might not provide corrections and suggestions for your site but we'll be working to improve the coverage. In the meantime, let us know what you think in the comments below or in our group discussion. Thanks for helping us make the Internet a more friendly place!


Friday, 15 August 2014

More on 404

Now that we've bid farewell to soft 404s, in this post for 404 week we'll answer your burning 404 questions.

How do you treat the response code 410 "Gone"?
Just like a 404.

Do you index content or follow links from a page with a 404 response code?
We aim to understand as much as possible about your site and its content. So while we wouldn't want to show a hard 404 to users in search results, we may utilize a 404's content or links if it's detected as a signal to help us better understand your site.

Keep in mind that if you want links crawled or content indexed, it's far more beneficial to include them in a non-404 page.

What about 404s with a 10-second meta refresh?
Yahoo! currently utilizes this method on their 404s. They respond with a 404, but the 404 content also shows:

<meta http-equiv="refresh" content="10;url=http://www.yahoo.com/?xxx">

We feel this technique is fine because it reduces confusion by giving users 10 seconds to make a new selection, only offering the homepage after 10 seconds without the user's input.

Should I 301-redirect misspelled 404s to the correct URL?
Redirecting/301-ing 404s is a good idea when it's helpful to users (i.e. not confusing like soft 404s). For instance, if you notice that the Crawl Errors of Webmaster Tools shows a 404 for a misspelled version of your URL, feel free to 301 the misspelled version of the URL to the correct version.

For example, if we saw this 404 in Crawl Errors:
http://www.google.com/webmsters  <-- typo for "webmasters"

we may first correct the typo if it exists on our own site, then 301 the URL to the correct version (as the broken link may occur elsewhere on the web):
http://www.google.com/webmasters

Have you guys seen any good 404s?
Yes, we have! (Confession: no one asked us this question, but few things are as fun to discuss as response codes. :) We've put together a list of some of our favorite 404 pages. If you have more 404-related questions, let us know, and thanks for joining us for 404 week!
http://www.metrokitchen.com/nice-404-page
"If you're looking for an item that's no longer stocked (as I was), this makes it really easy to find an alternative."
-Riona, domestigeek

http://www.comedycentral.com/another-404
"Blame the robot monkeys"
-Reid, tells really bad jokes

http://www.splicemusic.com/and-another
"Boost your 'Time on site' metrics with a 404 page like this."
-Susan, dabbler in music and Analytics

http://www.treachery.net/wow-more-404s
"It's not reassuring, but it's definitive."
-Jonathan, has trained actual spiders to build websites, ants handle the 404s

http://www.apple.com/iPhone4g
"Good with respect to usability."
http://thcnet.net/lost-in-a-forest
"At least there's a mailbox."
-JohnMu, adventurous

http://lookitsme.co.uk/404
"It's pretty cute. :)"
-Jessica, likes cute things

http://www.orangecoat.com/a-404-page.html
"Flow charts rule."
-Sahala, internet traveller

http://icanhascheezburger.com/iz-404-page
"I can has useful links and even e-mail address for questions! But they could have added 'OH NOES! IZ MISSING PAGE! MAYBE TIPO OR BROKN LINKZ?' so folks'd know what's up."
-Adam, lindy hop geek

Thursday, 14 August 2014

Specifying an image's license using RDFa

Webmaster Level: All

We recently introduced a new feature on Google Image Search which allows you to restrict your search results to images that have been tagged for free reuse. As a webmaster, you may be interested in how you can let Google know which licenses your images are released under, so I've prepared a brief video explaining how to do this using RDFa markup.



If you have any questions about how to mark up your images, please ask in our Webmaster Help Forum.

Tuesday, 12 August 2014

Farewell to soft 404s

We see two kinds of 404 ("File not found") responses on the web: "hard 404s" and "soft 404s." We discourage the use of so-called "soft 404s" because they can be a confusing experience for users and search engines. Instead of returning a 404 response code for a non-existent URL, websites that serve "soft 404s" return a 200 response code. The content of the 200 response is often the homepage of the site, or an error page.

How does a soft 404 look to the user? Here's a mockup of a soft 404: This site returns a 200 response code and the site's homepage for URLs that don't exist.



As exemplified above, soft 404s are confusing for users, and furthermore search engines may spend much of their time crawling and indexing non-existent, often duplicative URLs on your site. This can negatively impact your site's crawl coverage—because of the time Googlebot spends on non-existent pages, your unique URLs may not be discovered as quickly or visited as frequently.

What should you do instead of returning a soft 404?
It's much better to return a 404 response code and clearly explain to users that the file wasn't found. This makes search engines and many users happy.

Return 404 response code



Return clear message to users



Can your webserver return 404, but send a helpful "Not found" message to the user?
Of course! More info as "404 week" continues!

Monday, 11 August 2014

It's 404 week at Webmaster Central

This week we're publishing several blog posts dedicated to helping you with one response code: 404.

Response codes are a numeric status (like 200 for "OK", 301 for "Moved Permanently") that a webserver returns in response to a request for a URL. The 404 response code should be returned for a file "Not Found".

When a user sends a request for your webpage, your webserver looks for the corresponding file for the URL. If a file exists, your webserver likely responds with a 200 response code along with a message (often the content of the page, such as the HTML).

200 response code flow chart


So what's a 404? Let's say that in the link to "Visit Google Apps" above, the link is broken because of a typing error when coding the page. Now when a user clicks "Visit Google Apps", the particular webpage/file isn't located by the webserver. The webserver should return a 404 response code, meaning "Not Found".

404 response code flow chart


Now that we're all on board with the basics of 404s, stay tuned 4 even more information on making 404s good 4 users and 4 search engines.

New tools for Google Services for Websites

Webmaster Level: All
(A nearly duplicate version :) cross-posted on the Official Google Blog)

Earlier this year, we launched Google Services for Websites, a program that helps partners, e.g., web hoster and access providers, offer useful and powerful tools to their customers. By making services, such as Webmaster Tools, Custom Search, Site Search and AdSense, easily accessible via the hoster control panel, hosters can easily enable these services for their webmasters. The tools help website owners understand search performance, improve user retention and monetize their content — in other words, run more effective websites.

Since we launched the program, several hosting platforms have enhanced their offerings by integrating with the appropriate APIs. Webmasters can configure accounts, submit Sitemaps with Webmaster Tools, create Custom Search Boxes for their sites and monetize their content with AdSense, all with a few clicks at their hoster control panel. More partners are in the process of implementing these enhancements.

We've just added new tools to the suite:
  • Web Elements allows your customers to enhance their websites with the ease of cut-and-paste. Webmasters can provide maps, real-time news, calendars, presentations, spreadsheets and YouTube videos on their sites. With the Conversation Element, websites can create more engagement with their communities. The Custom Search Element provides inline search over your own site (or others you specify) without having to write any code and various options to customize further.
  • Page Speed allows webmasters to measure the performance of their websites. Snappier websites help users find things faster; the recommendations from these latency tools allow hosters and webmasters to optimize website speed. These techniques can help hosters reduce resource use and optimize network bandwidth.
  • The Tips for Hosters page offers a set of tips for hosters for creating a richer website hosting platform. Hosters can improve the convenience and accessibility of tools, while at the same time saving platform costs and earning referral fees. Tips include the use of analytics tools such as Google Analytics to help webmasters understand their traffic and linguistic tools such as Google Translate to help websites reach a broader audience.
If you're a hoster and would like to participate in the Google Services for Websites program, please apply here. You'll have to integrate with the service APIs before these services can be made available to your customers, so the earlier you start that process, the better.

And if your hosting service doesn't have Google Services for Websites yet, send them to this page. Once they become a partner, you can quickly configure the services you want at your hoster's control panel (without having to come to Google).

As always, we'd love to get feedback on how the program is working for you, and what improvements you'd like to see.

Sunday, 10 August 2014

Help test some next-generation infrastructure

Webmaster Level: All

To build a great web search engine, you need to:
  1. Crawl a large chunk of the web.
  2. Index the resulting pages and compute how reputable those pages are.
  3. Rank and return the most relevant pages for users' queries as quickly as possible.
For the last several months, a large team of Googlers has been working on a secret project: a next-generation architecture for Google's web search. It's the first step in a process that will let us push the envelope on size, indexing speed, accuracy, comprehensiveness and other dimensions. The new infrastructure sits "under the hood" of Google's search engine, which means that most users won't notice a difference in search results. But web developers and power searchers might notice a few differences, so we're opening up a web developer preview to collect feedback.

Some parts of this system aren't completely finished yet, so we'd welcome feedback on any issues you see. We invite you to visit the web developer preview of Google's new infrastructure at http://www2.sandbox.google.com/ and try searches there.

Right now, we only want feedback on the differences between Google's current search results and our new system. We're also interested in higher-level feedback ("These types of sites seem to rank better or worse in the new system") in addition to "This specific site should or shouldn't rank for this query." Engineers will be reading the feedback, but we won't have the cycles to send replies.

Here's how to give us feedback: Do a search at http://www2.sandbox.google.com/ and look on the search results page for a link at the bottom of the page that says "Dissatisfied? Help us improve." Click on that link, type your feedback in the text box and then include the word caffeine somewhere in the text box. Thanks in advance for your feedback!

Update on August 11, 2009: [ If you have language or country specific feedback on our new system's search results, we're happy to hear from you. It's a little more difficult to obtain these results from the sandbox URL, though, because you'll need manually alter the query parameters.

You can change these two values appropriately:
hl = language
gl = country code

Examples:
German language in Germany: &hl=de&gl=de
http://www2.sandbox.google.com/search?hl=de&gl=de&q=alle+meine+entchen

Spanish language in Mexico: &hl=es&gl=mx
http://www2.sandbox.google.com/search?hl=es&gl=mx&q=de+colores

And please don't forget to add the word "caffeine" in the feedback text box. :) ]

Saturday, 9 August 2014

Optimize your crawling & indexing

Webmaster Level: Intermediate to Advanced

Many questions about website architecture, crawling and indexing, and even ranking issues can be boiled down to one central issue: How easy is it for search engines to crawl your site? We've spoken on this topic at a number of recent events, and below you'll find our presentation and some key takeaways on this topic.



The Internet is a big place; new content is being created all the time. Google has a finite number of resources, so when faced with the nearly-infinite quantity of content that's available online, Googlebot is only able to find and crawl a percentage of that content. Then, of the content we've crawled, we're only able to index a portion.

URLs are like the bridges between your website and a search engine's crawler: crawlers need to be able to find and cross those bridges (i.e., find and crawl your URLs) in order to get to your site's content. If your URLs are complicated or redundant, crawlers are going to spend time tracing and retracing their steps; if your URLs are organized and lead directly to distinct content, crawlers can spend their time accessing your content rather than crawling through empty pages, or crawling the same content over and over via different URLs.

In the slides above you can see some examples of what not to do—real-life examples (though names have been changed to protect the innocent) of homegrown URL hacks and encodings, parameters masquerading as part of the URL path, infinite crawl spaces, and more. You'll also find some recommendations for straightening out that labyrinth of URLs and helping crawlers find more of your content faster, including:
  • Remove user-specific details from URLs.
    URL parameters that don't change the content of the page—like session IDs or sort order—can be removed from the URL and put into a cookie. By putting this information in a cookie and 301 redirecting to a "clean" URL, you retain the information and reduce the number of URLs pointing to that same content.
  • Rein in infinite spaces.
    Do you have a calendar that links to an infinite number of past or future dates (each with their own unique URL)? Do you have paginated data that returns a status code of 200 when you add &page=3563 to the URL, even if there aren't that many pages of data? If so, you have an infinite crawl space on your website, and crawlers could be wasting their (and your!) bandwidth trying to crawl it all. Consider these tips for reining in infinite spaces.
  • Disallow actions Googlebot can't perform.
    Using your robots.txt file, you can disallow crawling of login pages, contact forms, shopping carts, and other pages whose sole functionality is something that a crawler can't perform. (Crawlers are notoriously cheap and shy, so they don't usually "Add to cart" or "Contact us.") This lets crawlers spend more of their time crawling content that they can actually do something with.
  • One man, one vote. One URL, one set of content.
    In an ideal world, there's a one-to-one pairing between URL and content: each URL leads to a unique piece of content, and each piece of content can only be accessed via one URL. The closer you can get to this ideal, the more streamlined your site will be for crawling and indexing. If your CMS or current site setup makes this difficult, you can use the rel=canonical element to indicate the preferred URL for a particular piece of content.

If you have further questions about optimizing your site for crawling and indexing, check out some of our previous writing on the subject, or stop by our Help Forum.

Thursday, 7 August 2014

How to start a multilingual site

Have you ever thought of creating one or several sites in different languages? Let's say you want to start a travel site about backpacking in Europe, and you want to offer your content to English, German, and Spanish speakers. You'll want to keep in mind factors like site structure, geographic as well as language targeting, and content organization.

Site structure
The first thing you'll want to consider is if it makes sense for you to buy country-specific top-level domains (TLD) for all the countries you plan to serve. So your domains might be ilovebackpacking.co.uk, ichlieberucksackreisen.de, and irdemochilero.es.es. This option is beneficial if you want to target the countries that each TLD is associated with, a method known as geo targeting. Note that this is different from language targeting, which we will get into a little more later. Let's say your German content is specifically for users from Germany and not as relevant for German-speaking users in Austria or Switzerland. In this case, you'd want to register a domain on the .de TLD. German users will identify your site as a local one they are more likely to trust. On the other hand, it can be pretty expensive to buy domains on the country-specific TLDs, and it's more of a pain to update and maintain multiple domains. So if your time and resources are limited, consider buying one non-country-specific domain, which hosts all the different versions of your website. In this case, we recommend either of these two options:
  1. Put the content of every language in a different subdomain. For our example, you would have en.example.com, de.example.com, and es.example.com.
  2. Put the content of every language in a different subdirectory. This is easier to handle when updating and maintaining your site. For our example, you would have example.com/en/, example.com/de/, and example.com/es/.
Matt Cutts wrote a substantial post on subdirectories and subdomains, which may help you decide which option to go with.

Geographic targeting vs. Language targeting
As mentioned above, if your content is especially targeted towards a particular region in the world, you can use the Set Geographic Target tool in Webmaster Tools. It allows you to set different geographic targets for different subdirectories or subdomains (e.g., /de/ for Germany).

If you want to reach all speakers of a particular language around the world, you probably don't want to limit yourself to a specific geographic location. This is known as language targeting, and in this case, you don't want to use the geographic target tool.

Content organization
The same content in different languages is not considered duplicate content. Just make sure you keep things organized. If you follow one of the site structure recommendations mentioned above, this should be pretty straightforward. Avoid mixing languages on each page, as this may confuse Googlebot as well as your users. Keep navigation and content in the same language on each page.

If you want to check how many of your pages are recognized in a certain language, you can perform a language-specific site search. For example, if you go to google.de and do a site search on google.com, choose the option below the search box to only display German results.
If you have more questions on this topic, you can join our Webmaster Help Group to get more advice.

Wednesday, 6 August 2014

HTTPS as a ranking signal

Webmaster level: all

Security is a top priority for Google. We invest a lot in making sure that our services use industry-leading security, like strong HTTPS encryption by default. That means that people using Search, Gmail and Google Drive, for example, automatically have a secure connection to Google.

Beyond our own stuff, we’re also working to make the Internet safer more broadly. A big part of that is making sure that websites people access from Google are secure. For instance, we have created resources to help webmasters prevent and fix security breaches on their sites.

We want to go even further. At Google I/O a few months ago, we called for “HTTPS everywhere” on the web.

We’ve also seen more and more webmasters adopting HTTPS (also known as HTTP over TLS, or Transport Layer Security), on their website, which is encouraging.

For these reasons, over the past few months we’ve been running tests taking into account whether sites use secure, encrypted connections as a signal in our search ranking algorithms. We've seen positive results, so we're starting to use HTTPS as a ranking signal. For now it's only a very lightweight signal — affecting fewer than 1% of global queries, and carrying less weight than other signals such as high-quality content — while we give webmasters time to switch to HTTPS. But over time, we may decide to strengthen it, because we’d like to encourage all website owners to switch from HTTP to HTTPS to keep everyone safe on the web.


Lock


In the coming weeks, we’ll publish detailed best practices (it's in our help center now) to make TLS adoption easier, and to avoid common mistakes. Here are some basic tips to get started:

  • Decide the kind of certificate you need: single, multi-domain, or wildcard certificate
  • Use 2048-bit key certificates
  • Use relative URLs for resources that reside on the same secure domain
  • Use protocol relative URLs for all other domains
  • Check out our Site move article for more guidelines on how to change your website’s address
  • Don’t block your HTTPS site from crawling using robots.txt
  • Allow indexing of your pages by search engines where possible. Avoid the noindex robots meta tag.

If your website is already serving on HTTPS, you can test its security level and configuration with the Qualys Lab tool. If you are concerned about TLS and your site’s performance, have a look at Is TLS fast yet?. And of course, if you have any questions or concerns, please feel free to post in our Webmaster Help Forums.

We hope to see more websites using HTTPS in the future. Let’s all make the web more secure!

How do you use Webmaster Tools? Share your stories and become a YouTube star!

Our greatest resource is the webmaster community, and here at Webmaster Central we're constantly impressed by the knowledge and expertise we see among webmasters: real-life SEOs, bloggers, online retailers, and all those other people creating great online content.
How do real-life webmasters actually use Webmaster Tools? We'd love to know, and we'd like to showcase some real-life examples for the rest of the community. Create a video telling your story, and upload it via the gadget in our Help Center. We'll highlight the best on our Webmaster Central YouTube channel, and even embed some in relevant Help Center articles (with full credit to you, of course).


To share your stories: Make a video, upload it to YouTube, then go to our Help Center, and submit your vid via our Help Center gadget. Our full guidelines give more information, but here is a summary of the key points:

  • Keep the video short; 3-5 minutes is ideal. Think small: a short video is a good way to showcase your use of - for example - Top Search Queries, but not long enough to highlight your whole SEO strategy.
  • Focus on a real-life example of how you used a particular feature. For example, you could show how you used link data to research your brand, or crawl errors to diagnose problems with your site structure. Do you have a great tip or recommendation?
  • Upload your video before September 30.
  • White hats are recommended. They show up better on screen.

Advanced Q&A from (the appropriately-named) SMX Advanced

Webmaster Level: Intermediate to Advanced

Earlier this summer SMX Advanced landed once again in our fair city—Seattle—and it was indeed advanced. I got a number of questions at some Q&A panels that I had to go back and do a little research on. Here, as promised, are answers:

Q. We hear that Google's now doing a better job of indexing Flash content. If I have a Flash file that pulls in content from an external file and the external file is blocked by robots.txt, will that content be indexed in the Flash (which is not blocked by robots.txt)? Or will Google not be able to index that content?

A. We won't be able to access that content if it's in a file that's disallowed by robots.txt; so even though that content would be visible to humans (via the Flash), search engine crawlers wouldn't be able to access it. For more details, see our blog post on indexing Flash that loads external resources.

Q. Sites that customize content based on user behavior or clickstream are becoming more common. If a user clicks through to my site from a search results page, can I customize the content of that page or redirect the user based on the terms in their search query? Or is that considered cloaking? For example, if someone searches for [vintage cameo pendants] but clicks on my site's general vintage jewelry page, can I redirect that user to my vintage cameo-specific page since I know that's what they were searching for?

A. If you're redirecting or returning different content to the user than what Googlebot would see on that URL (e.g., based on the google.com referrer or query string), we consider that cloaking. If the searcher decided to click on the 'vintage jewelry' result, you should show them the page they clicked on even if you think a different page might be better. You can always link between related pages on your website (i.e., link to your 'vintage jewelry' page from your 'vintage cameos' page and vice versa, so that anyone landing on those pages from any source can cross-navigate); but we don't believe you should make that decision for the searcher.

Q. Even though it involves showing different content to different visitors, Google considers ethical website testing (such as A/B or multivariate testing) a legitimate practice that does not violate Google's guidelines. One reason for this is because, while search engines may only see the original content of the page and not the variations, there's also a percentage of human users who see that same content; so the technique doesn't specifically target search engines.

However, some testing services recommend running 100% of a site's traffic through the winning combination for awhile after an experiment has completed, to verify that conversion rates stay high. How does this fit in with Google's view of cloaking?

A. Running 100% of traffic through one combination for a brief period of time in order to verify your experiment's results is fine. However, as our article on this subject states, "if we find a site running a single non-original combination at 100% for a number of months... we may remove that site from our index." If you want to confirm the results of your experiment but are worried about "how long is too long," consider running a follow-up experiment in which you send most of your traffic through your winning combination while still sending a small percentage to the original page as a control. This is what Google recommends with its own testing tool, Website Optimizer.

Q. If the character encoding specified in a page's HTTP header is different from that specified in the <meta equiv="Content-Type"> tag, which one will Google pay attention to?

A. We take a look at both of these, and also do a bit of processing/guessing on our own based on the content of the page. Most major browsers prioritize the encoding specified in the HTTP header over that specified in the HTML, if both are valid but different. However, if you're aware that they're different, the best answer is to fix one of them!

Q. How does Google handle triple-byte UTF-8-encoded international characters in a URL (such as Chinese or Japanese characters)? These types of URLs break in some applications; is Google able to process them correctly? Does Google understand keywords that are encoded this way—that is, can you understand that www.example.com/%E9%9D%B4 is just as relevant to shoes as www.example.com/shoes is?

A. We can correctly handle %-escaped UTF-8 characters in the URL path and in query parameters, and we understand keywords that are encoded in this way. For international characters in a domain name, we recommend using punycode rather than %-encoding, because some older browsers (such as IE6) don't support non-ASCII domain names.

Have a question of your own? Join our discussion forum.

Tuesday, 5 August 2014

To infinity and beyond? No!

When Googlebot crawls the web, it often finds what we call an "infinite space". These are very large numbers of links that usually provide little or no new content for Googlebot to index. If this happens on your site, crawling those URLs may use unnecessary bandwidth, and could result in Googlebot failing to completely index the real content on your site.

Recently, we started notifying site owners when we discover this problem on their web sites. Like most messages we send, you'll find them in Webmaster Tools in the Message Center. You'll probably want to know right away if Googlebot has this problem - or other problems - crawling your sites. So verify your site with Webmaster Tools, and check the Message Center every now and then.



Examples of an infinite space

The classic example of an "infinite space" is a calendar with a "Next Month" link. It may be possible to keep following those "Next Month" links forever! Of course, that's not what you want Googlebot to do. Googlebot is smart enough to figure out some of those on its own, but there are a lot of ways to create an infinite space and we may not detect all of them.


Another common scenario is websites which provide for filtering a set of search results in many ways. A shopping site might allow for finding clothing items by filtering on category, price, color, brand, style, etc. The number of possible combinations of filters can grow exponentially. This can produce thousands of URLs, all finding some subset of the items sold. This may be convenient for your users, but is not so helpful for the Googlebot, which just wants to find everything - once!

Correcting infinite space issues

Our Webmaster Tools Help article describes more ways infinite spaces can arise, and provides recommendations on how to avoid the problem. One fix is to eliminate whole categories of dynamically generated links using your robots.txt file. The Help Center has lots of information on how to use robots.txt. If you do that, don't forget to verify that Googlebot can find all your content some other way. Another option is to block those problematic links with a "nofollow" link attribute. If you'd like more information on "nofollow" links, check out the Webmaster Help Center.

Monday, 4 August 2014

Introducing the Google News Publisher Center

(Cross-posted on the Google News Blog)

Webmaster level: All

UPDATE: Great News -- The Publisher Center is now available in all countries where Google News has an edition.

If you're a news publisher, your website has probably evolved and changed over time -- just like your stories. But in the past, when you made changes to the structure of your site, we might not have discovered your new content. That meant a lost opportunity for your readers, and for you. Unless you regularly checked Webmaster Tools, you might not even have realized that your new content wasn’t showing up in Google News. To prevent this from happening, we are letting you make changes to our record of your news site using the just launched Google News Publisher Center.

With the Publisher Center, your potential readers can be more informed about the articles they’re clicking on and you benefit from better discovery and classification of your news content. After verifying ownership of your site using Google Webmaster Tools, you can use the Publisher Center to directly make the following changes:

  • Update your news site details, including changing your site name and labeling your publication with any relevant source labels (e.g., “Blog”, “Satire” or “Opinion”)
  • Update your section URLs when you change your site structure (e.g., when you add a new section such as http://example.com/2014commonwealthgames or http://example.com/elections2014)
  • Label your sections with a specific topic (e.g., “Technology” or “Politics”)

Whenever you make changes to your site, we’d recommend also checking our record of it in the Publisher Center and updating it if necessary.

Try it out, or learn more about how to get started.

At the moment the tool is only available to publishers in the U.S. but we plan to introduce it in other countries soon and add more features.  In the meantime, we’d love to hear from you about what works well and what doesn’t. Ultimately, our goal is to make this a platform where news publishers and Google News can work together to provide readers with the best, most diverse news on the web.