Tuesday, 30 September 2014

Translate your website with Google: Expand your audience globally

(This has been cross-posted from the Official Google Blog)

How long would it take to translate all the world's web content into 50 languages? Even if all of the translators in the world worked around the clock, with the current growth rate of content being created online and the sheer amount of data on the web, it would take hundreds of years to make even a small dent.

Today, we're happy to announce a new website translator gadget powered by Google Translate that enables you to make your site's content available in 51 languages. Now, when people visit your page, if their language (as determined by their browser settings) is different than the language of your page, they'll be prompted to automatically translate the page into their own language. If the visitor's language is the same as the language of your page, no translation banner will appear.


After clicking the Translate button, the automatic translations are shown directly on your page.


It's easy to install — all you have to do is cut and paste a short snippet into your webpage to increase the global reach of your blog or website.


Automatic translation is convenient and helps people get a quick gist of the page. However, it's not a perfect substitute for the art of professional translation. Today happens to be International Translation Day, and we'd like to take the opportunity to celebrate the contributions of translators all over the world. These translators play an essential role in enabling global communication, and with the rapid growth and ease of access to digital content, the need for them is greater than ever. We hope that professional translators, along with translation tools such as Google Translator Toolkit and this Translate gadget, will continue to help make the world's content more accessible to everyone.

Advanced Website Diagnostics with Google Webmaster Tools

Running a website can be complicated—so we've provided Google Webmaster Tools to help webmasters to recognize potential issues before they become real problems. Some of the issues that you can spot there are relatively small (such as having duplicate titles and descriptions), other issues can be bigger (such as your website not being reachable). While Google Webmaster Tools can't tell you exactly what you need to change, it can help you to recognize that there could be a problem that needs to be addressed.

Let's take a look at a few examples that we ran across in the Google Webmaster Help Groups:

Is your server treating Googlebot like a normal visitor?

While Googlebot tries to act like a normal user, some servers may get confused and react in strange ways. For example, although your server may work flawlessly most of the time, some servers running IIS may react with a server error (or some other action that is tied to a server error occurring) when visited by a user with Googlebot's user-agent. In the Webmaster Help Group, we've seen IIS servers return result code 500 (Server error) and result code 404 (File not found) in the "Web crawl" diagnostics section, as well as result code 302 when submitting Sitemap files. If your server is redirecting to an error page, you should make sure that we can crawl the error page and that it returns the proper result code. Once you've done that, we'll be able to show you these errors in Webmaster Tools as well. For more information about this issue and possible resolutions, please see http://todotnet.com/archive/0001/01/01/7472.aspx and http://www.kowitz.net/archive/2006/12/11/asp.net-2.0-mozilla-browser-detection-hole.aspx.

If your website is hosted on a Microsoft IIS server, also keep in mind that URLs are case-sensitive by definition (and that's how we treat them). This includes URLs in the robots.txt file, which is something that you should be careful with if your server is using URLs in a non-case-sensitive way. For example, "disallow: /paris" will block /paris but not /Paris.

Does your website have systematically broken links somewhere?

Modern content management systems (CMS) can make it easy to create issues that affect a large number of pages. Sometimes these issues are straightforward and visible when you view the pages; sometimes they're a bit harder to spot on your own. If an issue like this creates a large number of broken links, they will generally show up in the "Web crawl" diagnostics section in your Webmaster Tools account (provided those broken URLs return a proper 404 result code). In one recent case, a site had a small encoding issue in its RSS feed, resulting in over 60,000 bad URLs being found and listed in their Webmaster Tools account. As you can imagine, we would have preferred to spend time crawling content instead of these 404 errors :).

Is your website redirecting some users elsewhere?

For some websites, it can make sense to concentrate on a group of users in a certain geographic location. One method of doing that can be to redirect users located elsewhere to a different page. However, keep in mind that Googlebot might not be crawling from within your target area, so it might be redirected as well. This could mean that Googlebot will not be able to access your home page. If that happens, it's likely that Webmaster Tools will run into problems when it tries to confirm the verification code on your site, resulting in your site becoming unverified. This is not the only reason for a site becoming unverified, but if you notice this on a regular basis, it would be a good idea to investigate. On this subject, always make sure that Googlebot is treated the same way as other users from that location, otherwise that might be seen as cloaking.

Is your server unreachable when we try to crawl?

It can happen to the best of sites—servers can go down and firewalls can be overly protective. If that happens when Googlebot tries to access your site, we won't be able crawl the website and you might not even know that we tried. Luckily, we keep track of these issues and you can spot "Network unreachable" and "robots.txt unreachable" errors in your Webmaster Tools account when we can't reach your site.

Has your website been hacked?

Hackers sometimes add strange, off-topic hidden content and links to questionable pages. If it's hidden, you might not even notice it right away; but nonetheless, it can be a big problem. While the Message Center may be able to give you a warning about some kinds of hidden text, it's best if you also keep an eye out yourself. Google Webmaster Tools can show you keywords from your pages in the "What Googlebot sees" section, so you can often spot a hack there. If you see totally irrelevant keywords, it would be a good idea to investigate what's going on. You might also try setting up Google Alerts or doing queries such as [site:example.com spammy words], where "spammy words" might be words like porn, viagra, tramadol, sex or other words that your site wouldn't normally show. If you find that your site actually was hacked, I'd recommend going through our blog post about things to do after being hacked.

There are a lot of issues that can be recognized with Webmaster Tools; these are just some of the more common ones that we've seen lately. Because it can be really difficult to recognize some of these problems, it's a great idea to check your Webmaster Tools account to make sure that you catch any issues before they become real problems. If you spot something that you absolutely can't pin down, why not post in the discussion group and ask the experts there for help?

Have you checked your site lately?

Friday, 26 September 2014

Keeping comment spam off your site and away from users

So, you've set up a forum on your site for the first time, or enabled comments on your blog. You carefully craft a post or two, click the submit button, and wait with bated breath for comments to come in.

And they do come in. Perhaps you get a friendly note from a fellow blogger, a pressing update from an MMORPG guild member, or a reminder from your Aunt Millie about dinner on Thursday. But then you get something else. Something... disturbing. Offers for deals that are too good to be true, bizarre logorrhean gibberish, and explicit images you certainly don't want Aunt Millie to see. You are now buried in a deluge of dreaded comment spam.

Comment spam is bad stuff all around. It's bad for you, because it adds to your workload. It's bad for your users, who want to find information on your site and certainly aren't interested in dodgy links and unrelated content. It's bad for the web as a whole, since it discourages people from opening up their sites for user-contributed content and joining conversations on existing forums.

So what can you, as a webmaster, do about it?

A quick disclaimer: the list below is a good start, but not exhaustive. There are so many different blog, forum, and bulletin board systems out there that we can't possibly provide detailed instructions for each, so the points below are general enough to make sense on most systems.

Make sure your commenters are real people
  • Add a CAPTCHA. CAPTCHAs require users to read a bit of obfuscated text and type it back in to prove they're a human being and not an automated script. If your blog or forum system doesn't have CAPTCHAs built in you may be able to find a plugin like Recaptcha, a project which also helps digitize old books. CAPTCHAs are not foolproof but they make life a little more difficult for spammers. You can read more about the many different types of CAPTCHAS, but keep in mind that just adding a simple one can be fairly effective.

  • Block suspicious behavior. Many forums allow you to set time limits between posts, and you can often find plugins to look for excessive traffic from individual IP addresses or proxies and other activity more common to bots than human beings.

Use automatic filtering systems
  • Block obviously inappropriate comments by adding words to a blacklist. Spammers obfuscate words in their comments so this isn't a very scalable solution, but it can keep blatant spam at bay.

  • Use built-in features or plugins that delete or mark comments as spam for you. Spammers use automated methods to besmirch your site, so why not use an automated system to defend yourself?  Comprehensive systems like Akismet, which has plugins for many blogs and forum systems and TypePad AntiSpam, which is open-source and compatible with Akismet, are easy to install and do most of the work for you. 

  • Try using Bayesian filtering options, if available. Training the system to recognize spam may require some effort on your part, but this technique has been used successfully to fight email spam

Make your settings a bit stricter
  • Nofollow untrusted links. Many systems have a setting to add a rel="nofollow" attribute to the links in comments, or do so by default. This may discourage some types of spam, but it's definitely not the only measure you should take.

  • Consider requiring users to create accounts before they can post a comment. This adds steps to the user experience and may discourage some casual visitors from posting comments, but may keep the signal-to-noise ratio higher as well.

  • Change your settings so that comments need to be approved before they show up on your site. This is a great tactic if you want to hold comments to a high standard, don't expect a lot of comments, or have a small, personal site. You may be able to allow employees or trusted users to approve posts themselves, spreading the workload. 

  • Think about disabling some types of comments. For example, you may want to disable comments on very old posts that are unlikely to get legitimate comments. On blogs you can often disable trackbacks and pingbacks, which are very cool features but can be major avenues for automated spam.

Keep your site up-to-date
  • Take the time to keep your software up-to-date and pay special attention to important security updates. Some spammers take advantage of security holes in older versions of blogs, bulletin boards, and other content management systems. Check the Quick Security Checklist for additional measures.

You may need to strike a balance on which tactics you choose to implement depending on your blog or bulletin board software, your user base, and your level of experience. Opening up a site for comments without any protection is a big risk, whether you have a small personal blog or a huge site with thousands of users. Also, if your forum has been completely filled with thousands of spam posts and doesn't even show up in Google searches, you may want to submit a reconsideration request after you clear out the bad content and take measures to prevent further spam.

As a long-time blogger and web developer myself, I can tell you that a little time spent setting up measures like these up front can save you a ton of time and effort later. I'm new to the Webmaster Central team, originally from Cleveland. I'm very excited to help fellow webmasters, and have a passion for usability and search quality (I've even done a bit of academic research on the topic). Please share your tips on preventing comment and forum spam in the comments below, and as always you're welcome to ask questions in our discussion group.

Thursday, 25 September 2014

Using named anchors to identify sections on your pages

We just announced a couple of new features on the Official Google Blog that enable users to get to the information they want faster. Both features provide additional links in the result block, which allow users to jump directly to parts of a larger page. This is useful when a user has a specific interest in mind that is almost entirely covered in a single section of a page. Now they can navigate directly to the relevant section instead of scrolling through the page looking for their information.

We generate these deep links completely algorithmically, based on page structure, so they could be displayed for any site (and of course money isn't involved in any way, so you can't pay to get these links). There are a few things you can do to increase the chances that they might appear on your pages. First, ensure that long, multi-topic pages on your site are well-structured and broken into distinct logical sections. Second, ensure that each section has an associated anchor with a descriptive name (i.e., not just "Section 2.1"), and that your page includes a "table of contents" which links to the individual anchors. The new in-snippet links only appear for relevant queries, so you won't see it on the results all the time — only when we think that a link to a section would be highly useful for a particular query.

Tuesday, 23 September 2014

More webmaster questions - Answered!

When it comes to answering your webmaster related questions, we just can't get enough. I wanted to follow-up and answer some additional questions that webmasters asked in our latest installment of Popular Picks. In case you missed it, you can find our answers to image search ranking, sitelinks, reconsideration requests, redirects, and our communication with webmasters in this blog post.



Check out these resources for additional details on questions answered in the video:
Video Transcript:

Hi everyone, I'm Reid from the Search Quality team. Today I'd like to answer some of the unanswered questions from our latest round of popular picks.

Searchmaster had a question about duplicate content. Understandably, this is a popular concern from webmasters. You should check out the Google Webmaster Central Blogwhere my colleague Susan Moskwa recently posted "Demystifying the 'duplicate content penalty," which answers many questions and concerns about duplicate content.

Jay is the Boss wanted to know if e-commerce websites suffer if they have two or more different themes. For example, you could have a site that sells auto parts, but also sightseeing guides. In general, I'd encourage webmasters to create a website that they feel is relevant for users. If it makes sense to sell auto parts and sightseeing guides, then go for it. Those are the sites that perform well, because users want to visit those sites, and they'll link to them as well.

emma2 wanted to know if Google will follow links on a page using the "noindex" attribute in the "robots" meta tag. To answer this question, Googlebot will follow links on a page which uses the meta "noindex" tag, but that page will not appear in our search results. As a reminder, if you would like to prevent Googlebot from crawling any links on a page, use the "nofollow" attribute in the "robots" meta tag.

Aaron Pratt wanted to know about some ways a webmaster can rank well for local searches. A quick recommendation is to add your business to the Local Business Center. There, you can add contact information as well as store operating hours and coupons as well. Another example, or a tip, is to take advantage and purchase a country-specific top-level domain, or use the geotargeting feature in Webmaster Tools.

jdeb901 said it would be helpful if we could let webmasters know if we are having problems with Webmaster Tools. This is an excellent point, and we're always thinking about better ways to communicate with webmasters. If you're having problems with Webmaster Tools, chances are someone else is as well, and they've posted to the Google Webmaster Help Group about this. In the past, if we've experienced problems with Webmaster Tools, we've also created a "sticky" post to let users know that we know about these issues with Webmaster Tools, and we're working to find a solution.

Well, that about wraps it up with our Popular Picks. Thanks again for all of your questions, and I look forward to seeing you around the group.

Monday, 22 September 2014

Dynamic URLs vs. static URLs

Chatting with webmasters often reveals widespread beliefs that might have been accurate in the past, but are not necessarily up-to-date any more. This was the case when we recently talked to a couple of friends about the structure of a URL. One friend was concerned about using dynamic URLs, since (as she told us) "search engines can't cope with these." Another friend thought that dynamic URLs weren't a problem at all for search engines and that these issues were a thing of the past. One even admitted that he never understood the fuss about dynamic URLs in comparison to static URLs. For us, that was the moment we decided to read up on the topic of dynamic and static URLs. First, let's clarify what we're talking about:

What is a static URL?
A static URL is one that does not change, so it typically does not contain any url parameters. It can look like this: http://www.example.com/archive/january.htm. You can search for static URLs on Google by typing filetype:htm in the search field. Updating these kinds of pages can be time consuming, especially if the amount of information grows quickly, since every single page has to be hard-coded. This is why webmasters who deal with large, frequently updated sites like online shops, forum communities, blogs or content management systems may use dynamic URLs.

What is a dynamic URL?
If the content of a site is stored in a database and pulled for display on pages on demand, dynamic URLs maybe used. In that case the site serves basically as a template for the content. Usually, a dynamic URL would look something like this: http://code.google.com/p/google-checkout-php-sample-code/issues/detail?id=31. You can spot dynamic URLs by looking for characters like: ? = &. Dynamic URLs have the disadvantage that different URLs can have the same content. So different users might link to URLs with different parameters which have the same content. That's one reason why webmasters sometimes want to rewrite their URLs to static ones.

Should I try to make my dynamic URLs look static?
Following are some key points you should keep in mind while dealing with dynamic URLs:
  1. It's quite hard to correctly create and maintain rewrites that change dynamic URLs to static-looking URLs.
  2. It's much safer to serve us the original dynamic URL and let us handle the problem of detecting and avoiding problematic parameters.
  3. If you want to rewrite your URL, please remove unnecessary parameters while maintaining a dynamic-looking URL.
  4. If you want to serve a static URL instead of a dynamic URL you should create a static equivalent of your content.
Which can Googlebot read better, static or dynamic URLs?
We've come across many webmasters who, like our friend, believed that static or static-looking URLs were an advantage for indexing and ranking their sites. This is based on the presumption that search engines have issues with crawling and analyzing URLs that include session IDs or source trackers. However, as a matter of fact, we at Google have made some progress in both areas. While static URLs might have a slight advantage in terms of clickthrough rates because users can easily read the urls, the decision to use database-driven websites does not imply a significant disadvantage in terms of indexing and ranking. Providing search engines with dynamic URLs should be favored over hiding parameters to make them look static.

Let's now look at some of the widespread beliefs concerning dynamic URLs and correct some of the assumptions which spook webmasters. :)

Myth: "Dynamic URLs cannot be crawled."
Fact: We can crawl dynamic URLs and interpret the different parameters. We might have problems crawling and ranking your dynamic URLs if you try to make your urls look static and in the process hide parameters which offer the Googlebot valuable information. One recommendation is to avoid reformatting a dynamic URL to make it look static. It's always advisable to use static content with static URLs as much as possible, but in cases where you decide to use dynamic content, you should give us the possibility to analyze your URL structure and not remove information by hiding parameters and making them look static.

Myth: "Dynamic URLs are okay if you use fewer than three parameters."
Fact: There is no limit on the number of parameters, but a good rule of thumb would be to keep your URLs short (this applies to all URLs, whether static or dynamic). You may be able to remove some parameters which aren't essential for Googlebot and offer your users a nice looking dynamic URL. If you are not able to figure out which parameters to remove, we'd advise you to serve us all the parameters in your dynamic URL and our system will figure out which ones do not matter. Hiding your parameters keeps us from analyzing your URLs properly and we won't be able to recognize the parameters as such, which could cause a loss of valuable information.

Following are some questions we thought you might have at this point.

Does that mean I should avoid rewriting dynamic URLs at all?
That's our recommendation, unless your rewrites are limited to removing unnecessary parameters, or you are very diligent in removing all parameters that could cause problems. If you transform your dynamic URL to make it look static you should be aware that we might not be able to interpret the information correctly in all cases. If you want to serve a static equivalent of your site, you might want to consider transforming the underlying content by serving a replacement which is truly static. One example would be to generate files for all the paths and make them accessible somewhere on your site. However, if you're using URL rewriting (rather than making a copy of the content) to produce static-looking URLs from a dynamic site, you could be doing harm rather than good. Feel free to serve us your standard dynamic URL and we will automatically find the parameters which are unnecessary.

Can you give me an example?
If you have a dynamic URL which is in the standard format like foo?key1=value&key2=value2 we recommend that you leave the url unchanged, and Google will determine which parameters can be removed; or you could remove uncessary parameters for your users. Be careful that you only remove parameters which do not matter. Here's an example of a URL with a couple of parameters:

www.example.com/article/bin/answer.foo?language=en&answer=3&sid=98971298178906&query=URL
  • language=en - indicates the language of the article
  • answer=3 - the article has the number 3
  • sid=8971298178906 - the session ID number is 8971298178906
  • query=URL - the query with which the article was found is [URL]
Not all of these parameters offer additional information. So rewriting the URL to www.example.com/article/bin/answer.foo?language=en&answer=3 probably would not cause any problems as all irrelevant parameters are removed.

The following are some examples of static-looking URLs which may cause more crawling problems than serving the dynamic URL without rewriting:
  • www.example.com/article/bin/answer.foo/en/3/98971298178906/URL
  • www.example.com/article/bin/answer.foo/language=en/answer=3/
    sid=98971298178906/query=URL
  • www.example.com/article/bin/answer.foo/language/en/answer/3/
    sid/98971298178906/query/URL
  • www.example.com/article/bin/answer.foo/en,3,98971298178906,URL
Rewriting your dynamic URL to one of these examples could cause us to crawl the same piece of content needlessly via many different URLs with varying values for session IDs (sid) and query. These forms make it difficult for us to understand that URL and 98971298178906 have nothing to do with the actual content which is returned via this URL. However, here's an example of a rewrite where all irrelevant parameters have been removed:
  • www.example.com/article/bin/answer.foo/en/3
Although we are able to process this URL correctly, we would still discourage you from using this rewrite as it is hard to maintain and needs to be updated as soon as a new parameter is added to the original dynamic URL. Failure to do this would again result in a static looking URL which is hiding parameters. So the best solution is often to keep your dynamic URLs as they are. Or, if you remove irrelevant parameters, bear in mind to leave the URL dynamic as the above example of a rewritten URL shows:
  • www.example.com/article/bin/answer.foo?language=en&answer=3
We hope this article is helpful to you and our friends to shed some light on the various assumptions around dynamic URLs. Please feel free to join our discussion group if you have any further questions.

Sunday, 21 September 2014

Google does not use the keywords meta tag in web ranking

Recently we received some questions about how Google uses (or more accurately, doesn't use) the "keywords" meta tag in ranking web search results. Suppose you have two website owners, Alice and Bob. Alice runs a company called AliceCo and Bob runs BobCo. One day while looking at Bob's site, Alice notices that Bob has copied some of the words that she uses in her "keywords" meta tag. Even more interesting, Bob has added the words "AliceCo" to his "keywords" meta tag. Should Alice be concerned?

At least for Google's web search results currently (September 2009), the answer is no. Google doesn't use the "keywords" meta tag in our web search ranking. This video explains more, or see the questions below.


Q: Does Google ever use the "keywords" meta tag in its web search ranking?
A: In a word, no. Google does sell a Google Search Appliance, and that product has the ability to match meta tags, which could include the keywords meta tag. But that's an enterprise search appliance that is completely separate from our main web search. Our web search (the well-known search at Google.com that hundreds of millions of people use each day) disregards keyword metatags completely. They simply don't have any effect in our search ranking at present.

Q: Why doesn't Google use the keywords meta tag?
A: About a decade ago, search engines judged pages only on the content of web pages, not any so-called "off-page" factors such as the links pointing to a web page. In those days, keyword meta tags quickly became an area where someone could stuff often-irrelevant keywords without typical visitors ever seeing those keywords. Because the keywords meta tag was so often abused, many years ago Google began disregarding the keywords meta tag.

Q: Does this mean that Google ignores all meta tags?
A: No, Google does support several other meta tags. This meta tags page documents more info on several meta tags that we do use. For example, we do sometimes use the "description" meta tag as the text for our search results snippets, as this screenshot shows:


Even though we sometimes use the description meta tag for the snippets we show, we still don't use the description meta tag in our ranking.

Q: Does this mean that Google will always ignore the keywords meta tag?
A: It's possible that Google could use this information in the future, but it's unlikely. Google has ignored the keywords meta tag for years and currently we see no need to change that policy.

Wednesday, 17 September 2014

Spanish Site Clinic now live

The Google Webmaster Central blog in Spanish has launched a Site Clinic especially for the Spanish-speaking market. We're offering to analyze a series of websites in order to share some best practices with our community using real web sites. The plan is to offer constructive advice on accessibility and improvements that can lead to better visibility in Google's search results.
During this month, we will be receiving submissions from any legitimate website, but it must be primarily in Spanish. So before you submit your site, please visit the original post and if you want to participate fill out the form as soon as possible, because we will only be selecting 3-5 websites from the first 200 submitted for this Site Clinic, so don't miss out!

Tuesday, 16 September 2014

Webmaster Tools made easier in French, Italian, German and Spanish

We're always working for new ways to make life a bit easier for webmasters. We've had great feedback to many of the initiatives that have taken place in Webmaster Tools and beyond, but given the complex nature of managing a website, there are some questions regarding the tools that come up quite often across the Webmaster Help Groups. This got us thinking: how can we best address these questions?

Well, if you're like me, then you find it a lot easier to learn how to use something if you actually get to see someone else doing it first; with that in mind, we'll launch a series of six video tutorials in French, German, Italian and Spanish over the next couple of months. The videos will take you through the basics of Webmaster Tools as well as how to use the information in the tools to make improvements to your site and hence your site's visibility in Google's index.

Our first video provides an overview of the different information you can access depending on whether you've verified ownership of your site in Webmaster Tools. We'll also explain the different verification methods available. And just to whet your appetite, here are the topics covered in the series:

Video 1: Getting started, signing in, benefits of verifying a site
Video 2: Setting preferences for crawling and indexing
Video 3: Creating and submitting Sitemaps
Video 4: Removing and preventing your content from being indexed
Video 5: Utilizing the Diagnostics, Statistics and Links sections
Video 6: Communicating between Webmasters and Google

You can access the first of these videos in the links provided below and keep a lookout in the local Webmaster Help Groups for upcoming video releases.

Italian Video Tutorials - Italian Webmaster Help Group
Latin America and Spain Video Tutorials - Spanish Webmaster Help Group
French Video Tutorials - French Webmaster Help Group
German Video Tutorials - German Webmaster Help Group - German Webmaster Blog

Enjoy!

Monday, 15 September 2014

Duplicate content and multiple site issues

Webmaster Level: All

Last month, I gave a talk at the Search Engine Strategies San Jose conference on Duplicate Content and Multiple Site Issues. For those who couldn't make it to the conference or would like a recap, we've reproduced the talk on the Google Webmaster Central YouTube Channel. Below you can see the short video reproduced from the content at SES:



You can view the slides here:



Sunday, 14 September 2014

Recommendations for webmaster friendly freehosts.

Most of the recommendations we've made in the past are for individual webmasters running their own websites. We thought we'd offer up some best practices for websites that allow users to create their own websites or host users' data, like Blogger or Google Sites. This class of websites is often referred to as freehosts, although these recommendations apply to certain "non-free" providers as well.

  • Make sure your users can verify their website in website management suites such as Google's Webmaster Tools.

    Webmaster Tools provides your users with detailed reports about their website's visibility in Google. Before we can grant your users access, we need to verify that they own their particular websites. Verifying ownership of a site in Webmaster Tools can be done using a custom HTML file, a meta tag, or seamless integration in your system via Google Services for Websites. Other website management suites such as Yahoo! Site Explorer and Bing Webmaster Tools may use similar verification methods; we recommend making sure your users can access each of these suites.

  • Choose a unique directory or hostname for each user.

    Webmaster Tools verifies websites based on a single URL, but assumes that users should be able to see data for all URLs 'beneath' this URL in the site URL hierarchy.  See our article on verifying subdomains and subdirectories for more information.  Beyond Webmaster Tools, many automated systems on the web--such as search engines or aggregators--expect websites to be structured in this way, and by doing so you'll be making it easier for those systems to find and organize your content.

  • Set useful and descriptive page titles.

    Let users set their own titles, or automatically set the pages on your users' websites to be descriptive of the content on that page.  For example, all of the user page titles should not be "Blogger: Create your free blog".  Similarly, if a user's website has more than one page with different content, they should not all have the same title: "User XYZ's Homepage".

  • Allow the addition of tags to a page.

    Certain meta tags are reasonably useful for search engines and users may want to control them.  These include tags with the name attribute of "robots", "description", "googlebot", "slurp", or "msnbot". Click on the specific name attributes to learn more about what these tags do.

  • Allow your users to use third-party analytics packages such as Google Analytics.

    Google Analytics is free enterprise-class analytics software that can run on a website by just adding a snippet of JavaScript to the page.  If you don't want to allow users to add arbitrary JavaScript for security reasons, the Google Analytics code only changes by one simple ID.  If your let your users tell you their Google Analytics ID, you can set up the rest for them. Users get more value out of your service if they can understand their traffic better. For example, see Weebly's support page on adding Google Analytics. We recommend considering similar methods you can use for enabling access to other third-party applications.

  • Help your users move around.

    Tastes change.  Someone on your service might want to change their account name or even move to another site altogether.  Help them by allowing them to access their own data and by letting them tell search engines when they move part or all of their site via the use of 301 redirect destinations. Similarly, if users want to remove a page/site instead of moving it, please return a 404 HTTP response code so that search engines will know that the page/site is no longer around.  This allows users to use the urgent URL removal tool (if necessary), and makes sure that these pages drop out of search results as soon as possible.

  • Help search engines find the good content from your users.

    Search engines continue to crawl more and more of the web.  Help our crawlers find the best content across your site. Allow us to crawl users' content, including media like user-uploaded images.  Help us find users' content using XML Sitemaps.  Help us to steer clear of duplicate versions of the same content so we can find more of the good stuff your users are creating by creating only one URL for each piece of content when possible, and by specifying your canonical URLs when not.  If you're hosting blogs, create RSS feeds that we can discover in Google Blog Search.  If your site is down or showing errors, please return 5xx response codes.  This helps us avoid indexing lots of "We'll be right back" pages by letting crawlers know that the content is temporarily unavailable.

Can you think of any other best practices that you would recommend for sites that host users' data or pages?

Supporting Facebook Share and RDFa for videos

Have you ever wondered how to increase the chances of your videos appearing in Google's results? Over the last year, the Video Search team has been working hard to improve our index of video on the Web. Today, we're beginning the first in a series of posts to explain some best practices for sites hosting video content.

We previously talked about the importance of submitting a Video Sitemap or mRSS feed to Google and following Google's webmaster guidelines. However, we wanted to offer webmasters an additional tool, so today we're taking a page from the rich snippets playbook and announcing support for Facebook Share and Yahoo! SearchMonkey RDFa. Both of these markup formats allow you to specify information essential to video indexing, such as a video's title and description, within the HTML of a video page. While we've become smarter at discovering this information on our own, we'd certainly appreciate some hints directly from webmasters. Also, to maximize the chances that we find the markup on your video pages, you should make sure it appears in the HTML without the execution of JavaScript or Flash.

So, check out Facebook Share and RDFa and help Google find your videos!

Facebook Share:
<meta name="title" content="Baroo? - cute puppies" />
<meta name="description" content="The cutest canine head tilts on the Internet!" />
<link rel="image_src" href="http://example.com/thumbnail_preview.jpg" />
<link rel="video_src" href="http://example.com/video_object.swf?id=12345"/>
<meta name="video_height" content="296" />
<meta name="video_width" content="512" />
<meta name="video_type" content="application/x-shockwave-flash" />
RDFa (Yahoo! SearchMonkey):
<object width="512" height="296" rel="media:video"
resource="http://example.com/video_object.swf?id=12345"
xmlns:media="http://search.yahoo.com/searchmonkey/media/"
xmlns:dc="http://purl.org/dc/terms/">
<param name="movie" value="http://example.com/video_object.swf?id=12345" />
<embed src="http://example.com/video_object.swf?id=12345"
type="application/x-shockwave-flash" width="512" height="296"></embed>
<a rel="media:thumbnail" href="http://example.com/thumbnail_preview.jpg" />
<a rel="dc:license" href="http://example.com/terms_of_service.html" />
<span property="dc:description" content="Cute Overload defines Baroo? as: Dogspeak for 'Whut the...?'
Frequently accompanied by the Canine Tilt and/or wrinkled brow for enhanced effect." />
<span property="media:title" content="Baroo? - cute puppies" />
<span property="media:width" content="512" />
<span property="media:height" content="296" />
<span property="media:type" content="application/x-shockwave-flash" />
<span property="media:region" content="us" />
<span property="media:region" content="uk" />
<span property="media:duration" content="63" />
</object>

Friday, 12 September 2014

Demystifying the "duplicate content penalty"

Duplicate content. There's just something about it. We keep writing about it, and people keep asking about it. In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."
Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.
There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:
  • Don't create multiple pages, subdomains, or domains with substantially duplicate content.
  • Avoid... "cookie cutter" approaches such as affiliate programs with little or no original content.
  • If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.
(Note that while scraping content from others is discouraged, having others scrape you is a different story; check out this post if you're worried about being scraped.)
But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content:
Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.
This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work.
Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:
  1. When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
  2. We select what we think is the "best" URL to represent the cluster in search results.
  3. We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.
Here's how this could affect you as a webmaster:
  • In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
  • In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.
In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.
Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.
In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:
  • You typically don't need to submit a reconsideration request when you're cleaning up innocently duplicated content.
  • If you're a webmaster of beginner-to-intermediate savviness, you probably don't need to put too much energy into worrying about duplicate content, since most search engines have ways of handling it.
  • You can help your fellow webmasters by not perpetuating the myth of duplicate content penalties! The remedies for duplicate content are entirely within your control. Here are some good places to start.

Huawei Ascend P7-L00 (Dual SIM with 4G/LTE support)

Introduction

Well, this time comes in a review of a device not based on a MediaTek platform. Huawei, the Chinese multinational networking and telecommunications equipment and services company, has recently released the dual SIM variant of its flagship - Ascend P7-L00.

As a sidenote, Huawei has reached the third place in smartphone sales ranking last year. The company is also facing competition from other Chinese competitors, including Lenovo, Xiaomi and ZTE, being the first places occupied by Apple and Samsung.

Specifications

Chipset

Name:HiSilicon Kirin 910T
CPU:Quad-core 1.8 GHz ARM Cortex™-A9
GPU:ARM Mali-450 MP4
Instruction set:ARMv7

Software environment

Embedded:OS: Android 4.4.2 (KitKat) with Emotion UI 2.3

Body

Dimensions
(width x height x depth):
140 x 68.8 x 7.2 millimetres
Weigth:126 grams (battery included)
Color:Black / White / Pink / Gold

Battery

Capacity: 2500 mAh

Memory

RAM:capacity:2 GB
ROM-capacity:16 GB
Expansion slot:microSD memory card, supporting up to 64 GB

Network support

Primary phone:GSM850, GSM900, GSM1800, GSM1900, UMTS850, UMTS900, UMTS1900, UMTS2100, FDD-LTE1800 (B3), FDD-LTE2100 (B1), TDD-LTE2600 (B41)
Secondary phone:GSM850, GSM900, GSM1800, GSM1900
Data links:GPRS, EDGE, UMTS, HSPA, HSPA+, FDD-LTE, TDD-LTE

Display

Type:IPS-LCD capacitive touchscreen
Size:5.0 inches, FHD resolution (1080 x 1920 pixels)
Protection:Corning Gorilla glass

Camera

Main (rear):13.0 megapixels with autofocus and single LED flash
Secondary (front):8.0 megapixels

Interfaces

Bluetooth (802.15):Bluetooth 4.0 + Enhanced Data Rate
Wireless LAN / Wi-Fi (802.11):  IEEE 802.11b/g/n
USB:USB 2.0 Client, Hi-Speed (480 Mbit/s)
USB Series Micro-B (Micro-USB) connector

Satellite navigation

Built-in GPS module:Included
GPS antenna:Internal
Complementary GPS services:  A-GPS (Assisted GPS), GLONASS

Additional features

Sensors:Gravity, Proximity, Light, Gyroscope and Magnetic field sensors
Analog Radio:FM radio (87.5-108 MHz) with RDS radio receiver
Others:Dedicated LED for notification of missed calls / new messages
Dual SIM Full Active

Design and construction

Huawei P7 follows the exact same design as its predecessor - Hauwei P6, apart from the screen dimensions. It's thin, with a metallic banded edge that's extremely reminiscent of the sort of design feature Apple introduced to the world with the iPhone 4.



Once you take phone out of its high-end cardboard box, the wow factor is there. The back is definitely quite pretty. One piece of glass covers a spangly mesh effect back, where a Huawei logo proudly sits.

Although, the glass back is very slippery. Place the phone over any surface that's not totally flat and will slide off it, as there's no protruding plastic or rubber surround to give the glassy rear any grip.



The USB connector is placed at the bottom of the device, making it much easier to hold the thing in front of your face when it's plugged in and charging. The headphone socket sits right at the top of the phone.






There are no soft-buttons on the bezel anymore, taking advantage of Android's navigation bar. The volume rocker is placed on the right edge, right above the unusual power button, which is rounded.

On the back side, the 13 megapixels autofocus camera, powered by a sensor supplied by Sony, is placed right above a single LED flash.


There's a bespoke little pin in the box, for users to poke-eject the micro SIM and nano SIM / micro SD card trays from their docks.



If one values storage space over connectivity, this phone can be turned into a real single SIM device by placing a memory card in the second slot instead of the nano SIM card.

Key features

This phone sports a 5-inch display running at the full HD resolution of 1080 x 1920, with the in-house (not-Qualcomm) 1.8GHz quad-core processor, 2GB of RAM and 4G / LTE support for use with speedy SIM cards.

One of its major features is the unprecedent 8 MP front-facing facing camera, which means a huge selling point for those who enjoy taking selfies, or just after a more impressive Skype video chat experience. The images taken by this secondary camera completely shame the image quality produced by main cameras of plenty of other smartphones.

The 1080p screen is another good feature. At maximum brightness, has more than enough contrast and brightness to make text readable outdoors, even in bright sunlight, and videos appear great.

Functionality

The phone comes pre-installed with Android 4.4.2. On top of that, Huawei added its own customizations, sticking its Emotion UI, a system that removes the standard Android app drawer and replaces it with an iOS-style emphasis on the home screen.

Here are some screenshots and details of the most important features with a special detail of the dual SIM functionality.


At first, those who are used to the pure Android user interface, may be shocked by the lack of the app drawer, but it only takes a few days to get used to it and to actually like it. A couple of folders can be used to hide unwanted and unremovable apps, plus it's quite nice to know there's only one place to look for all your stuff.


The pull-down notifications tab is the other core pillar of Android, and here Huawei's fiddled just a little. There is a collection of five toggles (which can be customized) that sits at the top when you first pull down the notifications blind, with another pull on the button area revealing more controls - and useful quick access to a screen brightness slider. A separate menu lets you rearrange the order in which this appears.


The user interface modified was modified by Huawei and made to be very iOS-like.



The dialer interface with the two buttons for calling from either SIM1 or SIM2. It supports smart dialing feature which works just great.



Under the dual SIM management menu several configurations can be set. The user can enable / disable each SIM cardas in the notification bar (behind the network strength bars). In the same menu, the user has the possibility to set the default SIM card to establish data connection and to make calls / send messages.



The phone obviously supports tethering and portable hotspot feature, letting the user share mobile data connection through USB or over a wireless (Wi-Fi or even Bluetooth) network.

For some unexplained reason, the phone doesn't include the native support for storage encryption, which can still be overcome with the use of third party apps.

Last thing, but very important and not to be left unmentioned, is that the phone has two separate radios, which means both SIM cards are always active, even during phone calls. Unlike other Dual SIM smartphones, where one of the SIM cards will be unreachable when the other has a established call, with this one you'll be always available.

Final thoughts

Huawei Ascend P7 is thin, it's smooth and reliable for the most part, it has a camera that's about as good as you can reasonably expect from a phone. If you're looking for a reliable Dual SIM smartphone from a well known, then go for this one.

So, last but not least and in order to avoid questions regarding where to buy it from, just visit the sponsored shop (etotalk.com). Huawei Ascend P7-L00 is available for 389 USD.


Highs:
  • Incredibly slim and light
  • Fantastic front camera
Lows:
  • Lack of support for native storage encryption