Friday, 31 July 2015

Introducing the Structured Data Dashboard

Webmaster level: All

Structured data is becoming an increasingly important part of the web ecosystem. Google makes use of structured data in a number of ways including rich snippets which allow websites to highlight specific types of content in search results. Websites participate by marking up their content using industry-standard formats and schemas.

To provide webmasters with greater visibility into the structured data that Google knows about for their website, we’re introducing today a new feature in Webmaster Tools - the Structured Data Dashboard. The Structured Data Dashboard has three views: site, item type and page-level.

Site-level view
At the top level, the Structured Data Dashboard, which is under Optimization, aggregates this data (by root item type and vocabulary schema).  Root item type means an item that is not an attribute of another on the same page.  For example, the site below has about 2 million Schema.Org annotations for Books (“http://schema.org/Book”)


Itemtype-level view
It also provides per-page details for each item type, as seen below:


Google parses and stores a fixed number of pages for each site and item type. They are stored in decreasing order by the time in which they were crawled. We also keep all their structured data markup. For certain item types we also provide specialized preview columns as seen in this example below (e.g. “Name” is specific to schema.org Product).


The default sort order is such that it would facilitate inspection of the most recently added Structured Data.

Page-level view
Last but not least, we have a details page showing all attributes of every item type on the given page (as well as a link to the Rich Snippet testing tool for the page in question).


Webmasters can use the Structured Data Dashboard to verify that Google is picking up new markup, as well as to detect problems with existing markup, for example monitor potential changes in instance counts during site redesigns.

Supplemental goes mainstream



When Google originally introduced Supplemental Results in 2003, our main web index had billions of web pages. The supplemental index made it possible to index even more web pages and, just like our main web index, make this content available when generating relevant search results for user queries. This was especially useful for queries that did not return many results from the main web index, and for these the supplemental index allowed us to query even more web pages. The fewer constraints we're able to place on sites we crawl for the supplemental index means that web pages that are not in the main web index could be included in the supplemental. These are often pages with lower PageRank or those with more complex URLs. Thus the supplemental index (read more - and here's Matt's talk about it on video) serves a very important purpose: to index as much of the relevant content that we crawl as possible.

The changes we make must focus on improving the search experience for our users. Since 2006, we've completely overhauled the system that crawls and indexes supplemental results. The current system provides deeper and more continuous indexing. Additionally, we are indexing URLs with more parameters and are continuing to place fewer restrictions on the sites we crawl. As a result, Supplemental Results are fresher and more comprehensive than ever. We're also working towards showing more Supplemental Results by ensuring that every query is able to search the supplemental index, and expect to roll this out over the course of the summer.

The distinction between the main and the supplemental index is therefore continuing to narrow. Given all the progress that we've been able to make so far, and thinking ahead to future improvements, we've decided to stop labeling these URLs as "Supplemental Results." Of course, you will continue to benefit from Google's supplemental index being deeper and fresher.

Monday, 27 July 2015

New notifications about inbound links

Webmaster level: Advanced

Lots of site owners use our webmaster console to see how their site is doing in Google. Last week we began sending new messages to sites with a pattern of unnatural links pointing to them, and I wanted to give more context about these new messages.

Original Link Messages 

First, let's talk about the original link messages that we've been sending out for months. When we see unnatural links pointing to a site, there are different ways we can respond. In many severe cases, we reduce our trust in the entire site. For example, that can happen when we believe a site has been engaging in a pretty widespread pattern of link spam over a long period of time. If your site is notified for these unnatural links, we recommend removing as many of the spammy or low-quality links as you possibly can and then submitting a reconsideration request for your site.

In a few situations, we have heard about directories or blog networks that won't take links down. If a website tries to charge you to put links up and to take links down, feel free to let us know about that, either in your reconsideration request or by mentioning it on our webmaster forum or in a separate spam report. We have taken action on several such sites, because they often turn out to be doing link spamming themselves.

New Link Messages 

In less severe cases, we sometimes target specific spammy or artificial links created as part of a link scheme and distrust only those links, rather than taking action on a site’s overall ranking. The new messages make it clear that we are taking "targeted action on the unnatural links instead of your site as a whole." The new messages also lack the yellow exclamation mark that other messages have, which tries to convey that we're addressing a situation that is not as severe as the previous "we are losing trust in your entire site" messages.

How serious are these new link messages? 

These new messages are worth your attention. Fundamentally, it means we're distrusting some links to your site. We often take this action when we see a site that is mostly good but might have some spammy or artificial links pointing to it (widgetbait, paid links, blog spam, guestbook spam, excessive article directory submissions, excessive link exchanges, other types of linkspam, etc.). So while the site's overall rankings might not drop directly, likewise the site might not be able to rank for some phrases. I wouldn't classify these messages as purely advisory or something to be ignored, or only for innocent sites.

On the other hand, I don't want site owners to panic. We do use this message some of the time for innocent sites where people are pointing hacked anchor text to their site to try to make them rank for queries like [buy viagra].

Example scenario: widget links 

A fair number of site owners emailed me after receiving one of the new messages, and I think it might be helpful if I paraphrased some of their situations to give you an idea of what it might mean if you get one of these messages.

The first example is widget links. An otherwise white-hat site emailed me about the message. Here's what I wrote back, with the identifying details removed:

"Looking into the very specific action that we took, I think we did the right thing. Take URL1 and URL2 for example. These pages are using your EXAMPLE1 widgets, but the pages include keyword-rich anchortext pointing to your site's url. One widget has the link ANCHORTEXT1 and the other has ANCHORTEXT2. 

If you do a search for [widgetbait matt cutts] you'll find tons of stories where I discourage people from putting keyword-rich anchortext into their widgets; see http://www.stonetemple.com/articles/interview-matt-cutts-061608.shtml for example. So this message is a way to tell you that not only are those links in your widget not working, they're probably keeping that page from ranking for the phrases that you're using." 

Example scenario: paid links 

The next example is paid links. I wrote this email to someone:

"I wouldn't recommend that Company X ignore this message. For example, check out SPAMMY_BLOG_POST_URL. That's a link from a very spammy website, and it calls into question the linkbuilding techniques that Company X has been using (we also saw a bunch of links due to widgets). These sorts of links are not helping Company X, and it would be worth their time to review how and why they started gathering links like this." 

I also wrote to another link building SEO who got this message pointing out that the SEO was getting links from a directory that appeared to offer only paid links that pass PageRank, and so we weren't trusting links like that.

Here's a final example of paid links. I emailed about one company's situation as follows:

"Company Y is getting this message because we see a long record of buying paid links that pass PageRank. In particular, we see a lot of low-quality 'sponsored posts' with keyword-rich anchortext where the links pass PageRank. The net effect is that we distrust a lot of links to this site. Here are a couple examples: URL1 and URL2. Bear in mind that we have more examples of these paid posts, but these two examples give a flavor of the sort of thing that should really be resolved. My recommendation would be to get these sort of paid posts taken down, and then Company Y could submit a reconsideration request. Otherwise, we'll continue to distrust quite a few links to the site." 

Example scenario: reputation management 

In some cases we're ignoring links to a site where the site itself didn't violate our guidelines. A good example of that is reputation management. We had two groups write in; one was a large news website, while the other was a not-for-profit publisher. Both had gotten the new link message. In one case, it appeared that a "reputation management" firm was using spammy links to try to push up positive articles on the news site, and we were ignoring those links to the news site. In the other case, someone was trying to manipulate the search results for a person's name by buying links on a well-known paid text link ad network. Likewise, we were just ignoring those specific links, and the not-for-profit publisher didn't need to take any action.

What should I do if I get the new link message? 

We recently launched the ability to download backlinks to your site sorted by date. If you get this new link message, you may want to check your most recent links to spot anything unusual going on. If you discover that someone in your company has been doing widgetbait, paid links, or serious linkspam, it's worth cleaning that up and submitting a reconsideration request. We're also looking at some ways to provide more concrete examples to make these messages more actionable and to help narrow down where to look when you get one.

Just to give you some context, less than 20,000 domains received these new messages—that's less than one-tenth the number of messages we send in a typical month—and that's only because we sent out messages retroactively to any site where we had distrusted some of the sites' backlinks. Going forward, based on our current level of action, on average only about 10 sites a day will receive this message. 

Summing up 

I hope this post and some of the examples above will help to convey the nuances of this new message. If you get one of these new messages, it's not a cause for panic, but neither should you completely ignore it. The message says that the current incident isn't affecting our opinion of the entire website, but it is affecting our opinion of some links to the website, and the site might not rank as well for some phrases as a result.

This message reflects an issue of moderate severity, and we're trying to find the right way to alert people that their site may have a potential issue (and it's worth some investigation) without overly stressing out site owners either. But we wanted to take this extra step toward more transparency now so that we can let site owners know when they might want to take a closer look at their current links.

Sunday, 26 July 2015

New Message Center notifications for detecting an increase in Crawl Errors

Webmaster Level: All

When Googlebot crawls your site, it’s expected that most URLs will return a 200 response code, some a 404 response, some will be disallowed by robots.txt, etc. Whenever we’re unable to reach your content, we show this information in the Crawl errors section of Webmaster Tools (even though it might be intentional and not actually an error). Continuing with our effort to provide useful and actionable information to webmasters, we're now sending SiteNotice messages when we detect a significant increase in the number of crawl errors impacting a specific site. These notifications are meant to alert you of potential crawl-related issues and provide a sample set of URLs for diagnosing and fixing them.

A SiteNotice for a spike in the number of unreachable URLs, for example, will look like this:


We hope you find SiteNotices helpful for discovering and dealing with issues that, if left unattended, could negatively affect your crawl coverage. You’ll only receive these notifications if you’ve verified your site in Webmaster Tools and we detect significant changes to the number of crawl errors we encounter on your site. And if you don't want to miss out on any these important messages, you can use the email forwarding feature to receive these alerts in your inbox.

If you have any questions, please post them in our Webmaster Help Forum or leave your comments below.

Friday, 24 July 2015

Behold Google index secrets, revealed!

Webmaster level: All

Since Googlebot was born, webmasters around the world have been asking one question: Google, oh, Google, are my pages in the index? Now is the time to answer that question using the new Index Status feature in Webmaster Tools. Whether one or one million, Index Status will show you how many pages from your site have been included in Google’s index.

Index Status is under the Health menu. After clicking on it you’ll see a graph like the following:



It shows how many pages are currently indexed. The legend shows the latest count and the graph shows up to one year of data.

If you see a steadily increasing number of indexed pages, congratulations! This should be enough to confirm that new content on your site is being discovered, crawled and indexed by Google.

However, some of you may find issues that require looking a little bit deeper. That’s why we added an Advanced tab to the feature. You can access it by clicking on the button at the top, and it will look like this:



The advanced section will show not only totals of indexed pages, but also the cumulative number of pages crawled, the number of pages that we know about which are not crawled because they are blocked by robots.txt, and also the number of pages that were not selected for inclusion in our results.

Notice that the counts are always totals. So, for example, if on June 17th the count for indexed pages is 92, that means that there are a total of 92 pages indexed at this point in time, not that 92 pages were added to the index on that day only. In particular for sites with a long history, the count of pages crawled may be very big in comparison with the number of pages indexed.

All this data can be used to identify and debug a variety of indexing-related problems. For example, if some of your content doesn’t appear any more on Google and you notice that the graph of pages indexed has a sudden drop, that may be an indication that you introduced a site-wide error when using meta=”noindex” and now Google isn’t including your content in search results.

Another example: if you change the URL structure of your site and don’t follow our recommendations for moving your site, you may see a jump in the count of “Not selected”. Fixing the redirects or rel=”canonical” tags should help get better indexing coverage.

We hope that Index Status will bring more transparency into Google’s index selection process and help you identify and fix indexing problems with your sites. And if you have questions, don’t hesitate to ask in our Help Forum.



Saturday, 18 July 2015

Message Center: Let us communicate with you about your site


Today we're launching our Message Center, a new way for webmasters to receive personalized information from Google in our webmaster console. Should we need to contact you, you'll see a notification in your Webmaster Tools dashboard.


Initially the messages will refer to search quality issues, but over time we'll use the Message Center as a communication channel for more types of information. Here's an example: informing the site owner about hidden text, a violation in our webmaster guidelines.


For our webmasters outside the U.S., we’re also pleased to tell you that Message Center is capable of providing information in all supported Webmaster Tools languages (French, Italian, German, Spanish, Danish, Dutch, Swedish, Russian, Chinese-Simplified, Chinese-Traditional, Korean, Japanese, etc.), across all countries.

Right now the number of sites we’re contacting is small, but we hope to expand this program over time. We’re also really happy that the Message Center lets us communicate with webmasters in an authenticated way. As time goes on, we’ll keep looking for even more ways to improve communication with site owners, but right now, why not claim your site in our webmaster tools so that we can give you a heads-up of any issues that we see?

Thursday, 16 July 2015

On web semantics

Webmaster level: All
In web development context, semantics refers to semantic markup, which means markup used according to its meaning and purpose.
Markup used according to its purpose means using heading elements (for instance, h1 to h6) to mark up headings, paragraph elements (p) for paragraphs, lists (ul, ol, dl, also datalist or menu) for lists, tables for data tables, and so on.
Stating the obvious became necessary in the old days, when the Web consisted of only a few web sites and authors used tables to code entire sites, table cells or paragraphs for headings, and thought about other creative ways to achieve the layout they wanted. (Admittedly, these authors had fewer instruments at their disposal than authors have today. There were times when coding a three column layout was literally impossible without using tables or images.)
Up until today authors were not always certain about what HTML element to use for what functional unit in their HTML page, though, and “living” specs like HTML 5 require authors to keep an eye on what elements will be there going forward to mark up what otherwise calls for “meaningless” fallback elements like div or span.
To know what elements HTML offers, and what meaning these elements have, it’s necessary to consult the HTML specs. There are indices—covering all HTML specs and elements—that make it a bit simpler to look up and find out the meaning of an element. However, in many cases it may be necessary to check what the HTML spec says.
For example, take the code element:
The code element represents a fragment of computer code. This could be an XML element name, a filename, a computer program, or any other string that a computer would recognize.

Author-controlled semantics

HTML elements carry meaning as defined by the HTML specs, yet ID and class names can bear meaning too. ID and class names, just like microdata, are typically under author control, the only exception being microformats. (We will not cover microdata or microformats in this article.)
ID and class names give authors a lot of freedom to work with HTML elements. There are a few basic rules of thumb that, when followed, make sure this freedom doesn’t turn into problems:

Advantages of using semantic markup

Using markup according to how it’s meant to be used, as well as modest use of functional ID and class names, has several advantages:
  • It’s the professional thing to do.
  • It’s more accessible.
  • It’s more maintainable.

Special cases

“Neutral” elements, elements with ambiguous meaning, and presentational elements constitute special cases.
div and span offer a “generic mechanism for adding structure to documents.” They can be used whenever there is no other element available that matches what the contents in question represent.
In the past a lot of confusion was caused by the b, strong, i, and em elements. Authors cursed b and i for being presentational, and typically suggested a 1:1 replacement with strong and em. Not to stir up the past, here’s what HTML 5 says, granting all four elements a raison d’être:
b “a span of text to be stylistically offset from the normal prose without conveying any extra importance, such as key words in a document abstract, product names in a review, or other spans of text whose typical typographic presentation is boldened” <p>The <b>frobonitor</b> and <b>barbinator</b> components are fried.
strong “strong importance for its contents” <p><strong>Warning.</strong> This dungeon is dangerous.
i “a span of text in an alternate voice or mood, or otherwise offset from the normal prose, such as a taxonomic designation, a technical term, an idiomatic phrase from another language, a thought, a ship name, or some other prose whose typical typographic presentation is italicized” <p>The term <i>prose content</i> is defined above.
em “stress emphasis of its contents” <p><em>Cats</em> are cute animals.
Last but not least, there are truly presentational elements. These elements will be supported by user agents (browsers) for forever but shouldn’t be used anymore as presentational markup is not maintainable, and should be handled by style sheets instead. Some popular ones are:
  • center
  • font
  • s
  • u

How to tell whether you’re on track

A quick and dirty way to check the semantics of your page and understand how it might be interpreted by a screen reader is to disable CSS, for example using the Web Developer Toolbar extension available for Chrome and Firefox. This only identifies issues around the use of CSS to convey meaning, but can still be helpful.
There are also tools like W3C’s semantic data extractor that provide cues on the meaningfulness of your HTML code.
Other methods range from peer reviews (coding best practices) to user testing (accessibility).

Do’s and Don’ts

Don’t Do Reason
<p class"heading">foo</p>
<h1>foo</h1>
For headings there are heading elements.
<p><font size="2">bar</font></p>
<p>bar</p>

p { font-size: 1em; }
Presentational markup is expensive to maintain.
<table>
<tr>
<td class="heading">baz</td>
</tr>
<tr>
<td>scribble</td>
</tr>
</table>
<h1>baz</h1>
<p>scribble</p>
Use table elements for tabular data.
<div class="newrow">foo</div>
<div>1</div>
<div class="newrow">bar</div>
<div>2</div>
<table>
<tr>
<th>foo</th>
<td>1</td>
</tr>
<tr>
<th>bar</th>
<td>2</td>
</tr>
</table>
Use table elements for tabular data.
foo bar.<br><br>baz scribble.
<p>foo bar.</p>
<p>baz scribble.</p>
Denote paragraphs by paragraph elements, not line breaks.

Video Sitemaps 101: Making your videos searchable

Webmaster Level: All

We know that some of you, or your clients or colleagues, may be new to online video publishing. To make it easier for everyone to understand video indexing and Video Sitemaps, we’ve created a video -- narrated by Nelson Lee, Video Search Product Manager -- that explains everything in basic terms:



Also, last month we wrote about some best practices for getting video content indexed on Google. Today, to help beginners better understand the whys and hows of implementing a Video Sitemap, we added a starting page to the information on Video Sitemaps in the Webmaster Help Center. Please take a look and share your thoughts.

Monday, 13 July 2015

New warnings feedback


Given helpful suggestions from our
discussion group, we've improved feedback for sitemaps in Webmaster Tools. Now, minor problems in a sitemap will be reported as "warnings," and will appear instead of, or in addition to, more serious "errors." (Previously all problems were listed as errors.) Warnings allow us to provide feedback on portions of your sitemap that may be confusing or inaccurate, while saving the real "error" alarm for problems that make your sitemap completely unreadable. We hope the additional information makes it even easier to share your sitemaps with Google.

The new set of warnings includes many problems that we had previously classified as errors, including the "incorrect namespace" and "invalid date" examples shown in the screenshot above. We also crawl a sample of the URLs listed in your sitemap and report warnings if the Googlebot runs into any trouble with them. These warnings might suggest a widespread problem with your site that warrants further investigation, such as a stale sitemap or a misconfigured robots.txt file.
Please let us know how you like this new feedback. Tell us what you think via the comments below, or in the
discussion group. We also appreciate suggestions for additional warnings that you would find useful.

Sunday, 12 July 2015

A reminder about manipulative or deceptive behavior

Webmaster level: All

Our quality guidelines prohibit manipulative or deceptive behavior, and this stance has remained unchanged since the guidelines were first published over a decade ago. Recently, we’ve seen some user complaints about a deceptive technique which inserts new pages into users’ browsing histories. When users click the "back" button on their browser, they land on a new page that they've never visited before. Users coming from a search results page may think that they’re going back to their search results. Instead, they’re taken to a page that looks similar, but is actually entirely advertisements:

list of advertisements


To protect our users, we may take action on, including removal of, sites which violate our quality guidelines, including for inserting deceptive or manipulative pages into a user's browser history. As always, if you believe your site has been impacted by a manual spam action and is no longer violating our guidelines, you can let us know by requesting reconsideration.

New Crawl Error alerts from Webmaster Tools

Webmaster level: All

Today we’re rolling out Crawl Error alerts to help keep you informed of the state of your site.

Since Googlebot regularly visits your site, we know when your site exhibits connectivity issues or suddenly spikes in pages returning HTTP error response codes (e.g. 404 File Not Found, 403 Forbidden, 503 Service Unavailable, etc). If your site is timing out or is exhibiting systemic errors when accessed by Googlebot, other visitors to your site might be having the same problem!

When we see such errors, we may send alerts –- in the form of messages in the Webmaster Tools Message Center –- to let you know what we’ve detected. Hopefully, given this increased communication, you can fix potential issues that may otherwise impact your site’s visitors or your site’s presence in search.

As we discussed in our blog post announcing the new Webmaster Tools Crawl Errors feature, we divide crawl errors into two types: Site Errors and URL Errors.

Site Error alerts for major site-wide problems

Site Errors represent an inability to connect to your site, and represent systemic issues rather than problems with specific pages. Here are some issues that might cause Site Errors:
  • Your DNS server is down or misconfigured.
  • Your web server itself is firewalled off.
  • Your web server is refusing connections from Googlebot.
  • Your web server is overloaded, or down.
  • Your site’s robots.txt is inaccessible.
These errors are global to a site, and in theory should never occur for a well-operating site (and don’t occur for the large majority of the sites we crawl). If Googlebot detects any appreciable number of these Site Errors, regardless of the size of your site, we’ll try to notify you in the form of a message in the Message Center:

Example of a Site Error alert
The alert provides the number of errors Googlebot encountered crawling your site, the overall crawl error connection rate for your site, a link to the appropriate section of Webmaster Tools to examine the data more closely, and suggestions as to how to fix the problem.

If your site shows a 100% error rate in one of these categories, it likely means that your site is either down or misconfigured in some way. If your site has an error rate less than 100% in any of these categories, it could just indicate a transient condition, but it could also mean that your site is overloaded or improperly configured. You may want to investigate these issues further, or ask about them on our forum.

We may alert you even if the overall error rate is very low — in our experience a well configured site shouldn’t have any errors in these categories.

URL Error anomaly alerts for potentially less critical issues

Whereas any appreciable number of Site Errors could indicate that your site is misconfigured, overloaded, or simply out of service, URL Errors (pages that return a non-200 HTTP code, or incorrectly return an HTTP 200 code in the case of soft 404 errors) may occur on any well-configured site. Because different sites have different numbers of pages and different numbers of external links, a count of errors that indicates a serious problem for a small site might be entirely normal for a large site.

That’s why for URL Errors we only send alerts when we detect a large spike in the number of errors for any of the five categories of errors (Server error, Soft 404, Access denied, Not found or Not followed). For example, if your site routinely has 100 pages with 404 errors, we won’t alert you if that number fluctuates minimally. However we might notify you when that count reaches a much higher number, say 500 or 1,000. Keep in mind that seeing 404 errors is not always bad, and can be a natural part of a healthy website (see our previous blog post: Do 404s hurt my site?).

A large spike in error count could be because something has changed on your site — perhaps a reconfiguration has changed the permissions for a section of your site, or a new version of a script is crashing regularly, or someone accidentally moved or deleted an entire directory, or a reorganization of your site causes external links to no longer work. It could also just be a transient spike, or could be because of external causes (someone has linked to non-existent pages), so there might not even be a problem; but when we see an unusually large number of errors for your site, we’ll let you know so you can investigate:

Example of a URL Error anomaly alert
The alert describes the category of web errors for which we’ve detected a spike, gives a link to the appropriate section of Webmaster Tools so that you can see what pages we think are problematic, and offers troubleshooting suggestions.

Enable Message forwarding to send alerts to your inbox

We know you’re busy, and that routinely checking Webmaster Tools just to check for new alerts might be something you forget to do. Consider turning on Message forwarding. We’ll send any Webmaster Tools messages to the email address of your choice.

Let us know what you think, and if you have any comments or suggestions on our new alerts please visit our forum.

Sunday, 5 July 2015

Best uses of Flash



We occasionally get questions on the Webmaster Help Group about how webmasters should work with Adobe Flash. I thought it would be worthwhile to write a few words about the search considerations designers should think about when building a Flash-heavy site.

As many of you already know, Flash is inherently a visual medium, and Googlebot doesn't have eyes. Googlebot can typically read Flash files and extract the text and links in them, but the structure and context are missing. Moreover, textual contents are sometimes stored in Flash as graphics, and since Googlebot doesn't currently have the algorithmic eyes needed to read these graphics, these important keywords can be missed entirely. All of this means that even if your Flash content is in our index, it might be missing some text, content, or links. Worse, while Googlebot can understand some Flash files, not all Internet spiders can.

So what's an honest web designer to do? The only hard and fast rule is to show Googlebot the exact same thing as your users. If you don't, your site risks appearing suspicious to our search algorithms. This simple rule covers a lot of cases including cloaking, JavaScript redirects, hidden text, and doorway pages. And our engineers have gathered a few more practical suggestions:

  1. Try to use Flash only where it is needed. Many rich media sites such as Google's YouTube use Flash for rich media but rely on HTML for content and navigation. You can too, by limiting Flash to on-page accents and rich media, not content and navigation. In addition to making your site Googlebot-friendly, this makes you site accessible to a larger audience, including, for example, blind people using screen readers, users of old or non-standard browsers, and those on limited low-bandwidth connections such as on a cell phone or PDA. As a bonus, your visitors can use bookmarks effectively, and can email links to your pages to their friends.
  2. sIFR: Some websites use Flash to force the browser to display headers, pull quotes, or other textual elements in a font that the user may not have installed on their computer. A technique like sIFR still lets non-Flash readers read a page, since the content/navigation is actually in the HTML -- it's just displayed by an embedded Flash object.
  3. Non-Flash Versions: A common way that we see Flash used is as a front page "splash screen" where the root URL of a website has a Flash intro that links to HTML content deeper into the site. In this case, make sure there is a regular HTML link on that front page to a non-Flash page where a user can navigate throughout your site without the need for Flash.

If you have other ideas that don't violate these guidelines that you'd like to ask about, feel free to ask them in the Webmaster Help Group under Crawling, Indexing, and Ranking. The many knowledgeable webmasters there, along with myself and a cadre of other Googlers, will do our best to clear up any confusion.

Update: See our additional blog posts about Flash Indexing at Google.

Thursday, 2 July 2015

How to create valuable startpages

In the Dutch market, the concept of so-called 'startpages' is hugely popular. In this article we will give some background information on them, and give those of you who may be startpage webmasters a few tips on how to create unique and informative startpages.

What's a startpage?

Basically, it's a webpage with a lot of links about a specific topic. The startpages are hosted on a startpage domain and each separate startpage is maintained by an individual webmaster. The links on startpages are usually ordered by categories related to the topic of the page. Besides hyperlinks, startpages often contain text, animations and pictures. Startpages are quite unique to the Dutch market, and offer a simple interface for novice users to create their own web portals, with a unique approach to user-generated content.

The whole startpage concept began in September 1998 with the launch of Startpagina.nl, which was set up to be an online linkbook for the inexperienced Internet user. Since then, Startpagina.nl has become a huge success, mainly because an enormous number of volunteers created and maintained the different startpages covering lots of interesting and diverse topics. Since Startpagina.nl emerged, lots of other startpage domains have been created, and are still being created today. The fact that there are still new startpage domains appearing and that the number of individual startpages on these domains is still increasing shows the continued popularity of startpages in the Dutch market.

Creating useful startpages

As a search engine, we love to have useful and diverse pages showing up in the search results we present to our users. We thought it would be a good idea to highlight some of the best practices we've seen in creating value-added startpages.

  1. Create your startpage for users, and not for search engines. This involves making sure that all your text on the page is visible to users, and writing full sentences as descriptions instead of just keywords.
  2. Try to deliver unique, informative and on-topic content. The structure of startpages is pretty straightforward and does not leave much room for variation. However, you can make a difference. Try to find a topic you know a lot about that has not been fully covered yet. Create good categories that are related to your topic and give a relevant title to every category. Then, find links that are related to the categories on your page and label every link with an anchor text that is relevant. For example, instead of naming your links 'link1', 'link2' et cetera, you can choose names that make clear where the link is pointing to. And you can write a short description for every category.
  3. Don't create startpages out of commercial intent or for the sole purpose of exchanging links. Of course there is nothing wrong with trying to monetize your startpage, but a page with only banners and affiliate links is not the best user experience and therefore not recommended. The same goes for startpages that are created as part of a link network. For example, pages that have all links pointing to a particular website and to other startpages that are also pointing to that same website. These kind of link schemes have no added value for the user and go against the Google webmaster guidelines.

With this post, we hope to have provided potential startpage webmasters with some helpful guidelines that will help to create the type of startpages the Dutch speaking people love!

On a final note, we would like to encourage you to fill in a paid links form if you come across a startpage that is involved in buying and selling links for the purpose of search engine manipulation. To report other forms of bad behavior, you can send a spam report. We'll review each report we get and use this feedback to enhance our algorithms and improve our search results. As always, we really appreciate your feedback and your help to provide the best search experience.

Startpagina's

Op de Nederlandstalige markt zijn de zogenaamde startpagina's bijzonder populair. In dit artikel willen we, naast het geven van wat achtergrondinformatie over startpagina's, toekomstige startpaginabeheerders een aantal tips geven voor het creëren van unieke en informatieve startpagina's.

Wat is een startpagina?

Een startpagina is een webpagina met een verzameling links gerelateerd aan een specifiek onderwerp. De startpagina's worden gehost op een startpagina domein en elke individuele startpagina wordt beheerd door een webmaster. De links op een startpagina zijn meestal opgedeeld in verschillende categorieën die relevant zijn voor het specifieke onderwerp van de startpagina. Naast een indeling in hyperlinks vind je op een startpagina vaak tekst, animaties en plaatjes. Het concept van startpagina's is redelijk specifiek voor de Nederlandstalige markt en komt nauwelijks voor in andere markten. Startpagina's hebben een simpele interface die het, ook voor de onervaren internetgebruikers, eenvoudig maakt om een eigen webpagina te creëren.

Het startpagina concept kwam tot stand in september 1998 met de lancering van Startpagina.nl, dat werd opgezet als een soort van linkboek voor de onervaren internet gebruiker. Startpagina.nl bleek al gauw een enorm succes. Dit succes was vooral te danken aan het enorme aantal vrijwilligers dat meehielp om startpagina's te creëren en beheren. Dat er nu, bijna negen jaar later, nog steeds nieuwe startpagina domeinen verschijnen en dat het aantal individuele startpagina's op deze domeinen nog steeds groeit toont aan dat de startpagina's onverminderd populair zijn.

Een waardevolle startpagina creëren

Als zoekmachine vinden we het fantastisch om waardevolle pagina's met unieke content en diversiteit in onze zoekresultaten te hebben. Het leek ons daarom een goed idee om een aantal tips te geven die kunnen helpen bij het creëren van startpagina's met toegevoegde waarde.

  1. Maak een startpagina voor internetgebruikers en niet voor zoekmachines. Zorg dat alle tekst zichtbaar is en gebruik volledige zinnen in plaats van enkel een aantal keywords.
  2. Probeer unieke, informatieve en aan je onderwerp gerelateerde inhoud aan je bezoekers te presenteren. Hoewel de opzet van een standaard startpagina niet heel veel ruimte biedt voor variatie, kun jij als beheerder het verschil maken! Begin met het zoeken naar een onderwerp waar je veel over weet en waar naar jouw idee nog niet genoeg informatie over te vinden is. Maak vervolgens relevante categorieën aan die gerelateerd zijn aan het onderwerp en geef elke categorie een relevante naam. Zoek vervolgens de links die je op je startpagina wil plaatsen en geef elke link een anchor tekst die omschrijft waar de link je bezoeker naar toe stuurt. Noem je links niet link1, link2, en link3, maar geef ze een naam die relevant is voor de inhoud van de pagina waar de link naar verwijst. Als extra aanvulling kan voor iedere categorie een korte beschrijving worden toegevoegd.
  3. Maak geen startpagina's vanuit een puur commercieel oogpunt. Er is niets mis met te proberen om wat te verdienen met je startpagina, maar vergeet niet dat je bezoekers niet zitten te wachten op een pagina met alleen reclamebanners en affiliate links. Hetzelfde geldt voor startpagina's die enkel worden aangemaakt als onderdeel van een linknetwerk. Een voorbeeld hiervan zijn startpagina's waarbij alle links verwijzen naar eenzelfde website en naar andere startpagina's die ook allemaal naar dezelfde website verwijzen. Dit soort startpagina's hebben geen enkele waarde voor je bezoekers en gaan bovendien in tegen de Google Richtlijnen voor Webmasters.

We hopen dat we met deze eerste Nederlandstalige post potentiële startpaginabeheerders hebben kunnen voorzien van een aantal nuttige tips die er voor zorgen dat zij het soort startpagina's kunnen gaan creëren waar onze Nederlandstalige gebruikers van houden!

Tot slot willen we iedereen aanmoedigen om een paid link formulier in te vullen, wanneer je een startpagina tegenkomt die links koopt en verkoopt om daarmee zoekmachines te manipuleren. Andere zaken die ingaan tegen de Google Richtlijnen voor Webmasters kun je melden door een spamrapport in te sturen. Wij bekijken elk rapport dat wordt ingestuurd en deze informatie wordt gebruikt om onze algoritmes en zoekresultaten verder te verbeteren. Zoals altijd wordt jullie feedback en hulp om onze gebruikers te voorzien van de meest relevante zoekresultaten enorm gewaardeerd!

Wednesday, 1 July 2015

Easier navigation without GPS

Webmaster level: All

Today we’re unveiling a shiny new navigation in Webmaster Tools. The update will make the features you already use easier to find, as well as unveil some exciting additions.

Navigation reflects how search works

We’ve organized the Webmaster Tools features in groups that match the stages of search:
  • Crawl: see information about how we discover and crawl your content. Here you will find crawl stats, crawl errors, any URLs you’ve blocked from crawling, Sitemaps, URL parameters, and the Fetch as Google feature.
  • Google Index: keep track of how many of your pages are in Google’s index and how we understand their content: you can monitor the overall indexed counts for your site (Index Status), see what keywords we’ve found on your pages (Content Keywords), or request to remove URLs from the search results.
  • Search Traffic: check how your pages are doing in the search results — how people find your site (Search Queries), who’s recommended your site (Links to Your Site), and see a sample of pages from your site that have incoming links from other internal pages.
  • Search Appearance: mark up your pages to help Google understand your content better during indexing and potentially influence how your pages appear in our search results. This includes the Structured Data dashboard, Data Highlighter, Sitelinks, and HTML Improvements.

Account-level administrative tasks now accessible from the Settings menu

Account-level admin tasks such as setting User permissions, Site Settings, and Change of Address are now grouped under the gear icon in the top right corner so they’re always accessible to you:


This is the list of items as visible to site owners, “full” or “restricted” users will see a subset of these options. For example, if you're a “restricted” user for a site, the "Users & Site Owners" menu item will not appear.

New Search Appearance pop-up

Beginner webmasters will appreciate the new Search Appearance pop-up, which can be used to visualize how your site may appear in search and learn more about the content or structure changes that may help to influence each element:


To access the pop-up window, click on the question mark icon next to the Search Appearance menu in the side navigation.

It includes the essential search result elements like title, snippet and URL, as well as optional elements such as sitelinks, breadcrumbs, search within a site, event and product rich snippets, and authorship information.

We hope the new navigation makes it easier for you to make the most of Webmaster Tools. As always, if you have additional questions, feel free to post in the Webmaster Help Forum.