- 26
- Sphinn It!
Posted By: Wiep 277 days ago
Topic Type: News Story (Jump to http://www.blogstorm.co.uk)
Category: Google Other
2 Comments
2 Comments
Save the date for:
SMX China (Nanjing) - Sept. 23-24
SMX Stockholm - Sept. 23-24: See who's speaking or register now.
SMX East (New York City) - Oct.
6-8: See the agenda or register today and save!
SMX London - Nov. 4-5: Pre-agenda rate now available. Click here.
Comments
Am I the only one a bit confused on the Google Article URL number guidline? It says that "the URL for each article must contain a unique number consisting of at least three digits". The example given is, http://www.google.com/news/article234.html.
Then it goes on to say that "If the only number in the article consists of an isolated four-digit number that resembles a year, such as http://www.google.com/news/article2006.html, we won't be able to crawl it."
So if your URL structure contained the year in a directory (news/2007/) you're required to have an additional 3 digit, 4 digit (not resembling year) or 4+ digit number somewhere else in the URL? I might just be a bit slow today...
I can't speak for google but in some instances the unique 3-4 digit code may no longer be required, as they have improved news crawling methods. Scan your mouse over some headlines on the homepage and you'll notice a number of standard /year/month/topic articles being indexed. That being said, many use a 4 digit (or more) unique identifier for each post.
LA Times
/washington/2007/11/streisand-leans.html
Washington Post
/2007/11/27/author-lashes-out-at-romn_n_74378.html