UPDATED
I remain an admirer of Google, but like many other people I'm worried that the company is getting too big for its virtual britches. As Jeff Jarvis and others have noted lately, there's a worrisome bent toward "trust us" in the operation of Google News, a site I like but find frequently frustrating.
Google News embarrassed itself by including a disgusting Nazi-ish site (I will not link to it myself) in its source-crawl. The company has removed the site, thankfully, but not before enduring well-deserved ridicule for having included the garbage in the first place.
The problem is, among other things, a lack of transparency. Why doesn't Google just post a list of the news sites it uses as sources? I can't see the harm in doing this, and can see a lot of value.
Update: A Google spokesperson says: "I believe we did not list all the sites for competitive reasons. But, I do hear what you're saying and can pass your feedback (which we take very seriously) to the News team."
So the answer is No, for now. Too bad.
Until Google does the right thing, we'll have to use a list being compiled via a programming script at the Private Radio blog. Some of the sites surprise me, and will probably surprise you, too. (How long do you expect it will take for Google to demand that the blogger stop?)
On a separate matter, the excellent Philly Future -- a combination blog, citizens news and photo sharing site about Philadelphia -- is appropriately miffed that Google has absolutely no listing for the site. (UPDATE: This is being fixed; see Matt Cutts' comment below. Thanks, Matt!)
Karl Martino, who runs the site (and is former colleague of mine), wrote me:
I think Google has banned Philly Future from indexing.If anyone from Google is reading this, how about fixing what is obviously a bad setting?Over a year ago a porn redirector owned the phillyfuture.org domain. I think that is what caused Googlebot to stop visiting the domain.
I've been posting in various message boards trying to get help. I've used their online forms to submit the site and ask for help a few times - but I can't seem to get a real response from Google itself except to: "be assured that these changes are automated. It is certainly our intent to represent the content of the internet fairly and accurately." from an automated reply. Basically - I keep being told to sit tight.
There is a nice Google employee blogger who is attempting to help - but he can't seem to figure out what's wrong. I've done all the right things according to him.
It's been a year now that I've had the Phillyfuture.org domain up. Googlebot has not visited. Philly Future is visitors traffic from MSN an Yahoo! - but not Google.
For me, this is one more reason to use the other search engines for at least some of my searches.
Update: Google has responded to Karl, and it looks like all will be well. I'm glad to hear it.
I agree that Google should make it easy for us to find out who they are crawling for news sources.
On the otherhand, Jeff the BuzzMachine was typically lazy. Google makes it very easy to complain about the quality of their sources, and Jeff didn't bother with that, he didn't bother with like, uh, calling them up and asking them for more information (that would have been a true "value add"), he just went direct to blog.
Jeff isn't so much worried as hysterical.
Posted by: jerry | March 24, 2005 at 05:03 PM
Dan, there are other areas where Google needs transparency as well. Starting with AdSense. Until just recently, and most likely due to VC Fred Wilson's publishing his Google AdSense financials and John Battelle blogging about how they should not have such a restrictive TOS on AdSense, you could not share how much money you made from Google's AdSense program.
Now Google has lifted the requirement that AdSense blogger partners keep quiet about how much they make but they still refuse to disclose how much of every AdSense dollar goes to the blogger vs. Google. Is it a 50/50 split? Is it 90/10? Is it 98/2? I'd suspect that the non evil (i.e. good) thing to do for Google would be disclose the split and explain why it's fair.
Would you take a sales job where the company said trust us, we'll pay you a fair share of the revenue that you generate for our company? No. Typically sales people know exactly how much the company makes and what their split or cut of that is.
Google needs bloggers for AdSense to work and AdSense is a great program for bloggers seeking to eek out a living. Why though is Google so secretive about the split that they pay their blogging partners? Is it perhaps due to the negative PR that might arrise if some perceived an "evil" split between Google and their bloggers? Are there those within Google that do not want to risk killing the goose that lays the golden eggs?
Transparency is not a bad thing and Google should spell out exactly what that split is. It would be the good (i.e. non-evil Sergey) thing for Google to do.
Posted by: Thomas Hawk | March 24, 2005 at 10:42 PM
The best advice I am getting from folks, it would seem, is to actually change the name of the site and move to a new domain name. To give up on getting Philly Future indexed.
I'm hoping it's not going to come to that though. It doesn't seem fair since I didn't do anything to get myself banned as far as anyone has been able to tell me.
But ya gotta do what ya gotta do. I hope it doesn't come to that though.
If anyone can help - please let me know. Google is too imporant a resource to not have indexing us.
Posted by: Karl | March 25, 2005 at 03:39 AM
I'm gotten some interesting feedback.
For the record - I have never employed tricks or SEO techniques to improve phillyfuture.org's standing in Google. Ever.
The domain was taken by a porn spammer for a time and that is why I believe it has gotten blacklisted.
I'll repeat what I said at Metafilter:
Google, in my experience, has never given me cause to complain. You post good content, you build a community, people link to you, you follow guidelines, don't try any evil tricks, and Google indexes you. Usually its that simple. It's one of the reasons why Google has been so terrific.
Getting a new domain, because a previous owner abused it - seems to me not only a partial loss of identity (partial I realize - it's our community that counts - just like at Metafilter) - but an admission that Google is creaking, cracking and growing bureaucratic as it grows older. I'd like to think there is a solution that is less severe.
Posted by: Karl | March 25, 2005 at 09:43 AM
Hi all, I'm an engineer at Google. I just wrote to Karl directly, but it might help the conversation to share what I said:
"
Hi, my name is Matt Cutts and I'm a software engineer at Google. I
wanted to write to you about phillyfuture.org. As I'm sure you know,
phillyfuture.org was used by a pretty bad porn spammer for quite a
while. That spammer was bad enough that someone actually wrote an
article about them: http://cyber.law.harvard.edu/people/edelman/renewals/
I checked our user support queue, and it looks like you starting writing
to us about phillyfuture.org starting around 3/2/2005, with a reinclusion
request on 3/18/2005. I wanted to let you know that someone here checked
out the site, verified it's good, and submitted the site for reinclusion
on 3/21/2005. It will take probably up to 2-4 weeks, but you should see
the site entering the Google index pretty soon. Feel free to email me if
you have any questions at all and I'll try to answer them.
Best wishes,
Matt Cutts
"
Karl did the right thing with a reinclusion request on 3/18/2005 to us, because that goes to someone who can verify that the site is now good. Because of the large volume of correspondence we get to user support, it really helps us to send the request with the right terms if you suspect a domain has been spamming in the past ("reinclusion request" in the subject is enough). After getting that reinclusion request it was approved in under 72 hours, but I understand that we could have done better in this case, because Karl first wrote us more than 2-3 weeks ago. Karl, I understand your frustration because you just bought a domain and wanted it to rank where it should. I'm sorry that the previous life of this domain as porn spam affected you.
Posted by: Matt Cutts | March 25, 2005 at 10:24 AM
Hi Matt,
This is pretty much a copy of what I am sending you in reply but I wanted to share it here:
Thank you very, very much. I figured it was that porn spammer that got the site blacklisted. Good to know it wasn't anything that we did at PF :)
I've been trying to get the site indexed again for around a year now, using the standard methods everyone uses. But the reinclusion request was recent - not many people know about it - and when I got the automated reply - I thought I was out of options. Hence my call for help.
I really appreciate the help and will send along the good word.
Thanks!
Posted by: Karl | March 25, 2005 at 10:35 AM
I think if enough developers (and non-developers too) ask Google to open up Google News to their API that we might actually see some really exciting apps.
Having a hole card on this for "competitive reasons" is absurd considering that Yahoo opens up their news via API.
More on this is available on my blog at the URL in my signature and I did trackback but Dan must be moderating his trackbacks because it doesn't yet show. Dan, if you got two trackbacks then please use the second one (#1620), as (#1619) is invalid. Thank you :)
Posted by: TDavid | March 26, 2005 at 08:15 AM