Computer: Web: Linking
Web Watcher   (+3)  [vote for, against]
Collaborative Internet service that watches for changes

Problem: It takes time, effort, and bandwidth to check my favorite blogs and web groups for changes.

Solution: Small add-on to browsers could communicate web page information with a central server.

How it works: When a page finishes loading, a small piece of information is sent to a central location. This includes the web address, and a hash code that can be used to identify a web page's content. This is compared with the last hash code that's been saved on the server for this web site, and if different then it's stored as an entry.

How this helps: For my needs, I can have a single web page that is updated when anyone with this browser add-on visits any of my favorite blogs. This web page could tell me when one of these pages has changed. Or, I could subscribe to a service that e-mails me this information.

Added benefits: If this becomes widely adopted, every browser could check this server first to find out if a web page has changed since the last time the user has visited it. If it hasn't, and if the last time anyone had checked was recent, it would just load a cached version.
-- Worldgineer, Feb 15 2005

Blogware My previous solution to this problem. Web Watcher may reduce internet traffic, rather than increase it. [Worldgineer, Feb 15 2005]

Bloglines http://www.bloglines.com/about/notifier
has a notifer too, so it tells you when to visit your panopticon of a page [neilp, Feb 15 2005]

Newsgator http://www.newsgator.com
this is the one which has the peer update model thing you want [neilp, Feb 15 2005]

Weblogs.com http://newhome.weblogs.com/faq
[krelnik, Feb 16 2005]

InfoMinder https://addons.upda...application=firefox
what [world] wanted. [neilp, May 02 2005]

Wikipedia ETag http://en.wikipedia.org/wiki/HTTP_ETag
Entity Tags tag/hash content with a 'version' or 'state' so differences can be detected [Improfane, Dec 30 2008]

This sounds like something you could do in Firefox if you had the right extension.
-- hazel, Feb 15 2005


That's what I'm picturing - start with extensible browsers, and grow support until MS gets jealous.
-- Worldgineer, Feb 15 2005


Wonder if it's already out there? [neilp] is quite au fait with Firefox extensions.
-- hazel, Feb 15 2005


interesting, you could definitely achieve that (relatively) easily, but I think there are a couple of services out therethat would let you have a page showing which of your watched blogs, but via a different mechanism. Yours has the advantage of working across sites other than blogs.

-- neilp, Feb 15 2005


(wandering off topic a bit)There's a specific issue that those miss - and that's annotations. Services like BlogRolling are updated using RSS-like data published from blogs. This data generally doesn't contain annotations, leaving me to click on every page I care about to find out if it's had new annos. See my blogware link for further details.
-- Worldgineer, Feb 15 2005


aaah, in that case, yup, think this could be a useful add-on. Anyone know a good hash code?
[World] you realise that you're gonna have to identify each specific URL you want to watch ? that's a lot of URLs...

-- neilp, Feb 15 2005


True, but with a little bit of logic we can cut down on the storage space required. For instance, things that never seem to change inside a base URL could be treated differently than those that often change. Those that always change can be considered dynamic and all but erased from the service.
-- Worldgineer, Feb 15 2005


Storage space needn't be a concern (if google can index 8bn pages, then we can build something to store 8bn URLs and their hash codes), it's more the pounding the servers would take - if, say 1 in 40 pages served was one from a google server then that's quite a lot of hits, however if every single web page, from every single user generated a post to your server that would be quite a bit of traffic.
-- neilp, Feb 15 2005


<disclaimer>I don't really know how this works</disclaimer> this might be better achieved by getting a local client to ping each of the pages in question - just to get back the last modified headers. Certainly works for static pages, but then relies on having compliant web server software that does the same for dynamic content.
-- neilp, Feb 15 2005


I think this is baked in the form of WEBLOGS.COM. It does, however, require the cooperation of the target site to send it a "ping" when updates occur. See link.
-- krelnik, Feb 16 2005


I think newsgator works the other way round, with the clients updating the central server upon a change. With the pinging approach of weblogs.com, you'd be better off getting the cooperative site to make the speciic channel users want (in [World]'s case a 'comments' feed)
-- neilp, Feb 16 2005


ask and it shall be baked (see link).
-- neilp, May 02 2005


Cool. Thanks [neil]. Their pricing structure is a bit high for what I need. (reads details) Ouch. To have it check more than once a day is $300 a year. Plus it's fundementally different than my idea - you're effectively paying someone to surf for you, and notify you when pages change.
-- Worldgineer, May 02 2005


There is no need to hash the content yourself, many servers do it for you. These are called ETags and browsers use them to see if content is different or not.

After submitting it to a central location, surely it would still be a pull implementation? The computer has to consult it to check if there is a new version? Perhaps you could use XMPP or IRC to provide a notification to users from within the browser that the page was modified.

(The ETag comparison check only has to occur when the user submits it to the server.)
-- Improfane, Dec 30 2008


Well that's handy. We should just be able to do this mostly client-side. Have a list of links that updates every now and then by sending a request for an ETag to each server.
-- Worldgineer, Dec 30 2008



random, halfbakery