Web Page Crap Cropper

Just get the good stuff from web pages
The average content in a web page from a "content site" is shrinking to the size of a postage stamp. Banner ads and useless Java menus at the top, "related partner sites" (ads) on the sides, and "navigation" cruft on the bottom is crowding out the stuff you really came for, which is typically a few K of text. Though you can filter out the ad images with a proxy, you are still left with acres of useless screen space.

The Web Page Crap Cropper is a browser enhancement that lets you select the "active" (i.e. good) content in a web page by dragging a red rectangle around it. With a little bit of smarts, this can be automatically reverse-rendered and parsed to find the markers that indicate the good stuff in the page source. When the browser sees a similarly-formatted page from the same site, it will find the markers and render only the good stuff between them in the full width of your browser. An "uncrop" button renders the original page if you really want to see it.

rmutt, Dec 07 2000

Machete Browser http://www.halfbake...achete_27_20Browser
Related idea (oops -- must search before add!) [rmutt, Dec 07 2000, last modified Oct 04 2004]

Platypus http://platypus.mozdev.org/
Firefox extension that does this. Woot! [rmutt, Dec 15 2006]

templatemaker http://www.holovaty...ive/2007/07/06/0128
Yay! Adrian Holovaty made an open source package to do this automagically! [rmutt, Jul 06 2007]

Boilerpipe code library https://boilerpipe-web.appspot.com/
Unsupervised, and works surprisingly well using "shallow text features" [rmutt, Nov 23 2010]

Readability http://lab.arc90.co...iments/readability/
"Readability™ is a simple tool that makes reading on the Web more enjoyable by removing the clutter around what you're reading." [iaoth, Nov 23 2010]

print what you like http://www.printwhatyoulike.com/
I've tried it, it works. [Loris, Feb 22 2015]

Instapaper textize bookmarklet https://www.instapaper.com/save
[slater, Feb 22 2015]


       Related aside: When I need to print a web page, I routinely copy/paste the interesting stuff to Word and print from there. Captures the fonts, desired images, etc. and saves trees.
syost, Dec 07 2000

       [rmutt], you don't need to search before add, you could also delete (upper left corner, idea menu) after add. (Maybe after moving the new material to the other idea.)
jutta, Mar 02 2001

       With Opera, you can make a custom CSS that filters out common ads, and even specific ones by editing the CSS file. the property used here is display:none, so it doesn't leave any obvious blankness. You can search their forums for that particular CSS. my.opera.com/forums
y4, Jan 15 2004

       like the idea rmutt good way to filter out the garbage and save time on loading. However sometimes I like seeing some of the adds, if it is an intrest to me, so perhaps there is a way that you could filter out the adds that are not relative to you, by having a personal profile of your intrests online thus keeping out the bad adds.
Dmedia, Dec 15 2006

       Pastry. The strength in this idea is using a rectangle to graphically select the good content, and letting the browser figure out which elements belong and which don't.   

       The biggest problem I've found when trying to copy/paste *part* of a web page is that it is quite difficult to select *part* of a table. But with a graphical selection, the browser would understand what you really want.   

       The relevancy of my problem to this idea, is that graphical selection functionality could be made available for use in copy/paste operations, not just for content filtering.
ed, Dec 15 2006

       Just use Lynx.
nineteenthly, Dec 15 2006

       adsubtract does this, but Trend Micro discontinued it a couple of months ago.
ldischler, Dec 16 2006

       I use Readability (see link) a lot. Not only does it remove clutter, it reformats with a readable font and low-contrast colours, as well as fetch the rest of the article if it's split into several pages.
iaoth, Nov 23 2010

       A lot of sites the bit that loads last is the bit you want to see. Are there browsers the load backwards ? Are there browsers that lie " Oh yeah I put all that spam on my users screen. So now you can send the good stuff. wink wink" ?
popbottle, Feb 21 2015

       //Related aside: When I need to print a web page, I routinely copy/paste the interesting stuff to Word and print from there. Captures the fonts, desired images, etc. and saves trees.//   

       syost, use printwhatyoulike (link)
Loris, Feb 22 2015


