diff --git a/README.md b/README.md index 4c3633a..780547f 100644 --- a/README.md +++ b/README.md @@ -1,13 +1,108 @@ # pup +`pup` is a command line tool for processing HTML. It read from stdin, +prints to stdout, and allows the user to filter parts ot the page using +[CCS selectors](http://www.w3schools.com/cssref/css_selectors.asp). + +Inspired by [`jq`](http://stedolan.github.io/jq/), `pup` aims to be a +fast and flexible way of exploring HTML from the terminal. + ## Install go get github.com/ericchiang/pup +## Examples + +Download a webpage with `wget`. _Please exercise restraint when using any +automated request tool._ + +```bash +$ wget http://en.wikipedia.org/wiki/Robots_exclusion_standard -O robots.html +``` + +###Clean and indent + +By default, `pup` will fill in missing tags, and properly indent the page. + +```bash +$ cat robots.html +# nasty looking html +$ cat robots.html | pup +# cleaned and indented html +``` + +###Filter by tag +``` +$ pup < robots.html title +