1
0
mirror of https://github.com/ericchiang/pup synced 2024-11-24 08:58:08 +00:00

Update README.md

This commit is contained in:
Eric Chiang 2014-09-01 20:56:02 -04:00
parent 84e54e1430
commit 6915c6abb9

View File

@ -1,6 +1,6 @@
# pup # pup
`pup` is a command line tool for processing HTML. It read from stdin, `pup` is a command line tool for processing HTML. It reads from stdin,
prints to stdout, and allows the user to filter parts ot the page using prints to stdout, and allows the user to filter parts ot the page using
[CCS selectors](http://www.w3schools.com/cssref/css_selectors.asp). [CCS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
@ -11,6 +11,18 @@ fast and flexible way of exploring HTML from the terminal.
go get github.com/ericchiang/pup go get github.com/ericchiang/pup
## Basic Usage
```bash
$ cat index.html | pup [selectors and flags]
```
or
```bash
$ pup < index.html [selectors and flags]
```
## Examples ## Examples
Download a webpage with `wget`. Download a webpage with `wget`.
@ -21,7 +33,7 @@ $ wget http://en.wikipedia.org/wiki/Robots_exclusion_standard -O robots.html
###Clean and indent ###Clean and indent
By default, `pup` will fill in missing tags, and properly indent the page. By default `pup` will fill in missing tags and properly indent the page.
```bash ```bash
$ cat robots.html $ cat robots.html
@ -102,6 +114,38 @@ $ pup < robots.html table -l 2
</table> </table>
``` ```
## Implemented Selectors
For further examples of these selectors head over to [w3schools](
http://www.w3schools.com/cssref/css_selectors.asp).
```bash
cat index.html | pup .class
# '#' indicates comments at the command line so you have to escape it
cat index.html | pup \#id
cat index.html | pup element
cat index.html | pup [attribute]
cat index.html | pup [attribute=value]
```
You can mix and match selectors as you wish.
```bash
cat index.html | pup element#id[attribute=value]
```
## Flags
```bash
-c --color print result with color
-f --file file to read from
-h --help display this help
-i --indent number of spaces to use for indent or character
-n --number print number of elements selected
-l --limit restrict number of levels printed
--version display version
```
## TODO: ## TODO:
* Print attribute value rather than html ({href}) * Print attribute value rather than html ({href})