mirror of
https://github.com/ericchiang/pup
synced 2024-11-24 08:58:08 +00:00
Update README.md
This commit is contained in:
parent
84e54e1430
commit
6915c6abb9
48
README.md
48
README.md
@ -1,6 +1,6 @@
|
|||||||
# pup
|
# pup
|
||||||
|
|
||||||
`pup` is a command line tool for processing HTML. It read from stdin,
|
`pup` is a command line tool for processing HTML. It reads from stdin,
|
||||||
prints to stdout, and allows the user to filter parts ot the page using
|
prints to stdout, and allows the user to filter parts ot the page using
|
||||||
[CCS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
|
[CCS selectors](http://www.w3schools.com/cssref/css_selectors.asp).
|
||||||
|
|
||||||
@ -11,6 +11,18 @@ fast and flexible way of exploring HTML from the terminal.
|
|||||||
|
|
||||||
go get github.com/ericchiang/pup
|
go get github.com/ericchiang/pup
|
||||||
|
|
||||||
|
## Basic Usage
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ cat index.html | pup [selectors and flags]
|
||||||
|
```
|
||||||
|
|
||||||
|
or
|
||||||
|
|
||||||
|
```bash
|
||||||
|
$ pup < index.html [selectors and flags]
|
||||||
|
```
|
||||||
|
|
||||||
## Examples
|
## Examples
|
||||||
|
|
||||||
Download a webpage with `wget`.
|
Download a webpage with `wget`.
|
||||||
@ -21,7 +33,7 @@ $ wget http://en.wikipedia.org/wiki/Robots_exclusion_standard -O robots.html
|
|||||||
|
|
||||||
###Clean and indent
|
###Clean and indent
|
||||||
|
|
||||||
By default, `pup` will fill in missing tags, and properly indent the page.
|
By default `pup` will fill in missing tags and properly indent the page.
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ cat robots.html
|
$ cat robots.html
|
||||||
@ -102,6 +114,38 @@ $ pup < robots.html table -l 2
|
|||||||
</table>
|
</table>
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Implemented Selectors
|
||||||
|
|
||||||
|
For further examples of these selectors head over to [w3schools](
|
||||||
|
http://www.w3schools.com/cssref/css_selectors.asp).
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cat index.html | pup .class
|
||||||
|
# '#' indicates comments at the command line so you have to escape it
|
||||||
|
cat index.html | pup \#id
|
||||||
|
cat index.html | pup element
|
||||||
|
cat index.html | pup [attribute]
|
||||||
|
cat index.html | pup [attribute=value]
|
||||||
|
```
|
||||||
|
|
||||||
|
You can mix and match selectors as you wish.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cat index.html | pup element#id[attribute=value]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Flags
|
||||||
|
|
||||||
|
```bash
|
||||||
|
-c --color print result with color
|
||||||
|
-f --file file to read from
|
||||||
|
-h --help display this help
|
||||||
|
-i --indent number of spaces to use for indent or character
|
||||||
|
-n --number print number of elements selected
|
||||||
|
-l --limit restrict number of levels printed
|
||||||
|
--version display version
|
||||||
|
```
|
||||||
|
|
||||||
## TODO:
|
## TODO:
|
||||||
|
|
||||||
* Print attribute value rather than html ({href})
|
* Print attribute value rather than html ({href})
|
||||||
|
Loading…
Reference in New Issue
Block a user