|
|
@ -281,18 +281,23 @@ $ cat robots.html | pup 'div#p-namespaces a'
|
|
|
|
$ cat robots.html | pup 'div#p-namespaces a json{}'
|
|
|
|
$ cat robots.html | pup 'div#p-namespaces a json{}'
|
|
|
|
[
|
|
|
|
[
|
|
|
|
{
|
|
|
|
{
|
|
|
|
|
|
|
|
"attrs": {
|
|
|
|
"accesskey": "c",
|
|
|
|
"accesskey": "c",
|
|
|
|
"href": "/wiki/Robots_exclusion_standard",
|
|
|
|
"href": "/wiki/Robots_exclusion_standard",
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Article",
|
|
|
|
|
|
|
|
"title": "View the content page [c]"
|
|
|
|
"title": "View the content page [c]"
|
|
|
|
},
|
|
|
|
},
|
|
|
|
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Article"
|
|
|
|
|
|
|
|
},
|
|
|
|
{
|
|
|
|
{
|
|
|
|
|
|
|
|
"attrs": {
|
|
|
|
"accesskey": "t",
|
|
|
|
"accesskey": "t",
|
|
|
|
"href": "/wiki/Talk:Robots_exclusion_standard",
|
|
|
|
"href": "/wiki/Talk:Robots_exclusion_standard",
|
|
|
|
"tag": "a",
|
|
|
|
"rel": "discussion",
|
|
|
|
"text": "Talk",
|
|
|
|
|
|
|
|
"title": "Discussion about the content page [t]"
|
|
|
|
"title": "Discussion about the content page [t]"
|
|
|
|
|
|
|
|
},
|
|
|
|
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Talk"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
]
|
|
|
|
```
|
|
|
|
```
|
|
|
@ -303,33 +308,27 @@ Use the `-i` / `--indent` flag to control the intent level.
|
|
|
|
$ cat robots.html | pup -i 4 'div#p-namespaces a json{}'
|
|
|
|
$ cat robots.html | pup -i 4 'div#p-namespaces a json{}'
|
|
|
|
[
|
|
|
|
[
|
|
|
|
{
|
|
|
|
{
|
|
|
|
|
|
|
|
"attrs": {
|
|
|
|
"accesskey": "c",
|
|
|
|
"accesskey": "c",
|
|
|
|
"href": "/wiki/Robots_exclusion_standard",
|
|
|
|
"href": "/wiki/Robots_exclusion_standard",
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Article",
|
|
|
|
|
|
|
|
"title": "View the content page [c]"
|
|
|
|
"title": "View the content page [c]"
|
|
|
|
},
|
|
|
|
},
|
|
|
|
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Article"
|
|
|
|
|
|
|
|
},
|
|
|
|
{
|
|
|
|
{
|
|
|
|
|
|
|
|
"attrs": {
|
|
|
|
"accesskey": "t",
|
|
|
|
"accesskey": "t",
|
|
|
|
"href": "/wiki/Talk:Robots_exclusion_standard",
|
|
|
|
"href": "/wiki/Talk:Robots_exclusion_standard",
|
|
|
|
"tag": "a",
|
|
|
|
"rel": "discussion",
|
|
|
|
"text": "Talk",
|
|
|
|
|
|
|
|
"title": "Discussion about the content page [t]"
|
|
|
|
"title": "Discussion about the content page [t]"
|
|
|
|
|
|
|
|
},
|
|
|
|
|
|
|
|
"tag": "a",
|
|
|
|
|
|
|
|
"text": "Talk"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
]
|
|
|
|
]
|
|
|
|
```
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
If the selectors only return one element the results will be printed as a JSON
|
|
|
|
|
|
|
|
object, not a list.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
```bash
|
|
|
|
|
|
|
|
$ cat robots.html | pup --indent 4 'title json{}'
|
|
|
|
|
|
|
|
{
|
|
|
|
|
|
|
|
"tag": "title",
|
|
|
|
|
|
|
|
"text": "Robots exclusion standard - Wikipedia, the free encyclopedia"
|
|
|
|
|
|
|
|
}
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Because there is no universal standard for converting HTML/XML to JSON, a
|
|
|
|
Because there is no universal standard for converting HTML/XML to JSON, a
|
|
|
|
method has been chosen which hopefully fits. The goal is simply to get the
|
|
|
|
method has been chosen which hopefully fits. The goal is simply to get the
|
|
|
|
output of pup into a more consumable format.
|
|
|
|
output of pup into a more consumable format.
|
|
|
|