# NAME
Web::Query - Yet another scraping library like jQuery
# SYNOPSIS
use Web::Query;
wq('http://www.w3.org/TR/html401/')
->find('div.head dt')
->each(sub {
my $i = shift;
printf("%d %s\n", $i+1, $_->text);
});
# DESCRIPTION
Web::Query is a yet another scraping framework, have a jQuery like interface.
Yes, I know Ingy's [pQuery](https://metacpan.org/pod/pQuery). But it's just a alpha quality. It doesn't works.
Web::Query built at top of the CPAN modules, [HTML::TreeBuilder::XPath](https://metacpan.org/pod/HTML::TreeBuilder::XPath), [LWP::UserAgent](https://metacpan.org/pod/LWP::UserAgent), and [HTML::Selector::XPath](https://metacpan.org/pod/HTML::Selector::XPath).
So, this module uses [HTML::Selector::XPath](https://metacpan.org/pod/HTML::Selector::XPath) and only supports the CSS 3
selector supported by that module.
Web::Query doesn't support jQuery's extended queries(yet?). If a selector is
passed as a scalar ref, it'll be taken as a straight XPath expression.
$wq( '
' )->find( 'p' ); # css selector
$wq( '' )->find( \'/div/p' ); # xpath selector
**THIS LIBRARY IS UNDER DEVELOPMENT. ANY API MAY CHANGE WITHOUT NOTICE**.
# FUNCTIONS
- `wq($stuff)`
This is a shortcut for `Web::Query->new($stuff)`. This function is exported by default.
# METHODS
## CONSTRUCTORS
- my $q = Web::Query->new($stuff, \\%options )
Create new instance of Web::Query. You can make the instance from URL(http, https, file scheme), HTML in string, URL in string, [URI](https://metacpan.org/pod/URI) object, and instance of [HTML::Element](https://metacpan.org/pod/HTML::Element).
This method throw the exception on unknown $stuff.
This method returns undefined value on non-successful response with URL.
Currently, the only two valid options are _indent_, which will be used as
the indentation string if the object is printed, and _no\_space\_compacting_,
which will prevent the compaction of whitespace characters in text blocks.
- my $q = Web::Query->new\_from\_element($element: HTML::Element)
Create new instance of Web::Query from instance of [HTML::Element](https://metacpan.org/pod/HTML::Element).
- `my $q = Web::Query->new_from_html($html: Str)`
Create new instance of Web::Query from HTML.
- my $q = Web::Query->new\_from\_url($url: Str)
Create new instance of Web::Query from URL.
If the response is not success(It means /^20\[0-9\]$/), this method returns undefined value.
You can get a last result of response, use the `$Web::Query::RESPONSE`.
Here is a best practical code:
my $url = 'http://example.com/';
my $q = Web::Query->new_from_url($url)
or die "Cannot get a resource from $url: " . Web::Query->last_response()->status_line;
- my $q = Web::Query->new\_from\_file($file\_name: Str)
Create new instance of Web::Query from file name.
## TRAVERSING
### add
Returns a new object augmented with the new element(s).
- add($html)
An HTML fragment to add to the set of matched elements.
- add(@elements)
One or more @elements to add to the set of matched elements.
@elements that already are part of the set are not added a second time.
my $group = $wq->find('#foo'); # collection has 1 element
$group = $group->add( '#bar', $wq ); # 2 elements
$group->add( '#foo', $wq ); # still 2 elements
- add($wq)
An existing Web::Query object to add to the set of matched elements.
- add($selector, $context)
$selector is a string representing a selector expression to find additional elements to add to the set of matched elements.
$context is the point in the document at which the selector should begin matching
### contents
Get the immediate children of each element in the set of matched elements, including text and comment nodes.
### each
Visit each nodes. `$i` is a counter value, 0 origin. `$elem` is iteration item.
`$_` is localized by `$elem`.
$q->each(sub { my ($i, $elem) = @_; ... })
### end
Back to the before context like jQuery.
### filter
Reduce the elements to those that pass the function's test.
$q->filter(sub { my ($i, $elem) = @_; ... })
### find
Get the descendants of each element in the current set of matched elements, filtered by a selector.
my $q2 = $q->find($selector); # $selector is a CSS3 selector.
**NOTE** If you want to match the element itself, use ["filter"](#filter).
**INCOMPATIBLE CHANGE**
From v0.14 to v0.19 (inclusive) find() also matched the element itself, which is not jQuery compatible.
You can achieve that result using `filter()`, `add()` and `find()`:
my $wq = wq(''); # needed because we don't have a global document like jQuery does
print $wq->filter('.foo')->add($wq->find('.foo'))->as_html; # bar
### first
Return the first matching element.
This method constructs a new Web::Query object from the first matching element.
### last
Return the last matching element.
This method constructs a new Web::Query object from the last matching element.
### not($selector)
Return all the elements not matching the `$selector`.
# $do_for_love will be every thing, except #that
my $do_for_love = $wq->find('thing')->not('#that');
### and\_back
Add the previous set of elements to the current one.
# get the h1 plus everything until the next h1
$wq->find('h1')->next_until('h1')->and_back;
### map
Creates a new array with the results of calling a provided function on every element.
$q->map(sub { my ($i, $elem) = @_; ... })
### parent
Get the parent of each element in the current set of matched elements.
### prev
Get the previous node of each element in the current set of matched elements.
my $prev = $q->prev;
### next
Get the next node of each element in the current set of matched elements.
my $next = $q->next;
### next\_until( $selector )
Get all subsequent siblings, up to (but not including) the next node matched `$selector`.
## MANIPULATION
### add\_class
Adds the specified class(es) to each of the set of matched elements.
# add class 'foo' to elements
wq('
')->find('p')->add_class('foo');
### after
Insert content, specified by the parameter, after each element in the set of matched elements.
wq('')->find('p')
->after('bar')
->end
->as_html; #
The content can be anything accepted by ["new"](#new).
### append
Insert content, specified by the parameter, to the end of each element in the set of matched elements.
wq('')->append('foo
')->as_html; #
The content can be anything accepted by ["new"](#new).
### as\_html
Return the elements associated with the object as strings.
If called in a scalar context, only return the string representation
of the first element.
### ` attr `
Get/Set the attribute value in element.
my $attr = $q->attr($name);
$q->attr($name, $val);
### tagname
Get/Set the tag name of elements.
my $name = $q->tagname;
$q->tagname($new_name);
### before
Insert content, specified by the parameter, before each element in the set of matched elements.
wq('')->find('p')
->before('bar')
->end
->as_html; #
The content can be anything accepted by ["new"](#new).
### clone
Create a deep copy of the set of matched elements.
### detach
Remove the set of matched elements from the DOM.
### has\_class
Determine whether any of the matched elements are assigned the given class.
### ` html `
Get/Set the innerHTML.
my @html = $q->html();
my $html = $q->html(); # 1st matching element only
$q->html('foo
');
### insert\_before
Insert every element in the set of matched elements before the target.
### insert\_after
Insert every element in the set of matched elements after the target.
### ` prepend `
Insert content, specified by the parameter, to the beginning of each element in the set of matched elements.
### remove
Delete the elements associated with the object from the DOM.
# remove all