¶
botcity.plugins.crawler.plugin.BotCrawlerPlugin
¶
javascript_enabled: bool
property
writable
¶
Whether or not JavaScript should be enabled when making the request.
__init__(self, javascript_enabled=False)
special
¶
BotCrawlerPlugin
Parameters:
Name | Type | Description | Default |
---|---|---|---|
javascript_enabled |
bool |
Whether or not JavaScript should be enabled when making requests. Defaults to False. |
False |
request(self, url, wait_time=0)
¶
Executes a request to the given URL
Parameters:
Name | Type | Description | Default |
---|---|---|---|
url |
str |
The desired URL. |
required |
wait_time |
int |
The number of milliseconds to wait after initial render. |
0 |
Returns:
Type | Description |
---|---|
HTML |
an HTML object which can be used to parse elements. See HTML |
botcity.plugins.crawler.html.HTML
¶
__init__(self, html, javascript_enabled=False)
special
¶
HTML representation of a page.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
html |
requests_html.HTML |
the page html object from |
required |
javascript_enabled |
bool |
Whether or not JavaScript was enabled for this request. Defaults to False. |
False |
elements(self)
¶
Returns all child elements.
Returns:
Type | Description |
---|---|
List |
List of elements. |
execute_javascript(self, code)
¶
Executes the specified JavaScript code within the page.
The usage would be similar to what can be achieved when executing JavaScript in the current page by entering "javascript:...some JS code..." in the URL field of a browser.
If JavaScript was not enabled on the Plugin before the request, calls to this method will be ignored.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
code |
str |
the JavaScript code to be executed. |
required |
get_attribute(self, attribute)
¶
Returns the value of the attribute in an element.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
attribute |
str |
The attribute name of element. |
required |
Exceptions:
Type | Description |
---|---|
RuntimeError |
If the element has no attributes. |
KeyError |
[description] |
Returns:
Type | Description |
---|---|
object |
The attribute value. |
get_element_by_id(self, id)
¶
Searches the element within the document which matches the id.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
id |
str |
Unique identifier of the element. |
required |
query_selector(self, selectors, reset=False)
¶
Searches the first element within the document which matches the specified group of selectors.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
selectors |
str |
One or more selectors |
required |
reset |
bool |
Whether or not to reset the current element before the search. |
False |
Returns:
Type | Description |
---|---|
HTML |
this object |
query_selector_all(self, selectors, index, reset=False)
¶
Searches all elements within the document which matches the specified group of selectors and returns the specified index.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
selectors |
str |
One or more selectors |
required |
index |
int |
The index of the element of the list |
required |
reset |
bool |
Whether or not to reset the current element before the search. |
False |
Returns:
Type | Description |
---|---|
HTML |
this object |
query_selector_all_size(self, selectors, reset=False)
¶
Searches all elements within the document which matches the specified group of selectors and return the number of elements.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
selectors |
str |
One or more selectors |
required |
reset |
bool |
Whether or not to reset the current element before the search. |
False |
Returns:
Type | Description |
---|---|
int |
number of elements |
query_selector_iter_all(self, selectors, reset=False)
¶
Searches all elements within the document which matches the specified
group of selectors and iterate over the results setting the current
element.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
selectors |
str |
One or more selectors |
required |
reset |
bool |
Whether or not to reset the current element before the search. |
False |
Returns:
Type | Description |
---|---|
HTML |
this object |
reset(self)
¶
Reset the current element to the top of the page.
value(self)
¶
Returns the value of an element.
Returns:
Type | Description |
---|---|
str |
The element value. |