Ir para o conteúdo

botcity.plugins.crawler.plugin.BotCrawlerPlugin

javascript_enabled: bool property writable

Whether or not JavaScript should be enabled when making the request.

__init__(self, javascript_enabled=False) special

BotCrawlerPlugin

Parameters:

Name Type Description Default
javascript_enabled bool

Whether or not JavaScript should be enabled when making requests. Defaults to False.

False

request(self, url, wait_time=0)

Executes a request to the given URL

Parameters:

Name Type Description Default
url str

The desired URL.

required
wait_time int

The number of milliseconds to wait after initial render.

0

Returns:

Type Description
HTML

an HTML object which can be used to parse elements. See HTML

botcity.plugins.crawler.html.HTML

__init__(self, html, javascript_enabled=False) special

HTML representation of a page.

Parameters:

Name Type Description Default
html requests_html.HTML

the page html object from requests_html.

required
javascript_enabled bool

Whether or not JavaScript was enabled for this request. Defaults to False.

False

elements(self)

Returns all child elements.

Returns:

Type Description
List

List of elements.

execute_javascript(self, code)

Executes the specified JavaScript code within the page.

The usage would be similar to what can be achieved when executing JavaScript in the current page by entering "javascript:...some JS code..." in the URL field of a browser.

If JavaScript was not enabled on the Plugin before the request, calls to this method will be ignored.

Parameters:

Name Type Description Default
code str

the JavaScript code to be executed.

required

get_attribute(self, attribute)

Returns the value of the attribute in an element.

Parameters:

Name Type Description Default
attribute str

The attribute name of element.

required

Exceptions:

Type Description
RuntimeError

If the element has no attributes.

KeyError

[description]

Returns:

Type Description
object

The attribute value.

get_element_by_id(self, id)

Searches the element within the document which matches the id.

Parameters:

Name Type Description Default
id str

Unique identifier of the element.

required

query_selector(self, selectors, reset=False)

Searches the first element within the document which matches the specified group of selectors.

Parameters:

Name Type Description Default
selectors str

One or more selectors

required
reset bool

Whether or not to reset the current element before the search.

False

Returns:

Type Description
HTML

this object

query_selector_all(self, selectors, index, reset=False)

Searches all elements within the document which matches the specified group of selectors and returns the specified index.

Parameters:

Name Type Description Default
selectors str

One or more selectors

required
index int

The index of the element of the list

required
reset bool

Whether or not to reset the current element before the search.

False

Returns:

Type Description
HTML

this object

query_selector_all_size(self, selectors, reset=False)

Searches all elements within the document which matches the specified group of selectors and return the number of elements.

Parameters:

Name Type Description Default
selectors str

One or more selectors

required
reset bool

Whether or not to reset the current element before the search.

False

Returns:

Type Description
int

number of elements

query_selector_iter_all(self, selectors, reset=False)

Searches all elements within the document which matches the specified group of selectors and iterate over the results setting the current element.

Parameters:

Name Type Description Default
selectors str

One or more selectors

required
reset bool

Whether or not to reset the current element before the search.

False

Returns:

Type Description
HTML

this object

reset(self)

Reset the current element to the top of the page.

value(self)

Returns the value of an element.

Returns:

Type Description
str

The element value.