The crawler plugin allows you to crawl websites and extract data from them without using a browser.
pip install botcity-crawler-plugin
Linux System Dependencies
For Debian/Ubuntu please run the following command:
apt install libxcomposite1 libxcursor1 libxdamage1 \ libxfixes3 libxi6 libxtst6 libnss3 libnspr4 libcups2 \ libdbus-1-3 libxrandr2 libasound2 libatk1.0-0 libatk-bridge2.0-0 \ libgtk-3-0 libx11-xcb1 --no-install-recommends
Please make sure to install the equivalent libraries for your Linux distribution.
Importing the Plugin¶
After you installed this package, the next step is to import the package into your code and start using the functions.
from botcity.plugins.crawler import BotCrawlerPlugin
Making the Request¶
To make the request you must use the
request method which takes
as an argument a URL.
Locating an Element¶
Looking into the page source from the previous example, we can notice that the element holding the subscribers information has the attribute
Here is how we can read the value of the element:
# This sets the current element on the HTML object to the one found html.get_element_by_id("subscriber-count") # Read the value into the subscribers variable subscribers = html.value()