Documentation - JavaScript Scenario
Interact with the webpage you want to scrape.You can also discover this feature using our Postman collection covering every ScrapingBee's features.
Basic usage
If you want to interact with pages you want to scrape before we return your the HTML you can add JavaScript scenario to your API call.
For example, if you wish to click on a button, you will need to use this scenario.
{
"instructions": [
{"click": "#buttonId"}
]
}
And so our scraper will scrape the webpage, click on the button #buttonId
and then return you the HTML of the page.
Important: JavaScript scenario are JSON formatted, and in order to pass them to a GET request, you need to stringify them.
# Install the Python ScrapingBee library:
# pip install scrapingbee
from scrapingbee import ScrapingBeeClient
client = ScrapingBeeClient(api_key='YOUR-API-KEY')
response = client.get(
'https://www.scrapingbee.com/blog',
params={
'js_scenario': {"instructions": [{ "click": "#buttonId" }]},
},
)
print('Response HTTP Status Code: ', response.status_code)
print('Response HTTP Response Body: ', response.content)
// request Axios
const axios = require('axios');
axios.get('https://app.scrapingbee.com/api/v1', {
params: {
'api_key': 'YOUR-API-KEY',
'url': 'https://www.scrapingbee.com/blog',
'js_scenario': '{"title":"h1","subtitle":"#subtitle"}',
}
}).then(function (response) {
// handle success
console.log(response);
})
String encoded_url = URLEncoder.encode("YOUR URL", "UTF-8");
require 'net/http'
require 'net/https'
require 'uri'
# Classic (GET )
def send_request
extract_rules = URI::encode('{"instructions": [{ "click": "#buttonId" }]}')
uri = URI('https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' + extract_rules)
# Create client
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_PEER
# Create Request
req = Net::HTTP::Get.new(uri)
# Fetch Request
res = http.request(req)
puts "Response HTTP Status Code: #{ res.code }"
puts "Response HTTP Response Body: #{ res.body }"
rescue StandardError => e
puts "HTTP Request failed (#{ e.message })"
end
send_request()
<?php
// get cURL resource
$ch = curl_init();
// set url
$extract_rules = urlencode('{"instructions": [{ "click": "#buttonId" }]}');
curl_setopt($ch, CURLOPT_URL, 'https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=' . $extract_rules);
// set method
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'GET');
// return the transfer as a string
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// send the request and save response to $response
$response = curl_exec($ch);
// stop if fails
if (!$response) {
die('Error: "' . curl_error($ch) . '" - Code: ' . curl_errno($ch));
}
echo 'HTTP Status Code: ' . curl_getinfo($ch, CURLINFO_HTTP_CODE) . PHP_EOL;
echo 'Response Body: ' . $response . PHP_EOL;
// close curl resource to free up system resources
curl_close($ch);
?>
package main
import (
"fmt"
"io/ioutil"
"net/http"
"net/url"
)
func sendClassic() {
// Create client
client := &http.Client{}
// Stringify rules
extract_rules := url.QueryEscape(`{"instructions": [{ "click": "#buttonId" }]}`)
// Create request
req, err := http.NewRequest("GET", "https://app.scrapingbee.com/api/v1/?api_key=YOUR-API-KEY&url=https://www.scrapingbee.com/blog&js_scenario=" + extract_rules, nil)
parseFormErr := req.ParseForm()
if parseFormErr != nil {
fmt.Println(parseFormErr)
}
// Fetch Request
resp, err := client.Do(req)
if err != nil {
fmt.Println("Failure : ", err)
}
// Read Response Body
respBody, _ := ioutil.ReadAll(resp.Body)
// Display Results
fmt.Println("response Status : ", resp.Status)
fmt.Println("response Headers : ", resp.Header)
fmt.Println("response Body : ", string(respBody))
}
func main() {
sendClassic()
}
You can add multiple instructions to the scenario, they will get executed one by one on our end.
Below is a quick overview of all the different instruction you can use.
{"evaluate": "console.log('foo')"} # Run custom JavaScript
{"click": "#button_id"} # Click on a an element
{"wait": 1000} # Wait for a fixed duration in ms
{"wait_for": "#slow_div"} # Wait for a css element to appear
{"wait_for_and_click": "#slow_div"} # Wait for a css element to appear and then click on it
{"scroll_x": 1000} # Scroll the screen in the horizontal axis, in px
{"scroll_y": 1000} # Scroll the screen in the vertical axis, in px
{"fill": ["#input_1", "value_1"]} # Fill some input
{"evaluate": "console.log('toto')"} # Run custom JavaScript code
Of course you can choose to use them in the order you want, and you can use the same one multiple time in one scenario.
Here is an example of a scenario that wait for a button to appear, click on it and then scroll, wait a bit, and scroll again.
{
"instructions": [
{"wait_for_and_click": "#slow_button"},
{"scroll_x": 1000},
{"wait": 1000},
{"scroll_x": 1000},
{"wait": 1000},
]
}
Clicking on a button
click
CSS selector
To click on a button, use this instruction with the CSS selector of the button you want to click on
If you want to click on the button whose id
is secretButton
you need to use this JavaScript scenario:
{
"instructions": [
{"click": "#secretButton"}
]
}
Wait for a fixed amount of time
wait
duration in ms
To wait for a fixed amount of time, use this instruction with the duration, in ms, you want to wait for.
If you want to wait for 2 seconds, you need to use this JavaScript scenario:
{
"instructions": [
{"wait": 2000}
]
}
Wait for an element to appear
wait_for
CSS selector
To wait for a particular element to appear, use this instruction with the CSS selector of the element you want to wait for.
If you want to wait for the element whose class is slow_div
to appear before getting some results, you need to use this JavaScript scenario:
{
"instructions": [
{"wait_for": ".slow_div"}
]
}
Wait for an element to appear and click
wait_for_and_click
CSS selector
To wait for a particular element to appear, and then click on it, use this instruction.
If you want to wait for the element whose class is slow_div
to appear before clicking on it, you need to use this JavaScript scenario:
{
"instructions": [
{"wait_for_and_click": ".slow_div"}
]
}
Note: this is exactly the same as using:
{
"instructions": [
{"wait_for": ".slow_div"},
{"click": ".slow_div"}
]
}
Scroll Horizontally
scroll_x
number of pixel
To scroll horizontally on a page, use this instruction with the number of pixels you want to scroll.
If you want to scroll down 1000px you need to use this JavaScript scenario:
{
"instructions": [
{"scroll_x": 1000}
]
}
Scroll Vertically
scroll_y
number of pixel
To scroll vertically on a page, use this instruction with the number of pixels you want to scroll.
If you want to scroll down 1000px you need to use this JavaScript scenario:
{
"instructions": [
{"scroll_y": 1000}
]
}
Filling form input
fill
[ CSS selector, value ]
To fill an input, use this instruction with the CSS selector of the input you want to fill and the value you want to fill it with.
If you want to fill an input whose CSS selector is #input_1
with the value value_1
you need to use this JavaScript scenario:
{
"instructions": [
{"fill": ["input_1", "value_1"]}
]
}
Executing custom JavaScript
evaluate
JavaScript code
If you need more flexibility and need to run custom JavaScript, you need to use this instruction.
If you want to run the code console.log('foo')
on the webpage you need to use this JavaScript scenario:
{
"instructions": [
{"evaluate": "console.log('foo')"}
]
}
Timeout
Your whole scenario should not take more than 40 seconds to complete, otherwise the API will timeout.