chromiumoxide library to interact with these sites.
When to Use Browser Blocks
Use browser automation when:- The site uses JavaScript to render content
- You need to solve CAPTCHAs visually
- The login flow requires multiple clicks/interactions
- Standard HTTP requests return Cloudflare challenges
- Simple HTTP requests work fine (5-10x faster)
- You’re checking thousands of accounts (too slow)
- The site has a documented API
Browser Automation Blocks
| Block | Purpose |
|---|---|
| BrowserOpen | Launch headless Chromium |
| NavigateTo | Open a URL |
| ClickElement | Click a button/link |
| TypeText | Fill an input field |
| WaitForElement | Wait for element to appear |
| GetElementText | Extract text/attribute |
| Screenshot | Capture page/element image |
| ExecuteJs | Run JavaScript code |
Add BrowserOpen Block
Drag BrowserOpen from the palette to start your pipeline.Configure:Extra args hide automation flags to avoid detection.Common args:
Navigate to Login Page
Add NavigateTo block:After navigation:
data.SOURCEcontains the page HTMLdata.ADDRESScontains the current URL
Fill Login Form
Add TypeText blocks for username and password:Username field:Password field:Selectors can be:
- CSS:
input[name="email"],#login-btn,.submit-button - Text-based:
//button[contains(text(), 'Sign In')]
Submit the Form
Add ClickElement block:If the button triggers a page load, enable Wait For Navigation to wait for the response.
Wait for Result
Some sites show success/error messages after a delay.Add WaitForElement:If the element doesn’t appear within 10s, the block throws an error (caught by safe mode or triggers Retry).
Extract Data
Add GetElementText to capture account info:Attributes:
innerText- visible texttextContent- all text (including hidden)value- input field valuehref- link URLsrc- image URL- Any custom attribute:
data-user-id
Check Success with KeyCheck
Add KeyCheck to classify the result:Use
data.ADDRESS to check URL changes (common pattern after login).Optional: Take Screenshot
For debugging, capture a screenshot:The screenshot is saved as base64 in the variable. Useful for:
- Manual review of failures
- Detecting CAPTCHAs visually
- Debugging layout issues
Test Your Pipeline
Press F5 to debug.Watch the browser window (if headless: false):
- Does it find the login form?
- Are credentials typed correctly?
- Does the submit button click?
data.SOURCEshows the final page HTMLdata.ADDRESSshows the final URL- Variables show extracted data
Real-World Example: Cloudflare Bypass
Cloudflare’s challenge page uses JavaScript. Browser automation can solve it:Using Proxies with Browser Blocks
Set the proxy in BrowserOpen:CURRENT_PROXY in a startup block or Script block that fetches from your proxy pool.
Advanced: Execute JavaScript
Some interactions require JavaScript:<CAPTCHA_ANSWER> in a TypeText block.
Common JS tasks:
- Extract data from
window.INITIAL_STATE - Trigger events:
document.querySelector('.btn').click() - Modify the page:
document.body.style.display = 'none' - Solve simple math CAPTCHAs
Implementation Details
Browser blocks are implemented insrc/pipeline/engine/browser.rs:
- Uses
chromiumoxidecrate (CDP protocol) - Spawns a background task for CDP event handling
- Browser handle persists across blocks in the same pipeline run
Performance Tips
- Use headless mode (headless: true) - 20-30% faster than windowed
- Disable images - add
--blink-settings=imagesEnabled=falseto Extra Args - Lower thread count - browsers are memory-intensive (max 10-20 concurrent)
- Reuse browser instances - don’t open/close per account
- Combine with HTTP - use browser only for login, then switch to HTTP for data extraction
Troubleshooting
”No browser open” Error
Cause: Missing BrowserOpen block Fix: Add BrowserOpen as the first block in your pipeline.Element Not Found
Cause: Selector is wrong or element hasn’t loaded yet Fix:- Add WaitForElement before interacting
- Check the selector in Chrome DevTools (F12 → Elements → Copy Selector)
- Increase timeout values
Cloudflare Still Blocking
Try:- Add
--disable-blink-features=AutomationControlledto Extra Args - Use residential proxies (datacenter IPs often blocked)
- Add random delays between actions
- Set a realistic User-Agent in Browser Settings (footer panel)
Browser Crashes or Hangs
Solutions:- Lower thread count (browsers use 200-500MB RAM each)
- Add
--disable-dev-shm-usageto Extra Args - Close unused browser tabs with JavaScript
- Restart the job periodically
Tips
Browser automation updates
data.SOURCE after each navigation. You can use ParseJSON/ParseRegex on the HTML just like HTTP responses.