Browser-Use Guide: Build a Python AI Agent to Automate the Web (50 Lines)

Category: Black-Tech / AI-Agents
Tags: Python, Browser-Use, AI Agent, Automation, DeepSeek
Slug: browser-use-ai-agent-guide

Browser-Use Official Banner - Making Websites Accessible for AI Agents
Browser-Use: The interface between AI and the Web.

The Pain: We Waste Our Lives Filling Forms

Picture this scenario:

You need to log into 5 competitor websites every morning to copy prices into Excel. Or maybe you need to grab a train ticket and have to stare at the screen refreshing constantly. Or perhaps you want to submit 100 job applications, but every site demands you re-type your “Education History” manually.

This isn’t work. This is “Digital Imprisonment”.

In the past, we tried using Selenium or Puppeteer scripts. But the moment a website changed a single `div` ID, the script would break. You ended up spending more time fixing bugs than actually doing the work.

Times have changed. Today I’m introducing a tool called Browser-Use. It’s not a rigid script that clicks coordinates; it’s an AI Agent with “Vision” and a “Brain”.

The Logic: Seeing the Web Like a Human

Browser-Use connects LangChain and Playwright. When you give it a command like “Go to Amazon and send me the price of iPhone 16”:

Browser-Use Agent Automating a Job Application
Real-time Demo: An agent filling out a job application automatically.
  1. It launches a real Chrome browser.
  2. It “sees” the page via screenshots, identifying where the search box is and where the “Price” label is (even without specific IDs).
  3. It automatically plans actions: Click -> Type -> Scroll -> Extract.

It’s like an intern sitting at your computer. You give the order; it runs the errands.

🚀 Quick Start (Installation)

We stick to the Hise principle: No fluff. Just run it.

You need: Python 3.11+ and an LLM API Key (DeepSeek or OpenAI recommended).

1. Install Dependencies

# Install core library
pip install browser-use langchain-openai

# Install browser drivers (Mandatory)
playwright install

2. Code Your Clone (agent.py)

Here are the 50 lines of code worth a million bucks:

import os
import asyncio
from langchain_openai import ChatOpenAI
from browser_use import Agent

# Set your API Key (Example using DeepSeek, compatible with OpenAI format)
os.environ["OPENAI_API_KEY"] = "sk-your-key-here"
os.environ["OPENAI_API_BASE"] = "https://api.deepseek.com" # If using DeepSeek

async def main():
    # 1. Define the Brain
    llm = ChatOpenAI(model="deepseek-chat")

    # 2. Define the Task
    agent = Agent(
        task="Go to Google, search for 'Hise.lol', find the title of the first article and print it.",
        llm=llm,
    )

    # 3. Execute
    result = await agent.run()
    print(result)

if __name__ == "__main__":
    asyncio.run(main())

3. Witness the Magic

python agent.py

You will see a browser pop up automatically, typing, clicking, and closing like a ghost. Finally, your terminal will receive the report.

Advanced: What Can It Do?

Beyond simple searches, it can handle complex workflows:

  • E-commerce Monitoring: Log into backends daily, screenshot sales reports, and send them to your Telegram.
  • Social Media Ops: Log into X (Twitter), search for specific keywords, like posts, and draft replies.
  • Data Cleaning: Scrape data from a messy government public notice site and organize it into clean JSON.

How to Keep It Running 24/7?

Running this on your laptop is cool, but it stops when you close the lid. If you want a true “Passive Income” employee, you need to deploy it to the cloud.

But there’s a catch: Cloud servers (VPS) usually don’t have screens (Headless), which triggers many anti-bot mechanisms.

The Solution: Combined with our previous black-tech guide “ClawSimple: Solving the Python Persistence Problem”, you can easily deploy this Agent on a High-Performance Vultr VPS ($100 Free Credit) using Xvfb virtual screen technology for 24/7 unattended operation.

Conclusion

AI Agents are reshaping how we interact with the internet. In the future, you won’t “visit” the internet; you will “command” it.

Project Source: GitHub – browser-use

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top