Shopee Scraper Toolkit Guide: 2025 Edition

When you search for “how to scrape Shopee,” you’ll be bombarded with commercial scraper API landing pages - “Shopee Scraper API,” “Shopee Data Scraper,” or generic eCommerce scraper pitches. These solutions weren’t specifically built for Shopee; while they might work for generic online shops, they fail on Shopee’s dynamic and JavaScript-heavy websites. They rely on a single request to an unofficial API endpoint or basic scraping process (e.g. a single request with URL parameters like limit, offset, and common parameters), often using default values, and can’t handle product listings, detailed product attributes (Product Details), or popular products data. They ignore the need for rotating Residential proxies or datacenter proxies, can’t navigate hierarchical category trees or category levels, and break when Shopee deploys new protection laws or anti-bot protections . Much of the “how to scrape Shopee” content floating around - GitHub projects, blog posts, or posts on dynamic marketplaces - are outdated and no longer function.

We’ll be upfront: Shopee is a very difficult scraping target.

‍In this article, we share everything we know - our scraping process, performance metrics, actionable insights, and improvement ideas - based on real experience and genuine research. Yes, we work with the Kameleo anti-detect browser, but our goal here is knowledge-sharing, not a pitch. In fact, one of our clients has successfully scraped from Shopee millions of product pages, collecting product reviews, customer reviews, buyer ratings, star rating and aggregated rating details, yet they guard their method as a valuable, hard-won secret.

The Challenge of Scraping Shopee

Scraping Shopee’s public product data is legally permissible - as long as you only collect non-PII, public information, respect robots.txt, and comply with Shopee’s data-scraping rules, Terms of Service, and relevant protection laws. However, Shopee employs sophisticated anti-scraping defenses: mandatory app-based logins, CAPTCHA challenges, aggressive IP rate limiting, fingerprint-based detection, and JavaScript-rendered content that foil basic crawlers (e.g., BeautifulSoup or simple HTTP requests). To overcome these, you can use virtual-number services for OTP verification, CAPTCHA-solving APIs, rotating residential proxies, and an anti-detect browser like Kameleo - combined with Playwright - to maintain stealth, handle dynamic content, and persist login sessions programmatically.

What Is Shopee?

Shopee is a mobile-first e-commerce platform launched by Sea Limited in 2015, serving Southeast Asia, East Asia, and Central and South America with localized domains like shopee.sg, shopee.com.my, and shopee.com.br. It’s one of the leading ecommerce marketplaces, featuring thousands of parent category pages, detailed category pages, flash sales, an interactive search bar with autocompletion hints (initial characters or input characters), and millions of product listings across official shops and third-party sellers.

Purpose and Value of Shopee Scraping

Automated scraping enables businesses to:

Monitor competitor strategies, competitive pricing strategies, and pricing adjustments
Analyze market trends, insights into market demands, and historical sales
Aggregate customer feedback, product reviews (including negative reviews), buyer feedback, and star rating
Optimize inventory: track available stock, avaliable stock, stock levels, and levels per product variant
Obtain Comprehensive shop details: seller info, official shops vs. third-party, Maximum shops, contact links (Contact Sales)
Conduct keyword research: Search keyword data, searches by keyword, popular keyword suggestions, Maximum keyword suggestions for market entry
Navigate category analytics: filter products, products by category URL, records by category url, category breadcrumbs, category rankings
Extract detailed product data: Product URL, Product mix, detailed product descriptions, product offerings, product improvements, and actionable insights for informed product strategy

Does Shopee Allow Web Scraping?

Yes. Shopee permits scraping of publicly available data so long as you:

1. Avoid PII: don’t collect personally identifiable information.

2. Respect robots.txt.

3. Comply with Terms: abide by Shopee’s Terms of Service and anti-scraping requirements (including protection laws).

Data Types Collected from Shopee

You can programmatically extract:

Product information: name, Product Details, product descriptions, images (https://down-${country}.img.susercontent.com/file/${imageKey}), SKU, option for products (variants)
Pricing data: actual price, product prices, discounts, promotions, pricing strategy
Availability: available stock, avaliable stock, stock levels, levels per product variant
Reviews & Ratings: customer reviews, product reviews, star rating, Aggregated rating details, Buyer ratings
Seller profiles: seller info, seller name, rating, number of listings, feedback
Shop & Sales: Comprehensive shop details, flash sales, online shop metrics
Search & Keywords: Filter products, Search keyword suggestions, autocompletion hints, insights into consumer preferences and market demands
Shipping details: available options, fees, estimated delivery times

What Definitely Doesn’t Work When Scraping Shopee

1. Basic HTTP Scraper (Python + BeautifulSoup)

Why it fails: Shopee requires login before serving search results; you get empty or placeholder HTML, missing product listings and valuable insights.

2. API-Based Scraping (Mobile App Unofficial API)

Why it fails: Even mobile-app endpoints authenticate first; without a valid login token you’ll get "is_login": false, so you can’t bypass the login wall by hitting a mobile API directly.

There are approaches that perform scraping not through the browser but via the API (in this case the API that their mobile app uses to communicate with their servers, where the data is stored).

For example, see this project:

I tried getting it up and running, but I ran into the same problem: the HTTP request isn’t authenticated. In the mobile app, pages only load when the user is logged in.

Concretely, I requested: https://shopee.com.br/api/v4/recommend/recommend?bundle=shop_page_product_tab_main&limit=999&offset=0&section=shop_page_product_tab_main_sec&shopid=409068735

and got this response:

Notice that "is_login": false - the request isn’t authenticated.

3. Commercial Scraping Services / Generic APIs

If you check their pricing, you’ll find they’re quite expensive. Even if they claim Shopee support, there’s no guarantee they’ll avoid Shopee’s ever-evolving anti-scraping defenses. Often these services treat Shopee as “just another site,” so when Shopee changes its DOM or adds new bot checks, those generic solutions break.

Many advertise “Scrape Shopee” or “eCommerce scraper,” but lack extended features (CSV files export, JSON file output, configurable page size) and break on dynamic content or locked websites. They ignore performance metrics, fail to provide actionable insights, and rely on basic scraping without advanced parameters or splitting requests.

Conclusion: You must bypass Shopee’s login wall to scrape meaningful data.

How Can I Extract Product Data from Shopee?

Extracting product data requires tools that handle both static and dynamic content. While libraries like requests + BeautifulSoup exist, the key is simulating real-user behavior. By using a browser automation solution in combination with anti-detect browsers, you can effectively gather data such as product titles, product prices, available stock, product reviews, and detailed product attributes - all while minimizing the risk of being detected or blocked by Shopee’s anti-bot systems.

Does Shopee Use Anti-Crawling Technology? How Can I Avoid Getting Blocked?

Yes, Shopee uses IP rate-limiting, CAPTCHA challenges, fingerprint checks, frequent DOM/API changes and fingerprint-based detection - as seen above. To avoid blocks:

Simulate a real user environment: use anti-detect browsers (fingerprint spoofing of Canvas, WebGL, AudioContext, Navigator, timezones)
Rotate proxies: Residential proxies, datacenter proxies, millions of IP addresses aligned with each Shopee domain
Throttle & split requests: ≤100 requests/minute per account, distinct requests, optional parameters (parameters_with_default_settings.extra_params)
Use custom headers: Referer header with URL Parameters, additional parameter options
Persist sessions: save cookies in a JSON file or save the full browser profile in the anti-detect browser to reuse authenticated sessions without repeated OTP.

Community-Reported Scraping Challenges on Shopee

Mandatory App-Based Login & CAPTCHA: Shopee.sg forces all sessions through its mobile app or web login, presenting CAPTCHA or blocking access to unauthenticated bots.
JavaScript-Rendered Content (dynamic elements, infinite scroll): Product listings and prices load asynchronously, so HTTP-only scrapers (e.g., BeautifulSoup) retrieve empty HTML.
Aggressive IP Rate-Limiting & Bans: Excessive requests trigger instant IP blocks or additional security checks, necessitating a pool of rotating residential or mobile proxies.
Frequent DOM/API Changes: Shopee updates its CSS selectors and API signatures regularly, breaking hard-coded scrapers and requiring continuous maintenance.
Region-Specific Phone Verification: Local phone-number checks reject foreign numbers, blocking non-resident account registrations.
OAuth & Google SSO Blocks: Automating Google SSO often triggers “browser not secure” errors unless using advanced stealth techniques.

Our research outcome is clear: to access data on Shopee, you must be logged in - no “single request” workaround works. Although some online sources claim there are URLs that work without login, in our tests none of them reproduced successfully.

Even on Reddit over a year ago people stated you need to be signed in everywhere - and our practical experience confirms it.

Expert Solutions for Reliable Shopee Scraping

Virtual-Number OTP Automation: Integrate services like OnlineSim or Grizzly SMS to programmatically register and verify Shopee accounts in target regions.
CAPTCHA-Solving APIs: Use 2Captcha or Anti-Captcha to handle challenges without manual intervention.
Rotating Proxies: Residential proxies, datacenter proxies - Distribute requests across a broad IP pool to avoid rate limits and geo-blocks.
Session Persistence & Cookie Management: JSON file, browser profile - Perform an initial headful login directly on Shopee with Google SSO in a stealth browser, export cookies or save browser profiles, and reuse them in subsequent automated sessions.
Use Anti-Detect Browser Profiles (Kameleo): Launch undetectable browser instances with randomized fingerprints, geolocation spoofing, and built-in proxy rotation to bypass advanced bot-detection systems.
Start by eliminating dead‐end methods: as shown, basic HTTP scrapers and unauthenticated mobile API calls fail, and commercial services are costly and unreliable.
Throttle and distribute requests to stay under Shopee’s undisclosed rate limits. A rule of thumb: no more than ~100 requests per minute per account and proxy.
Profile Management: reuse authenticated profiles to avoid repeated OTPs.
Automate SMS verification to reduce manual work and costs.
Align IP geographic location with the Shopee domain you’re scraping (e.g., Singapore IP for shopee.sg).
Regularly update selectors and monitor for DOM/API changes, since Shopee frequently tweaks its frontend.
Monitoring & Metrics: Frequently check performance metrics and session health.
Ethical Compliance: no PII, respect robots.txt, comply with Terms of Service

Introduction to Kameleo Anti-Detect Browser

What Is Kameleo?

Kameleo is an anti-detect browser platform for creating virtual profiles that mimic real user environments through automated fingerprint randomization, geolocation, timezone, and proxy configuration. The platform is user-friendly, ideal for beginners but also for for large-scale scraping tasks and Advanced users.

Key Features

Fingerprint Spoofing: Randomizes Canvas, WebGL, AudioContext, and Navigator properties to elude fingerprinting.
Proxy Integration: Supports HTTP, SOCKS5, and SSH proxies per profile for IP rotation and geo-spoofing.
Session Persistence: save cookies, local storage, history; reload via JSON file or profile context
Parallel Profiles: Run unlimited isolated profiles concurrently for large-scale scraping of millions of product and shop instances.
Timezone & Language Spoofing: Aligns browser locale with target region to avoid mismatches.
Configurable Page Size: parameters_with_default_settings.extra_params, common parameters, additional parameter options
Data Export: CSV files, JSON file outputs
SDKs & API: Python (scraper in python), JavaScript, C#; WebSocket endpoint

Why Kameleo Improves Scraping Reliability

By combining realistic fingerprinting, proxy rotation, and session persistence, Kameleo yields:

Accurate Data Extraction: renders dynamic and JavaScript-heavy websites
Performance Metrics: track throughput, error rates
Valuable Insights: competitor strategies, consumer preferences, customer satisfaction metrics
Scalability: millions of product listings per day, API-first orchestration

Realistic Fingerprinting

Kameleo builds fingerprints from millions of data points to fool anti-bot scripts - avoiding fingerprint-based detection and ensuring consistent behavior across both Chroma and Junglefox kernels.

Persistent Browser Sessions

Persistent Session Resumption saves entire sessions—cookies, local storage, history—so you can reload without re-authentication, reducing OTP costs and preserving buyer feedback context.

IP Management and Rotation

Rotate or pin Residential proxies and datacenter proxies per profile. Align geolocation/timezone with each Shopee domain to avoid mismatches and bans.

Multikernel Architecture

Kameleo’s multikernel design dynamically selects the best matching Chroma (Chromium via the Chrome DevTools Protocol) or Junglefox version, ensuring minimal engine mismatches, rapid security patch updates, and consistent fingerprint alignment.

Scalable Web Scraping

Device-local execution, headless mode, and an API-first SDK let you orchestrate millions of browser instances - extract an Array of product summaries (data/items/itemidanditems/shopid), products by seller URL, or Maximum products across categories - and export to CSV files or a JSON file.

How to Overcome the Login Wall

Getting past Shopee’s login wall is the crux of scraping. Even once you have credentials, automating the login process is tricky because Shopee demands SMS verification, and foreign phone numbers won’t work. Let’s break down your options:

1. Email/Password Authentication + OTP

Quick profile primer: In anti-detect browsers, a “profile” bundles up everything - settings, extensions, local storage and, crucially, cookies—so each profile mimics a distinct real user. Kameleo automatically saves all of this for you, including the login cookies.

Launch Kameleo in headful mode and create (or load) a dedicated browser profile.
Route your traffic through a Singaporean proxy and go to https://shopee.sg.
Enter your email and password, solve any CAPTCHA, and complete the OTP.
Save the profile. After that, simply loading this profile in Kameleo will restore your logged-in session- no manual cookie export or additional login steps needed.

2. Local Registration → SMS Verification

Shopee insists on local phone numbers for OTP SMS. Purchase virtual numbers from a reliable SMS API provider (e.g., OnlineSim, Grizzly SMS).
In your Playwright script, intercept the OTP input prompt, fetch the SMS code from your virtual-number service, and submit it.
After successful registration and login, export and save cookies for reuse or save the whole persistent browsing context, which includes the cookies. (By using an anti-detect browser and saving the browser profile, all your settings and cookies are preserved automatically.) This automation reduces manual overhead - once your profile is validated, you don’t need to re-SMS-verify every run.

3. Google/Facebook SSO

Instead of log in with emai/password, you can log in via Google or Facebook. However, this still triggers anti-bot checks: farmed Google/Facebook accounts often need SMS verification on their own side too.

Perform a one-time headful login via Kameleo, choose “Continue with Google” (or Facebook), complete any SMS/2FA challenges.
Export cookies to cookies.json, then load them in Playwright - Subsequent runs: await context.add_cookies() to skip re-authentication.

Note: You can use other methods instead of exporting cookies -they often work well - but the most foolproof solution is to run your session in an anti-detect browser and simply save the browser profile, since it already contains all the necessary cookies.

Key Point: Once you have a valid session saved, reuse it. Constantly creating new accounts or regenerating cookies is expensive (SMS costs add up) and increases the chance of detection. Profile management—storing and reloading authenticated sessions—minimizes SMS API fees and stabilizes your scraping pipeline.

Rate Limits & Multiple Accounts

Even after you successfully log in, Shopee imposes rate limits at both IP and account levels. If you scrape too aggressively from one account, you will hit API bans or be forced into CAPTCHA loops. Therefore:

Throttle to ~100 requests/minute per account (give or take). Exact thresholds aren’t published, but this is a safe ballpark.
Use one IP per account: If you switch the IP mid-session, Shopee may detect anomalous behavior (e.g., account “Alice” suddenly making 50 requests from Singapore IP, then 50 from the U.S. IP). Keep a consistent proxy per profile.
Parallelize with multiple accounts & proxies: For large-scale scraping (e.g., tens of thousands of product pages daily), maintain dozens of accounts, each paired with a unique proxy. This spreads the load and reduces the risk of account bans.
Monitor session health: Regularly check if a session still returns valid data (not a login page or CAPTCHA). If it redirects you to a login or throws repeated errors, retire that profile and switch to a fresh one.

Practical Pipeline Overview

1. Initialize Kameleo & Create a Profile

Install prerequisites, start Kameleo on port 5050.
Create a “desktop Chrome” profile, set recommended fingerprint defaults (parameters with default settings, category levels, or page size).

2. Authenticate & Save Session

Log in via email/password + OTP or Google SSO (handle CAPTCHA).
Export and save the browser profile or cookies.json (for persistent session resumption).

3. Load Saved Browser Profiles on Future Runs

In your Playwright script, start the browser profile. This skips manual login.

4. Throttle & Distribute Requests

Keep each profile to ≤100 requests/minute; use rotating Residential or datacenter proxies aligned by region.
Split requests and use distinct requests for search, category pages, and product URLs.

5. Scrape with Playwright + Kameleo

Navigate to search pages (https://shopee.sg/search?keyword=...), wait for dynamic content, extract product listings (.shopee-search-item-result__item), detailed product fields (title, price, sales, rating, Seller info, stock levels).

6. Save Results & Monitor

Store scraped data in JSON file or CSV files.
Periodically check performance metrics and session health; rotate or retire profiles as needed.

Step-by-Step Guided Scraping Tutorial Using Playwright + Kameleo

We won’t provide a single “copy-paste” solution. Instead, we’ll explain what to do, what to watch out for, and how to piece together a working pipeline that handles product listings, Product Details, and dynamic content via the Shopee Scraper API or Scrape Shopee approach.

1. Environment Setup

1. Install Python (3.8+) and Playwright (with Shopee Data Scraper support):

This gives you the Playwright browser automation plus the Kameleo Local API client for Residential proxies, session persistence, and fingerprint spoofing.

2. Launch Kameleo (Shopee Scraper Toolkit):

Ensure the Kameleo Local API runs on port 5050 for your “Scraper Input” scripts.

3. Create & Start a Kameleo Profile (Python)

We need a real-user fingerprint profile to bypass Shopee’s anti-bot protections and handle hierarchical category trees, category levels, and dynamic marketplace elements.

Refer to Kameleo docs for more on the Shopee Scraper API:

• Getting Started with Kameleo Automation

• API Examples

4. Connect Playwright to Kameleo (Python)

This script demonstrates how to integrate Playwright with Kameleo, handle dynamic content, rotate proxies, split requests if needed, and persist sessions so that your Shopee scraper runs reliably at scale.

Conclusion

In summary, scraping Shopee is a multi-layered challenge:

1. Avoid basic HTTP scrapers, unauthenticated API calls, and generic commercial scraping services.

2. Authenticate on Shopee (email/OTP, local registration, Google/Facebook SSO), save sessions in cookies.json or browser profiles.

3. Throttle & Rotate IP addresses (Residential proxies, datacenter proxies), split requests, and manage multiple accounts.

4. Use an anti-detect browser (we recommend Kameleo) with fingerprint spoofing, session persistence, and multikernel architecture.

5. Extract comprehensive product details (titles, detailed product descriptions, product prices, stock levels, customer insights, product improvements), shop metrics (Comprehensive shop details), and actionable insights (pricing strategy, market demands, competitor strategies).

By combining Kameleo’s powerful technologies with Playwright (or Puppeteer), you’ll achieve competitive superiority in a dynamic marketplace scraping - delivering real-time market intelligence, informed business decisions, and valuable insights at scale. For enterprise support or to join our Discord Community, Contact Sales today.

If you're ready to take your scraping to the next level, check out these pages:

Unlock the real anti-detect power for FREE - browse our flexible plans here!

Craving more insider tips? Dive into our Web Scraping Resource Hub!

Share this post