What is Scrapling?
Scrapling is a comprehensive web scraping framework for Python that positions itself as an "adaptive" solution to modern web scraping challenges. Created by Karim Shoair and first released in October 2024, Scrapling has rapidly gained traction in the developer community with over 33,000 GitHub stars. The framework addresses one of the most persistent problems in web scraping: maintaining scrapers when websites change their structure.
What sets Scrapling apart is its adaptive parsing engine that learns from website changes and automatically relocates elements when pages are updated. This means your scrapers can continue working even after a website redesign, reducing maintenance overhead significantly. The framework also includes built-in stealth capabilities to bypass anti-bot systems like Cloudflare Turnstile, making it particularly valuable for scraping protected websites.
Getting Started
Installing Scrapling is straightforward using pip. The framework requires Python 3.8 or higher and works across all major platforms.
pip install scraplingFor development or the latest features, you can install directly from GitHub:
pip install git+https://github.com/D4Vinci/Scrapling.gitOnce installed, you can start with a simple scraping example:
from scrapling.fetchers import StealthyFetcher
# Enable adaptive mode for automatic element relocation
StealthyFetcher.adaptive = True
# Fetch a webpage with stealth capabilities
page = StealthyFetcher.fetch('https://example.com', headless=True)
# Extract data using CSS selectors
products = page.css('.product')
for product in products:
title = product.css('h2::text').get()
price = product.css('.price::text').get()
print(f"{title}: {price}")Usage & Practical Examples
Scrapling's strength lies in its flexibility and adaptability. Here are three practical scenarios that demonstrate its capabilities:
Basic Web Scraping with Stealth
For simple scraping tasks that need to avoid detection:
from scrapling.fetchers import StealthyFetcher
# Scrape a news website
page = StealthyFetcher.fetch('https://news-site.com',
headless=True,
network_idle=True)
# Extract articles with automatic saving for future adaptability
articles = page.css('article.news-item', auto_save=True)
for article in articles:
headline = article.css('h2.headline::text').get()
summary = article.css('p.summary::text').get()
link = article.css('a.read-more::attr(href)').get()
print(f"Headline: {headline}")
print(f"Summary: {summary}")
print(f"Link: {link}\n")Full-Scale Crawling with Spiders
For larger scraping projects that require crawling multiple pages:
from scrapling.spiders import Spider, Response
import asyncio
class EcommerceSpider(Spider):
name = "ecommerce_scraper"
start_urls = ["https://shop.example.com/products"]
async def parse(self, response: Response):
# Extract product links
product_links = response.css('.product-card a::attr(href)').getall()
for link in product_links:
yield response.follow(link, self.parse_product)
# Follow pagination
next_page = response.css('.pagination .next::attr(href)').get()
if next_page:
yield response.follow(next_page, self.parse)
async def parse_product(self, response: Response):
yield {
'name': response.css('h1.product-title::text').get(),
'price': response.css('.price::text').get(),
'description': response.css('.description::text').get(),
'rating': response.css('.rating::attr(data-rating)').get(),
'url': response.url
}
# Run the spider
spider = EcommerceSpider()
spider.start()Adaptive Scraping for Changing Websites
When websites frequently change their structure, Scrapling's adaptive features shine:
from scrapling.fetchers import StealthyFetcher
# Initial scraping with element learning
page = StealthyFetcher.fetch('https://dynamic-site.com')
products = page.css('.product-item', auto_save=True) # Learn element structure
# Later, when the website changes its CSS classes
# The adaptive mode will try to find elements using learned patterns
page_updated = StealthyFetcher.fetch('https://dynamic-site.com')
products_adaptive = page_updated.css('.product-item', adaptive=True)
# Scrapling will attempt to locate the products even if CSS classes changed


