A Rust-based web scraping tool that downloads complete web pages with all their assets (HTML, CSS, JavaScript, and images).
- Downloads HTML content from any URL
- Automatically extracts and downloads CSS stylesheets
- Downloads JavaScript files
- Downloads all images from the page
- Organizes downloaded content in a structured directory
- Handles relative and absolute URLs
- Async/await for efficient downloading
- Rust (latest stable version)
- Cargo package manager
- Clone or download the source code
- Install dependencies:
cargo build
- Edit the
url
variable in the code to target your desired website - Run the tool:
cargo run
The tool will:
- Create a
downloaded_content
directory - Download the HTML page as
index.html
- Download CSS files with
css_
prefix - Download JavaScript files with
js_
prefix - Download images with
img_
prefix
Created directory: 'downloaded_content'
Downloading content from 'https://example.com/'...
Saved HTML content: downloaded_content/index.html
Downloading CSS: https://example.com/styles.css
Saved CSS: downloaded_content/css_styles.css
Downloading JavaScript: https://example.com/script.js
Saved JavaScript: downloaded_content/js_script.js
Downloading Image: https://example.com/logo.png
Saved image: downloaded_content/img_logo.png
Process completed! All files are in 'downloaded_content' directory.
Make sure you have permission to scrape the target website and comply with robots.txt and terms of service.