Scrape Instagram Reels data with ease—be it a single account or many in parallel—using Python, threading, robust logging, and optional database support.
Installation • Usage • Classes • Documentation • Contributing • License • Acknowledgments • Disclaimer
Requires Python 3.9+. Install directly from PyPI:
pip install reelscraper
Or clone from GitHub:
git clone https://github.com/andreaaazo/reelscraper.git
cd reelscraper
python -m pip install .
ReelScraper supports detailed logging and optional persistence via a database. You can either scrape a single Instagram account or handle multiple accounts concurrently.
Use ReelScraper
to fetch Reels for a single account. Optionally pass a LoggerManager
for retry logs and progress tracking.
from reelscraper import ReelScraper
from reelscraper.utils import LoggerManager
# Optional logger setup
logger = LoggerManager()
# Initialize scraper with a 30-second timeout, no proxy, and logging
scraper = ReelScraper(timeout=30, proxy=None, logger_manager=logger)
# Fetch up to 10 reels for "someaccount"
reels_data = scraper.get_user_reels("someaccount", max_posts=10)
for reel in reels_data:
print(reel)
Use ReelMultiScraper
to process many accounts concurrently. Configure logging (LoggerManager
) and database persistence (DBManager
) if desired.
from reelscraper import ReelScraper, ReelMultiScraper
from reelscraper.utils import LoggerManager
from reelscraper.utils.database import DBManager
# Configure logger and optional DB manager
logger = LoggerManager()
db_manager = DBManager(db_url="sqlite:///myreels.db")
# Create a single scraper instance
single_scraper = ReelScraper(timeout=30, proxy=None, logger_manager=logger)
# MultiScraper for concurrency, database integration, and auto-logging
multi_scraper = ReelMultiScraper(
single_scraper,
max_workers=5,
db_manager=db_manager,
)
# File contains one username per line, e.g.:
# user1
# user2
accounts_file_path = "accounts.txt"
# Scrape accounts concurrently
# If DBManager is provided, results are stored in DB, and this method returns None
all_reels = multi_scraper.scrape_accounts(
accounts_file=accounts_file_path,
max_posts_per_profile=20,
max_retires_per_profile=10
)
if all_reels is not None:
print(f"Total reels scraped: {len(all_reels)}")
else:
print("All reels have been stored in the database.")
Note: If
DBManager
is set, scraped reels are saved to the database instead of being returned.
-
Purpose:
Fetches Instagram Reels for a single user session. -
Key Components:
-
InstagramAPI
: Manages HTTP requests and proxy usage. -
Extractor
: Structures raw reel data. -
LoggerManager
(optional): Logs retries and status events.
-
-
Key Method:
-
get_user_reels(username, max_posts=50, max_retries=10)
: Retrieves reels, handling pagination and retries.
-
-
Purpose:
Scrapes multiple accounts in parallel, powered by a singleReelScraper
instance. -
Key Components:
-
ThreadPoolExecutor
: Enables concurrent scraping. -
AccountManager
: Reads accounts from a local file. -
LoggerManager
(optional): Captures multi-account events. -
DBManager
(optional): Saves aggregated results to a database.
-
-
Key Method:
-
scrape_accounts(accounts_file, max_posts_per_profile, max_retires_per_profile)
: Concurrently processes all accounts found in the file, optionally storing results in a DB.
-
Find full usage details in the DOCS.md file.
We welcome PRs that enhance features, fix bugs, or improve docs.
- Fork the repo.
- Create a new branch.
- Commit code changes (add tests where possible).
- Open a pull request.
Your contributions are appreciated—happy coding!
Licensed under the MIT License. Feel free to modify and distribute, but please be mindful of best practices and ethical scraping.
- Python Community: For making concurrency and requests straightforward to implement.
- Instagram: For providing reel content that inspires creativity.
- Beverages: For fueling late-night debugging and coding sessions.
This software is for personal and educational purposes only. Use it in accordance with Instagram’s Terms of Service. We do not promote or condone large-scale commercial scraping or any violation of privacy/IP rights.
Enjoy scraping, and may your concurrency be swift!