Quickly crawl the information (e.g. followers, tags etc...) of an instagram profile. No login required!
Automation Script for crawling information from ones instagram profile.
Like e.g. the number of posts, followers, and the tags of the the posts
Just do:
git clone https://github.com/timgrossmann/instagram-profilecrawl.gitIt uses selenium and requests to get all the information so install them with:
pip install -r requirements.txtInstall the proper chromedriver for your operating system. Once you (download it)[https://sites.google.com/a/chromium.org/chromedriver/downloads] just drag and drop it into instagram-profilecrawl/assets directory.
Now you can start using it following this example:
python3.5 crawl_profile.py username1 username2 ... usernameXSettings: To limit the amount of posts to be analyzed, change variable limit_amount in settings.py. Default value is 12000.
If you want to access more features (such as private accounts which you followed with yours will be accessible) you must enter your username and password in setting.py. Remember, it's optional.
Here are the steps to do so:
- Open Settings.py
- Search for
login_username&login_password - Put your information inside the quotation marks
Second option: just the settings to your script
Settings.login_username = 'my_insta_account'
Settings.login_password = 'my_password_xxx'To run the crawler on Raspberry Pi with Firefox, follow these steps:
- Install Firefox:
sudo apt-get install firefox-esr - Get the
geckodriveras described here - Install
pyvirtualdisplay:sudo pip3 install pyvirtualdisplay - Run the script for RPi:
python3 crawl_profile_pi.py username1 username2 ...
Collecting stats:
If you are interested in collecting and logging stats from a crawled profile, use the log_stats.py script after runnig crawl_profile.py (or crawl_profile_pi.py).
For example, on Raspberry Pi run:
- Run
python3 crawl_profile_pi.py username - Run
python3 log_stats.py -u usernamefor specific user orpython3 log_stats.pyfor all user
This appends the collected profile info to stats.csv. Can be useful for monitoring the growth of an Instagram account over time.
The logged stats are: Time, username, total number of followers, following, posts, likes, and comments.
The two commands can simply be triggered using crontab (make sure to trigger log_stats.py several minutes after crawl_profile_pi.py).
Settings:
Path to the save the profile jsons:
Settings.profile_location = os.path.join(BASE_DIR, 'profiles')Should the profile json file should get a timestamp
Settings.profile_file_with_timestamp = TruePath to the save the commentors:
Settings.profile_commentors_location = os.path.join(BASE_DIR, 'profiles')Should the commentors file should get a timestamp
Settings.profile_commentors_file_with_timestamp = TrueScrape & save the posts json
Settings.scrape_posts_infos = TrueHow many (max) post should be scraped
Settings.limit_amount = 12000Should the comments also be saved in json files
Settings.output_comments = FalseShould the mentions in the post image saved in json files
Settings.mentions = TrueShould the users who liked the post saved in json files Attention: be aware it would take a lot of time. script just can load 12 like at once. before making a break and load again
Settings.scrape_posts_likers = TrueTime between post scrolling (increase if you got errors)
Settings.sleep_time_between_post_scroll = 1.5Time between comment scrolling (increase if you got errors)
Settings.sleep_time_between_comment_loading = 1.5Output debug messages to Console
Settings.log_output_toconsole = TruePath to the logfile
Settings.log_location = os.path.join(BASE_DIR, 'logs')Output debug messages to File
Settings.log_output_tofile = TrueNew logfile for every run
Settings.log_file_per_run = FalseExample of a files data
{
"alias": "Tim Gro\u00dfmann",
"username": "grossertim",
"num_of_posts": 127,
"posts": [
{
"caption": "It was a good day",
"location": {
"location_url": "https://www.instagram.com/explore/locations/345421482541133/caffe-fernet/",
"location_name": "Caffe Fernet",
"location_id": "345421482541133",
"latitude": 1.2839,
"longitude": 103.85333
},
"img": "https://scontent.cdninstagram.com/t51.2885-15/e15/p640x640/16585292_1355568261161749_3055111083476910080_n.jpg?ig_cache_key=MTQ0ODY3MjA3MTQyMDA3Njg4MA%3D%3D.2",
"date": "2018-04-26T15:07:32.000Z",
"tags": ["#fun", "#good", "#goodday", "#goodlife", "#happy", "#goodtime", "#funny", ...],
"likes": 284,
"comments": {
"count": 0,
"list": [],
},
},
{
"caption": "Wild Rocket Salad with Japanese Sesame Sauce",
"location": {
"location_url": "https://www.instagram.com/explore/locations/318744905241462/junior-kuppanna-restaurant-singapore/",
"location_name": "Junior Kuppanna Restaurant, Singapore",
"location_id": "318744905241462",
"latitude": 1.31011,
"longitude": 103.85672
},
"img": "https://scontent.cdninstagram.com/t51.2885-15/e35/16122741_405776919775271_8171424637851271168_n.jpg?ig_cache_key=MTQ0Nzk0Nzg2NDI2ODc5MTYzNw%3D%3D.2",
"date": "2018-04-26T15:07:32.000Z",
"tags": ["#vegan", "#veganfood", "#vegansofig", "#veganfoodporn", "#vegansofig", ...],
"likes": 206,
"comments": {
"count": 1,
"list": [
{
"user": "pastaglueck",
"comment": "nice veganfood"
},
],
},
},
.
.
.
],
"prof_img": "https://scontent.cdninstagram.com/t51.2885-19/s320x320/14564896_1313394225351599_6953533639699202048_a.jpg",
"followers": 1950,
"following": 310
}
The script also collects usernames of users who commented on the posts and saves it in ./profiles/{username}_commenters.txt file, sorted by comment frequency.

