Crawling GitHub for Discord & Telegram Invites

By | August 6, 2019

My other crawling efforts in Crawling GitHub for New Cryptonight Coins used the GitHub API and Python. When I started on my efforts to find new Discord or Telegram invites that could possibly be cryptocurrency related I chose to walk a different path. This set of scripts used JavaScript and I did not use the GitHub API mostly out of curiosity and a desire to learn new things along the way.

The goal this time was a little different than the previous crawler which focused on files and repository. Discord and Telegram invites appeared in code, repository details, wiki pages, issues and even user profiles. I used Puppeteer to run Google Chrome in headless mode. Headless mode means that a browser window won’t pop up and that we can run this script from an SSH shell on Linux without any desktop. For each run I would sign into GitHub and save the session cookies off for reuse. Then search GitHub like you would do from any normal browser saving data and paging through search results. Each type of search object and different sorts get parsed for each run.

When searching for Discord invites I would use either discord.gg or discordapp invite NOT oauth2 which seemed to cover both invite link styles fairly well. After all the searches had been completed any existing invite codes are filtered out. The remaining invite codes then each call out to Discord’s invite API to ask for information about that invite. All of that data and the source information would be saved off in the database. Discord does ban IP addresses over excessive API usage but the ban is temporary. I could never find out what the right requests per hour limit was when I emailed them. Also fun fact you can only join, now I can’t remember it, I think 100 Discord channels before you hit a limit on free accounts. The bummer is that there isn’t an error for it, invites just don’t work and information about new Discord invites won’t show up either.

I spent less time on Telegram and thus didn’t go far in collecting Telegram invites. I didn’t bother trying to get any information about Telegram invite just saved them off a I found them. Those search strings were t.me and telegram.me. I bet they paid a handsome sum for a single letter domain even if the TLD was me.

After collecting that information then I could combine these invites and sources with the repositories I had found with Cryptonight coins. Sometimes coins were better handled and launched than others and these searches helped more than once.

This was part of a series: Ephemeral Projects as Performance Art

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment moderation is enabled. Your comment may take some time to appear.

This site uses Akismet to reduce spam. Learn how your comment data is processed.