If you aren’t already familiar with BitcoinTalk it is a cryptocurrency focused forum where the majority of coins make their launch announcements. Crawling a forum isn’t rocket science to be sure but the fun happens in how to go about identifying a new coin as a Cryptonight coin of interest.
You can crawl the forums with Python and early versions of my scripts did just that even using a Github repository I can no longer find. I chose NodeJS instead this time using the Chrome Remote Interface and Puppeteer to handle the data collection.
In the first pass looking at the BitcoinTalk announcement forum my script just saves off the index number of the announcement and the subject line. This allows the script to not wait and load up each announcement page but just leaf through the forum threads list. Before inserting an announcement into the database it verifies that the index isn’t already there. After a successful insert of a new announcement that new record waits patiently for the next script to come and give it some more attention.
This second script has the job of loading each announcement page and deciding if it is a Cryptonight coin or not. It looks at the body of the first post, the subject and whatever images it finds on the page as well. Some announcements are 100% images with not actual text. Each image is saved and parsed using Tesseract then the combination of all the text found there, the body text and the subject are checked to see if they contain one of these two strings. ptonote or onigh. As easy as that looks I’ve been able to identify a large portion of announcements as Cryptonight coins accurately.
Then a message is sent out to my private and public Discord servers telegraphing the new found announcements and highlighting those which look like they could be Cryptonight coins. I toyed with ideas like collecting links or saving significant words but they didn’t add any value. As of now I have around 27,000 announcements saved off. Many of which have since been deleted.
This was part of a series: Ephemeral Projects as Performance Art