Hunting YouTube Crypto Scams
Back in April 2022 I got annoyed by how prevalent cryptocurrency scams were still on YouTube years after I had first seen them. I spent a few minutes going through the scams that I easily found with a search for live streams including either “ETH” to “BTC” and reporting them via the YouTube flag / report system. Many hours later there were eventually taken down, but not before more scam live streams were already running to take their place.
Really I wanted (and still want) YouTube to do a better job… They have all of the information that should make shutting these down in the first seconds of them being live. But I figured I’d see how easy this would be to automate as a system using only the public APIs etc.
This post covers the initial prototype, followed by the scam-hunter web app which ran for some months before I sunset it last week. TLDR; lots of money was stolen while I was looking at these scam streams.
Example of the scam
When running, these streams are very easy to find by just searching for them (Live streams that mention “BTC” or “ETH”. You’ll either end up with streams displaying charts of the values compared with other crypto assets, or scam streams.
The scam streams take a variety of different forms, but not of them make use of pre-recorded videos of conversations with folks such as Elon Musk talking about cryptocurrencies, while also promoting a website such as MuskLiveNow.Tech (I made this one up) which claims to be running a giveaway event.
The example above appears to have 14k+ viewers. I have no idea if these are real, or just view bots, but if you follow the trail from video, to website, to wallet addresses, you can see that people are sending funds to the addresses being pedalled by the scam sites.
Domains are cheap, so the scams regularly rotate through new domains, and these all make their way into the videos, or YouTube pages one way or another. The sites are often reused, but there are some variations, perhaps from different groups running the same sort of scams.
Most of the domain registrations that I looked at by hand appeared to originate in Russia, and many sites were hosted via well-known hosts such as CloudFlare.
Back in April, before writing any code there was already hundreds of thousands of dollars worth of cryptocurrencies in the few wallets that I had found.
I spent a few days throwing together some quickly written node code, making use of ytsr to search for videos, ytdl-core (part of the slightly famous youtube downloader) to download short snippets of videos and googleapis to generally integrate the YouTube APIs for other information.
I came up with a series of steps to the process of detection, data extraction and reporting.
- ytlive.js: Search YouTube for live streams with
"eth" OR "btc", take a frame from the stream along with other general information. Then run that information through the “bad detector” and if bad was detected store the information on disk.
- domains.js: Look at the frame of video and try to extract any domains that can be found in text. If they are live, then also store them on disk.
- wallets.js: Look at the stored domains, and try to extract BTC or ETH wallets from the HTML, also storing them on disk.
- report.js: Report the live streams that had “bad detected” in them to YouTube video their API (
The search results were numerous, often with many apparent views, and making use of hijacked YouTube channels renamed, often to appear to represent well-known people or organizations.
The bad detector initially just used OCR on the frame of video trying to match some regex matches that frequently occur. The initial regular expressions being used looked something like this:
Code language: YAML (yaml)
- - regex - - Mr\.? Beast Crypto Charity Stream - double\syour\scrypto\s+SCAN\sQR[\s-]CODE - \d+((\.|,)\d+)?\+? ?(ETH|BTC|SOL|ADA),? ?(to|you)? ?(get|to|get|receive|and),? ?\d+((\.|,)\d+)?\+? ?(ETH|BTC|SOL|ADA)?
Keeping things simple, I just ran these scripts every few hours to gather some data.
This method collected numerous videos, bad domains and on May 2nd around 18 BTC wallets and 17 ETH wallets. You can find the exact lists of video IDs, domain names and wallet addresses in a yaml database in Github.
The prototype code was working, but I wanted it to run regularly by itself, and I also wanted to be able to share the information more easily with other folks and make it easier to navigate than a YAML file. I considered using Github Actions and Pages, but settled on giving Firebase a go (not used in a project by me before).
And thus scam-hunter.web.app was born (now mostly sunset).
Much easier for non-technical people to consume!
This site (read more about how it was set up with Firebase below) checked for live streams on YouTube every hour or so, collected information for the video, analysed it, and if it was seen as “bad” displayed it and all information on the site.
Looking at the cmheaZk6GpY video seen in the screenshot above, it is no longer live as it was “violating YouTube’s Terms of Service” we can get an overview of all parts of the scam that the app collected.
- video which had been live streaming for 11 hours at the time of this screenshot
- details of the video were captured when detected and stored, including:
Michael Saylor - Why $120K Bitcoin Next Week?! BITCOIN Urgent News! BTC/ETH Price Prediction
- Channel: Katana Produções / UCYjNLv_zdWBMxUMuDoBG7nw which appears to be back under the correct ownership
- View count: 5832
MicroStrategy CEO Michael Saylor, whose software company owns about $6 billion worth of bitcoin, said the cryptocurrency doesn't need Warren Buffett's endorsement to be wildly successful.\nBuffett dismissed cryptocurrencies last year as basically worthless because they don't produce anything. Saylor highlighted the criticism from the famed investor and Berkshire Hathaway CEO during a presentation at MicroStrategy's virtual investor day this week.\nHe pointed out that other assets have performed extremely well without Buffett's backing, and suggested bitcoin could flourish if even 5% of institutional investors embrace it.\n\"Warren Buffett never bought Microsoft stock, and he was best friends with Bill Gates for nearly a generation,\" Saylor said, according to a transcript on Sentieo, a financial-research site.\n#bitcoin #ethereum #crypto #btc #eth #bitcoinnews #cryptocurrency #altcoins #bch #bsv #microstrategy #Stocks
- Report (what was detected):
\d+((.|,)\d+)?+? ?(ETH|BTC|SOL|ADA),? ?(to|you)? ?(get|to|get|receive|and),? ?\d+((.|,)\d+)?+? ?(ETH|BTC|SOL|ADA)?
- Snapshot (the frame that was inspected)
- Text (extracted from the frame using tesseract OCR)
- Vision Text (extracted from the frame using Google Vision API)
All of this information is retained for all matched videos if anyone wants this sort of data set please reach out.
I didn’t end up implementing automatic reporting, or a way for others to report videos by clicking a button as the YouTube API makes this quite difficult, mainly issue 230865663 “Spam or misleading > Scams or fraud” reporting reason not available in the v3 API.
When using a script to report videos in bulk I would use the categories “Violent, hateful, or dangerous > Digital security”.
You can find your own report history on the report history page.
You’ll notice that despite some of my reporting efforts at least 1 recent video is still “Live” on YouTube. Live in this context means viewable, however, the live stream ended quite some time ago now.
The stream that is still live mainly appears to show a chart of Dogecoin vs Tether, however at around 1:50:00 the stream switches to something a little more suspicious, and by 2:21:00 the scam is running. It remains like this for the last ~10 hours of the stream. 4 months on, it is still live on YouTube despite the report.
So, does reporting actually guarantee that a human will look at it and make the right decision? well no…
Over 4 months (May – August), 2821 videos were detected, 680 domains, 403 possible BTC addresses and 314 possible ETH addresses.
On a daily basis thats ~23 videos a day, ~5 domains a day and around ~3 BTC addresses a day.
I don’t currently have a public list of all video IDs, screen grabs, descriptions etc. But if you do want this please reach out to me and I’ll try to make it public in a useful format.
As the application continued to evolve, and the scams changed, some of the detected videos were incorrectly detected as bad when they were perhaps good. This was due to the list of bad domains being automatically generated from previous matched videos, so once a single incorrect domain name got into the list, more false positives would occur. Somehow youtube.com ended up on the list of bad domains (whoops)… This was one of the reasons I decided to sunset the project this week (rather than try to maintain it while sailing)
I also haven’t added up how much money flowed through these wallets, I’m sure it is probably millions of USD worth at this point.
Behind the scenes (Firebase)
The automated firebase process used very similar code to the prototype, just shifted around into a collection of Firebase functions all triggered by 2 scheduled tasks. Everything persisted in Cloud Firestore and Cloud Storage then served to users via more Firebase functions.
Initial OCR detection was done with tesseract OCR in a Firebase function to keep the cost down, but when anything “bad” was detected the Google Vision API was then used for a more accurate OCR extraction from the frame to match domain names etc.
You can see how much better the Google Vision API was for OCR than my poorly configured tesseract by comparing some extracted text from each. Tesseract seemed to match far more random artefacts around the video frame.
What do I think?
Generally, YouTube should be doing a better job. The time from reporting to taking videos down is far too long. The fact that some scams don’t even get taken offline is pretty bad, and the fact that these are not already caught by some filter given the engineering might at Google is also a poor show.
A quick YouTube search in the past week shows a few streams that have ended that are using the same or similar live stream talking about cryptocurrencies (#1, #2, #3). These don’t appear to have any websites to visit in the video content themselves, but I have seen the websites appear in descriptions and as chat messages in the past, which may have since been removed.
The crypto scams on YouTube go far beyond the live streams that I have focused on in this post, and it would be great to have fewer of them on the platform…