A certified red teamer. A published researcher. A ghost.
For six months I published red team research on X.
Adversarial simulation frameworks.
Proof-of-concepts.
Write-ups that took days to validate and document.
The kind of work you don't whip up in an afternoon. The kind you triple-check because you know the community will scrutinize every line.
The result?
Eight followers.
Zero traction.
Complete, absolute silence.
I Thought It Was Me
I told myself the problem was me.
Maybe I didn't understand social media. Maybe my content wasn't "engaging" enough. Maybe I was too technical, too niche, too boring for the algorithm.
So I tried harder.
More posts. More hashtags. Tagging people. Following trends. Adjusting my tone. Rewriting hooks. Studying what "worked" for others.
Nothing changed.
The silence stayed. The void stayed. And I kept feeding it, post after post, thinking this one would break through.
It never did.
Then I Found Out Why
A friend mentioned a third-party tool that checks if your account is shadowbanned. I ran it out of curiosity. Expected a green checkmark.
Got this instead:
Ghost Ban detected.
Your posts are visible only to you.
Your replies are hidden from other users.
Your account appears normal to you, but is invisible to the community.
I stared at the screen for a solid minute.
Six months.
Hundreds of hours of research.
Dozens of posts.
All of it — literally invisible.
Nobody saw my work. Nobody could reply. Nobody even knew I existed.
The algorithm had decided I was a bot. Why? Because I was a new account. Because I used a VPN — because X is blocked in my country and I have no other way to access it. Because I linked to GitHub repositories instead of staying inside the platform's walled garden.
New account + VPN + external links = bot in the eyes of X's 2026 algorithm.
So it threw me into an invisible prison without a word.
No Warning. No Appeal. Just Deception.
Here is what makes me genuinely angry:
This isn't moderation.
This isn't "protecting the community."
This is deception.
I would have preferred an honest message. Something like:
"Your account is restricted because your IP is from a commercial VPN pool. Here's what you can do."
At least then I'd know. I could fix it. I could adapt. I could make an informed choice — stay and fight, or leave and focus my energy elsewhere.
But X chose silence.
It let me keep producing. Keep engaging. Keep believing I was part of a global security community. For months. While nobody could hear a single word.
The platform gave me the illusion of participation while denying me the reality of it.
That is not a bug. That is a design choice.
The Professional Cost
Let me be clear about what this means for someone in my field.
I am a certified offensive security professional. I run a red team lab. I build frameworks. I publish research so that defenders can understand what attackers are actually capable of.
For a security researcher, invisibility is a professional death sentence.
Your work doesn't exist if no one can see it.
Your findings don't matter if no one can read them.
Your contributions to the community are erased — not because they lack value, but because an algorithm decided you don't deserve an audience.
I wasn't spamming. I wasn't trolling. I wasn't violating any policy that anyone could point to.
I was simply from the wrong country and using the wrong IP address.
That was my crime.
Why I Left
I didn't leave because of Elon Musk's politics.
I didn't leave because of some ideological disagreement.
I didn't leave because "Twitter isn't what it used to be."
I left because a platform that calls itself a "town square" has built a system that silently eliminates professionals from censored countries.
No appeal.
No transparency.
No human review.
Just algorithmic disappearance.
If you live in a country where X is freely accessible, you might never experience this. You might think shadowbanning is a conspiracy theory or an edge case.
It isn't. It is a systemic feature that disproportionately affects people who already face the highest barriers to participation — those under sanctions, censorship, and digital exclusion.
And the cruelest part? You don't even know it's happening to you.
Where I Am Now
I moved to Bluesky.
Here, the feed is chronological. My posts reach the people who follow me. No algorithm decides whether I deserve visibility.
Here, using a VPN isn't a punishable offense. It isn't even a flag. It's just how some people connect.
Here, it's built on a protocol — not owned by one person who can wake up tomorrow and decide you're a bot, a threat, or simply inconvenient.
Here, I exist.
To the Infosec Community
If you're in cybersecurity and you've thought about leaving X — what was your final straw?
Was it the algorithm hiding your technical threads?
Was it the toxicity drowning out professional discourse?
Was it the realization that the platform values engagement over expertise?
Or are you still holding on? Still hoping that if you just optimize hard enough, the algorithm will finally notice you?
I held on for six months.
I optimized. I adjusted. I believed.
And all the while, I was screaming into a void that was designed to look like a room full of people.
Never again.
Find me on Bluesky: @toxy4ny.bsky.social
My red team research: github.com/toxy4ny
This lab: hackteam.RED
The author is a certified offensive security professional and the maintainer of the redteam-ai-benchmark open-source framework. Views are personal and do not represent any employer or client.
Top comments (30)
The painful part is that publishing into the void can look like a content problem when it is often a distribution and feedback-loop problem. Technical credibility is not enough by itself. The channel has to give you some way to learn what resonated, otherwise you keep improving in private with no signal.
Alex,
This is exactly it. And I didn't realize how deep the damage went until I escaped.
For six months I was trapped in that loop: write, publish, silence. No likes, no replies, no metrics I could trust. The algorithm didn't just hide my posts - it hid the feedback itself. I couldn't tell if my content was bad, my timing was off, or if I was screaming into a sealed room. So I kept iterating in the dark, assuming the problem was me.
"Technical credibility is not enough by itself."
Hard lesson. I built frameworks, published benchmarks, validated everything. And I genuinely believed that if the work was solid, the channel would carry it. Turns out the channel was broken, not the work.
Bluesky fixed the channel. First post here - immediate signal. Likes I could see. Replies from real humans. Within 24 hours I knew more about what resonated than in six months on X. That's not because I'm suddenly better. It's because the feedback loop exists again.
Your terminalskills.io - curious if you faced similar distribution challenges building for developers? CLI tools and security research share the same problem: deeply technical audience, hard to reach, easy to lose in algorithmic noise.
Thanks for the sharp analysis. This belongs in the article itself.
That loop is brutal because it trains you to optimize for signals the platform barely gives back. The healthier metric is often not views, but whether the writing creates reusable assets: clearer positioning, sharper examples, better replies, a stronger archive. Still painful, but less dependent on the feed being generous.
Alex,
"The healthier metric is often not views, but whether the writing creates reusable assets."
This reframes everything. I've been measuring the wrong thing — not just on X, but in my head.
Views are a platform metric. They depend on the feed being generous, the algorithm being fair, the VPN not being flagged. Reusable assets — clearer positioning, sharper examples, a stronger archive — those are mine. They survive platform death. They compound even when the channel is silent.
The brutal part is that X trained me to chase the wrong metric. Six months of optimizing for visibility that never came, while the actual asset — the research itself, the frameworks, the methodology — sat in GitHub, reusable but invisible even to me because I was too busy checking impressions.
Bluesky gives me views and the space to build assets. First post here already sharpened my positioning: "red teamer who builds benchmarks, not just breaks things." That clarity didn't come from X. It came from a channel that actually reflects back.
Your terminalskills.io — is that the asset-first approach? CLI curation as reusable knowledge, not content-for-feed?
Yes, that is basically the idea. Terminal Skills is asset-first: the post may explain the skill, but the durable object is the reusable operating procedure.
A good skill is not content for a feed; it is a small piece of executable knowledge that can be reused by agents, humans, and future docs.
Alex,
"A good skill is not content for a feed; it is a small piece of executable knowledge that can be reused by agents, humans, and future docs."
This is the exact frame I needed. I've been treating my red team frameworks as "projects” - things you ship and maintain. But they're not. They're skills made portable: a benchmark that teaches a methodology, a PoC that encodes a technique, a write-up that preserves a decision tree.
The redteam-ai-benchmark isn't content. It's executable knowledge. Someone can run it, extend it, or feed it to an agent that learns from it. That's the durable object. The DEV article explaining it - that's the searchable explanation. The Bluesky thread - that's the hook test and peer finder.
You just gave me vocabulary for what I've been building without naming it. Thank you.
That is the frame I like too: the durable object is not the post, it is the reusable unit of practice. A benchmark, PoC, or checklist can teach future agents because it carries procedure, not just opinion.
Alex,
"A benchmark, PoC, or checklist can teach future agents because it carries procedure, not just opinion."
This reframes the entire lifecycle of security research.
I've been measuring my work by "did I find the bug?" or "did I publish the write-up?" But the real question is: does this artifact carry procedure that survives me?
The redteam-ai-benchmark isn't valuable because I wrote it. It's valuable because someone in 2027 can run it against a model I've never heard of, and the procedure still teaches them something about AI security. The PoC isn't valuable because it worked once. It's valuable because the procedure - the chain of reasoning, the failure modes, the pivot points - can be adapted to a different target.
Opinion dies with the author. Procedure outlives the platform.
You've just shifted my metric from "did I ship?" to "did I encode something that teaches without me?"
I like that lifecycle framing. A procedure should have an owner, a reason it exists, and a way to expire when the surrounding system changes.
Otherwise the agent keeps obeying old scars as if they are still current law. That is how useful artifacts slowly turn into invisible constraints.
Alex,
"A procedure should have an owner, a reason it exists, and a way to expire when the surrounding system changes."
This is the trap I almost fell into.
The redteam-ai-benchmark v1 scored models on raw output quality. Then EDR vendors started detecting synthetic behavior patterns, and "working" shellcode became "flagged" shellcode. If I had kept scoring by the old metric, the benchmark would have become an invisible constraint — teaching researchers to optimize for detectable techniques.
I didn't have an expiration mechanism. No owner assigned to each procedure, no review date, no trigger for "the surrounding system changed." The artifact just sat there, accumulating authority it no longer deserved.
Your triad — owner, reason, expiration — is now my checklist for every new benchmark category. If I can't name who owns it, why it exists, and when it dies, I don't ship it.
That benchmark example is a clean case for expiration triggers. The surrounding world changed, so the artifact needed to be revalidated instead of silently gaining authority.
I like review triggers more than calendar dates for this: new model family, new detection method, new platform policy, or a failed run that contradicts the old assumption.
Alex,
"Review triggers more than calendar dates: new model family, new detection method, new platform policy, or a failed run that contradicts the old assumption."
This is sharper than expiration by calendar. Calendar dates are arbitrary - a benchmark doesn't rot on schedule, it rots when the world shifts.
The last one is the most powerful. A failed run that should pass is the world telling you your assumption is dead. No need to wait for a review date - the signal is immediate.
I'm adding these triggers to the benchmark CI pipeline. When a regression test fails, it opens an issue with tag assumption-expired. Owner gets notified. Procedure gets revalidated or retired.
The
assumption-expiredissue is a clean mechanism.I would keep the issue template small: old assumption, failing evidence, affected workflow, and owner. Otherwise the trigger becomes another noisy alert stream. The nice part is that a failed run gives you real evidence, not a calendar reminder pretending something might be stale.
That feedback blindness is the worst part. When a channel gives you silence, you cannot tell whether to improve the work, change the packaging, change the audience, or leave the platform. That is why I think creators need a small set of owned diagnostics outside the platform: replies from peers, private review, cross-post tests, even a tiny email list. Otherwise the algorithm becomes your only mirror.
Alex,
"Creators need a small set of owned diagnostics outside the platform."
This is the insight I wish I had six months ago. I was so locked into X as the only channel that I never built the parallel infrastructure. No email list. No cross-post tests. No peer review circle. Just me and the algorithm, and the algorithm was lying.
You're right: the silence isn't just absence of feedback. It's epistemic paralysis. You can't tell if the work is bad, the packaging is wrong, the audience is elsewhere, or the room is empty. So you keep tweaking the same variables, never knowing which one is actually broken.
I'm building that owned diagnostics now. Bluesky as primary, Mastodon as cross-post, DEV as long-form archive, GitHub as the immutable source. And a tiny email list — because you mentioned it, and it's the one thing algorithms can't shadow.
What does your setup look like? You mentioned terminalskills.io — do you run your own distribution stack, or do you still rely on platforms as primary?
I still use platforms, but I try not to let them be the source of truth.
The stack I trust is closer to: owned site or repo for the durable artifact, DEV for long-form/searchable explanation, social for testing hooks and finding peers, and a small private log for what actually worked. Platforms are useful sensors. They are dangerous as the only archive.
Alex,
"Platforms are useful sensors. They are dangerous as the only archive."
Sensors. Not sources of truth. That's the exact right relationship.
My stack is converging on the same shape, just with security-specific layers:
GitHub repo - Durable artifact - code, benchmarks, PoCs
DEV - Long-form, searchable, versioned explanation
Bluesky / Mastodon - Hook testing, peer discovery, real-time signal
Private log - What actually worked - failures, pivots, dead ends
The private log is the one I'm missing. Six months on X destroyed my trust in my own memory of what worked, because nothing worked and I couldn't tell why. A log would have captured: "tried X format, zero signal — but was it format or was I already ghosted?" Now I know. Then I didn't.
Your "small private log for what actually worked" — is it structured (templates, metrics) or free-form (journal, stream of consciousness)? Curious about the format that survives the noise.
Exactly. The private log is underrated because it preserves the decisions that never become polished content. Public platforms are good for signal, but the real asset is the artifact plus the trail of why it changed.
"The real asset is the artifact plus the trail of why it changed."
This is the missing piece. I've been archiving artefacts - code, write-ups, benchmarks - but I've been erasing the trail.
Why did the benchmark change from v1 to v2? Because Llama 3.1 failed on AMSI bypass in a way that revealed a blind spot in the scoring. Why did the PoC pivot from PowerShell to C#? Because EDR heuristics evolved between March and June 2026. These decisions are invisible in the final artifact. But they're the most reusable knowledge for anyone walking the same path.
The private log isn't a diary. It's decision archaeology. And without it, every new researcher has to rediscover the same dead ends.
I'm starting this log today. Format: dated decision, context, expected outcome, actual outcome, pivot. No polish, no narrative, just the trail.
If you ever decide to publish your methodology for this - even as a rough outline - I'd be the first reader. This belongs in the canon, not just in our comments.
Decision archaeology is the right phrase. Private logs are powerful because they explain why the agent chose a path, but they also become a sensitive record of assumptions, user intent, and mistakes.
The safest version is probably boring: short retention, scoped access, clear redaction, and summaries that preserve the reasoning without keeping every raw detail forever.
Alex,
"The safest version is probably boring: short retention, scoped access, clear redaction, and summaries that preserve the reasoning without keeping every raw detail forever."
Boring is safe. Boring is sustainable.
I started my decision log with grand intentions: capture everything, preserve the full context, build a complete archaeology. But you're right — raw logs are sensitive records. They contain assumptions that were wrong, intent that was naive, mistakes that could be weaponized. Not by enemies necessarily, but by future versions of myself who will misread my own scars.
Boring. But it preserves reasoning without hoarding risk.
Alex, this exchange has become more valuable than the original article. If you ever want to co-author something — even a rough framework — on procedure design for security researchers, I'm in. No platform, no feed, just the artifact.
That is the right tradeoff. Raw logs feel valuable because they preserve everything, but everything is also where the risk lives.
I like a two-layer model: short-lived raw trace for debugging, then a durable summary that records the decision, evidence, and uncertainty. You keep the learning without turning every old mistake into permanent surveillance material.
Alex,
"Two-layer model: short-lived raw trace for debugging, then a durable summary that records the decision, evidence, and uncertainty."
This is the architecture I needed.
The raw trace preserves the mess - and the mess is where real insight lives, temporarily. But it dies quickly, before it becomes surveillance material. The summary keeps the learning without keeping the scars.
"You keep the learning without turning every old mistake into permanent surveillance material."
This phrase belongs in the framework. It's not just about privacy - it's about cognitive hygiene. Old mistakes, permanently visible, train you to avoid risk rather than embrace necessary failure. The summary says "here's what we learned." The raw trace, if kept forever, whispers "here's who you were when you failed."
I'm implementing this now. Raw traces in private repo with 30-day TTL. Summaries in the benchmark docs, versioned, attributed, uncertain where appropriate.
Alex, this is no longer a conversation. This is collaborative architecture. If you ever want to formalize this even as a rough RFC - I'm ready to contribute code and cases.
Alex, quick question about terminalskills.io — what's the most controversial skill in your catalog? The one that got the most pushback or debate from users?
Probably the most debated category is the 3D tooling skills.
Some people expect "skills" to be prompt snippets. The pushback starts when a skill behaves more like an operating procedure: exact CLI steps, file layout, render checks, and failure recovery. Blender and 3ds Max exposed that gap fast because artists want creative freedom, while production teams want repeatable outputs.
That tension is useful though. A good skill should not replace judgment; it should remove the boring setup and make the expert checks harder to skip.
That 30-day TTL plus durable summary is a strong implementation.
The extra field I would add is "what would change this conclusion." It turns the summary from a static memory into a reviewable assumption. Then the system is not just preserving what you learned; it has a trigger for when that learning might be stale.
Raw traces are great debugging material. They are terrible permanent identity records.
Hey brother,
Thank you for this.
Honestly, it's one of the most thought-provoking pieces I've read in a long while.
To the certified red teamer, the published researcher,the mysterious figure behind these words: when the time is right, let this reach you.
Lately I've been seeing security come up more and more,in communities and inside companies.
Yet it's nearly always after something has already broken.Same as the law: we move only once the damage is done.People really don't change, do they?
I've taken a few "freeze beams" to my own X account as well, so this one struck rather close to home.
Perhaps people like you and me, and the rest of the security crowd, were simply a touch too stimulating for X's taste.Some signals, it seems, the algorithm would rather not read.
I'm genuinely glad our paths crossed.Finding you here on DEV, and getting to communicate even through the occasional reaction, is something I count myself lucky for. Even brushing past one another in this space has sharpened my career and raised the resolution at which I see things.
Knowing there's someone kind, and quietly proud enough,to keep writing words like these means more to me than I can easily put down. It rekindles something stubborn in me, the part that refuses to break. So let me say it plainly, even if it's a little embarrassing: a great deal of gratitude, of respect, and of love. From here.
For me it was less about leaving than about drawing a line. In my country, X leans heavily towards being a space for conversation, so I decided to lean into precisely that and keep it for communication. The denser material, the things that need links and depth, I bring here, or elsewhere. A division of labour that finally let me make my peace with the platform.
Being mutuals with you here has brought me no end of good. Thank you for that, brother.
P.S. Engineers in security burn out at a frightening rate, you know. So let's look after ourselves, and each other, and carry on living well, you and I both.
And give my regards to the ghost, the one who went and lived the part for real. Welcome back to the visible world, brother. lol
Akari,
Thank you for this. Seriously.
I wrote that article thinking I was documenting a technical failure - a platform's algorithm misfiring. But reading your words, I realized I was documenting something else entirely: loneliness. The loneliness of producing work you believe in, sending it into what you think is a room full of peers, and slowly discovering you've been talking to a mirror for six months.
Your "freeze beams" - I'm stealing that. It's perfect. Because that's exactly what it feels like: not a ban, not a punishment, just a cold, silent pause that never ends. And the worst part is you don't even feel the cold until someone else points out you've been frozen.
"Even brushing past one another in this space has sharpened my career."
This hit me hard. Because while I was a ghost, I had no idea I was brushing past anyone. I thought I was alone in the dark. Knowing that my work reached you even when the platform told me it reached no one - that rewrites the whole story. It doesn't make the ban less real, but it makes the invisibility less absolute. Thank you for telling me this.
Your division of labour is wise. Keeping X for conversation, bringing the dense material to places that respect it - that's not compromise, that's strategy. I might adopt something similar, though honestly, after the ghosting, I'm reluctant to give X even that much of my time. But your approach is pragmatic, and I respect pragmatism in our field. We need more of it.
"Engineers in security burn out at a frightening rate."
Yes. We do. And we pretend we don't. We talk about resilience and grind and "the mission," but we rarely admit that the mission doesn't care if we break. So your proposal - to look after ourselves and each other - I'll take that seriously. Consider this my acceptance of the pact. If you ever feel the freeze closing in, or the weight of the work getting too loud, you have someone here who understands exactly what that silence feels like.
And thank you for welcoming me back to the visible world. I didn't expect to find kindred spirits so quickly after escaping the sandbox. But here you are, proving that the communities we actually need are the ones built on protocols, not on one man's algorithm.
Let's keep each other visible, Akari. And let's keep each other human.
With respect, gratitude, and solidarity -
toxy4ny
I can relate lol. I had a similar problem with my github. Honestly shadowbanning should never have existed.
Mirrai,
Same here. And yeah - GitHub, X, doesn't matter which platform. The mechanism is identical: algorithm decides you're noise, and you're gone. No appeal, no explanation, just silence.
"Shadowbanning should never have existed."
Hard agree. It's not moderation. It's cowardice. A platform that can't look you in the eye and say "we're limiting you" has no right to host communities.
Your profile says you break things to understand them. Same. And it's ironic - we build tools to expose systems, but the systems that host us refuse to be exposed. No CVE for "platform silently ghosts its own users."
Good to know you're here. Let's keep breaking things - and documenting what breaks us.