A new trend of “Model Denial of Service” (MDoS) attacks is targeting companies with public-facing LLM APIs. Attackers are sending specially crafted, computationally expensive prompts (like complex recursive logic puzzles) that force the AI model to consume massive amounts of GPU resources, driving up cloud bills and causing service outages.
Business Impact
This is “financial DDoS.” Instead of crashing a server with traffic volume, attackers are exhausting the *budget* and *compute capacity* of the target. One startup reported a $200,000 cloud bill spike in 24 hours due to a sustained MDoS attack.
Why It Happened
Standard rate limiting (requests per minute) fails to stop MDoS because a single complex prompt counts as one request but can tie up a GPU for minutes. Attackers are exploiting the gap between API metering and actual compute cost.
Recommended Executive Action
Update your AI API rate limits to measure “token complexity” or compute time, not just request count. Implement “early exit” mechanisms to terminate processing for prompts that exceed a certain computational threshold.
Hashtags: #AI #LLM #DDoS #CloudSecurity #FinOps #CyberAttack #MachineLearning #InfoSec
