Emerging1d agoSafeClawBench: New Benchmark Separates Semantic Acceptance from Actual Harm in LLM Agent SecurityTechnologyScience66%