Security
Last updated March 26, 2026
Our approach
Security is central to what DeepGym does. We execute untrusted code generated by AI models, which means isolation and containment are not afterthoughts. They are the product.
Sandbox isolation
All code execution in production mode runs inside Daytona containers with full OS level isolation. Each execution gets its own container with restricted network access, resource limits on CPU, memory, and disk, and no access to the host filesystem. Containers are destroyed after each execution.
Local execution mode
The library supports a local execution mode for development and testing. This mode runs code as a subprocess on your machine without container isolation. It is intended for trusted code only and is clearly marked as such in the documentation and CLI output.
Adversarial testing
DeepGym includes built in adversarial testing capabilities that probe environments for reward hacking vulnerabilities. This includes detection of empty solutions, hardcoded outputs, pattern exploits, and RL based exploit discovery. We use these tools on our own built in environments.
API authentication
The DeepGym API server requires API key authentication in production. Keys are passed via the DEEPGYM_API_KEY environment variable. Development mode can optionally disable auth for local testing.
Dependencies
We keep our dependency surface small. The core library requires only pydantic and httpx. Optional dependencies for specific integrations are clearly separated. We monitor for known vulnerabilities in our dependency tree.
Reporting vulnerabilities
If you discover a security vulnerability in DeepGym, please report it responsibly by emailing contact@deepgym.io with the subject line "Security". We will acknowledge your report within 48 hours and work with you to understand and address the issue. Please do not open a public GitHub issue for security vulnerabilities.
Scope
This security page covers the DeepGym library and the deepgym.io website. Third party services we integrate with such as HuggingFace Hub, Daytona, and lm-eval have their own security practices and policies.