Running Claude Code dangerously (safely)

(blog.emilburzo.com)

155 points | by emilburzo 3 hours ago

43 comments

  • snowmobile 10 minutes ago
    Bit of a wider discussion, but how do you all feel about the fact that you're letting a program use your computer to do whatever it wants without you knowing? I know right now LLMs aren't overly capable, but if you'd apply this same mindset to an AGI, you'd probably very quickly have some paperclip-maximizing issues where it starts hacking into other systems or similar. It's sort of akin to running experiments on contagious bacteria in your backyard, not really something your neighbors would appreciate.
    • devolving-dev 1 minute ago
      Don't you have the same issue when you hire an employee and give them access to your systems? If the AI seems capable of avoiding harm and motivated to avoid harm, then the risk of giving it access is probably not greater than the expected benefit. Employees are also trying to maximize paperclips in a sense, they want to make as much money as possible. So in that sense it seems that AI is actually more aligned with my goals than a potential employee.
    • deegles 3 minutes ago
      I run mine in a docker container and they get read only access to most things.
  • lucasluitjes 18 minutes ago
    > What you’re NOT protecting against:

    > a malicious AI trying to escape the VM (VM escape vulnerabilities exist, but they’re rare and require deliberate exploitation)

    No VM escape vulns necessary. A malicious AI could just add arbitrary code to your Vagrantfile and get host access the first time you run a vagrant command.

    If you're only worried about mistakes, Claude could decide to fix/improve something by adding a commit hook. If that contains a mistake, the mistake gets executed on your host the first time you git commit/push.

    (Yes, it's unpleasantly difficult to truly isolate dev environments without inconveniencing yourself.)

  • nunez 20 minutes ago
    Vagrant is great for Claude!

    You can also use Lima, a lightweight VM control plane, as it natively works with qemu and Virtualization.Framework. (I think Vagrant does too; it's been a minute since I've tried.) This has traditionally been used for running container engines, but it's great for narrowly-scoped use cases like this.

    Just need to be careful about how the directory Claude is working with is shared. I copy my Git repo to a container volume to use with Claude (DinD is an issue unless you do something like what Kind did) and rsync my changes back and verify before pushing. This way, I don't have to worry if Claude decides to rewind the reflog or something.

    • bonsai_spool 18 minutes ago
      How are you configuring Lima? Do you have any scripts you use to set up the environments or is this done ad hoc?
  • corv 1 hour ago
    I'm pursuing a different approach: instead of isolating where Claude runs, intercept what it wants to do.

    Shannot[0] captures intent before execution. Scripts run in a PyPy sandbox that intercepts all system calls - commands and file writes get logged but don't happen. You review in a TUI, approve what's safe, then it actually executes.

    The trade-off vs VMs: VMs let Claude do anything in isolation, Shannot lets Claude propose changes to your real system with human approval. Different use cases - VMs for agentic coding, whereas this is for "fix my server" tasks where you want the changes applied but reviewed first.

    There's MCP integration for Claude, remote execution via SSH, checkpoint/rollback for undoing mistakes.

    Feedback greatly appreciated!

    [0] https://github.com/corv89/shannot

    • horsawlarway 59 minutes ago
      I'm struggling to see how this resolves the problem the author has. I still think there's value in this approach, but it feels to be in the same thrust as the built in controls that already exist in claude code.

      The problem with this approach (unless I'm misunderstanding - entirely possible!) is that it still blocks the agent on the first need for approval.

      What I think most folks actually want (or at least what I want) is to allow the agent to explore a space, including exploring possible dead ends that require permissions/access, without stopping until the task is finished.

      So if the agent is trying to "fix a server" it might suggest installing or removing a package. That suggestion blocks future progress.

      Until a human comes in and says "yes - do it" or "no - try X instead" it will sit there doing nothing.

      If instead it can just proceed, observe that the package doesn't resolve the issue, and continue exploring other solutions immediately, you save a whole lot of time.

      • corv 38 minutes ago
        You're right that blocking on every operation would defeat the purpose! Shannot is able to auto-approve safe operations for this reason (e.g. read-only, immutable)

        So the agent can freely explore, check logs, list files, inspect service status. It only blocks when it wants to change something (install a package, write a config, restart a service).

        Also worth noting: Shannot operates on entire scripts, not individual commands. The agent writes a complete program, the sandbox captures everything it wants to do during a dry run, then you review the whole batch at once. Claude Code's built-in controls interrupt at each command whereas Shannot interrupts once per script with a full picture of intent.

        That said, you're pointing at a real limitation: if the fix genuinely requires a write to test a hypothesis, you're back to blocking. The agent can't speculatively install a package, observe it didn't help, and roll back autonomously.

        For that use case, the OP's VM approach is probably better. Shannot is more suited to cases where you want changes applied to the real system but reviewed first.

        Definitely food for thought though. A combined approach might be the right answer. VM/scratch space where the agent can freely test hypotheses, then human-in-the-loop to apply those conclusions to production systems.

    • Retr0id 1 hour ago
      Very clever name!
      • corv 1 hour ago
        Thank you, good to know it landed :)
  • raesene9 2 hours ago
    Of course it depends on exactly what you're using Claude Code for, but if your use-case involves cloning repos and then running Claude Code on that repo. I would definitely recommend isolating it (same with other similar tools).

    There's a load of ways that a repository owner can get an LLM agent to execute code on user's machines so not a good plan to let them run on your main laptop/desktop.

    Personally my approach has been put all my agents in a dedicated VM and then provide them a scratch test server with nothing on it, when they need to do something that requires bare metal.

    • intrasight 1 hour ago
      In what situations where it require bare metal?
      • raesene9 1 hour ago
        In my case I was using Claude Code to build a PoC of a firecracker backed virtualization solution, so bare metal was needed for nested virtualization support.
  • azuanrb 1 hour ago
    I just learned that you can run `claude setup-token` to generate a long-lived token. Then you can set it via `CLAUDE_CODE_OAUTH_TOKEN` as a reusable token. Pretty useful when I'm running it in isolated environment.
  • marcelcor 7 minutes ago
    I'm a fan of https://e2b.dev/
  • replete 1 hour ago
    It's a practical approach, I used vagrant many years ago mostly successfully. I also explored the docker-in-docker situation recently while working on my own agentic devcontainer[0]- the tradeoffs are quite serious if you are building a secure sandbox! Data exfil is what worries me most, so I spent quite some time figuring out a decent self-contained interactive firewall. From a DX perspective, devcontainer-integrated IDEs are quite a convenient workflow, though docker has its frustrating behaviours

    [0]: https://github.com/replete/agentic-devcontainer

  • kernc 43 minutes ago
    Since everyone tends to present their own solution, I bid you mine:

        sandbox-run npx @anthropic-ai/claude-code
    
    This runs npx (...) transparently inside a Bubblewrap sandbox, exposing only the $PWD. Contrary to many other solutions, it is a few lines of pure POSIX shell.

    https://github.com/sandbox-utils/sandbox-run

    • corv 35 minutes ago
      I like the bubblewrap approach, it just happens to be Linux-only unfortunately. And once privileges are dropped for a process it doesn't appear to be possible to reinstate them.
      • kernc 20 minutes ago
        > Linux-only

        What other dev OSs are there?

        > once privileges are dropped [...] it doesn't appear to be possible to reinstate them

        I don't understand. If unprivileged code could easily re-elevate itself, privilege dropping would be meaningless ... If you need to communicate with the outside, you can do so via sockets (such as the bind-mounted X11 socket in one of the readme Examples).

  • mavam 2 hours ago
    For deploying Claude Code as agent, Cloudflare is also an interesting option.

    I needed a way to run Claude marketplace agents via Discord. Problem: agents can execute code, hit APIs, touch the filesystem—the dangerous stuff. Can't do that in a Worker's 30s timeout.

    Solution: Worker handles Discord protocol (signature verification, deferred response) and queues the task. Cloudflare Sandbox picks it up with a 15min timeout and runs claude --agent plugin:agent in an isolated container. Discord threads store history, so everything stays stateless. Hono for routing.

    This was surprisingly little glue. And the Cloudflare MCP made it a breeze do debug (instead of headbanging against the dashboard). Still working on getting E2E latency down.

  • samlinnfer 2 hours ago
    Here is what I do: run a container in a folder that has my entire dev environment installed. No VMs needed.

    The only access the container has are the folders that are bind mounted from the host’s filesystem. The container gets network access from a transparent proxy.

    https://github.com/dogestreet/dev-container

    Much more usable than setting up a VM and you can share the same desktop environment as the host.

    • phrotoma 2 hours ago
      This works great for naked code, but it kinda becomes a PITA if you want to develop a containerized application. As soon as you ask your agent to start hacking on a dockerfile or some compose files you start needing a bunch of cockeyed hacks to do containers-in-containers. I found it to be much less complicated to just stuff the agent in a full fledged VM with nerdctl and let it rip.
    • sampullman 2 hours ago
      I did this for a while, it's pretty good but I occasionally came across dependencies that were difficult to install in containers, and other minor inconveniences.

      I ended up getting a mini-PC solely dedicated toward running agents in dangerous mode, it's refreshing to not have to think too much about sandboxing.

      • laborcontract 2 hours ago
        I totally agree with you. Running a cheapo mac mini with full permissions with fully tracked code and no other files of importance is so liberating. Pair that with tailscale, and being able to ssh/screen control at any time, as well as access my dev deployments remotely. :chefs kiss:
  • bob1029 1 hour ago
    My approach to safety at the moment is to mostly lean on alignment of the base model. At some point I hope we realize that the effectiveness of an agent is roughly proportional to how much damage it could cause.

    I currently apply the same strategy we use in case of the senior developer or CTO going off the deep end. Snapshots of VMs, PITR for databases and file shares, locked down master branches, etc.

    I wouldn't spend a bunch of energy inventing an entirely new kind of prison for these agents. I would focus on the same mitigation strategies that could address a malicious human developer. Virtual box on a sensitive host another human is using is not how you'd go about it. Giving the developer a cheap cloud VM or physical host they can completely own is more typical. Locking down at the network is one of the simplest and most effective methods.

  • smallerfish 1 hour ago
    I've been working on a TUI to make bubblewrap more convenient to use: https://github.com/reubenfirmin/bubblewrap-tui

    I'm working on targeting both the curl|bash pattern and coding agents with this (via smart out of the box profiles). Early stages but functional. Feedback and bug reports would be appreciated.

  • crabmusket 2 hours ago
    What is the consensus on Claude Code's built-in sandboxing?

    https://code.claude.com/docs/en/sandboxing#sandboxing

    > Claude Code includes an intentional escape hatch mechanism that allows commands to run outside the sandbox when necessary. When a command fails due to sandbox restrictions (such as network connectivity issues or incompatible tools), Claude is prompted to analyze the failure and may retry the command with the dangerouslyDisableSandbox parameter.

    The ability for the agent itself to decide to disable the sandbox seems like a flaw. But do I understand correctly that this would cause a pause to ask for the user's approval?

  • fwystup 55 minutes ago
    I'm currently building a Docker dev environment for VSCode (github.com/dg1001/xaresaicoder) usable in a browser and hit the same issue. Without docker-in-docker it works well - I even was able to add transparent proxy in the Docker network to restrict outbound traffic and log all LLM calls (pretty nice in order to document your project). For docker-in-docker development and better security isolation, I'm considering Kata Containers instead of Vagrant. Which gives me real VM-level isolation with minimum perf overhead, while still be able to use my docker stuff. Still on my TODO list though. Has anyone actually run Kata with vs code server? Curious about real-world quirks - I've read that storage snapshot performance can be rough.
  • Strongbad536 45 minutes ago
    i've low-key been running claude in dangerously skip permissions mode for at least like 4 months now and have yet to be bitten by a truly destructive action. YMMV but i think as long as you're guiding/prompting correctly, and don't just allow write access to your prod account DBs willy nilly, it's mostly fine. just keep an eye on it :shrug:
    • nonethewiser 40 minutes ago
      Also something to note, this mode simply adds a new mode alongside accept edits, plan, nothing, dangerously skip permissions. You can choose when to use it or not, which is not something I initially realized.
  • riadsila 2 hours ago
    Koyeb has great resources about running Claude Code in sandboxes: https://www.koyeb.com/tutorials/use-claude-agent-sdk-with-ko...
    • mavam 1 hour ago
      What's the startup latency? How long do I have to wait until Claude is operational?
  • sandGorgon 1 hour ago
    Or...use wsl2 in windows. does the same thing - much much faster.

    Windows is the best (sandboxed) linux

    • strickjb9 1 hour ago
      Real question - are you not worried about access to /mnt/c ?
      • kachapopopow 47 minutes ago
        sudo chmod 700 /mnt/

        sudo chmod $UID /mnt/<project_path>

        ...done?

  • loloquwowndueo 2 hours ago
    Shellbox.dev and sprites.dev were discussed recently on hacker news, they give you a sandbox machine where it’s likely safe to run coding agents in dangerous mode. Filesystem checkpoint and restore make it easy to recover from even catastrophic mistakes.
    • thruflo 31 minutes ago
      I made a little tool for Ralphing on Sprites: https://github.com/thruflo/wisp

      I’ve found the sprites just work for claude. Pull how a repo (or repos) and run dangerously.

    • gcr 2 hours ago
      What about API calls? What about GitHub trusted CI deploys?

      One frustrating thing about these solutions is that they’re great to prevent Claude from breaking a machine, but there’s no pervasive sandbox for third party services

      • jermaustin1 1 hour ago
        Rollback? Its the same as all dev work. Use a dev endpoint for APIs, and thankfully git is a great tool to undo fuckups.
      • loloquwowndueo 2 hours ago
        What about them?
  • FourSigma 1 hour ago
    I've been exploring this space. There are some use cases where I'd love to run an isolated Claude agent asynchronously. I think running Docker in rootless mode might solve some of the OP's concerns—I believe Podman does this implicitly. Also, there are tools like Kaniko that does not need Docker to create container images. You can also try changing the underlying container runtime to something like gVisor if you want more security.

    Does anybody have experience using microVMs (Firecracker, Kata Containers, etc.) for this use case? Would love to hear your thoughts.

  • danmaz74 58 minutes ago
    I'm using devcontainers for this, and I'm finding that a very good solution (coupled with VSCode).
  • clbrmbr 2 hours ago
    I have been running two or three Claude’s bare metal with dangerously skip permissions all day every day for two months now. It’s absolutely liberating.
    • Gazoche 2 hours ago
      Until it decides to delete your home directory:https://old.reddit.com/r/ClaudeAI/comments/1pgxckk/claude_cl...
      • giancarlostoro 51 minutes ago
        This could be avoided by aliasing rm to something else that stops you from deleting stupid things like your entire home directory / partition root.
        • icedchai 1 minute ago
          What if the LLM detects this, and chooses to run /bin/rm directly? Or worse, writes a program that calls unlink.
      • pixl97 2 hours ago
        You're not running it on a filesystem that takes snapshots and is easily reversible?
        • giancarlostoro 54 minutes ago
          Many moons ago, I accidentally rm -rf'd the wrong directory with all my code inside poof, gone. I still had PyCharm open, I checked its built-in version tracker and lo and behold, my code as it was before I rm -rf'ed up my code. I believe Claude has ways to undo file changes, but something like rm is just outside of its scope.
        • coldtea 2 hours ago
          All 1 of them?
      • esperent 2 hours ago
        You can use the /hookify plugin to add hooks for preventing dangerous commands like this.
        • Gazoche 2 hours ago
          https://github.com/anthropics/claude-code/tree/main/plugins/...

          So it's basically adding "don't delete my files pretty please" to the prompt?

          EDIT: I misread, the natural language description of the rule is just a shortcut to generate the actual rule which is based on regexp patterns.

          Still, it only protects you against very specific commands. Won't help you if the LLM decides to fill your disk with `cat /dev/urandom > foo` for example.

          • simianwords 1 hour ago
            it may not protect against an adversarial llm
    • croes 2 hours ago
      I have been driving without seat belt for two month now. It’s absolutely liberating.
      • InsideOutSanta 4 minutes ago
        I have been skydiving without a parachute for 23 seconds now. It's absolutely liberating.
    • coldtea 2 hours ago
      And that's as a dev. Then we expect uses to know better than e.g. to trust links to .sh style installers some FOSS suggests...
    • sixhobbits 2 hours ago
      same, it's made a couple of damaging mistakes but so far it has a better track record than me in terms of fat-fingering `rm` commands or what have you
  • woof 45 minutes ago
    sandbox-exec on MacOS (ie. https://github.com/neko-kai/claude-code-sandbox) seems like the perfect solution to me.

    Missing FreeBSD jails in 2026 is kind of weird (hello 1999)...

  • tradziej 2 hours ago
  • jackcarter 37 minutes ago
    "At some point I realized that rather than do something else until it finishes, I would constantly check on it to see if it was asking for yet another permission, which felt like it was missing the point of having an agent do stuff"

    Why don't Claude Code & other AI agents offer an option to make a sound or trigger a system notification whenever they prompt for approval? I've looked into setting this up, and it seems like I'd have to wire up a script that scrapes terminal output for an approval request. Codex has had a feature request open for a while: https://github.com/openai/codex/issues/3052

    • AndroidKitKat 32 minutes ago
      When using Claude Code in Ghostty on macOS, I get notifications if it is waiting on my input (accept changes, questionnaire, run bash command). Dunno what combination (if any) of my setup is needed for this to happen, but I certainly didn't configure anything special. Maybe I'm giving CC too much free reign to do things.
  • denysvitali 2 hours ago
    Here's what I do (shameless plug): https://blog.denv.it/posts/im-happy-engineer-now/

    This allows you to use Claude Code from your mobile device, in a safe environment (restricted Kubernetes pod)

    • jeffrallen 2 hours ago
      Here's what I do (shameless plug, not an employee, just a satisfied user): https://exe.dev
      • denysvitali 2 hours ago
        Yes, this approach also looked nice! Maybe you can pair both (happy + exe.dev) for best results
  • csantini 1 hour ago
    Just create a new user and setup pip/npm to install locally.

    And setup an .env for the project with user/password to access only a dev database.

  • frankc 2 hours ago
    I think this makes sense but I wonder if firecracker would work better than vagrant for this? I haven't used it before, though. I guess it might if you are trying to run gas town level orchestration.
    • raesene9 2 hours ago
      Firecracker can solve the kind of problems where you want more isolation than Docker provides, and it's pretty performant.

      There's not a tonne of tooling for that use case now, although it's not too hard to put together I vibe-coded something that works for my use case fairly quickly (CC + Opus 4.5 seemed to understand what's needed)

  • letmetweakit 3 hours ago
    I run Claude in a Proxmox VM, generally the experience has been great. In my experience it also behaves better than gemini cli, that likes to create files all over the place if set loose (lesson learned to add that requirement to the relevant .md files)
    • chrisss395 2 minutes ago
      I too use this solution, using both Ubunutu LXCs and full-fledged VMs. Only issue I've struggled with has been losing SSH connection on the LXC, and tmux and session both seem to mess up the terminal formatting in CC.

      I do agree with the security / cautionary comments and wouldn't leverage this setup outside a hacked together homelab.

    • vidarh 3 hours ago
      Something that contains Claude even more in this respect is if you explicitly gives it a directory that you tell it is entirely under its control, and tells it to write md files and other intermediate work products there (and this seems to work better than telling it where it isn't allowed to leave things).
      • onionisafruit 1 hour ago
        That sounds like a good idea. When I have a one-off need for misc files I tell it to put them in the project’s ./tmp because that’s already in my global gitignore. That generally works, but I still run into surprise files it leaves in source dirs like a puppy leaves turds on a rug. I’ll try adding that to my instructions instead of doing it one-off.
      • jermaustin1 1 hour ago
        I've often found that LLMs don't listen to "Don't do" commands with anywhere near the same gusto as "Do" commands.
        • NitpickLawyer 54 minutes ago
          People don't usually think about pink elephants, unless you ask them not to think about pink elephants :)
    • emilburzo 2 hours ago
      This was also the direction I was initially headed, but then I realized I wanted one-VM-per-project so it can really do anything it wants on the complete VM. So the blast-from-the-past-Vagrant won because of the Vagrantfile + `vagrant up` easiness.
      • letmetweakit 2 hours ago
        I use Proxmox snapshots to get back to a clean state. I’ll take a look at Vagrant too though.
    • scalemaxx 2 hours ago
      In installed Gemini as an extension in VS Code and it kept wanting to index all my files. Still trying to figure out what it was doing outside of the VS Code folder I had set it to work on.
  • RobinL 2 hours ago
    Does anyone have direct experience with Claude making damaging mistakes in dangerously skip permissions mode? It'd be great to have a sense of what the real world risk is.
    • prodigycorp 2 hours ago
      Claude is very happy to wipe remote dbs, particularly if you're using something like supabase's mcp server. Sometimes it goes down rabbitholes and tries to clean itself up with `rm -rf`.

      There is definitely a real world risk. You should browse the ai coding subreddits. The regularity of `rm -rf` disasters is, sadly, a great source of entertainment for me.

      I once was playing around, having Claude Code (Agent A) control another instance of Claude Code (Agent B) within a tmux session using tmux's scripting. Within that session, I messed around with Agent B to make it output text that made Agent A think Agent B rm -rf'd entire codebase. It was such a stupid "prank", but seeing Agent A's frantic and worried reaction to Agent B's mistake was the loudest and only time I've laughed because of an LLM.

      • gregoriol 2 hours ago
        Why in the hell would it be able to access a _remote_ database?! In no acceptable dev environment would someone be able to access that.
        • heartbreak 1 hour ago
          Everywhere I’ve ever worked, there was always some way to access a production system even if it required multiple approvals and short-lived credentials for something like AWS SSM. If the user has access, the agent has access, no matter how briefly.
          • gregoriol 1 hour ago
            Not if you require auth with a Yubikey, not if you run the LLM client inside a VM which doesn't have your private ssh key, ...
        • prodigycorp 1 hour ago
          Supabase virtually encouraged it last year haha. I tried using it once and noped out after using it for an hour, when claude tried to do a bunch of migrations on prod instead of dev.

          https://web.archive.org/web/20250622161053/https://supabase....

          Now, there are some actual warnings. https://supabase.com/docs/guides/getting-started/mcp#securit...

        • kaydub 21 minutes ago
          I think LLMs are exposing how slapdash many people work when building software.
    • kaydub 23 minutes ago
      It feels like most people are exposing how wild west their environments are.
    • azuanrb 2 hours ago
      One recent example. For some reason, recently Claude prefer to write scripts in root /tmp folder. I don't like this behavior at all. It's nothing destructive, but it should be out of scope by default. I notice they keep adding more safeguards which is great, eg asking for permissions, but it seems to be case by case.
      • giancarlostoro 1 hour ago
        If you're not using .claude/instructions.md yet, I highly recommend it, for moments like this one you can tell it where to shove scripts. Trickery with the instructions file is Claude only reads it during a new prompt, so any time you update it, or Claude "forgets" instructions, ask it to re-read it, usually does the trick for me.
        • mythical_39 35 minutes ago
          Claude, I noticed you rm -rf my entire system. Your .instructions.md file specifically prohibits this. Please re-read your .instructions.md file and comply with it for all further work
    • coldtea 2 hours ago
    • ra120271 2 hours ago
      When approving actions "for this project" I actively monitor .claude\settings.local.json

      as

      "Bash(az resource:)",

      is much more permissive than

      "Bash(az resource show:)",

      It mostly gets it right but I instantly fix the file with the "readonly" version when it gets it too open.

    • MattGaiser 2 hours ago
      Claude has twice now thought that deleting the database is the right thing to do. It didn't matter as it was local and one created with fixtures in the Docker container (in anticipation of such a scenario), but it was an inappropriate way of handling Django migration issues.
  • skybrian 3 hours ago
    I'm doing this with a remote VM on exe.dev and it's quite nice. Well, actually with their own coding agent but they have Claude Code preinstalled too.

    Syncthing works well for getting a local copy of a directory from the VM.

  • mhb 1 hour ago
    Forgive a naive question, but why not run it on an AWS (or equivalent) instance?
  • tobyhinloopen 3 hours ago
    How about running Claude as a different user with very limited permissions?
    • gregoriol 3 hours ago
      This breaks the non-interactive mode the post want to achieve. Claude will not be able to install some things and will require user action, which is not desired here.
      • progval 2 hours ago
        Like what? It can already use npm/pip/etc. And if it needs a new APT package or config in /etc/ then you would want to know because you need to document it.
        • gregoriol 2 hours ago
          If you make claude work with c/c++, it may need apt for libraries or build tools.

          Even with npm/pip, these may not be available on a base linux box.

          Even then, some complex projects may need other tools that are not part of a base system (command line tools, redis, ...).

    • emilburzo 2 hours ago
      I tried this approach for a while, but I really wanted it to be able to do anything (install system packages, build/run Docker containers, the works).

      With these powers there's a lot less back-and-forth with me running commands, copying the output, pasting it to Claude, etc.

      I'm sure you've had the case where you had to instruct someone to do something (e.g. playing tech support with family, helping another engineer, etc). While it helps the other person learn, it feels soooo slow vs just doing it yourself :) And since I don't have to teach the agent, I think this approach makes sense.

    • delaminator 3 hours ago
      I run it with sudo enabled - true story

      just give it its own machine and let it check out any code

      I PXE boot it from a known image when I feel the need

      • zh3 39 minutes ago
        Same solution here - keep a base diskless image on the server, copy it to the diskless area, pxeboot the machine. Works for Windows too (iscsi).

        Could do the same thing on EC2 of course.

      • tobyhinloopen 2 hours ago
        Running it remotely on a VM seems like a very sensible option. Just don't give it permission to nuke the remote repository hah (EG don't allow force-push, use protected branches, only allow write access to branches it created)
  • cyberpunk 1 hour ago
    docker sandbox run claude? seems to work for me…
  • Retr0id 2 hours ago
    > VirtualBox 7.2.4 shipped with a regression that causes high CPU usage on idle guests. What are the odds.

    I have such a love/hate relationship with VirtualBox. It's so useful but so buggy. My current installation has a bug that causes high network latency, but I'm afraid to upgrade in case it introduces new, worse bugs.

    VMware is a million times better, but it is also Proprietary™

    • intrasight 1 hour ago
      As VMWare Workstation is now free on Linux and Windows, and allows you to create and rollback snapshots. Why not use it even if proprietary?
      • Retr0id 1 hour ago
        It's a good question and I'm pretty on the fence about it, and next time I'm reinstalling things I might switch.

        I do believe in the whole RMS "respects the user's freedoms" spiel, so all things being equal I prefer FOSS, even if it's worse - but there are limits.

  • szmarczak 1 hour ago
    What about Docker rootless?
  • firasd 2 hours ago
    I noticed something in Claude across all product surfaces

    There's a bug in that it can't output smart quotes “like this”

    Sonnet, Opus et al think they output it but something in the pipeline is rewriting it

    https://github.com/firasd/vibesbench/blob/main/docs/2026/A/t...

    Try it in Claude Code and you'll see what I mean! Very weird

  • supermatt 1 hour ago
    > now you need Docker-in-Docker

    Or you can just mount the socket and call docker from within docker.

    • emilburzo 49 minutes ago
      Correct, which I wanted to avoid because:

      > Mounting the Docker socket grants the agent full access to your Docker daemon, which has root-level privileges on your system. The agent can start or stop any container, access volumes, and potentially escape the sandbox. Only use this option when you fully trust the code the agent is working with.

      https://docs.docker.com/ai/sandboxes/advanced-config/#giving...

  • oofbey 12 minutes ago
    There are two spheres of influence you need to consider. The local machine/vm/container that the agent is running in. But also the effect the agent can have on the outside world - using auth tokens or ssh keys or apis that is has access to. This article largely deals with the first problem and ignores the second.

    You can have the local environment completely isolated with vagrant. But if you’re not careful with auth tokens it can (and eventually will when it gets confused)go wipe the shared dev database or the GitHub repo. The author kinda acknowledges this, but it’s glossing over a big chunk of the problem. If it can pus to GitHub, unless you’ve set up your tokens carefully it can delete things too. Having a local isolated test database separate from the shared infrastructure is a matter of a mature dev environment, which is a completely separate thing from how you run Claude. Two of the three examples cited as “no, no, no” are not protected by vagrant or docker or even EC2. It’s what tokens the agent has and needs.

  • athrowaway3z 2 hours ago
    `useradd claude`
  • ompogUe 6 days ago
    Keeping in mind with Vagrant: if you are using a synced_folder in your host as a source folder in the VM, those files in the synced_folder will be modified on the host.
    • gregoriol 3 hours ago
      If the folder is versioned and commited regularly there is no problem. It also allows you to open the files in your IDE, do some other tasks or fixes for claude. It prevents claude from accessing any other folder, which is the idea of the post.
      • gcr 2 hours ago
        I’ve seen Claude rm .git in rare occasions to “fix rebase hiccups”

        Version control ain’t a match for a good backup

        • gregoriol 2 hours ago
          So? if it removes .git, just clone the project again and you are ok
      • fragmede 44 minutes ago
        Until Claude nukes .git, assuming you're using git as the version/commit store. Solution use easy, just push to a remote on a reasonable cadence (that you can run reflog on, so a force push won't eat your data either). Git isn't backup though, it's a VCS, and those are two different things, even if they are somewhat alike.
    • emilburzo 6 days ago
      Good point. For me, that was intentional, since all my projects are in git I don't care if it messes something up. Then you get the benefit of being able to use your regular git tooling/flows/whatever, without having to add credentials to the VM.

      But if you need something more strict, 'config.vm.synced_folder' also supports 'type rsync', which will copy the source folder at startup to the VM, but then it's on you to sync it back or whatever.

      • ompogUe 6 days ago
        I like this workflow a lot, actually. Docker is great and all, but depending on the project, Vagrant helps "keep it simple".

        Thanks

    • ninadwrites 3 hours ago
      [dead]
  • Lucasjohntee 2 hours ago
    [dead]
  • nirdiamant 2 hours ago
    [flagged]