Why Ctrl+V won't paste images in Claude Code on WSL, with a fix

(rajveerbachkaniwala.com)

54 points | by rajveerb 3 days ago

17 comments

  • albert_e 18 hours ago
    Why are we emotionally tied to command line interfaces

    Desktop apps are a second class citizen that do not get feature parity

    Lot of actions on Claude Code seem much more suited for a thoughtfully designed GUI

    Even the chat responses and links therein can benefit from judicious use of rich text and formatting and real hyperlinks to other parts of the UI or elsewhere

    Favourite Skills can be toolbar buttons or menus if user so wishes.

    • davkan 13 hours ago
      For one I’m not sure if I’ve ever gotten an ad in my terminal or in a tui. That alone is probably enough. And it’s so much harder to ruin the terminal experience compared to a desktop app. They don’t get pointless redesigns. I can customize them however I want. The terminal is a like an oasis in the current climate. And that before getting into utility.
      • albert_e 11 hours ago
        "Calude agents" cli lauched recently

        It was very poor UX (from the demos i saw) -- and i tried it myself in Windows native command terminal and the rendering was horribly broken (windows is a second class OS for dev tools).

        A light weight GUI with keyboard shortcuts that mimic CLI experience would have been far better without taking anything away from power users of the terminal.

      • minetest2048 13 hours ago
        You can have ads in a terminal / CLI, for example Ubuntu put an ad for their Ubuntu Pro service when you run sudo apt upgrade: https://linuxiac.com/ubuntu-once-again-angered-users-by-plac... . But the worst thing they can do is to show a plain text
      • snitch182 8 hours ago
        Well powershell is a pointless and optical nausea redesign of a shell. I would not call it design per se .. or maybe it was. ?
        • thewebguyd 7 hours ago
          PowerShell was great, if a little verbose. PowerShell's real strength though isn't as an interactive shell, its as a scripting language. You pass objects back and forth, not text. Its basically an interactive API explorer.

          > ps aux | awk '{if ($6 > 102400) print $11, $6}'

          compared to

          > Get-Process | Where WorkingSet -gt 100MB | Select Name, WorkingSet

          doesn't matter how the output is formatted, you aren't manipulating text directly you're working with .NET objects.

          Anyway, PowerShell is the way it is out of necessity of the OS. Microsoft did try to just port ksh to Windows at one point it obviously failed because Windows isn't text based, system state is stored in WMI, COM, etc not text files.

          Snover talked about the creation of PowerShell on the Corecursive podcast a couple years ago, well worth a listen: https://corecursive.com/building-powershell-with-jeffrey-sno...

    • giancarlostoro 3 hours ago
      The reason for the CLI is because most editors will support CLIs natively, so its very easy to pop in Claude Code either via your IDE or your preferred terminal in your project directory. It's lowkey the most genius design choice by Anthropic. Why they havent made a real GUI since? No idea.
    • Jblx2 2 hours ago
    • pjmlp 17 hours ago
      I really don't get it.

      Started using computers when that was the only affordable way to use computers.

      For some reason, some people really love to live in 1970's with their expensive HiDPI monitors.

      • hvb2 16 hours ago
        Because it's all keyboard based. Depending on your field of work a good UI can be very different.

        A few years ago I watched an account work through my companies numbers using their accounting software. It's entry method? A windows commander like tool. The menu options like add expense etc were all numbered. So he never left the numeric part of his keyboard.

        The tool looked super old and obsolete but as soon as you see a power user use it, you see why.

        • pjmlp 16 hours ago
          GUIs can also be keyboard based, and there are language REPLs as well.
      • benjamincburns 16 hours ago
        I wasn't alive in the 70s, but I still prefer a terminal.

        I can string together a complex series of text-related tasks far more effectively as a shell pipeline than I can by pointing and clicking in a UI. I can scale that sequence of tasks out to operate on every file on the filesystem if I want, or down to a single character in a single file.

        Claude Code being a full-featured TUI is also helpful because I can quickly/easily use it remotely via SSH without having to deal with setting up X forwarding, VNC, Parsec, etc. The remote host doesn't even need to have a window manager. Sure, it'd be nice if it also had an elegant multi-page GUI so I could more easily drill into the actions its performing and make better use of my large screen to watch it do multi-agent things, but if I have to choose between the two, I prefer the TUI.

        That said, I'd much rather use a GUI to do things that are actually visual/spatial in nature.

        • madarco 16 hours ago
          can't agree more, I now run my agents in parallel with "agentbox claude", "agentbox opencode" and it teleport my project and settings to an hetzner VPS

          For those interested: https://github.com/madarco/agentbox

        • pjmlp 16 hours ago
          I can do the same in a language REPL, with the advantage it doesn't need to emulate a teletype.
          • 1718627440 13 hours ago
            The terminal is a language REPL, which the advantage, that it is the environment the whole OS runs on.
            • pjmlp 10 hours ago
              It is an handicapped REPL, not really the same.

              Such statements can only be voiced by those that never experienced what using a proper REPL actually feels like.

              Not something that emulates a vt100 teletype.

              • 1718627440 9 hours ago
                The "vt100 teletype emulation" only concerns the protocol in which the program describes what to draw on the display. It might be inconvenient for the programmer, but for the user it is kinda irrelevant.

                > Such statements can only be voiced by those that never experienced what using a proper REPL actually feels like.

                Maybe. What do you mean with proper REPL? SBCL just runs fine in my terminal emulator. As do all other kind of programming languages, like Prolog.

                When I interface with the OS, I want to start programs, control the processes, setup communication channels (pipes) and tell to computer to combine multiple programs to filter, combine, split stuff and redirect it to files and other programs. These should be also be able to run several times with slight differences. All of that works just fine in my shell. What are you missing?

                • pjmlp 8 hours ago
                  SBCL REPL cannot do this on xterm, it needs a proper hosting REPL environment like SLIME, which is no wonder, given how Emacs came to be and the interaction with genera.

                  Inline graphics from 1981,

                  https://youtu.be/o4-YnLpLgtk?t=376Or

                  Or using S-PACKAGE used to develop Nintendo 64.

                  https://www.youtube.com/watch?v=gV5obrYaogU

                  You are missing integration with live debugging, calling anything on the OS during a REPL session, e.g. OS APIs, calling into automation points of the OS,

                  Can try out directly on the browser, courtesy of WebAssembly and recovery of Xerox PARC software, https://interlisp.org.

                  Or get either Squeak or Pharo, and see how using the Transcript integrates with the whole platform

                  Windows, with either PowerShell ISE, or its new VSCode integration, are the closest to these kind of experiences.

                  To finalise with Xerox PARC view on UNIX, from 1989

                  "UNIX Needs A True Integrated Environment: CASE Closed"

                  https://www.bitsavers.org/pdf/xerox/parc/techReports/CSL-89-...

      • canpan 16 hours ago
        It might be overcompensation. I think UI, UX and GUIs got better up until the 90s, and early 2000s, but then somewhere GUIs suddenly got a lot worse. So a modern CLI is better and more standardized than a modern GUI.
        • thewebguyd 7 hours ago
          > then somewhere GUIs suddenly got a lot worse

          Electron is the reason, and the elusive dream of "write once, run anywhere" that got us cross platform UIs that are bespoke and don't follow native OS conventions (or keyboard navigation), plus once marketing got involved and GUIs started needing to be branded instead of just fitting in with every other native app on that OS.

          • rafterydj 4 hours ago
            I see arguments like this particularly against Electron and the web development sphere in general and I think it's more nuanced than either programmers or "marketing" (read: anyone not a programmer) gives credit towards.

            The "elusive dream" of 'write once, run anywhere' is realistically just people wanting to write software with direct product or service use in mind. Native OS conventions are subject to the middlemen of OS vendors, whereas the web (while basically subject to the same vendors) makes a substantial attempt at bridging the gap of writing software for your own purposes without native OS problems. This is a symptom of OSes catering/selling to developers as a platform and hooking them in the 1990s and 2000s.

            This attitude that wanting to just make useful code for people and not worry about a windows 11 update breaking everything because they are irresponsible - to think that is not a valid desire is IMO a big problem.

            On the other hand, you have a point that it quickly gets out of hand in terms of standards and accessibility and performance bottlenecks. WebAssembly and the WASI are so slow to come out and will by design always be slower than native performance. This doesn't and shouldn't stop us from having decently performant and decently usable program experiences, but it is a prerequisite to care about those things, and the other inheritors of the web development sphere clearly do not want to develop things properly if they take longer than the next fiscal quarter.

            There is 100% good Electron code out there, just as 100% there is bad native OS code. The problem isn't inherently the goals of the 'write once, run anywhere' idea; it's more the casualty of other interests pulling away from what developers actually want.

        • pjmlp 16 hours ago
          A modern CLI would be a REPL.
      • munk-a 16 hours ago
        I don't mind a GUI (as long as it isn't an obnoxiously large ribbon or anything) - but if I'm doing work my input device is the keyboard. I don't want to interface with software through moving a mouse pointer when I can just tell it what to do with a few keystrokes.
        • pjmlp 16 hours ago
          Xerox PARC REPL sytle, and better you can get the inline graphics for free, as there is no need to emulate a teletype.
      • aninteger 17 hours ago
        And some of us love to live in the 1970s with cheap non-HiDPI monitors (or maybe it's just me).
    • hombre_fatal 17 hours ago
      Why did you lambast it as an emotional attachment instead of a practical preference?

      People prefer terminal apps because they run inside our terminal app environments (kitty, zellij, tmux), tend to be keyboard driven, tend to be more lightweight than GUIs, tend to be scriptable, and can be run remotely over a standard ssh session.

      A conventional GUI is a nonstarter in comparison.

    • kstenerud 17 hours ago
      I do love GUIs, and use them for most of my workflow. But for Claude, I definitely prefer the CLI.

      Since it's a CLI app, I can wrap it in yoloAI for the sandbox protection, and also use VS Code's tunneling feature to reach that sandboxed workdir (with permissions safely bypassed) through my GUI.

      https://freeimage.host/i/screenshot-2026-05-19-at-141349.ByS...

      • benjamincburns 17 hours ago
        But can you paste an image into it?

        I have a similar setup, but I access it directly via iTerm2 instead of VS Code's terminal. I've figured out the right terminal settings to get copying/pasting text to work (including with neovim's + register), but not images. Would be nice to paste images, though. Currently I have to SCP them over.

        • kstenerud 16 hours ago
          I've actually never tried it before. I just ran some tests now on a mac:

          If I copy a file in Finder and paste it into a claude session, it shows in the TUI as [Image #1].

          If I do the same, but paste into a claude session running over SSH, it pastes the path to the file, not the data.

          If I open the image in Preview, copy the pixels (CMD-A, CMD-C), pasting that into a terminal does nothing.

          So it looks like CC just puts UI sugar over top of the image path when it has file access to it? That's not really image pasting, though...

          • benjamincburns 16 hours ago
            I suspect the first case worked as intended, and VS Code is greasing the wheels. I'm sure there's a way to get it working in iTerm 2, though I wouldn't be surprised if the solution was some Goldbergian chain of forwarded unix sockets and a helper daemon living inside the sandbox.

            Thanks for mentioning yoloAI, though. I started off sandboxing via devcontainers using kata & cloud hypervisor set up as a custom docker runtime. It worked well enough, but nested docker was super slow due to virtio-fs limitations. I recently moved to sysbox and it's a bit quicker. It's probably not as airtight as kata/chv, but good enough to keep Claude from writing a security test that deletes my whole filesystem [1].

            1: https://github.com/anthropics/claude-code/issues/28521

            • kstenerud 14 hours ago
              Haha yup. yoloAI is to scratch my own itch. I'm getting close to taking it out of beta, but first I'm putting it through a significant architectural overhaul in a feature branch. Normally I'd balk at doing something so heavy, but AI makes it so damn easy to do major mechanical changes (provided you guide it properly and have good tests). So far, so good! And it feels nice to fix the architectural warts before I lock in the interface.
    • patates 17 hours ago
      Composability (piping to other programs, or calling them via scripts), reachability (through ssh, for example), focus (not being distracted by all options being present) and universality (cli is more or less the same interface everywhere) are my reasons.

      I still use GUI apps too, and actually find claude code to be closer to a GUI app than a cli.

    • kordlessagain 8 hours ago
      Why are we emotionally tied to punctuation?
    • pdantix 15 hours ago
      i would gladly use claude code via the desktop app, but it lags behind the cli in terms of supported features, so i just don't bother. last i tried, it didn't support executing CC within WSL while desktop is running in windows.
    • mmh0000 17 hours ago
      Use Claude Desktop? (https://claude.com/download)

      Personally, I much prefer the CLI. The CLI is a tool that has been refined for over 50 years to excel at text input and output. Once you learn it, it can feel like an extension of your brain.

    • oneneptune 17 hours ago
      idk I just like running 6/8 terminal panes and organizing my workflows / projects in an exact space. I even tweaked my theme. and seeing them all on my side portrait monitor.
    • samlinnfer 17 hours ago
      The Unix philosophy is not emotional.
    • bashtoni 17 hours ago
      A text based interface is perfect for interacting with a large language model, and it seems unsurprising to me that it's the most popular way to work with them.

      Frankly, the idea of having to decipher what a picture is supposed to represent to use a skill fills me with horror.

    • DeathArrow 17 hours ago
      >Why are we emotionally tied to command line interfaces

      Being a power user, having used computers for more than 30 years, I usually prefer GUI because that's an evolution over CLI.

      Going from the basic interpreter on ZX Spectrum to the command line in MS DOS had me mesmerized. Going from the DOS CLI to Windows 95 GUI, had me me mesmerized, too.

      I think people in general consider themselves more pro and "hackers" if they use CLI and editors like Vi and Emacs.

      There are bonus points for memorizing hundreds of different keyboard shortcuts and not using the mouse at all.

      If they absolutely have to use GUI, they not use a desktop environment in Linux but a stacking window manager.

      • pjmlp 16 hours ago
        Which is a pity, a real hacker uses a graphical environment inspired from Lisp Machines, Interlisp-D, Smalltalk, selecting code in the REPL with "do it", fixing it on the fly in the debugger and "redo it", changing the work environment in the flow.

        Unfortunely they hear that and only understand Emacs.

      • ta8903 16 hours ago
        Would you prefer if those people used a mouse and desktop environments?
        • DeathArrow 13 hours ago
          I am not trying to diminish anyone and I do not have a preferences for how other people use computers. I was merely trying to explain why the CLI gets so much attention.
    • cookiengineer 14 hours ago
      Claude Code can't slopcode working GUIs

      That's the real reason.

      If you don't believe me, take a look at the leaked codebase from a couple weeks ago. It's the stuff of nightmares, because too many junior devs slopcoded in all places without any plan or understanding of software architecture patterns. They never actually take the time to refactor, there's dozens of outdated redundancies and orphaned modules all over the place.

      Without good architecture patterns, there can be no good GUI nor good UX.

    • joshka 17 hours ago
      [flagged]
  • benjaminl 19 hours ago
    Ctrl+V paste works for me on WSL. My secret is that I have given up on WSLg and use a standalone X Windows server. Specifically, the X410 X Server. This removes a whole lot of weird behavior including the ones described by the article.
    • cheema33 18 hours ago
      I have not tried this mostly because I figured it would a resource hog and clunky. Are you describe your experience with X410 on WSL in some more detail? What are the downsides?
    • sterlind 18 hours ago
      you do you, but I've had only good luck with WSLg. my main gripe with it is that it could be doing more. internally (part of?) WSLg uses the RDP protocol, which natively supports audio forwarding, USB passthru and smart cards. yet none of it's wired up.

      (disclaimer: I work at MS, not on WSL)

  • etbebl 4 hours ago
    Why would anyone assume that pasting an image into a CLI interface would work? The fact that this is framed as a bug is wild to me (but very cool that there is a way to do it now)
    • cstrahan 4 hours ago
      IIUC

      1) It isn’t a matter of literally “pasting into a terminal” (with the terminal emulator shoveling bytes into the TUI’s stdin), rather it’s “a TUI key-binding tells TUI app to read system clipboard”. No different than any other app.

      2) This works on macOS, Linux and Windows, but not WSL. Sounds fair game to call this a bug, or at the very least a feature disparity.

  • thehours 16 hours ago
    Only tangentially related, but does anyone know if it is possible to ‘paste’ images to an agent harness running inside a docker container?

    My current workaround is to paste it inside the working directory on the host machine, then @ reference it, but would be nice to streamline that workflow.

  • mdrzn 15 hours ago
    My main issue was the ability to paste images when using ClaudeCode via ssh on a remote machine, so I solved it by having Claude write a quick bridge that fetches the image in your clipboard, rsync it to the server and paste the correct image path in your clipboard: https://github.com/mdrzn/claude-screenshot-uploader
  • dested 18 hours ago
    Unrelated but I have a similar problem with speech to text apps on windows, where due to the funkiness of claude codes (necessary) implementation, it doesn't send the keybindings correctly.

    I sure wish it didn't have to be a console app

  • moontear 12 hours ago
    This is awesome! Thank you rajveerb. Here is to hoping issues from 2022 will be fixed ;-)
  • bombcar 18 hours ago
    I have the opposite problem; pasting anything moderately substantial into VSClaude ends up sending an image.
  • hboon 17 hours ago
    If it's not working, does pasting the absolute path work? Both works on macOS.
    • oezi 16 hours ago
      Well that means saving the clipboard first.
  • oezi 16 hours ago
    Codex CLI is doing this fine. Maybe copy a page from their book.
  • AgentMasterRace 17 hours ago
    Or just last the path .. ,
  • DeathArrow 17 hours ago
    This is still better than trying to paste text, files or images in Linux. In latest Pop!_OS I have to keep the app I copy from open until I paste. To add insult to the injury, pasting in terminal produces weird characters.
    • thewebguyd 7 hours ago
      That sounds like a Pop!_OS specific problem, I wonder if Cosmic doesn't have a clipboard manager?
  • behnamoh 18 hours ago
    It doesn't work for me on macOS + ghostty either. IDK what's the cause.
  • rajveerb 3 days ago
    tl;dr Use Claude Code in WSL inside Windows Terminal? Copying an image in Windows and pressing Ctrl+V in Claude Code doesn't work. Three things break: (1) WSL only hands Windows images to the Linux side in an old BMP format Claude Code can't read; (2) WSL also keeps quietly overwriting your fixes a moment later; (3) Windows Terminal grabs Ctrl+V before Claude Code can see it. The fix is a small Windows program that converts the image to PNG, a Linux script that puts it on the Linux clipboard (and re-asserts once after WSL overwrites it), and one extra keybinding for Claude Code so the keystroke actually reaches the program.

    Code: https://github.com/rajveerb/wsl-clip-bridge

  • jkwang 15 hours ago
    [flagged]
  • jocelyner 18 hours ago
    [dead]
  • jadar 19 hours ago
    The last "When this stops being needed" needs one amendment: "Or stop using Windows."
    • pjmlp 16 hours ago
      Valve could actually create a need for native Windows games.

      As it stands the only reason I have to pain myself back into using Linux on bare metal laptops, instead of VMs, is independence from US technologies in European soil, which also implies not having to care about Claude.

    • stronglikedan 19 hours ago
      > Or stop using Windows

      I'd rather continue to be as productive as possible.

      • z3c0 19 hours ago
        Not even getting into the semantics of what one could mean by "productive", that sounds like a bleak existence.
        • thewebguyd 17 hours ago
          Not everyone here is *nix-pilled (WSL aside). Despite W11's missteps, Windows isn't a completely terrible OS to work on and has some of the best window management outside of a full tiling WM.
          • 1718627440 13 hours ago
            I doubt it, Window management has been where MS Windows has been lacking for decades, at most they have caught up. Can you do "Always on top", "Always on the bottom" yet? That's a very basic and easy to implement feature.
            • thewebguyd 7 hours ago
              Always on top, yes (with PowerToys). But yeah you can't do "always on bottom" yet, but do able with AutoHotKey. Its still lightyears ahead of how macOS manages windows OOTB
    • TZubiri 19 hours ago
      Not a bug, pasting images into the terminal is not supported, do not do this, that's not what the terminal is for or how it is used. The standard way is to pass the path of a file to the program as a runtime parameter or in some config file.

      Terminals are not alternative web browsers/graphical application sandboxes.

      • fragmede 18 hours ago
        Sixel came out in the 80's as a way to print on dot matrix printers. If your terminal doesn't support that yet, you might want to look into updating your software.
      • TurdF3rguson 19 hours ago
        So basically don't use Claude Code is your suggestion. Not very helpful, guy.
        • recursive 18 hours ago
          I think you've misunderstood, guy.
        • TZubiri 18 hours ago
          pass the url (local or otherwise) of the image to Claude code. Otherwise it's not the terminal's problem, please don't pressure Microsoft to introduce an attack vector to wsl for slop's sake.
          • TurdF3rguson 18 hours ago
            The image in my clipboard doesn't have an url.
            • recursive 6 hours ago
              It could if it was in a file, which is a thing that you control. Take charge of your destiny.
              • TurdF3rguson 1 hour ago
                Right, I could paste my clipboard which takes 1 second and zero cognitive load or I can do your suggestion 100x per day and drive myself crazy. No thanks.
                • recursive 3 minutes ago
                  Just have Claude vibe you up a GUI that does it in one click with only 0.1 cognitive load.