Researchers have found that large language models (LLMs) tend to parrot buggy code when tasked with completing flawed snippets.

That is to say, when shown a snippet of shoddy code and asked to fill in the blanks, AI models are just as likely to repeat the mistake as to fix it.

  • @LovableSidekick@lemmy.world
    link
    fedilink
    English
    0
    edit-2
    14 days ago

    As a software developer I’ve never used AI to write code, but several of my friends use it daily and they say it really helps them in their jobs. To explain this to non-programmers, they don’t tell it “Write some code” and then watch TV while it does their job. Coding involves a lot of very routine busy work that’s little more than typing. AI can generate approximately what they want, which they then edit, and according to them this helps them work a lot faster.

    A hammer is a useful tool, even though can’t build a building by itself and is really shitty as a drill. I look at AI the same way.

    • @bpev@lemmy.world
      link
      fedilink
      English
      013 days ago

      100%. As a solo dev who used to work corporate, I compare it to having a jr engineer who completes every task instantly. If you give it something well-documented and not too complex, it’ll be perfect. If you give it something more complex or newer tech, it could work, but may have some mistakes or unadvised shortcuts.

      I’ve also found it pretty good for when a dependency I’m evaluating has shit documentation. Not always correct, but sometimes it’ll spit out some apis I didn’t notice.

      • @Reliant1087@lemmy.world
        link
        fedilink
        English
        013 days ago

        I’ve found it okay to get a general feel for stuff but I’ve been given insidiously bad code. Functions and data structures that look similar enough to real stuff but are deeply wrong or non+existent.

    • @sugar_in_your_tea@sh.itjust.works
      link
      fedilink
      English
      014 days ago

      Exactly. I have a coworker use it effectively.

      Personally, I’ve been around the block so it’s usually faster for me to just do the busy work myself. I have lots of tricks for manipulating text quickly (I’m quite proficient with vim), so it’s not a big deal to automate turning JSON into a serializer class or copy and modify a function a bunch of times to build out a bunch of controllers or something. What takes others on my team 30 min I can sometimes get done in 5 through the power of regex or macros.

      But at the end of the day, it doesn’t really matter what tools you use because you’re not being paid for your typing speed or ability to do mundane work quickly, you’re being paid to design and support complex software.

    • Lemminary
      link
      fedilink
      English
      014 days ago

      Coding involves a lot of very routine busy work that’s little more than typing.

      That’s right. You watch it type it out and right where it gets to the important part you realize that’s not what you meant at all, so you hit the stop button. Then you modify the prompt and repeat that one more time. That’s when you realize there are so many things it’s not even considering which gives you the satisfaction that your job is still secure. Then you write a more focused prompt for one aspect of them problem and take whatever good enough bullshit it spewed as a starting point for you to do the manual work. Rinse and repeat.

      • @Excrubulent@slrpnk.net
        link
        fedilink
        English
        013 days ago

        That sounds exhausting to me.

        Like seriously what busywork is so routine and so basic that you need an AI to do it but couldn’t make a template for it? And how is it less work to read what it gave yout to check for errors? That’s always the harder part of coding in my experience.

        I would love to know the specifics of where this supposedly saves time.

        I suspect the energy you’re putting into learning this tool could go into becoming a better typist, and you wouldn’t need to cook the planet to do it.

    • @IphtashuFitz@lemmy.world
      link
      fedilink
      English
      014 days ago

      We have a handful of Python tools that we require to adhere to PEP8 formatting, and have Jenkins pipeline jobs to validate it and block merge requests if any of the code isn’t properly formatted. I haven’t personally tried it yet, but I wonder if these AI’s might be good for fixing up this sort of formatting lint.

  • @masterspace@lemmy.ca
    link
    fedilink
    English
    015 days ago

    What a waste of time. Both the article and the researchers.

    Literally by the time their research was published, it was using irrelevant models, on top of the fact that, yeah, that’s how LLMs work. That would be obvious from 5m of using them.

  • @nectar45@lemmy.zip
    link
    fedilink
    English
    014 days ago

    If you ask the llm for code it will often give you a buggy code but if you run it get an error annd then tell the ai what error you had it will often fix the error so that is cool.

    Wont always work though…

    • @Thorry84@feddit.nl
      link
      fedilink
      English
      014 days ago

      In my experience it will write three paragraphs about the mistake, what went wrong and how to fix it. Only to then output the exact same code, or very close to it, with the same bug. And once you get into that infinite loop, it’s basically impossible to get out of it.

      • @spooky2092@lemmy.blahaj.zone
        link
        fedilink
        English
        014 days ago

        And once you get into that infinite loop, it’s basically impossible to get out of it.

        The easiest way I found to get out of that loop, is to get mad at the AI so it hangs up on you.

      • @nectar45@lemmy.zip
        link
        fedilink
        English
        014 days ago

        I was in that problem too but you can at least occationally tell it to “change the code further” and it can work.

        Often YOU have to try and fix its code though its far from perfect…

  • @taladar@sh.itjust.works
    link
    fedilink
    English
    015 days ago

    I don’t see why anyone would expect anything else out of a “what is the most likely way to continue this” algorithm.

    • @xthexder@l.sw0.com
      link
      fedilink
      English
      014 days ago

      It doesn’t help that the AI also has no ability to go backwards or edit code, it can only append. The best it can do is write it all out again with changes made, but even then, the chance of it losing the plot while doing that is pretty high.

      • Lemminary
        link
        fedilink
        English
        014 days ago

        Yeah, that’s what the canvas feature is for with ChatGPT. And you guessed it, it’s behind a paywall. :)

    • @skip0110@lemm.ee
      link
      fedilink
      English
      015 days ago

      To be fair, if you give me a shit code base and expect me to add features with no time to fix the existing ones, I will also just add more shit on the pile. Because obviously that’s how you want your codebase to look.

      • @sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        014 days ago

        And if you do that without saying you want to factor, I likely won’t stand up for you on the next round of layoffs. If I wanted to make the codebase worse, I’d use AI.

        • @skip0110@lemm.ee
          link
          fedilink
          English
          014 days ago

          I’ve been in this scenario and I didn’t wait for layoffs. I left and applied my skills where shit code is not tolerated, and quality is rewarded.

          But in this hypothetical, we got this shit code not by management encouraging the right behavior, and giving time to make it right. They’re going to keep the yes men and fire the “unproductive” ones (and I know fully, adding to the pile is not, in the long run, productive, but what does the management overseeing this mess think?)

          • @sugar_in_your_tea@sh.itjust.works
            link
            fedilink
            English
            014 days ago

            Fair.

            That said, we have a lot of awful code at my org, yet we also have time to fix it. Most of the crap came from the “move fast and break things” period, but now we have the room to push back a bit.

            There’s obviously a balance, and as a lead, I’m looking for my devs to push back and make the case for why we need the extra time. If you convince me, I’ll back you up and push for it, and we’ll probably get the go-ahead. I’m not going to approve everything though because we can’t fix everything at once. But if you ignore the problems and trudge along anyway, I’ll be disappointed.

    • @Goun@lemmy.ml
      link
      fedilink
      English
      014 days ago

      This is what I was thinking, if you give the code to a person and ask them to finish it, they would do the same.

      If you rather ask the LLM to give some insights about the code, it might tell you what’s wrong with it.

  • @Not_mikey@lemmy.dbzer0.com
    link
    fedilink
    English
    014 days ago

    I guess that’s one advantage of stack overflow, sometimes you need a guy to tell you the entire basis of your question is dumb and wrong.

    • @DragonTypeWyvern@midwest.social
      link
      fedilink
      English
      014 days ago

      o7

      Thank you for service, toxic ass Stack Overflow commenters who are often wrong themselves and are then corrected by other, more toxic commenters.

  • @Damage@feddit.it
    link
    fedilink
    English
    014 days ago

    Let’s train an LLM exclusively on the Windows XP source code and contemporary Microsoft apps