• SkunkWorkz@lemmy.world
    link
    fedilink
    arrow-up
    94
    ·
    1 day ago

    Yeah fake. No way you can get 90%+ using chatGPT without understanding code. LLMs barf out so much nonsense when it comes to code. You have to correct it frequently to make it spit out working code.

    • AeonFelis@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      7 hours ago
      1. Ask ChatGPT for a solution.
      2. Try to run the solution. It doesn’t work.
      3. Post the solution online as something you wrote all on your own, and ask people what’s wrong with it.
      4. Copy-paste the fixed-by-actual-human solution from the replies.
    • Artyom@lemm.ee
      link
      fedilink
      arrow-up
      7
      ·
      17 hours ago

      If we’re talking about freshman CS 101, where every assignment is the same year-over-year and it’s all machine graded, yes, 90% is definitely possible because an LLM can essentially act as a database of all problems and all solutions. A grad student TA can probably see through his “explanations”, but they’re probably tired from their endless stack of work, so why bother?

      If we’re talking about a 400 level CS class, this kid’s screwed and even someone who’s mastered the fundamentals will struggle through advanced algorithms and reconciling math ideas with hands-on-keyboard software.

    • threeduck@aussie.zone
      link
      fedilink
      arrow-up
      6
      ·
      22 hours ago

      Are you guys just generating insanely difficult code? I feel like 90% of all my code generation with o1 works first time? And if it doesn’t, I just let GPT know and it fixes it right then and there?

      • KillingTimeItself@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        5
        ·
        18 hours ago

        the problem is more complex than initially thought, for a few reasons.

        One, the user is not very good at prompting, and will often fight with the prompt to get what they want.

        Two, often times the user has a very specific vision in mind, which the AI obviously doesn’t know, so the user ends up fighting that.

        Three, the AI is not omnisicient, and just fucks shit up, makes goofy mistakes sometimes. Version assumptions, code compat errors, just weird implementations of shit, the kind of stuff you would expect AI to do that’s going to make it harder to manage code after the fact.

        unless you’re using AI strictly to write isolated scripts in one particular language, ai is going to fight you at least some of the time.

        • sugar_in_your_tea@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          edit-2
          18 hours ago

          I asked an LLM to generate tests for a 10 line function with two arguments, no if branches, and only one library function call. It’s just a for loop and some math. Somehow it invented arguments, and the ones that actually ran didn’t even pass. It made like 5 test functions, spat out paragraphs explaining nonsense, and it still didn’t work.

          This was one of the smaller deepseek models, so perhaps a fancier model would do better.

          I’m still messing with it, so maybe I’ll find some tasks it’s good at.

          • KillingTimeItself@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            2
            ·
            18 hours ago

            from what i understand the “preview” models are quite handicapped, usually the benchmark is the full fat model for that reason. the recent openAI one (they have stupid names idk what is what anymore) had a similar problem.

            If it’s not a preview model, it’s possible a bigger model would help, but usually prompt engineering is going to be more useful. AI is really quick to get confused sometimes.

            • sugar_in_your_tea@sh.itjust.works
              link
              fedilink
              arrow-up
              1
              ·
              18 hours ago

              It might be, idk, my coworker set it up. It’s definitely a distilled model though. I did hope it would do a better job on such a small input though.

      • nimbledaemon@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        edit-2
        2 hours ago

        I just generated an entire angular component (table with filters, data services, using in house software patterns and components, based off of existing work) using copilot for work yesterday. It didn’t work at first, but I’m a good enough software engineer that I iterated on the issues, discarding bad edits and referencing specific examples from the extant codebase and got copilot to fix it. 3-4 days of work (if you were already familiar with the existing way of doing things) done in about 3-4 hours. But if you didn’t know what was going on and how to fix it you’d end up with an unmaintainable non functional mess, full of bugs we have specific fixes in place to avoid but copilot doesn’t care about because it doesn’t have an idea of how software actually works, just what it should look like. So for anything novel or complex you have to feed it an example, then verify it didn’t skip steps or forget to include something it didn’t understand/predict, or make up a library/function call. So you have to know enough about the software you’re making to point that stuff out, because just feeding whatever error pops out of your compiler back into the AI may get you to working code, but it won’t ensure quality code, maintainability, or intelligibility.

      • JustAnotherKay@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        20 hours ago

        My first attempt at coding with chatGPT was asking about saving information to a file with python. I wanted to know what libraries were available and the syntax to use them.

        It gave me a three page write up about how to write a library myself, in python. Only it had an error on damn near every line, so I still had to go Google the actual libraries and their syntax and slosh through documentation

      • surph_ninja@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        21 hours ago

        A lot of people assume their not knowing how to prompt is a failure of the AI. Or they tried it years ago, and assume it’s still as bad as it was.

      • Eheran@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        24 hours ago

        You mean o3 mini? Wasn’t it on the level of o1, just much faster and cheaper? I noticed no increase in code quality, perhaps even a decrease. For example it does not remember things far more often, like variables that have a different name. It also easily ignores a bunch of my very specific and enumerated requests.

        • xor@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          2
          ·
          24 hours ago

          03 something… i think the bigger version….
          but, i saw a video where it wrote a working game of snake, and then wrote an ai training algorithm to make an ai that could play snake… all of the code ran on the first try….
          could be a lie though, i dunno….

          • Bronzebeard@lemm.ee
            link
            fedilink
            English
            arrow-up
            3
            ·
            21 hours ago

            Asking it to write a program that already exists in it’s entirety with source code publicly posted, and having that work is not impressive.

            That’s just copy pasting

            • xor@lemmy.dbzer0.com
              link
              fedilink
              English
              arrow-up
              2
              ·
              16 hours ago

              he asked it by describing the rules of the game, and then asked it to write and ai to learn the game….
              it’s still basic but not copy pasta

              • Bronzebeard@lemm.ee
                link
                fedilink
                English
                arrow-up
                2
                ·
                8 hours ago

                These things work by remind how likely other words are to appear next to certain words. Do you know how many tutorials on how to code those exact rules it must have scanned?