Anon cheats through college

🍹Early to RISA 🧉@sh.itjust.works · 2 months ago

Anon cheats through college

SkunkWorkz@lemmy.world · 2 months ago

Yeah fake. No way you can get 90%+ using chatGPT without understanding code. LLMs barf out so much nonsense when it comes to code. You have to correct it frequently to make it spit out working code.

Artyom@lemm.ee · 2 months ago

If we’re talking about freshman CS 101, where every assignment is the same year-over-year and it’s all machine graded, yes, 90% is definitely possible because an LLM can essentially act as a database of all problems and all solutions. A grad student TA can probably see through his “explanations”, but they’re probably tired from their endless stack of work, so why bother?

If we’re talking about a 400 level CS class, this kid’s screwed and even someone who’s mastered the fundamentals will struggle through advanced algorithms and reconciling math ideas with hands-on-keyboard software.

AeonFelis@lemmy.world · 2 months ago

Ask ChatGPT for a solution.
Try to run the solution. It doesn’t work.
Post the solution online as something you wrote all on your own, and ask people what’s wrong with it.
Copy-paste the fixed-by-actual-human solution from the replies.

threeduck@aussie.zone · 2 months ago

Are you guys just generating insanely difficult code? I feel like 90% of all my code generation with o1 works first time? And if it doesn’t, I just let GPT know and it fixes it right then and there?

KillingTimeItself@lemmy.dbzer0.com · 2 months ago

the problem is more complex than initially thought, for a few reasons.

One, the user is not very good at prompting, and will often fight with the prompt to get what they want.

Two, often times the user has a very specific vision in mind, which the AI obviously doesn’t know, so the user ends up fighting that.

Three, the AI is not omnisicient, and just fucks shit up, makes goofy mistakes sometimes. Version assumptions, code compat errors, just weird implementations of shit, the kind of stuff you would expect AI to do that’s going to make it harder to manage code after the fact.

unless you’re using AI strictly to write isolated scripts in one particular language, ai is going to fight you at least some of the time.

sugar_in_your_tea@sh.itjust.works · edit-2 2 months ago

I asked an LLM to generate tests for a 10 line function with two arguments, no if branches, and only one library function call. It’s just a for loop and some math. Somehow it invented arguments, and the ones that actually ran didn’t even pass. It made like 5 test functions, spat out paragraphs explaining nonsense, and it still didn’t work.

This was one of the smaller deepseek models, so perhaps a fancier model would do better.

I’m still messing with it, so maybe I’ll find some tasks it’s good at.

KillingTimeItself@lemmy.dbzer0.com · 2 months ago

from what i understand the “preview” models are quite handicapped, usually the benchmark is the full fat model for that reason. the recent openAI one (they have stupid names idk what is what anymore) had a similar problem.

If it’s not a preview model, it’s possible a bigger model would help, but usually prompt engineering is going to be more useful. AI is really quick to get confused sometimes.

sugar_in_your_tea@sh.itjust.works · 2 months ago

It might be, idk, my coworker set it up. It’s definitely a distilled model though. I did hope it would do a better job on such a small input though.

KillingTimeItself@lemmy.dbzer0.com · 2 months ago

the distilled models are a little goofier, it’s possible that might influence it, since they tend to behave weirdly sometimes, but it depends on the model and the application.

AI is still fairly goofy unfortunately, it’ll take time for it to become omniscient.

Radioactive Butthole@reddthat.com · 2 months ago

Can not confirm. LLMs generate garbage for me, i never use it.

nimbledaemon@lemmy.world · edit-2 2 months ago

I just generated an entire angular component (table with filters, data services, using in house software patterns and components, based off of existing work) using copilot for work yesterday. It didn’t work at first, but I’m a good enough software engineer that I iterated on the issues, discarding bad edits and referencing specific examples from the extant codebase and got copilot to fix it. 3-4 days of work (if you were already familiar with the existing way of doing things) done in about 3-4 hours. But if you didn’t know what was going on and how to fix it you’d end up with an unmaintainable non functional mess, full of bugs we have specific fixes in place to avoid but copilot doesn’t care about because it doesn’t have an idea of how software actually works, just what it should look like. So for anything novel or complex you have to feed it an example, then verify it didn’t skip steps or forget to include something it didn’t understand/predict, or make up a library/function call. So you have to know enough about the software you’re making to point that stuff out, because just feeding whatever error pops out of your compiler back into the AI may get you to working code, but it won’t ensure quality code, maintainability, or intelligibility.

surph_ninja@lemmy.world · 2 months ago

A lot of people assume their not knowing how to prompt is a failure of the AI. Or they tried it years ago, and assume it’s still as bad as it was.

JustAnotherKay@lemmy.world · 2 months ago

My first attempt at coding with chatGPT was asking about saving information to a file with python. I wanted to know what libraries were available and the syntax to use them.

It gave me a three page write up about how to write a library myself, in python. Only it had an error on damn near every line, so I still had to go Google the actual libraries and their syntax and slosh through documentation

UnsavoryMollusk@lemmy.world · 2 months ago

Garbage for me too except for basic beginners questions

xor@lemmy.dbzer0.com · 2 months ago

i guess the new new gpt actually makes code that works on the first time

Eheran@lemmy.world · 2 months ago

You mean o3 mini? Wasn’t it on the level of o1, just much faster and cheaper? I noticed no increase in code quality, perhaps even a decrease. For example it does not remember things far more often, like variables that have a different name. It also easily ignores a bunch of my very specific and enumerated requests.

xor@lemmy.dbzer0.com · 2 months ago

03 something… i think the bigger version….
but, i saw a video where it wrote a working game of snake, and then wrote an ai training algorithm to make an ai that could play snake… all of the code ran on the first try….
could be a lie though, i dunno….

Bronzebeard@lemm.ee · 2 months ago

Asking it to write a program that already exists in it’s entirety with source code publicly posted, and having that work is not impressive.

That’s just copy pasting

xor@lemmy.dbzer0.com · 2 months ago

he asked it by describing the rules of the game, and then asked it to write and ai to learn the game….
it’s still basic but not copy pasta

Bronzebeard@lemm.ee · 2 months ago

These things work by remind how likely other words are to appear next to certain words. Do you know how many tutorials on how to code those exact rules it must have scanned?

xor@lemmy.dbzer0.com · 2 months ago

that’s not how these things work

Bronzebeard@lemm.ee · 2 months ago

That’s exactly how LLMs work.

Eheran@lemmy.world · 2 months ago

o3 yes perhaps, we will see then. Would be amazing.

TheOakTree@lemm.ee · 2 months ago

I know the video you are referencing - I think it’s this one.

WoodScientist@sh.itjust.works · 2 months ago

Two words: partial credit.

dilroopgill@lemmy.world · 2 months ago

deepseek rnows solid, autoapprove works sometimes lol

Maggoty@lemmy.world · 2 months ago

Usually this joke is run with a second point of view saying, do I tell them or let them keep thinking this is cheating?