COBOLEval: How Well Can Large Language Models (LLMs) Write Code?

(4G)

Stream: Virtual Room 4
Time: 09:00 - 10:00

Presentation

LLMs are fast-changing the way that we write software. Over a million developers now pay for GitHub Copilot and recent breakthroughs in LLM reasoning have brought the dream of a fully AI Software Engineer closer to reality. But while it’s not hard to find a demo of an LLM coding a website or a clone of Flappy Bird, not much is known about their ability to write COBOL.

We've developed a new benchmark to evaluate the ability of LLMs to write COBOL. This presentation walks through the rationale, our design descisions, and the results from some popular LLMs.

Attachments

There is currently no attachment for COBOLEval: How Well Can Large Language Models (LLMs) Write Code?

Speakers

Gabriel Gordon-Hall at bloop AI Limited

Cofounder & CTO @ bloop

Email: gabriel@bloop.ai

Gabriel Gordon-Hall at bloop AI Limited

Cofounder & CTO @ bloop

Email: gabriel@bloop.ai

Feedback

Click here to give some Feedback so we can make it even better next year!