Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Benchmarking large language models for bio-image analysis code generation

Haase, R., Tischer, C., & Scherf, N. (2024). Benchmarking large language models for bio-image analysis code generation. bioRxiv. doi:10.1101/2024.04.19.590278.

Item is

Dateien

einblenden: Dateien
ausblenden: Dateien
:
Haase_pre_v2.pdf (Preprint), 3MB
Name:
Haase_pre_v2.pdf
Beschreibung:
-
OA-Status:
Grün
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Haase, Robert, Autor
Tischer, Christian, Autor
Scherf, Nico1, Autor                 
Affiliations:
1Method and Development Group Neural Data Science and Statistical Computing, MPI for Human Cognitive and Brain Sciences, Max Planck Society, ou_3282987              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: In the computational age, life-scientists often have to write Python code to solve bio-image analysis (BIA) problems. Many of them have not been formally trained in programming though. Code-generation, or coding assistance in general, with Large Language Models (LLMs) can have a clear impact on BIA. To the best of our knowledge, the quality of the generated code in this domain has not been studied. We present a quantitative benchmark to estimate the capability of LLMs to generate code for solving common BIA tasks. Our benchmark currently consists of 57 human-written prompts with corresponding reference solutions in Python, and unit-tests to evaluate functional correctness of potential solutions. We demonstrate our benchmark here and compare 6 state-of-the-art LLMs. To ensure that we will cover most of our community needs we also outline mid- and long-term strategies to maintain and extend the benchmark by the BIA open-source community. This work should support users in deciding for an LLM and also guide LLM developers in improving the capabilities of LLMs in the BIA domain.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2024-04-25
 Publikationsstatus: Online veröffentlicht
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: DOI: 10.1101/2024.04.19.590278
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

einblenden:
ausblenden:
Titel: bioRxiv
Genre der Quelle: Webseite
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: -
Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: - Identifikator: -