variables: 736547
Data license: CC-BY
This data as json
id | name | unit | description | createdAt | updatedAt | code | coverage | timespan | datasetId | sourceId | shortUnit | display | columnOrder | originalMetadata | grapherConfigAdmin | shortName | catalogPath | dimensions | schemaVersion | processingLevel | processingLog | titlePublic | titleVariant | attributionShort | attribution | descriptionShort | descriptionFromProducer | descriptionKey | descriptionProcessing | licenses | license | grapherConfigETL | type | sort | dataChecksum | metadataChecksum |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
736547 | Coding performance on interviews - state of the art | % | This benchmark assesses the accuracy of models in coding interviews based on the APPS benchmark. The APPS benchmark focuses on coding ability and problem-solving in a natural language context, simulating the evaluation process employed during human programmer interviews. It presents coding problems in unrestricted natural language and evaluates the correctness of solutions. The coding tasks within this benchmark are sourced from open-access coding websites such as Codeforces and Kattis. These tasks cover a spectrum of difficulty levels, ranging from introductory to collegiate competition level. The benchmark measures the accuracy of models in solving programming tasks specifically tailored for coding interviews. | 2023-07-03 14:54:56 | 2024-07-08 16:38:16 | 6103 | 29583 | % | { "name": "Coding interviews", "unit": "%", "zeroDay": "2019-01-01", "shortUnit": "%", "yearIsDay": true, "numDecimalPlaces": 1 } |
0 | performance_code_any_interview_state_of_the_art | grapher/artificial_intelligence/2023-06-14/papers_with_code_benchmarks_state_of_the_art/papers_with_code_benchmarks_state_of_the_art#performance_code_any_interview_state_of_the_art | 1 | [] |
float | [] |
44c47793e87683e34b61c0e2880952b6 | b72a908de22a8f69842216484807e768 |