variables: 736546
This data as json
id | name | unit | description | createdAt | updatedAt | code | coverage | timespan | datasetId | sourceId | shortUnit | display | columnOrder | originalMetadata | grapherConfigAdmin | shortName | catalogPath | dimensions | schemaVersion | processingLevel | processingLog | titlePublic | titleVariant | attributionShort | attribution | descriptionShort | descriptionFromProducer | descriptionKey | descriptionProcessing | licenses | license | grapherConfigETL | type | sort | dataChecksum | metadataChecksum |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
736546 | Coding performance on competitions - state of the art | % | This benchmark measures the accuracy of models in coding competitions based on the APPS benchmark. The APPS benchmark focuses on coding ability and problem-solving in a natural language context. It aims to replicate the evaluation process used for human programmers by presenting coding problems in unrestricted natural language and assessing the correctness of solutions. The coding tasks included in this benchmark are sourced from open-access coding websites such as Codeforces and Kattis. These tasks span a range of difficulty levels, from introductory to collegiate competition level. The benchmark evaluates the accuracy of models in solving programming tasks specifically designed for coding competitions. | 2023-07-03 14:54:56 | 2024-07-08 16:38:15 | 6103 | 29583 | % | { "name": "Coding competitions", "unit": "%", "zeroDay": "2019-01-01", "shortUnit": "%", "yearIsDay": true, "numDecimalPlaces": 1 } |
0 | performance_code_any_competition_state_of_the_art | grapher/artificial_intelligence/2023-06-14/papers_with_code_benchmarks_state_of_the_art/papers_with_code_benchmarks_state_of_the_art#performance_code_any_competition_state_of_the_art | 1 | [] |
float | [] |
e38c6810553c8c10b11ca0cd40cb53f6 | c1218e88d9542613ef75a81d5a877570 |