variables: 736527
Data license: CC-BY
This data as json
id | name | unit | description | createdAt | updatedAt | code | coverage | timespan | datasetId | sourceId | shortUnit | display | columnOrder | originalMetadata | grapherConfigAdmin | shortName | catalogPath | dimensions | schemaVersion | processingLevel | processingLog | titlePublic | titleVariant | attributionShort | attribution | descriptionShort | descriptionFromProducer | descriptionKey | descriptionProcessing | licenses | license | grapherConfigETL | type | sort | dataChecksum | metadataChecksum |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
736527 | Performance on math and problem-solving tasks | % | This benchmark assesses the accuracy of models on math and problem solving tasks based on the MATH benchmark. The MATH benchmark consists of 12,500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations. | 2023-07-03 14:54:48 | 2024-07-08 15:20:30 | 6102 | 29574 | % | { "unit": "%", "zeroDay": "2019-01-01", "shortUnit": "%", "yearIsDay": true, "numDecimalPlaces": 1 } |
0 | performance_math | grapher/artificial_intelligence/2023-06-14/papers_with_code_benchmarks/papers_with_code_benchmarks#performance_math | 1 | [] |
float | [] |
a4f2238d9420d82691a35fdb80133198 | 4c3dffb26fcfc27eaf7a989b1a2fe3ba |