variables: 736529
Data license: CC-BY
This data as json
id | name | unit | description | createdAt | updatedAt | code | coverage | timespan | datasetId | sourceId | shortUnit | display | columnOrder | originalMetadata | grapherConfigAdmin | shortName | catalogPath | dimensions | schemaVersion | processingLevel | processingLog | titlePublic | titleVariant | attributionShort | attribution | descriptionShort | descriptionFromProducer | descriptionKey | descriptionProcessing | licenses | license | grapherConfigETL | type | sort | dataChecksum | metadataChecksum |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
736529 | Accuracy on other knowledge tests | % | This benchmark assesses the average accuracy of models across subjects other than STEM, humanities, social sciences based on the MMLU benchmark. The MMLU benchmark covers a wide range of 57 subjects, including STEM, humanities, social sciences, and more. It encompasses subjects of varying difficulty levels, spanning from elementary concepts to advanced professional topics. This comprehensive benchmark assesses not only world knowledge but also problem-solving abilities. | 2023-07-03 14:54:48 | 2024-07-08 15:20:31 | 6102 | 29574 | % | { "unit": "%", "zeroDay": "2019-01-01", "shortUnit": "%", "yearIsDay": true, "numDecimalPlaces": 1 } |
0 | performance_other | grapher/artificial_intelligence/2023-06-14/papers_with_code_benchmarks/papers_with_code_benchmarks#performance_other | 1 | [] |
float | [] |
774883813cce879e7ac5d7a67fdc1df0 | 6432122c3d15ef28b79e639d497dbf8c |