variables: 935635
Data license: CC-BY
This data as json
id | name | unit | description | createdAt | updatedAt | code | coverage | timespan | datasetId | sourceId | shortUnit | display | columnOrder | originalMetadata | grapherConfigAdmin | shortName | catalogPath | dimensions | schemaVersion | processingLevel | processingLog | titlePublic | titleVariant | attributionShort | attribution | descriptionShort | descriptionFromProducer | descriptionKey | descriptionProcessing | licenses | license | grapherConfigETL | type | sort | dataChecksum | metadataChecksum |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
935635 | Training dataset size | datapoints | 2024-06-19 14:36:00 | 2024-07-08 16:38:43 | 6571 | { "unit": "datapoints", "zeroDay": "1949-01-01", "yearIsDay": true, "numDecimalPlaces": 0 } |
0 | training_dataset_size__datapoints | grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#training_dataset_size__datapoints | 2 | major | The number of examples provided to train an AI model. Typically, more data results in a more comprehensive understanding by the model. | [ "Training data size refers to the volume of data employed to train an artificial intelligence (AI) model effectively. It's a representation of the number of examples that the model learns from during its training process. It is a fundamental measure of the scope of the data used in the model's learning phase.", "To grasp the concept of training data size, imagine teaching a friend the art of distinguishing different types of birds. In this analogy, each bird picture presented to your friend corresponds to an individual piece of training data. If you showed them 100 unique bird photos, then the training data size in this scenario would be quantified as 100.", "Training data size is an essential indicator in AI and machine learning. First and foremost, it directly impacts the depth of learning achieved by the model. The more extensive the dataset, the more profound and comprehensive the model's understanding of the subject matter becomes. Additionally, a large training data size contributes significantly to improved recognition capabilities. By exposing the model to a diverse array of examples, it becomes adept at identifying subtle nuances, much like how it becomes skilled at distinguishing various bird species through exposure to a large variety of bird images." ] |
{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." } |
int | [] |
ce9794088bd67cebb772ba5bc0e2c603 | f5704ce08886df6236f8023974c06d70 |