owid: variables: 6 rows where datasetId = 6571 sorted by id descending

This table contains all the variables of all the datasets we have at Our World In Data. Note that this includes only the information about the variables like name and default display settings, not the data itself.

Data license: CC-BY

6 rows where datasetId = 6571 sorted by id descending

descending

✎ View and edit SQL

This data as json, CSV (advanced)

id ▲	name	unit	createdAt	updatedAt	timespan	datasetId	display	shortName	catalogPath	schemaVersion	processingLevel	descriptionShort	descriptionFromProducer	descriptionKey	descriptionProcessing	grapherConfigETL	type	sort	dataChecksum	metadataChecksum
935637	Training computation (petaFLOP)	petaFLOP	2024-06-19 14:36:00	2024-07-08 16:38:43		Tracking Compute-Intensive AI Models 6571	{ "unit": "petaFLOP", "zeroDay": "1949-01-01", "yearIsDay": true, "numDecimalPlaces": 0 }	training_computation_petaflop	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#training_computation_petaflop	2	major	Computation is measured in total petaFLOP, which is 10¹⁵ [floating-point operations](#dod:flop) estimated from AI literature, albeit with some uncertainty.		[ "In the context of artificial intelligence (AI), training computation is predominantly measured using floating-point operations or \u201cFLOP\u201d. One FLOP represents a single arithmetic operation involving floating-point numbers, such as addition, subtraction, multiplication, or division. To adapt to the vast computational demands of AI systems, the measurement unit of petaFLOP is commonly used. One petaFLOP stands as a staggering one quadrillion FLOPs, underscoring the magnitude of computational operations within AI.", "Modern AI systems are rooted in machine learning and deep learning techniques. These methodologies are notorious for their computational intensity, involving complex mathematical processes and algorithms. During the training phase, AI models process large volumes of data, while continuously adapting and refining their parameters to optimize performance, rendering the training process computationally intensive.", "Many factors influence the magnitude of training computation within AI systems. Notably, the size of the dataset employed for training significantly impacts the computational load. Larger datasets necessitate more processing power. The complexity of the model's architecture also plays a pivotal role; more intricate models lead to more computations. Parallel processing, involving the simultaneous use of multiple processors, also has a substantial effect. Beyond these factors, specific design choices and other variables further contribute to the complexity and scale of training computation within AI." ]	Training computation was converted from its original measurement in FLOPs (floating-point operations) to a more manageable unit known as petaFLOPs. This conversion is performed by dividing the original training compute value by 1e15, which represents one quadrillion (10^15). The purpose of this conversion is to provide a more human-readable and practical representation of the immense computational efforts involved in training AI systems. By expressing the training computation in petaFLOPs, it becomes easier to grasp the scale and magnitude of the computational resources required for training these systems, especially when dealing with large datasets and complex architectures.	{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP).", "title": "Training computation" }	float	[]	cc4703e34ef1b579bb10f61ae22f936b	db56ac11e8f4cd8f006ac9ad4afe4f22
935636	Number of parameters		2024-06-19 14:36:00	2024-07-08 16:38:43		Tracking Compute-Intensive AI Models 6571	{ "zeroDay": "1949-01-01", "yearIsDay": true, "numDecimalPlaces": 0 }	parameters	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#parameters	2	major	Total number of learnable variables or weights that the model contains. Parameters are adjusted during the training process to optimize the model's performance.		[ "Parameters are internal variables that machine learning models adjust during their training process to improve their ability to make accurate predictions. They act as the model's \"knobs\" that are fine-tuned based on the provided data. In deep learning, a subset of artificial intelligence (AI), parameters primarily consist of the weights assigned to the connections between the small processing units called neurons. Picture a vast network of interconnected neurons where the strength of each connection represents a parameter.", "The total number of parameters in a model is influenced by various factors. The model's structure and the number of \u201clayers\u201d of neurons play a significant role. Generally, more complex models with additional layers tend to have a higher number of parameters. Special components of specific deep learning architectures can further contribute to the overall parameter count.", "Understanding the number of parameters in a model is crucial to design effective models. More parameters can help the model understand complex data patterns, potentially leading to higher accuracy. However, there's a fine balance to strike. If a model has too many parameters, it risks memorizing the specific examples in its training data rather than learning their underlying patterns. Consequently, it may perform poorly when presented with new, unseen data. Achieving the right balance of parameters is a critical consideration in model development.", "In recent times, the AI community has witnessed the emergence of what are often referred to as \"giant models.\" These models boast an astounding number of parameters, reaching into the billions or even trillions. While these huge models have achieved remarkable performance, they have a significant computational cost. Effectively managing and training such large-scale models has become a prominent and active area of research and discussion within the AI field." ]		{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." }	int	[]	e27c885c33f029aed4d2574721d24ca1	c63ec5c1d3a64f1a07290a47176f0ca0
935635	Training dataset size	datapoints	2024-06-19 14:36:00	2024-07-08 16:38:43		Tracking Compute-Intensive AI Models 6571	{ "unit": "datapoints", "zeroDay": "1949-01-01", "yearIsDay": true, "numDecimalPlaces": 0 }	training_dataset_size__datapoints	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#training_dataset_size__datapoints	2	major	The number of examples provided to train an AI model. Typically, more data results in a more comprehensive understanding by the model.		[ "Training data size refers to the volume of data employed to train an artificial intelligence (AI) model effectively. It's a representation of the number of examples that the model learns from during its training process. It is a fundamental measure of the scope of the data used in the model's learning phase.", "To grasp the concept of training data size, imagine teaching a friend the art of distinguishing different types of birds. In this analogy, each bird picture presented to your friend corresponds to an individual piece of training data. If you showed them 100 unique bird photos, then the training data size in this scenario would be quantified as 100.", "Training data size is an essential indicator in AI and machine learning. First and foremost, it directly impacts the depth of learning achieved by the model. The more extensive the dataset, the more profound and comprehensive the model's understanding of the subject matter becomes. Additionally, a large training data size contributes significantly to improved recognition capabilities. By exposing the model to a diverse array of examples, it becomes adept at identifying subtle nuances, much like how it becomes skilled at distinguishing various bird species through exposure to a large variety of bird images." ]		{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." }	int	[]	ce9794088bd67cebb772ba5bc0e2c603	f5704ce08886df6236f8023974c06d70
935634	Researcher affiliation		2024-06-19 14:36:00	2024-07-08 16:38:43	24837-27479	Tracking Compute-Intensive AI Models 6571	{}	organization_categorization	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#organization_categorization	2	major	Describes the sector where the authors of an AI system have their primary affiliations.	Systems are categorized as “Industry” if their authors are affiliated with private sector organizations, “Academia” if the authors are affiliated with universities or academic institutions, or “Industry - Academia Collaboration” when at least 30% of the authors are from each.	[]		{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." }	string	[]	3712ab8e4d2e4d16c27da72f30f92397	7f17f649b0679eedc1e6c6c6f7ee64ca
935633	Publication date		2024-06-19 14:36:00	2024-07-08 16:38:43	24837-27479	Tracking Compute-Intensive AI Models 6571	{}	publication_date	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#publication_date	2	major	The date when the AI system was first published.	The publication, announcement, or release date of the model, in YYYY-MM-DD format. If the year and month are known but the day is unknown, the day is filled in as YYYY-MM-15. If the year is known but the month and day are unknown, the month and day are filled in as YYYY-07-01.	[]		{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." }	string	[]	e098e2e893a42bd0b2d27ab103b4faa1	0794c867d328a876630b11c527d6e873
935632	Domain		2024-06-19 14:36:00	2024-07-08 16:38:43		Tracking Compute-Intensive AI Models 6571	{ "zeroDay": "1949-01-01", "yearIsDay": true }	domain	grapher/artificial_intelligence/2024-06-19/epoch_compute_intensive/epoch_compute_intensive#domain	2	major	Refers to the specific area, application, or field in which an AI system is designed to operate.		[]		{ "note": "Confirmed large-scale AI models are those where the training compute exceeds 10\u00b2\u00b3 floating-point operations (FLOP)." }	string	[]	579d460f3bba0e80e5f863e98ba1f96b	9fd0e8f4b98697edf573f23e7c129222

Advanced export

JSON shape: default, array, newline-delimited, object

CREATE TABLE "variables" (
	"id" INTEGER PRIMARY KEY AUTOINCREMENT,
	"name" VARCHAR(750) NULL  ,
	"unit" VARCHAR(255) NOT NULL  ,
	"description" TEXT NULL  ,
	"createdAt" DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ,
	"updatedAt" DATETIME NULL  ,
	"code" VARCHAR(255) NULL  ,
	"coverage" VARCHAR(255) NOT NULL  ,
	"timespan" VARCHAR(255) NOT NULL  ,
	"datasetId" INTEGER NOT NULL  ,
	"sourceId" INTEGER NULL  ,
	"shortUnit" VARCHAR(255) NULL  ,
	"display" TEXT NOT NULL  ,
	"columnOrder" INTEGER NOT NULL DEFAULT '0' ,
	"originalMetadata" TEXT NULL  ,
	"grapherConfigAdmin" TEXT NULL  ,
	"shortName" VARCHAR(255) NULL  ,
	"catalogPath" VARCHAR(767) NULL  ,
	"dimensions" TEXT NULL  ,
	"schemaVersion" INTEGER NOT NULL DEFAULT '1' ,
	"processingLevel" VARCHAR(30) NULL  ,
	"processingLog" TEXT NULL  ,
	"titlePublic" VARCHAR(512) NULL  ,
	"titleVariant" VARCHAR(255) NULL  ,
	"attributionShort" VARCHAR(512) NULL  ,
	"attribution" TEXT NULL  ,
	"descriptionShort" TEXT NULL  ,
	"descriptionFromProducer" TEXT NULL  ,
	"descriptionKey" TEXT NULL  ,
	"descriptionProcessing" TEXT NULL  ,
	"licenses" TEXT NULL  ,
	"license" TEXT NULL  ,
	"grapherConfigETL" TEXT NULL  ,
	"type" TEXT NULL  ,
	"sort" TEXT NULL  ,
	"dataChecksum" VARCHAR(64) NULL  ,
	"metadataChecksum" VARCHAR(64) NULL,
	FOREIGN KEY("datasetId") REFERENCES "datasets" ("id") ON UPDATE RESTRICT ON DELETE RESTRICT,
	FOREIGN KEY("sourceId") REFERENCES "sources" ("id") ON UPDATE RESTRICT ON DELETE RESTRICT
);
CREATE UNIQUE INDEX "idx_catalogPath" ON "variables" ("catalogPath");
CREATE UNIQUE INDEX "unique_short_name_per_dataset" ON "variables" ("shortName", "datasetId");
CREATE UNIQUE INDEX "variables_code_fk_dst_id_7bde8c2a_uniq" ON "variables" ("code", "datasetId");
CREATE INDEX "variables_datasetId_50a98bfd_fk_datasets_id" ON "variables" ("datasetId");
CREATE UNIQUE INDEX "variables_name_fk_dst_id_f7453c33_uniq" ON "variables" ("name", "datasetId");
CREATE INDEX "variables_sourceId_31fce80a_fk_sources_id" ON "variables" ("sourceId");