A (Very) Brief History of AI
Table of Contents
Pre-Dartmouth
As early as the mid-19th century, Charles Babbage and Ada Lovelace created the Analytical Engine, a mechanical general-purpose computer. Lovelace is often credited with the idea of a machine that could manipulate symbols in accordance with rules and that it might act upon other than just numbers, touching upon concepts central to AI.
In 1943, Warren McCulloch and Walter Pitts publish their paper “A Logical Calculus of the Ideas Immanent in Nervous Activity [PDF]” proposing the first mathematical model of a neural network. Their work combines principles of logic and biology to conceptualize how neurons in the brain might work and lays the foundation for future research in neural networks.
Five years later, Norbert Wiener’s book “Cybernetics [PDF]” introduces the study of control and communication in the animal and the machine, which is closely related to AI. His work is influential in the development of robotics and the understanding of complex systems.
Then, in 1950, one of the fathers of modern computer science, Alan Turing, presents a seminal paper “Computing Machinery and Intelligence [PDF]”, asking the question: “Can machines think?” He proposes what is now known as the Turing Test, a criterion for establishing intelligence in a machine. Turing’s ideas about machine learning, artificial intelligence, and the nature of consciousness are foundational to the field.
In the late 1940s and early 1950s, the development of the first electronic computers provide the necessary hardware basis for AI research. The creation of the first computer programs that can perform tasks such as playing checkers or solving logic problems lay the groundwork for AI.
Dartmouth Conference (1956): The birth of AI
At Dartmouth College in Hanover, New Hampshire, a conference is organized by John McCarthy, Marvin Minsky, Nathaniel Rochester, and Claude Shannon, some of the leading figures in the field of computer science. Their objective is to explore how machines could be made to simulate aspects of human intelligence. This is a groundbreaking concept at the time, proposing the idea that aspects of learning and other features of intelligence could be so precisely described that a machine could be made to simulate them. The term “artificial intelligence” is coined and the assembly is destined to be seen as the official genesis of research in the field.
1950s-1960s
Developed in 1956, Logic Theorist, often cited as the first AI program, is able to prove mathematical theorems. Frank Rosenblatt develops the Mark I Perceptron, an early neural network, in 1957. It can perform simple pattern recognition tasks and the ability to learn from data.
In 1966, ELIZA, an early natural language processing program is created by Joseph Weizenbaum. An ancestor of ChatGPT, the program can mimic human conversation. More than 60 years later, it will beat OpenAI’s GPT-3.5 in a Turing Test study.
First AI Winter (1974-1980)
The field experiences its first major setback due to inflated expectations and subsequent disappointment in AI capabilities, leading to reduced funding and interest.
1980s
The 1980s sees the revival and rise of machine learning and a shift from rule-based to learning systems. Researchers start focusing more on creating algorithms that can learn from data, rather than solely relying on hardcoded rules. Further algorithms, such as decision trees and reinforcement learning, are developed and refined during this period too.
There is a renewed interest in neural networks, particularly with the advent of the backpropagation algorithm which enables more effective training of multi-layer networks. This is a precursor to the deep learning revolution to come later.
The Second AI Winter (late 1980s-1990s)
By the late 1980s, the limitations of existing AI technologies, particularly expert systems, become apparent. They are brittle, expensive, and unable to handle complex reasoning or generalize beyond their narrow domain of expertise.
Disillusionment with limited progress in the field and failures of major initiatives like Japan’s Fifth Generation Computer Project, lead to a reduction in government and industry funding, and a general decline in interest in AI research.
1990s
The 1990s sees a resurgence of interest in AI and a ramp up in investment by tech firms seeking to leverage a number of positive trends:
- The development of improved machine learning algorithms, particularly in the field of neural networks
- Rapid advancement in computational power, particularly due to the development and availability of Graphics Processing Units (GPUs), which dramatically increase the capabilities for processing large datasets and complex algorithms.
- An explosion of data thanks to the growth of the internet and digitalization of many aspects of life. As we have seen more and more, large data sets are crucial for training more sophisticated AI models, particularly in areas like natural language processing.
- AI re-enters the public imagination, fueled by popular culture and 1997’s highly publicized defeat of world chess champion Garry Kasparov by IBM’s Deep Blue. This was a watershed moment, proving that computers could outperform humans in specific tasks.
AI’s Renaissance: 2000s onwards
The 21st century sees an acceleration of AI development and output. Researchers like Geoffrey Hinton, Yoshua Bengio, and Yann LeCun lead breakthroughs in deep learning. The development of Convolutional Neural Networks (CNNs) for image processing and Recurrent Neural Networks (RNNs) for sequence analysis, revolutionize AI capabilities, particularly in vision and language processing.
The explosion of big data, combined with significant increases in computational power, enables the training of large, complex AI models, making tasks like image and speech recognition, and natural language processing, more accurate and efficient.
A leap forward occurs in 2011 with IBM’s Watson winning Jeopardy! This is an important victory, demonstrating Watson’s prowess not just in computational skills, as in chess, but also in understanding and processing natural language.
Generative Adversarial Networks (GANs) are introduced by Ian Goodfellow and his colleagues in 2014. The fundamental innovation of GANs lies in their unique architecture, consisting of two neural networks: the generator and the discriminator. These two networks engage in a continuous adversarial process, where the generator creates data and the discriminator evaluates it. The technology game changes the ability of AI to generate realistic and creative content, laying the foundation for Dall-E, MidJourney and other visual content generation apps. It also opens the gateway to deepfakes.
In 2015, Google’s DeepDream utilizes neural networks to produce dream-like images by amplifying patterns in pictures.
Also in 2015, Google DeepMind’s AlphaGo, utilizing deep reinforcement learning and Monte Carlo tree search techniques, overcomes top Go players like world champion Lee Sedol. Go is a complex game with a high number of possible positions, requiring a more nuanced strategy than chess, demonstrating the potential of neural networks and machine learning.
2017: Google introduces a novel approach to natural language processing (NLP) with transformers, a type of neural network architecture that significantly improves the efficiency and effectiveness of learning patterns in sequences of data, particularly language.
This innovation lays the groundwork for OpenAI’s GPT-1, released in 2018.
OpenAI unveils GPT-2 in 2019. This enhanced version is capable of generating coherent and contextually relevant text over longer passages. Its release is initially staggered due to concerns about potential misuse for generating misleading information.
GPT-3’s release in 2020 marks another significant advancement. Its scale is unprecedented, and it demonstrates a remarkable ability to generate human-like text using 175 billion parameters. This is a major leap in terms of the size and complexity of NLPs.
2021, OpenAI releases Dall-E, which uses a modified GPT-3 model to generate highly creative and often whimsical images from textual descriptions. This is another significant advancement in the field of AI-driven art and image synthesis.
The launch of ChatGPT in late 2022, built on the GPT-3.5 model, revolutionizes the field. ChatGPT, focusing on conversational tasks, gains immense popularity due to its accessibility, affordability (free of charge), and user-friendly interface.
March 2023, GPT-4 is released, representing the most advanced public-facing large language model (LLM) developed to-date.
For 30+ years, I've been committed to protecting people, businesses, and the environment from the physical harm caused by cyber-kinetic threats, blending cybersecurity strategies and resilience and safety measures. Lately, my worries have grown due to the rapid, complex advancements in Artificial Intelligence (AI). Having observed AI's progression for two decades and penned a book on its future, I see it as a unique and escalating threat, especially when applied to military systems, disinformation, or integrated into critical infrastructure like 5G networks or smart grids. More about me, and about Defence.AI.
Luka Ivezic
Luka Ivezic is the Lead Cybersecurity Consultant for Europe at the Information Security Forum (ISF), a leading global, independent, and not-for-profit organisation dedicated to cybersecurity and risk management. Before joining ISF, Luka served as a cybersecurity consultant and manager at PwC and Deloitte. His journey in the field began as an independent researcher focused on cyber and geopolitical implications of emerging technologies such as AI, IoT, 5G. He co-authored with Marin the book "The Future of Leadership in the Age of AI". Luka holds a Master's degree from King's College London's Department of War Studies, where he specialized in the disinformation risks posed by AI.