What are Foundation Models?

In recent years, machine learning models such as GPT-3 and BERT have emerged as powerhouses within the field. These models are trained on broad data at Scale and can be adapted or fine-tuned to a wide range of downstream tasks. Researchers at Stanford have created a center to better understand the strengths and potential harms of these models, calling them in their new paper "foundation models."

Foundation models are large pre-trained models that can be used as a foundation for more specific applications via methods like fine-tuning and transfer learning. Unlike models that are smaller or that are trained on more niche data sets, their power comes in part from the fact that they are large enough to be trained on very large data sets and develop an encoding for the complex relationships between the general range of topics within those data sets.

Emergence and Homogenization

Foundation models are significant for two reasons: emergence and homogenization. Emergence refers to the phenomena where the behavior of a model is implicitly induced rather than explicitly encoded. This is said to be both a source of scientific excitement because we have these models that are more powerful than we may have initially realized, but also considerable concern because when things go wrong, we don't necessarily know why.

On the other HAND, homogenization is the consolidation of the models that we use for a wide range of applications, which is good in that we've found that we can adapt the same model architecture towards a wide variety of tasks, but is bad in that there's essentially one point of failure. So if there's anything wrong with that model architecture, its use is already pretty widespread.

Criticisms of Foundation Models

One of the criticisms of the idea of foundation models is that the term itself is kind of an odd term for something that we already had a name for. We have pre-trained large models. Additionally, the examples that they provide in this paper are fairly limited, as they tend to focus on models that are trained by major tech companies, whereas there are other models that would likely fall under their rubric of foundation models, such as ResNet50, that are used everywhere and have been trained on massive data sets, which they don't really discuss at all.

Another criticism is that foundation models are supposed to be these large models, but they don't really talk about scale, so it's unclear what the cutoff for a foundation model is in their heads. Specifying that is helpful because it tells You a lot about what kinds of models we might be looking at and what already exists in the field that might be looked at, and it also tells us what we might not be looking at, such as smaller models, more niche models, or things that actually smaller companies who can't necessarily afford to run things like GPT-3 might use in place of these larger language models that might have their own faults that we don't know a ton about.

The Need for Scientific Thinking

Regardless of how you want to approach your interest in foundation models, you'll need a solid foundation in scientific thinking to get started. Brilliant's newly updated course on scientific thinking is a great place to start. It's full of interactive exercises that let you experience the principles of science firsthand. To really learn something, it's not enough to just watch someone else do it. You have to actually do it yourself.

As someone who personally learns better through visual and physical intuition than through rote memorization, I really appreciate Brilliant's interactive approach to teaching the major pillars of STEM. If you're interested in spending more time on the programming side of foundation models, Brilliant can help you learn how to program without having to dig through the weeds of coding syntax through their fun interactive challenges in their Python programming course. You just shift around these blocks of pseudocode, and then you can get immediate feedback on your results. It's a good way to understand how computer algorithms work, and then once you have that down, the coding syntax becomes a lot less intimidating.

Brilliant's Course on Scientific Thinking

Brilliant is not about memorizing or regurgitating facts for a test. You can just pick a course you're interested in, get started, and if you're feeling stuck or made a mistake, you can Read the explanations to find out more and learn at your own pace. If you'd like to try out Brilliant for free and get 20% off a year of STEM learning, you can click on the link in the description or go to brilliant.org/jordan to sign up for free.

Conclusion

In conclusion, while I think that it's true that large pre-trained models should be studied to better understand their strengths and harms, especially since a lot of these models have been used in public-facing machine learning systems for a while now, I do find the term "foundation models" to be a little bit arbitrary considering that we've already had a name for these models and that there has been ongoing research about them. On the other hand, this paper seems to be a way of introducing the Center for Research on Foundation Models over at Stanford to the larger machine learning community, and I do think that it's probably a good idea to have a center that focuses on looking at how these models work and trying to get a better understanding of that, especially if they end up expanding the scope of the models that they end up looking at.

FAQ

Q: What are foundation models? A: Foundation models are large pre-trained models that can be used as a foundation for more specific applications via methods like fine-tuning and transfer learning.

Q: What is the significance of foundation models? A: Foundation models are significant for two reasons: emergence and homogenization. Emergence refers to the phenomena where the behavior of a model is implicitly induced rather than explicitly encoded. Homogenization is the consolidation of the models that we use for a wide range of applications.

Q: What are some criticisms of foundation models? A: One criticism is that the term itself is kind of an odd term for something that we already had a name for. Another criticism is that the examples that they provide in this paper are fairly limited, as they tend to focus on models that are trained by major tech companies.

Q: What is Brilliant? A: Brilliant is a website and app based on the principle of active problem-solving. It offers interactive courses in STEM subjects, including a course on scientific thinking.

Q: How can Brilliant help with learning about foundation models? A: Brilliant's interactive approach to teaching the major pillars of STEM can help you learn how to program without having to dig through the weeds of coding syntax through their fun interactive challenges in their Python programming course.

Foundation Models: Opportunities and Risks