Nathan Strauss, an Amazon spokesman, said the company is closely reviewing the index. “Titan Text is still in private preview and it would be premature to judge the transparency of a base model before it is ready for general availability,” he says. Meta declined to comment on the Stanford report and OpenAI did not respond to a request for comment.
Rishi Bommasani, a graduate student at Stanford who worked on the study, says this reflects the fact that AI is becoming increasingly opaque even as it becomes more influential. This is in stark contrast to the last major AI boom, when openness helped enable major advances in capabilities such as speech and image recognition. “By the end of the 2010s, companies were more transparent about their research and published a lot more,” says Bommasani. “That’s why we’ve had the success of deep learning.”
The Stanford report also suggests that models do not need to be kept so secret for competitive reasons. Kevin Klyman, a policy researcher at Stanford University, says the fact that a number of leading models perform relatively well on different measures of transparency suggests that they could all become more open without losing out to competitors.
As AI experts try to figure out where the recent flowering of certain AI approaches will lead, some say the secrecy could make the field less of a scientific discipline and more of a for-profit discipline.
“This is a pivotal time in the history of AI,” he says Jesse Dodge, a research scientist at the Allen Institute for AI, or AI2. “The most influential players developing generative AI systems today are increasingly secretive and not sharing important details of their data and processes.”
AI2 is trying to develop a much more transparent AI language model called OLMo. Training is carried out using a collection of data from the Internet, scientific publications, code, books and encyclopedias. This data set is called Dolmawas released under AI2 ImpACT license. When OLMo is finished, AI2 plans to release the working AI system and also the code behind it so that others can build on the project.
Dodge says it’s particularly important to expand access to the data behind powerful AI models. Without direct access, it is generally impossible to know why or how a model can do what it does. “To advance science, reproducibility is necessary,” he says. “Without open access to these critical building blocks of modeling, we remain in a ‘closed’, stagnant and proprietary situation.”
Given how widely used AI models are — and how dangerous some experts warn they could be — a little more openness could go a long way.