Meta Platforms (NASDAQ:META) said on Friday it was releasing some AI models from its research division, including a "Self-Taught Evaluator" which could provide a path to less human involvement in the AI development process.
The Facebook-owner said the "Self-Taught Evaluator" approach generates contrasting model outputs and trains a large language model, or LLM-as-a-Judge to produce reasoning traces for evaluation and final judgments, with an iterative* self-improvement scheme.
The Self-Taught Evaluator is a new method for generating synthetic preference data to train reward models without relying on human annotations.
The release follows Meta's introduction of the product in a paper in August, which showed how it depends on the same "chain of thought" technique used by OpenAI's o1 models to get it to make reliable judgments about models' responses, according to a report from Reuters.
Last month, Microsoft-backed (MSFT) OpenAI released AI models called o1 and o1-mini, which can reason through complex tasks and solve harder problems than previous models in science, coding, and math.
The ability to use AI to review AI reliably offers a glimpse at a potential pathway to building autonomous AI agents which can learn from their own mistakes, the report added, citing two Meta researchers behind the project.
Self-improving models could remove the requirement for an often expensive and inefficient process currently used, called Reinforcement Learning from Human Feedback, which requires input from human annotators who should have specialized expertise to label data accurately and verify that answers to complex math and writing questions are correct, the report added.
"The idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of super-human level of AI," said Jason Weston, one of the researchers, the report noted.
The social media giant also released Meta Segment Anything 2.1 (SAM 2.1), an update to its Segment Anything Model 2 for images and videos. SAM 2.1 includes a new developer suite with the code for model training and the web demo.
* "iterative self-improvement scheme"
What is meant by "iterative process"?
The iterative process involves a continuous cycle of planning, analysis, implementation, and evaluation. Each cycle produces a segment of development that forms the basis for the next cycle of iterative improvement.
Re: A.I.
Posted: Mon Oct 21, 2024 7:33 am
by pdub
We are going to destroy humanity.
Re: A.I.
Posted: Mon Oct 21, 2024 8:01 am
by KUTradition
yup
Re: A.I.
Posted: Mon Oct 21, 2024 8:38 am
by TDub
2 things I've been consistently and adamantly against....NIL/tunr college inti NBA....and all things AI related.
So far nothing has changed my.mind in either, but rather, fortified my stances.
I hate it all
Re: A.I.
Posted: Mon Oct 21, 2024 9:10 am
by pdub
Yes.
Both came with, 'well give it a chance, you may like it'.
I was pretty confident I wouldn't and to be honest, both are worse than I imagined.
AI in particular is so shitty because now I have to use it whether it's because it is baked into the software i'm using ( Adobe ) or because AI killed the active communities where I can find help with specific coding questions not answered within an API's/codebases documentation ( bring back stackoverflow ).
Re: A.I.
Posted: Mon Oct 21, 2024 9:21 am
by TDub
I'm an old curmudgeon, stuck in my ways, appreciator of the old ways and simpler times and I've long been resistant to most things tech related.
can't work on your own trucks barely anymore, streaming TV makes cable look affordable and efficient and when we only had 3 channels I had a better chance of actually finding the show I wanted l...