Dr. Alan D. Thompson（AI 専門家）: OpenAI o1 は人間を上回るレベルで「心の理論」を備え、しかも自己認識、自己推論能力もある

2024年9月29日 · 約7分

動画の冒頭の概説

先週発表されたばかりのOpenAIの最新モデルO1は、これまでに行われたほぼすべての知能の限界テストやベンチマークを突破し、博士号レベルの専門家をあらゆる面で凌駕し、今では自己認識や推論のレベルにおける心理テストでも人間を上回る成績を収めている。> これは最先端の技術であり、人工知能に関して言えば、ほとんどの人は、物事が今どれほど速く、どこまで進んでいるのか見当もつかないだろう。しかし、今日は皆さんにそのスピードについていけるよう、お話しようと思う。

FasterWhisper AI(large-v2 model) + DeepL(2024-07 model)

そして、最近のメモでは、自己認識能力が向上したことについても言及しているね。自己認識能力の向上は実に興味深い。だから、彼らがこれらのモデルをベンチマークしたり、社内で安全性テストを行う際には、人間と似たようなさまざまなベンチマークを実行している。 (0:23:10)

GPT-4が100%を達成した「心の理論」というテストがある。これは心理学のテストで、他人の立場に自分を置き換えて考えるというものだ。共感力を測るのに最適なテストだ。GPT-4は100%を達成した。人間でも100%には達していない。80%くらいだと思う。そして、それはかなり驚くべきことだった。新しい上限値を設定できるように、それらのテストを書き直さなければならなかった。 (0:23:30)

同じことが起こっているが、O1モデルではより大きな規模で起こっている。そこで、アポロ・リサーチという新しい研究グループが、この「心の理論」の概念を再度テストし、さらに「自己認識」の概念もテストした。その結果、自己認識と自己推論が向上していることが分かった。これは、エージェントの設定において自己認識を適用し、「心の理論」を適用した結果である。 (0:24:02)

これらは本当に珍しいもので、特に知能など存在しないと主張する人にとってはそうだ。彼らは間違いなく、これは意識の概念に似たものかもしれないという事実と向き合うことになるだろう。 (0:24:12)

私はこの言葉を軽々しく口にするつもりはないし、O1が意識を持っていると言っているわけではないが、自己認識、自己認識、自己推論能力があるという測定結果が出ている。これは驚くべきことだ。そして、いわゆる思考力、推論力、自己認識力の向上については、あなたが言ったように、心の理論をテストすることで、共感力、他者を理解する能力、他者の立場に立つ能力など、人間に適用するさまざまなテストで、そのレベルを確かめることができる。 (0:24:46)

それに関しては、人間を上回る能力さえある。もし私たちが「人間は自己認識力がある」と言うなら、それはテストでこのレベルのパフォーマンスができるからだと分かるが、O1はそれを上回っており、少なくともこれらのテストに基づけば、より高い自己認識力を示している。

▼文字起こし原文展開

And you even talk about in your recent memo that it has increased self-awareness. The increased self-awareness is really interesting. So, when they do benchmark these models, when they do safety test these models internally, they run it through all sorts of benchmarks that are similar to humans. (0:23:10)

We saw GPT-4 hit 100% in the theory of mind, which is a test in psychology where we play around with putting ourselves in other people's shoes. Great test for empathy. GPT-4 hit 100%. Humans don't even hit 100%. I think we're at about 80%. And that was pretty crazy to see. We had to rewrite those tests to be able to have a new ceiling. (0:23:30)

The same thing is happening, but in a bigger way for the O1 model. So, Apollo Research, which is a new group of researchers, tested this concept of theory of mind again, and then this concept of self-awareness. They found that it had improved self-knowledge, improved self-reasoning, so that's applied self-awareness when it's in some sort of agent setting, and applied theory of mind. (0:24:02)

These are really unusual, particularly for those who might argue that there's no intelligence. They're definitely going to be confronted by the fact that this might have something akin to the idea of sentience. (0:24:12)

Now, I don't say that word lightly, and I'm not saying that O1 is sentient, but we now have measurements that it has self-awareness, self-knowledge, and self-reasoning. That is amazing. And along with this now ability to quote-unquote think, reason, increase self-awareness, like you said, by testing it with theory of mind, all the various tests that we would apply to a person to see what is their level of empathy, ability to understand someone else, put themselves in another person's shoes, for example. (0:24:46)

It's even outperforming humans on that. If we were to say, oh, people are self-aware, and we know this because they're able to perform at this level on tests, well, O1 is outperforming and even exhibiting more self-awareness, at least based on these tests.

動画(38:00)

Interview about AI - Dr Alan D. Thompson on OpenAI's New o1 Model Is a Really Big Deal (Sep/2024)

動画概要欄

4,200 views Sep 29, 2024

話題のメモ

OpenAI’s o1 Model: A Quantum Leap in AI Intelligence – Insights from Dr Alan D. Thompson | Financial Sense https://www.financialsense.com/blog/21037/openais-o1-model-quantum-leap-ai-intelligence-insights-dr-alan-d-thompson

(2024-09-29)

動画の冒頭の概説​

FasterWhisper AI(large-v2 model) + DeepL(2024-07 model)​

動画(38:00)​

動画概要欄​

話題のメモ​

動画の冒頭の概説

FasterWhisper AI(large-v2 model) + DeepL(2024-07 model)

動画(38:00)

動画概要欄

話題のメモ