亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Table of Contents
This is not the first time that Sophon Engine has released a model.
產(chǎn)生源源不絕的新互動資料
具身智能「活」的大腦
Home Technology peripherals AI The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

Apr 29, 2024 pm 06:55 PM
data train

Cry to death, the whole world is crazy about making big models, the data on the Internet is not enough, not enough at all.

The training model looks like "The Hunger Games", and AI researchers around the world are worrying about how to feed these big data eaters.

This problem is particularly prominent in multi-modal tasks. When

was at a loss, the start-up team from the Department of Renmin University used its own new model to take the lead in China in turning "model-generated data to feed itself" into Reality.

Moreover, it is a two-pronged approach on the understanding side and the generation side. Both sides can generate high-quality, multi-modal new data and provide data feedback to the model itself.

What is the model?

The multi-modal large model Awaker 1.0 just appeared on the Zhongguancun Forum. Who is the team?

Sophon engine. was founded by Gao Yizhao, a doctoral student at the Hillhouse School of Artificial Intelligence at Renmin University of China, with Professor Lu Zhiwu from the Hillhouse School of Artificial Intelligence serving as a consultant. When the company was founded in 2021, it entered the "no man's land" track of multi-modality early. MOE architecture, solving the conflict problem of multi-modal and multi-task training

This is not the first time that Sophon Engine has released a model.

On March 8 last year, the team that has devoted two years of research and development released the first self-developed multi-modal model, the ChatImg sequence model with tens of billions of parameters, and launched the world's first public evaluation based on this. Multimodal conversation application ChatImg

(元 multiply image)

. Later, ChatImg continued to iterate, and the research and development of the new model Awaker was also advanced in parallel. The latter also inherits the basic capabilities of the previous model.

Compared with the previous generation ChatImg sequence model, Awaker 1.0

adopts the MoE model architecture. The reason is that we want to solve the problem of serious conflicts in multi-modal and multi-task training.

Using the MoE model architecture, it can better learn multi-modal general capabilities and the unique capabilities required for each task, thereby further improving the capabilities of the entire Awaker 1.0 on multiple tasks.

Data is worth a thousand words:

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks In view of the problem of evaluation data leakage in mainstream multi-modal evaluation lists, the Sophon team strictly constructed its own evaluation set. Most of the test images come from personal mobile phone photo albums.

The table shows that the team evaluated Awaker 1.0 and the three most advanced multi-modal large models at home and abroad.

One more thing to mention, since GPT-4V and Intern-VL do not directly support detection tasks, their detection results are obtained by requiring the model to use language to describe the orientation of the object.

It can be seen that in visual question answering and business application tasks, the base model of Awaker 1.0 exceeds GPT-4V, Qwen-VL-Max and Intern-VL.

The base model of Awaker 1.0 achieved the second-best results on description, reasoning, and detection tasks.

Finally, looking at the average score, Awaker 1.0 has the highest value among them.

Therefore, the above results also confirm the effectiveness of the multi-task multi-modal model using the MoE architecture.

The data set evaluation results are available, but the real effect needs to be further experienced.

Here we mainly ask some questions about Chinese OCR

(picture text recognition)

and counting issues, detailed description tasks, etc. compared with the large model. This main

test count:Awaker 1.0 can give the correct answer, while the other three models all answer incorrectly.

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecksThis main test

Chinese OCR:The players who answered correctly are Qwen-VL-Max and Awaker 1.0.

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecksThe last question tests the understanding of the

picture content. GPT-4V and Awaker 1.0 can not only describe the content of the picture in detail, but also accurately identify the details in the picture, such as Coca-Cola shown in the picture.

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecksIt must be mentioned that Awaker 1.0 inherits some of the research results that the Sophon team has previously received much attention from.

I’m talking about you - the

generated side of Awaker 1.0. The generation side of Awaker 1.0 is the Sora-like video generation base VDT

(Video Diffusion Transformer)

independently developed by Sophon Engine. VDT's academic paper preceded the release of OpenAI Sora

(last May)

, and has been accepted by the top conference ICLR 2024.

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

VDT與眾不同的創(chuàng)新之處,主要有兩點。

一是在技術(shù)架構(gòu)上採用Diffusion Transformer#,在OpenAI之前就展現(xiàn)了Transformer在影片產(chǎn)生領(lǐng)域的巨大潛力。

它的優(yōu)勢在於其出色的時間依賴性捕獲能力,能夠產(chǎn)生時間上連貫的視訊幀,包括模擬三維物件隨時間的物理動態(tài)。

二是提出統(tǒng)一的時空遮罩建模機制,使VDT能夠處理多種視訊產(chǎn)生任務(wù)。

VDT靈活的條件資訊處理方式,如簡單的token空間拼接,有效地統(tǒng)一了不同長度和模態(tài)的資訊。

同時,透過與該工作提出的時空掩碼建模機制結(jié)合,VDT成為了一個通用的視訊擴散工具,在不修改模型結(jié)構(gòu)的情況下可以應(yīng)用於無條件生成、視訊後續(xù)幀預(yù)測、插幀、圖生影片、影片畫面補全等多種影片生成任務(wù)。

據(jù)了解,智子引擎團(tuán)隊不僅探討了VDT對簡單物理規(guī)律的模擬,發(fā)現(xiàn)它能模擬物理過程

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

也在超寫實人像影片產(chǎn)生任務(wù)上進(jìn)行了深度探索。

因為肉眼對人臉及人的動態(tài)變化非常敏感,所以這個任務(wù)對影片產(chǎn)生品質(zhì)的要求非常高。不過,智子引擎已經(jīng)突破超寫實人像影片產(chǎn)生的大部分關(guān)鍵技術(shù),比起Sora也沒在害怕的。

口說無憑。

這是智子引擎結(jié)合VDT和可控生成,對人像視頻生成質(zhì)量提升後的效果:

據(jù)悉,智子引擎還將繼續(xù)優(yōu)化人物可控的生成演算法,並積極進(jìn)行商業(yè)化探索。

產(chǎn)生源源不絕的新互動資料

更值得關(guān)注的是,智子引擎團(tuán)隊強調(diào):

Awaker 1.0是世界上首個能自主更新的多模態(tài)大模型。

換句話說,Awaker 1.0是「活」的,它的參數(shù)可以即時持續(xù)地更新-這就導(dǎo)致Awaker 1.0區(qū)別於所有其它多模態(tài)大模型,

Awaker 1.0的自主更新機制,包含三大關(guān)??鍵技術(shù),分別是:

  • 資料主動產(chǎn)生
  • 模型反思評估
  • ##模型連續(xù)更新
這三項技術(shù),讓Awaker 1.0具備自主學(xué)習(xí)、自動反思和自主更新的能力,可以在這個世界自由探索,甚至與人類互動。

基於此,Awaker 1.0在理解側(cè)和生成側(cè)都能產(chǎn)生源源不絕的新交互資料。

怎麼做到的?

在理解側(cè),Awaker 1.0與數(shù)位世界和現(xiàn)實世界互動。

在執(zhí)行任務(wù)的過程中,Awaker 1.0將場景行為資料反哺給模型,以實現(xiàn)持續(xù)更新與訓(xùn)練。

在生成側(cè),Awaker 1.0可以進(jìn)行高品質(zhì)的多模態(tài)內(nèi)容生成,為理解側(cè)模型提供更多的訓(xùn)練資料。

在理解側(cè)和生成側(cè)的兩個循環(huán)中,Awaker 1.0實際實現(xiàn)了將視覺理解與視覺生成進(jìn)行融合。

要知道,Sora問世後,越來越多聲音表示,要通往AGI,必須達(dá)成「理解和生成的大一統(tǒng)」。

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

以新知識注入為例,以下來看個具體跑通的例子。

Awaker 1.0能夠持續(xù)在網(wǎng)路上學(xué)習(xí)即時新聞訊息,同時,它結(jié)合新學(xué)習(xí)到的新聞資訊來回答各種複雜問題。

這和目前兩種主流,即RAG和傳統(tǒng)長上下文方式還不太一樣,Awaker 1.0是真的

把新知識「記憶」在自個兒模型的參數(shù)上

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

可以看到,在連續(xù)3天的自我更新過程中,Awaker 1.0每天都能學(xué)習(xí)當(dāng)天的新聞信息,並在描述中準(zhǔn)確地說出對應(yīng)信息。

而且雖然一直在學(xué),Awaker 1.0倒沒有顧此失彼,它並不會很快地遺忘學(xué)過的知識。

譬如,4月16日學(xué)進(jìn)去的智界S7相關(guān)知識,在2天後仍被Awaker 1.0記住或理解。

So,在這個數(shù)據(jù)如金的時代,別再哀嘆「數(shù)據(jù)不夠用」了。

面對資料瓶頸的團(tuán)隊們,一種可行、可用的新選擇,不就被Awaker 1.0送來了?

具身智能「活」的大腦

話說回來,正是由於實現(xiàn)了視覺理解與視覺生成的融合,當(dāng)遇到「多模態(tài)大模型適配具身智能」的問題,Awaker 1.0的驕傲已經(jīng)顯露無疑。

事情是這樣的:

Awaker 1.0這類多模態(tài)大模型,其具有的視覺理解能力可以天然與具身智能的「眼睛」結(jié)合。

而主流聲音也認(rèn)為,「多模態(tài)大模型 具身智慧」有可能大幅提升具身智慧的適應(yīng)性和創(chuàng)造性,甚至是實現(xiàn)AGI的可行路徑。

理由不外乎兩點。

第一,人們期望具身智慧擁有適應(yīng)性,即智能體能夠透過持續(xù)學(xué)習(xí)來適應(yīng)不斷變化的應(yīng)用環(huán)境。

這樣一來,具身智慧既能在已知多模態(tài)任務(wù)上越做越好,也能快速適應(yīng)未知的多模態(tài)任務(wù)。

第二,人們也期望具身智慧具有真正的創(chuàng)造性,希望它透過對環(huán)境的自主探索,能夠發(fā)現(xiàn)新的策略和解決方案,並探索AI的能力邊界。

但是二者的適配,並不是簡簡單單把多模態(tài)大模型連結(jié)個身體,或直接給具身智能裝個腦子那麼簡單。

就拿多模態(tài)大模型來說,至少有兩個明顯的問題擺在眼前。

一是模型的迭代更新周期長,需要大量的人力投入;

二是模型的訓(xùn)練資料都源自已有的數(shù)據(jù),模型不能持續(xù)獲得大量的新知識。雖然透過RAG和擴長上下文視窗也可以注入持續(xù)出現(xiàn)的新知識,模型記不住,補救方式還會帶來額外的問題。

總之,目前的多模態(tài)大模型在實際應(yīng)用場景中不具備很強的適應(yīng)性,更不具備創(chuàng)造性,導(dǎo)致在行業(yè)落地時總是出現(xiàn)各種各樣的困難。

妙啊——還記得我們前面提到,Awaker 1.0不僅可以學(xué)習(xí)新知識,還能記住新知識,而這種學(xué)習(xí)是每天的、持續(xù)的、及時的。

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

從這張框架圖可以看出,Awaker 1.0能夠與各種智慧型裝置結(jié)合,透過智慧型裝置觀察世界,產(chǎn)生動作意圖,並自動建構(gòu)指令控制智能設(shè)備完成各種動作。

在完成各種動作後,智慧型裝置會自動產(chǎn)生各種回饋,Awaker 1.0能夠從這些動作和回饋中獲得有效的訓(xùn)練資料進(jìn)行持續(xù)的自我更新,不斷強化模型的各種能力。

這就等於具身智能擁有一個活的大腦了。

誰看了不說一句how pay(狗頭)

#尤其重要的是,因為具備自主更新能力,Awaker 1.0##不單單是可以和具身智能適配,它也適用於更廣泛的行業(yè)場景,能夠解決更複雜的實際任務(wù)。

例如,Awaker 1.0與各種智慧型裝置結(jié)合,從而實現(xiàn)雲(yún)邊協(xié)同。

這時候,Awaker 1.0就是部署在雲(yún)端的“大腦”,觀察、指揮,控制各種邊端智慧型裝置執(zhí)行各項任務(wù)。

而邊端智慧型裝置執(zhí)行各項任務(wù)時獲得的回饋,又會源源不斷地傳回給Awaker 1.0,讓它持續(xù)地獲得訓(xùn)練數(shù)據(jù),不斷進(jìn)行自我更新。

這可不是紙上談兵,Awaker 1.0與智慧型裝置的雲(yún)邊協(xié)同的技術(shù)路線,已經(jīng)應(yīng)用在電網(wǎng)智慧巡檢、智慧城市等應(yīng)用場景中,並取得了遠(yuǎn)好於傳統(tǒng)小模型的識別效果。

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks

多模態(tài)大模型能聽、能看、能說,在語音辨識、影像處理、自然語言理解等多個領(lǐng)域展現(xiàn)出了巨大的潛力和應(yīng)用價值,幾乎無所不能。

但它的煩惱很明顯,如何不斷吸收新知識、適應(yīng)新變化?

可以說,修練內(nèi)功、提升武藝成為了多模態(tài)大模型面臨的重要課題。

智子引擎Awaker 1.0的問世,為多模態(tài)大模型的自我超越提供了一把鑰匙。

它好像會了那個吸星大法,透過自主更新機制,打破了資料短缺的瓶頸,為多模態(tài)大模型的持續(xù)學(xué)習(xí)和自我進(jìn)化提供了可能;再就是利用雲(yún)邊協(xié)同技術(shù),勇闖在具身智慧等智慧體設(shè)備的具體應(yīng)用場景。

這或許是邁向AGI的一小步,但同時也是多模態(tài)大模型自我超越之旅的一個開始。

漫長而艱辛的旅程,需要智子引擎這樣的團(tuán)隊,向科技的高峰不斷攀登。

#

The above is the detailed content of The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Use ddrescue to recover data on Linux Use ddrescue to recover data on Linux Mar 20, 2024 pm 01:37 PM

DDREASE is a tool for recovering data from file or block devices such as hard drives, SSDs, RAM disks, CDs, DVDs and USB storage devices. It copies data from one block device to another, leaving corrupted data blocks behind and moving only good data blocks. ddreasue is a powerful recovery tool that is fully automated as it does not require any interference during recovery operations. Additionally, thanks to the ddasue map file, it can be stopped and resumed at any time. Other key features of DDREASE are as follows: It does not overwrite recovered data but fills the gaps in case of iterative recovery. However, it can be truncated if the tool is instructed to do so explicitly. Recover data from multiple files or blocks to a single

Open source! Beyond ZoeDepth! DepthFM: Fast and accurate monocular depth estimation! Open source! Beyond ZoeDepth! DepthFM: Fast and accurate monocular depth estimation! Apr 03, 2024 pm 12:04 PM

0.What does this article do? We propose DepthFM: a versatile and fast state-of-the-art generative monocular depth estimation model. In addition to traditional depth estimation tasks, DepthFM also demonstrates state-of-the-art capabilities in downstream tasks such as depth inpainting. DepthFM is efficient and can synthesize depth maps within a few inference steps. Let’s read about this work together ~ 1. Paper information title: DepthFM: FastMonocularDepthEstimationwithFlowMatching Author: MingGui, JohannesS.Fischer, UlrichPrestel, PingchuanMa, Dmytr

Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Hello, electric Atlas! Boston Dynamics robot comes back to life, 180-degree weird moves scare Musk Apr 18, 2024 pm 07:58 PM

Boston Dynamics Atlas officially enters the era of electric robots! Yesterday, the hydraulic Atlas just "tearfully" withdrew from the stage of history. Today, Boston Dynamics announced that the electric Atlas is on the job. It seems that in the field of commercial humanoid robots, Boston Dynamics is determined to compete with Tesla. After the new video was released, it had already been viewed by more than one million people in just ten hours. The old people leave and new roles appear. This is a historical necessity. There is no doubt that this year is the explosive year of humanoid robots. Netizens commented: The advancement of robots has made this year's opening ceremony look like a human, and the degree of freedom is far greater than that of humans. But is this really not a horror movie? At the beginning of the video, Atlas is lying calmly on the ground, seemingly on his back. What follows is jaw-dropping

Slow Cellular Data Internet Speeds on iPhone: Fixes Slow Cellular Data Internet Speeds on iPhone: Fixes May 03, 2024 pm 09:01 PM

Facing lag, slow mobile data connection on iPhone? Typically, the strength of cellular internet on your phone depends on several factors such as region, cellular network type, roaming type, etc. There are some things you can do to get a faster, more reliable cellular Internet connection. Fix 1 – Force Restart iPhone Sometimes, force restarting your device just resets a lot of things, including the cellular connection. Step 1 – Just press the volume up key once and release. Next, press the Volume Down key and release it again. Step 2 – The next part of the process is to hold the button on the right side. Let the iPhone finish restarting. Enable cellular data and check network speed. Check again Fix 2 – Change data mode While 5G offers better network speeds, it works better when the signal is weaker

Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Google is ecstatic: JAX performance surpasses Pytorch and TensorFlow! It may become the fastest choice for GPU inference training Apr 01, 2024 pm 07:46 PM

The performance of JAX, promoted by Google, has surpassed that of Pytorch and TensorFlow in recent benchmark tests, ranking first in 7 indicators. And the test was not done on the TPU with the best JAX performance. Although among developers, Pytorch is still more popular than Tensorflow. But in the future, perhaps more large models will be trained and run based on the JAX platform. Models Recently, the Keras team benchmarked three backends (TensorFlow, JAX, PyTorch) with the native PyTorch implementation and Keras2 with TensorFlow. First, they select a set of mainstream

Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! Tesla robots work in factories, Musk: The degree of freedom of hands will reach 22 this year! May 06, 2024 pm 04:13 PM

The latest video of Tesla's robot Optimus is released, and it can already work in the factory. At normal speed, it sorts batteries (Tesla's 4680 batteries) like this: The official also released what it looks like at 20x speed - on a small "workstation", picking and picking and picking: This time it is released One of the highlights of the video is that Optimus completes this work in the factory, completely autonomously, without human intervention throughout the process. And from the perspective of Optimus, it can also pick up and place the crooked battery, focusing on automatic error correction: Regarding Optimus's hand, NVIDIA scientist Jim Fan gave a high evaluation: Optimus's hand is the world's five-fingered robot. One of the most dexterous. Its hands are not only tactile

The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. The U.S. Air Force showcases its first AI fighter jet with high profile! The minister personally conducted the test drive without interfering during the whole process, and 100,000 lines of code were tested for 21 times. May 07, 2024 pm 05:00 PM

Recently, the military circle has been overwhelmed by the news: US military fighter jets can now complete fully automatic air combat using AI. Yes, just recently, the US military’s AI fighter jet was made public for the first time and the mystery was unveiled. The full name of this fighter is the Variable Stability Simulator Test Aircraft (VISTA). It was personally flown by the Secretary of the US Air Force to simulate a one-on-one air battle. On May 2, U.S. Air Force Secretary Frank Kendall took off in an X-62AVISTA at Edwards Air Force Base. Note that during the one-hour flight, all flight actions were completed autonomously by AI! Kendall said - "For the past few decades, we have been thinking about the unlimited potential of autonomous air-to-air combat, but it has always seemed out of reach." However now,

The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks The vitality of super intelligence awakens! But with the arrival of self-updating AI, mothers no longer have to worry about data bottlenecks Apr 29, 2024 pm 06:55 PM

I cry to death. The world is madly building big models. The data on the Internet is not enough. It is not enough at all. The training model looks like "The Hunger Games", and AI researchers around the world are worrying about how to feed these data voracious eaters. This problem is particularly prominent in multi-modal tasks. At a time when nothing could be done, a start-up team from the Department of Renmin University of China used its own new model to become the first in China to make "model-generated data feed itself" a reality. Moreover, it is a two-pronged approach on the understanding side and the generation side. Both sides can generate high-quality, multi-modal new data and provide data feedback to the model itself. What is a model? Awaker 1.0, a large multi-modal model that just appeared on the Zhongguancun Forum. Who is the team? Sophon engine. Founded by Gao Yizhao, a doctoral student at Renmin University’s Hillhouse School of Artificial Intelligence.

See all articles