arXiv:2401.03568v2[cs.AI]25Jan2024AGENTAI:SURVEYINGTHEHORIZONSOFMULTIMODALINTERACTIONZaneDurante1†,QiuyuanHuang2‡∗,NaokiWake2∗,RanGong3†,JaeSungPark4†,BidiptaSarkar1†,RohanTaori1†,YusukeNoda5,DemetriTerzopoulos3,YejinChoi4,KatsushiIkeuchi2,HoiVo5,LiFei-Fei1,JianfengGao21StanfordUniversity;2MicrosoftResearch,Redmond;3UniversityofCalifornia,LosAngeles;4UniversityofWashington;5MicrosoftGamingFigure1:OverviewofanAgentAIsystemthatcanperceiveandactindifferentdomainsandapplications.AgentAIisemergingasapromisingavenuetowardArtificialGeneralIntelligence(AGI).AgentAItraininghasdemonstratedthecapacityformulti-modalunderstandinginthephysicalworld.Itprovidesaframeworkforreality-agnostictrainingbyleveraginggenerativeAIalongsidemultipleindependentdatasources.Largefoundationmodelstrainedforagentandaction-relatedtaskscanbeappliedtophysicalandvirtualworldswhentrainedoncross-realitydata.WepresentthegeneraloverviewofanAgentAIsystemthatcanperceiveandactinmanydifferentdomainsandapplications,possiblyservingasaroutetowardsAGIusinganagentparadigm.∗EqualContribution.‡ProjectLead.†WorkdonewhileinterningatMicrosoftResearch,Redmond.AgentAI:APREPRINTSurveyingtheHorizonsofMultimodalInteractionABSTRACTMulti-modalAIsystemswilllikelybecomeaubiquitouspresenceinoureverydaylives.Apromisingapproachtomakingthesesystemsmoreinteractiveist...
发表评论取消回复