Nexa AI – Functional Tokens for On-device Multimodal Models
Date Presented: 7/12/2024
Speakers: Alex Chen + Zach Li
Abstract: Tokenizing corpora into semantic tokens has proven effective for large language models. However, this approach encounters challenges when applied to function calls, leading to inaccuracies and hallucinations. To address this issue, we have pioneered a new training methodology using functional tokens, transforming complex function calling tasks into language completion tasks. We also released Octopus-series models using functional tokens and achieved GPT4 level function calling accuracy with 2B parameter size. Our Octopus-V2 model achieved 35 times faster inference speed up and 70 times more energy efficiency compared to the RAG plus Llama3 solution, and is four times faster than OpenAI’s GPT-4O. The functional token is then applied to Octopus-V3, a sub-billion multimodal model, adept at both text and images, and fluent in English and Mandarin. Furthermore, Octopus-V4 extends these capabilities into a graph network structure, with Octopus-V2 as the master node and integration with other open-source models as worker nodes, Octopus-V4 achieved 74.8 MMLU and outperforms GPT3.5, and applied for cloud and edge collaboration. Nexa’s Octopus-V2 models ranked 2nd place among half a million models on HuggingFace between Apr 2 and Apr 15, surpassing XAI grok and Databrick DBRX model during that period, and was mentioned by Google Gemma team during the 2024 Google IO. Nexa’s Octopus models have also attracted industrial collaboration interest from AWS, Google, Volkswagen US, Qualcomm, ByteDance, Stellantis, Zoom, and more.
Speakers' Bios: Alex Chen is the CEO and founder of Nexa AI, with PhD in Mechanics and Computation from Stanford University. His research interests lie in AI agent development empowered by large language models. He is a serial entrepreneur and served as President of the Chinese Entrepreneur Organization before. He is also a gold medalist in the Mathematics Olympiad. Zack Li is the CTO and co-founder of Nexa AI. Before this, he accumulated four years of industrial experience in on-device AI at Google and Amazon Lab126, focusing on model deployment, performance optimization, and edge-cloud collaboration. He received an MS in Operation Research from Stanford University. Alex and Zack are founders of Nexa AI and have authored Octopus series models. Nexa AI builds lightweight but powerful multimodal models for AI agents and provides on-device SDK infra to make models run fast and energy-efficiently. For more information, visit https://www.nexa4ai.com/
社群媒體聆聽
Revolutionize on-device AI in VR: Nexa AI's Octopus model demo 🐙 🥽
Nexa AI's Octopus model transforms VR/AR experiences with on-device AI. Take a look at our demo below! Highlights: ⚙️ Compatibility: Smooth operation on VR headsets like Meta Quest 2. 📡 Offline: Octopus runs entirely on-device, no internet needed. ⚡ Rapid Inference: Lightning-fast and efficient performance. 🔄 Automation: Streamlines workflows and enhances VR/AR interactions. Stay updated with us at nexa4ai.com. We're empowering billions of devices with fast, secure, and energy-efficient AI. #NexaAI #AI #OnDeviceAI #VRAR
Nexa AI – Functional Tokens for On-device Multimodal Models
Date Presented: 7/12/2024 Speakers: Alex Chen + Zach Li Abstract: Tokenizing corpora into semantic tokens has proven effective for large language models. However, this approach encounters challenges when applied to function calls, leading to inaccuracies and hallucinations. To address this issue, we have pioneered a new training methodology using functional tokens, transforming complex function calling tasks into language completion tasks. We also released Octopus-series models using functional tokens and achieved GPT4 level function calling accuracy with 2B parameter size. Our Octopus-V2 model achieved 35 times faster inference speed up and 70 times more energy efficiency compared to the RAG plus Llama3 solution, and is four times faster than OpenAI’s GPT-4O. The functional token is then applied to Octopus-V3, a sub-billion multimodal model, adept at both text and images, and fluent in English and Mandarin. Furthermore, Octopus-V4 extends these capabilities into a graph network structure, with Octopus-V2 as the master node and integration with other open-source models as worker nodes, Octopus-V4 achieved 74.8 MMLU and outperforms GPT3.5, and applied for cloud and edge collaboration. Nexa’s Octopus-V2 models ranked 2nd place among half a million models on HuggingFace between Apr 2 and Apr 15, surpassing XAI grok and Databrick DBRX model during that period, and was mentioned by Google Gemma team during the 2024 Google IO. Nexa’s Octopus models have also attracted industrial collaboration interest from AWS, Google, Volkswagen US, Qualcomm, ByteDance, Stellantis, Zoom, and more. Speakers' Bios: Alex Chen is the CEO and founder of Nexa AI, with PhD in Mechanics and Computation from Stanford University. His research interests lie in AI agent development empowered by large language models. He is a serial entrepreneur and served as President of the Chinese Entrepreneur Organization before. He is also a gold medalist in the Mathematics Olympiad. Zack Li is the CTO and co-founder of Nexa AI. Before this, he accumulated four years of industrial experience in on-device AI at Google and Amazon Lab126, focusing on model deployment, performance optimization, and edge-cloud collaboration. He received an MS in Operation Research from Stanford University. Alex and Zack are founders of Nexa AI and have authored Octopus series models. Nexa AI builds lightweight but powerful multimodal models for AI agents and provides on-device SDK infra to make models run fast and energy-efficiently. For more information, visit https://www.nexa4ai.com/