Overview

This Arm Kleidi learning path shows how to use a single AWS Graviton instance – powered by an Arm Neoverse CPU – to build a simple “Token as a Service” server, used below to provide a chat-bot to serve a small number of concurrent users.

This architecture would be suitable for businesses looking to deploy the latest Generative AI technologies using their existing CPU compute capacity and deployment pipelines. The demo uses the open source llama.cpp framework, which Arm has enhanced by contributing the latest Arm Kleidi Technologies. Further optimizations are achieved by using the smaller 8 billion parameter Llama 3.1 model, which has been quantized to optimize memory usage.

Chat with the Llama-3.1-8B LLM below to see the performance for yourself, then follow the learning path to build your own Generative AI service on Arm Neoverse.

Running the Demo

  1. Type & send a message to the chatbot.
  2. Receive the chatbot’s reply.
  3. View stats showing how well AWS Graviton runs LLMs.
summary picture
Blown Up Diagram

Arm KleidiAI Demo

Pinging LLM server
Reset chat
TERMS OF SERVICE FOR USE OF CHATBOT DEMO PLEASE READ THESE TERMS OF SERVICE CAREFULLY. BY USING THE SERVICE, YOU HEREBY ACKNOWLEDGE AND AGREE THAT YOU HAVE READ, FULLY UNDERSTAND, AND AGREE TO BE BOUND BY THESE TERMS OF SERVICE. IF YOU ARE ENTERING INTO THESE TERMS OF SERVICE ON BEHALF OF A COMPANY OR OTHER ENTITY, YOU REPRESENT THAT YOU HAVE THE AUTHORITY TO BIND SUCH COMPANY OR OTHER ENTITY TO THESE TERMS OF SERVICE. THESE TERMS OF SERVICE (“TERMS”) ARE BETWEEN YOU, OR IF ACCEPTING ON BEHALF OF AN ENTITY, SUCH ENTITY (“YOU” OR “YOUR”) AND ARM LIMITED (“ARM”, “WE”, OR “US”). ARM MAY REVISE THESE TERMS AT ANY TIME. “Input” means any prompt provided by You to the Service via the chatbot user interface. “Service” means the demo made available by Arm at /learning-paths/servers-and-cloud-computing/llama-cpu/_demo/, which comprises of a chatbot application utilizing the Arm Kleidi Libraries software interfacing with the Llama3b artificial intelligence large language model, executing on an Arm-based server instance. “Service Output” means the output generated by Your use of the Service which includes responses to Your Service Input. 1. YOUR USE OF THE SERVICE 1.1 Please be aware that the Service comprises of a chatbot powered by artificial intelligence, and Service Output is generated by the Service without any human review or intervention. Responses may not be fully accurate, comprehensive or up-to-date and the Service lacks real-world awareness and cannot interpret complex situations as a human professional would. Your attention is drawn to Clauses 2.6, 4 and 5 of these Terms with regards to any reliance on the Service. 1.2 Therefore, you may use the Service solely for Your own internal, non-commercial purposes to demonstrate the performance of the Service. 2. RESTRICTIONS ON USE 2.1 You may only submit Input to the Service that You have created or that is owned by You. 2.2 Input to the Service provided by You shall comply with all relevant laws, including respecting the privacy rights of individuals, abiding by regulations governing specific activities, and refraining from any illegal actions. You shall not submit Input to the Service that is personal data. 2.3 You may not use the Service for the purposes of inflicting harm upon oneself or others, this includes using the Service for promotion of self-harm, suicide, the development or use of weaponry, causing injury to others, damaging property, or compromising the security of any service or system are strictly prohibited. 2.4 You shall not repurpose or disseminate Service Output with the intent to cause harm or spread hatred. 2.5 Arm will not use the Input for any purpose other than the facilitation of the Service to generate Service Output. 2.6 In accordance with Clause 1.1, you expressly acknowledge and agree that the Service is provided by Arm solely for the purposes of demonstrating the performance of an artificial intelligence large language model utilizing the Arm Kleidi Libraries software running on an Arm-based server, with the sole intention of showing the speed and efficiency of the underlying software and hardware which is running the Chatbot. The Service and Service Output is not intended to be relied upon in any way. 3. DATA RETENTION 3.1 The Input and related Service Output is available only during an active session. At the end of each session the Input and related Service Output is deleted. Arm does not store the Input and related Service Output beyond the duration of an active session. 4. WARRANTIES 4.1 You agree that the Service and any Service Output is provided “as is” without any warranties. 4.2 Arm and its licensors expressly disclaim all representations, warranties, conditions or other terms, express or implied or statutory, including, without limitation, freedom from errors, defects, and the implied warranties of merchantability, non-infringement, satisfactory quality, and fitness for a particular purpose. 5. LIMITATION OF LIABILITY 5.1 TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT SHALL ARM BE LIABLE FOR ANY INDIRECT, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES WHETHER SUCH DAMAGES ARE ALLEGED AS A RESULT OF TORTIOUS CONDUCT (INCLUDING NEGLIGENCE) OR BREACH OF CONTRACT OR OTHERWISE EVEN IF THE OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES (SUCH DAMAGES SHALL INCLUDE BUT SHALL NOT BE LIMITED TO THE COST OF REMOVAL AND REINSTALLATION OF GOODS, LOSS OF GOODWILL, LOSS OF PROFITS, LOSS OR USE OF DATA, INTERRUPTION OF BUSINESS OR OTHER ECONOMIC LOSS). 5.2 NOTWITHSTANDING ANYTHING TO THE CONTRARY CONTAINED IN THESE TERMS, THE MAXIMUM LIABILITY OF ARM TO YOU IN AGGREGATE FOR ALL CLAIMS MADE AGAINST ARM UNDER THESE TERMS, FOR BREACH OF CONTRACT, IN TORT OR OTHERWISE UNDER OR IN CONNECTION WITH THE SUBJECT MATTER OF THESE TERMS SHALL NOT EXCEED THE GREATER OF: (I) THE TOTAL SUMS PAID BY YOU TO ARM (IF ANY) IN RESPECT OF THE SERVICE; AND (II) TEN US DOLLARS ($10.00). THE EXISTENCE OF MORE THAN ONE CLAIM OR SUIT WILL NOT ENLARGE OR EXTEND THE LIMIT. YOU RELEASE ARM FROM ALL OBLIGATIONS, LIABILITY, CLAIMS OR DEMANDS IN EXCESS OF THIS LIMITATION. 5.3 NOTHING IN THESE TERMS SHALL OPERATE TO LIMIT OR EXCLUDE LIABILITY FOR DEATH OR PERSONAL INJURY ARISING FROM EITHER PARTY’S NEGLIGENCE OR FRAUD. 6. GENERAL 6.1 These Terms constitute the entire agreement between the parties and supersedes and extinguishes all prior and contemporaneous understandings, promises, assurances and agreements, written or oral, regarding its subject matter. 6.2 Nothing in these Terms represents a restriction upon further dissemination of the Service or the Service Output. 6.3 These Terms and any dispute or claim (including non-contractual disputes or claims) arising out of or in connection with it or its subject matter or formation shall be governed by and construed in accordance with the law of England and Wales. Terms of Use for Chatbot Demo Version 1.0 September-2024
Your use of this demo is subject to the Terms of Use.

Stats

Type a message to the chatbot to view metrics.