SemanticScuttle - klotz.me » Tags: edge computing+llm

Tags: edge computing* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

PrismML debuts energy-sipping 1-bit LLM in bid to free AI from the cloud

PrismML, a venture originating from Caltech, has introduced its new 1-bit large language model, Bonsai 8B, designed to significantly enhance AI efficiency on edge hardware. This innovative model architecture represents weights using only their sign and a shared scale factor, resulting in a memory footprint of just 1.15 GB. Compared to full-precision models, Bonsai 8B is 14 times smaller, 8 times faster, and 5 times more energy-efficient, while maintaining competitive performance. By drastically reducing memory and power requirements, PrismML aims to enable advanced AI applications on mobile devices, real-time robotics, and secure enterprise systems, effectively moving powerful language models out of massive cloud datacenters and onto local hardware.

2026-04-05 Tags: prismml, bonsai 8b, 1-bit llm, edge computing, quantization, caltech, machine learning, llm by klotz

Bringing AI Closer to the Edge and On-Device with Gemma 4

NVIDIA has launched the Gemma 4 model family, designed to operate efficiently across a wide range of hardware, from data centers to edge devices like Jetson. This new generation includes the first Gemma MoE model and supports over 140 languages, enabling advanced capabilities like reasoning, code generation, and multimodal input.
Developers can fine-tune and deploy Gemma 4 using tools like NeMo Automodel and NVIDIA NIM, with commercial licensing available. The models are optimized for local deployment with frameworks such as vLLM, Ollama, and llama.cpp, offering flexibility for various use cases, including robotics, smart machines, and secure on-premise applications.

2026-04-03 Tags: agents, edge computing, blackwell, dgx, jetson, nemo, llm by klotz

Orange Pi Unveils AI Station with Ascend 310 and 176 TOPS Compute

Orange Pi has announced the Orange Pi AI Station, a compact edge computing platform featuring the Ascend 310 processor, offering up to 176 TOPS of AI compute performance with options for up to 96GB of LPDDR4X memory and NVMe storage.

2025-12-31 Tags: orange pi, llm, ascend 310, edge computing, sbc, single board computer, ai, inference, lpddr4x, nvme by klotz

Build a Fast Offline AI Assistant on a Raspberry Pi 5

This article details how to build a fast, offline AI chatbot using a Raspberry Pi 5, RLM AA50 accelerator card, and optimization techniques for speech recognition, natural language processing, and text-to-speech tasks.

2025-12-31 Tags: raspberry pi 5, llm, whisper, qwen-3, mellotts, edge computing, ai chatbot by klotz

SkyMemory: A LEO Edge Cache for Transformer Inference Optimization and Scale Out

This paper proposes SkyMemory, a LEO satellite constellation hosted key-value cache (KVC) to accelerate transformer-based inference, particularly for large language models (LLMs). It explores different chunk-to-server mapping strategies (rotation-aware, hop-aware, and combined) and presents simulation results and a proof-of-concept implementation demonstrating performance improvements.

2025-08-10 Tags: low-earth orbit, bernardo huberman, cache, transformer, inference, llm, key-value, edge computing, satellite communication by klotz

Federated Language Models: SLMs at the Edge Plus Cloud LLMs

The article introduces the concept of Federated Language Models, combining edge-based Small Language Models (SLMs) with cloud-based Large Language Models (LLMs) for enhanced privacy and performance in AI applications.

2024-07-10 Tags: large language models, edge computing, cloud computing, data security, slm, llm, federated learning, privacy by klotz

Bringing Generative AI to Life with NVIDIA Jetson

2023-10-21 Tags: nvidia, jetson, llm, stable diffusion, iot, edge computing by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

Tags: edge computing* + llm*

Linked Tags

Related Tags