SemanticScuttle - klotz.me » klotz: llm+machine learning

klotz: llm* + machine learning*

Zhipu AI Releases GLM-4.7-Flash: A 30B-A3B MoE Model for Efficient Local Coding and Agents

Zhipu AI has released GLM-4.7-Flash, a 30B-A3B MoE model designed for efficient local coding and agent applications. It offers strong coding and reasoning performance with a 128k token context length and supports English and Chinese.

2026-01-22 Tags: llm, glm-4.7-flash, zhipu ai, moe, coding, agents, machine learning, deep learning, local deployment by klotz

The Physics of mHC: Why Deep Learning Needs Energy Conservation

This article presents a compelling argument that the Manifold-Constrained Hyper-Connections (mHC) method in deep learning isn't just a mathematical trick, but a fundamentally physics-inspired approach rooted in the principle of energy conservation.

The author argues that standard neural networks act as "active amplifiers," injecting energy and potentially leading to instability. mHC, conversely, aims to create "passive systems" that route information without creating or destroying it. This is achieved by enforcing constraints on the weight matrices, specifically requiring them to be doubly stochastic.

The derivation of these constraints is presented from a "first principles" physics perspective:

* **Conservation of Signal Mass:** Ensures the total input signal equals the total output signal (Column Sums = 1).
* **Bounding Signal Energy:** Prevents energy from exploding by ensuring the output is a convex combination of inputs (non-negative weights).
* **Time Symmetry:** Guarantees energy conservation during backpropagation (Row Sums = 1).

The article also draws a parallel to Information Theory, framing mHC as a way to combat the Data Processing Inequality by preserving information through "soft routing" – akin to a permutation – rather than lossy compression.

Finally, it explains how the Sinkhorn-Knopp algorithm is used to enforce these constraints, effectively projecting the network's weights onto the Birkhoff Polytope, ensuring stability and adherence to the laws of thermodynamics. The core idea is that a stable deep network should behave like a system of pipes and valves, routing information without amplifying it.

2026-01-14 Tags: mhc, deep learning, physics, energy conservation, doubly stochastic matrices, sinkhorn-knopp algorithm, information theory, neural networks, deep seek, llm by klotz

Run LLMs on Intel® GPUs Using llama.cpp

This article details how to run Large Language Models (LLMs) on Intel GPUs using the llama.cpp framework and its new SYCL backend, offering performance improvements and broader hardware support.

2026-01-07 Tags: llama.cpp, llm, intel gpu, sycl, oneapi, inference, machine learning, linux by klotz

GenAI_Agents

This repository provides tutorials and implementations for various Generative AI Agent techniques, from basic to advanced. It serves as a comprehensive guide for building intelligent, interactive AI systems.

2026-01-02 Tags: agents, nlp, llm, machine learning, natural language processing by klotz

The Optimal Architecture for Small Language Models

This article details research into finding the optimal architecture for small language models (70M parameters), exploring depth-width tradeoffs, comparing different architectures, and introducing Dhara-70M, a diffusion model offering 3.8x faster throughput with improved factuality.

2025-12-27 Tags: llm, nlp, small language models, architecture, diffusion, llama, gemma, deep learning by klotz

Towards Better Search with Domain-Aware Text Embeddings for C2C Marketplaces

This paper reports on an experiment to build a domain-aware Japanese text-embedding approach to improve the quality of search at Mercari, Japan's largest C2C marketplace.

2025-12-25 Tags: text embeddings, information retrieval, search, machine learning, llm, fine tuning by klotz

Choosing the Right Chunking Strategy: A Comprehensive Guide to RAG Optimization

This article explores different chunking strategies for Retrieval-Augmented Generation (RAG) systems, comparing nine approaches using the agenticmemory library to improve retrieval accuracy and reduce hallucinations.

2025-12-22 Tags: llm, performance, rag, chunking, embedding, vector database, rag optimization by klotz

Portable AI Agent with UNIHIKER K10

This project guides you through building a portable AI agent using the UNIHIKER K10, Xiaozhi AI firmware, and a custom 3D-printed case. It covers hardware overview, firmware flashing, console setup, and 3D printing services.

2025-12-10 Tags: llm, iot, machine learning, unihiker k10, esp32-s3, xiaozhi ai, 3d printing, edge ai by klotz

Cisco Released Cisco Time Series Model: Their First Open-Weights Foundation Model based on Decoder-only Transformer Architecture

Cisco and Splunk have introduced the Cisco Time Series Model, a univariate zero shot time series foundation model designed for observability and security metrics. It is released as an open weight checkpoint on Hugging Face.

* **Multiresolution data is common:** The model handles data where fine-grained (e.g., 1-minute) and coarse-grained (e.g., hourly) data coexist, a typical pattern in observability platforms where older data is often aggregated.
* **Long context windows are needed:** It's built to leverage longer historical data (up to 16384 points) than many existing time series models, improving forecasting accuracy.
* **Zero-shot forecasting is desired:** The model aims to provide accurate forecasts *without* requiring task-specific fine-tuning, making it readily applicable to a variety of time series datasets.
* **Quantile forecasting is important:** It predicts not just the mean forecast but also a range of quantiles (0.1 to 0.9), providing a measure of uncertainty.

2025-12-09 Tags: time series, foundation model, transformer, cisco, splunk, observability, metrics, machine learning, llm by klotz

How to Turn Your LLM Prototype Into a Production-Ready System

This article details the steps to move a Large Language Model (LLM) from a prototype to a production-ready system, covering aspects like observability, evaluation, cost management, and scalability.

2025-12-07 Tags: llm, production, deployment, observability, evaluation, cost management, scalability, machine learning by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: llm* + machine learning*

Linked Tags

Related Tags