A unified memory stack that functions as a memristor as well as a ferroelectric capacitor is reported, enabling both energy-efficient inference and learning at the edge.
   
    
 
 
  
   
   This paper proposes SkyMemory, a LEO satellite constellation hosted key-value cache (KVC) to accelerate transformer-based inference, particularly for large language models (LLMs). It explores different chunk-to-server mapping strategies (rotation-aware, hop-aware, and combined) and presents simulation results and a proof-of-concept implementation demonstrating performance improvements.
   
    
 
 
  
   
   The article introduces the concept of Federated Language Models, combining edge-based Small Language Models (SLMs) with cloud-based Large Language Models (LLMs) for enhanced privacy and performance in AI applications.