Tags: computer science* + machine learning*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Personal website of Alex L. Zhang, a PhD student at MIT CSAIL focusing on the efficiency and utilization of language models. His research spans ML systems, language model benchmarks, and specialized model development.
    Key areas of work include:
    - Recursive Language Models (RLMs) and Project Popcorn
    - GPU programming competitions via KernelBot and GPU MODE
    - Benchmarking capabilities through VideoGameBench and KernelBench
    - Development of models like Neo-1 and KernelLLM-8B
  2. This is an open, unconventional textbook covering mathematics, computing, and artificial intelligence from foundational principles. It's designed for practitioners seeking a deep understanding, moving beyond exam preparation and focusing on real-world application. The author, drawing from years of experience in AI/ML, has compiled notes that prioritize intuition, context, and clear explanations, avoiding dense notation and outdated material.
    The compendium covers a broad range of topics, from vectors and matrices to machine learning, computer vision, and multimodal learning, with future chapters planned for areas like data structures and AI inference.
  3. In cellular automata, simple rules create elaborate structures. Now researchers can start with the structures and reverse-engineer the rules.
  4. A new study by MIT CSAIL researchers maps the challenges of AI in software development, identifying bottlenecks and highlighting research directions to move the field forward, aiming to allow humans to focus on high-level design while automating routine tasks.
  5. "We present a systematic review of some of the popular machine learning based email spam filtering approaches."

    "Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering."
  6. This paper introduces Cross-Layer Attention (CLA), an extension of Multi-Query Attention (MQA) and Grouped-Query Attention (GQA) for reducing the size of the key-value cache in transformer-based autoregressive large language models (LLMs). The authors demonstrate that CLA can reduce the cache size by another 2x while maintaining nearly the same accuracy as unmodified MQA, enabling inference with longer sequence lengths and larger batch sizes.
  7. 2018-10-31 Tags: , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "computer science+machine learning"

About - Propulsed by SemanticScuttle