klotz: machine learning*

Bookmarks on this page are managed by an admin user.

"Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.


0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. Learn about the new Amazon time series model, which you can use to forecast energy usage, traffic congestion, and weather.
    2024-04-10 Tags: , , by klotz
  2. Learn about the importance of evaluating classification models and how to use the confusion matrix and ROC curves to assess model performance. This post covers the basics of both methods, their components, calculations, and how to visualize the results using Python.
  3. This GitHub repository contains a course on Large Language Models (LLMs) with roadmaps and Colab notebooks. The course is divided into three parts: LLM Fundamentals, The LLM Scientist, and The LLM Engineer. Each part covers various topics, including mathematics, Python, neural networks, instruction datasets, pre-training, supervised fine-tuning, reinforcement learning from human feedback, evaluation, quantization, new trends, running LLMs, building a vector storage, retrieval augmented generation, advanced RAG, inference optimization, and deployment.
    2024-04-08 Tags: , , by klotz
  4. This article explores how to boost the performance of small language models by using supervision from larger ones through knowledge distillation. The article provides a step-by-step guide on how to distill knowledge from a teacher model (LLama 2–70B) to a student model (Tiny-LLama) using unlabeled in-domain data and targeted prompting.
  5. Quivr is an open-source RAG framework and a robust AI assistant that helps you manage and interact with information, reducing the burden of information overload. It integrates with all your files and programs, making it easy to find and analyze your data in one place.
  6. Discussion on the efficiency of Random Forest algorithms for PCA and Feature Importance. By Christopher Karg for Towards Data Science.
  7. The paper proposes a two-phase framework called TnT-LLM to automate the process of end-to-end label generation and assignment for text mining using large language models, where LLMs produce and refine a label taxonomy iteratively using a zero-shot, multi-stage reasoning approach, and are used as data labelers to yield training samples for lightweight supervised classifiers. The framework is applied to the analysis of user intent and conversational domain for Bing Copilot, achieving accurate and relevant label taxonomies and a favorable balance between accuracy and efficiency for classification at scale.
  8. - provides list of 55 categorical encoders, explains how to use the code as a supplement to Category Encoders Python module.
    - categorizes encoders into families, explains how to reuse code from the benchmark to include your encoder or dataset in comparison
  9. ColBERT is a new way of scoring passage relevance using a BERT language model that substantially solves the problems with dense passage retrieval.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: machine learning

About - Propulsed by SemanticScuttle