SemanticScuttle - klotz.me » klotz: evaluation+llm

klotz: evaluation* + llm*

Metrics to Evaluate a Classification Machine Learning Model

This article explores various metrics used to evaluate the performance of classification machine learning models, including precision, recall, F1-score, accuracy, and alert rate. It explains how these metrics are calculated and provides insights into their application in real-world scenarios, particularly in fraud detection.

2024-08-01 Tags: machine learning, classification, metrics, evaluation, precision, recall, f1-score, accuracy, alert rate, fraud detection, llm by klotz

End-to-end LLM Workflows Guide

This guide demonstrates how to execute end-to-end LLM workflows for developing and productionizing LLMs at scale. It covers data preprocessing, fine-tuning, evaluation, and serving.

2024-06-21 Tags: llm, workflows, data preprocessing, fine-tuning, evaluation, serving, ray, anyscale by klotz

New Trends in LLM Architecture

Discusses the trends in Large Language Models (LLMs) architecture, including the rise of more GPU, more weights, more tokens, energy-efficient implementations, the role of LLM routers, and the need for better evaluation metrics, faster fine-tuning, and self-tuning.

2024-06-01 Tags: llm, machine learning, deep learning, transformers, self-tuning, evaluation by klotz

Langfuse - Open Source LLM Engineering Platform

Langfuse is an open-source LLM engineering platform that offers tracing, prompt management, evaluation, datasets, metrics, and playground for debugging and improving LLM applications. It is backed by several renowned companies and has won multiple awards. Langfuse is built with security in mind, with SOC 2 Type II and ISO 27001 certifications and GDPR compliance.

2024-05-23 Tags: lamgfuse, llm, prompt engineering, evaluation, datasets, metrics, observability by klotz

Evaluate anything you want | Creating advanced evaluators with LLMs

Discover how to build custom LLM evaluators for specific real-world needs

2024-04-20 Tags: llm, evaluation by klotz

Steady the Course: Navigating the Evaluation of LLM-based Applications

Why evaluating LLM apps matters and how to get started

2023-11-10 Tags: llm, application, evaluation, metrics by klotz

LLM Evaluation

2023-10-13 Tags: llm, evaluation, metrics by klotz

Survey of LLM Evaluations

2023-07-14 Tags: llm, evaluation, survey, github by klotz

First / Previous / Next / Last / Page 1 of 0

SemanticScuttle - klotz.me

klotz: evaluation* + llm*

Linked Tags

Related Tags