Tags: metrics* + llm*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Cisco and Splunk have introduced the Cisco Time Series Model, a univariate zero shot time series foundation model designed for observability and security metrics. It is released as an open weight checkpoint on Hugging Face.

    * **Multiresolution data is common:** The model handles data where fine-grained (e.g., 1-minute) and coarse-grained (e.g., hourly) data coexist, a typical pattern in observability platforms where older data is often aggregated.
    * **Long context windows are needed:** It's built to leverage longer historical data (up to 16384 points) than many existing time series models, improving forecasting accuracy.
    * **Zero-shot forecasting is desired:** The model aims to provide accurate forecasts *without* requiring task-specific fine-tuning, making it readily applicable to a variety of time series datasets.
    * **Quantile forecasting is important:** It predicts not just the mean forecast but also a range of quantiles (0.1 to 0.9), providing a measure of uncertainty.
  2. This article explores various metrics used to evaluate the performance of classification machine learning models, including precision, recall, F1-score, accuracy, and alert rate. It explains how these metrics are calculated and provides insights into their application in real-world scenarios, particularly in fraud detection.
  3. A discussion post on Reddit's LocalLLaMA subreddit about logging the output of running models and monitoring performance, specifically for debugging errors, warnings, and performance analysis. The post also mentions the need for flags to output logs as flat files, GPU metrics (GPU utilization, RAM usage, TensorCore usage, etc.) for troubleshooting and analytics.
  4. Langfuse is an open-source LLM engineering platform that offers tracing, prompt management, evaluation, datasets, metrics, and playground for debugging and improving LLM applications. It is backed by several renowned companies and has won multiple awards. Langfuse is built with security in mind, with SOC 2 Type II and ISO 27001 certifications and GDPR compliance.
  5. Why evaluating LLM apps matters and how to get started
    2023-11-10 Tags: , , , by klotz
  6. 2023-10-13 Tags: , , by klotz
  7. 2023-10-13 Tags: , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "metrics+llm"

About - Propulsed by SemanticScuttle