klotz: machine learning*

"Machine learning is a subset of artificial intelligence in the field of computer science that often uses statistical techniques to give computers the ability to "learn" (i.e., progressively improve performance on a specific task) with data, without being explicitly programmed.

https://en.wikipedia.org/wiki/Machine_learning

0 bookmark(s) - Sort by: Date ↓ / Title / - Bookmarks from other users for this tag

  1. This article details how to accelerate deep learning and LLM inference using Apache Spark, focusing on distributed inference strategies. It covers basic deployment with `predict_batch_udf`, advanced deployment with inference servers like NVIDIA Triton and vLLM, and deployment on cloud platforms like Databricks and Dataproc. It also provides guidance on resource management and configuration for optimal performance.
  2. A post with pithy observations and clear conclusions from building complex LLM workflows, covering topics like prompt chaining, data structuring, model limitations, and fine-tuning strategies.
  3. This article details the often overlooked cost of storing embeddings for RAG systems, and how quantization techniques (int8 and binary) can significantly reduce storage requirements and improve retrieval speed without substantial accuracy loss.
  4. Running GenAI models is easy. Scaling them to thousands of users, not so much. This guide details avenues for scaling AI workloads from proofs of concept to production-ready deployments, covering API integration, on-prem deployment considerations, hardware requirements, and tools like vLLM and Nvidia NIMs.
  5. PaperCoder is a multi-agent LLM system that transforms scientific papers into code repositories through a three-stage pipeline: planning, analysis, and code generation. It aims to create faithful, high-quality implementations.
  6. The article showcases concise Python code snippets (one-liners) for common machine learning tasks like data splitting, standardization, model training (linear regression, logistic regression, decision tree, random forest), and prediction, leveraging libraries such as scikit-learn.

    | **#** | **One-Liner** | **Description** | **Library** | **Use Case** |
    |-----|-----------------------------------------------------|-------------------------------------------------------------------------------------|-------------------|-------------------------------------------------|
    | 1 | `from sklearn.datasets import load_iris; X, y = load_iris(return_X_y=True)` | Loads the Iris dataset, a classic for classification. | scikit-learn | Loading a standard dataset. |
    | 2 | `from sklearn.model_selection import train_test_split; X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=1)` | Splits the dataset into training and testing sets. | scikit-learn | Preparing data for model training & evaluation.|
    | 3 | `from sklearn.linear_model import LogisticRegression; model = LogisticRegression(random_state=1)` | Creates a Logistic Regression model. | scikit-learn | Binary Classification. |
    | 4 | `model.fit(X_train, y_train)` | Trains the Logistic Regression model. | scikit-learn | Model training. |
    | 5 | `y_pred = model.predict(X_test)` | Predicts labels for the test dataset. | scikit-learn | Making predictions. |
    | 6 | `from sklearn.metrics import accuracy_score; accuracy = accuracy_score(y_test, y_pred)` | Calculates the accuracy of the model. | scikit-learn | Evaluating model performance. |
    | 7 | `import pandas as pd; df = pd.DataFrame(X, columns=iris.feature_names)` | Creates a Pandas DataFrame from the Iris dataset features. | Pandas | Data manipulation and analysis. |
    | 8 | `df 'target' » = y` | Adds the target variable to the DataFrame. | Pandas | Combining features and labels. |
    | 9 | `df.head()` | Displays the first few rows of the DataFrame. | Pandas | Inspecting the data. |
    | 10 | `df.describe()` | Generates descriptive statistics of the DataFrame. | Pandas | Understanding data distribution. |
  7. >"TL;DR: We unify over 23 methods in contrastive learning, dimensionality reduction, spectral clustering, and supervised learning with a single equation."

    >"As the field of representation learning grows, there has been a proliferation of different loss functions to solve different classes of problems. We introduce a single information-theoretic equation that generalizes a large collection of mod- ern loss functions in machine learning. In particular, we introduce a framework that shows that several broad classes of machine learning methods are precisely minimizing an integrated KL divergence between two conditional distributions: the supervisory and learned representations. This viewpoint exposes a hidden information geometry underlying clustering, spectral methods, dimensionality re- duction, contrastive learning, and supervised learning. This framework enables the development of new loss functions by combining successful techniques from across the literature. We not only present a wide array of proofs, connecting over 23 different approaches, but we also leverage these theoretical results to create state-of-the-art unsupervised image classifiers that achieve a +8% improvement over the prior state-of-the-art on unsupervised classification on ImageNet-1K. We also demonstrate that I-Con can be used to derive principled debiasing methods which improve contrastive representation learners."
  8. Optuna is an open-source hyperparameter optimization framework designed to automate the hyperparameter search process for machine learning models. It supports various frameworks like TensorFlow, Keras, Scikit-Learn, XGBoost, and LightGBM, offering features like eager search spaces, state-of-the-art algorithms, and easy parallelization.
  9. This example demonstrates Density-Based Spatial Clustering of Applications with Noise (DBSCAN) using scikit-learn, showing how to generate synthetic clusters, compute DBSCAN clustering, and visualize the results, including core and non-core samples.
  10. DeepMind researchers propose a new 'streams' approach to AI development, focusing on experiential learning and autonomous interaction with the world, moving beyond the limitations of current large language models and potentially surpassing human intelligence.

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: Tags: machine learning

About - Propulsed by SemanticScuttle