Tags: data*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. A deep dive into the structure and performance benefits of Parquet files, including columnar storage, partitioning strategies, and row groups.

    2025-03-14 Tags: , , , , by klotz
  2. The article discusses how AI agents are transforming the way data is organized and utilized, moving away from rigid tabular structures to more flexible, interconnected data models. This shift is driven by the need for applications to be more intelligent and context-aware, requiring vast and complex datasets. The database layer is becoming increasingly critical, with AI agents necessitating richly structured data to enable near-human levels of logic and intuition.

    2025-03-07 Tags: , , , , by klotz
  3. LlamaExtract is a powerful, easy-to-use tool that allows users to extract structured data from unstructured documents with minimal effort, available through LlamaCloud’s web UI and Python SDK.

  4. The article discusses the security risks and challenges associated with the increasing use of AI agents in enterprise workflows. It highlights concerns about data access, privacy, and the potential for new vulnerabilities in multi-agent systems. Experts emphasize the need for careful management of agent identities and access permissions to mitigate risks.

  5. OpenSanctions helps investigators find leads, allows companies to manage risk, and enables technologists to build data-driven products by providing a clean, de-duplicated dataset from 276 global sources.

    2025-02-17 Tags: , , , by klotz
  6. This article introduces Streamlit, a Python library for building data dashboards, as a solution for Python programmers to create graphical front-ends without needing to delve into CSS, HTML, or JavaScript. The author, a seasoned data engineer, explains how Streamlit and similar tools enable the creation of attractive dashboards, marking a shift from traditional tools like Tableau or Quicksight. This piece serves as the first in a series focusing on Streamlit, with future articles planned on Gradio and Taipy. The author aims to replicate similar layouts and functionalities across dashboards using consistent data.

  7. Breser stands for Business Rules & Expression Syntax for Easy Retrieval. It is a powerful and flexible query language designed for efficient log processing and structured data filtering.

  8. An exploration of AG Grid, a JavaScript data grid library used to build interactive and advanced data tables or grids in web applications, highlighting its features, performance, and how it compares to other solutions.

  9. ArchiveBox is a powerful, open-source self-hosted internet archiving solution that lets organizations and individuals archive both public and private web content, retaining control over their data. It supports a variety of input and output formats and can be installed via Docker, pip, or other package managers.

  10. A visual representation of papers on ArXiv using UMAP and nomic-embed.

    2024-10-12 Tags: , , , , , by klotz

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "data"

About - Propulsed by SemanticScuttle