Tags: data engineering* + gcp*

0 bookmark(s) - Sort by: Date ↓ / Title /

  1. Spotify, a human's digital jukebox, has been a data-driven company since day one, using data for various purposes including payments and experimentation. Managing the vast amount of data required a more streamlined approach, leading to the development of their internal data platform.

    **Event Delivery System:**
    - **On-Premises Setup:** Initially, Spotify used on-premises solutions like Kafka and HDFS. Event data from clients was captured, timestamped, and routed to a central Hadoop cluster.
    - **Google Cloud Transition:** In 2015, Spotify moved to Google Cloud Platform (GCP) for better scalability and reliability. Key components include File Tailer, Event Delivery Service, Reliable Persistent Queue, and ETL jobs using Dataflow and BigQuery.
  2. Notebooks are not enough for ML at scale

Top of the page

First / Previous / Next / Last / Page 1 of 0 SemanticScuttle - klotz.me: tagged with "data engineering+gcp"

About - Propulsed by SemanticScuttle