A Docker container for quickly standing up a Splunk instance, complete with Eventgen and Splunk's Machine Learning app for testing and training purposes.
This article demonstrates how basic statistics and techniques like PCA can be used to analyze tabular datasets, highlighting the importance of data preprocessing, statistical tests, and handling multicollinearity.
This article introduces interpretable clustering, a field that aims to provide insights into the characteristics of clusters formed by clustering algorithms. It discusses the limitations of traditional clustering methods and highlights the benefits of interpretable clustering in understanding data patterns.
Stumpy is a Python library designed for efficient analysis of large time series data. It uses matrix profile computation to identify patterns, anomalies, and shapelets. Stumpy leverages optimized algorithms, parallel processing, and early termination to significantly reduce computational overhead.
Outlier treatment is a necessary step in data analysis. This article, part 3 of a four-part series, eases the process and provides insights on effective methods and tools for outlier detection.
Quadratic is a modern spreadsheet that combines the familiarity of a spreadsheet with the power of code, allowing you to work with data and code collaboratively in real-time. It supports popular programming languages like Python, SQL, and JavaScript, and offers features such as dynamic charts, APIs, multi-line formulas, and AI integration.
This article discusses causal inference, an emerging field in machine learning that goes beyond predicting what could happen to focus on understanding the cause-and-effect relationships in data. The author explains how to detect and fix errors in a directed acyclic graph (DAG) to make it a valid representation of the underlying data.