Posts by Tags

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

ML Model Evaluation: Why a 99% Accurate Model Can Still Fail in Production

5 minute read

Published: May 12, 2026

Introduction: The Deployment Dilemma

ML Model Evaluation: Why a 99% Accurate Model Can Still Fail in Production

5 minute read

Published: May 12, 2026

Introduction: The Deployment Dilemma

Feature Engineering: Teaching Machines What to See

7 minute read

Published: April 10, 2026

Introduction: The Signal and the Noise

Data Preprocessing: Why Noisy and Incomplete Data Breaks Machine Learning Models

5 minute read

Published: April 10, 2026

1. The Messy Reality of Raw Data

Data Visualization and Exploration: A Scientific Necessity

4 minute read

Published: April 15, 2026

Data visualization is far more than creating aesthetic charts; it is a critical scientific tool for understanding trends and patterns. Our biological capacity to process information is physically limited, but our visual sense acts as a high-bandwidth “network cable” for the brain. While hearing has a bandwidth comparable to a hard drive and touch to a USB port, our visual senses are approximately ten times more powerful, allowing us to ingest and process complex data flows far more efficiently than through tables alone.

Data Preprocessing: Why Noisy and Incomplete Data Breaks Machine Learning Models

5 minute read

Published: April 10, 2026

1. The Messy Reality of Raw Data

ML Model Evaluation: Why a 99% Accurate Model Can Still Fail in Production

5 minute read

Published: May 12, 2026

Introduction: The Deployment Dilemma

Principal Component Analysis: Cutting Through High-Dimensional Data to Find What Actually Matters

6 minute read

Published: April 20, 2026

1. Introduction: The Curse of Dimensionality

Data Visualization and Exploration: A Scientific Necessity

4 minute read

Published: April 15, 2026

Data visualization is far more than creating aesthetic charts; it is a critical scientific tool for understanding trends and patterns. Our biological capacity to process information is physically limited, but our visual sense acts as a high-bandwidth “network cable” for the brain. While hearing has a bandwidth comparable to a hard drive and touch to a USB port, our visual senses are approximately ten times more powerful, allowing us to ingest and process complex data flows far more efficiently than through tables alone.

Feature Engineering: Teaching Machines What to See

7 minute read

Published: April 10, 2026

Introduction: The Signal and the Noise

Data Preprocessing: Why Noisy and Incomplete Data Breaks Machine Learning Models

5 minute read

Published: April 10, 2026

1. The Messy Reality of Raw Data

Data Visualization and Exploration: A Scientific Necessity

4 minute read

Published: April 15, 2026

Data visualization is far more than creating aesthetic charts; it is a critical scientific tool for understanding trends and patterns. Our biological capacity to process information is physically limited, but our visual sense acts as a high-bandwidth “network cable” for the brain. While hearing has a bandwidth comparable to a hard drive and touch to a USB port, our visual senses are approximately ten times more powerful, allowing us to ingest and process complex data flows far more efficiently than through tables alone.

Principal Component Analysis: Cutting Through High-Dimensional Data to Find What Actually Matters

6 minute read

Published: April 20, 2026

1. Introduction: The Curse of Dimensionality

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

Why Knowledge Graphs Still Matter in the Age of LLMs

4 minute read

Published: April 04, 2026

Large Language Models have taken the world by storm. GPT-4, Llama, Mistral — they can write code, summarise papers, answer complex questions, and even pass medical licensing exams. It is tempting to ask: if LLMs can do all of this, do we still need knowledge graphs?

Data Visualization and Exploration: A Scientific Necessity

4 minute read

Published: April 15, 2026

Data visualization is far more than creating aesthetic charts; it is a critical scientific tool for understanding trends and patterns. Our biological capacity to process information is physically limited, but our visual sense acts as a high-bandwidth “network cable” for the brain. While hearing has a bandwidth comparable to a hard drive and touch to a USB port, our visual senses are approximately ten times more powerful, allowing us to ingest and process complex data flows far more efficiently than through tables alone.

Why Knowledge Graphs Still Matter in the Age of LLMs

4 minute read

Published: April 04, 2026

Large Language Models have taken the world by storm. GPT-4, Llama, Mistral — they can write code, summarise papers, answer complex questions, and even pass medical licensing exams. It is tempting to ask: if LLMs can do all of this, do we still need knowledge graphs?

Feature Engineering: Teaching Machines What to See

7 minute read

Published: April 10, 2026

Introduction: The Signal and the Noise

Data Preprocessing: Why Noisy and Incomplete Data Breaks Machine Learning Models

5 minute read

Published: April 10, 2026

1. The Messy Reality of Raw Data

Why Knowledge Graphs Still Matter in the Age of LLMs

4 minute read

Published: April 04, 2026

Large Language Models have taken the world by storm. GPT-4, Llama, Mistral — they can write code, summarise papers, answer complex questions, and even pass medical licensing exams. It is tempting to ask: if LLMs can do all of this, do we still need knowledge graphs?

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

KG Construction From Unstructured Text

6 minute read

Published: May 06, 2022

Lots of valuable information is available on the web such as Twitter, legal documents, financial/sports news and scientific articles as unstructured data. Although a lot of Knowledge Graphs (KGs) including WikiData and DBPedia are made publicly available, it may be necessary to create our own KG for an analysis that we would like to perform. By converting text to KG, we can obtain new knowledge and new insights from text sources. In this blog, we will discuss what Natural Language Processing (NLP) methods and tools should be used to build KGs.

KG Construction From Unstructured Text

6 minute read

Published: May 06, 2022

Lots of valuable information is available on the web such as Twitter, legal documents, financial/sports news and scientific articles as unstructured data. Although a lot of Knowledge Graphs (KGs) including WikiData and DBPedia are made publicly available, it may be necessary to create our own KG for an analysis that we would like to perform. By converting text to KG, we can obtain new knowledge and new insights from text sources. In this blog, we will discuss what Natural Language Processing (NLP) methods and tools should be used to build KGs.

Why Knowledge Graphs Still Matter in the Age of LLMs

4 minute read

Published: April 04, 2026

Large Language Models have taken the world by storm. GPT-4, Llama, Mistral — they can write code, summarise papers, answer complex questions, and even pass medical licensing exams. It is tempting to ask: if LLMs can do all of this, do we still need knowledge graphs?

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

ML Model Evaluation: Why a 99% Accurate Model Can Still Fail in Production

5 minute read

Published: May 12, 2026

Introduction: The Deployment Dilemma

Principal Component Analysis: Cutting Through High-Dimensional Data to Find What Actually Matters

6 minute read

Published: April 20, 2026

1. Introduction: The Curse of Dimensionality

Data Visualization and Exploration: A Scientific Necessity

4 minute read

Published: April 15, 2026

Data visualization is far more than creating aesthetic charts; it is a critical scientific tool for understanding trends and patterns. Our biological capacity to process information is physically limited, but our visual sense acts as a high-bandwidth “network cable” for the brain. While hearing has a bandwidth comparable to a hard drive and touch to a USB port, our visual senses are approximately ten times more powerful, allowing us to ingest and process complex data flows far more efficiently than through tables alone.

Feature Engineering: Teaching Machines What to See

7 minute read

Published: April 10, 2026

Introduction: The Signal and the Noise

Data Preprocessing: Why Noisy and Incomplete Data Breaks Machine Learning Models

5 minute read

Published: April 10, 2026

1. The Messy Reality of Raw Data

ML Model Evaluation: Why a 99% Accurate Model Can Still Fail in Production

5 minute read

Published: May 12, 2026

Introduction: The Deployment Dilemma

KG Construction From Unstructured Text

6 minute read

Published: May 06, 2022

Lots of valuable information is available on the web such as Twitter, legal documents, financial/sports news and scientific articles as unstructured data. Although a lot of Knowledge Graphs (KGs) including WikiData and DBPedia are made publicly available, it may be necessary to create our own KG for an analysis that we would like to perform. By converting text to KG, we can obtain new knowledge and new insights from text sources. In this blog, we will discuss what Natural Language Processing (NLP) methods and tools should be used to build KGs.

KG Construction From Unstructured Text

6 minute read

Published: May 06, 2022

Lots of valuable information is available on the web such as Twitter, legal documents, financial/sports news and scientific articles as unstructured data. Although a lot of Knowledge Graphs (KGs) including WikiData and DBPedia are made publicly available, it may be necessary to create our own KG for an analysis that we would like to perform. By converting text to KG, we can obtain new knowledge and new insights from text sources. In this blog, we will discuss what Natural Language Processing (NLP) methods and tools should be used to build KGs.

KG Construction From Unstructured Text

6 minute read

Published: May 06, 2022

Lots of valuable information is available on the web such as Twitter, legal documents, financial/sports news and scientific articles as unstructured data. Although a lot of Knowledge Graphs (KGs) including WikiData and DBPedia are made publicly available, it may be necessary to create our own KG for an analysis that we would like to perform. By converting text to KG, we can obtain new knowledge and new insights from text sources. In this blog, we will discuss what Natural Language Processing (NLP) methods and tools should be used to build KGs.

Feature Engineering: Teaching Machines What to See

7 minute read

Published: April 10, 2026

Introduction: The Signal and the Noise

Why Knowledge Graphs Still Matter in the Age of LLMs

4 minute read

Published: April 04, 2026

Large Language Models have taken the world by storm. GPT-4, Llama, Mistral — they can write code, summarise papers, answer complex questions, and even pass medical licensing exams. It is tempting to ask: if LLMs can do all of this, do we still need knowledge graphs?

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

Principal Component Analysis: Cutting Through High-Dimensional Data to Find What Actually Matters

6 minute read

Published: April 20, 2026

1. Introduction: The Curse of Dimensionality

Beyond Leaderboards: Why Ontology-Based Benchmarks Give a More Accurate Picture of LLM Reasoning

6 minute read

Published: April 04, 2026

Everyone agrees that Large Language Models should be evaluated rigorously. Dozens of benchmarks exist — MMLU, HellaSwag, BIG-Bench, GSM8K, and many more. Leaderboards are updated weekly. New models claim state-of-the-art performance almost daily.

Principal Component Analysis: Cutting Through High-Dimensional Data to Find What Actually Matters

6 minute read

Published: April 20, 2026

Remzi Celebi

Posts by Tags

Benchmarking

Bias-Variance Tradeoff

Introduction: The Deployment Dilemma

Classification

Introduction: The Deployment Dilemma

Computer Vision

Introduction: The Signal and the Noise

Data Engineering

1. The Messy Reality of Raw Data

Data Exploration

Data Preprocessing

1. The Messy Reality of Raw Data

Data Science

Introduction: The Deployment Dilemma

1. Introduction: The Curse of Dimensionality

Introduction: The Signal and the Noise

1. The Messy Reality of Raw Data

Data Visualization

Dimensionality Reduction

1. Introduction: The Curse of Dimensionality

Evaluation

Explainability

Exploratory Data Analysis

FAIR Data

Feature Engineering

Introduction: The Signal and the Noise

1. The Messy Reality of Raw Data

Knowledge Graphs

Knowlegde Graphs

Knowlegde Graphs Construction

Large Language Models

Machine Learning

Introduction: The Deployment Dilemma

1. Introduction: The Curse of Dimensionality

Introduction: The Signal and the Noise

1. The Messy Reality of Raw Data

Model Evaluation

Introduction: The Deployment Dilemma

NLP

Named Entity Linking

Named Entity Recognition

Natural Language Processing

Introduction: The Signal and the Noise

Neuro-symbolic AI

Ontologies

PCA

1. Introduction: The Curse of Dimensionality

Reasoning

Statistics

1. Introduction: The Curse of Dimensionality