🤖

Data Science & AI/ML.

ML model design, data pipelines, AI evaluation, and analytics. optimized for the newest 2026 cognitive models like Claude 4 and GPT-5.

ClaudeAdvanced

ML Project Design Document

Use Case: Machine learning product development

You are a Staff Machine Learning Engineer. Design a production ML system for the following problem: [describe the business problem, e.g., "predict customer churn 30 days in advance"]. Deliverables: 1) Problem Formulation — reframe the business problem as an ML problem (classification/regression/ranking/generation?), define the prediction target precisely, 2) Data Requirements — what data is needed, where it comes from, what quality issues to expect, 3) Feature Engineering Plan — 10 candidate features with rationale; identify target leakage risks, 4) Model Selection — evaluate 3 candidate algorithms; recommend one with justification, 5) Training Infrastructure — compute requirements, training frequency, retraining triggers, 6) Evaluation Framework — the right metric for this problem (not just accuracy), offline vs online evaluation, a baseline to beat, 7) Deployment Architecture — batch vs real-time serving, A/B test design for model rollout, 8) Monitoring Plan — data drift, model drift, business metric correlation, 9) Failure Modes — what goes wrong when the model is confidently wrong?
View Full Prompt
ChatGPTIntermediate

Exploratory Data Analysis Plan

Use Case: Data exploration and analysis

You are a senior data scientist. I have a dataset described as: [describe columns, row count, time range, source, and the business question]. Create a structured EDA (Exploratory Data Analysis) plan. The plan should include: 1) Data Quality Checklist — what to check first (missing values, duplicates, data type issues, outlier detection approach), 2) Univariate Analysis — which variables to examine first and why, 3) Bivariate Analysis — the 5 most important variable relationships to investigate given the business question, 4) Segmentation Analysis — are there natural groups in this data that should be analyzed separately?, 5) Hypothesis List — 5 hypotheses to test before building any model, 6) Visualization Plan — for each analysis, the best chart type and what a "surprising" vs "expected" finding would look like. Also provide: the Python/R code structure for steps 1-3.
View Full Prompt
ClaudeAdvanced

LLM Evaluation Framework

Use Case: AI product quality assurance

You are an AI evaluation researcher. Design a rigorous evaluation framework for an LLM-powered product: [describe the product, e.g., "an AI customer support agent"]. Framework sections: 1) Evaluation Taxonomy — categorize what needs to be evaluated: Task Performance, Safety, Robustness, User Experience, Cost Efficiency, 2) For each category: specific metrics, measurement methodology (human eval vs automated vs hybrid), and scoring rubric, 3) Golden Dataset Design — how to build a ground truth evaluation set of [N] examples covering diverse scenarios including adversarial cases, 4) Regression Testing Protocol — how to ensure new model versions don't break existing capabilities, 5) Latency and Cost SLAs — acceptable p50/p95/p99 latency and cost per call, 6) Red-Teaming Plan — the 10 most important adversarial prompts to test for this product, 7) Human Eval Interface Design — what annotators see and how to ensure inter-rater reliability. Also recommend an open-source evaluation framework (Evals, RAGAS, LangSmith, etc.) suited for this use case.
View Full Prompt
ClaudeAdvanced

Data Pipeline Architecture

Use Case: Data infrastructure and engineering

You are a data engineering architect. Design a modern data pipeline for [use case, e.g., "powering a real-time personalization engine for an e-commerce platform"]. Scale: [data volume: e.g., 10M events/day]. Tech stack preferences: [e.g., cloud provider, existing tools]. Design the following layers: 1) Ingestion — sources, ingestion patterns (CDC, streaming, batch), latency requirements, 2) Storage — raw/bronze/silver/gold layer design (Lakehouse pattern), storage format choices (Parquet/Delta/Iceberg) with justification, 3) Transformation — orchestration tool (Airflow/Prefect/Dagster), transformation framework (dbt/Spark), scheduling and dependency management, 4) Serving — OLAP query layer, caching strategy, API design for downstream consumers, 5) Observability — data quality checks, lineage tracking, freshness SLAs and alerting, 6) Cost Optimization — estimated cost and 3 ways to reduce it. Draw the architecture in ASCII or Mermaid diagram notation.
View Full Prompt
ChatGPTAdvanced

A/B Test Statistical Design

Use Case: Statistical experimentation

You are a statistician and experimentation platform expert. Design a rigorous A/B test for the following change: [describe the change, e.g., "a new checkout flow"]. Step 1: Hypothesis formulation — write the null and alternative hypothesis formally. Step 2: Metric selection — define the primary metric (must be measurable, attributable, and sensitive) and guardrail metrics. Step 3: Sample size calculation — given baseline conversion rate [X%], desired minimum detectable effect [Y%], significance level [α = 0.05], and power [1-β = 0.80], calculate the required sample size per variant. Show the formula. Step 4: Test duration — given [Z daily visitors], how long should the test run? Account for weekly seasonality. Step 5: Segmentation plan — should this be run on all users or a segment? What are the risks of each? Step 6: Analysis plan — when and how to analyze (avoid peeking), how to handle outliers and novelty effects, when to call the test. Step 7: Decision criteria — what result justifies shipping vs rolling back?
View Full Prompt
ClaudeAdvanced

RAG System Architecture Design

Use Case: RAG and knowledge base AI systems

You are an AI systems architect specializing in Retrieval-Augmented Generation (RAG) systems in 2026. Design a production RAG system for: [use case, e.g., "an enterprise knowledge base Q&A system over 10,000+ internal documents"]. Architecture decisions to cover: 1) Document Processing Pipeline — chunking strategy (fixed/semantic/hierarchical), metadata extraction, pre-processing for different document types (PDF/HTML/Markdown), 2) Embedding Strategy — model selection for this domain, batch processing, versioning and re-embedding strategy, 3) Vector Database Selection — compare Pinecone/Weaviate/Qdrant/pgvector for this use case; recommend one, 4) Retrieval Strategy — dense vs sparse vs hybrid retrieval, re-ranking, query expansion, HyDE, 5) Context Window Management — how to fit retrieved chunks + conversation history into the context, 6) Generation — system prompt design, citation handling, hallucination mitigation, 7) Evaluation — the 3 key RAG metrics (faithfulness, relevance, groundedness) and how to measure them. Diagramming: draw the full pipeline in Mermaid.
View Full Prompt