ChatGPT Prompts for Data Science
RepositoryFreeA repository of useful data science prompts for ChatGPT.
Capabilities12 decomposed
role-based prompt templating for data science tasks
Medium confidenceProvides a structured prompt template pattern where ChatGPT assumes specific data science roles (data scientist, ML engineer, SQL expert, statistician) to deliver specialized expertise. The template follows a consistent three-part structure: role specification ('I want you to act as [role]'), task description ('[specific task]'), and input placeholders ('[user context]'). This role-assumption pattern primes ChatGPT's response generation toward domain-specific terminology, methodologies, and best practices without requiring explicit instruction on each interaction.
Uses explicit role-specification pattern ('I want you to act as [role]') combined with task-description and input-placeholder structure, creating a reusable template framework that maps to 11 distinct data science workflow stages (data acquisition, exploration, modeling, optimization, deployment). This three-part template structure is consistently applied across 50+ prompts rather than ad-hoc prompt engineering.
More structured and reusable than generic ChatGPT prompting because it codifies role-assumption as a first-class pattern, enabling non-experts to generate domain-appropriate responses without deep prompt engineering knowledge.
python code generation with data science context
Medium confidenceGenerates Python code for data science tasks (model training, data manipulation, visualization) by providing ChatGPT with dataset descriptions, target variables, and desired outcomes. The prompt templates guide code generation for specific libraries (pandas, scikit-learn, matplotlib) and patterns (train-test splits, hyperparameter tuning, feature engineering). Code is generated as complete, executable snippets that can be directly pasted into Jupyter notebooks or scripts.
Provides 11+ specialized Python code prompts mapped to specific data science workflow stages (model training, feature engineering, hyperparameter tuning, optimization) rather than generic code generation. Each prompt includes role-assumption ('act as data scientist') combined with task-specific context (dataset type, target variable, desired output format).
More targeted than Copilot for data science because prompts are pre-crafted for common ML workflows and include explicit context about dataset structure and modeling goals, reducing the need for iterative refinement.
career development and resource recommendation
Medium confidenceProvides career guidance and learning resource recommendations for data scientists by providing career goals, current skills, and interests to ChatGPT with career-focused prompts ('act as career advisor'). The prompt guides ChatGPT to suggest skill development paths, recommend learning resources, and provide portfolio project ideas. Output includes both recommendations and rationale for career progression.
Provides dedicated prompts for career guidance as a distinct workflow stage with role-assumption ('act as career advisor') and guidance on recommending skill development paths and portfolio projects. Treats career development as a structured, prompt-driven process.
More personalized than generic career advice because prompts guide ChatGPT to consider specific data science career paths and provide actionable recommendations for skill development and portfolio building.
prompt engineering and optimization techniques
Medium confidenceProvides guidance on effective prompt engineering for ChatGPT by documenting prompt design patterns, best practices, and optimization techniques. The repository includes a dedicated section on prompt engineering that explains how to structure prompts for clarity, specificity, and effectiveness. This meta-capability enables users to improve their own prompts and understand why the provided templates work well.
Provides meta-level guidance on prompt engineering as a distinct section within the repository, explaining the principles behind the provided templates (role-assumption, task description, input placeholders). Treats prompt engineering as a learnable skill rather than an art.
More educational than other prompt repositories because it explicitly documents prompt design principles and best practices, enabling users to understand and improve prompts rather than just copy-pasting templates.
code explanation and documentation generation
Medium confidenceGenerates natural language explanations of existing Python or SQL code by providing code snippets to ChatGPT with a role-assumption prompt ('act as code explainer'). The prompt guides ChatGPT to break down logic, explain library usage, describe data transformations, and identify potential issues. Output is formatted as readable documentation suitable for code comments, docstrings, or knowledge base entries.
Provides dedicated prompts for code explanation as a distinct workflow stage, treating explanation as a first-class task rather than a side effect of code generation. Includes role-assumption ('act as code explainer') combined with guidance on explanation depth and target audience.
More focused than generic ChatGPT explanation because prompts are pre-optimized for data science code patterns (pandas operations, scikit-learn pipelines, SQL queries) and include role-assumption to ensure domain-appropriate terminology.
code optimization and performance improvement suggestions
Medium confidenceAnalyzes existing Python or SQL code and generates optimization suggestions by providing code snippets to ChatGPT with optimization-focused prompts ('act as performance engineer'). The prompt guides ChatGPT to identify bottlenecks, suggest faster algorithms, recommend library-specific optimizations (pandas vectorization, numpy broadcasting), and provide refactored code. Output includes both explanation of optimization rationale and executable improved code.
Provides dedicated optimization prompts as a distinct workflow stage, with role-assumption ('act as performance engineer') and guidance on optimization techniques specific to data science libraries (pandas vectorization, numpy broadcasting, SQL query optimization). Includes 5+ optimization-focused prompts covering different code types.
More specialized than generic code optimization tools because prompts are tailored to data science libraries and include role-assumption to ensure recommendations align with data science best practices rather than general software engineering.
sql query generation and optimization
Medium confidenceGenerates SQL queries for data extraction, transformation, and analysis by providing ChatGPT with database schema descriptions, desired output, and optimization requirements. The prompt templates guide query generation for common data science tasks (aggregation, joins, window functions, CTEs). Includes both query generation and optimization prompts to improve readability and performance. Output is executable SQL suitable for direct database execution.
Provides dedicated SQL prompts as a distinct workflow category with role-assumption ('act as SQL expert') and guidance on query patterns specific to data science (feature extraction, aggregation, window functions). Includes separate prompts for query generation vs. optimization.
More focused than generic SQL generation because prompts are pre-optimized for data science use cases (feature engineering, data extraction) and include role-assumption to ensure queries follow data science best practices.
code translation and language conversion
Medium confidenceTranslates code between programming languages (Python to R, SQL to pandas, etc.) by providing source code and target language to ChatGPT with translation-focused prompts ('act as code translator'). The prompt guides ChatGPT to maintain logic equivalence while adapting to target language idioms and libraries. Output is executable code in the target language with equivalent functionality.
Provides dedicated translation prompts as a distinct workflow stage with role-assumption ('act as code translator') and guidance on maintaining logic equivalence across language boundaries. Treats translation as a first-class task rather than a side effect of code generation.
More reliable than manual translation because prompts guide ChatGPT to consider language-specific idioms and library ecosystems, reducing the risk of logic errors or non-idiomatic code in the target language.
data science concept explanation and learning
Medium confidenceExplains data science concepts, algorithms, and methodologies by providing concept names or questions to ChatGPT with explanation-focused prompts ('act as data science educator'). The prompt guides ChatGPT to provide clear explanations suitable for different audience levels, include practical examples, and connect concepts to real-world applications. Output is formatted as educational content suitable for learning materials or documentation.
Provides dedicated prompts for concept explanation as a distinct workflow stage with role-assumption ('act as data science educator') and guidance on explanation depth and audience level. Treats education as a first-class task within the data science workflow.
More pedagogically sound than generic ChatGPT explanations because prompts guide ChatGPT to consider audience level, provide practical examples, and connect concepts to real-world applications rather than providing purely theoretical explanations.
feature engineering and model improvement suggestions
Medium confidenceGenerates feature engineering ideas and model improvement suggestions by providing dataset descriptions, current model performance, and target variables to ChatGPT with ideation-focused prompts ('act as ML engineer'). The prompt guides ChatGPT to suggest new features, identify potential data quality issues, recommend feature selection techniques, and propose model architecture changes. Output includes both feature ideas and rationale for why they might improve model performance.
Provides dedicated prompts for feature engineering ideation as a distinct workflow stage with role-assumption ('act as ML engineer') and guidance on suggesting features that align with model objectives. Treats feature engineering as a systematic, prompt-driven process rather than ad-hoc exploration.
More structured than manual brainstorming because prompts guide ChatGPT to consider multiple feature engineering techniques (domain-specific features, statistical transformations, interaction terms) and provide rationale for suggestions.
troubleshooting and debugging assistance
Medium confidenceProvides debugging and troubleshooting guidance for data science code by providing error messages, code snippets, and context to ChatGPT with debugging-focused prompts ('act as debugging expert'). The prompt guides ChatGPT to identify root causes, suggest fixes, and explain why errors occurred. Output includes both diagnosis and corrected code or configuration.
Provides dedicated debugging prompts as a distinct workflow stage with role-assumption ('act as debugging expert') and guidance on systematic error diagnosis. Treats debugging as a structured process guided by prompts rather than ad-hoc problem-solving.
More systematic than generic ChatGPT debugging because prompts guide ChatGPT to consider common error patterns in data science code (library version mismatches, data type issues, memory constraints) and provide structured diagnosis.
statistical analysis and experimental design guidance
Medium confidenceProvides guidance on statistical analysis and experimental design by providing research questions, data descriptions, and constraints to ChatGPT with statistics-focused prompts ('act as statistician'). The prompt guides ChatGPT to recommend appropriate statistical tests, suggest experimental designs (A/B tests, multivariate tests), and explain statistical assumptions. Output includes both recommendations and rationale for methodological choices.
Provides dedicated prompts for statistical guidance as a distinct workflow stage with role-assumption ('act as statistician') and guidance on recommending appropriate tests and designs. Treats statistical methodology as a systematic, prompt-driven process.
More accessible than statistical textbooks because prompts guide ChatGPT to provide practical recommendations with clear rationale, making statistical methodology more approachable for practitioners without deep statistical training.
Capabilities are decomposed by AI analysis. Each maps to specific user intents and improves with match feedback.
Related Artifactssharing capabilities
Artifacts that share capabilities with ChatGPT Prompts for Data Science, ranked by overlap. Discovered automatically through the match graph.
ParallelGPT
Bulk processing ChatGPT on...
marvin
a simple and powerful tool to get things done with AI
ai-collab-playbook
Practical AI collaboration playbook for research, writing, reading, and coding: article, prompts, agent rules, and reusable skills.
twinny
The most no-nonsense, locally or API-hosted AI code completion plugin for Visual Studio Code - like GitHub Copilot but 100% free.
Promptitude.io
Harness AI to streamline content creation and workflow...
BambooAI
Data exploration and analysis for non-programmers
Best For
- ✓data scientists and ML engineers seeking faster problem-solving workflows
- ✓teams standardizing ChatGPT interactions across data science projects
- ✓individual contributors building personal productivity systems with LLMs
- ✓junior data scientists learning common patterns
- ✓experienced practitioners seeking rapid prototyping
- ✓teams standardizing code generation for reproducibility
- ✓data scientists planning career progression
- ✓teams mentoring junior data scientists
Known Limitations
- ⚠Role assumption is stateless — each prompt must re-specify the role; no persistent context across conversations
- ⚠No validation that ChatGPT actually maintains role consistency; depends entirely on model behavior
- ⚠Template placeholders are unstructured text; no schema validation for input quality
- ⚠Generated code quality depends on prompt specificity; vague dataset descriptions produce generic, potentially incorrect code
- ⚠No static analysis or linting of generated code; requires manual review for production use
- ⚠No integration with actual data — code is generated blind without schema validation
Requirements
Input / Output
UnfragileRank
UnfragileRank is computed from adoption signals, documentation quality, ecosystem connectivity, match graph feedback, and freshness. No artifact can pay for a higher rank.
About
A repository of useful data science prompts for ChatGPT.
Categories
Alternatives to ChatGPT Prompts for Data Science
Are you the builder of ChatGPT Prompts for Data Science?
Claim this artifact to get a verified badge, access match analytics, see which intents users search for, and manage your listing.
Get the weekly brief
New tools, rising stars, and what's actually worth your time. No spam.
Data Sources
Looking for something else?
Search →