Files
ai7-m3/ai_prompt.txt
2026-02-05 11:24:31 +03:00

361 lines
11 KiB
Plaintext

I need to analyze CSV files containing school scheduling data and create a complete Python database system. Here's what I need:
## **PROJECT OVERVIEW**
I'm building a school scheduling system that migrates from spreadsheets to a database. I have CSV files with school data that needs to be analyzed, transformed, and loaded into a relational database, then queried through a Python application.
## **DATA ANALYSIS PHASE**
**1. CSV Structure Analysis:**
- Analyze these sample CSV files for structure and content:
- `students.csv` (student_id, name, grade, class_group)
- `teachers.csv` (teacher_id, name, subject, department)
- `schedule.csv` (entry_id, student_id, teacher_id, room, subject, day, time_slot, group_type)
- `subjects.csv` (subject_id, subject_name, level, requirements)
- `rooms.csv` (room_id, room_number, capacity, equipment)
**2. Data Quality Check:**
- Identify missing values, duplicates, and inconsistencies
- Check data types and formatting consistency
- Identify relationships between different CSV files
- Note any data normalization needs
**3. Schema Design:**
- Design a normalized SQL database schema based on CSV analysis
- Create ERD showing relationships
- Specify primary keys, foreign keys, and constraints
## **DATA MIGRATION PHASE**
**4. Python Migration Script:**
Create a Python script that:
- Reads CSV files using pandas
- Cleans and transforms data (handle missing values, format dates, normalize text)
- Creates SQLite database with proper schema
- Inserts data with error handling and validation
- Creates indexes for performance
- Generates migration logs and reports
**5. Migration Features:**
- Incremental data loading
- Rollback capability
- Data validation rules
- Performance optimization
- Error logging and reporting
## **DATABASE APPLICATION PHASE**
**6. Python Database Application:**
Build a Python application with these features:
**A. Core Query Functions:**
1. Student queries:
- Find student by name/id
- Get student schedule
- Find student location at specific time
- Get student's teachers and subjects
2. Teacher queries:
- Find teacher schedule
- Get teacher's students
- Find available rooms for teacher
3. Schedule queries:
- Room availability by time
- Schedule conflicts detection
- Daily/weekly schedule views
**B. Application Structure:**
- Database connection manager
- Query builder/ORM layer
- Business logic layer
- CLI interface
- API endpoints (optional)
- Error handling and logging
**C. Advanced Features:**
- Search functionality (full-text search)
- Export to CSV/Excel
- Data validation before insertion
- Audit logging
- Performance metrics
## **IMPLEMENTATION DETAILS**
**7. Technology Stack:**
- Python 3.8+
- SQLite (or PostgreSQL for production)
- pandas for data processing
- SQLAlchemy for ORM (optional)
- Flask/FastAPI for web interface (optional)
- pytest for testing
**8. File Structure:**
```
school_scheduler/
├── data/
│ ├── raw_csv/ # Original CSV files
│ ├── processed/ # Cleaned CSV files
│ └── database/ # SQLite database
├── src/
│ ├── migration/
│ │ ├── analyzer.py # CSV analysis
│ │ ├── cleaner.py # Data cleaning
│ │ └── loader.py # Database loading
│ ├── database/
│ │ ├── models.py # SQLAlchemy models
│ │ ├── connection.py # DB connection
│ │ └── queries.py # SQL queries
│ ├── app/
│ │ ├── cli.py # Command line interface
│ │ ├── api.py # REST API (optional)
│ │ └── scheduler.py # Business logic
│ └── utils/
│ ├── validators.py # Data validation
│ └── logger.py # Logging setup
├── tests/
│ ├── test_migration.py
│ ├── test_queries.py
│ └── test_app.py
├── requirements.txt
├── config.yaml # Configuration
└── README.md
```
**9. Step-by-Step Implementation:**
**Phase 1: Analysis & Design**
1. Analyze CSV file structures
2. Design database schema
3. Create ER diagram
4. Plan data transformation rules
**Phase 2: Migration Development**
1. Write CSV analyzer
2. Implement data cleaning functions
3. Create database schema
4. Build data loader with error handling
5. Add validation and logging
**Phase 3: Application Development**
1. Set up database connection
2. Implement core query functions
3. Create CLI interface
4. Add search and export features
5. Implement error handling
**Phase 4: Testing & Optimization**
1. Write unit tests
2. Test with sample data
3. Optimize queries
4. Add performance monitoring
## **SAMPLE DATA EXPECTATIONS**
CSV files may contain:
- Student data with multiple group affiliations
- Complex scheduling with time slots
- Room assignments with capacity constraints
- Teacher-subject relationships
- Overlapping
Perfect! Here's the optimized AI prompt for your specific setup:
## **Optimized AI Prompt for SQLite Database Migration**
```text
I'm building a school scheduling system migrating from CSV files to SQLite database. Here's my exact setup:
## **CURRENT SETUP**
- **Folder Structure**: `sample_data/` folder containing CSV files
- **Database Tool**: DB Browser for SQLite
- **Target Database**: SQLite database file (.db or .sqlite)
## **PHASE 1: CSV ANALYSIS & DATABASE DESIGN**
**1. Analyze CSV files in `sample_data/` folder:**
Please examine these CSV files (provide actual file names if known):
- `students.csv` - Student information
- `teachers.csv` - Teacher information
- `schedule.csv` - Scheduling data
- `subjects.csv` - Subject catalog
- `rooms.csv` - Room information
- Any other CSV files present
**2. For each CSV file, analyze:**
- Column names and data types
- Primary key candidates
- Data quality issues (nulls, duplicates, inconsistencies)
- Foreign key relationships to other files
- Sample data patterns
**3. Design SQLite database schema:**
- Create normalized tables based on CSV analysis
- Specify SQLite-compatible data types (TEXT, INTEGER, REAL, BLOB, NULL)
- Design primary keys (INTEGER PRIMARY KEY for autoincrement)
- Define foreign key constraints
- Create indexes for frequent queries
**4. Generate SQL schema for DB Browser SQLite:**
Provide CREATE TABLE statements that work in DB Browser SQLite.
## **PHASE 2: PYTHON MIGRATION SCRIPT**
**5. Create `migrate_to_sqlite.py` script that:**
- Reads all CSV files from `sample_data/` folder
- Uses pandas for data analysis and cleaning
- Creates SQLite database file
- Implements this workflow:
```python
# Workflow:
1. Connect to SQLite database
2. Create tables from schema
3. For each CSV file:
- Read with pandas
- Clean data (handle nulls, format, validate)
- Insert into corresponding table
- Log errors/success
4. Create indexes
5. Generate migration report
```
**6. Specific requirements for SQLite:**
- Use `sqlite3` Python module
- Handle SQLite constraints (NOT NULL, UNIQUE, CHECK)
- Use parameterized queries to prevent SQL injection
- Implement transaction rollback on errors
- Generate CSV import summary
**7. Data cleaning for SQLite compatibility:**
- Convert dates to TEXT in ISO format (YYYY-MM-DD HH:MM:SS)
- Handle empty strings vs NULL
- Ensure integer fields are numeric
- Trim whitespace from TEXT fields
## **PHASE 3: PYTHON QUERY APPLICATION**
**8. Create `school_scheduler.py` application with:**
- SQLite database connection manager
- Query functions for common operations:
```python
# Essential Queries:
1. Find student: "Where is [student] right now?"
2. Student schedule: "Show [student]'s daily schedule"
3. Teacher schedule: "Show [teacher]'s classes"
4. Room availability: "Is [room] free at [time]?"
5. Schedule conflicts: "Find overlapping schedules"
```
**9. Application features:**
- Command-line interface (CLI) with menu
- Interactive query builder
- Export results to CSV
- Error handling for SQLite errors
- Connection management (open/close)
**10. SQLite-specific optimizations:**
- Use connection pooling
- Implement query caching if needed
- Add EXPLAIN QUERY PLAN for optimization
- Create views for complex queries
## **PHASE 4: DB BROWSER SQLITE INTEGRATION**
**11. Generate files for DB Browser SQLite:**
- SQL schema file (`.sql`) to import into DB Browser
- Sample queries for DB Browser's "Execute SQL" tab
- Instructions for using DB Browser with the database
**12. Create `db_browser_setup.sql` with:**
- All CREATE TABLE statements
- Sample INSERT statements
- Useful SELECT queries
- Index creation statements
## **DELIVERABLES REQUESTED**
Please provide:
**1. Database Schema:**
- Complete SQL CREATE TABLE statements for SQLite
- ER diagram showing relationships
- Index recommendations
**2. Python Scripts:**
- `analyze_csv.py` - Analyze sample_data CSV files
- `migrate_to_sqlite.py` - Migration script
- `school_scheduler.py` - Query application
- `requirements.txt` - Dependencies
**3. DB Browser Files:**
- `schema.sql` - Import into DB Browser
- `sample_queries.sql` - Pre-written queries
**4. Documentation:**
- README with setup instructions
- Sample queries and expected results
- Troubleshooting guide for SQLite issues
## **SAMPLE WORKFLOW**
```bash
# Expected workflow:
1. python analyze_csv.py # Analyze CSV files
2. python migrate_to_sqlite.py # Create and populate database
3. Open school_schedule.db in DB Browser # Visual verification
4. python school_scheduler.py # Run query application
```
## **ADDITIONAL REQUIREMENTS**
**For SQLite Database:**
- Database file: `school_schedule.db`
- Enable foreign keys: `PRAGMA foreign_keys = ON`
- Use WAL journal mode for better concurrency
- Implement backup mechanism
**For CSV Handling:**
- Detect CSV encoding automatically
- Handle different delimiters (comma, semicolon, tab)
- Skip malformed rows with logging
- Generate data quality report
**For Python Application:**
- Use argparse for CLI arguments
- Implement --help documentation
- Add verbose mode for debugging
- Include unit tests for key functions
Please analyze the CSV structure first, then provide the complete implementation.
```
## **Quick Version (If you just want the code):**
```text
Build a Python system that:
1. Analyzes CSV files in `sample_data/` folder
2. Creates SQLite database (`school_schedule.db`)
3. Migrates CSV data to SQLite with cleaning
4. Provides query application for school scheduling
Key queries needed:
- Find student location by time
- Show student/teacher schedules
- Check room availability
- Detect schedule conflicts
Use: Python, sqlite3, pandas, DB Browser SQLite compatible.
```
## **Best Practices to Include in Your Prompt:**
1. **Be specific about folder paths**: `sample_data/`
2. **Mention your tools**: DB Browser for SQLite
3. **Specify file formats**: `.db` or `.sqlite` extension
4. **List expected CSV files** if you know their names
5. **Mention specific queries** you need
6. **Include error handling requirements**
7. **Ask for DB Browser integration files**
This prompt will give you a complete, production-ready solution tailored to your exact setup with SQLite and DB Browser.