Added
This commit is contained in:
BIN
scheduler_bots/.DS_Store
vendored
BIN
scheduler_bots/.DS_Store
vendored
Binary file not shown.
31
scheduler_bots/README_HTML_combination.md
Normal file
31
scheduler_bots/README_HTML_combination.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# HTML Files Combination Documentation
|
||||
|
||||
## Overview
|
||||
This document describes the combination of multiple HTML files into the main thesis document.
|
||||
|
||||
## What Was Done
|
||||
Content from two additional HTML files was inserted into the main thesis document:
|
||||
- `/Thesis materials/deepseek_html_20260128_0dc71d.html`
|
||||
- `/Thesis materials/deepseek_html_20260128_15ee7a.html`
|
||||
|
||||
These were inserted into:
|
||||
- `/Thesis materials/Thesis_ Intelligent School Schedule Management System.html`
|
||||
|
||||
## Files Combined
|
||||
1. **Main File**: `Thesis_ Intelligent School Schedule Management System.html` (original)
|
||||
2. **Added Content 1**: `deepseek_html_20260128_0dc71d.html` (added as "Additional Content Section 1")
|
||||
3. **Added Content 2**: `deepseek_html_20260128_15ee7a.html` (added as "Additional Content Section 2")
|
||||
|
||||
## How It Was Done
|
||||
- Removed HTML structure (doctype, html, head, body tags) from the additional files
|
||||
- Added both contents to the main file before the closing `</body>` tag
|
||||
- Wrapped each addition in a styled section with descriptive headings
|
||||
- Applied consistent styling to match the main document theme
|
||||
|
||||
## Styling Added
|
||||
- Each section has a gray border with rounded corners
|
||||
- Distinct headings with blue underline for visual separation
|
||||
- Appropriate margins and padding for readability
|
||||
|
||||
## Result
|
||||
The main HTML document now contains all three files' content in a unified format, with clear visual separation between the original content and the added sections.
|
||||
42
scheduler_bots/README_consolidated_csv.md
Normal file
42
scheduler_bots/README_consolidated_csv.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Consolidated CSV Data Documentation
|
||||
|
||||
## Overview
|
||||
This directory contains a consolidated CSV file (`consolidated_data_simple.csv`) that combines data from multiple individual CSV files in the `sample_data` directory. Each original CSV file is identified by a sheet number and filename prefix in the consolidated file.
|
||||
|
||||
## File Structure
|
||||
|
||||
### `consolidated_data_simple.csv`
|
||||
- **Columns**: `[Sheet_Number, Original_File, Original_Column_1, Original_Column_2, ...]`
|
||||
- **Sheet Numbers**:
|
||||
1. `8-Table 1.csv`
|
||||
2. `1-Table 1.csv`
|
||||
3. `10-Table 1.csv`
|
||||
4. `Лист3-Table 1.csv`
|
||||
5. `7-Table 1.csv`
|
||||
6. `Реестр заявлений на перевод 252-Table 1.csv`
|
||||
7. `ТАЙМПАД-Table 1.csv`
|
||||
8. `6 -Table 1.csv`
|
||||
9. `11-Table 1.csv`
|
||||
10. `4 -Table 1.csv`
|
||||
11. `3 -Table 1.csv`
|
||||
12. `2 -Table 1.csv`
|
||||
13. `5 -Table 1.csv`
|
||||
14. `9-Table 1.csv`
|
||||
15. `АНГЛ-Table 1.csv`
|
||||
|
||||
## Format Details
|
||||
- Column 1: `Sheet_Number` - The numeric identifier for the original CSV file
|
||||
- Column 2: `Original_File` - The filename of the original CSV file
|
||||
- Columns 3+: The original data columns from each CSV file
|
||||
|
||||
## Purpose
|
||||
This consolidated file is designed for AI/ML analysis where each original CSV sheet can be identified by its sheet number, allowing algorithms to treat each original dataset separately while analyzing the combined data.
|
||||
|
||||
## Total Records
|
||||
- Total rows in consolidated file: 3283
|
||||
- Number of original CSV files consolidated: 15
|
||||
|
||||
## Notes
|
||||
- All files were encoded in UTF-8 to preserve Cyrillic characters
|
||||
- Some original files may have been skipped if they did not contain student data (e.g., notification texts)
|
||||
- The consolidation preserves the original row and column structure from each source file
|
||||
42
scheduler_bots/README_consolidated_theses.md
Normal file
42
scheduler_bots/README_consolidated_theses.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Consolidated HTML Theses Documentation
|
||||
|
||||
## Overview
|
||||
This directory contains a consolidated HTML file (`consolidated_theses.html`) that combines multiple thesis documents into a single, organized HTML document. Each original document is clearly separated with headers and navigation links.
|
||||
|
||||
## File Structure
|
||||
|
||||
### `consolidated_theses.html`
|
||||
A single HTML file containing all thesis documents with:
|
||||
- Table of Contents with links to each document
|
||||
- Clear visual separation between documents
|
||||
- Document headers with titles and source file names
|
||||
- Responsive styling for easy reading
|
||||
|
||||
## Included Documents
|
||||
1. Lesson_ SQLite Database Implementation.html
|
||||
2. Presentaion_School Schedule Assistant Bot _ Student Project.html
|
||||
3. Professional_Thesis_Scheduler_Bot.html
|
||||
4. Scheduler Bot_ Telegram & CSV Database.html
|
||||
5. Student Database Search System _ Beginner's Guide.html
|
||||
6. Thesis_ Intelligent School Schedule Management System_23_Jan_2026.html
|
||||
7. Thesis_AI7_Building_A_ Scheduler_Bot_A Student Project.html
|
||||
|
||||
## Features
|
||||
- **Navigation**: Clickable table of contents linking to each document
|
||||
- **Visual Separation**: Each document is visually separated with distinct headers
|
||||
- **Responsive Design**: Optimized for both desktop and mobile viewing
|
||||
- **Self-contained**: All CSS styling included within the HTML file
|
||||
- **Easy Sharing**: Single file containing all thesis documents
|
||||
|
||||
## File Size
|
||||
- `consolidated_theses.html`: ~129 KB (129,814 bytes)
|
||||
|
||||
## Purpose
|
||||
This consolidated file is designed for:
|
||||
- Easy navigation between multiple thesis documents
|
||||
- Simplified sharing and distribution
|
||||
- Streamlined review and analysis of related documents
|
||||
- Preservation of all content in a single file format
|
||||
|
||||
## Access
|
||||
Simply open `consolidated_theses.html` in any modern web browser to access all documents.
|
||||
211
scheduler_bots/README_v2.md
Normal file
211
scheduler_bots/README_v2.md
Normal file
@@ -0,0 +1,211 @@
|
||||
# Implementing SQLite Database in Telegram Scheduler Bot
|
||||
|
||||
This document explains how to enhance your existing Telegram Scheduler Bot (`telegram_scheduler_v2.py`) to include SQLite database functionality, resulting in `telegram_scheduler_v3.py`.
|
||||
|
||||
## Overview
|
||||
|
||||
The transition from `telegram_scheduler_v2.py` to `telegram_scheduler_v3.py` introduces persistent storage capabilities using SQLite, allowing the bot to store, retrieve, and manage schedule data beyond runtime.
|
||||
|
||||
## What is SQLite?
|
||||
|
||||
SQLite is a lightweight, serverless, self-contained SQL database engine. It stores the entire database in a single file, making it ideal for applications that need local data persistence without setting up a separate database server.
|
||||
|
||||
## How the Database File (`schedule.db`) Appears
|
||||
|
||||
The `schedule.db` file is automatically created when:
|
||||
1. The bot runs for the first time after implementing SQLite functionality
|
||||
2. The `init_db()` function executes, which creates the database file if it doesn't exist
|
||||
3. The first database operation occurs (like adding a record)
|
||||
|
||||
The file appears in the same directory as your Python script and persists between program runs.
|
||||
|
||||
## Step-by-Step Implementation Guide
|
||||
|
||||
### 1. Import Required Libraries
|
||||
|
||||
Add SQLite3 import to your existing imports:
|
||||
|
||||
```python
|
||||
import sqlite3
|
||||
```
|
||||
|
||||
### 2. Database Initialization Function
|
||||
|
||||
Create a function to initialize your database:
|
||||
|
||||
```python
|
||||
def init_db():
|
||||
"""Initialize the SQLite database and create tables if they don't exist."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Create table for schedule entries
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS schedule (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
day TEXT NOT NULL,
|
||||
period INTEGER NOT NULL,
|
||||
subject TEXT NOT NULL,
|
||||
class_name TEXT NOT NULL,
|
||||
room TEXT NOT NULL,
|
||||
UNIQUE(day, period)
|
||||
)
|
||||
''')
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
```
|
||||
|
||||
### 3. Database Connection Setup
|
||||
|
||||
Define the database name and initialize it:
|
||||
|
||||
```python
|
||||
# Database setup
|
||||
DATABASE_NAME = "schedule.db"
|
||||
|
||||
# Initialize the database
|
||||
init_db()
|
||||
```
|
||||
|
||||
### 4. Data Manipulation Functions
|
||||
|
||||
Add functions to interact with the database:
|
||||
|
||||
```python
|
||||
def add_schedule_entry(day, period, subject, class_name, room):
|
||||
"""Add a new schedule entry to the database."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
try:
|
||||
cursor.execute('''
|
||||
INSERT OR REPLACE INTO schedule (day, period, subject, class_name, room)
|
||||
VALUES (?, ?, ?, ?, ?)
|
||||
''', (day, period, subject, class_name, room))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
return True
|
||||
except sqlite3.Error as e:
|
||||
print(f"Database error: {e}")
|
||||
conn.close()
|
||||
return False
|
||||
|
||||
def load_schedule_from_db():
|
||||
"""Load schedule from the SQLite database."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("SELECT day, period, subject, class_name, room FROM schedule ORDER BY day, period")
|
||||
rows = cursor.fetchall()
|
||||
|
||||
conn.close()
|
||||
|
||||
# Group by day
|
||||
schedule = {}
|
||||
for day, period, subject, class_name, room in rows:
|
||||
if day not in schedule:
|
||||
schedule[day] = []
|
||||
|
||||
class_info = f"Subject: {subject} Class: {class_name} Room: {room}"
|
||||
schedule[day].append((str(period), class_info))
|
||||
|
||||
return schedule
|
||||
```
|
||||
|
||||
### 5. Update Existing Functions to Use Database
|
||||
|
||||
Modify your schedule-retrieving functions to use the database instead of CSV:
|
||||
|
||||
```python
|
||||
async def where_am_i(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Tell user where they should be right now."""
|
||||
# Reload schedule from DB to ensure latest data
|
||||
schedule = load_schedule_from_db()
|
||||
# ... rest of function remains similar but uses 'schedule' from DB
|
||||
```
|
||||
|
||||
### 6. Add Conversation State Management
|
||||
|
||||
To handle multi-step interactions like the `/add` command:
|
||||
|
||||
```python
|
||||
# User states for tracking conversations
|
||||
user_states = {} # Stores user conversation state
|
||||
```
|
||||
|
||||
### 7. Implement the New `/add` Command
|
||||
|
||||
Create an interactive command that collects data from the user:
|
||||
|
||||
```python
|
||||
async def add(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Start the process of adding a new schedule entry."""
|
||||
user_id = update.effective_user.id
|
||||
user_states[user_id] = {"step": "waiting_day"}
|
||||
|
||||
await update.message.reply_text(
|
||||
"📅 Adding a new class to the schedule.\n"
|
||||
"Please enter the day of the week (e.g., Monday, Tuesday, etc.):"
|
||||
)
|
||||
```
|
||||
|
||||
### 8. Handle Messages During Conversations
|
||||
|
||||
Add a general message handler for interactive flows:
|
||||
|
||||
```python
|
||||
async def handle_message(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Handle user messages during the add process."""
|
||||
# Implementation for processing user input during multi-step conversations
|
||||
# Handles day -> period -> subject -> class -> room sequence
|
||||
```
|
||||
|
||||
### 9. Register New Handlers
|
||||
|
||||
Add the new handlers to your main function:
|
||||
|
||||
```python
|
||||
def main():
|
||||
# Create the Application
|
||||
application = Application.builder().token(BOT_TOKEN).build()
|
||||
|
||||
# Add command handlers
|
||||
application.add_handler(CommandHandler("start", start))
|
||||
application.add_handler(CommandHandler("whereami", where_am_i))
|
||||
application.add_handler(CommandHandler("schedule", schedule))
|
||||
application.add_handler(CommandHandler("tomorrow", tomorrow))
|
||||
application.add_handler(CommandHandler("add", add)) # New command
|
||||
application.add_handler(CommandHandler("help", help_command))
|
||||
|
||||
# Add message handler for conversation flow
|
||||
application.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, handle_message))
|
||||
```
|
||||
|
||||
## Key Changes Summary
|
||||
|
||||
| Aspect | telegram_scheduler_v2.py | telegram_scheduler_v3.py |
|
||||
|--------|--------------------------|--------------------------|
|
||||
| Data Storage | CSV file | SQLite database |
|
||||
| Persistence | Lost when program ends | Persists between runs |
|
||||
| New Classes | Cannot add dynamically | Interactive `/add` command |
|
||||
| Data Updates | Requires manual CSV editing | Real-time updates via bot |
|
||||
|
||||
## Benefits of Using SQLite
|
||||
|
||||
1. **Persistence**: Data survives bot restarts
|
||||
2. **Dynamic Updates**: Users can add new classes without changing files
|
||||
3. **Data Integrity**: Built-in constraints prevent duplicates
|
||||
4. **Scalability**: Easy to extend with additional tables/fields
|
||||
5. **Performance**: Fast queries for schedule lookups
|
||||
|
||||
## Security Note
|
||||
|
||||
The `schedule.db` file contains your schedule data and should be protected accordingly. In production environments, consider access controls and backups.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- If the database isn't being created, ensure your application has write permissions in the directory
|
||||
- Check logs for SQLite error messages if operations fail
|
||||
- The database file will grow as more entries are added over time
|
||||
BIN
scheduler_bots/__pycache__/database.cpython-314.pyc
Normal file
BIN
scheduler_bots/__pycache__/database.cpython-314.pyc
Normal file
Binary file not shown.
BIN
scheduler_bots/__pycache__/database_fresh.cpython-314.pyc
Normal file
BIN
scheduler_bots/__pycache__/database_fresh.cpython-314.pyc
Normal file
Binary file not shown.
182
scheduler_bots/combine_thesis_html.py
Normal file
182
scheduler_bots/combine_thesis_html.py
Normal file
@@ -0,0 +1,182 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
combine_thesis_html.py - Combines all HTML files in the Thesis materials directory
|
||||
into the main thesis document
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
from bs4 import BeautifulSoup
|
||||
|
||||
|
||||
def combine_html_files():
|
||||
# Directory containing the HTML files
|
||||
thesis_dir = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/scheduler_bots/Thesis materials"
|
||||
|
||||
# Main file to append content to
|
||||
main_file = "Thesis_ Intelligent School Schedule Management System.html"
|
||||
main_file_path = os.path.join(thesis_dir, main_file)
|
||||
|
||||
# Get all HTML files in the directory
|
||||
html_files = [f for f in os.listdir(thesis_dir) if f.endswith('.html')]
|
||||
|
||||
print(f"Found {len(html_files)} HTML files:")
|
||||
for i, f in enumerate(html_files, 1):
|
||||
print(f" {i}. {f}")
|
||||
|
||||
# Read the main file content
|
||||
with open(main_file_path, 'r', encoding='utf-8') as f:
|
||||
main_content = f.read()
|
||||
|
||||
# Parse the main file with BeautifulSoup
|
||||
soup_main = BeautifulSoup(main_content, 'html.parser')
|
||||
|
||||
# Find the body element in the main file
|
||||
main_body = soup_main.find('body')
|
||||
if not main_body:
|
||||
# If no body tag, create one
|
||||
main_body = soup_main.new_tag('body')
|
||||
soup_main.html.insert(0, main_body) if soup_main.html else soup_main.insert(0, main_body)
|
||||
|
||||
# Add a separator before adding new content
|
||||
separator = soup_main.new_tag('hr')
|
||||
separator['style'] = 'margin: 40px 0; border: 2px solid #4a6fa5;'
|
||||
main_body.append(separator)
|
||||
|
||||
# Add a heading for the appended content
|
||||
appendix_heading = soup_main.new_tag('h2')
|
||||
appendix_heading.string = 'Additional Thesis Materials'
|
||||
appendix_heading['style'] = 'color: #2c3e50; margin-top: 40px; border-bottom: 2px solid #4a6fa5; padding-bottom: 10px;'
|
||||
main_body.append(appendix_heading)
|
||||
|
||||
# Process each additional HTML file
|
||||
for filename in html_files:
|
||||
if filename == main_file: # Skip the main file
|
||||
continue
|
||||
|
||||
print(f"Processing {filename}...")
|
||||
|
||||
file_path = os.path.join(thesis_dir, filename)
|
||||
|
||||
# Read the additional file content
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
additional_content = f.read()
|
||||
|
||||
# Parse the additional file
|
||||
soup_additional = BeautifulSoup(additional_content, 'html.parser')
|
||||
|
||||
# Create a section for this file
|
||||
section_div = soup_main.new_tag('div')
|
||||
section_div['class'] = 'additional-section'
|
||||
section_div['style'] = 'margin: 30px 0; padding: 20px; border: 1px solid #ddd; border-radius: 8px; background-color: #fafafa;'
|
||||
|
||||
# Add a heading for this section
|
||||
section_heading = soup_main.new_tag('h3')
|
||||
section_heading.string = f'Content from: {filename}'
|
||||
section_heading['style'] = 'color: #4a6fa5; margin-top: 0;'
|
||||
section_div.append(section_heading)
|
||||
|
||||
# Get body content from the additional file
|
||||
additional_body = soup_additional.find('body')
|
||||
if additional_body:
|
||||
# Copy child elements from the additional body to our section
|
||||
for child in additional_body.children:
|
||||
if child.name: # Only copy actual elements, not text nodes
|
||||
section_div.append(child.extract())
|
||||
else:
|
||||
# If no body tag, add the whole content
|
||||
section_div.append(soup_additional)
|
||||
|
||||
# Append the section to the main body
|
||||
main_body.append(section_div)
|
||||
|
||||
# Write the combined content back to the main file
|
||||
with open(main_file_path, 'w', encoding='utf-8') as f:
|
||||
f.write(str(soup_main.prettify()))
|
||||
|
||||
print(f"All HTML files have been combined into {main_file}")
|
||||
print(f"Combined file saved at: {main_file_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
# Check if BeautifulSoup is available
|
||||
try:
|
||||
import bs4
|
||||
combine_html_files()
|
||||
except ImportError:
|
||||
print("BeautifulSoup4 library is required for this script.")
|
||||
print("Install it with: pip install beautifulsoup4")
|
||||
|
||||
# Create a simple version without BeautifulSoup
|
||||
print("Creating a basic combination without BeautifulSoup...")
|
||||
|
||||
thesis_dir = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/scheduler_bots/Thesis materials"
|
||||
main_file = "Thesis_ Intelligent School Schedule Management System.html"
|
||||
main_file_path = os.path.join(thesis_dir, main_file)
|
||||
|
||||
# Get all HTML files in the directory
|
||||
html_files = [f for f in os.listdir(thesis_dir) if f.endswith('.html')]
|
||||
|
||||
# Read the main file content
|
||||
with open(main_file_path, 'r', encoding='utf-8') as f:
|
||||
main_content = f.read()
|
||||
|
||||
# Find the closing body tag to insert additional content
|
||||
body_close_pos = main_content.rfind('</body>')
|
||||
if body_close_pos == -1:
|
||||
# If no closing body tag, find the closing html tag
|
||||
html_close_pos = main_content.rfind('</html>')
|
||||
if html_close_pos != -1:
|
||||
insert_pos = html_close_pos
|
||||
else:
|
||||
# If no closing html tag, append at the end
|
||||
insert_pos = len(main_content)
|
||||
else:
|
||||
insert_pos = body_close_pos
|
||||
|
||||
# Prepare the additional content
|
||||
additional_content = '\n\n<!-- Additional Thesis Materials -->\n<hr style="margin: 40px 0; border: 2px solid #4a6fa5;">\n<h2 style="color: #2c3e50; margin-top: 40px; border-bottom: 2px solid #4a6fa5; padding-bottom: 10px;">Additional Thesis Materials</h2>\n\n'
|
||||
|
||||
# Process each additional HTML file
|
||||
for filename in html_files:
|
||||
if filename == main_file: # Skip the main file
|
||||
continue
|
||||
|
||||
print(f"Processing {filename}...")
|
||||
|
||||
file_path = os.path.join(thesis_dir, filename)
|
||||
|
||||
# Read the additional file content
|
||||
with open(file_path, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# Remove HTML and HEAD sections to only get body content
|
||||
# Remove doctype
|
||||
content = re.sub(r'<!DOCTYPE[^>]*>', '', content, flags=re.IGNORECASE)
|
||||
|
||||
# Remove html tags
|
||||
content = re.sub(r'<html[^>]*>|</html>', '', content, flags=re.IGNORECASE)
|
||||
|
||||
# Remove head section
|
||||
content = re.sub(r'<head[^>]*>.*?</head>', '', content, flags=re.DOTALL | re.IGNORECASE)
|
||||
|
||||
# Remove opening and closing body tags
|
||||
content = re.sub(r'<body[^>]*>|</body>', '', content, flags=re.IGNORECASE)
|
||||
|
||||
# Add section wrapper
|
||||
section_content = f'\n<div class="additional-section" style="margin: 30px 0; padding: 20px; border: 1px solid #ddd; border-radius: 8px; background-color: #fafafa;">\n'
|
||||
section_content += f'<h3 style="color: #4a6fa5; margin-top: 0;">Content from: {filename}</h3>\n'
|
||||
section_content += content
|
||||
section_content += '\n</div>\n'
|
||||
|
||||
additional_content += section_content
|
||||
|
||||
# Insert the additional content
|
||||
combined_content = main_content[:insert_pos] + additional_content + main_content[insert_pos:]
|
||||
|
||||
# Write the combined content back to the main file
|
||||
with open(main_file_path, 'w', encoding='utf-8') as f:
|
||||
f.write(combined_content)
|
||||
|
||||
print(f"All HTML files have been combined into {main_file}")
|
||||
print(f"Combined file saved at: {main_file_path}")
|
||||
117
scheduler_bots/consolidate_csv.py
Normal file
117
scheduler_bots/consolidate_csv.py
Normal file
@@ -0,0 +1,117 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
consolidate_csv.py - Consolidates all CSV files in sample_data directory into a single CSV
|
||||
with sheet identifiers to distinguish between different original files
|
||||
"""
|
||||
|
||||
import csv
|
||||
import os
|
||||
|
||||
|
||||
def consolidate_csv_files():
|
||||
sample_data_dir = "sample_data"
|
||||
output_file = "consolidated_data.csv"
|
||||
|
||||
if not os.path.exists(sample_data_dir):
|
||||
print(f"Directory '{sample_data_dir}' not found.")
|
||||
return
|
||||
|
||||
# Get all CSV files and filter out the schedule template and sheet files
|
||||
all_csv_files = [f for f in os.listdir(sample_data_dir) if f.endswith('.csv')]
|
||||
|
||||
# Keep only the actual student distribution files (not the sheets)
|
||||
csv_files = []
|
||||
for filename in all_csv_files:
|
||||
if 'first_sheet' not in filename and 'last_sheet' not in filename and 'template' not in filename:
|
||||
csv_files.append(filename)
|
||||
|
||||
if not csv_files:
|
||||
print(f"No student data CSV files found in '{sample_data_dir}' directory.")
|
||||
return
|
||||
|
||||
print(f"Found {len(csv_files)} student data CSV file(s):")
|
||||
for i, filename in enumerate(csv_files, 1):
|
||||
print(f" {i}. {filename}")
|
||||
|
||||
consolidated_rows = []
|
||||
|
||||
for sheet_num, filename in enumerate(csv_files, 1):
|
||||
csv_path = os.path.join(sample_data_dir, filename)
|
||||
|
||||
print(f"Processing {csv_path}...")
|
||||
|
||||
with open(csv_path, 'r', encoding='utf-8') as file:
|
||||
reader = csv.reader(file)
|
||||
rows = list(reader)
|
||||
|
||||
# Add a column to indicate which sheet this data came from
|
||||
for row_idx, row in enumerate(rows):
|
||||
# Create a new row with the sheet number as the first column
|
||||
new_row = [sheet_num, filename] + row
|
||||
consolidated_rows.append(new_row)
|
||||
|
||||
# Write consolidated data to a new CSV file
|
||||
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
|
||||
writer = csv.writer(csvfile)
|
||||
# Write header
|
||||
writer.writerow(['Sheet_Number', 'Original_File', 'Data_Columns'])
|
||||
# Write all rows
|
||||
for row in consolidated_rows:
|
||||
writer.writerow(row)
|
||||
|
||||
print(f"Consolidated data written to {output_file}")
|
||||
print(f"Total rows in consolidated file: {len(consolidated_rows)}")
|
||||
|
||||
|
||||
def consolidate_csv_files_simple():
|
||||
"""
|
||||
Creates a simpler consolidated CSV file with Sheet_Number and Original_File columns
|
||||
"""
|
||||
sample_data_dir = "sample_data"
|
||||
output_file = "consolidated_data_simple.csv"
|
||||
|
||||
if not os.path.exists(sample_data_dir):
|
||||
print(f"Directory '{sample_data_dir}' not found.")
|
||||
return
|
||||
|
||||
# Get all CSV files
|
||||
all_csv_files = [f for f in os.listdir(sample_data_dir) if f.endswith('.csv')]
|
||||
|
||||
# Keep only the actual student distribution files (not the sheets)
|
||||
csv_files = []
|
||||
for filename in all_csv_files:
|
||||
if 'first_sheet' not in filename and 'last_sheet' not in filename and 'template' not in filename:
|
||||
csv_files.append(filename)
|
||||
|
||||
if not csv_files:
|
||||
print(f"No student data CSV files found in '{sample_data_dir}' directory.")
|
||||
return
|
||||
|
||||
print(f"Found {len(csv_files)} student data CSV file(s):")
|
||||
for i, filename in enumerate(csv_files, 1):
|
||||
print(f" {i}. {filename}")
|
||||
|
||||
with open(output_file, 'w', newline='', encoding='utf-8') as outfile:
|
||||
writer = csv.writer(outfile)
|
||||
|
||||
# Process each file
|
||||
for sheet_num, filename in enumerate(csv_files, 1):
|
||||
csv_path = os.path.join(sample_data_dir, filename)
|
||||
|
||||
print(f"Processing {csv_path}...")
|
||||
|
||||
with open(csv_path, 'r', encoding='utf-8') as infile:
|
||||
reader = csv.reader(infile)
|
||||
|
||||
for row in reader:
|
||||
# Add sheet number and filename as first two columns
|
||||
new_row = [sheet_num, filename] + row
|
||||
writer.writerow(new_row)
|
||||
|
||||
print(f"Simple consolidated data written to {output_file}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
print("Creating consolidated CSV with sheet identifiers...")
|
||||
consolidate_csv_files_simple()
|
||||
print("Done! You can now upload the consolidated_data_simple.csv file for AI/ML analysis.")
|
||||
202
scheduler_bots/consolidate_theses.py
Normal file
202
scheduler_bots/consolidate_theses.py
Normal file
@@ -0,0 +1,202 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
consolidate_theses.py - Consolidates all HTML thesis files into a single HTML file
|
||||
with clear separation between different documents
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
|
||||
|
||||
def consolidate_html_theses():
|
||||
# Define the parent directory containing HTML files
|
||||
parent_dir = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3"
|
||||
|
||||
# List of HTML thesis files to consolidate
|
||||
html_files = [
|
||||
"Lesson_ SQLite Database Implementation.html",
|
||||
"Presentaion_School Schedule Assistant Bot _ Student Project.html",
|
||||
"Professional_Thesis_Scheduler_Bot.html",
|
||||
"Scheduler Bot_ Telegram & CSV Database.html",
|
||||
"Student Database Search System _ Beginner's Guide.html",
|
||||
"Thesis_ Intelligent School Schedule Management System_23_Jan_2026.html",
|
||||
"Thesis_AI7_Building_A_ Scheduler_Bot_A Student Project.html"
|
||||
]
|
||||
|
||||
# Output file
|
||||
output_file = "consolidated_theses.html"
|
||||
|
||||
# Start building the consolidated HTML
|
||||
consolidated_html = """<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>Consolidated Thesis Documents</title>
|
||||
<style>
|
||||
body {
|
||||
font-family: Arial, sans-serif;
|
||||
margin: 20px;
|
||||
background-color: #f9f9f9;
|
||||
line-height: 1.6;
|
||||
}
|
||||
.document-separator {
|
||||
page-break-before: always;
|
||||
border-top: 3px solid #333;
|
||||
margin: 30px 0;
|
||||
}
|
||||
.document-header {
|
||||
background-color: #e9ecef;
|
||||
padding: 15px;
|
||||
border-radius: 5px;
|
||||
margin-bottom: 20px;
|
||||
border-left: 4px solid #007bff;
|
||||
}
|
||||
.document-title {
|
||||
color: #2c3e50;
|
||||
font-size: 24px;
|
||||
margin: 0;
|
||||
}
|
||||
.document-source {
|
||||
color: #6c757d;
|
||||
font-size: 14px;
|
||||
margin-top: 5px;
|
||||
}
|
||||
.document-content {
|
||||
background-color: white;
|
||||
padding: 20px;
|
||||
border-radius: 5px;
|
||||
box-shadow: 0 2px 5px rgba(0,0,0,0.1);
|
||||
margin-bottom: 30px;
|
||||
}
|
||||
.toc {
|
||||
background-color: #f8f9fa;
|
||||
padding: 20px;
|
||||
border-radius: 5px;
|
||||
margin-bottom: 30px;
|
||||
border-left: 4px solid #28a745;
|
||||
}
|
||||
.toc h2 {
|
||||
color: #2c3e50;
|
||||
margin-top: 0;
|
||||
}
|
||||
.toc ul {
|
||||
list-style-type: decimal;
|
||||
padding-left: 20px;
|
||||
}
|
||||
.toc li {
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
.toc a {
|
||||
text-decoration: none;
|
||||
color: #007bff;
|
||||
}
|
||||
.toc a:hover {
|
||||
text-decoration: underline;
|
||||
}
|
||||
h1 {
|
||||
color: #343a40;
|
||||
border-bottom: 2px solid #007bff;
|
||||
padding-bottom: 10px;
|
||||
}
|
||||
.footer {
|
||||
text-align: center;
|
||||
margin-top: 30px;
|
||||
padding: 15px;
|
||||
color: #6c757d;
|
||||
font-size: 12px;
|
||||
border-top: 1px solid #dee2e6;
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<h1>Consolidated Thesis Collection</h1>
|
||||
<div class="toc">
|
||||
<h2>Table of Contents</h2>
|
||||
<ul>
|
||||
"""
|
||||
|
||||
# Add links to each document in the TOC
|
||||
for i, filename in enumerate(html_files, 1):
|
||||
doc_title = os.path.splitext(filename)[0].replace('_', ' ')
|
||||
consolidated_html += f' <li><a href="#doc-{i}">{i}. {doc_title}</a></li>\n'
|
||||
|
||||
# Close TOC section
|
||||
consolidated_html += """ </ul>
|
||||
</div>
|
||||
"""
|
||||
|
||||
# Process each HTML file
|
||||
for i, filename in enumerate(html_files, 1):
|
||||
filepath = os.path.join(parent_dir, filename)
|
||||
|
||||
if not os.path.exists(filepath):
|
||||
print(f"File not found: {filename}")
|
||||
continue
|
||||
|
||||
print(f"Processing {filename}...")
|
||||
|
||||
# Add document separator and header
|
||||
doc_title = os.path.splitext(filename)[0].replace('_', ' ')
|
||||
consolidated_html += f""" <div class="document-separator" id="doc-{i}"></div>
|
||||
<div class="document-header">
|
||||
<h2 class="document-title">{i}. {doc_title}</h2>
|
||||
<div class="document-source">Source file: {filename}</div>
|
||||
</div>
|
||||
<div class="document-content">
|
||||
"""
|
||||
|
||||
# Read the HTML file and extract content
|
||||
try:
|
||||
with open(filepath, 'r', encoding='utf-8') as f:
|
||||
content = f.read()
|
||||
|
||||
# Remove HTML and HEAD tags, keeping only BODY content
|
||||
# First remove the DOCTYPE declaration if present
|
||||
content = re.sub(r'<!DOCTYPE[^>]*>', '', content, flags=re.IGNORECASE)
|
||||
|
||||
# Remove HTML tags and everything outside the body
|
||||
body_start = content.find('<body')
|
||||
if body_start != -1:
|
||||
body_start = content.find('>', body_start) + 1
|
||||
body_end = content.rfind('</body>')
|
||||
if body_end != -1:
|
||||
content = content[body_start:body_end]
|
||||
|
||||
# If no body tags found, try to remove head section
|
||||
if body_start == -1 or body_end == -1:
|
||||
head_match = re.search(r'<head[^>]*>.*?</head>', content, re.DOTALL | re.IGNORECASE)
|
||||
if head_match:
|
||||
content = content.replace(head_match.group(0), '')
|
||||
|
||||
# Remove html tags if present
|
||||
content = re.sub(r'<html[^>]*>|</html>', '', content, flags=re.IGNORECASE)
|
||||
|
||||
# Add the content to the consolidated HTML
|
||||
consolidated_html += content
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error processing {filename}: {str(e)}")
|
||||
consolidated_html += f"<p><em>Error reading this document: {str(e)}</em></p>"
|
||||
|
||||
# Close the document content div
|
||||
consolidated_html += """ </div>
|
||||
"""
|
||||
|
||||
# Add footer
|
||||
consolidated_html += """ <div class="footer">
|
||||
<p>Consolidated from multiple thesis documents | Generated automatically</p>
|
||||
</div>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
# Write the consolidated HTML to file
|
||||
with open(output_file, 'w', encoding='utf-8') as f:
|
||||
f.write(consolidated_html)
|
||||
|
||||
print(f"Consolidated HTML thesis document created: {output_file}")
|
||||
print(f"Included {len([f for f in html_files if os.path.exists(os.path.join(parent_dir, f))])} documents in the consolidated file")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
consolidate_html_theses()
|
||||
3473
scheduler_bots/consolidated_data_simple.csv
Normal file
3473
scheduler_bots/consolidated_data_simple.csv
Normal file
File diff suppressed because it is too large
Load Diff
2746
scheduler_bots/consolidated_theses.html
Normal file
2746
scheduler_bots/consolidated_theses.html
Normal file
File diff suppressed because it is too large
Load Diff
87
scheduler_bots/convert_dfd_to_png.py
Normal file
87
scheduler_bots/convert_dfd_to_png.py
Normal file
@@ -0,0 +1,87 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
convert_dfd_to_png.py - Converts DFD.html to a PNG image file
|
||||
"""
|
||||
|
||||
import os
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def convert_html_to_png():
|
||||
# Define the input and output file paths
|
||||
input_file = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html"
|
||||
output_file = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png"
|
||||
|
||||
# Check if the input file exists
|
||||
if not os.path.exists(input_file):
|
||||
print(f"Input file does not exist: {input_file}")
|
||||
return
|
||||
|
||||
print("Converting DFD.html to DFD.png...")
|
||||
|
||||
# Since we need to convert HTML to PNG, this requires special tools
|
||||
# First, let's check if we have the required libraries
|
||||
try:
|
||||
from selenium import webdriver
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
|
||||
# Setup Chrome options for headless browsing
|
||||
chrome_options = Options()
|
||||
chrome_options.add_argument("--headless")
|
||||
chrome_options.add_argument("--no-sandbox")
|
||||
chrome_options.add_argument("--disable-dev-shm-usage")
|
||||
|
||||
# Create a temporary HTML file to ensure proper formatting
|
||||
driver = webdriver.Chrome(options=chrome_options)
|
||||
|
||||
# Load the HTML file
|
||||
file_url = f"file://{os.path.abspath(input_file)}"
|
||||
driver.get(file_url)
|
||||
|
||||
# Take a screenshot of the entire page
|
||||
driver.set_window_size(1200, 800) # Set window size
|
||||
driver.save_screenshot(output_file)
|
||||
|
||||
driver.quit()
|
||||
|
||||
print(f"Successfully converted DFD.html to DFD.png")
|
||||
print(f"Output file: {output_file}")
|
||||
|
||||
except ImportError:
|
||||
# If selenium is not available, try using playwright
|
||||
try:
|
||||
from playwright.sync_api import sync_playwright
|
||||
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
page = browser.new_page()
|
||||
|
||||
# Load the HTML file
|
||||
file_path = os.path.abspath(input_file)
|
||||
page.goto(f"file://{file_path}")
|
||||
|
||||
# Set viewport size
|
||||
page.set_viewport_size({"width": 1200, "height": 800})
|
||||
|
||||
# Take screenshot
|
||||
page.screenshot(path=output_file, full_page=True)
|
||||
|
||||
browser.close()
|
||||
|
||||
print(f"Successfully converted DFD.html to DFD.png using Playwright")
|
||||
print(f"Output file: {output_file}")
|
||||
|
||||
except ImportError:
|
||||
# If neither selenium nor playwright is available, inform the user
|
||||
print("Required libraries not available for HTML to PNG conversion.")
|
||||
print("To convert HTML to PNG, you need to install one of these packages:")
|
||||
print(" pip install selenium")
|
||||
print(" OR")
|
||||
print(" pip install playwright")
|
||||
print(" OR")
|
||||
print(" Use a web browser to manually export the HTML as PDF/PNG")
|
||||
return
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
convert_html_to_png()
|
||||
70
scheduler_bots/convert_dfd_to_png_alt.py
Normal file
70
scheduler_bots/convert_dfd_to_png_alt.py
Normal file
@@ -0,0 +1,70 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
convert_dfd_to_png_alt.py - Alternative method to convert DFD.html to a PNG image file
|
||||
"""
|
||||
|
||||
import os
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def convert_html_to_png():
|
||||
# Define the input and output file paths
|
||||
input_file = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html"
|
||||
output_file = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png"
|
||||
|
||||
# Check if the input file exists
|
||||
if not os.path.exists(input_file):
|
||||
print(f"Input file does not exist: {input_file}")
|
||||
return
|
||||
|
||||
print("Attempting to convert DFD.html to DFD.png...")
|
||||
|
||||
# Method 1: Using wkhtmltoimage if available
|
||||
try:
|
||||
subprocess.run([
|
||||
"wkhtmltoimage",
|
||||
"--width", "1200",
|
||||
"--height", "800",
|
||||
input_file,
|
||||
output_file
|
||||
], check=True)
|
||||
|
||||
print(f"Successfully converted DFD.html to DFD.png using wkhtmltoimage")
|
||||
print(f"Output file: {output_file}")
|
||||
return
|
||||
except FileNotFoundError:
|
||||
print("wkhtmltoimage not found. Trying alternative method...")
|
||||
except subprocess.CalledProcessError as e:
|
||||
print(f"Error using wkhtmltoimage: {e}")
|
||||
|
||||
# Method 2: Using weasyprint if available
|
||||
try:
|
||||
import weasyprint
|
||||
from PIL import Image
|
||||
import io
|
||||
|
||||
# Convert HTML to PDF in memory first
|
||||
html_doc = weasyprint.HTML(input_file)
|
||||
pdf_bytes = html_doc.write_pdf()
|
||||
|
||||
# Convert PDF to PNG (this is more complex and may require additional tools)
|
||||
print("WeasyPrint method requires additional image conversion tools.")
|
||||
|
||||
except ImportError:
|
||||
print("WeasyPrint not available. Trying simpler approach...")
|
||||
|
||||
# Method 3: Provide instructions for manual conversion
|
||||
print("\nHTML to PNG conversion requires specialized tools.")
|
||||
print("You can manually convert the file using one of these methods:")
|
||||
print("1. Open the HTML file in a browser, take a screenshot, and save as PNG")
|
||||
print("2. Install wkhtmltopdf/wkhtmltoimage: brew install wkhtmltopdf (on macOS)")
|
||||
print("3. Use online converters that support HTML to PNG conversion")
|
||||
print(f"\nHTML file location: {input_file}")
|
||||
|
||||
# Just copy a placeholder for now
|
||||
print("\nAs a placeholder, I'm noting that the conversion needs to be done manually or with the proper tools installed.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
convert_html_to_png()
|
||||
961
scheduler_bots/database.py
Normal file
961
scheduler_bots/database.py
Normal file
@@ -0,0 +1,961 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
database.py - School schedule database (normalized version)
|
||||
Creates normalized tables and extracts from CSV with proper relationships
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import csv
|
||||
import os
|
||||
import sys
|
||||
import re
|
||||
|
||||
class SchoolScheduleDB:
|
||||
def __init__(self, db_name='school_schedule.db'):
|
||||
self.conn = sqlite3.connect(db_name)
|
||||
self.cursor = self.conn.cursor()
|
||||
# Initialize database tables
|
||||
self.create_tables()
|
||||
|
||||
def normalize_class_name(self, class_name):
|
||||
"""Normalize class names to handle Cyrillic/Latin character differences"""
|
||||
if not class_name:
|
||||
return class_name
|
||||
|
||||
# Replace Cyrillic characters with Latin equivalents in class names
|
||||
# Specifically: replace Cyrillic А (U+0410) with Latin A (U+0041)
|
||||
normalized = class_name.replace('А', 'A').replace('В', 'B').replace('С', 'C')
|
||||
return normalized
|
||||
|
||||
def create_tables(self):
|
||||
"""Create normalized tables with proper relationships"""
|
||||
# Teachers table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS teachers (
|
||||
teacher_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
email TEXT,
|
||||
phone TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Subjects table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS subjects (
|
||||
subject_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Days table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS days (
|
||||
day_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL -- e.g., Monday, Tuesday, etc.
|
||||
)
|
||||
""")
|
||||
|
||||
# Periods table - with proper unique constraint
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS periods (
|
||||
period_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
period_number INTEGER,
|
||||
start_time TEXT,
|
||||
end_time TEXT,
|
||||
UNIQUE(period_number, start_time, end_time)
|
||||
)
|
||||
""")
|
||||
|
||||
# Groups table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS groups (
|
||||
group_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT,
|
||||
class_name TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Students table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS students (
|
||||
student_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
class_name TEXT,
|
||||
full_name TEXT NOT NULL,
|
||||
UNIQUE(full_name, class_name) -- Prevent duplicate student entries
|
||||
)
|
||||
""")
|
||||
|
||||
# Homeroom teachers table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS homeroom_teachers (
|
||||
homeroom_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
class_name TEXT UNIQUE,
|
||||
teacher_name TEXT,
|
||||
classroom TEXT,
|
||||
parent_meeting_room TEXT,
|
||||
internal_number TEXT,
|
||||
mobile_number TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Schedule table with foreign key relationships
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS schedule (
|
||||
entry_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
student_id INTEGER,
|
||||
subject_id INTEGER,
|
||||
teacher_id INTEGER,
|
||||
day_id INTEGER,
|
||||
period_id INTEGER,
|
||||
group_id INTEGER,
|
||||
FOREIGN KEY (student_id) REFERENCES students(student_id),
|
||||
FOREIGN KEY (subject_id) REFERENCES subjects(subject_id),
|
||||
FOREIGN KEY (teacher_id) REFERENCES teachers(teacher_id),
|
||||
FOREIGN KEY (day_id) REFERENCES days(day_id),
|
||||
FOREIGN KEY (period_id) REFERENCES periods(period_id),
|
||||
FOREIGN KEY (group_id) REFERENCES groups(group_id)
|
||||
)
|
||||
""")
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def populate_periods_table(self):
|
||||
"""Populate the periods table with standard school periods"""
|
||||
period_times = {
|
||||
'1': ('09:00', '09:40'),
|
||||
'2': ('10:00', '10:40'),
|
||||
'3': ('11:00', '11:40'),
|
||||
'4': ('11:50', '12:30'),
|
||||
'5': ('12:40', '13:20'),
|
||||
'6': ('13:30', '14:10'),
|
||||
'7': ('14:20', '15:00'),
|
||||
'8': ('15:20', '16:00'),
|
||||
'9': ('16:15', '16:55'),
|
||||
'10': ('17:05', '17:45'),
|
||||
'11': ('17:55', '18:35'),
|
||||
'12': ('18:45', '19:20'),
|
||||
'13': ('19:20', '20:00')
|
||||
}
|
||||
|
||||
for period_num, (start_time, end_time) in period_times.items():
|
||||
self.cursor.execute(
|
||||
"INSERT OR IGNORE INTO periods (period_number, start_time, end_time) VALUES (?, ?, ?)",
|
||||
(int(period_num), start_time, end_time)
|
||||
)
|
||||
|
||||
# Add days of the week
|
||||
days_of_week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
|
||||
for day in days_of_week:
|
||||
self.cursor.execute("INSERT OR IGNORE INTO days (name) VALUES (?)", (day,))
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def update_database_from_csv(self, auto_update=True):
|
||||
"""Automatically update database from specific CSV files in the sample_data directory"""
|
||||
# Updated path to look in the parent directory
|
||||
sample_data_dir = "../sample_data"
|
||||
|
||||
if not os.path.exists(sample_data_dir):
|
||||
print(f"Directory '{sample_data_dir}' not found. Trying local directory...")
|
||||
sample_data_dir = "sample_data"
|
||||
if not os.path.exists(sample_data_dir):
|
||||
print(f"Directory '{sample_data_dir}' not found.")
|
||||
return
|
||||
|
||||
# Get all CSV files and filter out the schedule template and sheet files
|
||||
all_csv_files = [f for f in os.listdir(sample_data_dir) if f.endswith('.csv')]
|
||||
|
||||
# Keep only the actual student distribution files (not the sheets)
|
||||
csv_files = []
|
||||
for filename in all_csv_files:
|
||||
if 'first_sheet' not in filename and 'last_sheet' not in filename and 'template' not in filename:
|
||||
csv_files.append(filename)
|
||||
|
||||
if not csv_files:
|
||||
print(f"No student data CSV files found in '{sample_data_dir}' directory.")
|
||||
return
|
||||
|
||||
print(f"Found {len(csv_files)} student data CSV file(s):")
|
||||
for i, filename in enumerate(csv_files, 1):
|
||||
print(f" {i}. {filename}")
|
||||
|
||||
if auto_update:
|
||||
print("\nAuto-updating database with all student data CSV files...")
|
||||
files_to_update = csv_files
|
||||
else:
|
||||
response = input("\nUpdate database with CSV files? (yes/no): ").lower()
|
||||
|
||||
if response not in ['yes', 'y', 'да']:
|
||||
print("Skipping database update.")
|
||||
return
|
||||
|
||||
print(f"\n0. Update all files")
|
||||
|
||||
try:
|
||||
selection = input(f"\nSelect file(s) to update (0 for all, or comma-separated numbers like 1,2,3): ")
|
||||
|
||||
if selection.strip() == '0':
|
||||
# Update all files
|
||||
files_to_update = csv_files
|
||||
else:
|
||||
# Parse user selection
|
||||
indices = [int(x.strip()) - 1 for x in selection.split(',')]
|
||||
files_to_update = [csv_files[i] for i in indices if 0 <= i < len(csv_files)]
|
||||
|
||||
if not files_to_update:
|
||||
print("No valid selections made.")
|
||||
return
|
||||
except ValueError:
|
||||
print("Invalid input. Please enter numbers separated by commas or '0' for all files.")
|
||||
return
|
||||
|
||||
# Populate the periods and days tables first
|
||||
self.populate_periods_table()
|
||||
|
||||
print(f"\nUpdating database with {len(files_to_update)} file(s):")
|
||||
for filename in files_to_update:
|
||||
print(f" - {filename}")
|
||||
|
||||
csv_path = os.path.join(sample_data_dir, filename)
|
||||
print(f"Processing {csv_path}...")
|
||||
|
||||
self.process_csv_with_teacher_mapping(csv_path)
|
||||
|
||||
# Update homeroom teachers from the dedicated CSV
|
||||
self.update_homeroom_teachers_from_csv()
|
||||
|
||||
print("Database updated successfully with selected CSV data.")
|
||||
|
||||
def update_homeroom_teachers_from_csv(self):
|
||||
"""Update homeroom teachers from the dedicated CSV file"""
|
||||
# Updated path to look in the parent directory
|
||||
homeroom_csv_path = "../sample_data/Homeroom_teachers.csv"
|
||||
|
||||
if not os.path.exists(homeroom_csv_path):
|
||||
print(f"Homeroom teachers file '{homeroom_csv_path}' not found. Trying local directory...")
|
||||
homeroom_csv_path = "sample_data/Homeroom_teachers.csv"
|
||||
if not os.path.exists(homeroom_csv_path):
|
||||
print(f"Homeroom teachers file '{homeroom_csv_path}' not found.")
|
||||
return
|
||||
|
||||
with open(homeroom_csv_path, 'r', encoding='utf-8') as file:
|
||||
reader = csv.DictReader(file)
|
||||
|
||||
for row in reader:
|
||||
# Normalize the class name to handle Cyrillic/Latin differences
|
||||
normalized_class = self.normalize_class_name(row['Class'])
|
||||
self.cursor.execute("""
|
||||
INSERT OR REPLACE INTO homeroom_teachers
|
||||
(class_name, teacher_name, classroom, parent_meeting_room, internal_number, mobile_number)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (
|
||||
normalized_class,
|
||||
row['Homeroom Teacher'],
|
||||
row['Classroom'],
|
||||
row['Parent Meeting Room'],
|
||||
row['Internal Number'],
|
||||
row['Mobile Number']
|
||||
))
|
||||
|
||||
self.conn.commit()
|
||||
print("Homeroom teachers updated successfully.")
|
||||
|
||||
def process_csv_with_teacher_mapping(self, csv_file):
|
||||
"""Process CSV with teacher-subject mapping based on positional order"""
|
||||
if not os.path.exists(csv_file):
|
||||
return False
|
||||
|
||||
with open(csv_file, 'r', encoding='utf-8') as file:
|
||||
reader = csv.reader(file)
|
||||
rows = list(reader)
|
||||
|
||||
# Identify header row - look for the row containing "ФИО" (full name) or similar indicators
|
||||
header_idx = None
|
||||
for i, row in enumerate(rows):
|
||||
for cell in row:
|
||||
if "ФИО" in str(cell) or "фио" in str(cell).lower() or "Ф.И.О." in str(cell) or "ф.и.о." in str(cell):
|
||||
header_idx = i
|
||||
break
|
||||
if header_idx is not None:
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
# Check if this file contains class and name columns that identify it as a student data file
|
||||
# Even if the header doesn't contain ФИО, we might still be able to identify student data
|
||||
has_class_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
has_name_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
if has_class_indicators and has_name_indicators:
|
||||
# Try to find the header row by looking for class and name indicators
|
||||
for i, row in enumerate(rows):
|
||||
if any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class']) and \
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname']):
|
||||
header_idx = i
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
print(f"Skipping {csv_file} - does not appear to be student data with ФИО/class columns")
|
||||
return False
|
||||
|
||||
# Build a mapping of subject names in the header row
|
||||
header_row = rows[header_idx]
|
||||
header_subjects = {}
|
||||
for col_idx, subject_name in enumerate(header_row):
|
||||
subject_name = str(subject_name).strip()
|
||||
if (subject_name and
|
||||
subject_name.lower() not in ['ф.и.о.', 'фио', 'класс', 'номер', 'сортировка', 'шкaфчика', 'локера'] and
|
||||
subject_name.strip() != "" and
|
||||
"ф.и.о" not in subject_name.lower() and
|
||||
"сортировка" not in subject_name.lower() and
|
||||
"номер" not in subject_name.lower() and
|
||||
"№" not in subject_name):
|
||||
header_subjects[col_idx] = subject_name # Map column index to subject name
|
||||
|
||||
# IMPROVED TEACHER-SUBJECT MAPPING: Extract teacher-subject pairs from the first rows
|
||||
# Match base subjects to teachers and then map to header subjects
|
||||
base_subject_teacher_map = {}
|
||||
|
||||
# Look through the first rows to find teacher-subject pairs
|
||||
for i in range(min(len(rows), header_idx)): # Only go up to header row
|
||||
current_row = rows[i]
|
||||
|
||||
# Process the row in pairs of (subject, teacher, group_info) pattern
|
||||
j = 0
|
||||
while j < len(current_row) - 1:
|
||||
subject_cell = current_row[j].strip() if j < len(current_row) else ""
|
||||
teacher_cell = current_row[j + 1].strip() if j + 1 < len(current_row) else ""
|
||||
group_cell = current_row[j + 2].strip() if j + 2 < len(current_row) else ""
|
||||
|
||||
# Check if the first cell is a subject, the second is a teacher, and the third is a group
|
||||
if (subject_cell and self._is_likely_subject_name_simple(subject_cell) and
|
||||
teacher_cell and self._is_likely_teacher_name_enhanced(teacher_cell) and
|
||||
group_cell and self._is_likely_group_identifier(group_cell)):
|
||||
|
||||
# Add to the base subject teacher map (if multiple teachers for same subject, store all)
|
||||
if subject_cell not in base_subject_teacher_map:
|
||||
base_subject_teacher_map[subject_cell] = []
|
||||
if teacher_cell not in base_subject_teacher_map[subject_cell]:
|
||||
base_subject_teacher_map[subject_cell].append(teacher_cell)
|
||||
|
||||
# Move to the next potential triplet (subject, teacher, group_info)
|
||||
j += 3 # Skip subject, teacher, and group info
|
||||
|
||||
# Also check the row immediately before the header row for additional teacher-subject pairs
|
||||
if header_idx > 0:
|
||||
prev_row = rows[header_idx - 1]
|
||||
j = 0
|
||||
while j < len(prev_row) - 1:
|
||||
subject_cell = prev_row[j].strip() if j < len(prev_row) else ""
|
||||
teacher_cell = prev_row[j + 1].strip() if j + 1 < len(prev_row) else ""
|
||||
group_cell = prev_row[j + 2].strip() if j + 2 < len(prev_row) else ""
|
||||
|
||||
# Check if the first cell is a subject, the second is a teacher, and the third is a group
|
||||
if (subject_cell and self._is_likely_subject_name_simple(subject_cell) and
|
||||
teacher_cell and self._is_likely_teacher_name_enhanced(teacher_cell) and
|
||||
group_cell and self._is_likely_group_identifier(group_cell)):
|
||||
|
||||
# Add to the base subject teacher map (if multiple teachers for same subject, store all)
|
||||
if subject_cell not in base_subject_teacher_map:
|
||||
base_subject_teacher_map[subject_cell] = []
|
||||
if teacher_cell not in base_subject_teacher_map[subject_cell]:
|
||||
base_subject_teacher_map[subject_cell].append(teacher_cell)
|
||||
|
||||
# Move to the next potential triplet (subject, teacher, group_info)
|
||||
j += 3 # Skip subject, teacher, and group info
|
||||
|
||||
# Now map the header subjects to the teachers using base subject matching
|
||||
teacher_subject_map = {}
|
||||
for col_idx, header_subject in header_subjects.items():
|
||||
# Find the base subject that corresponds to this header subject
|
||||
base_subject = self._find_base_subject(header_subject, base_subject_teacher_map.keys())
|
||||
|
||||
if base_subject and base_subject in base_subject_teacher_map:
|
||||
# Use the first teacher for this base subject
|
||||
teacher_subject_map[header_subject] = base_subject_teacher_map[base_subject][0]
|
||||
|
||||
# Process each student row
|
||||
for student_row in rows[header_idx + 1:]:
|
||||
# Determine the structure dynamically based on the header
|
||||
class_col_idx = None
|
||||
name_col_idx = None
|
||||
|
||||
# Find the index of the class column (usually called "Класс")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "Класс" in str(header) or "класс" in str(header) or "Class" in str(header) or "class" in str(header).lower():
|
||||
class_col_idx = idx
|
||||
break
|
||||
|
||||
# Find the index of the name column (usually called "ФИО")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "ФИО" in str(header) or "ф.и.о." in str(header).lower() or "name" in str(header).lower():
|
||||
name_col_idx = idx
|
||||
break
|
||||
|
||||
# If we couldn't find the columns properly, skip this row
|
||||
if class_col_idx is None or name_col_idx is None:
|
||||
continue
|
||||
|
||||
# Check if this row has valid data in the expected columns
|
||||
if (len(student_row) > max(class_col_idx, name_col_idx) and
|
||||
student_row[class_col_idx].strip() and # class name exists
|
||||
student_row[name_col_idx].strip() and # student name exists
|
||||
self._is_valid_student_record_by_cols(student_row, class_col_idx, name_col_idx)):
|
||||
|
||||
name = student_row[name_col_idx].strip() # Name column
|
||||
class_name = student_row[class_col_idx].strip() # Class column
|
||||
|
||||
# Normalize the class name to handle Cyrillic/Latin differences
|
||||
normalized_class = self.normalize_class_name(class_name)
|
||||
|
||||
# Insert student into the database (using INSERT OR REPLACE to prevent duplicates)
|
||||
self.cursor.execute(
|
||||
"INSERT OR REPLACE INTO students (class_name, full_name) VALUES (?, ?)",
|
||||
(normalized_class, name)
|
||||
)
|
||||
|
||||
# Get the student_id for this student
|
||||
self.cursor.execute("SELECT student_id FROM students WHERE full_name = ? AND class_name = ?", (name, normalized_class))
|
||||
student_id_result = self.cursor.fetchone()
|
||||
if student_id_result is None:
|
||||
continue
|
||||
student_id = student_id_result[0]
|
||||
|
||||
# Process schedule data for this student
|
||||
# Go through each column to find subject and group info
|
||||
for col_idx, cell_value in enumerate(student_row):
|
||||
if cell_value and col_idx < len(header_row):
|
||||
# Get the subject from the header
|
||||
subject_header = header_row[col_idx] if col_idx < len(header_row) else ""
|
||||
|
||||
# Skip columns that don't contain schedule information
|
||||
if (col_idx == 0 or col_idx == 1 or col_idx == 2 or col_idx == class_col_idx or col_idx == name_col_idx or # skip metadata cols
|
||||
"сортировка" in subject_header.lower() or
|
||||
"номер" in subject_header.lower() or
|
||||
"шкaфчика" in subject_header.lower() or
|
||||
"локера" in subject_header.lower()):
|
||||
continue
|
||||
|
||||
# Extract group information from the cell
|
||||
group_assignment = cell_value.strip()
|
||||
|
||||
if group_assignment and group_assignment.lower() != "nan" and group_assignment != "-" and group_assignment != "":
|
||||
# Find the teacher associated with this subject
|
||||
subject_name = str(subject_header).strip()
|
||||
teacher_name = teacher_subject_map.get(subject_name, f"Default Teacher for {subject_name}")
|
||||
|
||||
# Insert the entities into their respective tables first
|
||||
# Then get their IDs to create the schedule entry
|
||||
self._process_schedule_entry_with_teacher_mapping(
|
||||
student_id, group_assignment, subject_name, teacher_name
|
||||
)
|
||||
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
def _find_base_subject(self, header_subject, base_subjects):
|
||||
"""Find the base subject that corresponds to a header subject"""
|
||||
header_lower = header_subject.lower()
|
||||
|
||||
# Check for direct matches first
|
||||
for base_subject in base_subjects:
|
||||
if base_subject.lower() in header_lower or header_lower in base_subject.lower():
|
||||
return base_subject
|
||||
|
||||
# Check for partial matches with common patterns
|
||||
for base_subject in base_subjects:
|
||||
# Remove common suffixes from header subject and try to match
|
||||
simplified_header = header_lower.replace(" 1 модуль", "").replace(" 2 модуль", "") \
|
||||
.replace(" 1,2 модуль", "").replace(" 1 мод.", "").replace(" 2 мод.", "") \
|
||||
.replace(" / ", " ").replace(" ", " ")
|
||||
|
||||
simplified_base = base_subject.lower().replace(" / ", " ").replace(" ", " ")
|
||||
|
||||
if simplified_base in simplified_header or simplified_header in simplified_base:
|
||||
return base_subject
|
||||
|
||||
return None
|
||||
|
||||
def _is_likely_subject_name_simple(self, text):
|
||||
"""Simple check if the text is likely a subject name"""
|
||||
if not text or len(text.strip()) < 2:
|
||||
return False
|
||||
|
||||
text = text.strip().lower()
|
||||
|
||||
# Common subject indicators in Russian and English
|
||||
subject_indicators = [
|
||||
'технотрек', 'матем', 'информ', 'англ', 'русск', 'физика', 'химия', 'биол', 'история',
|
||||
'общество', 'география', 'литер', 'физкульт', 'лидерство',
|
||||
'спорт. клуб', 'орксэ', 'китайск', 'немецк', 'француз', 'speaking club', 'maths',
|
||||
'ict', 'geography', 'physics', 'robotics', 'culinary', 'science', 'ai core', 'vr/ar',
|
||||
'cybersafety', 'business', 'design', 'prototype', 'mediacom', 'robotics track',
|
||||
'culinary track', 'science track', 'ai core track', 'vr/ar track', 'cybersafety track',
|
||||
'programming', 'algorithm', 'logic', 'pe', 'sports', 'swimming', 'fitness', 'gymnastics',
|
||||
'climbing', 'games', 'art', 'music', 'dance', 'karate', 'judo', 'chess', 'leadership',
|
||||
'алгоритмика', 'робототехника', 'программирование', 'математика', 'информатика', 'орксэ',
|
||||
'английский', 'русский', 'физическая культура', 'орксэ', 'изо', 'алгебра', 'геометрия',
|
||||
'астрономия', 'экология', 'астрономия', 'иностранный', 'ит', 'computer science', 'informatics'
|
||||
]
|
||||
|
||||
# Check if text contains any of the subject indicators
|
||||
for indicator in subject_indicators:
|
||||
if indicator in text:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _is_likely_subject_name(self, text):
|
||||
"""Check if the text is likely a subject name"""
|
||||
if not text or len(text.strip()) < 2:
|
||||
return False
|
||||
|
||||
text = text.strip()
|
||||
|
||||
# Common subject indicators in Russian and English
|
||||
subject_indicators = [
|
||||
'Матем.', 'Информ.', 'Англ.яз', 'Русск.яз', 'Физика', 'Химия', 'Биол', 'История',
|
||||
'Общество', 'География', 'Литер', 'Физкульт', 'Технотрек', 'Лидерство',
|
||||
'Спорт. клуб', 'ОРКСЭ', 'Китайск', 'Немецк', 'Француз', 'Speaking club', 'Maths',
|
||||
'ICT', 'Geography', 'Physics', 'Robotics', 'Culinary', 'Science', 'AI Core', 'VR/AR',
|
||||
'CyberSafety', 'Business', 'Design', 'Prototype', 'MediaCom', 'Science', 'Robotics',
|
||||
'Culinary', 'AI Core', 'VR/AR', 'CyberSafety', 'Business', 'Design', 'Prototype',
|
||||
'MediaCom', 'Robotics Track', 'Culinary Track', 'Science Track', 'AI Core Track',
|
||||
'VR/AR Track', 'CyberSafety Track', 'Business Track', 'Design Track', 'Prototype Track',
|
||||
'MediaCom Track', 'Math', 'Algebra', 'Geometry', 'Calculus', 'Statistics', 'Coding',
|
||||
'Programming', 'Algorithm', 'Logic', 'Robotics', 'Physical Education', 'PE', 'Sports',
|
||||
'Swimming', 'Fitness', 'Gymnastics', 'Climbing', 'Games', 'Art', 'Music', 'Dance',
|
||||
'Karate', 'Judo', 'Martial Arts', 'Chess', 'Leadership', 'Entrepreneurship',
|
||||
'Технотрек 1 модуль', 'Технотрек 2 модуль', 'ОРКСЭ 1,2 модуль', 'Математика 1 модуль',
|
||||
'Математика 2 модуль', 'Программирование', 'Алгоритмика и логика', 'Лидерство',
|
||||
'Робототехника', 'Physical Education 1,2 модуль', 'Английский 1 модуль', 'Английский 2 модуль',
|
||||
'Англ.яз', 'Русск.яз', 'Информ.', 'Матем.', 'Физика', 'Химия', 'Биология', 'История',
|
||||
'Обществознание', 'География', 'Литература', 'Физическая культура', 'ОРКСЭ', 'ИЗО',
|
||||
'Китайский', 'Немецкий', 'Французский', 'Алгебра', 'Геометрия', 'Астрономия', 'Экология'
|
||||
]
|
||||
|
||||
# Check if text matches any of the subject indicators
|
||||
for indicator in subject_indicators:
|
||||
if indicator.lower() in text.lower():
|
||||
return True
|
||||
|
||||
# Check if the text contains common subject-related keywords
|
||||
common_keywords = ['модуль', 'track', 'club', 'group', 'class', 'lesson', 'subject', 'module', 'яз', 'язык']
|
||||
for keyword in common_keywords:
|
||||
if keyword in text.lower():
|
||||
return True
|
||||
|
||||
# Check if text contains specific patterns that indicate it's a subject
|
||||
subject_patterns = [
|
||||
r'.*[Tt]rack.*', # Track identifiers
|
||||
r'.*[Mm]odule.*', # Module identifiers
|
||||
r'.*[Cc]lub.*', # Club identifiers
|
||||
r'.*[Ss]ubject.*', # Subject identifiers
|
||||
r'.*[Cc]lass.*', # Class identifiers
|
||||
r'.*[Ll]esson.*', # Lesson identifiers
|
||||
]
|
||||
|
||||
for pattern in subject_patterns:
|
||||
if re.search(pattern, text):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _is_valid_student_record_by_cols(self, row, class_col_idx, name_col_idx):
|
||||
"""Check if a row represents a valid student record based on specific columns"""
|
||||
# A valid student record should have:
|
||||
# - Non-empty class name in the class column
|
||||
# - Non-empty student name in the name column
|
||||
|
||||
if len(row) <= max(class_col_idx, name_col_idx):
|
||||
return False
|
||||
|
||||
class_name = row[class_col_idx].strip() if len(row) > class_col_idx else ""
|
||||
student_name = row[name_col_idx].strip() if len(row) > name_col_idx else ""
|
||||
|
||||
# Check if the class name looks like an actual class (contains a number followed by a letter)
|
||||
class_pattern = r'^\d+[А-ЯA-Z]$' # e.g., 6А, 11А, 4B
|
||||
if re.match(class_pattern, class_name):
|
||||
return bool(student_name and student_name != class_name) # Ensure name exists and is different from class
|
||||
|
||||
# If not matching class pattern, check if the name field is not just another class-like value
|
||||
name_pattern = r'^\d+[А-ЯA-Z]$' # This would indicate it's probably a class, not a name
|
||||
if re.match(name_pattern, student_name):
|
||||
return False # This row has a class in the name field, so not valid
|
||||
|
||||
return bool(class_name and student_name and class_name != student_name)
|
||||
|
||||
def _process_schedule_entry_with_teacher_mapping(self, student_id, group_info, subject_info, teacher_name):
|
||||
"""Process individual schedule entries with explicit teacher mapping and insert into normalized tables"""
|
||||
# Clean up the inputs
|
||||
subject_name = subject_info.strip() if subject_info.strip() else "General Class"
|
||||
group_assignment = group_info.strip()
|
||||
|
||||
# Only proceed if we have valid data
|
||||
if subject_name and group_assignment and group_assignment.lower() != "nan" and group_assignment != "-" and group_assignment != "":
|
||||
# Insert subject if not exists and get its ID
|
||||
self.cursor.execute("INSERT OR IGNORE INTO subjects (name) VALUES (?)", (subject_name,))
|
||||
self.cursor.execute("SELECT subject_id FROM subjects WHERE name = ?", (subject_name,))
|
||||
subject_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Insert teacher if not exists and get its ID
|
||||
# Use the teacher name as is, without default creation if not found
|
||||
self.cursor.execute("INSERT OR IGNORE INTO teachers (name) VALUES (?)", (teacher_name,))
|
||||
self.cursor.execute("SELECT teacher_id FROM teachers WHERE name = ?", (teacher_name,))
|
||||
teacher_result = self.cursor.fetchone()
|
||||
if teacher_result:
|
||||
teacher_id = teacher_result[0]
|
||||
else:
|
||||
# Fallback to a default teacher if the extracted name is invalid
|
||||
default_teacher = "Неизвестный преподаватель"
|
||||
self.cursor.execute("INSERT OR IGNORE INTO teachers (name) VALUES (?)", (default_teacher,))
|
||||
self.cursor.execute("SELECT teacher_id FROM teachers WHERE name = ?", (default_teacher,))
|
||||
teacher_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Use a default day for now (in a real system, we'd extract this from the schedule)
|
||||
# For now, we'll randomly assign to a day of the week
|
||||
import random
|
||||
days_list = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
|
||||
selected_day = random.choice(days_list)
|
||||
self.cursor.execute("INSERT OR IGNORE INTO days (name) VALUES (?)", (selected_day,))
|
||||
self.cursor.execute("SELECT day_id FROM days WHERE name = ?", (selected_day,))
|
||||
day_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Use a default period - for now we'll use period 1, but in a real system
|
||||
# we would need to extract this from the CSV if available
|
||||
self.cursor.execute("SELECT period_id FROM periods WHERE period_number = 1 LIMIT 1")
|
||||
period_result = self.cursor.fetchone()
|
||||
if period_result:
|
||||
period_id = period_result[0]
|
||||
else:
|
||||
# Fallback if no periods were inserted
|
||||
self.cursor.execute("SELECT period_id FROM periods LIMIT 1")
|
||||
period_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Clean the group name to separate it from student data
|
||||
group_name = self._clean_group_name(group_assignment)
|
||||
self.cursor.execute("INSERT OR IGNORE INTO groups (name) VALUES (?)", (group_name,))
|
||||
self.cursor.execute("SELECT group_id FROM groups WHERE name = ?", (group_name,))
|
||||
group_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Insert the schedule entry
|
||||
self.cursor.execute("""
|
||||
INSERT OR IGNORE INTO schedule (student_id, subject_id, teacher_id, day_id, period_id, group_id)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (student_id, subject_id, teacher_id, day_id, period_id, group_id))
|
||||
|
||||
def _clean_group_name(self, raw_group_data):
|
||||
"""Extract clean group name from potentially mixed student/group data"""
|
||||
# Remove potential student names from the group data
|
||||
# Group names typically contain numbers, class identifiers, or specific activity names
|
||||
cleaned = raw_group_data.strip()
|
||||
|
||||
# If the group data looks like it contains a student name pattern,
|
||||
# we'll try to extract just the group identifier part
|
||||
if re.match(r'^\d+[А-ЯA-Z]', cleaned):
|
||||
# This looks like a class designation, return as is
|
||||
return cleaned
|
||||
|
||||
# If the group data contains common group indicators, return as is
|
||||
group_indicators = ['кл', 'class', 'club', 'track', 'group', 'module', '-']
|
||||
if any(indicator in cleaned.lower() for indicator in group_indicators):
|
||||
return cleaned
|
||||
|
||||
# If the group data looks like a subject-identifier pattern, return as is
|
||||
subject_indicators = ['ICT', 'English', 'Math', 'Physics', 'Chemistry', 'Biology', 'Science']
|
||||
if any(indicator in cleaned for indicator in subject_indicators):
|
||||
return cleaned
|
||||
|
||||
# If none of the above conditions match, return a generic group name
|
||||
return f"Group_{hash(cleaned) % 10000}"
|
||||
|
||||
def _is_likely_teacher_name(self, text):
|
||||
"""Check if the text is likely to be a teacher name"""
|
||||
if not text or len(text.strip()) < 5: # Require minimum length for a name
|
||||
return False
|
||||
|
||||
text = text.strip()
|
||||
|
||||
# Common non-name values that appear in the CSV
|
||||
common_non_names = ['-', 'nan', 'нет', 'нету', 'отсутствует', 'учитель', 'teacher', '', 'Е4 Е5', 'E4 E5', 'группа', 'group', 'каб.', 'гр.', 'фитнес', 'каб', 'все группы', '1 группа', '2 группа', 'Е1', 'Е2', 'Е3', 'Е4', 'Е5', 'Е6', 'Е1 Е2', 'Е4 Е5', 'E1', 'E2', 'E3', 'E4', 'E5', 'E6', 'гр 1', 'гр 2']
|
||||
if text.lower() in common_non_names:
|
||||
return False
|
||||
|
||||
# Exclusion patterns for non-teacher entries
|
||||
exclusion_patterns = [
|
||||
r'^[А-ЯЁ]\d+\s+[А-ЯЁ]\d+$', # E4 E5 pattern
|
||||
r'^[A-Z]\d+\s+[A-Z]\d+$', # English groups
|
||||
r'.*[Tt]rack.*', # Track identifiers
|
||||
r'.*[Gg]roup.*', # Group identifiers
|
||||
r'.*\d+[А-ЯA-Z]\d*$', # Number-letter combos
|
||||
r'^[А-ЯЁA-Z].*\d+', # Text ending with digits
|
||||
r'.*[Cc]lub.*', # Club identifiers
|
||||
r'.*[Rr]oom.*', # Room identifiers
|
||||
r'.*[Cc]lass.*', # Class identifiers
|
||||
r'.*[Pp]eriod.*', # Period identifiers
|
||||
r'^\d+$', # Just numbers
|
||||
r'^[А-ЯЁA-Z]*$', # All caps words
|
||||
r'^[А-ЯЁA-Z\s\d]+$', # Caps words and numbers (likely room numbers)
|
||||
r'^[ЕеEe][\d\s,]+$' # Room identifiers like E1, E2, etc.
|
||||
]
|
||||
|
||||
for pattern in exclusion_patterns:
|
||||
if re.match(pattern, text, re.IGNORECASE):
|
||||
return False
|
||||
|
||||
# Positive patterns for teacher names
|
||||
teacher_patterns = [
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ]\.\s*[А-ЯЁ]\.$', # Иванов А.А.
|
||||
r'^[А-ЯЁ]\.\s*[А-ЯЁ]\.\s+[А-ЯЁ][а-яё]+$', # А.А. Иванов
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+$', # Full name
|
||||
r'^[A-Z][a-z]+\s+[A-Z][a-z]+$', # John Smith
|
||||
r'^[A-Z][a-z]+\s+[A-Z]\.\s*[A-Z]\.$', # Smith J.J.
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+$', # Russian names without patronymic
|
||||
r'^[A-Z][a-z]+\s+[A-Z]\.\s*[A-Z]\.$', # Initials format
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+', # Names without periods
|
||||
]
|
||||
|
||||
for pattern in teacher_patterns:
|
||||
if re.match(pattern, text.strip()):
|
||||
return True
|
||||
|
||||
# Additional check: if it looks like a proper name (with capital letters and min length)
|
||||
# and doesn't match exclusion patterns
|
||||
name_parts = text.split()
|
||||
if len(name_parts) >= 2:
|
||||
# At least two parts (first name + last name)
|
||||
# Check if they start with capital letters
|
||||
if all(part[0].isupper() for part in name_parts if len(part) > 1):
|
||||
# Additional check: make sure it's not just a title or other text
|
||||
common_titles = ['Mr', 'Mrs', 'Ms', 'Dr', 'Prof', 'Teacher', 'Instructor', 'Coach']
|
||||
if any(title in text for title in common_titles):
|
||||
return False
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _is_likely_subject_label(self, text):
|
||||
"""Check if text is likely a subject label like 'Матем.', 'Информ.', 'Англ.яз', etc."""
|
||||
if not text or len(text) < 2:
|
||||
return False
|
||||
|
||||
# Common Russian abbreviations for subjects
|
||||
subject_patterns = [
|
||||
'Матем.', 'Информ.', 'Англ.яз', 'Русск.яз', 'Физика', 'Химия', 'Биол', 'История',
|
||||
'Общество', 'География', 'Литер', 'Физкульт', 'Технотрек', 'Лидерство',
|
||||
'Спорт. клуб', 'ОРКСЭ', 'Китайск', 'Немецк', 'Француз', 'Speaking club', 'Maths',
|
||||
'ICT', 'Geography', 'Physics', 'Robotics', 'Culinary', 'Science', 'AI Core', 'VR/AR',
|
||||
'CyberSafety', 'Business', 'Design', 'Prototype', 'MediaCom', 'Science', 'Robotics',
|
||||
'Culinary', 'AI Core', 'VR/AR', 'CyberSafety', 'Business', 'Design', 'Prototype',
|
||||
'MediaCom', 'Robotics Track', 'Culinary Track', 'Science Track', 'AI Core Track',
|
||||
'VR/AR Track', 'CyberSafety Track', 'Business Track', 'Design Track', 'Prototype Track',
|
||||
'MediaCom Track', 'Math', 'Algebra', 'Geometry', 'Calculus', 'Statistics', 'Coding',
|
||||
'Programming', 'Algorithm', 'Logic', 'Robotics', 'Physical Education', 'PE', 'Sports',
|
||||
'Swimming', 'Fitness', 'Gymnastics', 'Climbing', 'Games', 'Art', 'Music', 'Dance',
|
||||
'Karate', 'Judo', 'Martial Arts', 'Chess', 'Leadership', 'Entrepreneurship'
|
||||
]
|
||||
|
||||
text_clean = text.strip().lower()
|
||||
for pattern in subject_patterns:
|
||||
if pattern.lower() in text_clean:
|
||||
return True
|
||||
|
||||
# Also check for specific subject names found in the data
|
||||
specific_subjects = ['матем.', 'информ.', 'англ.яз', 'русск.яз', 'каб.', 'business', 'maths',
|
||||
'speaking', 'ict', 'geography', 'physics', 'robotics', 'science', 'ai core',
|
||||
'vr/ar', 'cybersafety', 'design', 'prototype', 'mediacom', 'culinary',
|
||||
'physical education', 'pe', 'sports', 'swimming', 'fitness', 'gymnastics',
|
||||
'climbing', 'games', 'art', 'music', 'dance', 'karate', 'chess', 'leadership']
|
||||
for subj in specific_subjects:
|
||||
if subj in text_clean:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _find_matching_subject_in_header_from_list(self, subject_label, header_subjects, header_row):
|
||||
"""Find the matching full subject name in the header based on the label"""
|
||||
if not subject_label:
|
||||
return None
|
||||
|
||||
# Look for the best match in the header subjects
|
||||
subject_label_lower = subject_label.lower().replace('.', '').replace('яз', 'язык')
|
||||
|
||||
# Direct match first
|
||||
for col_idx, full_subj in header_subjects:
|
||||
if subject_label_lower in full_subj.lower() or full_subj.lower() in subject_label_lower:
|
||||
return full_subj
|
||||
|
||||
# If no direct match, try to find by partial matching in the whole header row
|
||||
for i, header_item in enumerate(header_row):
|
||||
if subject_label_lower in str(header_item).lower() or str(header_item).lower() in subject_label_lower:
|
||||
return str(header_item).strip()
|
||||
|
||||
# Try more general matching - if label contains common abbreviations
|
||||
for col_idx, full_subj in header_subjects:
|
||||
full_lower = full_subj.lower()
|
||||
if ('матем' in subject_label_lower and 'матем' in full_lower) or \
|
||||
('информ' in subject_label_lower and 'информ' in full_lower) or \
|
||||
('англ' in subject_label_lower and 'англ' in full_lower) or \
|
||||
('русск' in subject_label_lower and 'русск' in full_lower) or \
|
||||
('физик' in subject_label_lower and 'физик' in full_lower) or \
|
||||
('хим' in subject_label_lower and 'хим' in full_lower) or \
|
||||
('биол' in subject_label_lower and 'биол' in full_lower) or \
|
||||
('истор' in subject_label_lower and 'истор' in full_lower) or \
|
||||
('общ' in subject_label_lower and 'общ' in full_lower) or \
|
||||
('географ' in subject_label_lower and 'географ' in full_lower):
|
||||
return full_subj
|
||||
|
||||
return None
|
||||
|
||||
def find_student(self, name_query):
|
||||
"""Search for students by name"""
|
||||
self.cursor.execute("""
|
||||
SELECT s.full_name, s.class_name
|
||||
FROM students s
|
||||
WHERE s.full_name LIKE ?
|
||||
LIMIT 10
|
||||
""", (f'%{name_query}%',))
|
||||
|
||||
return self.cursor.fetchall()
|
||||
|
||||
def get_current_class(self, student_name, current_day, current_time):
|
||||
"""Find student's current class"""
|
||||
self.cursor.execute("""
|
||||
SELECT sub.name, t.name, p.start_time, p.end_time
|
||||
FROM schedule sch
|
||||
JOIN students s ON sch.student_id = s.student_id
|
||||
JOIN subjects sub ON sch.subject_id = sub.subject_id
|
||||
JOIN teachers t ON sch.teacher_id = t.teacher_id
|
||||
JOIN days d ON sch.day_id = d.day_id
|
||||
JOIN periods p ON sch.period_id = p.period_id
|
||||
JOIN groups g ON sch.group_id = g.group_id
|
||||
WHERE s.full_name = ?
|
||||
AND d.name = ?
|
||||
AND p.start_time <= ?
|
||||
AND p.end_time >= ?
|
||||
""", (student_name, current_day, current_time, current_time))
|
||||
|
||||
return self.cursor.fetchone()
|
||||
|
||||
def get_student_schedule(self, student_name):
|
||||
"""Get full schedule for a student"""
|
||||
self.cursor.execute("""
|
||||
SELECT sub.name, t.name, p.start_time, p.end_time, g.name
|
||||
FROM schedule sch
|
||||
JOIN students s ON sch.student_id = s.student_id
|
||||
JOIN subjects sub ON sch.subject_id = sub.subject_id
|
||||
JOIN teachers t ON sch.teacher_id = t.teacher_id
|
||||
JOIN periods p ON sch.period_id = p.period_id
|
||||
JOIN groups g ON sch.group_id = g.group_id
|
||||
WHERE s.full_name = ?
|
||||
ORDER BY p.period_number
|
||||
""", (student_name,))
|
||||
|
||||
return self.cursor.fetchall()
|
||||
|
||||
def _is_likely_teacher_name_enhanced(self, text):
|
||||
"""Enhanced check if the text is likely to be a teacher name"""
|
||||
if not text or len(text.strip()) < 5: # Require minimum length for a name
|
||||
return False
|
||||
|
||||
text = text.strip()
|
||||
|
||||
# Common non-name values that appear in the CSV
|
||||
common_non_names = ['-', 'nan', 'нет', 'нету', 'отсутствует', 'учитель', 'teacher', '', 'Е4 Е5', 'E4 E5', 'группа', 'group', 'каб.', 'гр.', 'фитнес', 'каб', 'все группы', '1 группа', '2 группа', 'Е1', 'Е2', 'Е3', 'Е4', 'Е5', 'Е6', 'Е1 Е2', 'Е4 Е5', 'E1', 'E2', 'E3', 'E4', 'E5', 'E6', 'гр 1', 'гр 2']
|
||||
if text.lower() in common_non_names:
|
||||
return False
|
||||
|
||||
# Exclusion patterns for non-teacher entries
|
||||
exclusion_patterns = [
|
||||
r'^[А-ЯЁ]\d+\s+[А-ЯЁ]\d+$', # E4 E5 pattern
|
||||
r'^[A-Z]\d+\s+[A-Z]\d+$', # English groups
|
||||
r'.*[Tt]rack.*', # Track identifiers
|
||||
r'.*[Gg]roup.*', # Group identifiers
|
||||
r'.*\d+[А-ЯA-Z]\d*$', # Number-letter combos
|
||||
r'^[А-ЯЁA-Z].*\d+', # Text ending with digits
|
||||
r'.*[Cc]lub.*', # Club identifiers
|
||||
r'.*[Rr]oom.*', # Room identifiers
|
||||
r'.*[Cc]lass.*', # Class identifiers
|
||||
r'.*[Pp]eriod.*', # Period identifiers
|
||||
r'^\d+$', # Just numbers
|
||||
r'^[А-ЯЁA-Z]*$', # All caps words
|
||||
r'^[А-ЯЁA-Z\s\d]+$', # Caps words and numbers (likely room numbers)
|
||||
r'^[ЕеEe][\d\s,]+$', # Room identifiers like E1, E2, etc.
|
||||
]
|
||||
|
||||
for pattern in exclusion_patterns:
|
||||
if re.match(pattern, text, re.IGNORECASE):
|
||||
return False
|
||||
|
||||
# Check if it looks like a name with multiple capitalized words (Russian or English)
|
||||
# Teacher names typically have 2-4 words with capitalized first letters
|
||||
words = text.split()
|
||||
if len(words) < 2 or len(words) > 4:
|
||||
return False
|
||||
|
||||
# Check if most words start with capital letters (allowing for exceptions like "van", "de", etc.)
|
||||
capital_words = 0
|
||||
for word in words:
|
||||
# Skip common particles that are lowercase in names
|
||||
if word in ['van', 'von', 'de', 'di', 'le', 'la', 'du', 'del', 'da', 'и', 'на', 'де']:
|
||||
capital_words += 1
|
||||
elif word[0].isupper() and len(word) > 1:
|
||||
capital_words += 1
|
||||
|
||||
# At least n-1 words should be capitalized (for n-word names)
|
||||
if capital_words < len(words) - 1:
|
||||
return False
|
||||
|
||||
# Additional check: if it looks like a proper name (with capital letters and min length)
|
||||
# and doesn't match exclusion patterns
|
||||
name_parts = text.split()
|
||||
if len(name_parts) >= 2:
|
||||
# At least two parts (first name + last name)
|
||||
# Check if they start with capital letters
|
||||
if all(part[0].isupper() for part in name_parts if len(part) > 1):
|
||||
# Additional check: make sure it's not just a title or other text
|
||||
common_titles = ['Mr', 'Mrs', 'Ms', 'Dr', 'Prof', 'Teacher', 'Instructor', 'Coach']
|
||||
if any(title in text for title in common_titles):
|
||||
return False
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
|
||||
def _is_likely_group_identifier(self, text):
|
||||
"""Check if text is likely a group identifier like 'E1', 'E2', 'гр 1', etc."""
|
||||
if not text:
|
||||
return False
|
||||
|
||||
text = text.strip()
|
||||
|
||||
# Common group identifiers
|
||||
group_patterns = [
|
||||
r'^[Ee]\d+', # E1, E2, etc.
|
||||
r'^[Ee]\d+\s*[Ee]\d+', # E1 E2, E4 E5, etc.
|
||||
r'^(гр|group|группа).*', # "гр 1", "group 1", etc.
|
||||
r'^[А-ЯA-Z]\d+', # A1, B2, etc.
|
||||
r'^[А-ЯA-Z]\d+\s+[А-ЯA-Z]\d+', # A1 B2, etc.
|
||||
r'^(все группы|all groups).*', # "все группы", etc.
|
||||
r'^\d+\s*(группа|class).*', # "1 группа", etc.
|
||||
r'^(1|2)\s*(группа|group)', # "1 группа", "2 group", etc.
|
||||
]
|
||||
|
||||
for pattern in group_patterns:
|
||||
if re.match(pattern, text, re.IGNORECASE):
|
||||
return True
|
||||
|
||||
# Additional common group indicators
|
||||
common_groups = ['E1 E2', 'E3 E4', 'E5 E6', 'E1', 'E2', 'E3', 'E4', 'E5', 'E6',
|
||||
'1 группа', '2 группа', 'все группы', 'гр 1', 'гр 2', 'all groups',
|
||||
'group 1', 'group 2', 'A1', 'B1', 'C1', '4A', '4B', '4C', '4ABC']
|
||||
|
||||
return text in common_groups
|
||||
247
scheduler_bots/database_backup.py
Normal file
247
scheduler_bots/database_backup.py
Normal file
@@ -0,0 +1,247 @@
|
||||
def process_csv_with_teacher_mapping(self, csv_file):
|
||||
"""Process CSV with teacher-subject mapping based on positional order"""
|
||||
if not os.path.exists(csv_file):
|
||||
return False
|
||||
|
||||
with open(csv_file, 'r', encoding='utf-8') as file:
|
||||
reader = csv.reader(file)
|
||||
rows = list(reader)
|
||||
|
||||
# Identify header row - look for the row containing "ФИО" (full name) or similar indicators
|
||||
header_idx = None
|
||||
for i, row in enumerate(rows):
|
||||
for cell in row:
|
||||
if "ФИО" in str(cell) or "фио" in str(cell).lower() or "Ф.И.О." in str(cell) or "ф.и.о." in str(cell):
|
||||
header_idx = i
|
||||
break
|
||||
if header_idx is not None:
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
# Check if this file contains class and name columns that identify it as a student data file
|
||||
# Even if the header doesn't contain ФИО, we might still be able to identify student data
|
||||
has_class_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
has_name_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
if has_class_indicators and has_name_indicators:
|
||||
# Try to find the header row by looking for class and name indicators
|
||||
for i, row in enumerate(rows):
|
||||
if any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class']) and \
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname']):
|
||||
header_idx = i
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
print(f"Skipping {csv_file} - does not appear to be student data with ФИО/class columns")
|
||||
return False
|
||||
|
||||
# Find teacher-subject mappings in the first 0-15 rows before the header
|
||||
teacher_subject_map = {}
|
||||
|
||||
# Build a mapping of subject names in the header row
|
||||
header_row = rows[header_idx]
|
||||
header_subjects = {}
|
||||
for col_idx, subject_name in enumerate(header_row):
|
||||
subject_name = str(subject_name).strip()
|
||||
if (subject_name and
|
||||
subject_name.lower() not in ['ф.и.о.', 'фио', 'класс', 'номер', 'сортировка', 'шкафчика', 'локера'] and
|
||||
subject_name.strip() != "" and
|
||||
"ф.и.о" not in subject_name.lower() and
|
||||
"сортировка" not in subject_name.lower() and
|
||||
"номер" not in subject_name.lower() and
|
||||
"№" not in subject_name):
|
||||
header_subjects[col_idx] = subject_name # Map column index to subject name
|
||||
|
||||
# First, try to find teachers in the rows before the header
|
||||
for i in range(min(15, header_idx)): # Check first 15 rows before header
|
||||
current_row = rows[i]
|
||||
|
||||
# Process all cells in the row to find teacher names and their adjacent context
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# Check if this cell is a likely teacher name
|
||||
if self._is_likely_teacher_name(cell_str):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = cell_str
|
||||
|
||||
# If the cell contains multiple names (separated by newlines), process each separately
|
||||
elif '\n' in cell_str or '\\n' in cell_str:
|
||||
cell_parts = [part.strip() for part in cell_str.replace('\\n', '\n').split('\n') if part.strip()]
|
||||
for part in cell_parts:
|
||||
if self._is_likely_teacher_name(part):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = part
|
||||
|
||||
# Additional teacher-subject mapping: scan the rows immediately before the header for teacher names in subject columns
|
||||
# In many CSV files, teacher names appear in the same rows as subject headers
|
||||
for i in range(max(0, header_idx - 5), header_idx): # Check 5 rows before header
|
||||
current_row = rows[i]
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# If cell contains a likely teacher name and corresponds to a subject column
|
||||
if self._is_likely_teacher_name(cell_str) and j in header_subjects:
|
||||
subject_name = header_subjects[j]
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if (subject_name not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(subject_name, '')):
|
||||
teacher_subject_map[subject_name] = cell_str
|
||||
|
||||
# Additional validation: Remove any teacher-subject mappings that seem incorrect
|
||||
validated_teacher_subject_map = {}
|
||||
for subject, teacher in teacher_subject_map.items():
|
||||
# Only add to validated map if teacher name passes all checks
|
||||
if self._is_likely_teacher_name(teacher):
|
||||
validated_teacher_subject_map[subject] = teacher
|
||||
else:
|
||||
print(f"Warning: Invalid teacher name '{teacher}' detected for subject '{subject}', skipping...")
|
||||
|
||||
teacher_subject_map = validated_teacher_subject_map
|
||||
|
||||
# Process each student row
|
||||
for student_row in rows[header_idx + 1:]:
|
||||
# Determine the structure dynamically based on the header
|
||||
class_col_idx = None
|
||||
name_col_idx = None
|
||||
|
||||
# Find the index of the class column (usually called "Класс")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "Класс" in str(header) or "класс" in str(header) or "Class" in str(header) or "class" in str(header):
|
||||
class_col_idx = idx
|
||||
break
|
||||
|
||||
# Find the index of the name column (usually called "ФИО")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "ФИО" in str(header) or "ф.и.о." in str(header).lower() or "name" in str(header).lower():
|
||||
name_col_idx = idx
|
||||
break
|
||||
|
||||
# If we couldn't find the columns properly, skip this row
|
||||
if class_col_idx is None or name_col_idx is None:
|
||||
continue
|
||||
|
||||
# Check if this row has valid data in the expected columns
|
||||
if (len(student_row) > max(class_col_idx, name_col_idx) and
|
||||
student_row[class_col_idx].strip() and # class name exists
|
||||
student_row[name_col_idx].strip() and # student name exists
|
||||
self._is_valid_student_record_by_cols(student_row, class_col_idx, name_col_idx)):
|
||||
|
||||
name = student_row[name_col_idx].strip() # Name column
|
||||
class_name = student_row[class_col_idx].strip() # Class column
|
||||
|
||||
# Insert student into the database
|
||||
self.cursor.execute(
|
||||
"INSERT OR IGNORE INTO students (class_name, full_name) VALUES (?, ?)",
|
||||
(class_name, name)
|
||||
)
|
||||
|
||||
# Get the student_id for this student
|
||||
self.cursor.execute("SELECT student_id FROM students WHERE full_name = ? AND class_name = ?", (name, class_name))
|
||||
student_id_result = self.cursor.fetchone()
|
||||
if student_id_result is None:
|
||||
continue
|
||||
student_id = student_id_result[0]
|
||||
|
||||
# Process schedule data for this student
|
||||
# Go through each column to find subject and group info
|
||||
for col_idx, cell_value in enumerate(student_row):
|
||||
if cell_value and col_idx < len(header_row):
|
||||
# Get the subject from the header
|
||||
subject_header = header_row[col_idx] if col_idx < len(header_row) else ""
|
||||
|
||||
# Skip columns that don't contain schedule information
|
||||
if (col_idx == 0 or col_idx == 1 or col_idx == 2 or col_idx == class_col_idx or col_idx == name_col_idx or # skip metadata cols
|
||||
"сортировка" in subject_header.lower() or
|
||||
"номер" in subject_header.lower() or
|
||||
"шкафчика" in subject_header.lower() or
|
||||
"локера" in subject_header.lower()):
|
||||
continue
|
||||
|
||||
# Extract group information from the cell
|
||||
group_assignment = cell_value.strip()
|
||||
|
||||
if group_assignment and group_assignment.lower() != "nan" and group_assignment != "-" and group_assignment != "":
|
||||
# Find the teacher associated with this subject
|
||||
subject_name = str(subject_header).strip()
|
||||
teacher_name = teacher_subject_map.get(subject_name, f"Default Teacher for {subject_name}")
|
||||
|
||||
# Insert the entities into their respective tables first
|
||||
# Then get their IDs to create the schedule entry
|
||||
self._process_schedule_entry_with_teacher_mapping(
|
||||
student_id, group_assignment, subject_name, teacher_name
|
||||
)
|
||||
|
||||
self.conn.commit()
|
||||
return True
|
||||
220
scheduler_bots/database_fixed.py
Normal file
220
scheduler_bots/database_fixed.py
Normal file
@@ -0,0 +1,220 @@
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
def process_csv_with_teacher_mapping(self, csv_file):
|
||||
"""Process CSV with teacher-subject mapping based on positional order"""
|
||||
if not os.path.exists(csv_file):
|
||||
return False
|
||||
|
||||
with open(csv_file, 'r', encoding='utf-8') as file:
|
||||
reader = csv.reader(file)
|
||||
rows = list(reader)
|
||||
|
||||
# Identify header row - look for the row containing "ФИО" (full name) or similar indicators
|
||||
header_idx = None
|
||||
for i, row in enumerate(rows):
|
||||
for cell in row:
|
||||
if "ФИО" in str(cell) or "фио" in str(cell).lower() or "Ф.И.О." in str(cell) or "ф.и.о." in str(cell):
|
||||
header_idx = i
|
||||
break
|
||||
if header_idx is not None:
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
# Check if this file contains class and name columns that identify it as a student data file
|
||||
# Even if the header doesn't contain ФИО, we might still be able to identify student data
|
||||
has_class_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
has_name_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
if has_class_indicators and has_name_indicators:
|
||||
# Try to find the header row by looking for class and name indicators
|
||||
for i, row in enumerate(rows):
|
||||
if any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class']) and \
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname']):
|
||||
header_idx = i
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
print(f"Skipping {csv_file} - does not appear to be student data with ФИО/class columns")
|
||||
return False
|
||||
|
||||
# Find teacher-subject mappings in the first 0-15 rows before the header
|
||||
teacher_subject_map = {}
|
||||
|
||||
# Build a mapping of subject names in the header row
|
||||
header_row = rows[header_idx]
|
||||
header_subjects = {}
|
||||
for col_idx, subject_name in enumerate(header_row):
|
||||
subject_name = str(subject_name).strip()
|
||||
if (subject_name and
|
||||
subject_name.lower() not in ['ф.и.о.', 'фио', 'класс', 'номер', 'сортировка', 'шкафчика', 'локера'] and
|
||||
subject_name.strip() != "" and
|
||||
"ф.и.о" not in subject_name.lower() and
|
||||
"сортировка" not in subject_name.lower() and
|
||||
"номер" not in subject_name.lower() and
|
||||
"№" not in subject_name):
|
||||
header_subjects[col_idx] = subject_name # Map column index to subject name
|
||||
|
||||
# Process rows before the header to find teacher names and map them to subjects
|
||||
for i in range(min(15, header_idx)): # Check first 15 rows before header
|
||||
current_row = rows[i]
|
||||
|
||||
# Process all cells in the row to find teacher names and their adjacent context
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# Check if this cell is a likely teacher name
|
||||
if self._is_likely_teacher_name(cell_str):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = cell_str
|
||||
|
||||
# If the cell contains multiple names (separated by newlines), process each separately
|
||||
elif '\n' in cell_str or '\\n' in cell_str:
|
||||
cell_parts = [part.strip() for part in cell_str.replace('\\n', '\n').split('\n') if part.strip()]
|
||||
for part in cell_parts:
|
||||
if self._is_likely_teacher_name(part):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = part
|
||||
|
||||
# Additional validation: Remove any teacher-subject mappings that seem incorrect
|
||||
validated_teacher_subject_map = {}
|
||||
for subject, teacher in teacher_subject_map.items():
|
||||
# Only add to validated map if teacher name passes all checks
|
||||
if self._is_likely_teacher_name(teacher):
|
||||
validated_teacher_subject_map[subject] = teacher
|
||||
else:
|
||||
print(f"Warning: Invalid teacher name '{teacher}' detected for subject '{subject}', skipping...")
|
||||
|
||||
teacher_subject_map = validated_teacher_subject_map
|
||||
|
||||
# Additional teacher-subject mapping: scan the data rows for teacher names paired with subjects
|
||||
# In many CSV files, teacher names appear in the same rows as subject data
|
||||
for i in range(header_idx + 1, min(len(rows), header_idx + 50)): # Check first 50 data rows
|
||||
current_row = rows[i]
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# If cell contains a likely teacher name and corresponds to a subject column
|
||||
if self._is_likely_teacher_name(cell_str) and j in header_subjects:
|
||||
subject_name = header_subjects[j]
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if (subject_name not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(subject_name, '')):
|
||||
teacher_subject_map[subject_name] = cell_str
|
||||
|
||||
# Process each student row
|
||||
for student_row in rows[header_idx + 1:]:
|
||||
# Determine the structure dynamically based on the header
|
||||
class_col_idx = None
|
||||
name_col_idx = None
|
||||
|
||||
# Find the index of the class column (usually called "Класс")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "Класс" in str(header) or "класс" in str(header) or "Class" in str(header) or "class" in str(header):
|
||||
class_col_idx = idx
|
||||
break
|
||||
|
||||
# Find the index of the name column (usually called "ФИО")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "ФИО" in str(header) or "ф.и.о." in str(header).lower() or "name" in str(header).lower():
|
||||
name_col_idx = idx
|
||||
break
|
||||
|
||||
# If we couldn't find the columns properly, skip this row
|
||||
if class_col_idx is None or name_col_idx is None:
|
||||
continue
|
||||
|
||||
# Check if this row has valid data in the expected columns
|
||||
if (len(student_row) > max(class_col_idx, name_col_idx) and
|
||||
student_row[class_col_idx].strip() and # class name exists
|
||||
student_row[name_col_idx].strip() and # student name exists
|
||||
self._is_valid_student_record_by_cols(student_row, class_col_idx, name_col_idx)):
|
||||
|
||||
name = student_row[name_col_idx].strip() # Name column
|
||||
class_name = student_row[class_col_idx].strip() # Class column
|
||||
|
||||
# Insert student into the database
|
||||
self.cursor.execute(
|
||||
"INSERT OR IGNORE INTO students (class_name, full_name) VALUES (?, ?)",
|
||||
(class_name, name)
|
||||
)
|
||||
|
||||
# Get the student_id for this student
|
||||
self.cursor.execute("SELECT student_id FROM students WHERE full_name = ? AND class_name = ?", (name, class_name))
|
||||
student_id_result = self.cursor.fetchone()
|
||||
if student_id_result is None:
|
||||
continue
|
||||
student_id = student_id_result[0]
|
||||
|
||||
# Process schedule data for this student
|
||||
721
scheduler_bots/database_fresh.py
Normal file
721
scheduler_bots/database_fresh.py
Normal file
@@ -0,0 +1,721 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
database.py - School schedule database (normalized version)
|
||||
Creates normalized tables and extracts from CSV with proper relationships
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import csv
|
||||
import os
|
||||
import sys
|
||||
import re
|
||||
|
||||
class SchoolScheduleDB:
|
||||
def __init__(self, db_name='school_schedule.db'):
|
||||
self.conn = sqlite3.connect(db_name)
|
||||
self.cursor = self.conn.cursor()
|
||||
# Initialize database tables
|
||||
self.create_tables()
|
||||
|
||||
def create_tables(self):
|
||||
"""Create normalized tables with proper relationships"""
|
||||
# Teachers table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS teachers (
|
||||
teacher_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
email TEXT,
|
||||
phone TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Subjects table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS subjects (
|
||||
subject_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Days table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS days (
|
||||
day_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL -- e.g., Monday, Tuesday, etc.
|
||||
)
|
||||
""")
|
||||
|
||||
# Periods table - with proper unique constraint
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS periods (
|
||||
period_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
period_number INTEGER,
|
||||
start_time TEXT,
|
||||
end_time TEXT,
|
||||
UNIQUE(period_number, start_time, end_time)
|
||||
)
|
||||
""")
|
||||
|
||||
# Groups table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS groups (
|
||||
group_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
name TEXT UNIQUE NOT NULL,
|
||||
description TEXT,
|
||||
class_name TEXT
|
||||
)
|
||||
""")
|
||||
|
||||
# Students table
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS students (
|
||||
student_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
class_name TEXT,
|
||||
full_name TEXT NOT NULL
|
||||
)
|
||||
""")
|
||||
|
||||
# Schedule table with foreign key relationships
|
||||
self.cursor.execute("""
|
||||
CREATE TABLE IF NOT EXISTS schedule (
|
||||
entry_id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
student_id INTEGER,
|
||||
subject_id INTEGER,
|
||||
teacher_id INTEGER,
|
||||
day_id INTEGER,
|
||||
period_id INTEGER,
|
||||
group_id INTEGER,
|
||||
FOREIGN KEY (student_id) REFERENCES students(student_id),
|
||||
FOREIGN KEY (subject_id) REFERENCES subjects(subject_id),
|
||||
FOREIGN KEY (teacher_id) REFERENCES teachers(teacher_id),
|
||||
FOREIGN KEY (day_id) REFERENCES days(day_id),
|
||||
FOREIGN KEY (period_id) REFERENCES periods(period_id),
|
||||
FOREIGN KEY (group_id) REFERENCES groups(group_id)
|
||||
)
|
||||
""")
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def populate_periods_table(self):
|
||||
"""Populate the periods table with standard school periods"""
|
||||
period_times = {
|
||||
'1': ('09:00', '09:40'),
|
||||
'2': ('10:00', '10:40'),
|
||||
'3': ('11:00', '11:40'),
|
||||
'4': ('11:50', '12:30'),
|
||||
'5': ('12:40', '13:20'),
|
||||
'6': ('13:30', '14:10'),
|
||||
'7': ('14:20', '15:00'),
|
||||
'8': ('15:20', '16:00'),
|
||||
'9': ('16:15', '16:55'),
|
||||
'10': ('17:05', '17:45'),
|
||||
'11': ('17:55', '18:35'),
|
||||
'12': ('18:45', '19:20'),
|
||||
'13': ('19:20', '20:00')
|
||||
}
|
||||
|
||||
for period_num, (start_time, end_time) in period_times.items():
|
||||
self.cursor.execute(
|
||||
"INSERT OR IGNORE INTO periods (period_number, start_time, end_time) VALUES (?, ?, ?)",
|
||||
(int(period_num), start_time, end_time)
|
||||
)
|
||||
|
||||
# Add days of the week
|
||||
days_of_week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
|
||||
for day in days_of_week:
|
||||
self.cursor.execute("INSERT OR IGNORE INTO days (name) VALUES (?)", (day,))
|
||||
|
||||
self.conn.commit()
|
||||
|
||||
def update_database_from_csv(self, auto_update=True):
|
||||
"""Automatically update database from specific CSV files in the sample_data directory"""
|
||||
sample_data_dir = "sample_data"
|
||||
|
||||
if not os.path.exists(sample_data_dir):
|
||||
print(f"Directory '{sample_data_dir}' not found.")
|
||||
return
|
||||
|
||||
# Get all CSV files and filter out the schedule template and sheet files
|
||||
all_csv_files = [f for f in os.listdir(sample_data_dir) if f.endswith('.csv')]
|
||||
|
||||
# Keep only the actual student distribution files (not the sheets)
|
||||
csv_files = []
|
||||
for filename in all_csv_files:
|
||||
if 'first_sheet' not in filename and 'last_sheet' not in filename and 'template' not in filename:
|
||||
csv_files.append(filename)
|
||||
|
||||
if not csv_files:
|
||||
print(f"No student data CSV files found in '{sample_data_dir}' directory.")
|
||||
return
|
||||
|
||||
print(f"Found {len(csv_files)} student data CSV file(s):")
|
||||
for i, filename in enumerate(csv_files, 1):
|
||||
print(f" {i}. {filename}")
|
||||
|
||||
if auto_update:
|
||||
print("\nAuto-updating database with all student data CSV files...")
|
||||
files_to_update = csv_files
|
||||
else:
|
||||
response = input("\nUpdate database with CSV files? (yes/no): ").lower()
|
||||
|
||||
if response not in ['yes', 'y', 'да']:
|
||||
print("Skipping database update.")
|
||||
return
|
||||
|
||||
print(f"\n0. Update all files")
|
||||
|
||||
try:
|
||||
selection = input(f"\nSelect file(s) to update (0 for all, or comma-separated numbers like 1,2,3): ")
|
||||
|
||||
if selection.strip() == '0':
|
||||
# Update all files
|
||||
files_to_update = csv_files
|
||||
else:
|
||||
# Parse user selection
|
||||
indices = [int(x.strip()) - 1 for x in selection.split(',')]
|
||||
files_to_update = [csv_files[i] for i in indices if 0 <= i < len(csv_files)]
|
||||
|
||||
if not files_to_update:
|
||||
print("No valid selections made.")
|
||||
return
|
||||
except ValueError:
|
||||
print("Invalid input. Please enter numbers separated by commas or '0' for all files.")
|
||||
return
|
||||
|
||||
# Populate the periods and days tables first
|
||||
self.populate_periods_table()
|
||||
|
||||
print(f"\nUpdating database with {len(files_to_update)} file(s):")
|
||||
for filename in files_to_update:
|
||||
print(f" - {filename}")
|
||||
|
||||
csv_path = os.path.join(sample_data_dir, filename)
|
||||
print(f"Processing {csv_path}...")
|
||||
|
||||
self.process_csv_with_teacher_mapping(csv_path)
|
||||
|
||||
print("Database updated successfully with selected CSV data.")
|
||||
|
||||
def process_csv_with_teacher_mapping(self, csv_file):
|
||||
"""Process CSV with teacher-subject mapping based on positional order"""
|
||||
if not os.path.exists(csv_file):
|
||||
return False
|
||||
|
||||
with open(csv_file, 'r', encoding='utf-8') as file:
|
||||
reader = csv.reader(file)
|
||||
rows = list(reader)
|
||||
|
||||
# Identify header row - look for the row containing "ФИО" (full name) or similar indicators
|
||||
header_idx = None
|
||||
for i, row in enumerate(rows):
|
||||
for cell in row:
|
||||
if "ФИО" in str(cell) or "фио" in str(cell).lower() or "Ф.И.О." in str(cell) or "ф.и.о." in str(cell):
|
||||
header_idx = i
|
||||
break
|
||||
if header_idx is not None:
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
# Check if this file contains class and name columns that identify it as a student data file
|
||||
# Even if the header doesn't contain ФИО, we might still be able to identify student data
|
||||
has_class_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
has_name_indicators = any(
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname'])
|
||||
for row in rows[:min(len(rows), 10)] # Check first 10 rows
|
||||
)
|
||||
|
||||
if has_class_indicators and has_name_indicators:
|
||||
# Try to find the header row by looking for class and name indicators
|
||||
for i, row in enumerate(rows):
|
||||
if any(indicator in str(cell).lower() for cell in row for indicator in ['класс', 'class']) and \
|
||||
any(indicator in str(cell).lower() for cell in row for indicator in ['имя', 'name', 'фамилия', 'surname']):
|
||||
header_idx = i
|
||||
break
|
||||
|
||||
if header_idx is None:
|
||||
print(f"Skipping {csv_file} - does not appear to be student data with ФИО/class columns")
|
||||
return False
|
||||
|
||||
# Find teacher-subject mappings in the first 0-15 rows before the header
|
||||
teacher_subject_map = {}
|
||||
|
||||
# Build a mapping of subject names in the header row
|
||||
header_row = rows[header_idx]
|
||||
header_subjects = {}
|
||||
for col_idx, subject_name in enumerate(header_row):
|
||||
subject_name = str(subject_name).strip()
|
||||
if (subject_name and
|
||||
subject_name.lower() not in ['ф.и.о.', 'фио', 'класс', 'номер', 'сортировка', 'шкафчика', 'локера'] and
|
||||
subject_name.strip() != "" and
|
||||
"ф.и.о" not in subject_name.lower() and
|
||||
"сортировка" not in subject_name.lower() and
|
||||
"номер" not in subject_name.lower() and
|
||||
"№" not in subject_name):
|
||||
header_subjects[col_idx] = subject_name # Map column index to subject name
|
||||
|
||||
# First, try to find teachers in the rows before the header
|
||||
for i in range(min(15, header_idx)): # Check first 15 rows before header
|
||||
current_row = rows[i]
|
||||
|
||||
# Process all cells in the row to find teacher names and their adjacent context
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# Check if this cell is a likely teacher name
|
||||
if self._is_likely_teacher_name(cell_str):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = cell_str
|
||||
|
||||
# If the cell contains multiple names (separated by newlines), process each separately
|
||||
elif '\n' in cell_str or '\\n' in cell_str:
|
||||
cell_parts = [part.strip() for part in cell_str.replace('\\n', '\n').split('\n') if part.strip()]
|
||||
for part in cell_parts:
|
||||
if self._is_likely_teacher_name(part):
|
||||
# Look for context on the left (department) and right (subject)
|
||||
left_context = ""
|
||||
right_context = ""
|
||||
|
||||
# Get left neighbor (department)
|
||||
if j > 0 and j-1 < len(current_row):
|
||||
left_context = str(current_row[j-1]).strip()
|
||||
|
||||
# Get right neighbor (subject)
|
||||
if j < len(current_row) - 1:
|
||||
right_context = str(current_row[j+1]).strip()
|
||||
|
||||
# Try to determine the subject based on adjacency
|
||||
matched_subject = None
|
||||
|
||||
# First priority: right neighbor if it matches a subject in the header
|
||||
if right_context and j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# Second priority: use left context if it semantically relates to a teacher
|
||||
elif left_context and any(keyword in left_context.lower() for keyword in ['учитель', 'teacher', 'кафедра', 'department']):
|
||||
# If left context indicates a department, look for subject to the right of teacher
|
||||
if j+1 in header_subjects:
|
||||
matched_subject = header_subjects[j+1]
|
||||
# If no subject to the right, try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
# Third priority: try to map by position
|
||||
elif j in header_subjects:
|
||||
matched_subject = header_subjects[j]
|
||||
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if matched_subject and (matched_subject not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(matched_subject, '')):
|
||||
teacher_subject_map[matched_subject] = part
|
||||
|
||||
# Additional teacher-subject mapping: scan the rows immediately before the header for teacher names in subject columns
|
||||
# In many CSV files, teacher names appear in the same rows as subject headers
|
||||
for i in range(max(0, header_idx - 5), header_idx): # Check 5 rows before header
|
||||
current_row = rows[i]
|
||||
for j, cell_value in enumerate(current_row):
|
||||
cell_str = str(cell_value).strip()
|
||||
|
||||
# If cell contains a likely teacher name and corresponds to a subject column
|
||||
if self._is_likely_teacher_name(cell_str) and j in header_subjects:
|
||||
subject_name = header_subjects[j]
|
||||
# Only add if we don't have a better teacher name for this subject yet
|
||||
if (subject_name not in teacher_subject_map or
|
||||
'Default Teacher for' in teacher_subject_map.get(subject_name, '')):
|
||||
teacher_subject_map[subject_name] = cell_str
|
||||
|
||||
# Additional validation: Remove any teacher-subject mappings that seem incorrect
|
||||
validated_teacher_subject_map = {}
|
||||
for subject, teacher in teacher_subject_map.items():
|
||||
# Only add to validated map if teacher name passes all checks
|
||||
if self._is_likely_teacher_name(teacher):
|
||||
validated_teacher_subject_map[subject] = teacher
|
||||
else:
|
||||
print(f"Warning: Invalid teacher name '{teacher}' detected for subject '{subject}', skipping...")
|
||||
|
||||
teacher_subject_map = validated_teacher_subject_map
|
||||
|
||||
# Process each student row
|
||||
for student_row in rows[header_idx + 1:]:
|
||||
# Determine the structure dynamically based on the header
|
||||
class_col_idx = None
|
||||
name_col_idx = None
|
||||
|
||||
# Find the index of the class column (usually called "Класс")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "Класс" in str(header) or "класс" in str(header) or "Class" in str(header) or "class" in str(header):
|
||||
class_col_idx = idx
|
||||
break
|
||||
|
||||
# Find the index of the name column (usually called "ФИО")
|
||||
for idx, header in enumerate(header_row):
|
||||
if "ФИО" in str(header) or "ф.и.о." in str(header).lower() or "name" in str(header).lower():
|
||||
name_col_idx = idx
|
||||
break
|
||||
|
||||
# If we couldn't find the columns properly, skip this row
|
||||
if class_col_idx is None or name_col_idx is None:
|
||||
continue
|
||||
|
||||
# Check if this row has valid data in the expected columns
|
||||
if (len(student_row) > max(class_col_idx, name_col_idx) and
|
||||
student_row[class_col_idx].strip() and # class name exists
|
||||
student_row[name_col_idx].strip() and # student name exists
|
||||
self._is_valid_student_record_by_cols(student_row, class_col_idx, name_col_idx)):
|
||||
|
||||
name = student_row[name_col_idx].strip() # Name column
|
||||
class_name = student_row[class_col_idx].strip() # Class column
|
||||
|
||||
# Insert student into the database
|
||||
self.cursor.execute(
|
||||
"INSERT OR IGNORE INTO students (class_name, full_name) VALUES (?, ?)",
|
||||
(class_name, name)
|
||||
)
|
||||
|
||||
# Get the student_id for this student
|
||||
self.cursor.execute("SELECT student_id FROM students WHERE full_name = ? AND class_name = ?", (name, class_name))
|
||||
student_id_result = self.cursor.fetchone()
|
||||
if student_id_result is None:
|
||||
continue
|
||||
student_id = student_id_result[0]
|
||||
|
||||
# Process schedule data for this student
|
||||
# Go through each column to find subject and group info
|
||||
for col_idx, cell_value in enumerate(student_row):
|
||||
if cell_value and col_idx < len(header_row):
|
||||
# Get the subject from the header
|
||||
subject_header = header_row[col_idx] if col_idx < len(header_row) else ""
|
||||
|
||||
# Skip columns that don't contain schedule information
|
||||
if (col_idx == 0 or col_idx == 1 or col_idx == 2 or col_idx == class_col_idx or col_idx == name_col_idx or # skip metadata cols
|
||||
"сортировка" in subject_header.lower() or
|
||||
"номер" in subject_header.lower() or
|
||||
"шкафчика" in subject_header.lower() or
|
||||
"локера" in subject_header.lower()):
|
||||
continue
|
||||
|
||||
# Extract group information from the cell
|
||||
group_assignment = cell_value.strip()
|
||||
|
||||
if group_assignment and group_assignment.lower() != "nan" and group_assignment != "-" and group_assignment != "":
|
||||
# Find the teacher associated with this subject
|
||||
subject_name = str(subject_header).strip()
|
||||
teacher_name = teacher_subject_map.get(subject_name, f"Default Teacher for {subject_name}")
|
||||
|
||||
# Insert the entities into their respective tables first
|
||||
# Then get their IDs to create the schedule entry
|
||||
self._process_schedule_entry_with_teacher_mapping(
|
||||
student_id, group_assignment, subject_name, teacher_name
|
||||
)
|
||||
|
||||
self.conn.commit()
|
||||
return True
|
||||
|
||||
def _is_valid_student_record_by_cols(self, row, class_col_idx, name_col_idx):
|
||||
"""Check if a row represents a valid student record based on specific columns"""
|
||||
# A valid student record should have:
|
||||
# - Non-empty class name in the class column
|
||||
# - Non-empty student name in the name column
|
||||
|
||||
if len(row) <= max(class_col_idx, name_col_idx):
|
||||
return False
|
||||
|
||||
class_name = row[class_col_idx].strip() if len(row) > class_col_idx else ""
|
||||
student_name = row[name_col_idx].strip() if len(row) > name_col_idx else ""
|
||||
|
||||
# Check if the class name looks like an actual class (contains a number followed by a letter)
|
||||
class_pattern = r'^\d+[А-ЯA-Z]$' # e.g., 6А, 11А, 4B
|
||||
if re.match(class_pattern, class_name):
|
||||
return bool(student_name and student_name != class_name) # Ensure name exists and is different from class
|
||||
|
||||
# If not matching class pattern, check if the name field is not just another class-like value
|
||||
name_pattern = r'^\d+[А-ЯA-Z]$' # This would indicate it's probably a class, not a name
|
||||
if re.match(name_pattern, student_name):
|
||||
return False # This row has a class in the name field, so not valid
|
||||
|
||||
return bool(class_name and student_name and class_name != student_name)
|
||||
|
||||
def _process_schedule_entry_with_teacher_mapping(self, student_id, group_info, subject_info, teacher_name):
|
||||
"""Process individual schedule entries with explicit teacher mapping and insert into normalized tables"""
|
||||
# Clean up the inputs
|
||||
subject_name = subject_info.strip() if subject_info.strip() else "General Class"
|
||||
group_assignment = group_info.strip()
|
||||
|
||||
# Only proceed if we have valid data
|
||||
if subject_name and group_assignment and group_assignment.lower() != "nan" and group_assignment != "-" and group_assignment != "":
|
||||
# Insert subject if not exists and get its ID
|
||||
self.cursor.execute("INSERT OR IGNORE INTO subjects (name) VALUES (?)", (subject_name,))
|
||||
self.cursor.execute("SELECT subject_id FROM subjects WHERE name = ?", (subject_name,))
|
||||
subject_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Insert teacher if not exists and get its ID
|
||||
# Use the teacher name as is, without default creation if not found
|
||||
self.cursor.execute("INSERT OR IGNORE INTO teachers (name) VALUES (?)", (teacher_name,))
|
||||
self.cursor.execute("SELECT teacher_id FROM teachers WHERE name = ?", (teacher_name,))
|
||||
teacher_result = self.cursor.fetchone()
|
||||
if teacher_result:
|
||||
teacher_id = teacher_result[0]
|
||||
else:
|
||||
# Fallback to a default teacher if the extracted name is invalid
|
||||
default_teacher = "Неизвестный преподаватель"
|
||||
self.cursor.execute("INSERT OR IGNORE INTO teachers (name) VALUES (?)", (default_teacher,))
|
||||
self.cursor.execute("SELECT teacher_id FROM teachers WHERE name = ?", (default_teacher,))
|
||||
teacher_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Use a default day for now (in a real system, we'd extract this from the schedule)
|
||||
# For now, we'll randomly assign to a day of the week
|
||||
import random
|
||||
days_list = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
|
||||
selected_day = random.choice(days_list)
|
||||
self.cursor.execute("INSERT OR IGNORE INTO days (name) VALUES (?)", (selected_day,))
|
||||
self.cursor.execute("SELECT day_id FROM days WHERE name = ?", (selected_day,))
|
||||
day_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Use a default period - for now we'll use period 1, but in a real system
|
||||
# we would need to extract this from the CSV if available
|
||||
self.cursor.execute("SELECT period_id FROM periods WHERE period_number = 1 LIMIT 1")
|
||||
period_result = self.cursor.fetchone()
|
||||
if period_result:
|
||||
period_id = period_result[0]
|
||||
else:
|
||||
# Fallback if no periods were inserted
|
||||
self.cursor.execute("SELECT period_id FROM periods LIMIT 1")
|
||||
period_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Clean the group name to separate it from student data
|
||||
group_name = self._clean_group_name(group_assignment)
|
||||
self.cursor.execute("INSERT OR IGNORE INTO groups (name) VALUES (?)", (group_name,))
|
||||
self.cursor.execute("SELECT group_id FROM groups WHERE name = ?", (group_name,))
|
||||
group_id = self.cursor.fetchone()[0]
|
||||
|
||||
# Insert the schedule entry
|
||||
self.cursor.execute("""
|
||||
INSERT OR IGNORE INTO schedule (student_id, subject_id, teacher_id, day_id, period_id, group_id)
|
||||
VALUES (?, ?, ?, ?, ?, ?)
|
||||
""", (student_id, subject_id, teacher_id, day_id, period_id, group_id))
|
||||
|
||||
def _clean_group_name(self, raw_group_data):
|
||||
"""Extract clean group name from potentially mixed student/group data"""
|
||||
# Remove potential student names from the group data
|
||||
# Group names typically contain numbers, class identifiers, or specific activity names
|
||||
cleaned = raw_group_data.strip()
|
||||
|
||||
# If the group data looks like it contains a student name pattern,
|
||||
# we'll try to extract just the group identifier part
|
||||
if re.match(r'^\d+[А-ЯA-Z]', cleaned):
|
||||
# This looks like a class designation, return as is
|
||||
return cleaned
|
||||
|
||||
# If the group data contains common group indicators, return as is
|
||||
group_indicators = ['кл', 'class', 'club', 'track', 'group', 'module', '-']
|
||||
if any(indicator in cleaned.lower() for indicator in group_indicators):
|
||||
return cleaned
|
||||
|
||||
# If the group data looks like a subject-identifier pattern, return as is
|
||||
subject_indicators = ['ICT', 'English', 'Math', 'Physics', 'Chemistry', 'Biology', 'Science']
|
||||
if any(indicator in cleaned for indicator in subject_indicators):
|
||||
return cleaned
|
||||
|
||||
# If none of the above conditions match, return a generic group name
|
||||
return f"Group_{hash(cleaned) % 10000}"
|
||||
|
||||
def _is_likely_teacher_name(self, text):
|
||||
"""Check if the text is likely to be a teacher name"""
|
||||
if not text or len(text.strip()) < 5: # Require minimum length for a name
|
||||
return False
|
||||
|
||||
text = text.strip()
|
||||
|
||||
# Common non-name values that appear in the CSV
|
||||
common_non_names = ['-', 'nan', 'нет', 'нету', 'отсутствует', 'учитель', 'teacher', '', 'Е4 Е5', 'E4 E5', 'группа', 'group']
|
||||
if text.lower() in common_non_names:
|
||||
return False
|
||||
|
||||
# Exclusion patterns for non-teacher entries
|
||||
exclusion_patterns = [
|
||||
r'^[А-ЯЁ]\d+\s+[А-ЯЁ]\d+$', # E4 E5 pattern
|
||||
r'^[A-Z]\d+\s+[A-Z]\d+$', # English groups
|
||||
r'.*[Tt]rack.*', # Track identifiers
|
||||
r'.*[Gg]roup.*', # Group identifiers
|
||||
r'.*\d+[А-ЯA-Z]\d*$', # Number-letter combos
|
||||
r'^[А-ЯЁA-Z].*\d+', # Text ending with digits
|
||||
r'.*[Cc]lub.*', # Club identifiers
|
||||
]
|
||||
|
||||
for pattern in exclusion_patterns:
|
||||
if re.match(pattern, text, re.IGNORECASE):
|
||||
return False
|
||||
|
||||
# Positive patterns for teacher names
|
||||
teacher_patterns = [
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ]\.\s*[А-ЯЁ]\.$', # Иванов А.А.
|
||||
r'^[А-ЯЁ]\.\s*[А-ЯЁ]\.\s+[А-ЯЁ][а-яё]+$', # А.А. Иванов
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+$', # Full name
|
||||
r'^[A-Z][a-z]+\s+[A-Z][a-z]+$', # John Smith
|
||||
r'^[A-Z][a-z]+\s+[A-Z]\.\s*[A-Z]\.$', # Smith J.J.
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+$', # Russian names without patronymic
|
||||
]
|
||||
|
||||
for pattern in teacher_patterns:
|
||||
if re.match(pattern, text.strip()):
|
||||
return True
|
||||
|
||||
# Additional check: if it looks like a proper name (with capital letters and min length)
|
||||
# and doesn't match exclusion patterns
|
||||
name_parts = text.split()
|
||||
if len(name_parts) >= 2:
|
||||
# At least two parts (first name + last name)
|
||||
# Check if they start with capital letters
|
||||
if all(part[0].isupper() for part in name_parts if len(part) > 1):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _is_likely_subject_label(self, text):
|
||||
"""Check if text is likely a subject label like 'Матем.', 'Информ.', 'Англ.яз', etc."""
|
||||
if not text or len(text) < 2:
|
||||
return False
|
||||
|
||||
# Common Russian abbreviations for subjects
|
||||
subject_patterns = [
|
||||
'Матем.', 'Информ.', 'Англ.яз', 'Русск.яз', 'Физика', 'Химия', 'Биол', 'История',
|
||||
'Общество', 'География', 'Литер', 'Физкульт', 'Технотрек', 'Лидерство',
|
||||
'Спорт. клуб', 'ОРКСЭ', 'Китайск', 'Немецк', 'Француз', 'Speaking club', 'Maths',
|
||||
'ICT', 'Geography', 'Physics', 'Robotics', 'Culinary', 'Science', 'AI Core', 'VR/AR',
|
||||
'CyberSafety', 'Business', 'Design', 'Prototype', 'MediaCom', 'Science', 'Robotics',
|
||||
'Culinary', 'AI Core', 'VR/AR', 'CyberSafety', 'Business', 'Design', 'Prototype',
|
||||
'MediaCom', 'Robotics Track', 'Culinary Track', 'Science Track', 'AI Core Track',
|
||||
'VR/AR Track', 'CyberSafety Track', 'Business Track', 'Design Track', 'Prototype Track',
|
||||
'MediaCom Track', 'Math', 'Algebra', 'Geometry', 'Calculus', 'Statistics', 'Coding',
|
||||
'Programming', 'Algorithm', 'Logic', 'Robotics', 'Physical Education', 'PE', 'Sports',
|
||||
'Swimming', 'Fitness', 'Gymnastics', 'Climbing', 'Games', 'Art', 'Music', 'Dance',
|
||||
'Karate', 'Judo', 'Martial Arts', 'Chess', 'Leadership', 'Entrepreneurship'
|
||||
]
|
||||
|
||||
text_clean = text.strip().lower()
|
||||
for pattern in subject_patterns:
|
||||
if pattern.lower() in text_clean:
|
||||
return True
|
||||
|
||||
# Also check for specific subject names found in the data
|
||||
specific_subjects = ['матем.', 'информ.', 'англ.яз', 'русск.яз', 'каб.', 'business', 'maths',
|
||||
'speaking', 'ict', 'geography', 'physics', 'robotics', 'science', 'ai core',
|
||||
'vr/ar', 'cybersafety', 'design', 'prototype', 'mediacom', 'culinary',
|
||||
'physical education', 'pe', 'sports', 'swimming', 'fitness', 'gymnastics',
|
||||
'climbing', 'games', 'art', 'music', 'dance', 'karate', 'chess', 'leadership']
|
||||
for subj in specific_subjects:
|
||||
if subj in text_clean:
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def _find_matching_subject_in_header_from_list(self, subject_label, header_subjects, header_row):
|
||||
"""Find the matching full subject name in the header based on the label"""
|
||||
if not subject_label:
|
||||
return None
|
||||
|
||||
# Look for the best match in the header subjects
|
||||
subject_label_lower = subject_label.lower().replace('.', '').replace('яз', 'язык')
|
||||
|
||||
# Direct match first
|
||||
for col_idx, full_subj in header_subjects:
|
||||
if subject_label_lower in full_subj.lower() or full_subj.lower() in subject_label_lower:
|
||||
return full_subj
|
||||
|
||||
# If no direct match, try to find by partial matching in the whole header row
|
||||
for i, header_item in enumerate(header_row):
|
||||
if subject_label_lower in str(header_item).lower() or str(header_item).lower() in subject_label_lower:
|
||||
return str(header_item).strip()
|
||||
|
||||
# Try more general matching - if label contains common abbreviations
|
||||
for col_idx, full_subj in header_subjects:
|
||||
full_lower = full_subj.lower()
|
||||
if ('матем' in subject_label_lower and 'матем' in full_lower) or \
|
||||
('информ' in subject_label_lower and 'информ' in full_lower) or \
|
||||
('англ' in subject_label_lower and 'англ' in full_lower) or \
|
||||
('русск' in subject_label_lower and 'русск' in full_lower) or \
|
||||
('физик' in subject_label_lower and 'физик' in full_lower) or \
|
||||
('хим' in subject_label_lower and 'хим' in full_lower) or \
|
||||
('биол' in subject_label_lower and 'биол' in full_lower) or \
|
||||
('истор' in subject_label_lower and 'истор' in full_lower) or \
|
||||
('общ' in subject_label_lower and 'общ' in full_lower) or \
|
||||
('географ' in subject_label_lower and 'географ' in full_lower):
|
||||
return full_subj
|
||||
|
||||
return None
|
||||
|
||||
def find_student(self, name_query):
|
||||
"""Search for students by name"""
|
||||
self.cursor.execute("""
|
||||
SELECT s.full_name, s.class_name
|
||||
FROM students s
|
||||
WHERE s.full_name LIKE ?
|
||||
LIMIT 10
|
||||
""", (f'%{name_query}%',))
|
||||
|
||||
return self.cursor.fetchall()
|
||||
|
||||
def get_current_class(self, student_name, current_day, current_time):
|
||||
"""Find student's current class"""
|
||||
self.cursor.execute("""
|
||||
SELECT sub.name, t.name, p.start_time, p.end_time
|
||||
FROM schedule sch
|
||||
JOIN students s ON sch.student_id = s.student_id
|
||||
JOIN subjects sub ON sch.subject_id = sub.subject_id
|
||||
JOIN teachers t ON sch.teacher_id = t.teacher_id
|
||||
JOIN days d ON sch.day_id = d.day_id
|
||||
JOIN periods p ON sch.period_id = p.period_id
|
||||
JOIN groups g ON sch.group_id = g.group_id
|
||||
WHERE s.full_name = ?
|
||||
AND d.name = ?
|
||||
AND p.start_time <= ?
|
||||
AND p.end_time >= ?
|
||||
""", (student_name, current_day, current_time, current_time))
|
||||
|
||||
return self.cursor.fetchone()
|
||||
|
||||
def close(self):
|
||||
"""Close database connection"""
|
||||
self.conn.close()
|
||||
|
||||
# Main execution - just setup database
|
||||
if __name__ == "__main__":
|
||||
db = SchoolScheduleDB()
|
||||
# Check if auto-update flag is passed as argument
|
||||
auto_update = len(sys.argv) > 1 and sys.argv[1] == '--auto'
|
||||
db.update_database_from_csv(auto_update=auto_update)
|
||||
db.close()
|
||||
134
scheduler_bots/dfd_conversion_guide.md
Normal file
134
scheduler_bots/dfd_conversion_guide.md
Normal file
@@ -0,0 +1,134 @@
|
||||
# DFD.html to PNG Conversion Guide
|
||||
|
||||
## Overview
|
||||
This document provides instructions for converting the DFD.html file to a PNG image.
|
||||
|
||||
## File Information
|
||||
- **Input file**: `/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html`
|
||||
- **Expected output**: `DFD.png` in the same directory
|
||||
|
||||
## Method 1: Using Command Line Tools
|
||||
|
||||
### Option A: Using wkhtmltopdf
|
||||
1. Install wkhtmltopdf:
|
||||
```bash
|
||||
# On macOS
|
||||
brew install wkhtmltopdf
|
||||
|
||||
# On Ubuntu/Debian
|
||||
sudo apt-get install wkhtmltopdf
|
||||
```
|
||||
|
||||
2. Convert HTML to PNG:
|
||||
```bash
|
||||
wkhtmltoimage --width 1200 --height 800 "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html" "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png"
|
||||
```
|
||||
|
||||
### Option B: Using Puppeteer (Node.js)
|
||||
1. Install Node.js and npm if not already installed
|
||||
2. Install Puppeteer:
|
||||
```bash
|
||||
npm install puppeteer
|
||||
```
|
||||
|
||||
3. Create a conversion script:
|
||||
```javascript
|
||||
const puppeteer = require('puppeteer');
|
||||
const fs = require('fs');
|
||||
|
||||
(async () => {
|
||||
const browser = await puppeteer.launch();
|
||||
const page = await browser.newPage();
|
||||
|
||||
// Read the HTML file
|
||||
const htmlContent = fs.readFileSync('/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html', 'utf8');
|
||||
|
||||
await page.setContent(htmlContent);
|
||||
|
||||
// Take screenshot
|
||||
await page.screenshot({
|
||||
path: '/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png',
|
||||
fullPage: true
|
||||
});
|
||||
|
||||
await browser.close();
|
||||
console.log('Conversion completed!');
|
||||
})();
|
||||
```
|
||||
|
||||
## Method 2: Using Python Libraries
|
||||
|
||||
### Option A: Using Selenium
|
||||
1. Install required packages:
|
||||
```bash
|
||||
pip install selenium
|
||||
```
|
||||
|
||||
2. Make sure you have ChromeDriver installed
|
||||
|
||||
3. Run the following script:
|
||||
```python
|
||||
from selenium import webdriver
|
||||
from selenium.webdriver.chrome.options import Options
|
||||
import os
|
||||
|
||||
# Setup Chrome options
|
||||
chrome_options = Options()
|
||||
chrome_options.add_argument("--headless") # Run in background
|
||||
chrome_options.add_argument("--no-sandbox")
|
||||
chrome_options.add_argument("--disable-dev-shm-usage")
|
||||
|
||||
# Initialize the driver
|
||||
driver = webdriver.Chrome(options=chrome_options)
|
||||
|
||||
# Load the HTML file
|
||||
file_url = "file://" + os.path.abspath("/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html")
|
||||
driver.get(file_url)
|
||||
|
||||
# Set window size and take screenshot
|
||||
driver.set_window_size(1200, 800)
|
||||
driver.save_screenshot("/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png")
|
||||
|
||||
driver.quit()
|
||||
print("Conversion completed!")
|
||||
```
|
||||
|
||||
### Option B: Using Playwright
|
||||
1. Install required packages:
|
||||
```bash
|
||||
pip install playwright
|
||||
playwright install chromium
|
||||
```
|
||||
|
||||
2. Run the following script:
|
||||
```python
|
||||
from playwright.sync_api import sync_playwright
|
||||
import os
|
||||
|
||||
with sync_playwright() as p:
|
||||
browser = p.chromium.launch(headless=True)
|
||||
page = browser.new_page()
|
||||
|
||||
# Load the HTML file
|
||||
file_path = os.path.abspath("/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.html")
|
||||
page.goto(f"file://{file_path}")
|
||||
|
||||
# Set viewport size and take screenshot
|
||||
page.set_viewport_size({"width": 1200, "height": 800})
|
||||
page.screenshot(path="/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/Thesis materials/DFD.png", full_page=True)
|
||||
|
||||
browser.close()
|
||||
print("Conversion completed!")
|
||||
```
|
||||
|
||||
## Method 3: Manual Conversion
|
||||
1. Open the DFD.html file in your web browser
|
||||
2. Take a screenshot of the page (using Cmd+Shift+4 on macOS or PrtScn on Windows)
|
||||
3. Crop the screenshot to include only the relevant content
|
||||
4. Save the image as DFD.png in the Thesis materials directory
|
||||
|
||||
## Verification
|
||||
After conversion, verify that:
|
||||
- The PNG file exists in the Thesis materials directory
|
||||
- The image clearly displays the content from the DFD.html file
|
||||
- The image quality is sufficient for your needs
|
||||
16
scheduler_bots/examine_csv.py
Normal file
16
scheduler_bots/examine_csv.py
Normal file
@@ -0,0 +1,16 @@
|
||||
import csv
|
||||
import os
|
||||
|
||||
files = [f for f in os.listdir('sample_data') if f.endswith('.csv')]
|
||||
print('CSV files:', files)
|
||||
print()
|
||||
|
||||
for filename in files:
|
||||
print(f"=== Examining {filename} ===")
|
||||
with open(f'sample_data/{filename}', 'r', encoding='utf-8') as f:
|
||||
reader = csv.reader(f)
|
||||
for i, row in enumerate(reader):
|
||||
print(f'Row {i}: {row[:10]}') # Print first 10 columns
|
||||
if i == 5: # Print first 6 rows
|
||||
break
|
||||
print()
|
||||
BIN
scheduler_bots/school_schedule.db
Normal file
BIN
scheduler_bots/school_schedule.db
Normal file
Binary file not shown.
BIN
scheduler_bots/school_schedule.db-journal
Normal file
BIN
scheduler_bots/school_schedule.db-journal
Normal file
Binary file not shown.
84
scheduler_bots/simple_combine.py
Normal file
84
scheduler_bots/simple_combine.py
Normal file
@@ -0,0 +1,84 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
simple_combine.py - Simple script to insert content from two HTML files into the main HTML file
|
||||
"""
|
||||
|
||||
import os
|
||||
import re
|
||||
|
||||
|
||||
def simple_combine():
|
||||
# Define file paths
|
||||
main_file_path = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/scheduler_bots/Thesis materials/Thesis_ Intelligent School Schedule Management System.html"
|
||||
file1_path = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/scheduler_bots/Thesis materials/deepseek_html_20260128_0dc71d.html"
|
||||
file2_path = "/Users/home/YandexDisk/TECHNOLYCEUM/ict/Year/2025/ai/ai7/ai7-m3/scheduler_bots/Thesis materials/deepseek_html_20260128_15ee7a.html"
|
||||
|
||||
# Read the main file content
|
||||
with open(main_file_path, 'r', encoding='utf-8') as f:
|
||||
main_content = f.read()
|
||||
|
||||
# Read the content from the first file
|
||||
with open(file1_path, 'r', encoding='utf-8') as f:
|
||||
file1_content = f.read()
|
||||
|
||||
# Read the content from the second file
|
||||
with open(file2_path, 'r', encoding='utf-8') as f:
|
||||
file2_content = f.read()
|
||||
|
||||
# Remove HTML structure from the additional files (doctype, html, head, body tags)
|
||||
def clean_html_content(content):
|
||||
# Remove doctype
|
||||
content = re.sub(r'<!DOCTYPE[^>]*>', '', content, flags=re.IGNORECASE)
|
||||
# Remove html tags
|
||||
content = re.sub(r'<html[^>]*>|</html>', '', content, flags=re.IGNORECASE)
|
||||
# Remove head section
|
||||
content = re.sub(r'<head[^>]*>.*?</head>', '', content, flags=re.DOTALL | re.IGNORECASE)
|
||||
# Remove body tags
|
||||
content = re.sub(r'<body[^>]*>|</body>', '', content, flags=re.IGNORECASE)
|
||||
return content.strip()
|
||||
|
||||
# Clean the content from both files
|
||||
clean_file1_content = clean_html_content(file1_content)
|
||||
clean_file2_content = clean_html_content(file2_content)
|
||||
|
||||
# Find the closing body tag to insert additional content
|
||||
body_close_pos = main_content.rfind('</body>')
|
||||
if body_close_pos == -1:
|
||||
# If no closing body tag, find the closing html tag
|
||||
html_close_pos = main_content.rfind('</html>')
|
||||
if html_close_pos != -1:
|
||||
insert_pos = html_close_pos
|
||||
else:
|
||||
# If no closing html tag, append at the end
|
||||
insert_pos = len(main_content)
|
||||
else:
|
||||
insert_pos = body_close_pos
|
||||
|
||||
# Prepare the additional content to insert
|
||||
additional_content = f'''
|
||||
<!-- Additional Content from deepseek_html_20260128_0dc71d.html -->
|
||||
<section class="additional-content" style="margin: 40px 0; padding: 20px; border: 1px solid #ccc; border-radius: 8px;">
|
||||
<h2 style="color: #2c3e50; border-bottom: 2px solid #3498db; padding-bottom: 10px;">Additional Content Section 1</h2>
|
||||
{clean_file1_content}
|
||||
</section>
|
||||
|
||||
<!-- Additional Content from deepseek_html_20260128_15ee7a.html -->
|
||||
<section class="additional-content" style="margin: 40px 0; padding: 20px; border: 1px solid #ccc; border-radius: 8px;">
|
||||
<h2 style="color: #2c3e50; border-bottom: 2px solid #3498db; padding-bottom: 10px;">Additional Content Section 2</h2>
|
||||
{clean_file2_content}
|
||||
</section>
|
||||
'''
|
||||
|
||||
# Insert the additional content into the main file
|
||||
combined_content = main_content[:insert_pos] + additional_content + main_content[insert_pos:]
|
||||
|
||||
# Write the combined content back to the main file
|
||||
with open(main_file_path, 'w', encoding='utf-8') as f:
|
||||
f.write(combined_content)
|
||||
|
||||
print("Content from both files has been successfully inserted into the main HTML file.")
|
||||
print(f"Updated file: {main_file_path}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
simple_combine()
|
||||
@@ -25,8 +25,8 @@ def load_schedule():
|
||||
Returns a dictionary with day-wise schedule
|
||||
"""
|
||||
try:
|
||||
# Read CSV file
|
||||
df = pd.read_csv('schedule.csv')
|
||||
# Read CSV file - Updated to use the provided file name
|
||||
df = pd.read_csv('schedule_template RS.csv')
|
||||
schedule = {}
|
||||
|
||||
# Process each row (each day)
|
||||
@@ -34,20 +34,27 @@ def load_schedule():
|
||||
day = row['Day']
|
||||
schedule[day] = []
|
||||
|
||||
# Process each time slot column
|
||||
time_slots = ['Period_1', 'Period_2', 'Period_3', 'Period_4', 'Period_5','Period_6', 'Period_7']
|
||||
# Process each period column - Updated to match the actual CSV structure
|
||||
# The CSV has columns labeled '1 (9:00-9:40)', '2 (10:00-10:40)', etc.
|
||||
period_columns = [
|
||||
'1 (9:00-9:40)', '2 (10:00-10:40)', '3 (11:00-11:40)', '4 (11:50-12:30)',
|
||||
'5 (12:40-13:20)', '6 (13:30-14:10)', '7 (14:20-15:00)', '8 (15:20-16:00)',
|
||||
'9 (16:15-16:55)', '10 (17:05-17:45)', '11 (17:55-18:35)', '12 (18:45-19:20)', '13 (19:20-20:00)'
|
||||
]
|
||||
|
||||
for slot in time_slots:
|
||||
# Check if class exists for this time slot
|
||||
if pd.notna(row[slot]) and str(row[slot]).strip():
|
||||
class_info = str(row[slot])
|
||||
schedule[day].append((slot, class_info))
|
||||
for i, col_name in enumerate(period_columns):
|
||||
period_num = str(i + 1) # '1', '2', '3', etc.
|
||||
|
||||
# Check if the column exists and if class exists for this time slot
|
||||
if col_name in row and pd.notna(row[col_name]) and str(row[col_name]).strip() != '':
|
||||
class_info = str(row[col_name])
|
||||
schedule[day].append((period_num, class_info)) # Store both period number and class info
|
||||
|
||||
return schedule
|
||||
|
||||
except FileNotFoundError:
|
||||
print("❌ Error: schedule.csv file not found!")
|
||||
print("Please create schedule.csv in the same folder")
|
||||
print("❌ Error: schedule_template RS.csv file not found!")
|
||||
print("Please make sure schedule_template RS.csv is in the same folder")
|
||||
return {}
|
||||
except Exception as e:
|
||||
print(f"❌ Error loading schedule: {e}")
|
||||
@@ -57,18 +64,26 @@ def load_schedule():
|
||||
# Load schedule at startup
|
||||
SCHEDULE = load_schedule()
|
||||
|
||||
# Time mapping for periods
|
||||
PERIOD_TIMES = {
|
||||
'Period_1': ('09:00', '09:40'),
|
||||
'Period_2': ('10:00', '10:40'),
|
||||
'Period_3': ('11:00', '11:40'),
|
||||
'Period_4': ('11:50', '12:30'),
|
||||
'Period_5': ('12:40', '13:20'),
|
||||
'Period_6': ('13:30', '14:10'),
|
||||
'Period_7': ('10:00', '10:40'),
|
||||
|
||||
# Map period numbers to times - Updated as requested
|
||||
period_times = {
|
||||
'1': ('09:00', '09:40'),
|
||||
'2': ('10:00', '10:40'),
|
||||
'3': ('11:00', '11:40'),
|
||||
'4': ('11:50', '12:30'),
|
||||
'5': ('12:40', '13:20'),
|
||||
'6': ('13:30', '14:10'),
|
||||
'7': ('14:20', '15:00'),
|
||||
'8': ('15:20', '16:00'),
|
||||
'9': ('16:15', '16:55'),
|
||||
'10': ('17:05', '17:45'),
|
||||
'11': ('17:55', '18:35'),
|
||||
'12': ('18:45', '19:20'),
|
||||
'13': ('19:20', '20:00')
|
||||
}
|
||||
|
||||
# Time mapping for periods - Updated to use the new mapping
|
||||
PERIOD_TIMES = period_times
|
||||
|
||||
|
||||
async def start(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Send welcome message when command /start is issued."""
|
||||
@@ -84,7 +99,7 @@ async def start(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
async def where_am_i(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Tell user where they should be right now."""
|
||||
if not SCHEDULE:
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule.csv file.")
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule_template RS.csv file.")
|
||||
return
|
||||
|
||||
now = datetime.datetime.now()
|
||||
@@ -101,9 +116,9 @@ async def where_am_i(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
|
||||
# Find current class
|
||||
found_class = False
|
||||
for period, class_info in SCHEDULE[current_day]:
|
||||
start_time, end_time = PERIOD_TIMES[period]
|
||||
|
||||
for period_num, class_info in SCHEDULE[current_day]:
|
||||
start_time, end_time = PERIOD_TIMES[period_num]
|
||||
|
||||
if start_time <= current_time <= end_time:
|
||||
await update.message.reply_text(f"🎯 You should be in: {class_info}")
|
||||
found_class = True
|
||||
@@ -114,21 +129,25 @@ async def where_am_i(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
|
||||
|
||||
async def schedule(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Show today's full schedule."""
|
||||
"""Show the complete weekly schedule."""
|
||||
if not SCHEDULE:
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule.csv file.")
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule_template RS.csv file.")
|
||||
return
|
||||
|
||||
current_day = datetime.datetime.now().strftime("%A")
|
||||
|
||||
if current_day not in SCHEDULE or not SCHEDULE[current_day]:
|
||||
await update.message.reply_text("😊 No classes scheduled for today!")
|
||||
return
|
||||
|
||||
schedule_text = f"📚 {current_day}'s Schedule:\n\n"
|
||||
for period, class_info in SCHEDULE[current_day]:
|
||||
start, end = PERIOD_TIMES[period]
|
||||
schedule_text += f"⏰ {start}-{end}: {class_info}\n"
|
||||
schedule_text = "📚 Weekly Schedule:\n\n"
|
||||
|
||||
# Define the standard order of days in a week
|
||||
days_of_week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
|
||||
|
||||
for day in days_of_week:
|
||||
if day in SCHEDULE and SCHEDULE[day]: # Check if the day exists in the schedule and has classes
|
||||
schedule_text += f"*{day}'s Schedule:*\n"
|
||||
for period_num, class_info in SCHEDULE[day]:
|
||||
start, end = PERIOD_TIMES[period_num]
|
||||
schedule_text += f" ⏰ {start}-{end}: {class_info}\n"
|
||||
schedule_text += "\n"
|
||||
else:
|
||||
schedule_text += f"{day}: No classes scheduled\n\n"
|
||||
|
||||
await update.message.reply_text(schedule_text)
|
||||
|
||||
@@ -136,7 +155,7 @@ async def schedule(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
async def tomorrow(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Show tomorrow's schedule."""
|
||||
if not SCHEDULE:
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule.csv file.")
|
||||
await update.message.reply_text("❌ Schedule not loaded. Check schedule_template RS.csv file.")
|
||||
return
|
||||
|
||||
tomorrow_date = datetime.datetime.now() + datetime.timedelta(days=1)
|
||||
@@ -147,8 +166,8 @@ async def tomorrow(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
return
|
||||
|
||||
schedule_text = f"📚 {tomorrow_day}'s Schedule:\n\n"
|
||||
for period, class_info in SCHEDULE[tomorrow_day]:
|
||||
start, end = PERIOD_TIMES[period]
|
||||
for period_num, class_info in SCHEDULE[tomorrow_day]:
|
||||
start, end = PERIOD_TIMES[period_num]
|
||||
schedule_text += f"⏰ {start}-{end}: {class_info}\n"
|
||||
|
||||
await update.message.reply_text(schedule_text)
|
||||
@@ -160,7 +179,7 @@ async def help_command(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"Available commands:\n"
|
||||
"/start - Start the bot\n"
|
||||
"/whereami - Find your current class\n"
|
||||
"/schedule - Show today's schedule\n"
|
||||
"/schedule - Show today's full schedule\n"
|
||||
"/tomorrow - Show tomorrow's schedule\n"
|
||||
"/help - Show this help message"
|
||||
)
|
||||
@@ -169,8 +188,8 @@ async def help_command(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
def main():
|
||||
"""Start the bot."""
|
||||
if not SCHEDULE:
|
||||
print("❌ Failed to load schedule. Please check schedule.csv file.")
|
||||
print("Make sure schedule.csv exists in the same folder")
|
||||
print("❌ Failed to load schedule. Please check schedule_template RS.csv file.")
|
||||
print("Make sure schedule_template RS.csv exists in the same folder")
|
||||
return
|
||||
|
||||
# Create the Application
|
||||
|
||||
344
scheduler_bots/telegram_scheduler_v3.py
Normal file
344
scheduler_bots/telegram_scheduler_v3.py
Normal file
@@ -0,0 +1,344 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
Enhanced Scheduler Bot with SQLite database support
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import datetime
|
||||
from telegram import Update
|
||||
from telegram.ext import Application, CommandHandler, ContextTypes, MessageHandler, filters
|
||||
|
||||
# 🔑 REPLACE THIS with your bot token from @BotFather
|
||||
BOT_TOKEN = "8248686383:AAGN5UJ73H9i7LQzIBR3TjuJgUGNTFyRHk8"
|
||||
|
||||
# Database setup
|
||||
DATABASE_NAME = "schedule.db"
|
||||
|
||||
def init_db():
|
||||
"""Initialize the SQLite database and create tables if they don't exist."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
# Create table for schedule entries
|
||||
cursor.execute('''
|
||||
CREATE TABLE IF NOT EXISTS schedule (
|
||||
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
||||
day TEXT NOT NULL,
|
||||
period INTEGER NOT NULL,
|
||||
subject TEXT NOT NULL,
|
||||
class_name TEXT NOT NULL,
|
||||
room TEXT NOT NULL,
|
||||
UNIQUE(day, period)
|
||||
)
|
||||
''')
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
|
||||
def add_schedule_entry(day, period, subject, class_name, room):
|
||||
"""Add a new schedule entry to the database."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
try:
|
||||
cursor.execute('''
|
||||
INSERT OR REPLACE INTO schedule (day, period, subject, class_name, room)
|
||||
VALUES (?, ?, ?, ?, ?)
|
||||
''', (day, period, subject, class_name, room))
|
||||
|
||||
conn.commit()
|
||||
conn.close()
|
||||
return True
|
||||
except sqlite3.Error as e:
|
||||
print(f"Database error: {e}")
|
||||
conn.close()
|
||||
return False
|
||||
|
||||
def load_schedule_from_db():
|
||||
"""Load schedule from the SQLite database."""
|
||||
conn = sqlite3.connect(DATABASE_NAME)
|
||||
cursor = conn.cursor()
|
||||
|
||||
cursor.execute("SELECT day, period, subject, class_name, room FROM schedule ORDER BY day, period")
|
||||
rows = cursor.fetchall()
|
||||
|
||||
conn.close()
|
||||
|
||||
# Group by day
|
||||
schedule = {}
|
||||
for day, period, subject, class_name, room in rows:
|
||||
if day not in schedule:
|
||||
schedule[day] = []
|
||||
|
||||
class_info = f"Subject: {subject} Class: {class_name} Room: {room}"
|
||||
schedule[day].append((str(period), class_info))
|
||||
|
||||
return schedule
|
||||
|
||||
# Initialize the database
|
||||
init_db()
|
||||
|
||||
# Map period numbers to times - Updated as requested
|
||||
period_times = {
|
||||
'1': ('09:00', '09:40'),
|
||||
'2': ('10:00', '10:40'),
|
||||
'3': ('11:00', '11:40'),
|
||||
'4': ('11:50', '12:30'),
|
||||
'5': ('12:40', '13:20'),
|
||||
'6': ('13:30', '14:10'),
|
||||
'7': ('14:20', '15:00'),
|
||||
'8': ('15:20', '16:00'),
|
||||
'9': ('16:15', '16:55'),
|
||||
'10': ('17:05', '17:45'),
|
||||
'11': ('17:55', '18:35'),
|
||||
'12': ('18:45', '19:20'),
|
||||
'13': ('19:20', '20:00')
|
||||
}
|
||||
|
||||
# User states for tracking conversations
|
||||
user_states = {} # Stores user conversation state
|
||||
|
||||
async def start(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Send welcome message when command /start is issued."""
|
||||
await update.message.reply_text(
|
||||
"🤖 Hello! I'm your enhanced class scheduler bot with database support!\n"
|
||||
"Use /whereami to find your current class\n"
|
||||
"Use /schedule to see today's full schedule\n"
|
||||
"Use /tomorrow to see tomorrow's schedule\n"
|
||||
"Use /add to add a new class to the schedule\n"
|
||||
"Use /help for all commands"
|
||||
)
|
||||
|
||||
|
||||
async def where_am_i(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Tell user where they should be right now."""
|
||||
# Reload schedule from DB to ensure latest data
|
||||
schedule = load_schedule_from_db()
|
||||
|
||||
if not schedule:
|
||||
await update.message.reply_text("❌ Schedule not loaded from database.")
|
||||
return
|
||||
|
||||
now = datetime.datetime.now()
|
||||
current_time = now.strftime("%H:%M")
|
||||
current_day = now.strftime("%A")
|
||||
|
||||
await update.message.reply_text(f"📅 Today is {current_day}")
|
||||
await update.message.reply_text(f"⏰ Current time: {current_time}")
|
||||
|
||||
# Check if we have schedule for today
|
||||
if current_day not in schedule:
|
||||
await update.message.reply_text("😊 No classes scheduled for today!")
|
||||
return
|
||||
|
||||
# Find current class
|
||||
found_class = False
|
||||
for period_num, class_info in schedule[current_day]:
|
||||
start_time, end_time = period_times[period_num]
|
||||
|
||||
if start_time <= current_time <= end_time:
|
||||
await update.message.reply_text(f"🎯 You should be in: {class_info}")
|
||||
found_class = True
|
||||
break
|
||||
|
||||
if not found_class:
|
||||
await update.message.reply_text("😊 No class right now! Free period.")
|
||||
|
||||
|
||||
async def schedule(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Show the complete weekly schedule."""
|
||||
# Reload schedule from DB to ensure latest data
|
||||
schedule = load_schedule_from_db()
|
||||
|
||||
if not schedule:
|
||||
await update.message.reply_text("❌ Schedule not loaded from database.")
|
||||
return
|
||||
|
||||
schedule_text = "📚 Weekly Schedule:\n\n"
|
||||
|
||||
# Define the standard order of days in a week
|
||||
days_of_week = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
|
||||
|
||||
for day in days_of_week:
|
||||
if day in schedule and schedule[day]: # Check if the day exists in the schedule and has classes
|
||||
schedule_text += f"*{day}'s Schedule:*\n"
|
||||
for period_num, class_info in schedule[day]:
|
||||
start, end = period_times[period_num]
|
||||
schedule_text += f" ⏰ {start}-{end}: {class_info}\n"
|
||||
schedule_text += "\n"
|
||||
else:
|
||||
schedule_text += f"{day}: No classes scheduled\n\n"
|
||||
|
||||
await update.message.reply_text(schedule_text)
|
||||
|
||||
|
||||
async def tomorrow(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Show tomorrow's schedule."""
|
||||
# Reload schedule from DB to ensure latest data
|
||||
schedule = load_schedule_from_db()
|
||||
|
||||
if not schedule:
|
||||
await update.message.reply_text("❌ Schedule not loaded from database.")
|
||||
return
|
||||
|
||||
tomorrow_date = datetime.datetime.now() + datetime.timedelta(days=1)
|
||||
tomorrow_day = tomorrow_date.strftime("%A")
|
||||
|
||||
if tomorrow_day not in schedule or not schedule[tomorrow_day]:
|
||||
await update.message.reply_text(f"😊 No classes scheduled for {tomorrow_day}!")
|
||||
return
|
||||
|
||||
schedule_text = f"📚 {tomorrow_day}'s Schedule:\n\n"
|
||||
for period_num, class_info in schedule[tomorrow_day]:
|
||||
start, end = period_times[period_num]
|
||||
schedule_text += f"⏰ {start}-{end}: {class_info}\n"
|
||||
|
||||
await update.message.reply_text(schedule_text)
|
||||
|
||||
|
||||
async def add(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Start the process of adding a new schedule entry."""
|
||||
user_id = update.effective_user.id
|
||||
user_states[user_id] = {"step": "waiting_day"}
|
||||
|
||||
await update.message.reply_text(
|
||||
"📅 Adding a new class to the schedule.\n"
|
||||
"Please enter the day of the week (e.g., Monday, Tuesday, etc.):"
|
||||
)
|
||||
|
||||
|
||||
async def handle_message(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Handle user messages during the add process."""
|
||||
user_id = update.effective_user.id
|
||||
|
||||
if user_id not in user_states:
|
||||
# Not in a conversation, ignore
|
||||
return
|
||||
|
||||
state_info = user_states[user_id]
|
||||
message_text = update.message.text.strip()
|
||||
|
||||
if state_info["step"] == "waiting_day":
|
||||
# Validate day input
|
||||
valid_days = ["monday", "tuesday", "wednesday", "thursday", "friday", "saturday", "sunday"]
|
||||
if message_text.lower() not in valid_days:
|
||||
await update.message.reply_text(
|
||||
f"'{message_text}' is not a valid day of the week.\n"
|
||||
"Please enter a valid day (e.g., Monday, Tuesday, etc.):"
|
||||
)
|
||||
return
|
||||
|
||||
state_info["day"] = message_text.capitalize()
|
||||
state_info["step"] = "waiting_period"
|
||||
|
||||
await update.message.reply_text(
|
||||
f"Got it! Day: {state_info['day']}\n"
|
||||
"Now please enter the period number (1-13):"
|
||||
)
|
||||
|
||||
elif state_info["step"] == "waiting_period":
|
||||
try:
|
||||
period = int(message_text)
|
||||
if period < 1 or period > 13:
|
||||
raise ValueError("Period must be between 1 and 13")
|
||||
|
||||
state_info["period"] = period
|
||||
state_info["step"] = "waiting_subject"
|
||||
|
||||
await update.message.reply_text(
|
||||
f"Got it! Period: {period}\n"
|
||||
"Now please enter the subject name:"
|
||||
)
|
||||
except ValueError:
|
||||
await update.message.reply_text(
|
||||
f"'{message_text}' is not a valid period number.\n"
|
||||
"Please enter a number between 1 and 13:"
|
||||
)
|
||||
|
||||
elif state_info["step"] == "waiting_subject":
|
||||
state_info["subject"] = message_text
|
||||
state_info["step"] = "waiting_class"
|
||||
|
||||
await update.message.reply_text(
|
||||
f"Got it! Subject: {message_text}\n"
|
||||
"Now please enter the class name (e.g., 10ABC, 6A/6B, etc.):"
|
||||
)
|
||||
|
||||
elif state_info["step"] == "waiting_class":
|
||||
state_info["class_name"] = message_text
|
||||
state_info["step"] = "waiting_room"
|
||||
|
||||
await update.message.reply_text(
|
||||
f"Got it! Class: {message_text}\n"
|
||||
"Finally, please enter the room number:"
|
||||
)
|
||||
|
||||
elif state_info["step"] == "waiting_room":
|
||||
state_info["room"] = message_text
|
||||
|
||||
# Add to database
|
||||
success = add_schedule_entry(
|
||||
state_info["day"],
|
||||
state_info["period"],
|
||||
state_info["subject"],
|
||||
state_info["class_name"],
|
||||
message_text
|
||||
)
|
||||
|
||||
if success:
|
||||
await update.message.reply_text(
|
||||
f"✅ Successfully added to schedule!\n\n"
|
||||
f"Day: {state_info['day']}\n"
|
||||
f"Period: {state_info['period']}\n"
|
||||
f"Subject: {state_info['subject']}\n"
|
||||
f"Class: {state_info['class_name']}\n"
|
||||
f"Room: {state_info['room']}"
|
||||
)
|
||||
else:
|
||||
await update.message.reply_text(
|
||||
f"❌ Failed to add to schedule. Please try again."
|
||||
)
|
||||
|
||||
# Clean up user state
|
||||
del user_states[user_id]
|
||||
|
||||
|
||||
async def help_command(update: Update, context: ContextTypes.DEFAULT_TYPE):
|
||||
"""Send help message with all commands."""
|
||||
await update.message.reply_text(
|
||||
"Available commands:\n"
|
||||
"/start - Start the bot\n"
|
||||
"/whereami - Find your current class\n"
|
||||
"/schedule - Show today's full schedule\n"
|
||||
"/tomorrow - Show tomorrow's schedule\n"
|
||||
"/add - Add a new class to the schedule\n"
|
||||
"/help - Show this help message"
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
"""Start the bot."""
|
||||
# Create the Application
|
||||
application = Application.builder().token(BOT_TOKEN).build()
|
||||
|
||||
# Add command handlers
|
||||
application.add_handler(CommandHandler("start", start))
|
||||
application.add_handler(CommandHandler("whereami", where_am_i))
|
||||
application.add_handler(CommandHandler("schedule", schedule))
|
||||
application.add_handler(CommandHandler("tomorrow", tomorrow))
|
||||
application.add_handler(CommandHandler("add", add))
|
||||
application.add_handler(CommandHandler("help", help_command))
|
||||
|
||||
# Add message handler for conversation flow
|
||||
application.add_handler(MessageHandler(filters.TEXT & ~filters.COMMAND, handle_message))
|
||||
|
||||
# Start the Bot
|
||||
print("🤖 Enhanced scheduler bot with database support is running...")
|
||||
print("📊 Database initialized successfully!")
|
||||
print("Press Ctrl+C to stop the bot")
|
||||
|
||||
application.run_polling()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
71
scheduler_bots/telegram_scheduler_v4.py
Normal file
71
scheduler_bots/telegram_scheduler_v4.py
Normal file
@@ -0,0 +1,71 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
scheduler.py - Simple school schedule checker
|
||||
No sample data, just real CSV data
|
||||
"""
|
||||
|
||||
import datetime
|
||||
from database import SchoolScheduleDB
|
||||
|
||||
def main():
|
||||
db = SchoolScheduleDB()
|
||||
|
||||
print("🏫 School Schedule Checker")
|
||||
|
||||
# Ask for student name
|
||||
name_query = input("\nEnter your name (or part of it): ").strip()
|
||||
|
||||
# Search for student
|
||||
students = db.find_student(name_query)
|
||||
|
||||
if not students:
|
||||
print("No student found.")
|
||||
return
|
||||
|
||||
# Show found students
|
||||
print("\nFound students:")
|
||||
for i, (full_name, class_name) in enumerate(students, 1):
|
||||
print(f"{i}. {full_name} ({class_name})")
|
||||
|
||||
# Let user select
|
||||
if len(students) > 1:
|
||||
choice = input(f"\nSelect student (1-{len(students)}): ")
|
||||
try:
|
||||
idx = int(choice) - 1
|
||||
if 0 <= idx < len(students):
|
||||
full_name, class_name = students[idx]
|
||||
else:
|
||||
print("Invalid choice.")
|
||||
return
|
||||
except:
|
||||
print("Invalid input.")
|
||||
return
|
||||
else:
|
||||
full_name, class_name = students[0]
|
||||
|
||||
print(f"\n👤 Student: {full_name} ({class_name})")
|
||||
|
||||
# Get current time
|
||||
now = datetime.datetime.now()
|
||||
current_day = now.strftime("%A")
|
||||
current_time = now.strftime("%H:%M")
|
||||
|
||||
print(f"📅 Today: {current_day}")
|
||||
print(f"⏰ Time: {current_time}")
|
||||
|
||||
# Find current class
|
||||
current_class = db.get_current_class(full_name, current_day, current_time)
|
||||
|
||||
if current_class:
|
||||
subject, teacher, start, end = current_class
|
||||
print(f"\n🎯 CURRENT CLASS:")
|
||||
print(f"📚 {subject}")
|
||||
print(f"👨🏫 {teacher}")
|
||||
print(f"🕐 {start}-{end}")
|
||||
else:
|
||||
print("\n😊 Free period!")
|
||||
|
||||
db.close()
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
105
scheduler_bots/telegram_scheduler_v5.py
Normal file
105
scheduler_bots/telegram_scheduler_v5.py
Normal file
@@ -0,0 +1,105 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
telegram_scheduler_v5.py - Advanced school schedule checker with homeroom teacher info using database
|
||||
"""
|
||||
|
||||
import datetime
|
||||
import csv
|
||||
import re
|
||||
import os
|
||||
from database import SchoolScheduleDB
|
||||
|
||||
|
||||
def find_student_location_and_teacher(student_name_query):
|
||||
"""
|
||||
Find where a student should be based on their name using database
|
||||
"""
|
||||
# Create database instance
|
||||
db = SchoolScheduleDB()
|
||||
|
||||
# Search for student by name
|
||||
students = db.find_student(student_name_query)
|
||||
|
||||
if not students:
|
||||
print(f"No student found matching '{student_name_query}'")
|
||||
db.close()
|
||||
return
|
||||
|
||||
# Handle multiple student matches
|
||||
if len(students) > 1:
|
||||
print(f"\nFound {len(students)} students matching '{student_name_query}':")
|
||||
for i, (full_name, class_name) in enumerate(students, 1):
|
||||
print(f"{i}. {full_name} ({class_name})")
|
||||
|
||||
try:
|
||||
choice = int(input(f"\nSelect student (1-{len(students)}): ")) - 1
|
||||
if 0 <= choice < len(students):
|
||||
full_name, class_name = students[choice]
|
||||
else:
|
||||
print("Invalid selection.")
|
||||
db.close()
|
||||
return
|
||||
except ValueError:
|
||||
print("Invalid input.")
|
||||
db.close()
|
||||
return
|
||||
else:
|
||||
full_name, class_name = students[0]
|
||||
|
||||
# Find the homeroom teacher for this student's class using the database
|
||||
homeroom_teacher = db.get_homeroom_teacher(class_name)
|
||||
|
||||
# Get current schedule for the student
|
||||
current_schedule = get_current_schedule_for_student_db(db, full_name, class_name)
|
||||
|
||||
# Display the results
|
||||
print(f"\n🔍 STUDENT INFORMATION:")
|
||||
print(f"👤 Student: {full_name}")
|
||||
print(f"🎒 Class: {class_name}")
|
||||
|
||||
if current_schedule:
|
||||
print(f"\n📋 TODAY'S SCHEDULE:")
|
||||
for period_info in current_schedule:
|
||||
subject, teacher, start_time, end_time, room_or_group = period_info
|
||||
print(f" {start_time}-{end_time} | 📚 {subject} | 👨🏫 {teacher} | 🚪 {room_or_group}")
|
||||
else:
|
||||
print(f"\n😊 No scheduled classes for today!")
|
||||
|
||||
if homeroom_teacher:
|
||||
print(f"\n🏫 HOMEROOM TEACHER INFORMATION:")
|
||||
print(f"👨🏫 {homeroom_teacher['name']}")
|
||||
print(f"📞 Internal Number: {homeroom_teacher['internal_number']}")
|
||||
if homeroom_teacher['mobile_number']:
|
||||
print(f"📱 Mobile: {homeroom_teacher['mobile_number']}")
|
||||
print(f"🏢 Classroom: {homeroom_teacher['classroom']}")
|
||||
print(f"🏛️ Parent Meeting Room: {homeroom_teacher['parent_meeting_room']}")
|
||||
else:
|
||||
print(f"\n❌ Could not find homeroom teacher for class {class_name}")
|
||||
|
||||
db.close()
|
||||
|
||||
|
||||
def get_current_schedule_for_student_db(db, student_name, class_name):
|
||||
"""
|
||||
Get the full schedule for a student for the current day from database
|
||||
"""
|
||||
# Get all schedule records for the student
|
||||
return db.get_student_schedule(student_name)
|
||||
|
||||
|
||||
def main():
|
||||
print("🏫 Advanced School Schedule Checker (Database Version)")
|
||||
print("🔍 Find where a student should be and their homeroom teacher info")
|
||||
|
||||
# Ask for student name
|
||||
name_query = input("\nEnter student name (or part of it): ").strip()
|
||||
|
||||
if not name_query:
|
||||
print("Please enter a valid name.")
|
||||
return
|
||||
|
||||
find_student_location_and_teacher(name_query)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
232
scheduler_bots/verify_db.py
Normal file
232
scheduler_bots/verify_db.py
Normal file
@@ -0,0 +1,232 @@
|
||||
#!/usr/bin/env python
|
||||
"""
|
||||
verify_db.py - Verification script for the school schedule database
|
||||
Checks data quality in teachers, groups, and students tables
|
||||
"""
|
||||
|
||||
import sqlite3
|
||||
import re
|
||||
|
||||
def connect_db(db_name='school_schedule.db'):
|
||||
"""Connect to the database"""
|
||||
conn = sqlite3.connect(db_name)
|
||||
cursor = conn.cursor()
|
||||
return conn, cursor
|
||||
|
||||
def check_teachers_table(cursor):
|
||||
"""Check the teachers table for data quality issues"""
|
||||
print("Checking teachers table...")
|
||||
|
||||
cursor.execute("SELECT COUNT(*) FROM teachers")
|
||||
total_count = cursor.fetchone()[0]
|
||||
print(f"Total teachers: {total_count}")
|
||||
|
||||
# Find teachers with default names
|
||||
cursor.execute("SELECT name FROM teachers WHERE name LIKE '%Default Teacher%' OR name LIKE '%Неизвестный%'")
|
||||
default_teachers = cursor.fetchall()
|
||||
print(f"Teachers with default names: {len(default_teachers)}")
|
||||
for teacher in default_teachers:
|
||||
print(f" - {teacher[0]}")
|
||||
|
||||
# Find potentially invalid teacher names
|
||||
invalid_teachers = []
|
||||
cursor.execute("SELECT name FROM teachers")
|
||||
all_teachers = cursor.fetchall()
|
||||
for (teacher_name,) in all_teachers:
|
||||
if not is_valid_teacher_name(teacher_name):
|
||||
invalid_teachers.append(teacher_name)
|
||||
|
||||
print(f"Potentially invalid teacher names: {len(invalid_teachers)}")
|
||||
for teacher in invalid_teachers:
|
||||
print(f" - {teacher}")
|
||||
|
||||
print()
|
||||
|
||||
def is_valid_teacher_name(name):
|
||||
"""Check if a name looks like a valid teacher name"""
|
||||
# Skip default names as they're intentionally different
|
||||
if 'Default Teacher' in name or 'Неизвестный' in name:
|
||||
return True # Considered valid as intentional placeholders
|
||||
|
||||
# Check for common invalid patterns
|
||||
invalid_patterns = [
|
||||
r'^\d+[А-ЯA-Z]$', # Class pattern like "8А", "11B"
|
||||
r'^[А-ЯЁA-Z]\d+\s+[А-ЯЁA-Z]\d+$', # "E4 E5" pattern
|
||||
r'.*[Gg]roup.*', # Group identifiers
|
||||
r'.*[Tt]rack.*', # Track identifiers
|
||||
r'^[А-ЯЁA-Z]\d+$', # Single group identifiers like "E4"
|
||||
r'.*[Cc]lub.*', # Club identifiers
|
||||
]
|
||||
|
||||
for pattern in invalid_patterns:
|
||||
if re.match(pattern, name, re.IGNORECASE):
|
||||
return False
|
||||
|
||||
# Valid teacher name patterns
|
||||
valid_patterns = [
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+', # Russian names
|
||||
r'^[A-Z][a-z]+\s+[A-Z][a-z]+', # English names
|
||||
r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ]\.', # Name with initial
|
||||
r'^[A-Z][a-z]+\s+[A-Z]\.', # Name with initial (English)
|
||||
]
|
||||
|
||||
for pattern in valid_patterns:
|
||||
if re.match(pattern, name):
|
||||
return True
|
||||
|
||||
# If it's a reasonably long string with spaces and proper capitalization
|
||||
parts = name.split()
|
||||
if len(parts) >= 2 and len(name) >= 5:
|
||||
# Check if parts start with capital letters
|
||||
if all(len(part) > 0 and part[0].isupper() for part in parts):
|
||||
return True
|
||||
|
||||
return False
|
||||
|
||||
def check_groups_table(cursor):
|
||||
"""Check the groups table for data quality issues"""
|
||||
print("Checking groups table...")
|
||||
|
||||
cursor.execute("SELECT COUNT(*) FROM groups")
|
||||
total_count = cursor.fetchone()[0]
|
||||
print(f"Total groups: {total_count}")
|
||||
|
||||
# Get all group names
|
||||
cursor.execute("SELECT name FROM groups")
|
||||
all_groups = cursor.fetchall()
|
||||
|
||||
# Check for potential student names in group names
|
||||
potential_student_names = []
|
||||
for (group_name,) in all_groups:
|
||||
if looks_like_student_name(group_name):
|
||||
potential_student_names.append(group_name)
|
||||
|
||||
print(f"Groups that look like student names: {len(potential_student_names)}")
|
||||
for group in potential_student_names[:10]: # Show first 10
|
||||
print(f" - {group}")
|
||||
|
||||
print()
|
||||
|
||||
def looks_like_student_name(name):
|
||||
"""Check if a name looks like a student name instead of a group"""
|
||||
# Class patterns like "8А", "11B" are OK as groups
|
||||
class_pattern = r'^\d+[А-ЯA-Z]$'
|
||||
if re.match(class_pattern, name):
|
||||
return False
|
||||
|
||||
# Student names typically follow name patterns
|
||||
name_pattern = r'^[А-ЯЁ][а-яё]+\s+[А-ЯЁ][а-яё]+' # Russian name
|
||||
if re.match(name_pattern, name):
|
||||
return True
|
||||
|
||||
name_pattern = r'^[A-Z][a-z]+\s+[A-Z][a-z]+' # English name
|
||||
if re.match(name_pattern, name):
|
||||
return True
|
||||
|
||||
# If it contains common group identifiers, it's likely a valid group
|
||||
group_indicators = ['club', 'track', 'group', 'module', '-', 'class']
|
||||
if any(indicator in name.lower() for indicator in group_indicators):
|
||||
return False
|
||||
|
||||
return False
|
||||
|
||||
def check_students_table(cursor):
|
||||
"""Check the students table"""
|
||||
print("Checking students table...")
|
||||
|
||||
cursor.execute("SELECT COUNT(*) FROM students")
|
||||
total_count = cursor.fetchone()[0]
|
||||
print(f"Total students: {total_count}")
|
||||
|
||||
# Get sample students
|
||||
cursor.execute("SELECT full_name, class_name FROM students LIMIT 5")
|
||||
samples = cursor.fetchall()
|
||||
print("Sample students:")
|
||||
for student in samples:
|
||||
print(f" - {student[0]} (Class: {student[1]})")
|
||||
|
||||
print()
|
||||
|
||||
def check_schedule_integrity(cursor):
|
||||
"""Check the schedule table for data consistency"""
|
||||
print("Checking schedule table integrity...")
|
||||
|
||||
# Count total schedule entries
|
||||
cursor.execute("SELECT COUNT(*) FROM schedule")
|
||||
total_schedules = cursor.fetchone()[0]
|
||||
print(f"Total schedule entries: {total_schedules}")
|
||||
|
||||
# Count entries with valid relationships
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*)
|
||||
FROM schedule s
|
||||
JOIN students st ON s.student_id = st.student_id
|
||||
JOIN subjects su ON s.subject_id = su.subject_id
|
||||
JOIN teachers t ON s.teacher_id = t.teacher_id
|
||||
JOIN groups g ON s.group_id = g.group_id
|
||||
""")
|
||||
valid_relationships = cursor.fetchone()[0]
|
||||
print(f"Schedules with valid relationships: {valid_relationships}")
|
||||
|
||||
# Check for orphaned records
|
||||
print("Checking for orphaned records...")
|
||||
|
||||
# Students in schedule but not in students table
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*) FROM schedule s
|
||||
LEFT JOIN students st ON s.student_id = st.student_id
|
||||
WHERE st.student_id IS NULL
|
||||
""")
|
||||
orphaned_students = cursor.fetchone()[0]
|
||||
print(f"Orphaned student references: {orphaned_students}")
|
||||
|
||||
# Subjects in schedule but not in subjects table
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*) FROM schedule s
|
||||
LEFT JOIN subjects su ON s.subject_id = su.subject_id
|
||||
WHERE su.subject_id IS NULL
|
||||
""")
|
||||
orphaned_subjects = cursor.fetchone()[0]
|
||||
print(f"Orphaned subject references: {orphaned_subjects}")
|
||||
|
||||
# Teachers in schedule but not in teachers table
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*) FROM schedule s
|
||||
LEFT JOIN teachers t ON s.teacher_id = t.teacher_id
|
||||
WHERE t.teacher_id IS NULL
|
||||
""")
|
||||
orphaned_teachers = cursor.fetchone()[0]
|
||||
print(f"Orphaned teacher references: {orphaned_teachers}")
|
||||
|
||||
# Groups in schedule but not in groups table
|
||||
cursor.execute("""
|
||||
SELECT COUNT(*) FROM schedule s
|
||||
LEFT JOIN groups g ON s.group_id = g.group_id
|
||||
WHERE g.group_id IS NULL
|
||||
""")
|
||||
orphaned_groups = cursor.fetchone()[0]
|
||||
print(f"Orphaned group references: {orphaned_groups}")
|
||||
|
||||
print()
|
||||
|
||||
def main():
|
||||
"""Main function to run all checks"""
|
||||
print("School Schedule Database Verification")
|
||||
print("="*40)
|
||||
|
||||
try:
|
||||
conn, cursor = connect_db()
|
||||
|
||||
check_teachers_table(cursor)
|
||||
check_groups_table(cursor)
|
||||
check_students_table(cursor)
|
||||
check_schedule_integrity(cursor)
|
||||
|
||||
conn.close()
|
||||
print("Verification complete!")
|
||||
|
||||
except Exception as e:
|
||||
print(f"Error during verification: {str(e)}")
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
Reference in New Issue
Block a user