PivotRL is currently in progress. A full write-up will be published once experiments stabilize.


Work in progress on distillation-aware reinforcement learning workflows for compact policy transfer.

PivotRL


# How I turned competitive Pokemon into an RL environment for LLMs

## Why I picked Pokemon

Most LLM environments are either too toy-like or too clean. The model gets a neatly packaged state, the reward is obvious, and there is no real adversary trying to exploit mistakes. I wanted something harder: hidden information, delayed rewards, legal action constraints, and another agent actively pushing back.

Competitive Pokemon turned out to be a much better fit than I expected.

On the surface, Pokemon does not sound like a serious benchmark for language-model reasoning. But competitive Pokemon is really about uncertainty, tempo, resource preservation, and long-term consequences. Rock-paper-scissors gives a tiny example of cyclic matchups where the right action depends on what the other side is likely to do. Pokemon scales that idea up dramatically: type matchups, switching, partial information, setup turns, status effects, and positions where the move that looks best right now is exactly the move that loses the game later.

That makes it a rich environment for long-horizon planning and situational awareness.

## Turning Showdown into an environment

WolfeClick wraps Pokemon Showdown as an OpenEnv-compatible environment. Instead of treating the simulator like an external tool that an LLM pokes through brittle prompting, I wanted a proper RL loop:

1. The environment emits an observation.
2. The model chooses one action.
3. The simulator advances one step.
4. The environment returns the next observation and reward.

Under the hood:

- **Showdown** is the battle engine.
- **poke-env** manages interaction with Showdown.
- **WolfeClick environment wrapper** transforms battle state into model-usable observations, enforces legality, tracks revealed opponent information, and computes per-step rewards.

This makes the system feel less like a one-off demo and more like a reusable training environment.

## What the model sees

One of the key design choices was deciding what information the model should receive. I did not want to leak hidden state, but I also did not want observations so sparse they became unusable.

The final state representation is a structured text view containing:

- Active field state
- Full information for the model's own team
- Opponent information revealed so far
- Exact legal actions on the current turn

In practice, this includes active Pokemon, HP, status, item/ability if known, available moves, switch options, and an evolving memory of opponent reveals. That memory matters because Pokemon is partially observable: good play depends on updating beliefs from revealed evidence, not only reacting to the current frame.

The legal-action list also makes the task crisp. The model is not asked to write an essay. It is asked to choose one valid decision.

## Action format

To constrain the action space, the model outputs exactly one JSON object.

Early on, I learned this could not be left to prompting alone, so I did a short SFT warmup to make schema adherence reliable. Once format stability improved, RL could focus on strategy rather than malformed outputs.

```json
{"action": "move" | "switch", "choice": "Exact Name of Move or Pokemon"}
```

This matters for two reasons:

1. It forces concrete decisions instead of vague reasoning.
2. It makes validation immediate for malformed, hallucinated, or illegal actions.

That turns the setup into a verifiable reward problem.

## Reward design

This is where the environment becomes genuinely strategic instead of a formatting task.

A pure win/loss reward is too sparse for short runs. If useful signal only arrives at battle end, training is slow and unstable while the model is still learning legality and basic action quality. I shaped reward around intermediate battle events while keeping it tied to winning actual games.

Reward structure:

- Team HP changes by 10%: `-1.0` per 10% HP lost, `+1.0` per 10% HP removed from opponent
- Pokemon faints: `-3.0` if your Pokemon faints, `+3.0` if opponent faints
- Super-effective move: `+0.5`
- Move has no effect (immunity/no effect): `-1.0`
- Move misses: `-0.25`
- Healing: `+1.0` per 10% healed, capped at `+3.0` per battle
- Team status cured: `+1.0`
- Setup boosts for your active Pokemon: `+0.5` per stage, capped at `+2.0` per Pokemon, only if above 50% HP
- Opponent setup boosts: `-0.5` per stage gained by opponent
- Passive damage on opponent: `0.01 x cumulative passive hits`
- Your team gets burned/poisoned/badly poisoned: `-0.5`
- Your team gets paralyzed/frozen/asleep/confused: `-1.0`
- Illegal action output: `-10.0`

Some details are important:

- Damage and knockouts are symmetric and interpretable, so they anchor the signal.
- Healing/setup are capped to prevent reward farming loops.
- Setup is rewarded only above half HP to discourage reckless greed.
- Passive damage is incremental, acknowledging strategic pressure without overwhelming the objective.

The result is denser but still strategy-aligned: deal meaningful damage, preserve resources, pressure the opponent, and avoid illegal or low-value actions.

## Training loop

Once the environment was stable, the training loop was straightforward:

1. The model plays live Showdown battles.
2. I record trajectories from those battles.
3. I train a LoRA adapter on those trajectories with GRPO.

I used `Qwen3-4B-Instruct` as the base model and trained LoRA adapters (not full checkpoints) for faster iteration.

What mattered most to me was grounding. The model does not learn from static preference pairs about Pokemon. It learns from consequences inside the environment. Bad switch decisions are punished by state transitions and reward. Useful aggressive lines and resource-preserving play show up in trajectories as positive signal.

## What was harder than expected

The hardest part was observability, not raw modeling.

If you only inspect aggregate reward, it is hard to know what is actually happening:

- Did the model improve, or just get lucky?
- Is it respecting legal action space, or being silently corrected?
- Is reward rising from strategy, or from artifacts?

So I invested heavily in logging and replay tooling:

- Recorded battle logs with detailed per-turn information
- Converted logs into replay-friendly JSON
- Built a viewer to inspect turn-by-turn observations, legal actions, chosen actions, and reward changes

This made behavior legible rather than opaque.

## What worked (and what still does not)

What works:

- The full end-to-end loop runs.
- A relatively small LLM can operate in a live multi-agent environment.
- Strict action constraints and real rollouts are practical.

What still does not:

- Rollout collection is slower than training.
- Reward is still noisy.
- Random legal-action opponents are only a weak baseline.

The next real step is not claiming "the model is good at Pokemon." It is benchmarking against stronger heuristic/non-random opponents and tightening reward to better reflect true strategic quality.

## Why this matters

For me, this project is bigger than Pokemon.

Pokemon is the concrete world I used to package an agent problem that appears everywhere: act under uncertainty, respect hard constraints, update beliefs from partial information, and trade short-term gain for long-term outcomes.

Competitive Pokemon bundles all of that into a compact, testable, and unforgiving environment. That is exactly why it works so well as a benchmark and training ground for LLM agents.

## Links

- Replay viewer: [WolfeClick Space](https://huggingface.co/spaces/Atharva2099/WolfeClick)
- Code: [OpenEnv-WolfeClick](https://github.com/Atharva2099/OpenEnv-WolfeClick)
- Model weights: [openenv-smogon-rl](https://huggingface.co/Atharva2099/openenv-smogon-rl)


WolfeClick wraps Pokemon Showdown as an OpenEnv-compatible environment so LLMs can learn legal action selection and long-horizon strategy from live battles.

How I Turned Competitive Pokemon Into an RL Environment for LLMs


## Overview
Trip AI harnesses the power of Large Language Models to transform your travel planning experience. Whether you're planning a weekend getaway or a two-week adventure, Trip AI helps you create detailed, personalized itineraries with just a few clicks. In this guide, we'll explore how to use Trip AI and make the most of its features.

## Problem and Motivation
Traditional travel planning often involves juggling multiple tabs, cross-referencing reviews, and manually organizing schedules. Trip AI streamlines this process by leveraging LLMs to generate comprehensive itineraries while giving you full control over the details. The best part? You can try it right now at [Trip AI](https://atharva2099.github.io/Trip.AI/).

## Getting Started

### 1. Setting Up Access
There are two ways to get started with Trip AI:

#### Direct Usage
1. Visit [Trip AI](https://atharva2099.github.io/Trip.AI/)
2. Get a free Groq API key from [Groq Console](https://console.groq.com/keys)
3. Enter your API key when prompted (stored locally in your browser)

#### Local Development
```bash

# Clone the repository
git clone https://github.com/Atharva2099/Trip.AI.git
cd Trip AI

# Install dependencies
npm install

# Add your Groq API key to .env file
echo "REACT_APP_GROQ_API_KEY=your_key_here" > .env

# Start the development server
npm start

```

### 2. Creating Your First Itinerary
The process is straightforward:

1. Enter your destination
2. Select travel dates
3. Set your budget
4. Specify number of travelers
5. Add interests (optional)
6. Click "Generate Itinerary"

### 3. Smart Features

#### Real-Time Fact Checking
Trip AI validates every location and activity it suggests:
- Confirms actual existence of locations
- Verifies distances between activities
- Checks opening hours and accessibility
- Validates price ranges against real-world data

#### Interactive Customization
Don't like a suggested activity? No problem! Every element of your itinerary is customizable:
- Click on any activity or meal to open the modification chat
- Ask for alternatives, adjust timing, or request different price points
- The LLM ensures all changes maintain consistency with your overall plan
- Get instant suggestions that account for location, budget, and timing constraints

#### Smart Map Integration
- Interactive map shows your daily route
- Click markers for activity details
- Get real-time directions to any location
- Visualize travel times between activities

## Pro Tips

1. **Budget Optimization**
   - Start with a slightly lower budget than your maximum
   - Use the modification feature to upgrade specific activities you care about
   - The cost breakdown helps you track spending across categories

2. **Customization Tricks**
   ```text
   
   Some effective modification requests:
   - "Find a cheaper alternative to this activity"
   - "Suggest a more local restaurant instead"
   - "Move this activity to earlier in the day"
   - "Find something more kid-friendly"
   
   ```

3. **Location Management**
   - Trip AI automatically optimizes routes
   - Use the map view to ensure distances are comfortable
   - Request changes if locations seem too far apart

## Future Improvements
We're constantly working to enhance Trip AI with features like:
- Multi-city trip planning
- Integrated travel booking
- Group collaboration tools
- Local events integration
- Offline mode support

## Conclusion
Trip AI demonstrates how LLMs can transform travel planning from a chore into an enjoyable experience. Whether you use the hosted version or run it locally, the combination of intelligent suggestions and real-time customization helps create the perfect itinerary for your needs.

Want to contribute or suggest features? Check out our [GitHub repository](https://github.com/Atharva2099/Trip.AI)!

---
GitHub: [@FullMLAlchemist](https://github.com/Atharva2099)
Twitter: [@Attharave](https://x.com/attharave)


A practical setup and usage guide for Trip AI, including itinerary generation, customization, and local development tips.

Getting Started with Trip AI Your LLM Powered Travel Companion


## Introduction
The N-Queens problem is a classic chess puzzle where we need to place N queens on an N×N chessboard such that no two queens threaten each other. A Monte Carlo simulation can help us estimate the complexity of solving this problem using backtracking.

## Concept: Monte Carlo Method
Monte Carlo methods use random sampling to obtain numerical results. In our case, we:
1. Randomly explore paths in the state space tree
2. Count nodes visited and promising positions
3. Estimate total complexity through multiple trials

## Setup and Dependencies
```python

import random
import time
from statistics import mean, stdev
import numpy as np

```

## Core Function: Checking Promising Positions
We need to determine if a queen placement is valid (promising) by checking:
- No queen in the same column
- No queen in the diagonals

```python

def promising(i, j, col):
    """Check if placing a queen at position (i,j) is promising"""
    for k in range(i):
        if (col[k] == j or abs(col[k] - j) == abs(k - i)):
            return False
    return True

```

## Monte Carlo Estimation
The estimation process:
1. Start from root node
2. At each level:
   - Count total nodes
   - Find promising positions
   - Randomly select one promising child
3. Continue until no promising children or board is full

```python

def monte_carlo_estimate(n):
    """
    Perform one Monte Carlo estimation for n-Queens problem
    Returns tuple of (total_nodes, promising_nodes)
    """
    col = [-1] * n
    total_nodes = 1    # Root node
    promising_nodes = 1  # Root is promising
    m = 1
    mprod = 1
    i = 0
    
    while m != 0 and i != n:
        mprod = mprod * m
        current_level_nodes = mprod * n
        total_nodes += current_level_nodes
        
        # Find promising children at current level
        m = 0
        prom_children = []
        for j in range(n):
            if promising(i, j, col):
                m += 1
                prom_children.append(j)
        
        promising_nodes += m * mprod
        
        if m != 0:
            j = random.choice(prom_children)
            col[i] = j
            i += 1
    
    return (total_nodes, promising_nodes)

```

## Running Multiple Trials
To get reliable estimates, we run multiple trials and collect statistics:
- Mean values
- Standard deviation
- Min/Max values
- Execution time

```python

def run_monte_carlo_simulation(n, num_trials=100):
    """Run multiple Monte Carlo simulations and analyze results"""
    total_estimates = []
    promising_estimates = []
    start_time = time.time()
    
    for _ in range(num_trials):
        estimate_result = monte_carlo_estimate(n)
        total_estimates.append(estimate_result[0])
        promising_estimates.append(estimate_result[1])
    
    execution_time = time.time() - start_time
    
    return {
        'total_nodes': {
            'mean': mean(total_estimates),
            'std_dev': stdev(total_estimates),
            'min': min(total_estimates),
            'max': max(total_estimates)
        },
        'promising_nodes': {
            'mean': mean(promising_estimates),
            'std_dev': stdev(promising_estimates),
            'min': min(promising_estimates),
            'max': max(promising_estimates)
        },
        'execution_time': execution_time,
        'num_trials': num_trials,
        'raw_promising': promising_estimates
    }

```

## Main Execution and Analysis
Here we:
1. Run simulations with different trial sizes
2. Collect and display statistics
3. Compare with professor's values
4. Calculate overall averages

```python

def main():
    n = 12  # Board size
    num_trials = [100, 500, 1000]
    random.seed(123)  # For reproducibility
    
    print(f"\nMonte Carlo Simulation for {n}-Queens Problem")
    print("=" * 60)
    
    all_promising_values = []
    
    for trials in num_trials:
        results = run_monte_carlo_simulation(n, trials)
        print(f"\nResults for {trials} trials:")
        print("\nTotal Nodes:")
        print(f"Average: {results['total_nodes']['mean']:.2f}")
        print(f"Standard deviation: {results['total_nodes']['std_dev']:.2f}")
        print(f"Min: {results['total_nodes']['min']:.2f}")
        print(f"Max: {results['total_nodes']['max']:.2f}")
        
        print("\nPromising Nodes:")
        print(f"Average: {results['promising_nodes']['mean']:.2f}")
        print(f"Standard deviation: {results['promising_nodes']['std_dev']:.2f}")
        print(f"Min: {results['promising_nodes']['min']:.2f}")
        print(f"Max: {results['promising_nodes']['max']:.2f}")
        
        all_promising_values.extend(results['raw_promising'])
    
    # Overall statistics
    total_runs = sum(num_trials)
    overall_mean = mean(all_promising_values)
    overall_std = stdev(all_promising_values)
    
    print("\nOverall Statistics:")
    print(f"Total runs: {total_runs}")
    print(f"Overall mean promising nodes: {overall_mean:.2f}")
    print(f"Overall standard deviation: {overall_std:.2f}")
    
    # Compare with professor's value
    professors_value = 856000
    percentage_diff = ((overall_mean - professors_value) / professors_value) * 100
    print(f"\nPercentage difference from professor's value: {percentage_diff:.2f}%")

if __name__ == "__main__":
    main()

```

## Initial Simulation Results
![Simulation Results](/images/n-simulation-result.png)


## Results Analysis
When we run this simulation for n=12:
1. Our estimates are close to the professor's values:
   - Professor's value: 8.56 × 10^5
   - Our estimated value: ~8.70 × 10^5 (within 2% difference)
2. The standard deviation shows the variability of the Monte Carlo method
3. Larger numbers of trials generally give more stable results

## Conclusion
The Monte Carlo simulation effectively estimates the complexity of the N-Queens problem:
- Provides good approximations of node counts
- Much faster than exhaustive counting
- Helps understand the scale of the problem
- Results align well with theoretical expectations

## Plots for 4, 8, 12 and 14 Queens problem on a Log scale

Analysisng the data:

```python

	def analyze_complexity(n_values, trials_per_n=1000):
	
	"""Analyze time complexity for different values of n"""
	
	total_nodes_avg = []
	
	promising_nodes_avg = []
	
	execution_times = []
	
	for n in n_values:
	
	start_time = time.time()
	
	trial_totals = []
	
	trial_promising = []
	
	for _ in range(trials_per_n):
	
	total, promising = monte_carlo_estimate(n)
	
	trial_totals.append(total)
	
	trial_promising.append(promising)
	
	exec_time = time.time() - start_time
	
	total_nodes_avg.append(mean(trial_totals))
	
	promising_nodes_avg.append(mean(trial_promising))
	
	execution_times.append(exec_time)
	
	print(f"\nResults for n={n}:")
	
	print(f"Average Total Nodes: {mean(trial_totals):,.2f}")
	
	print(f"Average Promising Nodes: {mean(trial_promising):,.2f}")
	
	print(f"Execution Time: {exec_time:.4f} seconds")
	
	return total_nodes_avg, promising_nodes_avg, execution_times
```	
	  
Plotting the graphs:

```python	
	def plot_complexity_analysis(n_values, total_nodes, promising_nodes, times):
	
	fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
	
	# Plot 1: Nodes vs N (log scale)
	
	ax1.plot(n_values, total_nodes, 'b-o', label='Total Nodes')
	
	ax1.plot(n_values, promising_nodes, 'g-o', label='Promising Nodes')
	
	ax1.set_yscale('log')
	
	ax1.set_title('Growth of Nodes with N\n(Log Scale)', fontsize=12)
	
	ax1.set_xlabel('N (Board Size)', fontsize=10)
	
	ax1.set_ylabel('Number of Nodes (log scale)', fontsize=10)
	
	ax1.grid(True, alpha=0.3)
	
	ax1.legend()
	
	# Plot 2: Execution Time vs N
	
	ax2.plot(n_values, times, 'r-o', label='Execution Time')
	
	ax2.set_title('Execution Time vs N', fontsize=12)
	
	ax2.set_xlabel('N (Board Size)', fontsize=10)
	
	ax2.set_ylabel('Time (seconds)', fontsize=10)
	
	ax2.grid(True, alpha=0.3)
	
	plt.tight_layout()
	
	plt.show()
```

Main Function: 
```python
	def main():
	
	random.seed(123)
	
	n_values = [4, 8, 12, 14] # Different board sizes
	
	print("Analyzing time complexity for N-Queens problem")
	
	print("=" * 60)
	
	total_nodes, promising_nodes, exec_times = analyze_complexity(n_values)
	
	plot_complexity_analysis(n_values, total_nodes, promising_nodes, exec_times)
	
	  
	
	if __name__ == "__main__":
	
	main()

```

## Complexity Analysis Results

![Complexity Analysis](/images/n-queen-plot.png)

## Code
Find the complete implementation on [GitHub](https://github.com/Atharva2099/AssignmentsButFun/blob/main/CSC510/Monte%20Carlo%20Simulation%20for%20N-Queens%20Using%20Backtracking%20and%20pruning.ipynb).

---
GitHub: [@FullMLAlchemist](https://github.com/Atharva2099)
Twitter: [@Attharave](https://x.com/attharave)


An experiment-driven explanation of estimating N-Queens search complexity with Monte Carlo sampling and backtracking.

Monte Carlo Simulation For N Queens Problem


# Automating Study Material Creation with Retrieval-Augmented Generation (RAG)

### Overview
Creating effective study materials from lengthy notes or research papers can be time-consuming. As someone passionate about machine learning and productivity tools, I developed **EasyQuizzes**, an application that uses Retrieval-Augmented Generation (RAG) to turn notes or PDFs into custom flashcards. This blog walks you through how I implemented RAG and structured the app to automate the process.

### Problem and Motivtion
Creating flashcards manually from notes often requires significant time and effort. Inspired by my experience with Quizlet, I wanted to streamline this task by building a tool that uses AI to extract key information and generate flashcards instantly.

### Solution Architecture

**1. Setting Up RAG with ChromaDB and LLaMA3**
To implement RAG:
- **LLM Selection**: I used **Meta's LLaMA3.2-3B-preview** via Groq API to ensure low latency and efficient processing.
- **Embedding and Retrieval**: Using **ChromaDB**, I created vector embeddings for notes, allowing quick retrieval of relevant chunks.
  
**2. Core Components**
**Document Parsing and Chunking:**
Using **PyPDF2** and **OCR**, my app converts handwritten or scanned notes into text. Each document is chunked into manageable sections for better LLM performance and stored as vectors in ChromaDB.

**RAG Process**

The RAG approach includes:
1. **Retrieving** relevant text chunks based on a prompt.
2. **Generating** responses by feeding retrieved information to the LLaMA3 model.

**3. Key Code Snippets**

**Document Parsing and Chunk Storage**

```python

import PyPDF2

def parse_pdf(file_path):
    pdf_reader = PyPDF2.PdfFileReader(open(file_path, "rb"))
    text_content = ""
    for page in pdf_reader.pages:
        text_content += page.extract_text()
    return text_content
```

**Flashcard Generation**
```python

def generate_flashcards(topic, document_text):
    retrieved_chunks = chromadb.retrieve(topic)
    flashcards = []
    for chunk in retrieved_chunks:
        question, answer = llm.generate_flashcard(chunk)
        flashcards.append((question, answer))
    return flashcards
```

**4. Key Features and Benefits**
- **Custom Flashcards in Seconds**: Reduces study prep time by automating question generation.
- **Accuracy with RAG**: By using retrieval, flashcards focus on relevant information, increasing the quality of study material.
- **Local Storage and Privacy**: Flashcards and data are stored locally for student privacy and offline access.

## Challenges and Learnings
1. **Embedding Quality**: Choosing the right embeddings is crucial for retrieving relevant text chunks accurately.
2. **Latency**: Using the Groq API and an optimized Conda environment helped manage model inference times effectively.
3. **Chunk Size**: Balancing chunk size was essential to avoid missing context while keeping retrieval efficient.

## Future Improvements
1. **Multi-language Support**: Given my background in Indic languages, I plan to extend this tool to support content in Hindi and Marathi.
2. **Mobile App Version**: Allowing flashcard creation on the go for a more seamless study experience.

## Conclusion
EasyQuizzes demonstrates how retrieval-augmented generation can save time and enhance learning. With AI, students can now spend more time understanding concepts rather than preparing notes.

Checkout EasyQuizzes at : [EasyQuizzes](https://github.com/Atharva2099/EasyQuizzes)

---
GitHub: [@FullMLAlchemist](https://github.com/Atharva2099)
Twitter: [@Attharave](https://x.com/attharave)



How EasyQuizzes combines retrieval-augmented generation and vector search to generate study flashcards from notes and PDFs.

EasyQuizzes AI Powered Flashcard Generator


# Understanding the Sudoku Board Validator - LeetCode Solution

## Problem Overview
LeetCode problem #36 asks us to validate a 9x9 Sudoku board. A valid Sudoku board must satisfy three conditions:
1. Each row must contain digits 1-9 without repetition
2. Each column must contain digits 1-9 without repetition
3. Each 3x3 sub-box must contain digits 1-9 without repetition

Note that empty cells (marked as ".") are allowed and don't affect validity.

## Solution Approach
Let's break down the solution into digestible pieces to understand how it efficiently validates a Sudoku board in a single pass.

```python

def isValidSudoku(self, board: List[List[str]]) -> bool:
    rows = collections.defaultdict(set)
    cols = collections.defaultdict(set)
    sqrs = collections.defaultdict(set)
```

### Data Structures
We use three `defaultdict(set)` to track numbers in:
- `rows`: Each row of the board
- `cols`: Each column of the board
- `sqrs`: Each 3x3 square

Using `defaultdict(set)` is clever because:
- It automatically creates an empty set when we access a new key
- Sets provide O(1) lookup and insertion
- We don't need to initialize anything manually

### The Main Algorithm
```python

for r in range(9):
    for c in range(9):
        if board[r][c] == ".":
            continue
```
We iterate through each cell. If it's empty ("."), we skip it.

```python

if (board[r][c] in rows[r] or
    board[r][c] in cols[c] or 
    board[r][c] in sqrs[(r//3,c//3)]):
    return False
```

For each number, we check three conditions:
1. Is it already in the current row?
2. Is it already in the current column?
3. Is it already in the current 3x3 square?

The expression `(r//3,c//3)` is particularly clever:
- It maps the 9x9 grid coordinates to 3x3 square coordinates
- For example:
  - Cell (0,0) maps to square (0,0)
  - Cell (1,1) maps to square (0,0)
  - Cell (3,3) maps to square (1,1)

```python

rows[r].add(board[r][c])
cols[c].add(board[r][c])
sqrs[(r//3,c//3)].add(board[r][c])
```

If all checks pass, we:
1. Add the number to its row set
2. Add it to its column set
3. Add it to its 3x3 square set

If we complete the entire board without finding any duplicates, return `True`.

## Time and Space Complexity
- Time Complexity: O(1)
  - We always process exactly 81 cells (9x9 board)
  - Each cell operation is O(1) due to set operations
- Space Complexity: O(1)
  - We store at most 9 numbers in each set
  - We have a fixed number of sets (9 rows + 9 columns + 9 squares)

## Advantages of this Solution
1. **Single Pass**: We only need to traverse the board once
2. **Early Exit**: Returns `False` as soon as an invalid state is detected
3. **Clean Code**: Using `defaultdict(set)` makes the code concise and readable
4. **Efficient Lookups**: Set operations are O(1)

## Common Pitfalls to Avoid
1. Don't forget to check empty cells (".") and skip them
2. Remember that valid numbers are strings ("1" to "9"), not integers
3. The 3x3 square calculation `(r//3,c//3)` must use integer division

## Conclusion
This solution demonstrates how proper data structure choice (using sets) and clever coordinate mapping (for 3x3 squares) can lead to a clean and efficient solution. While there are other ways to solve this problem, this approach provides an excellent balance of readability and performance.

---
GitHub: [@FullMLAlchemist](https://github.com/Atharva2099)
Twitter: [@Attharave](https://x.com/attharave)

A clean single-pass Sudoku validator using hash sets to track rows, columns, and 3x3 boxes efficiently.

Leetcode No.36 Valid Sudoku


# Building a shell with pipes in C 

Command-line shells are an essential part of operating systems, allowing users to interact with the system via commands. While modern operating systems have powerful, feature-rich shells, understanding how they work at a low level provides crucial insight into process management, input/output operations, and more. In this post, we’ll take a closer look at building a simple shell in C, particularly focusing on three key phases: **Read**, **Parse**, and **Execute**.

<div class="video-container">
  <iframe
    width="100%"
    height="500"
    src="https://www.youtube.com/embed/2J7g3KcZJ3I"
    title="Building a Simple Shell in C"
    frameborder="0"
    allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
    allowfullscreen>
  </iframe>
</div>

### 1. The Shell Process Lifecycle

The shell can be broken down into three distinct phases:

- **Read Phase**: The shell reads the user’s command.

- **Parse Phase**: The shell breaks the command into individual tokens (like command and arguments) and prepares for execution.

- **Execute Phase**: The shell forks a new process to execute the command.

Let’s explore each phase in more detail.

-----------
### 2. **Read Phase**

In the **Read Phase**, the shell waits for user input. This input could be a simple command like `ls` or something more complex, involving pipes (`|`) and redirection (`>`, `<`). The input is typically read as a single string, which the shell will process further.

In C, this can be achieved using functions like `fgets()` or `read()` to capture input from the user.

#### Example:

```c

char input[1024];

printf("shell> ");

fgets(input, sizeof(input), stdin);

```

This code snippet captures user input of up to 1024 characters. Once the input is read, the shell proceeds to the **Parse Phase**.

-----
### 3. **Parse Phase**

The **Parse Phase** involves breaking down the user’s input into smaller components, known as **tokens**. These tokens are typically the command (like `ls`) and its arguments (like `-l`, `| grep .c`).

In C, functions like `strtok()` are useful for breaking the input string into tokens based on delimiters (like spaces or pipe symbols).

#### Example:

```c

char *token;

token = strtok(input, " \n");

while (token != NULL) {

    printf("%s\n", token);  // Process each token

    token = strtok(NULL, " \n");

}

```

This will break the user’s input into individual tokens. For example, if the user types `ls -l | grep .c`, the tokens will be `ls`, `-l`, `|`, and `grep .c`.

The shell needs to handle each of these tokens appropriately, determining what the user wants to achieve (e.g., if there’s a pipe, we need to split the command into two processes).

--------
### 4. **Execute Phase**

Finally, the **Execute Phase** is where the shell runs the command. In most cases, this involves creating a new child process using `fork()`, and then using `execvp()` or similar system calls to replace the child process’s memory with the new command’s memory.

For commands involving pipes or redirection, this phase becomes slightly more complex, as the shell needs to manage file descriptors and direct output from one process to another.

#### Example:

```c

pid_t pid = fork();

if (pid == 0) {

    // Child process

    execvp(command[0], command);

    perror("execvp");

    exit(EXIT_FAILURE);

} else if (pid > 0) {

    // Parent process

    wait(NULL);

} else {

    perror("fork");

    exit(EXIT_FAILURE);

}

```

This simple fork/exec pattern allows the shell to run commands. The parent process waits for the child to complete using `wait()`, ensuring that commands are executed sequentially unless the user requests background execution.

---

### 5. **Advanced Features**

Building a fully functional shell also requires adding advanced features like:

- **Piping**: Sending the output of one command as input to another (e.g., `ls | grep .c`).

- **Redirection**: Redirecting output to a file or input from a file (e.g., `ls > output.txt`).

- **Signal Handling**: Handling interrupts (e.g., `Ctrl+C`) to terminate running processes or commands.

While these features add complexity, they are also what make a shell useful. Understanding the basics of process control and inter-process communication (pipes) will help you implement these features.

---

### Conclusion

Building a shell from scratch is an excellent way to learn about process management, system calls, and low-level programming in C. By breaking the problem down into the **Read**, **Parse**, and **Execute** phases, you can focus on each aspect individually and build up to more complex features like piping and redirection.

Once you understand how these phases interact, you can experiment with adding more functionality and customization to your shell. And who knows? You might end up building a command-line interface that fits your workflow better than existing ones.

---
GitHub: [@FullMLAlchemist](https://github.com/Atharva2099)
Twitter: [@Attharave](https://x.com/attharave)