## Introduction to Python 

## Author: Nitesh Kumar

<img src="https://indiaeducationdiary.in/wp-content/uploads/2022/01/UPES-LOGO-01.jpg" alt="drawing" width="250"/>


## Email: nitesh.kumar@ddn.upes.ac.in

# Python Basics from Scratch

## Table of Contents
1. Introduction to Python
2. Variables and Data Types
3. Basic Operators
4. String Manipulation
5. Control Structures (If, Elif, Else)
6. Loops (For, While) & Loop Control (Break, Continue)
7. Functions
8. Data Structures I: Lists, Tuples, and List Comprehensions
9. Data Structures II: Dictionaries
10. Error Handling (Try / Except)
11. Modules (Math)
12. Putting It All Together: 15 Scientific Examples
13. Introduction to NumPy
14. Introduction to Pandas

---

## 1. Introduction to Python
Python is a high-level, interpreted programming language known for its readability and simplicity. It is widely used in various fields, including web development, data analysis, artificial intelligence, and more.

### Example Code

In [1]:
print("Hello, World!")

Hello, World!



### Challenge
1. Print your name.
2. Print the result of 2 + 3.
3. Print a string that includes your favorite hobby.
4. Print the current year.
5. Print a simple math operation (e.g., 10 * 5).


In [1]:
# Solution 1: Print your name.


# Solution 2: Print the result of 2 + 3.


# Solution 3: Print a string that includes your favorite hobby.


# Solution 4: Print the current year.


# Solution 5: Print a simple math operation (e.g., 10 * 5).



---

## 2. Variables and Data Types
Variables are used to store data values. Python has various data types, including integers, floats, strings, and booleans.

### Example Code

In [3]:
# Integer
age = 25
# Float
height = 5.9
# String
name = "Alice"
# Boolean
is_student = True

print(age, height, name, is_student)

# You can check the type of any variable using the type() function:
print(f"Type of age: {type(age)}")
print(f"Type of height: {type(height)}")
print(f"Type of name: {type(name)}")
print(f"Type of is_student: {type(is_student)}")

25 5.9 Alice True
Type of age: <class 'int'>
Type of height: <class 'float'>
Type of name: <class 'str'>
Type of is_student: <class 'bool'>



### Challenge
1. Create a variable for your age and print it.
2. Create a variable for your height (e.g., 1.75 meters) and print it.
3. Create a variable for your favorite color and print it.
4. Create a boolean variable indicating if you like Python.
5. Print all the variables you created in one line.



In [2]:
# solution 1: Integer

# solution 2: Float

# solution 3: String

# solution 4: Boolean

# solution 5: Print all variables in one line.

---

## 3. Basic Operators
Operators are used to perform operations on variables and values. Python supports arithmetic, comparison, and logical operators.

### 3.1 Arithmetic Operators
    

In [5]:
a = 10
b = 5

# Arithmetic Operators
sum_result = a + b
difference = a - b
product = a * b
quotient = a / b    # Note: Division always results in a float
remainder = a % b   # Modulus operator
exponent = a ** b # Exponentiation

print(f"Sum (a + b): {sum_result}")
print(f"Difference (a - b): {difference}")
print(f"Product (a * b): {product}")
print(f"Quotient (a / b): {quotient}")
print(f"Remainder (a % b): {remainder}")
print(f"Exponent (a ** b): {exponent}")

Sum (a + b): 15
Difference (a - b): 5
Product (a * b): 50
Quotient (a / b): 2.0
Remainder (a % b): 0
Exponent (a ** b): 100000



### Challenge
1. Calculate the remainder of 10 divided by 3.
2. Find the power of 2 raised to 5.
3. Subtract 15 from 30 and print the result.
4. Multiply 7 by 6 and print the result.
5. Divide 100 by 4 and print the result.

In [3]:
# Solutions:



### 3.2 User Input
We can get input from a user using the `input()` function. **Important:** `input()` always returns a string. You must convert it to a number (like `float` or `int`) if you want to do math with it.

In [7]:
r_str = input('Enter the radius of the cylinder= ') # This is a string!
r = float(r_str) # Convert the string to a float (a number with decimals)

h_str = input('Enter the height of the cylinder= ') # Also a string
h = float(h_str) # Convert to float

pi = 3.1415

V = pi*r*r*h
A = 2*pi*r*(r+h)

print('\n') # '\n' is a special character that means "new line"
print('Total surface area:', A)
print('Total Volume:', V)

Enter the radius of the cylinder= 2.5
Enter the height of the cylinder= 7



Total surface area: 149.225625
Total Volume: 137.4440625


### Challenge

1. Write a program that asks the user for two numbers and prints their sum.
2. Ask the user for the radius of a circle and print its area (use pi = 3.1415).
3. Ask the user for their name and age, then print a message: "Hello, <name>! You are <age> years old."
4. Ask the user for a number and print whether it is positive, negative, or zero.
5. Ask the user for the length and width of a rectangle, then print its perimeter and area.

In [None]:
# Solutions:


### 3.3 Print Formatting (f-strings)

It's often useful to format our output neatly. The easiest way is with **f-strings**. By putting an `f` before the opening quote, you can put variables directly inside curly braces `{}`.

You can also control the formatting:
- `{V:.2f}`: Formats the variable `V` as a **f**loating-point number with **2** decimal places.
- `{V:.0f}`: Formats as a floating-point number with **0** decimal places (looks like an integer).
- `{V:.2e}`: Formats in scientific **e**-notation with **2** decimal places. 

In [5]:
# This cell uses the variables r, h, and V from the cell above
V = 137.4458698785
print(f'Integer format: The volume is {V:.0f} cm^3 \n')
print(f'Two decimal places: The volume is {V:.2f} cm^3 \n')
print(f'Scientific notation: The volume is {V:.2e} cm^3 \n')

Integer format: The volume is 137 cm^3 

Two decimal places: The volume is 137.45 cm^3 

Scientific notation: The volume is 1.37e+02 cm^3 



---
## 4. String Manipulation

Strings are sequences of characters. You can access parts of them using indexing and slicing, and they have many useful built-in methods.

In [9]:
my_string = "Hello, World!"

# Indexing (starts at 0)
print(f"First character (index 0): {my_string[0]}")
print(f"Fifth character (index 4): {my_string[4]}")
print(f"Last character (index -1): {my_string[-1]}")

# Slicing [start:stop] (stop index is not included)
print(f"Slice from index 7 to 12: {my_string[7:12]}")

# Common Methods
print(f"All uppercase: {my_string.upper()}")
print(f"All lowercase: {my_string.lower()}")
print(f"Split into a list: {my_string.split(',')}")

# Stripping whitespace
other_string = "   Some Spaces   "
print(f"String with whitespace: '{other_string}'")
print(f"Stripped of whitespace: '{other_string.strip()}'")

First character (index 0): H
Fifth character (index 4): o
Last character (index -1): d
Slice from index 6 to 11: World
All uppercase: HELLO, WORLD!
All lowercase: hello, world!
Split into a list: ['Hello,', 'World!']
String with whitespace: '   Some Spaces   '
Stripped of whitespace: 'Some Spaces'


---

## 5. Control Structures (If, Elif, Else)
Control structures allow you to control the flow of your program. This includes conditional statements.

### Example Code

In [10]:
# If-Elif-Else Statement
number = 0

if number > 0:
    print("Positive number")
elif number == 0: # 'elif' is short for 'else if'
    print("The number is zero.")
else:
    print("Negative number")

The number is zero.


---
## 6. Loops (For, While)

Loops allow you to execute a block of code multiple times.

### 6.1 For Loops
A `for` loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string). The `range()` function is commonly used to generate a sequence of numbers.

In [11]:
# The range(5) function generates numbers from 0 up to (but not including) 5.
print("Printing numbers from 0 to 4:")
for i in range(5):
    print(i)

Printing numbers from 0 to 4:
0
1
2
3
4


### 6.2 While Loops
A `while` loop runs as long as a certain condition is True.

In [12]:
i = 1
while i < 6: # This loop will run as long as 'i' is less than 6
  print(i)
  i += 1  # This is crucial! It's short for i = i + 1. Without it, you get an infinite loop!

1
2
3
4
5


### 6.3 Loop Control: `break` and `continue`
- `break`: Exits the loop entirely.
- `continue`: Skips the rest of the current iteration and moves to the next one.

In [13]:
# Example of 'continue'
print("Using 'continue' to skip odd numbers:")
for i in range(6):
    if i % 2 != 0: # If the number is odd
        continue     # Skip the print statement and go to the next iteration
    print(i) # This line only runs for even numbers

# Example of 'break'
print("\nUsing 'break' to stop the loop early:")
for i in range(100): # This could loop 100 times
    if i == 5:           # But we will stop it early
        print("Found the number 5! Stopping loop.")
        break            # Exit the loop completely
    print(i)

Using 'continue' to skip odd numbers:
0
2
4
Using 'break' to stop the loop early:
0
1
2
3
4
Found the number 5! Stopping loop.



### Challenge (If & Loops)
1. Write a program that checks if a number (e.g., `num = 7`) is even or odd.
2. Create a loop that prints numbers from 1 to 10 (inclusive).
3. Write a program that prints "Hello" 5 times using a `for` loop.
4. Create a program that prints all numbers from 1 to 20 that are divisible by 3.
5. Write a program that prints the first 5 square numbers (1, 4, 9, 16, 25).


In [6]:
# Solution 1: Even or Odd
print("--- Solution 1 ---")


# Solution 2: Print 1 to 10
print("--- Solution 2 ---")


# Solution 3: Print "Hello" 5 times
print("--- Solution 3 ---")

# Solution 4: Divisible by 3
print("--- Solution 4 ---")

# Solution 5: First 5 squares
print("--- Solution 5 ---")


--- Solution 1 ---
--- Solution 2 ---
--- Solution 3 ---
--- Solution 4 ---
--- Solution 5 ---



---

## 7. Functions
Functions are reusable blocks of code that perform a specific task. They help make your code organized, reusable, and easier to read. You define a function using the `def` keyword.

### Example Code

In [15]:
def greet(name):
    """This is a docstring. It explains what the function does."""
    return f"Hello, {name}!"

# Call the function
greeting = greet("Alice")
print(greeting)

Hello, Alice!


In [16]:
def add_numbers(a, b):
    """Takes two numbers and returns their sum."""
    c = a + b
    return c

z = add_numbers(5, 8)
print(z)

13



### Challenge
1. Write a function that takes two numbers and returns their sum (you already saw this, try to write it from memory!).
2. **Create a function `is_prime(num)` that checks if a number is prime.** (A prime number is only divisible by 1 and itself. Hint: loop from 2 up to `num-1` and check for any remainders).
3. Write a function that returns the factorial of a number. (e.g., 5! = 5 * 4 * 3 * 2 * 1)
4. Create a function that takes a string and returns its length (Hint: Python has a built-in `len()` function).
5. Write a function that converts Celsius to Fahrenheit. Formula: $F = (C \times 9/5) + 32$.

In [7]:
# Solution 1: Add function
print("--- Solution 1: Add ---")


# Solution 2: Prime checker
print("--- Solution 2: Is Prime ---")

# Solution 3: Factorial
print("--- Solution 3: Factorial ---")

# Solution 4: String length
print("--- Solution 4: String Length ---")

# Solution 5: C-to-F converter
print("--- Solution 5: Celsius to Fahrenheit ---")


--- Solution 1: Add ---
--- Solution 2: Is Prime ---
--- Solution 3: Factorial ---
--- Solution 4: String Length ---
--- Solution 5: Celsius to Fahrenheit ---



---

## 8. Data Structures I: Lists and Tuples
Lists and tuples are used to store multiple items in a single variable.
- **Lists** are created with `[]`, and they are **mutable** (you can change, add, or remove items).
- **Tuples** are created with `()`, and they are **immutable** (you cannot change them after creation).

In [18]:
# List (Mutable)
fruits = ["apple", "banana", "cherry"]
print(f"Original list: {fruits}")

fruits.append("orange") # Add an item to the end
print(f"Appended list: {fruits}")

# Tuple (Immutable)
colors = ("red", "green", "blue")
print(f"Tuple: {colors}")

Original list: ['apple', 'banana', 'cherry']
Appended list: ['apple', 'banana', 'cherry', 'orange']
Tuple: ('red', 'green', 'blue')


The code cell below will **purposefully create an error** to prove that tuples are immutable. This is expected behavior.

In [19]:
# Python code to test that 
# tuples are immutable 
  
tuple1 = (0, 1, 2, 3) 
tuple1[0] = 4 # This line will raise a TypeError
print(tuple1) 


TypeError: 'tuple' object does not support item assignment

### 8.1 List Comprehensions

A very powerful and "Pythonic" way to create lists is by using a **list comprehension**. It offers a shorter syntax when you want to create a new list based on the values of an existing list or range.

In [20]:
# Example: Get a list of the first 10 square numbers

# Method 1: Regular 'for' loop
print("Regular for loop:")
squares_list = []
for x in range(10):
    squares_list.append(x**2)
print(squares_list)

# Method 2: List Comprehension (more efficient and readable)
print("\nList Comprehension:")
squares_comp = [x**2 for x in range(10)]
print(squares_comp)

Regular for loop:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
List Comprehension:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]



### Challenge
1. Create a list of your favorite movies and print it.
2. Add a new movie to the list and print the updated list.
3. Create a tuple of your favorite books and print it.
4. Write a program that prints each fruit in the `fruits` list (from the example) on a new line using a `for` loop.
5. Use a list comprehension to create a list of all even numbers from 0 to 20.

In [8]:
# Solution 1: Favorite movies list
print("--- Solution 1 ---")


# Solution 2: Add a new movie
print("--- Solution 2 ---")


# Solution 3: Favorite books tuple
print("--- Solution 3 ---")


# Solution 4: Loop through fruits list
print("--- Solution 4 ---")


# Solution 5: List comprehension for evens
print("--- Solution 5 ---")


--- Solution 1 ---
--- Solution 2 ---
--- Solution 3 ---
--- Solution 4 ---
--- Solution 5 ---


---

## 9. Data Structures II: Dictionaries
Dictionaries are used to store data values in **key:value** pairs. They are created with `{}`. Dictionaries are mutable.

In [22]:
person = {
    "name": "Alice", # "name" is the key, "Alice" is the value
    "age": 25,
    "city": "New York"
}

# Access a value by its key
print(f"Alice's age is: {person['age']}")

# Get all keys and values
print(f"All keys: {person.keys()}")
print(f"All values: {person.values()}")

# Add a new key:value pair
print("Adding job...")
person["job"] = "Engineer"
print(f"Full dictionary: {person}")

Alice's age is: 25
All keys: dict_keys(['name', 'age', 'city'])
All values: dict_values(['Alice', 25, 'New York'])
Adding job...
Full dictionary: {'name': 'Alice', 'age': 25, 'city': 'New York', 'job': 'Engineer'}



### Challenge
1. Create a dictionary to store your favorite book's title, author, and year published.
2. Print only the author's name from the dictionary.
3. Add a new key-value pair for the 'genre' of the book.
4. Write a program that prints all keys in the dictionary using a `for` loop.
5. Write a program that prints all values in the dictionary using a `for` loop.

In [9]:
# Solution 1: Create dictionary
print("--- Solutions 1 & 2 ---")

# Solution 3: Add genre
print("--- Solution 3 ---")


# Solution 4: Loop over keys
print("--- Solution 4: Print Keys ---")

# Solution 5: Loop over values
print("--- Solution 5: Print Values ---")


--- Solutions 1 & 2 ---
--- Solution 3 ---
--- Solution 4: Print Keys ---
--- Solution 5: Print Values ---


---
## 10. Error Handling (Try / Except)

Sometimes, code can fail for reasons you can't predict (like bad user input). If an error (also called an **Exception**) occurs, the program will crash. 

We use a `try...except` block to "catch" the error gracefully and handle it without crashing.

In [24]:
# Example 1: Handling bad user input

user_input = input("Enter a number: ")

try:
    num = float(user_input) # This line will FAIL if user enters 'hello'
    print(f"You entered a valid number: {num}")
except ValueError:
    # This block only runs IF a ValueError occurs in the 'try' block
    print(f"Error: '{user_input}' is not a valid number!")

Enter a number: 10


You entered a valid number: 10.0


In [25]:
# Example 2: Handling a math error

def safe_divide(a, b):
    try:
        result = a / b
        print(f"Attempting {a} / {b}... Result: {result}")
    except ZeroDivisionError:
        print(f"Attempting {a} / {b}... Error: Cannot divide by zero!")

safe_divide(10, 2)
safe_divide(10, 0) # This would normally crash the program

Attempting 10 / 2... Result: 5.0
Attempting 10 / 0... Error: Cannot divide by zero!


---
## 11. Modules

A **module** is simply a Python file (`.py`) containing functions, classes, and variables. We use the `import` statement to use the code from one module in another file.

Python has a large **standard library** of built-in modules we can use. `math` is a popular one for scientific computing.

In [26]:
import math # Import the entire module

# Now we must use 'math.' prefix to access its functions
print(f"Square root of 16 is: {math.sqrt(16)}")
print(f"Value of Pi: {math.pi}")
print(f"Sine of 90 degrees (pi/2 radians): {math.sin(math.pi / 2)}")

Square root of 16 is: 4.0
Value of Pi: 3.141592653589793
Sine of 90 degrees (pi/2 radians): 1.0


You can also import specific functions using `from` to avoid typing the module name.


In [27]:
from math import sqrt, cos, pi # Import only what we need

# Now we can call them directly without the 'math.' prefix
print(f"Square root of 64 is: {sqrt(64)}")
print(f"Cosine of Pi is: {cos(pi)}")

Square root of 64 is: 8.0
Cosine of Pi is: -1.0


---
## 12. Putting It All Together: 15 Scientific Examples

This section provides 15 problems and their solutions, combining all the concepts we've learned (variables, functions, loops, lists, dicts, etc.) in real-world scientific scenarios.

### 1. Simple Temperature Converter (Chemistry)

**Question:** Write a Python function that converts temperature from Celsius to both Fahrenheit and Kelvin.
- Fahrenheit = $ \text{Celsius} \times \frac{9}{5} + 32 $
- Kelvin = $ \text{Celsius} + 273.15 $

---

### 2. Gravitational Force Calculator (Physics)

**Question:** Create a function that calculates the gravitational force between two masses using the formula:
$
F = \frac{G \cdot m_1 \cdot m_2}{r^2}
$
Where $ G = 6.674 \times 10^{-11} $ is the gravitational constant.

---

### 3. Prime Number Checker (Mathematics)

**Question:** Write a function `is_prime(number)` to check if a number is prime.

---

### 4. Fibonacci Sequence Generator (Mathematics)

**Question:** Write a function to generate the first `n` numbers in the Fibonacci sequence (where each number is the sum of the two preceding ones: 0, 1, 1, 2, 3, 5, 8...).

---

### 5. pH Value Classifier (Chemistry)

**Question:** Write a program that takes a pH value and classifies it as acidic (<7), neutral (=7), or basic (>7).

---

### 6. Rock Classification (Geology)

**Question:** Create a program that classifies rocks. Store example rocks in tuples for each type (Igneous, Sedimentary, Metamorphic) and have the user input a rock name.

---

### 7. Projectile Motion Calculator (Physics)

**Question:** Write a program that calculates the range ($R$) and maximum height ($H$) of a projectile using the `math` module.
- $ R = \frac{v^2 \sin(2\theta)}{g} $
- $ H = \frac{v^2 \sin^2(\theta)}{2g} $
(where $g = 9.81$, and $\theta$ must be converted to radians: `math.radians(angle)`)

---

### 8. Matrix Multiplication (Mathematics)

**Question:** Write a function to perform matrix multiplication (dot product) for two 2x2 matrices (represented as lists of lists).
*(Note: This is much easier in NumPy, which we'll see later, but it's a great exercise in nested loops!)*

---

### 9. Periodic Table Lookup (Chemistry)

**Question:** Create a program using a dictionary to store the symbols of the first 10 elements (Atomic numbers 1-10). Allow the user to look up a symbol by its atomic number.

---

### 10. Earthquake Magnitude Classifier (Geology)

**Question:** Write a program that classifies earthquakes based on the Richter scale magnitude using `if-elif-else`.

---
### 11. Factorial Calculator (Mathematics)

**Question:** Write a function that calculates the factorial of a given number `n` using a loop.
$ n! = n \times (n-1) \times (n-2) \times \dots \times 1 $

---

### 12. Simulating Radioactive Decay (Physics)

**Question:** Write a program that calculates remaining atoms after radioactive decay using the formula:
$ N(t) = N_0 e^{-\lambda t} $
(Requires the `math` module for `math.exp()`, which is $e^x$)

---

### 13. Distance Between Two Points (Mathematics)

**Question:** Write a program that calculates the distance between two points $(x_1, y_1)$ and $(x_2, y_2)$ using the formula:
$ d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} $

---

### 14. Planetary Weight Calculator (Physics)

**Question:** Write a program that calculates a person's weight on different planets using a dictionary of gravitational factors. 
$\text{Weight on planet} = \text{Weight on Earth} \times \text{gravitational factor}$

---

### 15. Unit Converter (General)

**Question:** Write a program that converts units. Use a dictionary to store conversion factors and `if-elif-else` to handle user choice.

---
## 13. Introduction to NumPy

NumPy (Numerical Python) is the foundational library for scientific computing in Python. Its main feature is the powerful **N-dimensional array (ndarray)** object. It is much faster than standard Python lists for numerical operations.

By convention, we almost always import it `as np`.

In [43]:
import numpy as np

### 13.1 Creating Arrays
You can create an array from a Python list, or using built-in NumPy functions.

In [44]:
# Creating a 1D array from a list
arr1 = np.array([1, 2, 3])
print("1D Array from list:\n", arr1)

# Creating a 2D array (a matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print("\n2D Array (Matrix):\n", arr2)

# Creating arrays with zeros
zeros = np.zeros((2, 3))  # A 2x3 array of zeros (note the tuple (2, 3))
print("\nArray of zeros:\n", zeros)

# Creating arrays with a range of values (like Python's range())
arr_range = np.arange(0, 10, 2)  # From 0 up to 10 (exclusive), step 2
print("\nArray from range (0 to 9, step 2):\n", arr_range)

# Creating arrays with equally spaced numbers
lin_space = np.linspace(0, 1, 5)  # 5 numbers between 0 and 1 (inclusive)
print("\nArray of 5 evenly spaced numbers from 0 to 1:\n", lin_space)

1D Array from list:
[1 2 3]

2D Array (Matrix):
[[1 2 3]
 [4 5 6]]

Array of zeros:
[[0. 0. 0.]
 [0. 0. 0.]]

Array from range (0 to 9, step 2):
[0 2 4 6 8]

Array of 5 evenly spaced numbers from 0 to 1:
[0.   0.25 0.5  0.75 1.  ]


### 13.2 Array Attributes
You can inspect the properties of an array.

In [45]:
# Using arr2 from the previous cell
print(f"Shape (rows, cols): {arr2.shape}")
print(f"Dimensions: {arr2.ndim}")
print(f"Size (total elements): {arr2.size}")
print(f"Data type: {arr2.dtype}")

Shape (rows, cols): (2, 3)
Dimensions: 2
Size (total elements): 6
Data type: int64


### 13.3 Mathematical Operations (Element-wise)
This is NumPy's most powerful feature. Operations are applied **element-by-element** automatically, which is much faster than using a `for` loop.

In [46]:
arr_a = np.array([1, 2, 3])
arr_b = np.array([4, 5, 6])
print(f"arr_a: {arr_a}")
print(f"arr_b: {arr_b}")

# Element-wise operations
sum_arr = arr_a + arr_b
print(f"a + b = {sum_arr}")

prod_arr = arr_a * arr_b
print(f"a * b = {prod_arr}")

# Scalar operations (Broadcasting)
arr_scalar = arr_a * 2  # The '2' is broadcast to every element
print(f"a * 2 = {arr_scalar}")

# Universal functions (ufuncs)
sin_arr = np.sin(arr_a)  # Applies sine function to every element
print(f"sin(a) = {sin_arr}")

arr_a: [1 2 3]
arr_b: [4 5 6]
a + b = [5 7 9]
a * b = [ 4 10 18]
a * 2 = [2 4 6]
sin(a) = [0.84147098 0.90929743 0.14112001]


### 13.4 Statistical Functions

In [47]:
data = np.array([1, 2, 3, 4, 5])
print(f"Data: {data}")

# Can call as np.mean(data) OR data.mean()
print(f"Mean: {np.mean(data)}")
print(f"Std Dev: {np.std(data)}")
print(f"Median: {np.median(data)}")
print(f"Sum: {np.sum(data)}")

Data: [1 2 3 4 5]
Mean: 3.0
Std Dev: 1.4142135623730951
Median: 3.0
Sum: 15


### 13.5 Linear Algebra (Matrix Math)
While `*` is element-wise multiplication, `np.dot()` (or the `@` operator) performs matrix multiplication (dot product).

In [48]:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
print("A:\n", A)
print("B:\n", B)

# Matrix multiplication (compare this to the complex loops we wrote in Example 8!)
matmul = np.dot(A, B)
# An alternative, modern syntax is the '@' operator:
# matmul = A @ B
print("\nMatrix Product (A @ B):\n", matmul)

# Matrix inverse (using the 'linalg' submodule)
inv_A = np.linalg.inv(A)
print("\nInverse of A:\n", inv_A)

A:
 [[1 2]
 [3 4]]
B:
 [[5 6]
 [7 8]]
Matrix Product (A @ B):
 [[19 22]
 [43 50]]
Inverse of A:
 [[-2.   1. ]
 [ 1.5 -0.5]]


## Problems for practice (NumPy)

### Problem 1: Creating Arrays
Create a 3x3 matrix with values ranging from 1 to 9.

### Problem 2: Array Slicing
Given `arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])`, extract the second column (i.e., `[2, 5, 8]`).

### Problem 3: Arithmetic Operations
Create `arr1 = np.array([1, 2, 3])` and `arr2 = np.array([10, 20, 30])`. Add them, multiply them, and subtract arr1 from arr2.

### Problem 4: Broadcasting
Given `arr = np.array([[10, 20, 30], [40, 50, 60]])`, add 5 to every element in the array.

### Problem 5: Reshaping Arrays
Create a 1D array of numbers from 0 to 11 (using `np.arange`). Reshape this 1D array into a 3x4 2D array.

### Problem 6: Statistical Functions
Given `scores = np.array([55, 89, 76, 65, 93, 42, 67])`, calculate the mean, median, and max score.

### Problem 7: Random Numbers
Generate a 4x4 matrix of random numbers sampled from a normal distribution (mean=0, std=1).

### Problem 8: Matrix Multiplication
Given `A = np.array([[1, 2], [3, 4]])` and `B = np.array([[5, 6], [7, 8]])`, compute the dot product.

### Problem 9: Boolean Indexing
Given `arr = np.array([10, 25, 33, 45, 55, 67, 72, 89, 91])`, extract only the numbers greater than 50.

### Problem 10: Element-wise Conditional
Using `np.where()`, replace all elements in `arr = np.array([1, 4, 6, 8, 10])` that are greater than 5 with 0.

---
## 14. Introduction to Pandas

Pandas is the most popular library for data analysis and manipulation. It is built on top of NumPy. 
The two primary Pandas data structures are:
- **Series:** A 1D labeled array (like a NumPy array but with custom labels/index).
- **DataFrame:** A 2D labeled table with columns (like an Excel spreadsheet or SQL table). This is what you will use 99% of the time.

In [10]:
import pandas as pd

### 14.1 Creating a DataFrame
The most common way is from a Python dictionary, where dictionary keys become column names.

In [11]:
# Creating a DataFrame from a dictionary
data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Score': [85, 90, 95]
}
df = pd.DataFrame(data)

# Displaying the DataFrame (Jupyter notebooks format this as a nice table)
df

Unnamed: 0,Name,Age,Score
0,Alice,25,85
1,Bob,30,90
2,Charlie,35,95


### 14.2 Loading Data from Files
A primary use of Pandas is loading data from files like CSVs. (This cell is commented out as we don't have 'data.csv', but this is the command).

In [12]:
# Create a dummy CSV file just for this example
dummy_csv_content = "Name,Age,Score\nDavid,22,88\nEva,29,91"
with open("data.csv", "w") as f:
    f.write(dummy_csv_content)

# Reading a CSV file into a DataFrame
df_from_csv = pd.read_csv('data.csv')

print("DataFrame loaded from data.csv:")
print(df_from_csv)

DataFrame loaded from data.csv:
    Name  Age  Score
0  David   22     88
1    Eva   29     91


### 14.3 DataFrame Attributes & Info
Quickly inspecting your data.

In [13]:
# Using the 'df' from cell [51]
print(f"Shape (rows, cols): {df.shape}")
print(f"Columns: {df.columns}")

# .info() is great for seeing data types and missing values
print("\nInfo:")
df.info()

# .describe() gives a quick statistical summary of numerical columns
print("\nStatistical Summary:")
df.describe()

Shape (rows, cols): (3, 3)
Columns: Index(['Name', 'Age', 'Score'], dtype='object')

Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 3 entries, 0 to 2
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Name    3 non-null      object
 1   Age     3 non-null      int64 
 2   Score   3 non-null      int64 
dtypes: int64(2), object(1)
memory usage: 204.0+ bytes

Statistical Summary:


Unnamed: 0,Age,Score
count,3.0,3.0
mean,30.0,90.0
std,5.0,5.0
min,25.0,85.0
25%,27.5,87.5
50%,30.0,90.0
75%,32.5,92.5
max,35.0,95.0


### 14.4 Selecting Data (Columns & Rows)
This is the most important part of Pandas.

In [14]:
# Selecting a single column (this returns a Series)
print("--- Selecting a single column (returns a Series) ---")
ages = df['Age']
print(ages)

# Selecting multiple columns (note the double brackets [[...]])
print("\n--- Selecting multiple columns (returns a DataFrame) ---")
subset = df[['Name', 'Score']]
display(subset) # display() is richer than print() in Jupyter

# Selecting rows by condition (Boolean Indexing)
print("\n--- Filtering Rows (Boolean Indexing) --- ")
# Step 1: Create the condition: df['Age'] > 25
# Step 2: Pass the condition back into the DataFrame
adults = df[df['Age'] > 25]
display(adults)

--- Selecting a single column (returns a Series) ---
0    25
1    30
2    35
Name: Age, dtype: int64

--- Selecting multiple columns (returns a DataFrame) ---


Unnamed: 0,Name,Score
0,Alice,85
1,Bob,90
2,Charlie,95



--- Filtering Rows (Boolean Indexing) --- 


Unnamed: 0,Name,Age,Score
1,Bob,30,90
2,Charlie,35,95


### 14.5 Creating New Columns

In [15]:
# We can create a new column based on a condition on other columns
df['Passed'] = df['Score'] >= 90 # Passing score is 90
display(df)

# Note: Alice now shows False, as her score (85) is not >= 90. 
# Let's fix that. Let's say passing is > 80.
df['Passed'] = df['Score'] > 80
display(df)

Unnamed: 0,Name,Age,Score,Passed
0,Alice,25,85,False
1,Bob,30,90,True
2,Charlie,35,95,True


Unnamed: 0,Name,Age,Score,Passed
0,Alice,25,85,True
1,Bob,30,90,True
2,Charlie,35,95,True


### 14.6 Handling Missing Data
Real-world data is messy. Pandas uses `NaN` (Not a Number) to represent missing data. We can drop it or fill it.

In [16]:
data_missing = {
    'Name': ['Anna', 'Brian', 'Cathy'],
    'Age': [23, np.nan, 35], # np.nan represents a missing value
    'Score': [85, 88, np.nan]
}
df_messy = pd.DataFrame(data_missing)
print("Original DataFrame with missing data:")
display(df_messy)

# Option 1: Drop rows with any NaN values
df_dropped = df_messy.dropna()
print("\nDataFrame after dropping rows with any missing data:")
display(df_dropped)

# Option 2: Fill missing values
# We can fill 'Age' with the mean age, and 'Score' with 0
age_mean = df_messy['Age'].mean() # Calculate mean age (ignores NaN)
df_filled = df_messy.fillna({"Age": age_mean, "Score": 0})
print("\nDataFrame after filling missing data:")
display(df_filled)

NameError: name 'np' is not defined

### 14.7 Grouping and Aggregation (GroupBy)
This is one of the most powerful features of Pandas. It allows you to split data into groups, apply functions to each group, and combine the results. (Split-Apply-Combine).

In [17]:
df_teams = pd.DataFrame({
    'Team': ['A', 'B', 'A', 'B', 'A'],
    'Player': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Score': [88, 95, 82, 91, 89]
})
print("Original Data:")
display(df_teams)

# Group by the 'Team' column, then calculate the mean of the remaining numerical columns ('Score')
team_avg = df_teams.groupby('Team').mean(numeric_only=True)
print("\nAverage Score per Team:")
display(team_avg)

Original Data:


Unnamed: 0,Team,Player,Score
0,A,Alice,88
1,B,Bob,95
2,A,Charlie,82
3,B,David,91
4,A,Eva,89



Average Score per Team:


Unnamed: 0_level_0,Score
Team,Unnamed: 1_level_1
A,86.333333
B,93.0


## Problems for practice (Pandas)

### Problem 1: Creating a DataFrame
Create a DataFrame from this dict and print it:
```python
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'],
    'Age': [25, 30, 35, 40, 45],
    'Score': [85, 90, 95, 100, 88]
}
```

### Problem 2: Selecting Columns
Using the DataFrame from Problem 1, select and print only the `Name` and `Age` columns.

### Problem 3: Filtering Rows
Using the DataFrame from Problem 1, filter out and display only the rows where `Score` is greater than 90.

### Problem 4: Adding a New Column
Using the DataFrame from Problem 1, add a new column named `Grade` that contains 'A' if `Score` >= 90, and 'B' if not. (Hint: use `np.where(condition, 'A', 'B')`).

### Problem 5: Grouping and Aggregation
Given this DataFrame:
```python
df_group = pd.DataFrame({
    'Team': ['A', 'B', 'A', 'B', 'A', 'B'],
    'Score': [88, 95, 82, 91, 89, 87]
})
```
Group by `Team` and calculate the average `Score` AND the sum of `Score` for each team.

### Problem 6: Sorting Data
Using the DataFrame from Problem 1, sort the DataFrame by `Score` in descending order (highest score first) and print the result.

---
## Congratulations!

You have completed the Python basics course, covering core syntax, data structures, functions, and the fundamentals of the two most important libraries for scientific computing: **NumPy** and **Pandas**. You are now ready to tackle more complex data analysis tasks!