Python Programming — Data Science, Web Development, and Automation

Python: The Swiss Army Knife

The Language That Quietly Took Over Everything

Instagram: Python. Spotify's recommendation engine: Python. Netflix's data pipeline: Python. YouTube's original codebase: Python. NASA uses it for mission planning. Hedge funds use it for trading algorithms. Biologists use it to sequence DNA. It is the second most popular programming language in the world and the most taught in universities -- not because it is the fastest or the most powerful, but because it reads like English and gets out of your way so you can focus on the problem, not the syntax.

That last point matters more than any benchmark. Most programming languages force you to spend half your mental energy on the language itself -- semicolons, curly braces, type declarations, boilerplate. Python strips all of that away. A Python program to read a file and print its contents is three lines. The equivalent in Java is fifteen. The equivalent in C is twenty-five. The programs do the same thing. The difference is how much ceremony the language demands before it lets you do your work.

This is why Python became the default language for people who are not professional software engineers but need to get things done with code -- data scientists, physicists, financial analysts, biologists, journalists, and students. It is also why it became the dominant language for artificial intelligence, machine learning, and data science. When your job is to solve problems, not to wrestle with syntax, Python wins.

#2
Position on the TIOBE Index -- the most widely tracked language popularity ranking
1991
Year Guido van Rossum released Python -- older than Java, JavaScript, and C#
500K+
Packages on PyPI, the Python Package Index -- pre-built tools for nearly any task
48.07%
Stack Overflow developers who want to work with Python -- highest of any language

Why Python Won

In 1989, a Dutch programmer named Guido van Rossum was bored during Christmas week. He wanted a side project -- a new programming language that would be fun to use. Not powerful in the academic sense. Not optimized for speed. Fun. Readable. Something that felt like writing pseudocode but actually ran. He named it after Monty Python's Flying Circus, the British comedy troupe, because he wanted the language to feel lighthearted.

That philosophy became Python's design principle, codified in a document called "The Zen of Python" (you can see it by typing import this in any Python interpreter). The most important line: "There should be one -- and preferably only one -- obvious way to do it." Where other languages offer five ways to write a loop, Python offers one. Where other languages let you choose between tabs and braces for code blocks, Python enforces indentation. These constraints feel restrictive at first, but they produce code that is consistent, readable, and maintainable -- even when written by someone else, even months later.

Compare the same operation in three languages -- printing each item from a list:

Python:

fruits = ["apple", "banana", "cherry"]
for fruit in fruits:
    print(fruit)

JavaScript:

const fruits = ["apple", "banana", "cherry"];
for (const fruit of fruits) {
    console.log(fruit);
}

Java:

import java.util.List;
import java.util.Arrays;

public class Main {
    public static void main(String[] args) {
        List<String> fruits = Arrays.asList("apple", "banana", "cherry");
        for (String fruit : fruits) {
            System.out.println(fruit);
        }
    }
}

Same result. Python does it in three lines with zero boilerplate. Java requires a class declaration, a main method, imports, type annotations, and semicolons on every line -- just to print three words. This is not a criticism of Java. Java's verbosity serves a purpose in large-scale enterprise systems. But when you are a beginner trying to learn what a loop is, or a data scientist who needs to process a CSV file before lunch, that overhead is pure friction.

Key Insight

Python did not win because it is the fastest language, the most feature-rich, or the most theoretically elegant. It won because it minimizes the distance between thinking about a solution and implementing it. In a world where most code is written by people who are not full-time software engineers, that matters more than raw performance.

Python in 15 Minutes

This is not a replacement for the Programming Fundamentals page -- it is Python-specific syntax. If you have read that page, everything below will feel familiar. You are just seeing the Python version.

Variables

No type declarations. No semicolons. Just name, equals sign, value.

# Numbers
price = 29.99
quantity = 3
total = price * quantity  # 89.97

# Strings (text)
name = "Hozaki"
greeting = f"Welcome to {name}"  # f-strings embed variables inside text

# Booleans
is_active = True
is_expired = False

Strings

Text manipulation is one of Python's greatest strengths. Methods are built into every string object.

message = "  Hello, World!  "
message.strip()          # "Hello, World!" -- removes whitespace
message.lower()          # "  hello, world!  "
message.replace("World", "Python")  # "  Hello, Python!  "
message.split(",")       # ["  Hello", " World!  "]

# Check content
"Hello" in message       # True
message.startswith("  H")  # True
len(message)             # 17

Lists

Ordered collections. The most-used data structure in Python.

scores = [85, 92, 78, 95, 88]
scores.append(91)        # Add to end: [85, 92, 78, 95, 88, 91]
scores.sort()            # Sort in place: [78, 85, 88, 91, 92, 95]
scores[0]                # First element: 78
scores[-1]               # Last element: 95
len(scores)              # 6

# List comprehension -- Python's most iconic feature
doubled = [s * 2 for s in scores]        # [156, 170, 176, 182, 184, 190]
passing = [s for s in scores if s >= 85]  # [85, 88, 91, 92, 95]

Dictionaries

Key-value pairs. Like a real dictionary: look up a word (key) and get its definition (value).

user = {
    "name": "Alice",
    "age": 28,
    "email": "[email protected]",
    "is_premium": True
}

user["name"]             # "Alice"
user["age"] = 29         # Update a value
user["city"] = "Tokyo"   # Add a new key

# Loop through a dictionary
for key, value in user.items():
    print(f"{key}: {value}")

Conditionals

No parentheses required around the condition. No curly braces. Just a colon and indentation.

temperature = 35

if temperature > 30:
    print("Too hot")
elif temperature > 20:
    print("Just right")
else:
    print("Too cold")

Loops

# For loop -- iterate over any collection
for i in range(5):
    print(i)  # Prints 0, 1, 2, 3, 4

# While loop -- keep going until a condition is false
attempts = 0
while attempts < 3:
    print(f"Attempt {attempts + 1}")
    attempts += 1

Functions

Define once, call anywhere. Parameters in, return value out.

def calculate_tax(price, rate=0.08):
    """Calculate tax on a given price."""
    tax = price * rate
    return round(tax, 2)

calculate_tax(100)         # 8.0 (uses default rate)
calculate_tax(100, 0.10)   # 10.0 (custom rate)
calculate_tax(49.99)       # 4.0
Real-World Example

That triple-quoted string inside the function ("""Calculate tax on a given price.""") is called a docstring. Every serious Python project uses them. They are not comments -- they are documentation that tools can extract automatically. When you type help(calculate_tax) in the interpreter, Python displays that docstring. Django, Flask, NumPy, and every major library document their entire API this way.

What Python Is Great At

Python is not a specialist. It is a generalist that happens to be world-class in several specific domains. Here are the five areas where Python dominates, each with real code you could run today.

Data Science and Analysis

This is where roughly 70% of Python usage in business lives. Reading data, cleaning it, analyzing it, and presenting results. The two core libraries are pandas (data manipulation) and NumPy (numerical computation).

import pandas as pd

# Read a CSV file into a DataFrame
sales = pd.read_csv("sales_2026.csv")

# Show first 5 rows
print(sales.head())

# Filter: only rows where revenue exceeds $10,000
big_deals = sales[sales["revenue"] > 10000]

# Group by sales rep, calculate average deal size
rep_performance = sales.groupby("rep_name")["revenue"].mean()

# Sort by revenue descending
top_reps = rep_performance.sort_values(ascending=False)

print(top_reps)

That is nine lines of code to load a spreadsheet, filter it, group it, calculate averages, and rank the results. The same analysis in Excel requires pivot tables, manual formula ranges, and three times the effort. At scale -- millions of rows -- Excel crashes. Pandas does not.

Web Development

Django is the full-featured web framework. Flask is the lightweight alternative. Together they power millions of websites.

from flask import Flask, jsonify

app = Flask(__name__)

@app.route("/api/greeting")
def greeting():
    return jsonify({"message": "Hello from Python!", "status": "ok"})

if __name__ == "__main__":
    app.run(debug=True)

Eight lines. A working web API that listens for requests and returns JSON. That is Flask. Django adds an admin panel, user authentication, an ORM for database access, and dozens of other features out of the box.

Automation and Scripting

Renaming thousands of files. Sending automated emails. Scraping product prices from websites. Moving data between systems. Python is the duct tape of the programming world.

import os

# Rename all .jpeg files to .jpg in a directory
folder = "/Users/alice/photos"
count = 0
for filename in os.listdir(folder):
    if filename.endswith(".jpeg"):
        old_path = os.path.join(folder, filename)
        new_path = os.path.join(folder, filename.replace(".jpeg", ".jpg"))
        os.rename(old_path, new_path)
        count += 1

print(f"Renamed {count} files")

Ten lines to rename 5,000 files. Without Python, someone would spend an afternoon doing this by hand, one right-click at a time.

Artificial Intelligence and Machine Learning

TensorFlow (Google), PyTorch (Meta), and scikit-learn are all Python-first. The entire modern AI stack runs on Python.

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import pandas as pd

# Load data
data = pd.read_csv("customer_churn.csv")
X = data.drop("churned", axis=1)  # Features
y = data["churned"]               # Target label

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train a model
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

# Test accuracy
predictions = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, predictions):.2%}")

Twelve lines to train a machine learning model that predicts whether a customer will cancel their subscription. The same kind of model that Netflix uses to decide which shows to recommend and that banks use to detect fraud.

Web Scraping

BeautifulSoup and requests turn any webpage into structured data.

import requests
from bs4 import BeautifulSoup

response = requests.get("https://news.ycombinator.com")
soup = BeautifulSoup(response.text, "html.parser")

# Extract all article titles
titles = soup.select(".titleline > a")
for i, title in enumerate(titles[:10], 1):
    print(f"{i}. {title.text}")

Seven lines to download a webpage, parse its HTML, and extract the top 10 headlines. Journalists use this to monitor news sources. Researchers use it to collect data. Businesses use it to track competitor pricing.

What Python Is Bad At

No honest guide pretends a tool is perfect. Python has real limitations, and ignoring them wastes your time.

Where Python Excels

Data science: pandas, NumPy, Matplotlib -- the entire ecosystem is built for Python. No other language comes close in breadth of data tools.

Web backends: Django powers Instagram (2B+ users). Flask powers Netflix's internal tools. Proven at massive scale.

Machine learning: TensorFlow, PyTorch, scikit-learn, Hugging Face -- all Python-first. The AI revolution runs on Python.

Automation: System scripts, file processing, API integration. Python replaces hours of manual work in minutes.

Prototyping: Get from idea to working code faster than any other mainstream language.

Where Python Struggles

Mobile apps: Kivy and BeeWare exist but are niche. Real mobile apps use Swift (iOS), Kotlin (Android), or React Native/Flutter.

Browser code: Python cannot run in web browsers. JavaScript is the only language browsers execute natively.

AAA game engines: Too slow for real-time rendering. Game engines use C++ (Unreal) or C# (Unity). Python is used for game tooling, not game loops.

Real-time systems: Garbage collection causes unpredictable pauses. Embedded systems and high-frequency trading use C/C++ or Rust.

Raw computation speed: 10-100x slower than C/C++ for CPU-intensive tasks. Mitigated by calling C libraries (NumPy, TensorFlow).

The speed limitation deserves extra clarity. Python is an interpreted language -- your code is read and executed line by line at runtime, rather than being compiled into machine code in advance. This makes it flexible and easy to debug but inherently slower than compiled languages like C, Rust, or Go. For most tasks -- web servers, data analysis, scripting -- this does not matter. The bottleneck is the network, the disk, or the database, not the CPU. But when the task is pure computation -- physics simulations, video encoding, cryptographic operations -- Python's raw speed becomes a real constraint.

The workaround that the entire Python ecosystem relies on: write the speed-critical parts in C and call them from Python. NumPy is a Python library, but its core is written in C and Fortran. TensorFlow is a Python library, but its core is C++ and CUDA. You write comfortable Python code, and under the hood, C is doing the heavy lifting. This is sometimes called Python's "glue language" role -- it connects fast components and orchestrates their work.

Libraries: Standing on the Shoulders of 500,000 Packages

The single most important command in Python is not print or for. It is pip install.

pip install requests

That one command downloads the requests library from PyPI (the Python Package Index) and installs it on your machine. Now you can fetch any webpage in three lines:

import requests

response = requests.get("https://api.github.com/users/octocat")
print(response.json()["name"])  # "The Octocat"

Without requests, fetching a webpage in raw Python takes 50+ lines of socket management, HTTP header construction, SSL certificate handling, and response parsing. With it: three lines. That is the power of the Python ecosystem. You do not build everything from scratch. You import what others have built and focus on your actual problem.

The Python Ecosystem Major libraries organized by domain Python 500K+ packages Data Science pandas NumPy Matplotlib Jupyter Web Dev Django Flask FastAPI SQLAlchemy AI / ML TensorFlow PyTorch scikit-learn Hugging Face Auto- mation Selenium BeautifulSoup requests Paramiko
Python's ecosystem spans four major domains. Each domain has mature, battle-tested libraries maintained by thousands of contributors. You do not choose Python and then find libraries -- you find the library you need and it happens to be Python.

PyPI (the Python Package Index) hosts over 500,000 packages. Some perspective on what that means: there is a package to generate fake data for testing (faker). There is a package to convert speech to text (SpeechRecognition). There is a package to control a Raspberry Pi's GPIO pins (RPi.GPIO). There is a package to generate PDF invoices, parse Excel spreadsheets, send SMS messages, connect to every major database, interact with every major cloud platform, and do things you have never thought of. Whatever your problem is, someone has probably built a library for it.

Write code
Need a library
pip install library
import library
Use it

Python vs. JavaScript vs. Java

These three languages cover most of the world's professional programming. They overlap in some areas but have fundamentally different philosophies and strengths. Choosing between them is not about which is "better" -- it is about which is right for what you want to build.

Python

Typing: Dynamic -- variables can change type at runtime. x = 5 then x = "hello" is valid.

Syntax example:

def greet(name):
    return f"Hello, {name}"

Speed: Slowest of the three. 10-100x slower than Java for CPU-bound tasks. Fast enough for 95% of real workloads.

Primary domains: Data science, AI/ML, automation, web backends, scientific computing.

Learning curve: Gentlest. Reads like English. Beginners write useful code in days.

Job market: Dominant in data science, AI, and academia. Strong in backend web. Growing everywhere.

JavaScript

Typing: Dynamic with quirks. "5" + 3 = "53" but "5" - 3 = 2. TypeScript adds optional static typing.

Syntax example:

function greet(name) {
    return `Hello, ${name}`;
}

Speed: Middle. V8 engine (JIT compiled) is significantly faster than Python. Slower than Java.

Primary domains: Frontend web (only option), full-stack web (Node.js), mobile (React Native).

Learning curve: Moderate. Easy to start, but quirks and asynchronous patterns trip up beginners.

Job market: Largest overall demand. Every web company needs JavaScript. Highest job count globally.

Java

Typing: Static -- every variable must declare its type. String name = "Alice"; Catches errors at compile time.

Syntax example:

public String greet(String name) {
    return "Hello, " + name;
}

Speed: Fastest of the three. JVM with JIT compilation approaches C++ speed for long-running processes.

Primary domains: Enterprise backends, Android (legacy), banking/finance, large-scale distributed systems.

Learning curve: Steepest. Verbose syntax, OOP concepts, build tools, and type system all required upfront.

Job market: Massive in enterprise and finance. High salaries but often corporate environments. Steady demand.

Summary: Which to Choose

Learn Python first if: You want data science, AI, automation, or the gentlest entry point into programming.

Learn JavaScript first if: You want to build websites, interactive UIs, or full-stack web applications.

Learn Java first if: You are targeting enterprise jobs, Android development, or large-scale distributed systems.

The truth: Most professional developers learn all three eventually. Your first language is not your last. Pick the one that matches what you want to build right now.

Working with Data in Python: A Real Mini-Project

Let's do what 70% of Python usage in business actually looks like: load data, clean it, analyze it, and extract insights. Imagine you have a CSV file of sales data from a small e-commerce store.

import pandas as pd

# Load the sales data
sales = pd.read_csv("sales_data.csv")

# Inspect the data
print(sales.shape)          # (2847, 6) -- 2847 rows, 6 columns
print(sales.columns.tolist())
# ['date', 'product', 'category', 'quantity', 'price', 'region']

# Check for missing values
print(sales.isnull().sum())
# date        0
# product     3
# category    0
# quantity    0
# price       12
# region      0

# Drop rows with missing prices (can't calculate revenue without them)
sales = sales.dropna(subset=["price"])

# Create a revenue column
sales["revenue"] = sales["quantity"] * sales["price"]

# Total revenue
print(f"Total revenue: ${sales['revenue'].sum():,.2f}")

# Revenue by category
by_category = sales.groupby("category")["revenue"].sum().sort_values(ascending=False)
print("\nRevenue by category:")
print(by_category)

# Top 10 products by units sold
top_products = sales.groupby("product")["quantity"].sum().nlargest(10)
print("\nTop 10 products by units sold:")
print(top_products)

# Average order value by region
avg_by_region = sales.groupby("region")["revenue"].mean().round(2)
print("\nAverage order value by region:")
print(avg_by_region)

# Monthly trend
sales["date"] = pd.to_datetime(sales["date"])
monthly = sales.set_index("date").resample("M")["revenue"].sum()
print("\nMonthly revenue trend:")
print(monthly)

Forty lines of code. That script loads a spreadsheet, cleans it, calculates revenue, breaks it down by category, finds the best-selling products, compares regions, and shows a monthly trend. A business analyst doing this in Excel would need pivot tables, VLOOKUP formulas, manual chart configuration, and probably an hour. This runs in under a second and is repeatable -- run it next month with new data and you get updated results instantly.

Key Insight

The real power of Python for data work is not any single operation -- it is the pipeline. Load, clean, transform, analyze, visualize, export. Each step is a few lines. The pipeline runs end to end in seconds. And because it is code, it is version-controlled, testable, and repeatable. A spreadsheet analysis dies the moment someone accidentally deletes a formula. A Python script runs the same way every time.

Real-World Python: Three Stories at Scale

Instagram: 2 Billion Users on Django

Instagram is the largest deployment of the Django web framework in the world. When they were acquired by Facebook in 2012 for $1 billion, their team was 13 people. Thirteen engineers serving 30 million users. They chose Python and Django because it let a tiny team build features fast. By the time they reached a billion users, they had invested so heavily in their Python infrastructure that switching languages would have cost more than optimizing what they had.

Their solution: not rewriting in a faster language, but optimizing Python itself. Instagram's engineering team contributed directly to CPython (the standard Python implementation), improving garbage collection and memory usage. They also pioneered the use of Cython -- a Python superset that compiles to C -- for their most performance-critical code paths. The lesson: Python's speed ceiling is higher than people assume, because you can always drop into C where it matters.

Real-World Example

Instagram's engineering blog documented that their Django servers handle over 500 million daily active users with Python. When they optimized their Python garbage collector, they reduced memory usage by 10% across their entire fleet -- which, at their scale, saved tens of millions of dollars in server costs per year. A 10% memory improvement on a million servers is not a trivial optimization. It is a budget line item visible to the CFO.

Spotify: Machine Learning Across 400 Million Playlists

Every Monday morning, 400 million Spotify users open their "Discover Weekly" playlist to find 30 songs they have never heard but are likely to love. That playlist is generated by a Python-based machine learning pipeline that analyzes listening history, collaborative filtering data (what similar users listen to), and audio features extracted from the songs themselves.

Spotify's data infrastructure team uses Python at every stage: data ingestion (pulling listening events from their servers), feature engineering (transforming raw data into meaningful signals), model training (building and updating recommendation models), and serving (delivering personalized playlists to each user). They use Luigi -- an open-source Python workflow engine that Spotify itself built and released -- to orchestrate these data pipelines. Luigi manages task dependencies, handles failures gracefully, and ensures that every step runs in the correct order.

# Simplified concept of how Spotify might structure a recommendation task
import luigi
import pandas as pd

class ExtractListeningHistory(luigi.Task):
    user_id = luigi.Parameter()

    def output(self):
        return luigi.LocalTarget(f"data/{self.user_id}_history.csv")

    def run(self):
        # Fetch listening events from database
        history = fetch_user_history(self.user_id)
        history.to_csv(self.output().path, index=False)

class GenerateRecommendations(luigi.Task):
    user_id = luigi.Parameter()

    def requires(self):
        return ExtractListeningHistory(user_id=self.user_id)

    def run(self):
        history = pd.read_csv(self.input().path)
        recommendations = recommendation_model.predict(history)
        save_playlist(self.user_id, recommendations)

JPMorgan: 360,000 Hours of Lawyer Work Replaced

In 2017, JPMorgan Chase deployed a Python-based system called COIN (Contract Intelligence) that reviews commercial loan agreements. The task -- interpreting 12,000 commercial credit agreements per year -- previously required 360,000 hours of work by lawyers and loan officers. COIN does it in seconds.

The system uses natural language processing (NLP) libraries to parse legal documents, extract key terms, identify risks, and flag anomalies. What makes this story significant is not just the automation -- it is the error rate. COIN makes fewer mistakes than the human reviewers it replaced. Legal document review is tedious, repetitive work that humans do poorly because attention lapses over thousands of pages. A Python script does not get tired, does not lose focus, and applies the same rules consistently to every document.

JPMorgan now employs Python programmers in roles that used to be filled by paralegals. The bank's "Athena" trading platform -- 35 million lines of Python code -- manages $5 trillion in assets. Python is not just for startups and data scientists. It is infrastructure that the world's largest financial institutions bet their operations on.

From Script to Career: What Python Jobs Look Like

Knowing Python is not a career by itself -- it is a tool that unlocks careers. Which career depends on what you combine Python with.

1
Data Analyst ($55K-$90K)

What you do: Pull data from databases, clean it, analyze trends, build dashboards, and present findings to business teams. You answer questions like "Which products are underperforming?" and "Where are we losing customers?"

Python stack: pandas, NumPy, Matplotlib/Seaborn for visualization, SQL for databases, Jupyter Notebooks for exploratory analysis.

Learning path: Python basics (2-4 weeks) → pandas/SQL (4-6 weeks) → Statistics fundamentals (4 weeks) → Portfolio projects (ongoing).

2
Backend Developer ($75K-$130K)

What you do: Build the server-side logic that powers web applications. APIs, authentication, database interactions, business rules. When someone clicks "Place Order," your code processes the payment, updates inventory, and sends the confirmation email.

Python stack: Django or Flask/FastAPI, PostgreSQL or MySQL, Redis for caching, Docker for deployment, REST APIs or GraphQL.

Learning path: Python basics (2-4 weeks) → Web fundamentals / HTTP (2 weeks) → Django or Flask (6-8 weeks) → Databases/SQL (4 weeks) → Deploy a real project.

3
Machine Learning Engineer ($100K-$180K)

What you do: Build and deploy models that make predictions -- fraud detection, recommendation systems, image recognition, natural language processing. You work at the intersection of software engineering and statistics.

Python stack: scikit-learn, TensorFlow or PyTorch, pandas, NumPy, MLflow for model tracking, cloud platforms (AWS SageMaker, GCP Vertex AI).

Learning path: Python basics (2-4 weeks) → Statistics and linear algebra (8 weeks) → scikit-learn (6 weeks) → Deep learning (8 weeks) → ML system design (ongoing).

4
DevOps / Automation Engineer ($80K-$140K)

What you do: Automate infrastructure, deployments, monitoring, and operational tasks. You write the scripts that keep servers running, deploy code to production, and alert teams when something breaks.

Python stack: Ansible (Python-based), Boto3 (AWS SDK), Fabric, subprocess module, Docker SDK, monitoring scripts.

Learning path: Python basics (2-4 weeks) → Linux/command line (4 weeks) → Cloud platform (AWS/GCP) (6-8 weeks) → CI/CD pipelines (4 weeks) → Infrastructure as code.

Salary ranges are based on the Stack Overflow Developer Survey and Glassdoor data for the United States. Adjust expectations based on your location, experience, and whether the role is remote. The consistent pattern across all four paths: Python basics take 2-4 weeks. Domain-specific skills take 2-6 months. Getting your first job takes 6-12 months of focused effort. There are no shortcuts, but the path is well-documented and thousands of people walk it every year.

Setting Up Python: Your First Five Minutes

Getting started should take five minutes, not five hours. Here is the fastest path.

1
Install Python

Go to python.org/downloads and download the latest Python 3 release. On macOS and Windows, run the installer. Check "Add Python to PATH" on Windows. On Linux, Python 3 is usually pre-installed (python3 --version to verify).

2
Install VS Code

Download VS Code from code.visualstudio.com. Install the Python extension (by Microsoft) from the Extensions marketplace. This gives you syntax highlighting, auto-completion, debugging, and linting -- everything you need in one editor.

3
Write Your First Script

Create a file called hello.py, type the code below, and run it with python3 hello.py in your terminal.

name = input("What is your name? ")
print(f"Hello, {name}! Welcome to Python.")
4
Set Up a Virtual Environment

Before installing any libraries, create a virtual environment. This keeps each project's dependencies isolated -- critical once you work on multiple projects.

python3 -m venv myproject
source myproject/bin/activate  # macOS/Linux
myproject\Scripts\activate     # Windows
pip install requests pandas    # Install libraries into this environment

Common Beginner Mistakes (and How to Avoid Them)

Every Python beginner hits the same walls. Knowing them in advance saves hours of frustration.

IndentationError. Python uses indentation to define code blocks. Mix tabs and spaces, or indent inconsistently, and Python refuses to run. Solution: configure your editor to insert 4 spaces when you press Tab. VS Code does this by default for Python files.

# Wrong -- inconsistent indentation
def greet(name):
    if name:
      print("Hello")  # 2 spaces instead of 4
    return name

# Right -- consistent 4-space indentation
def greet(name):
    if name:
        print("Hello")
    return name

Mutable default arguments. This one trips up even intermediate developers.

# Dangerous -- the list is shared across all calls
def add_item(item, items=[]):
    items.append(item)
    return items

print(add_item("a"))  # ["a"]
print(add_item("b"))  # ["a", "b"] -- surprise!

# Safe -- use None as default
def add_item(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

Off-by-one with range(). range(5) produces 0, 1, 2, 3, 4 -- not 1 through 5. range(1, 5) produces 1, 2, 3, 4 -- not 1 through 5. The end value is always excluded. This is consistent with how Python handles slicing (list[0:3] gives elements 0, 1, 2) but confusing for newcomers.

Modifying a list while iterating over it.

# Wrong -- modifying a list while looping over it causes skipped elements
numbers = [1, 2, 3, 4, 5]
for n in numbers:
    if n % 2 == 0:
        numbers.remove(n)
# Result: [1, 3, 5]? No -- [1, 3, 5] if lucky, unpredictable if not

# Right -- use a list comprehension to create a new list
numbers = [1, 2, 3, 4, 5]
odd_numbers = [n for n in numbers if n % 2 != 0]
# Result: [1, 3, 5] -- always

Frequently Asked Questions

Is Python too slow for real work?

It is slower than C, C++, Rust, Go, and Java for raw CPU computation. It is fast enough for 95% of real-world use cases. Web servers spend most of their time waiting for database queries and network responses -- Python is not the bottleneck. Data science workloads use NumPy and pandas, which run C and Fortran under the hood -- Python is just the orchestrator. Machine learning training happens on GPUs via TensorFlow and PyTorch, which are C++ and CUDA at their core. The cases where Python's speed genuinely matters -- high-frequency trading, video game engines, operating systems -- are cases where nobody would choose Python in the first place.

Python 2 or Python 3?

Python 3. Always. Python 2 was officially discontinued on January 1, 2020. It no longer receives security patches. Every major library has dropped Python 2 support. If you encounter a tutorial that uses print "hello" (without parentheses), it is a Python 2 tutorial and you should find a newer one. The current stable release is Python 3.12+, and that is what you should install and use.

Can I build a mobile app with Python?

Technically yes. Kivy builds cross-platform mobile apps, and BeeWare compiles Python to native mobile code. In practice, neither has the ecosystem, performance, or polish of native development (Swift for iOS, Kotlin for Android) or cross-platform frameworks (React Native, Flutter). If mobile development is your goal, learn JavaScript (React Native) or Dart (Flutter). If you already know Python and just need a quick internal tool on mobile, Kivy is workable. For anything user-facing and serious, use the right tool for the job.

What IDE should I use?

VS Code (free, by Microsoft) with the Python extension is the best starting point. It provides syntax highlighting, auto-completion, integrated debugging, linting, and terminal access -- everything a beginner and intermediate developer needs. PyCharm (by JetBrains) is purpose-built for Python with more advanced features like refactoring tools, database integration, and Django-specific support. The community edition is free; the professional edition requires a paid license. Start with VS Code. Move to PyCharm if you find yourself wanting more.

What should I build to practice?

Start with problems you actually have. Automate something tedious in your life: rename files, organize photos by date, track your expenses from a CSV bank export, scrape prices from a website you check daily. Then build something for other people: a command-line tool, a simple web API, a data dashboard. The projects that teach the most are the ones where you hit problems the tutorials never mention -- because that is when you learn to debug, read documentation, and think through edge cases on your own.

Key Insight

Python's greatest strength is not any single feature -- it is the compound effect of readability, a massive ecosystem, and universal applicability. You learn it once and use it everywhere: automating your file system today, analyzing spreadsheets next month, building a web app next quarter, training a machine learning model next year. No other language offers that breadth with that low a barrier to entry. That is why it won, and that is why it will keep winning.