CODE FARM
Galaxy background

"You become responsible, forever, for what you have tamed."

- Antoine de Saint-Exupéry, The Little Prince

Python Learning Notes

> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

1. Running

  • Using the interactive interpreter (shell)

    $ python3 -q
    >>> 2+2
    4
    >>> quit()

    IPython provides an enhanced text-based REPL with completion and introspection, whereas JupyterLab is a web-based environment that executes Python via an IPython kernel (ipykernel).

    $ pip install ipython
    $ ipython
    In [1]: 2+2
    Out[1]: 4
    
    In [2]: len?
    $ pip install jupyterlab
    $ jupyter lab
  • Using python files

    print(2+2)
    $ python3 test.py
    4
  • Using python files with shebang

    In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark (#!) at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.

     — From Wikipedia, the free encyclopedia

    #!/usr/bin/env python3
    print(2+2)
    $ ./test.py
    4
  • Executing modules as scripts

    In Python, python -m executes installed modules as scripts directly from the command line, removing the need for a separate .py file.

    $ python3 -m venv --help
    usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear] [--upgrade] [--without-pip]
                [--prompt PROMPT] [--upgrade-deps]
                ENV_DIR [ENV_DIR ...]
    
    Creates virtual Python environments in one or more target directories.
    . . .
    $ python3 -m webbrowser https://www.google.com

2. Indentations, comments, and multi-line expressions

  • Python uses four-space indentation (PEP-8) instead of curly brackets or keywords to delimit code blocks.

    • Don’t mix tabs and spaces to avoid messesing up the indentation count.

    • Guido van Rossum designed Python to use indentation for structure, avoiding the parentheses and braces common in other languages.

      disaster = True
      if disaster:
          print("Woe!")
      else:
          print("Whee!")
    • A compound statement body can optionally follow the colon on the same line.

      if x > y: print(x)  # Simple statement on header line
  • Line breaks generally terminate statements automatically.

    x = 1  # x = 1;
  • Multiple statements may be placed on one line using semicolon separators.

    a = 1; b = 2; print(a + b) # Three statements on one line
  • Python expressions can span multiple lines when enclosed within delimiters like (), [], or {}.

    • In pre-3.0 Python, a trailing backslash (\) was required for line continuation, a practice now obsolete in modern versions.

      # Example in older Python (error-prone, not recommended)
      long_expression = (1 + 2 + 3 + 4 + 5 + \
                        6 + 7 + 8 + 9 + 10)
    • In modern Python, favor delimiters like (), [], or {} over the backslash (\) to improve readability and structure in multi-line expressions.

      # Parentheses for complex calculations
      long_calculation = (a * b +
                          c) * (d /
                                e - f)
      
      # Brackets for multi-line lists or data structures
      data = [
          "item1",
          "item2 with a longer description",
          "item3"
      ]
      
      # Braces for multi-line dictionaries
      person_info = {
          "name": "Alice",
          "age": 30,
          "hobbies": ["reading", "hiking"]
      }
  • A comment is marked by the # character (hash, sharp, pound, or octothorpe) and extends to the end of the line.

    # 60 sec/min * 60 min/hr * 24 hr/day
    seconds_per_day = 86400
    seconds_per_day = 86400 # 60 sec/min * 60 min/hr * 24 hr/day
    # Python does NOT
    # have a multiline comment.
    print("No comment: quotes make the # harmless.")

3. Keywords

False               class               from                or
None                continue            global              pass
True                def                 if                  raise
and                 del                 import              return
as                  elif                in                  try
assert              else                is                  while
async               except              lambda              with
await               finally             nonlocal            yield
break               for                 not

4. Types

  • Python is dynamically and strongly typed with built-in garbage collection.

    • A dynamically typed language determines a variable type at runtime rather than requiring an explicit declaration during definition.

      age = 30  # age is an integer (no need to declare the data type explicitly)
      age = "thirty"  # age is now a string
    • A statically typed language requires a variable type to be declared at compile time to ensure type compatibility.

      // In Java, declare the type of a variable before assigning a value.
      int age = 30;  // age is declared as an integer
      age = "thirty";  // error: incompatible types: String cannot be converted to int
    • A strongly typed language requires strict type safety by preventing operations between incompatible data types.

      Static typing dictates when a type is verified (compile time), whereas strong typing dictates how strictly that type is enforced (both compile time and runtime).
    • In Python, every data type is an object whose associated methods and attributes are verified for compatibility at runtime.

      # Python supports type inference on assignment.
      name = "Alice"  # String inferred
      name + 10       # TypeError: mixed types (Strongly typed)

      In computer programming, duck typing is an application of the duck test—"If it walks like a duck and it quacks like a duck, then it must be a duck"—to determine whether an object can be used for a particular purpose.

       — From Wikipedia, the free encyclopedia

      # str, tuple, list, bytes, bytearray
      # dict, set, frozenset
      # int, bool, float, complex, decimal, fraction
      # function, generator, class, method
      # module, NoneType, Ellipsis, type, code, frame, traceback
      bool    # True, False
      int     # 47, 25000, 25_000, 0b0100_0000, 0o100, 0x40, sys.maxsize, - sys.maxsize - 1
      float   # 3.14, 2.7e5, float('inf'), float('-inf'), float('nan')
      complex # 3j, 5 + 9j
      
      str # unicode: 'alas', "alack", '''a verse attack'''
      
      tuple # (2, 4, 8)
      list  # ['Winken', 'Blinken', 'Nod']
      
      bytes # b'ab\xff'
      bytearray # bytearray(...)
      
      dict      # {}, {'game': 'bingo', 'dog': 'dingo', 'drummer': 'Ringo'}
      set       # set([3, 5, 7])
      frozenset # frozenset(['Elsa', 'Otto'])
      
      # import decimal, fractions
      decimal.Decimal(1/3)     # Decimal('0.333333333333333314829616256247390992939472198486328125')
      fractions.Fraction(1, 3) # Fraction(1, 3)
      # int(), float(), bin(), oct(), hex(), chr(), and ord()
      int(True), int(False)                                # (1, 0)
      int(98.6), int(1.0e4)                                # (98, 10_000)
      int('99'), int('-23'), int('+12'), int('1_000_000')  # (99, -23, 12, 1_000_000)
      
      int('10', 2), 'binary', int('10', 8), 'octal', int('10', 16), 'hexadecimal', int('10', 22), 'chesterdigital'
      # (2, 'binary', 8, 'octal', 16, 'hexadecimal', 22, 'chesterdigital')
      
      float(True), float(False)                     # (1.0, 0.0)
      float('98.6'), float('-1.5'), float('1.0e4')  # (98.6, -1.5, 10_000.0)
      
      bin(65), oct(65), hex(65)  # ('0b1000001', '0o101', '0x41')
      chr(65), ord('A')          # ('A', 65)
      
      False + 0, True + 0, False + 0., True + 0.  # (0, 1, 0.0, 1.0)
      True + True, True + False, False + False    # (2, 1, 0)

4.1. type hints

  • In Python, type hints (annotations) provide optional metadata to specify expected data types for variables, parameters, and return values.

    from typing import Annotated, Any
    
    # primitives & unions (3.10+)
    age: int = 30
    pi: float | None = 3.14  # nullable | optional
    is_active: bool = True
    raw: bytes = b"\x01\x02"
    flex: Any = "can be anything"
    
    # generics (3.9+): list, dict, tuple, set
    def process(
        ids: list[int],
        data: dict[str, float],
        point: tuple[int, int, str],
        unique: set[bytes]
    ) -> str: ...
    
    # classes & metadata
    class User:
        def __init__(self, name: str): self.name = name
    
    def register(
        user: User,
        note: Annotated[str, "Max 20 chars"]
    ) -> bool:
        return True

    any is a built-in function for truthiness checks, whereas typing.Any is the type hint for unconstrained values.

    from typing import Any
    
    x: any = 10                               # function object
    y: Any = 10                               # type hint
    
    print(f"x hint: {__annotations__['x']}")  # <built-in function any>
    print(f"y hint: {__annotations__['y']}")  # typing.Any

4.2. assignments

  • In Python, variables must be assigned to an object before being referenced, otherwise, a NameError is raised.

    # assignment statements
    spam = 'Spam'                   # simple assignment
    spam, ham = 'yum', 'YUM'        # tuple unpacking
    [spam, ham] = ['yum', 'YUM']    # list unpacking
    a, b, c, d = 'spam'             # sequence unpacking (each character to a variable)
    a, *b = 'spam'                  # extended sequence unpacking (a='s', b=['p', 'a', 'm'])
    a, *_ = 'spam'                  # use the underscore (_) for unwanted variables
    spam = ham = 'lunch'            # multiple assignment (both variables refer to the same object)
    spams += 42                     # augmented assignment (equivalent to spams = spams + 42)
    spam = ham = eggs = 0           # multiple variable names can be assigned a value at the same time
    # swap variable names
    a, b = 1, 2
    b, a = a, b  # 2, 1

4.3. bindings

  • In python, variables are labels referencing memory objects (PyObjects) defined by a type, unique ID, value, and reference count.

    import sys
    
    # 1. Type & ID: Exploring the PyObject
    val = 5.20
    print(type(val))  # <class 'float'>
    print(id(val))    # Unique memory address (ID)
    
    # 2. Reference Counting: Labels on a PyObject
    x = y = z = 1000.1
    base_count = sys.getrefcount(x)
    
    del y
    print(sys.getrefcount(x) == base_count - 1)  # True: one label removed
    
    del z
    print(sys.getrefcount(x) == base_count - 2)  # True: only 'x' remains

4.4. identities

  • A class is a blueprint for creating objects; in Python, "class" and "type" are synonymous.

    type(7)             # <class 'int'>
    type(7) == int      # True
    
    isinstance(7, int)           # True
    isinstance(type(int), type)  # True
    
    # 1. instances vs. blueprints
    print(type(7) == int)          # True
    print(isinstance(7, int))      # True
    
    # 2. bool is a subclass of int
    print(issubclass(bool, int))   # True
    print(isinstance(True, int))   # True (True is an int instance)
    
    # 3. meta identity
    print(isinstance(int, type))   # True (Classes are type objects)

4.5. equality

  • In Python, == compares object values via recursive equivalence while is checks if two variables reference the same memory address.

    # 1. value equivalence (==)
    L1 = [1, 2, 3]
    L2 = [1, 2, 3]
    print(L1 == L2)    # True: content is identical
    print(L1 is L2)    # False: different objects in memory
    
    # 2. object identity (is)
    S1 = 'spam'
    S2 = 'spam'
    print(S1 == S2)    # True: same value
    print(S1 is S2)    # True: same object (interned)
    
    # 3. memory addresses (id)
    x, y = 1024, 1024
    print(x == y)      # True
    print(x is y)      # False: distinct IDs for large ints

4.6. sequences

  • Strings, tuples, and lists are ordered, zero-indexed collections; while tuples and lists store any data type, strings are strictly sequences of characters.

    # concatenation (+) and repetition (*)
    combo = ('cat',) + ('dog', 'cow')  # ('cat', 'dog', 'cow')
    alarm = ('bark',) * 3              # ('bark', 'bark', 'bark')
    
    # membership & unpacking
    'c' in 'cat'                        # True
    c, d, w = ['meow', 'bark', 'moo']   # unpacking
    
    # iteration: direct vs. indexed vs. enumerated
    items = ['meow', 'bark', 'moo']
    for item in items: ...
    for i in range(len(items)): ...
    for i, v in enumerate(items): ...
    # indexing
    s = 'hello!'  # len(s) is 6
    
    # positive offsets (0 to len-1)
    print(s[0])     # 'h' (first)
    print(s[5])     # '!' (last)
    
    # negative offsets (-1 to -len)
    print(s[-1])    # '!' (same as s[len(s)-1])
    print(s[-6])    # 'h' (same as s[0])
    
    # out of bounds
    # s[6]          # IndexError: index out of range
    # slicing
    s = 'hello!'
    
    # [start:stop] - stop is non-inclusive
    print(s[1:3])   # 'el' (offsets 1 and 2)
    print(s[:3])    # 'hel' (default start: 0)
    print(s[1:])    # 'ello!' (default end: len)
    
    # [start:stop:step]
    print(s[::2])   # 'hlo' (every 2nd character)
    print(s[::-1])  # '!olleh' (negative step reverses)
    
    # shadow copy
    print(s[:])              # 'hello!' (top-level copy)
    print(s[slice(0, 6, 1)]) # 'hello!' (the internal logic)

4.7. truthiness

  • In Python, truthiness and falsiness determine a value’s evaluation in a Boolean context where most non-empty collections and non-zero numbers are truthy while None and empty or zero-valued objects are falsy.

    # truthy: objects with content or non-zero value
    bool(42)          # True
    bool("hello")     # True
    bool([1, 2])      # True
    
    # falsy: empty, zero, or null
    bool(0)           # False
    bool("")          # False
    bool([])          # False
    bool(None)        # False

4.8. and, or, not

  • In Python, logical operators combine Boolean expressions where not negates a value and both and and or use short-circuiting to return the operand that determines the result.

    # 1. negation
    print(not True)           # False
    print(not 0)              # True
    
    # 2. short-circuiting AND: returns first Falsy or last Truthy
    print([] and "hello")     # []
    print(10 and "hello")     # "hello"
    
    # 3. short-circuiting OR: returns first Truthy or last Falsy
    print("apple" or "pear")  # "apple"
    print(None or 0)          # 0
    
    letter = 'o'
    if letter == 'a' or letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u':
        print(letter, 'is a vowel')
    else:
        print(letter, 'is not a vowel')

4.9. ~, <<, >>, &, ^, |

  • In Python, bitwise operators perform bit-level manipulations with a precedence lower than arithmetic operators following the specific order of ~, << >>, &, ^, and then |.

    x = 5  # 0b0101
    y = 1  # 0b0001
    
    # 1. AND, OR, XOR
    print(f"0b{(x & y):04b}")  # 0b0001 (both bits must be 1)
    print(f"0b{(x | y):04b}")  # 0b0101 (either bit is 1)
    print(f"0b{(x ^ y):04b}")  # 0b0100 (bits must differ)
    
    # 2. shifts & inversion
    print(f"0b{(x << 1):04b}") # 0b1010 (shift left: multiply by 2)
    print(f"0b{(x >> 1):04b}") # 0b0010 (shift right: floor divide by 2)
    print(f"0b{~x:b}")         # 0b-110 (invert: -(x+1))

4.10. /, //, %

  • In Python, / performs true division returning a float while // and % perform floor division and modulo returning integers only if both operands are integers.

    # 1. true division (/): always float
    print(10 / 2)          # 5.0
    print(11 / 2)          # 5.5
    
    # 2. floor division (//): truncates toward negative infinity
    print( 11   // 2)         #  5   (int)
    print( 11.0 // 2)         #  5.0 (float if any operand is float)
    print(-11   // 2)         # -6   (floor of -5.5 is -6)
    
    # 3. modulo (%): remainder of division
    print( 10 %  3)          #  1
    print(-10 %  3)          #  2   (result sign matches divisor)
    print( 10 % -3)          # -2

5. Bytes and bytearray

  • In Python, eight-bit integer sequences represent values from 0 to 255 as either immutable bytes or mutable bytearray objects.

    # 1. bytes: immutable literal (b'...')
    b_seq = b'abc'
    # b_seq[0] = 65      # TypeError: immutable
    
    # 2. bytearray: mutable constructor
    ba_seq = bytearray(b'abc')
    ba_seq[0] = 65       # 'a' (97) becomes 'A' (65)
    print(ba_seq)        # bytearray(b'Abc')
    
    # 3. indexing returns integers, slicing returns new sequences
    print(b_seq[0])      # 97 (integer)
    print(b_seq[:1])     # b'a' (bytes)
    
    # 4. initialization from size or list
    empty_bytes = bytes(5)          # b'\x00\x00\x00\x00\x00'
    from_list = bytes([97, 98, 99]) # b'abc'

    Endianness is a computer architecture convention for multi-byte data where "big-endian" (standard for IBM mainframes and networking) stores the most significant byte at the lowest address and "little-endian" (standard for x86 and ARM) stores the least significant byte first.

    import sys, struct
    
    # 1. check local architecture
    print(sys.byteorder)      # 'little' (common)
    
    # 2. multi-byte integer (hex: 0x0400)
    n = 1024
    
    # 3. convert to bytes
    big    = n.to_bytes(2, 'big')    # b'\x04\x00' (MSB first)
    little = n.to_bytes(2, 'little') # b'\x00\x04' (LSB first)
    
    print(f"Big:    {big.hex(' ')}")     # Big:    04 00
    print(f"Little: {little.hex(' ')}")  # Little: 00 04
    
    # 4. interpretation risk
    wrong = int.from_bytes(little, 'big')
    print(wrong)  # 4 (interpreted as 0x0004)
    
    # 5. using struct '>' for network order
    network_pkt = struct.pack('>H', n)  # pack as big-endian (>) unsigned short (H)
    print(network_pkt)                  # b'\x04\x00'

6. Strings

In Python, strings exist as Unicode str for text, immutable bytes for binary data, and mutable bytearray for modified raw data.

  • In Python, files use text mode for Unicode strings or binary mode for raw, untranslated bytes.

# create a sample file with a special character
with open('demo.txt', 'w', encoding='utf-8') as f:
    f.write('Hi 👋')

# text mode: returns a 'str' (Unicode)
with open('demo.txt', 'r') as f:
    print(f"Text:   {f.read()}")

# binary mode: returns 'bytes' (Raw)
with open('demo.txt', 'rb') as f:
    print(f"Binary: {f.read()}")

Designed by Unix legends Ken Thompson and Rob Pike on a diner placemat in New Jersey, UTF-8 is a variable-width encoding that serves as the standard for Python, Linux, and the Web.

# 1. define Unicode string with an emoji
cafe = 'café ☕'

# 2. len() counts Unicode characters (the emoji is 1 char)
print(len(cafe))       # 6

# 3. encode to bytes
# 'é' is 2 bytes (\xc3\xa9) | '☕' is 3 bytes (\xe2\x98\x95)
cafe_bytes = cafe.encode()

# 4. len() counts raw bytes
print(len(cafe_bytes)) # 9

# 5. decode back to str
print(cafe_bytes.decode()) # 'café ☕'
  • Python strings are created using single, double, or triple quotes, with triple quotes specifically designed to handle multiline text and preserve formatting like newlines and indentation.

    # 1. single and double quotes (interchangeable)
    s1 = 'Snap'
    s2 = "Crackle"
    
    # 2. nesting quotes without escapes
    s3 = "'Nay!' said the naysayer."
    s4 = 'The rare double quote: ".'
    
    # 3. triple quotes for multiline blocks
    poem = """There was a Young Lady of Norway,
        Who casually sat in a doorway;
        When the door squeezed her flat,
        She exclaimed, "What of that?"
        This courageous Young Lady of Norway."""
    
    # 4. raw representation (showing \n and spaces)
    print(repr(poem))
    # 1. repeating and combining
    hi = 'Na ' * 4 + 'Hey ' * 4
    
    # 2. escaping and line continuation
    farewell = '\\' + '\t' + 'Goodbye.' \
               + ' Done.'
    
    # 3. implicit concatenation
    s = ("Auto-" "merged " "literals")
  • Python supports specialized string types via single-letter prefixes that determine how the interpreter processes formatting, escape sequences, and underlying data structures.

    # 1. f-strings: formatted string literals
    animal, loc = 'wereduck', 'werepond'
    print(f'The {animal} is in the {loc}')
    
    # 2. r-strings: raw strings (ignores backslashes)
    path = r'C:\Users\name'  # Interpreted as 'C:\\Users\\name'
    
    # 3. b-strings: bytes literals (binary data)
    blob = b'\x14\xcd'
    print(list(blob))        # [20, 205]
    
    # 4. fr-strings: raw f-strings (combined)
    var = "Value"
    14 print(fr'Raw plus {var}')
  • Python supports three formatting methodologies: legacy C-style expressions, the .format() method, and modern interpolated f-strings.

    actor = 'Richard Gere'
    cat, weight = 'Chester', 28
    
    # 1. C-style (%)
    s1 = 'Actor: %s' % actor
    s2 = 'Our cat %s weighs %d lbs' % (cat, weight)
    s3 = '%(cat)s is %(weight)d' % {'cat': cat, 'weight': weight}
    
    # 2. str.format()
    s4 = '{0}, {1} and {2}'.format('spam', 'ham', 'eggs')
    s5 = '{motto}, {0} and {food}'.format('ham', motto='spam', food='eggs')
    s6 = '{}, {} and {}'.format('spam', 'ham', 'eggs')
    
    # 3. f-strings
    s7 = f'Our cat {cat} weighs {weight} pounds'
  • Python’s re module provides a suite of tools for pattern matching, substitution, and splitting string data using regular expressions.

    import re
    
    source = "Charles Baudelaire's 'Les Fleurs du Mal'"
    
    # 1. compiling a pattern (optional, improves performance for reuse)
    pattern = re.compile('Les Fleurs du Mal')
    
    # 2. search(): find first occurrence anywhere
    m = pattern.search(source)
    if m:
        print("Match found within the string.")
    
    # 3. match(): find exact match at the START only
    print(re.match('Les Fleurs du Mal', source))  # None
    
    # 4. findall(): returns a list of all matches
    print(re.findall('es', source))  # ['es', 'es']
    
    # 5. split(): break string at every pattern occurrence
    print(re.split(r'\s', source))   # split by whitespace
    
    # 6. sub(): search and replace patterns
    print(re.sub("'", '?', source)) # replaces single quotes with ?

7. If, while, and for

The walrus operator (:=) assigns a value to a variable within an expression stored and evaluated simultaneously.

limit = 280
msg = "Blah " * 60

# value is stored in 'diff' and evaluated by '>=' simultaneously
if (diff := limit - len(msg)) >= 0: # walrus operator
    print("Fitting tweet")
else:
    print(f"Over by {abs(diff)}")
  • Branch with if, elif, and else:

    # 1. standard multi-way branching
    color = "mauve"
    if color == "red":
        print("a tomato")
    elif color == "green":
        print("a green pepper")
    else:
        print("unknown:", color)
    
    # 2. ternary expression
    result = 't' if 'spam' else 'f'
    
    # 3. chained comparisons
    x = 2.5
    if 4 > x > 2 > 1: ...  # evaluates as (4 > x) and (x > 2) and (2 > 1)
    
    # 4. dictionary-based branching (dispatch tables)
    menu = {'spam': 1.25, 'ham': 1.99, 'eggs': 0.99}
    price = menu.get('bacon', 'N/A')
    
    actions = {'spam': lambda: print("order spam"), 'ham': lambda: print("order ham")}
    actions.get('spam', lambda: print("default action"))()
  • Repeat with while, and break, continue, and else:

    items = [1, 3, 5]
    
    while items:
        if (val := items[0]) == 0:
            break           # exit and skip 'else'
    
        items = items[1:]   # slice to progress
    
        if val % 2 == 0:
            continue        # skip to next condition check
        print(f"{val} squared is {val**2}")
    else: # optional
        print("no zeros found") # ONLY if the break above was never hit
  • Iterate with for/in, and break, continue and else:

    # 1. loop control: continue and break
    for char in 'thud':
        if char == 'u': continue  # skip remaining block for this item
        if char == 'x': break     # exit loop immediately
        print(char)
    else: # optional
        print("no 'x' found")     # ONLY if the break above was never hit
    
    # 2. range-based loops (start, stop, step)
    for i in range(0, 10, 2):
        print(i, end=' ')         # 0 2 4 6 8
    
    # 3. parallel and indexed iteration
    s = 'spam'
    for i, char in enumerate(s):  # generates (index, item) pairs
        print(f'{i}: {char}')
    
    for a, b in zip(s, s.upper()): # pairs elements from multiple iterables
        print(a, b)                # s S, p P, a A, m M
    
    # 4. sequence unpacking
    pairs = [[1, 2], [3, 4]]
    for x, y in pairs:             # direct assignment to variables
        print(x + y)

8. Tuples and lists

  • A tuple is an immutable, ordered sequence built with commas as a structural operator or tuple() as a constructor for iterables.

    'cat',                   # singleton  (trailing comma)
    'cat', 'dog', 'cattle'   # multi-item (separating commas)
    
    tuple()                  # constructor: empty ()
    tuple('cat')             # constructor: iterable to ('c', 'a', 't')

    Parentheses are grouping symbols used for empty literals, visual clarity, resolving syntactic ambiguity, and defining generator expressions.

    ()                       # empty literal
    ('cat',)                 # tuple
    ('cat')                  # string
    
    type(('cat',))           # <class 'tuple'>
    type('cat',)             # <class 'str'>
    
    (x for x in range(10))   # generator expression

    A named tuple is a hybrid object factory that creates classes supporting positional indexing (tuple), dotted name attribute (class), and dictionary conversion (_asdict()).

    # modern class-based; supports PEP 484 type hints and IDE autocompletion
    from typing import NamedTuple
    
    class Rec(NamedTuple):
        name: str                   # explicit field type
        age: float                  # enables static analysis
        jobs: list[str]             # self-documenting schema
    # legacy factory-based; quick, dynamic, but lacks static type hints
    from collections import namedtuple
    
    Rec = namedtuple('Rec', ['name', 'age', 'jobs'])
    bob = Rec('Bob', age=40.5, jobs=['dev', 'mgr'])
    
    bob[0]                          # positional indexing (tuple)
    bob.name                        # dotted name attribute (class)
    bob._asdict()['name']           # dictionary conversion (dict)
  • A list is a mutable, ordered sequence built with brackets [] as a literal or list() as a constructor for iterables.

    []                                # []
    ['meow', 'bark', 'moo']           # ['meow', 'bark', 'moo']
    [('cat', 'meow'), 'bark', 'moo']  # [('cat', 'meow'), 'bark', 'moo']
    
    list()                            # []
    list('cat')                       # ['c', 'a', 't']
    # append(), insert(), extend()
    wow = ['meow']  # ['meow']
    wow.append('moo')  # ['meow', 'moo']
    wow.insert(1, 'bark')  # ['meow', 'bark', 'moo']
    wow.extend(['cluck', 'baa']) # ['meow', 'bark', 'moo', 'cluck', 'baa']
    
    ```py
    # plus(+), repeat(*)
    plus = ['meow', 'bark', 'moo'] + ['cluck', 'baa'] # ['meow', 'bark', 'moo', 'cluck', 'baa']
    repeat = ['bark'] * 3 # ['bark', 'bark', 'bark']
    
    ```py
    # index, and slice assignment
    L = ['spam', 'Spam', 'SPAM!']
    # index assignment
    L[1] = 'eggs'  # ['spam', 'eggs', 'SPAM!']
    # slice assignment: delete+insert  # list[start:stop:step] = iterable
    #   if the iterable is shorter, elements are deleted from the slice.
    #   if the iterable is longer, extra elements are inserted.
    L[0:2] = ['eat', 'more']  # ['eat', 'more', 'SPAM!']
    # del, remove(), clear()
    farm = ['cat', 'dog', 'cattle', 'chicken', 'duck']
    
    del farm[-1]
    # ['cat', 'dog', 'cattle', 'chicken']
    
    farm.remove('dog')
    # ['cat', 'cattle', 'chicken']
    
    farm.clear()
    # []
    # pop: remove and return item at index (default last).
    farm = ['cat', 'cattle', 'chicken']
    
    farm.pop()  # 'chicken'
    # ['cat', 'cattle']
    
    farm.pop(-1)  # 'cattle'
    # ['cat']
    # sort() and sorted()
    farm = ['cat', 'dog', 'cattle']
    
    # a sorted copy
    sorted(farm)  # ['cat', 'cattle', 'dog']
    print(farm)  # ['cat', 'dog', 'cattle']
    
    # sorting in-place
    farm.sort()
    print(farm)  # ['cat', 'cattle', 'dog']
    # list comprehensions: [expression for item in iterable]
    even_numbers = [2 * num for num in range(5)]
    # [0, 2, 4, 6, 8]
    
    # list comprehensions: [expression for item in iterable if condition]
    odd_numbers = [num for num in range(10) if num % 2 == 1]
    # [1, 3, 5, 7, 9]
    # shallow: copies the top-level container with shared nested objects.
    a = [['cat', 'meow'], ['dog', 'bark']]
    c = a[:]
    b = a.copy()  # slower than direct bytecode slicing a[:]
    d = list(c)
    
    # deep: creates an independent clone of the container and all nested objects.
    e = copy.deepcopy(a)  # import copy
    
    a[0][1] = 'moo'
    
    a  # [['cat', 'moo'], ['dog', 'bark']]
    b  # [['cat', 'moo'], ['dog', 'bark']]
    c  # [['cat', 'moo'], ['dog', 'bark']]
    d  # [['cat', 'moo'], ['dog', 'bark']]
    
    e  # [['cat', 'meow'], ['dog', 'bark']]

    A deque (double-ended queue) is optimized for O(1) appends and pops from either end, whereas a list incurs O(N) costs for left-side mutations.

    from collections import deque
    
    q = deque([], maxlen=5)         # fixed-length sliding window
    q.append(0)                     # O(1) end-point growth
    q.appendleft(5)                 # O(1) start-point growth (vs list's O(N))
    
    q.pop()                         # O(1) end-point shrinkage
    q.popleft()                     # O(1) start-point shrinkage
    
    q.extend([1, 2])                # right-side batch add
    q.extendleft([3, 4])            # left-side batch add: deque([4, 3, ...])

9. Dictionaries and sets

In Python, keys in dictionaries and elements in sets must be of immutable, or hashable data types.

The built-in hash() operates natively with built-in types, but delegates to __hash__ for user-defined types, defaulting to object identity.

# 1. built-in immutables: always hashable (int, str, tuple, frozenset)
hash('python')  # returns a stable integer

# 2. built-in mutables: never hashable (list, dict, set)
# hash([1, 2])  --> raises TypeError: unhashable type: 'list'

# 3. user-defined classes: hashable by default via identity
class Foo: ...
a, b = Foo(), Foo()
print(hash(a), hash(b))  # different hashes based on memory address (id)
  • A dict is a mutable, associative array/map of unique keys to values, built with curly braces {} as a literal or dict() as a constructor.

    {}                              # {}
    {'cat': 'meow', 'dog': 'bark'}  # {'cat': 'meow', 'dog': 'bark'}
    
    dict()                          # constructor: empty {}
    dict(cat='meow', dog='bark')    # constructor: keyword args
    dict([('cat', 'meow')])         # constructor: iterable of pairs
    # [key], get()
    animals = {'cat': 'meow', 'dog': 'bark'}
    animals['cattle'] = 'moo'  # {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
    animals['cat']  # 'meow'
    animals['sheep']  # KeyError: 'sheep'
    animals.get('sheep')  # None
    animals.get('sheep', 'baa')  # 'baa'
    
    # testing
    animals = {'cat': 'meow', 'dog': 'bark'}
    'cat' in animals  # True
    'sheep' in animals  # False
    animals['sheep'] if 'sheep' in animals else 'oops!'  # 'oops!'
    # keys(), values(), items(), len()
    animals.keys()  # dict_keys(['cat', 'dog', 'cattle'])
    animals.values()  # dict_values(['meow', 'bark', 'moo'])
    animals.items()  # dict_items([('cat', 'meow'), ('dog', 'bark'), ('cattle', 'moo')])
    len(animals)  # 3
    # `**`, update()
    {**{'cat': 'meow'}, **{'dog': 'bark'}}  # {'cat': 'meow', 'dog': 'bark'}
    animals = {'cat': 'meow'}
    animals.update({'dog': 'bark'})  # {'cat': 'meow', 'dog': 'bark'}
    # del, pop(), clear()
    animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
    del animals['dog']
    # {'cat': 'meow', 'cattle': 'moo'}
    animals.pop('cattle')  # 'moo'
    # {'cat': 'meow'}
    animals.clear()
    # {}
    # iterations
    animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
    for key in animals:  # for key in animals.keys()
        print(f'{key} => {animals[key]}', end='\t')
    # cat => meow	dog => bark	cattle => moo
    for key, value in animals.items():
        print(f'{key} => {value}', end='\t')
    # cat => meow     dog => bark     cattle => moo
    # dictionary comprehensions: {key_expression : value_expression for expression in iterable}
    word = 'letters'
    letter_counts = {letter: word.count(letter) for letter in word}
    # {'l': 1, 'e': 2, 't': 2, 'r': 1, 's': 1}
    
    # dictionary comprehensions: {key_expression : value_expression for expression in iterable if condition}
    vowels = 'aeiou'
    word = 'onomatopoeia'
    vowel_counts = {letter: word.count(letter)
                    for letter in set(word) if letter in vowels}
    # {'i': 1, 'o': 4, 'a': 2, 'e': 1}
    # setdefault()
    d = {}
    d[0].extend(range(5))  # KeyError: 0
    d.setdefault(0, []).extend(range(5))
    d[0]  # [0, 1, 2, 3, 4]
    • A defaultdict is a dict subclass that calls a factory function to provide a default value for missing keys.

      from collections import defaultdict  # defaultdict(default_factory=None, /, [...])
      
      # factory: list -> defaults to []
      d_list = defaultdict(list)
      d_list[0].extend(range(5))      # auto-creates [] then extends
      
      # factory: int -> defaults to 0
      d_int = defaultdict(int)
      d_int['count'] += 1             # auto-creates 0 then increments
    • A Counter is a dict subclass for counting hashable items, storing elements as keys and their frequencies as values.

      from collections import Counter  # Counter(iterable=None, /, **kwds)
      
      word = 'banana'
      
      # O(N²) — scans string for 'b', then 'a', then 'n'...
      {l: word.count(l) for l in set(word)}
      
      # O(N) — single pass population
      c = Counter(word)               # Counter({'a': 3, 'n': 2, 'b': 1})
      c.most_common(1)                # [('a', 3)]
      list(c.elements())              # ['b', 'a', 'a', 'a', 'n', 'n']
      c['z']                          # missing keys return 0
    • A typed dict is a dict-like factory that defines fixed keys and types for static validation, providing a schema for flexible JSON-like maps.

      from typing import TypedDict, NotRequired
      
      class User(TypedDict):
          name: str                   # Required
          id: int                     # Required
          email: NotRequired[str]     # Optional (PEP 655)
      
      # 1. type check: static tools (Mypy) flag missing keys or wrong types
      user: User = {"name": "Alice", "id": 42}
      
      # 2. runtime: a plain dict
      print(type(user))               # <class 'dict'>
      print(user["name"])             # Standard string-key access
  • A set is a mutable, unordered collection of unique, hashable elements, built with curly braces {} or the set() constructor.

    {}            # <class 'dict'>
    {0, 2, 4, 6}  # {0, 2, 4, 6}
    
    set()                                                 # set()
    set('letter')                                         # {'l', 't', 'r', 'e'}
    set({'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'})  # {'cat', 'cattle', 'dog'}
    
    frozenset()                    # frozenset()
    frozenset([3, 1, 4, 1, 5, 9])  # frozenset({1, 3, 4, 5, 9})
    # len(), add(), remove()
    nums = {0, 1, 2, 3, 4, }
    len(nums)  # 5
    nums.add(5)  # {0, 1, 2, 3, 4, 5}
    nums.remove(0)  # {1, 2, 3, 4, 5}
    # iteration
    for num in {0, 2, 4, 6, 8}:
        print(num, end='\t')
    # 0	2	4	6	8
    # testing
    2 in {0, 2, 4}  # True
    3 in {0, 2, 4}  # False
    # `&`: intersection(), `|`: union(), `-`: difference(), `^`: symmetric_difference()
    a = {1, 3}
    b = {2, 3}
    a & b  # {3}
    a | b  # {1, 2, 3}
    a - b  # {1}
    a ^ b  # {1, 2}
    # `<=`: issubset(), `<`: proper subset, `>=`: issuperset(), `>`: proper superset
    a <= b  # False
    a < b  # False
    a >= b  # False
    a > b  # False
    # set comprehensions: { expression for expression in iterable }
    {num for num in range(10)}  # {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
    
    # set comprehensions: { expression for expression in iterable if condition }
    {num for num in range(10) if num % 2 == 0}  # {0, 2, 4, 6, 8}

10. Iterations

An iterable is an object supporting the iter() call, while an iterator is the specific object returned by that call which supports next() to produce values.

An iterator is any object with a __next__ method that yields results and raises StopIteration to signal completion, providing the mechanism for iteration protocol to advance and terminate.

The iteration protocol—utilized by tools like for loops, comprehensions, and map(), and implemented by objects like files, lists, and generators—relies on two key steps:

  • An iterable’s __iter__ method, triggered by iter(), returns an iterator to manage the state and lifecycle of the iteration.

  • The iterator’s __next__ method, triggered by next(), produces values sequentially until a StopIteration exception signals the end of the series.

nums = [1, 2]          # iterable
i = iter(nums)         # iterator created here
print(next(i))         # 1
print(next(i))         # 2
# next(i) now raises StopIteration

Iteration contexts in Python include the for loop; list comprehensions; the map built-in function; the in membership test expression; and the built-in functions sorted, sum, any, and all, and also includes the list and tuple built-ins, string join methods, and sequence assignments, all of which use the iteration protocol to step across iterable objects one item at a time.

List comprehensions are executed at internal C-level routines, running faster than manual for loops by bypassing the interpreter’s bytecode overhead.

L = [1, 2, 3, 4, 5]
res = []
for x in L:
    res.append(x+10)
print(res)  # [11, 12, 13, 14, 15]
res2 = [x + 10 for x in L]
print(res2)  # [11, 12, 13, 14, 15]
# filter clauses: if
[line.rstrip() for line in open('script2.py') if line[0] == 'p']
# nested loops: for
[x + y for x in 'abc' for y in 'lmn']
# all, any, map, filter, reduce, zip, enumerate, shuffle, sample, reversed, sorted
nums = list(range(10))  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

all(num > 0 for num in nums)  # False
any(num > 0 for num in nums)  # True

map(lambda x: x * x, nums)  # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
filter(lambda x: x % 2 == 0, nums)  # [0, 2, 4, 6, 8]
# from functools import reduce
reduce(lambda x, y: x + y, nums)  # ((0 + 1) + 2) + ... = 45

zip(range(3), range(4), range(5))  # [(0, 0, 0), (1, 1, 1), (2, 2, 2)]

funcs = ['map', 'filter', 'reduce']
enumerate(funcs)     # [(0, 'map'), (1, 'filter'), (2, 'reduce')]
enumerate(funcs, 1)  # [(1, 'map'), (2, 'filter'), (3, 'reduce')]
[(i, func) for i, func in enumerate(funcs, start=1)]  # [(1, 'map'), (2, 'filter'), (3, 'reduce')]

# from random import shuffle, sample
shuffle(nums)  # Shuffle list x in place, and return None.
nums  # [4, 2, 5, 9, 6, 0, 1, 3, 8, 7]
sample(nums, k=len(nums))  # [5, 3, 7, 6, 8, 4, 0, 1, 2, 9]

reversed(nums)  # [7, 8, 3, 1, 0, 6, 9, 5, 2, 4]
nums[::-1]  # [7, 8, 3, 1, 0, 6, 9, 5, 2, 4]

sorted(nums, reverse=True)  # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
import itertools

names = ["Alan", "Adam", "Wes", "Will", "Albert", "Steven"]

for letter, names in itertools.groupby(names, lambda x: x[0]):
    print(letter, list(names))

# A ['Alan', 'Adam']
# W ['Wes', 'Will']
# A ['Albert']
# S ['Steven']

for num in itertools.chain(range(3), range(3, 7), range(7, 10), [10]):
    print(num, end='\t')

# 0       1       2       3       4       5       6       7       8       9       10

list(itertools.combinations([0, 1, 2], 2))
[(0, 1), (0, 2), (1, 2)]

list(itertools.combinations_with_replacement([0, 1, 2], 2))
[(0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)]

list(itertools.permutations([0, 1, 2], 2))
[(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]

list(itertools.product([0, 1, 2], [3, 4, 5], repeat=1))
[(0, 3), (0, 4), (0, 5), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)]

11. Files and directories

A file is a byte sequence identified by a filename within a directory-based filesystem, categorized into text files—which automatically handle Unicode encoding and line endings—and binary files, which provide raw, unaltered access via the bytes type.

  • A file is opened by open() with an optional mode indicating permissions and newline handling, resulting a stream object for data reading or writing.

    open(f, 'r')  # read an EXISTING file
    open(f, 'w')  # create or overwrite a file
    open(f, 'a')  # create or append to a file
    open(f, 'x')  # create a NON-EXISTING file (fails if exists)
    
    open(f, 'r+') # read and write an EXISTING file
    open(f, 'w+') # read and write a file (creates or overwrites)
    open(f, 'a+') # read and append to a file (creates if missing)
    
    open(f, 'rb') # read an EXISTING file as a raw stream of bytes
    open(f, 'wb') # write a file as a raw stream of bytes
    # text mode (str): .txt, .csv, .json
    with open("file.txt", "w", encoding="utf-8") as f:
        f.write("Line 1\n")            # write string
        f.writelines(["L2\n", "L3\n"]) # write list of strings
    
    with open("file.txt", "r") as f:
        content = f.read()             # read entire file as a single string
        fio.seek(0)
        lines = f.readlines()          # read entire file as a list of strings
        fio.seek(0)
        for line in f: ...             # read line by line (lazy loading)
    # binary mode (bytes): .jpg, .pdf, .zip, .exe
    with open("image.jpg", "rb") as f:
        header = f.read(10)            # read first 10 bytes
        data = f.read()                # read remainder as bytes object
    
    with open("copy.jpg", "wb") as f:
        f.write(data)                  # write bytes object
    • By default, files open in text mode (t) using universal newlines, which transparently maps OS-specific endings (CRLF on Windows, LF on Unix) to the standard \n character.

      open(f, 'r', newline=None)  # default enables universal newline translation
      open(f, 'r', newline='')    # disables translation to return raw endings
      open(f, 'w', newline='\n')  # forces LF line endings regardless of OS
    • By default, files open in system-dependent locale, causing cross-platform failures (e.g., cp1252) when reading UTF-8 files.

      import locale
      
      print(locale.getpreferredencoding())  # preferred encoding
      
      open(f, 'r', encoding='utf-8')        # explicit & safe
  • pathlib is a modern, object-oriented module for path manipulation, replacing the raw string-based logic of os.path.

    from pathlib import Path
    
    # 1. initialization
    p = Path("data/v1/config.yaml")   # object initialization
    p = Path.cwd() / "src" / "app.py" # path combination
    p = Path.home()                   # home dir
    
    # 2. attributes
    p.name    # app.py
    p.stem    # app"
    p.suffix  # .py
    p.parent  # parent dir
    p.parts   # ('/', 'src', 'app.py')
    
    # 3. verification & metadata
    p.exists()  # existence
    p.is_file() # file
    p.is_dir()  # directory
    p.stat()    # size, mtime, etc.
    p.resolve() # absolute path
    
    # 4. mutations
    p.mkdir(parents=True, exist_ok=True) # create dir + parents
    p.touch()                            # create file/update timestamp
    p.unlink(missing_ok=True)            # delete file
    p.rmdir()                            # delete empty dir
    p.rename("new.py")                   # move/rename
    p.replace("new.py")                  # atomic move/overwrite
    
    # 5. search & iteration
    p.iterdir()     # shallow contents generator
    p.glob("*.csv") # shallow pattern match
    p.rglob("*.py") # recursive pattern match
    
    # 6. stream & I/O
    with p.open('r') as f: f.read()  # manual stream
    p.read_text()                    # fast read (UTF-8)
    p.write_text("data")             # fast write (UTF-8)

    pathlib supports * (shallow), ** (recursive), ? (single-char), and [] (sets/ranges), but excludes shell-style {} expansion.

    p.glob("*.py")       # shallow  : current directory only
    p.glob("**/*.py")    # recursive: explicit double-star pattern
    p.rglob("*.py")      # recursive: shorthand (implies ** prefix)
    
    # multi-extension workaround (no {} support)
    target_exts = {'.jpg', '.png', '.gif'}
    images = (f for f in p.rglob("*") if f.suffix in target_exts)

12. Functions

Python functions are first-class objects existing as named blocks with def, anonymous expressions with lambda, or methods with a bound instance.

  • def is a statement creating a named function at runtime, while lambda is an expression coding an anonymous, single-expression function.

    def add(x, y): return x + y     # named, multiple statements
    add_alt = lambda x, y: x + y    # anonymous, one expression
    
    if persistent:
        def save(): ...             # def works inside logic blocks
    else:
        save = lambda: None         # lambda works where expressions are expected
    
    def future_func(): pass         # NOOP: classic
    def todo_func(): ...            # NOOP: modern
  • return sends a result and exits, while yield produces a result and suspends state to generate a series over time.

    def get_one(): return 1         # terminate
    def get_seq(): yield 1; yield 2 # generator
  • global binds names to the module-level scope, while nonlocal binds names to the nearest enclosing function scope.

    # global: modifies module-level x
    def change_global():
        global x; x = 2
    
    # nonlocal: modifies outer function y
    def outer():
        y = 1
        def inner():
            nonlocal y; y = 2
  • Python uses pass-by-assignment, matching arguments from left to right by default, or by keyword (name=value).

    def myfunc(arg1, arg2, meat='ham', *args, **kwargs): ...
    
    # 'spam', 'eggs' -> positional
    # meat=ham       -> keyword
    # *args          -> unpacks remaining positionals
    # **kargs        -> unpacks remaining keyword
    myfunc('spam', 'eggs', meat=ham, *args, **kargs)

    In Python, the / indicates that everything before it is positional-only, and the * (when used alone) indicates that everything after it is keyword-only.

    def feed_ animal(qty, /, kind="goat"): ...  # 'qty' is positional-only; 'kind' is standard
    feed_animal(5, "sheep")                    # valid
    # feed_animal(qty=5, kind="sheep")         # type error
    
    def harvest(*, crop, tool="scythe"): ...   # everything to the right must be named
    harvest(crop="wheat")                      # valid
    # harvest("wheat")                         # type error

12.1. Attributes and annotations

  • A function is a first-class object supporting system and user-defined attributes alongside metadata annotations.

    # 1. annotations (type hint vs. general metadata)
    def cube(n: int) -> int: ...            # type hint
    def spam(a: 'tag'): return a            # general metadata
    
    # 2. user-defined attribute
    cube.category = "math"
    
    # 3. system-defined attributes
    print(cube.__name__)                    # 'cube'
    print(cube.__annotations__)             # {'n': <class 'int'>, 'return': <class 'int'>}
    print(cube.category)                    # 'math'
    
    # 4. first-class citizen (high order)
    def execute(func, value):
        return func(value)
    
    print(execute(cube, 3))                 # 27
    print(execute(lambda n: n**3, 3))       # 27

12.2. Lambdas

  • A lambda expression is created by the keyword lambda with a comma-separated argument list and a single expression that returns the function’s result.

    from functools import reduce
    nums = range(10)
    
    # map: mapping functions over iterables
    list(map(lambda x: x+1, nums))              # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    
    # filter: selecting items in iterables
    list(filter(lambda x: x % 2 == 0, nums))    # [0, 2, 4, 6, 8]
    
    # reduce: combining items in iterables
    reduce(lambda x, y: x+y, nums)              # 45

12.3. Namespaces

A namespace is a scope where names live within LEGB levels (local, enclosing, global, and built-in).

  • A name resolution is the process of searching LEGB levels in order and stopping at the first match.

  • A name assignment is bound to the local scope by default, unless overridden by global or nonlocal.

    a = 5.21                    # global (G)
    
    def tester(start):
        state = start           # enclosing (E)
        def nested(label):
            nonlocal state      # bound to 'state' in tester
            global a            # bound to 'a' at module level
    
            state += 1
            print(locals())     # local names and values
            print(globals())    # global names and values
    
            print(vars())       # local names and values (same as locals)
            import math
            print(vars(math))   # attribute names and values of math module
        return nested

12.4. Closures

  • A closure is a function object that remembers the values in its enclosing lexical environment even after the outer scope has finished executing.

    A lexical environment (or lexical scope) is a static structure where variable accessibility is determined by the physical placement of code at write-time rather than the execution path at runtime.

    # 1. named closure (using def)
    def maker(n):
        def action(x):
            return x ** n  # remembers n from enclosing scope
        return action
    
    f = maker(2)
    g = maker(3)
    
    print(f(4))  # 16 (remembers n=2)
    print(g(4))  # 64 (remembers n=3)
    
    # 2. anonymous closure (using lambda)
    def lambda_maker(n):
        return lambda x: x ** n  # n captured by lambda expression
    
    h = lambda_maker(4)
    print(h(2))  # 16 (remembers n=4)

    If a lambda or def defined within a function is nested inside a loop, all generated functions will share the loop variable’s final value because the variable is bound late at call-time rather than definition-time.

    # 1. the trap: late binding
    def make_actions():
        # 'i' is not "captured" yet; it is just a name to be looked up later
        return [lambda x: i ** x for i in range(5)]
    
    acts = make_actions()
    # At call-time, the loop is finished and 'i' is 4 in the enclosing scope
    print([f(2) for f in acts])  # [16, 16, 16, 16, 16]
    
    # 2. the fix: early binding
    def make_actions():
        # i=i binds the current value of 'i' to a local parameter immediately
        return [lambda x, i=i: i ** x for i in range(5)]
    
    acts = make_actions()
    print([f(2) for f in acts])  # [0, 1, 4, 9, 16]

12.5. Generators

  • A generator is a specialized iterator—a one-way stream that produces items one at a time on demand through the iteration protocol instead of returning a complete sequence at once.

    • A generator function is a generator factory with a def statement and yield to produce an object featuring state suspension, retaining its local scope and code position between yields.

      def count_factory(n):
          for i in range(n):
              yield i               # suspends execution, retains local scope/position
      
      def delegated_factory(n):
          yield from range(n)       # shorthand for "for i in range(n): yield i"
      
      for val in count_factory(5):
          print(val)                # 0, 1, 2, 3, 4
    • A generator expression is a memory-space optimization shorthand with () to produce items on-demand, running slower than list comprehensions due to iteration overhead but essential for large datasets.

      gen_exp = (i for i in range(5))  # memory-efficient
      
      print(next(gen_exp))             # 0            / yields on-demand
      print(list(gen_exp))             # [1, 2, 3, 4] / exhausts the remaining values
      
      next(gen_exp)                    # stop iteration exception

13. Classes

A class is a blueprint defining a namespace of shared attributes to create instance objects with a unique namespace for instance attributes while delegating shared attribute lookups to the class.

class Animal:
    """blueprint for creating animal instances with shared and unique traits."""

    kingdom    = "Animalia"       # public
    _territory = "Earth"          # protected (convention)
    __acient   = "Paleolith"      # private (mangled)

    def __init__(self, species, color, voice):
        self.voice     = voice    # public
        self._color    = color    # protected (convention)
        self.__species = species  # private (mangled)
        # self refers to the specific instance object being created

    @classmethod
    @property
    def territory(cls):
        """getter for protected class territory."""
        return cls._territory

    @classmethod
    @property
    def acient(cls):
        """getter for private class acient."""
        return cls.__acient

    @property
    def species(self):
        """getter for private instance species."""
        return self.__species

    @species.setter
    def species(self, value):
        """setter for private instance species."""
        self.__species = value

    @property
    def color(self):
        """getter for protected instance color."""
        return self._color

    @color.setter
    def color(self, value):
        self._color = value

    def wow(self):
        """prints the animal voice."""
        print(f"{self.voice}!")

    @classmethod
    def change_kingdom(cls, new_name):
        """modifies the shared class namespace."""
        cls.kingdom = new_name
        # cls refers to the class object itself, not a specific instance

    @staticmethod
    def is_living():
        """utility bound to the class namespace."""
        return True

dog = Animal("Canine", "Brown", "Woof")
cat = Animal("Feline", "Orange", "Meow")

dog.wow()                # Woof!
print(cat.species)       # Feline

print(Animal.territory)  # Earth
print(Animal.acient}")   # Paleolith

The term attribute is an umbrella for any named member accessed with dot (.) notation of an object (states, properties, or methods), which are categorized by scopes (class and instance), visibility (public, protected, and private), and storage mechanisms (__dict__ and __slots__).

  • __dict__ is an object-level dictionary that stores an object’s attributes for attribute lookup, and extension.

    class Animal:
        def __init__(self, species):
            self.species = species
    
    dog = Animal("Canine")
    print(dog.__dict__)                 # {'species': 'Canine'}
    
    dog.color = "Brown"                 # add a data attribute
    print(dog.__dict__)                 # {'species': 'Canine', 'color': 'Brown'}
    
    dog.wow = lambda: print("Woof!")    # add a method
    print(dog.__dict__)                 # {..., 'wow': <function <lambda> at ...>}
    
    dog.wow()                           # Woof!"
    
    cat = Animal("Feline")
    print(cat.__dict__)                 # {'species': 'Feline'}
    
    Animal.kingdom = "Animalia"         # add a class data attribute
    print(Animal.__dict__)              # {..., 'kingdom': 'Animalia'}
    
    print(cat.__dict__)                 # {'species': 'Feline'}
    
    print(dog.kingdom)                  # Animalia
    print(cat.kingdom)                  # Animalia
  • __slots__ is a class-level sequence that declares a fixed set of permissioned attributes for memory optimization by bypassing the per-instance __dict__.

    class SlottedAnimal:
        __slots__ = ('species', 'color')
    
        def __init__(self, species):
            self.species = species
    
    bird = SlottedAnimal("Avian")
    
    bird.color = "Blue"
    print(f"{bird.species}, {bird.color}") # Avian, Blue
    
    # print(bird.__dict__) -> 'SlottedAnimal' object has no attribute '__dict__'
    # bird.age = 5         -> 'SlottedAnimal' object has no attribute 'age' and no __dict__ for setting new attributes

__new__, __init__, and __del__ are specialized lifecycle methods to instantiate the object (memory), initialize its attributes (state), and clean up resources (destruction).

class Robot:
    def __new__(cls, *args, **kwargs):
        print("__new__")
        return super().__new__(cls)

    def __init__(self, name):
        self.name = name
        print("__init__")

    def __del__(self):
        print("__del__")

bot = Robot("WALL·E")  # triggers __new__ then __init__
del bot                 # triggers __del__

13.1. Methods

  • An instance method is a function that implicitly receives (binds) the instance as its first argument (self) to reference and manipulate instance attributes.

    class Robot:
        def __init__(self, name):
            self.name = name
    
        def rename(self, new_name):
            self.name = new_name
    
    bot = Robot('WALL·E')
    bot.rename('WALL·EVE')
  • A class method is a function decorated by @classmethod that implicitly receives (binds) the class as its first argument (cls) to reference and manage class attributes.

    class Robot:
        name = 'WALL·E'
    
        @classmethod
        def rename(cls, new_name):
            cls.name += new_name
    
    Robot.rename('WALL·EVE')
  • A static method is a function decorated by @staticmethod that receives no implicit argument and serves as a namespace-bound utility.

    class Robot:
        @staticmethod
        def is_sustainable(plant_count):
            """Check if life is sustainable based on current plant discovery."""
            return plant_count > 0
    
    if Robot.is_sustainable(1):
        print("Return to Earth!")

13.2. Inheritances

  • A subclass is a child class that extends or overrides the functionality from one or more base classes.

    class Robot:
        def __init__(self, name):
            self.name = name
    
        def move(self):
            print(f"{self.name} moves on treads.")
    
    class WallE(Robot):
        def __init__(self, name):
            self.name = name
    
        def work(self):
            print(f"{self.name} compacting trash.")
    • A Mixin is a small, specialized class used in multiple inheritance to plug in a specific feature (like flight or tread) without changing the core identity of the target class (like a Robot).

      class Robot:
          def __init__(self, name):
              self.name = name
      
      class FlightMixin:
          def move(self):
              print(f"{self.name} is flying through the air!")
      
      class TreadMixin:
          def move(self):
              print(f"{self.name} is rolling on treads.")
      
      class Eve(FlightMixin, Robot):
          """Identity: Robot | Feature: Flight"""
          def scan(self):
              # Uses the identity name to perform a unique action
              print(f"{self.name} is scanning for plant life...")
      
      class WallE(TreadMixin, Robot):
          """Identity: Robot | Feature: Treads"""
          def work(self):
              # Uses the identity name to perform a unique action
              print(f"{self.name} is compacting trash cubes.")
      
      issubclass(Eve, Robot)         # True
      issubclass(Eve, FlightMixin)   # True
      issubclass(WallE, Robot)       # True
      issubclass(WallE, TreadMixin)  # True
  • In Python, inheritance is an attribute lookup process that uses C3 Linearization to flatten class hierarchies into a single, predictable search path called the MRO (Method Resolution Order).

    • Class.mro() is a method that returns a list of classes representing the search path derived from C3 Linearization for attribute lookup.

      class Base: ...
      
      class Mixin(Base): ...
      
      class Child(Mixin, Base): ...
      
      print(Child.mro())              # [<class 'Child'>, <class 'Mixin'>, <class 'Base'>, <class 'object'>]
    • super() is a proxy object that delegates method calls to the next class in the MRO without specifying the class name explicitly.

      class Robot:
          def __init__(self, name):
              self.name = name
      
      class WallE(Robot):
          def __init__(self, name):
              super().__init__(name)      # FLEXIBLE: finds Robot automatically
      
      class Eve(Robot):
          def __init__(self, name):
              Robot.__init__(self, name)  # RIGID: must name class and pass 'self' manually
  • In Python, duck typing is a loose implementation of polymorphism that prioritizes an object’s behavior (methods and attributes) over its inheritance or class identity.

    # If it walks like a duck and quacks like a duck, it’s a duck.
    #     —— A Wise Person
    class Duck:
    
        def wow(self):
            return 'quack!'
    
    class Cat:
    
        def wow(self):
            return 'meow!'
    
    def speak(entity):
        print(entity.speak())
    
    speak(Duck())  # quack!
    speak(Cat())   # meow!
  • ABC (Abstract Base Class) is an explicit contract for interfaces with runtime type checking (i.e., isinstance()), while Protocol is an implicit shape using structural subtyping for static type checking (e.g., Pyright), aligning with Python’s duck typing philosophy.

    from abc import ABC, abstractmethod
    from typing import Protocol
    
    class RobotABC(ABC):
        @abstractmethod
        def move(self): ...
    
    class WallE(RobotABC):                  # explicitly inherits
        def move(self):
            print("Solar rolling...")
    
    class Flyer(Protocol):
        def move(self) -> None: ...
    
    class Eve:                              # implicitly matches
        def move(self):
            print("Thruster flight...")
    
    def activate(unit: Flyer):
        unit.move()
    
    activate(WallE())                       # works (has .move)
    activate(Eve())                         # works (has .move)

13.3. Enumeration

  • An Enum (Enumeration) is a specialized class that represents a set of symbolic names bound to unique, constant values.

    from enum import Enum, auto
    
    class RobotStatus(Enum):
        IDLE   = auto()         # 1
        MOVING = auto()         # 2
        ERROR  = 99             # explicit override
    
    status = RobotStatus.IDLE
    print(status.name)          # Status: IDLE
    
    for status in RobotStatus:
        print(f"{status.name:8} | {status.value}")
  • A Flag is a specialized Enum that represents a set of simultaneous states or permissions using bitwise operators rather than a single choice.

    from enum import Enum, Flag, auto
    
    class RobotSystem(Flag):
        SENSORS = auto()                             # 1
        MOTORS  = auto()                             # 2
        WIFI    = auto()                             # 4
    
    active = RobotSystem.SENSORS | RobotSystem.WIFI
    
    print(RobotSystem.SENSORS in active)             # True
    print(RobotSystem.MOTORS in active)              # False
    
    for sys in active:
        print(f"System Active: {sys.name}")

13.4. Operator overloading

  • Operator Overloading allows custom classes to redefine the behavior of built-in operators (like +, ==, []) and built-in functions (like str(), len(), or callable()) using dunder methods.

    class Robot:
        def __init__(self, name, power):
            self.name, self.power = name, power
            self.tools = ["Scanner", "Laser"]
    
        # visibility
        def __repr__(self): return f"Robot({self.name!r}, {self.power})"
        def __str__(self):  return f"{self.name} ({self.power}%)"
    
        # identity (value equality & hashability)
        def __eq__(self, other):
            return isinstance(other, Robot) and self.name == other.name
        def __hash__(self): return hash(self.name)
    
        # collection logic
        def __len__(self): return len(self.tools)
        def __getitem__(self, idx): return self.tools[idx]
    
    walle = Robot("Wall-E", 100)
    
    print(walle)                        # Wall-E (100%)
    print(walle[1])                     # Laser
    print(len(walle))                   # 2
    print(walle == Robot("Wall-E", 50)) # True
  • A dataclass is a specialized class decorated by @dataclass (similar to Lombok in Java), designed primarily to store data while automatically generating boilerplate methods like __init__, __repr__, and __eq__.

    from dataclasses import dataclass
    
    @dataclass
    class Point:
        x: float
        y: float
    
    p1 = Point(1.0, 2.0)
    p2 = Point(1.0, 2.0)
    
    print(p1)        # Point(x=1.0, y=2.0)
    print(p1 == p2)  # True
  • A callable expression (__call__) is used to apply function call on an object instance.

    class Robot:
        def __init__(self, name: str):
            self.name = name
    
        def __call__(self, task: str):
            print(f"{self.name} is now executing: {task}")
    
    walle = Robot("Wall-E")
    
    walle("Compressing Trash")  # Wall-E is now executing: Compressing Trash
  • A context manager (__enter__ and __exit__) is used to automate resource setup and cleanup of an object in a with statement block.

    class Robot:
        def __init__(self, name: str):
            self.name = name
    
        def __enter__(self):
            print(f"[Start] {self.name} connected.")
            return self
    
        def __exit__(self, *args):
            print(f"[End] {self.name} disconnected.")
    
    with Robot("Wall-E") as bot:
        print(f"Executing tasks with {bot.name}...")

    A context manager can also be defined as a generator using the @contextmanager decorator.

    from contextlib import contextmanager
    
    @contextmanager
    def robot_session(name: str):
        # __enter__ logic
        print(f"[Start] {name} online.")
        try:
            yield name  # what 'as' receives
        finally:
            # __exit__ logic
            print(f"[End] {name} offline.")
    
    with robot_session("Wall-E") as bot_name:
        print(f"Task: {bot_name} is cleaning.")

14. Decorators

A decorator is a higher-order function—defined as either a function or a class—that wraps another function or class to extend its functionality without modifying its source code.

  • A function-based decorator is a higher-order function that uses a closure to intercept the target’s execution within an inner wrapper.

    @functools.wraps(target) is a built-in decorator that copies the original function’s metadata—like its name, docstrings, and argument list—onto the new wrapper.
    import functools
    
    def trace(target):
        @functools.wraps(target)
        def wrapper(*args, **kwargs):
            print(f"[Trace] Intercepting: {target.__name__}")
            return target(*args, **kwargs)
        return wrapper
    
    # 1: intercepting a standalone function
    @trace
    def launch_rocket():
        print("Rocket launched!")
    
    # 2: intercepting class instantiation (the constructor call)
    @trace
    class Satellite:
        def __init__(self):
            print("Satellite object created.")
    launch_rocket()
    # [Trace] Intercepting: launch_rocket
    # Rocket launched!
    Satellite()
    # [Trace] Intercepting: Satellite
    # Satellite object created.
  • A class-based decorator is a class that implements __init__ to capture the target and __call__ to act as the higher-order interface, providing a structured way to maintain internal state.

    functools.update_wrapper(self, target) is a utility function used inside a class-based decorator to copy the target’s metadata (such as __name__, __doc__, and __module__) to the decorator instance.
    import functools
    
    class trace:
        def __init__(self, target):
            # captures the target and preserves its metadata
            functools.update_wrapper(self, target)
            self.target = target
            self.count = 0
    
        def __call__(self, *args, **kwargs):
            # acts as the higher-order interface to intercept execution
            self.count += 1
            print(f"[Trace] Intercepting call #{self.count} to: {self.target.__name__}")
            return self.target(*args, **kwargs)
    
    # 1: intercepting a function
    @trace
    def launch_rocket():
        print("Rocket launched!")
    
    # 2: intercepting class instantiation
    @trace
    class Satellite:
        def __init__(self):
            print("Satellite object created.")
    launch_rocket()
    launch_rocket()
    # [Trace] Intercepting call #1 to: launch_rocket
    # Rocket launched!
    # [Trace] Intercepting call #2 to: launch_rocket
    # Rocket launched!
    Satellite()
    # [Trace] Intercepting call #1 to: Satellite
    # Satellite object created.

15. Exceptions

  • An exception is a class, which is a child of the class Exception.

    class OopsException(Exception): pass  # user-defined exception
  • The raise statement raises (triggers) a built-in or user-defined exception.

    raise instance  # raise instance of class
    raise clazz     # make and raise instance of class: makes an instance with no constructor arguments
    raise           # reraise the most recent exception
    try:
        1 / 0
    except Exception as E:
        raise TypeError('Bad') from E  # raise newexception from otherexception
    
    # Traceback (most recent call last):
    # ZeroDivisionError: division by zero
    #
    # The above exception was the direct cause of the following exception:
    #
    # Traceback (most recent call last):
    # TypeError: Bad
  • The assert statement raises an AssertionError exception if a condition is false.

    # assert test, data # the data part is optional
    assert False, 'Nobody expects the Spanish Inquisition!'  # AssertionError: Nobody expects the Spanish Inquisition!
  • The try statement catches and recovers from exceptions with one or more handlers for exceptions that may be raised during the block’s execution.

    # try -> except -> else -> finally
    try:
        raise OopsException('panic')  # raising exceptions
    except OopsException as err:  # 3.X localizes 'as' names to except block
        print(err)  # catch and recover from exceptions
    except (RuntimeError, TypeError, NameError) as err:  # multiple exceptions as a parenthesized tuple
        ...
    except Exception as other:  # except to catch all exceptions
        ...
    except:  # bare except to catch all exceptions
        ...
    else:
        ... # run if no exception was raised during try block
    finally:  # termination actions
        ...
  • The with/as statement is designed to automate startup and termination activities that must occur around a block of code.

    # try:
    #     file = open('lumberjack.txt', 'w', encoding='utf-8')
    #     file.write('The larch!\n')
    # finally:
    #     if file: file.close()
    with open('lumberjack.txt', 'w', encoding='utf-8') as file:  # always close file on exit
        file.write('The larch!\n')

16. Ellipsis (…​)

Ellipsis (…​) is Python’s built-in Ellipsis object, a singleton constant (like None, True, False).

>>> ...
Ellipsis
>>> type (...)
<class 'ellipsis'>
>>> Ellipsis is ...
True
>>>
  • Use …​ as a placeholder in function/class bodies (similar to pass):

    def function_to_implement_later():
        ...  # Placeholder - does nothing
    
    class IncompleteClass:
        ...  # Placeholder
    
    # Equivalent to:
    def function_to_implement_later():
        pass
  • In type annotations, …​ represents any number of elements:

    # Tuple with any number of ints
    def process(*args: tuple[int, ...]) -> None:
        pass
    
    # Callable with variadic arguments
    from collections.abc import Callable
    func: Callable[..., int]  # Any args, returns int
    
    # Fixed-length tuple
    point: tuple[int, int] = (1, 2)
    
    # Variadic tuple
    numbers: tuple[int, ...] = (1, 2, 3, 4, 5)
  • Libraries use …​ as a sentinel to mark special cases (distinct from None):

    from pydantic import BaseModel, Field
    
    class User(BaseModel):
        # Required field (no default)
        email: str = Field(..., description="Email address")
    
        # Optional field (defaults to None)
        avatar: str | None = Field(None, description="Avatar URL")
    
        # Optional with default value
        is_active: bool = Field(default=True, description="Active status")
    # Simplified Pydantic internal logic
    def Field(default=..., **kwargs):
        if default is ...:  # Checks if default is the Ellipsis object
            # Field is REQUIRED - no default value
            return RequiredField(**kwargs)
        else:
            # Field is OPTIONAL - has a default value
            return OptionalField(default=default, **kwargs)
  • In NumPy, …​ represents all remaining dimensions:

    import numpy as np
    
    arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
    # Shape: (2, 2, 2)
    
    arr[..., 0]    # All dimensions except last, then first element
    # Equivalent to: arr[:, :, 0]
    
    arr[0, ...]    # First element of first dimension, all others
    # Equivalent to: arr[0, :, :]
  • In type stub files, …​ indicates implementation not shown:

    # module.pyi (type stub)
    def complex_function(arg1: int, arg2: str) -> bool: ...
        # Implementation details not shown in stub
    
    class MyClass:
        def method(self) -> None: ...
        # Method signature only

17. Modules

A module is a single Python file containing reusable code, while a package is a directory of module files.

  • A regular package is a unified directory anchored by an __init__.py file, which locks the package to a single physical location and enables package-level initialization.

  • A namespace package is a virtual directory without an __init__.py file, which allows a single package name to span multiple detached physical locations.

    /project_root/
    ├── path_a/ (add to sys.path)
    │   └── company/                <-- namespace package (no __init__.py)
    │       └── auth/               <-- regular package (physical anchor)
    │           ├── __init__.py
    │           └── logic.py
    │
    └── path_b/ (add to sys.path)
        └── company/                <-- namespace package
            └── billing/            <-- regular package
                ├── __init__.py
                └── processor.py
    import company
    
    # regular package: returns a list with a single string path
    print(company.__path__)
    # ['/home/user/project/company']
    
    # namespace package: returns a _NamespacePath object containing multiple paths
    print(company.__path__)
    # _NamespacePath(['/path_a/company', '/path_b/company'])
  • sys.path is a mutable directory string list that the Python interpreter scans—in order—to locate modules and packages, finalized as a search path formed by concatenating the script’s home directory, user-defined PYTHONPATH variables, and standard library locations.

    import sys
    
    # loop through the list to see the search priority (index 0 is highest)
    for index, path in enumerate(sys.path):
        # using repr() or f-strings helps visualize the empty string '' as the current directory
        print(f"{index}: {path if path else '(current working directory)'}")
    
    # 0: (current working directory)
    # 1: /usr/lib/python313.zip
    # 2: /usr/lib/python3.13
    # 3: /usr/lib/python3.13/lib-dynload
    # 4: /usr/local/lib/python3.13/dist-packages
    # 5: /usr/lib/python3/dist-packages
  • Imports are executable assignments that fetch modules or specific attributes into a scope, where regular packages use __init__.py as a physical anchor and namespace packages span multiple directories to form a virtual search path.

    # 1. Standard Library
    import sys
    from importlib import reload
    
    # 2. Third-Party
    import numpy as np
    
    # 3. Local (Absolute preferred)
    from app.models import User
    
    # 4. Relative to current package hierarchy (requires __name__ != "__main__")
    # --- Relative Imports (Context: inside app/services/logic.py) ---
    # .  = Current Package (app.services)
    from .utils import helper
    
    # .. = Parent Package (app)
    from ..database import connection
    
    # 5. Best Practice: Avoid circularity with TYPE_CHECKING
    from typing import TYPE_CHECKING
    if TYPE_CHECKING:
        from .heavy_module import ComplexType
    
    # 6. Runtime Logic
    # reload(np)                 # Re-executes the module (e.g., after a file change)
    print(sys.path[0])           # The "Home" directory (highest search priority)
  • __all__ is a list that acts as an explicit export contract where Python copies exactly the names listed irrespective of underscores or defaults to copying all names without a single leading underscore (_X) if the list is absent.

    # utils.py
    __all__ = ["calculate", "_internal_override"]
    
    def calculate():
        return "Public API"
    
    def _internal_override():
        return "Private but explicitly exported"
    
    def hidden_helper():
        return "Skipped because I'm not in __all__"
    
    # ---------------------------------------------------------
    # main.py
    from utils import *
    
    print(calculate())          # Works
    print(_internal_override()) # Works
    # print(hidden_helper())    # NameError
  • If a module’s __name__ variable is the string "__main__", it means that the file is being executed as a top-level script as a program instead of being imported from another file as a library in the program.

    # cat.py
    def wow():
        return __name__
    
    if __name__ == '__main__':
        print(f'executed: {wow()}')
    $ python3 cat.py  # directly executed (as a script)
    executed: __main__
    # imported by another module
    from cat import wow
    print(f'imported: {wow()}')  # imported: cat

18. Dependencies

Dependencies are external libraries required by a project, managed within isolated environments to prevent version conflicts and ensure reproducibility across different machines.

A dependency’s version specifier consists of a series of version clauses (SemVer) separated by commas (e.g., ~= 1.2, != 1.2.3, < 1.5.0), which function as a logical AND to define the version range.

# standard pip (requires quotes for complex specifiers)
pip install "requests~=1.2, !=1.2.3, <1.5.0"

# modern uv (faster, supports the same syntax)
uv pip install "requests~=1.2, !=1.2.3, <1.5.0"

18.1. uv & pip

uv is an extremely fast Python package and project manager, written in Rust, to replace pip, pip-tools, pipx, virtualenv, and more.

Use uv pip as a drop-in, high-performance replacement for legacy pip commands (e.g., uv pip install -r requirements.txt).
# install uv (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh

# enable shell completion (Bash)
echo 'eval "$(uv generate-shell-completion bash)"' >> ~/.bashrc
# install uv (Windows)
winget install --id=astral-sh.uv -e

# enable shell completion (PowerShell)
if (!(Test-Path -Path $PROFILE)) { New-Item -ItemType File -Path $PROFILE -Force }
Add-Content -Path $PROFILE -Value '(& uv generate-shell-completion powershell) | Out-String | Invoke-Expression'
  • Python Version Management

    uv python install     # download/install python versions
    uv python list        # view available and installed versions
    uv python find        # locate an installed python version
    uv python pin         # lock project to a specific python version
    uv python uninstall   # remove a Python version
  • Project Management (pyproject.toml)

    uv init               # create a new python project
    uv add <pkg>          # add a dependency (updates pyproject.toml)
    uv remove <pkg>       # remove a dependency
    uv sync               # sync lockfile with the environment
    uv lock               # generate/update the uv.lock file
    uv run <script.py>    # run script in the managed environment
    uv tree               # view the dependency tree
    uv build              # build distribution archives (wheel/sdist)
    uv publish            # upload to a package index (PyPI)
  • Tool Management (uvx)

    uvx <tool>            # run a tool (ruff, black) in an ephemeral env
    uv tool install       # install a tool user-wide (isolated)
    uv tool list          # list installed user-wide tools
    uv tool uninstall     # remove a tool
    uv tool update-shell  # ensure tool paths are in your PATH
  • System Maintenance

    uv cache clean        # remove all cache entries
    uv cache prune        # remove only outdated cache entries
    uv cache dir          # show the cache directory path
    uv self update        # update uv to the latest version
  • Index Options

    # 1. using Environment variables (best for global mirrors)
    export UV_DEFAULT_INDEX="https://pypi.tuna.tsinghua.edu.cn/simple"
    
    # additional indexes
    export UV_INDEX="https://mirrors.sustech.edu.cn/pypi/web/simple https://mirrors.aliyun.com/pypi/simple/"
    
    # 2. using uv.toml / pyproject.toml
    [index]
    # primary mirror (replaces pypi.org)
    default = true
    url = "https://pypi.tuna.tsinghua.edu.cn/simple"
    
    [[index]]
    # additional source
    url = "https://mirrors.sustech.edu.cn/pypi/web/simple"
    
    # 3. one-off command line use
    uv pip install <pkg> --default-index https://pypi.tuna.tsinghua.edu.cn/simple

18.2. conda

Conda is a cross-platform package and environment manager for Python (C++, CUDA, R, etc.), whereas uv is specialized on isolated, per-project Python workflows.

my_project/
├── environment.yml    # conda: provisions Python + uv + C-libs
├── pyproject.toml     # uv: defines Python packages & metadata
└── .venv/             # created by uv: Local project isolation
# environment.yml
name: ds-stack
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.12        # base interpreter for the entire environment
  - pip                # required for Conda's internal package hooks
  - uv                 # high-speed resolver installed via conda-forge

  # non-python/system Dependencies
  - gxx_linux-64
  - cudatoolkit=11.8

  # binary-heavy Python libs (optional to keep in Conda)
  - numpy
# pyproject.toml
[project]
name = "analysis-app"
version = "0.1.0"
dependencies = [
    "pandas>=2.2.0",
    "scikit-learn",
]

[project.optional-dependencies]
dev = [
    "pytest>=8.0.0",    # the industry standard for python testing
    "pytest-cov",       # extension for coverage reporting
    "ruff",             # fast linting and formatting
]
plot = ["matplotlib", "seaborn"]

[tool.uv]
managed = true          # enables uv to manage the project and lockfile automatically

[tool.ruff]
line-length = 88
select = ["E", "F", "I"]
fix = true

[tool.pytest.ini_options]
testpaths = ["tests"]   # directory where tests are located
pythonpath = ["src"]    # ensures project code is importable during tests
# 1. provision the base system (Python + uv)
conda env create -f environment.yml
conda activate ds-stack

# 2. synchronize to the final desired state in one step
uv sync --all-extras

# 3. execute tools directly via uv
uv run ruff check .
  • Miniconda is the preferred lightweight version of the full Anaconda distribution.

    # download and install Miniconda (Linux)
    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    bash ~/Miniconda3-latest-Linux-x86_64.sh
    
    # recommended: prevent 'base' environment from auto-activating on shell start
    conda config --set auto_activate_base false
  • Conda channels are package host locations (like PyPI) where the definition order dictates search priority.

    • By default, Conda automatically uses repo.anaconda.com to download and update packages.

      $ conda config --get channels
      --add channels 'https://repo.anaconda.com/pkgs/msys2'   # lowest priority (1)
      --add channels 'https://repo.anaconda.com/pkgs/r' (2)
      --add channels 'https://repo.anaconda.com/pkgs/main'   # highest priority (3)
      1 A Windows-specific channel that provides Unix-like tools and libraries necessary for many packages to function on Windows.
      2 A specialized channel dedicated to packages for the R programming language.
      3 The default, general-purpose channel maintained by Anaconda, Inc., primarily hosting Python-based scientific computing packages.
    • In addition, Conda clients search conda.anaconda.org for community channels like conda-forge or bioconda.

      conda-forge is a separate, community-led channel, required to be added explicitly.

      $ conda config --add channels conda-forge
      $ conda config --get channels
      --add channels 'https://repo.anaconda.com/pkgs/msys2'   # lowest priority
      --add channels 'https://repo.anaconda.com/pkgs/r'
      --add channels 'https://repo.anaconda.com/pkgs/main'
      --add channels 'conda-forge'   # highest priority
    • Conda can be configured to use mirror servers instead of the default online repositories.

      # mirror defaults
      default_channels: (1)
          - https://my-mirror.com/pkgs/main
          - https://my-mirror.com/pkgs/r
          - https://my-mirror.com/pkgs/msys2
      
      # mirror all community channels
      channel_alias: https://my-mirror.com (2)
      
      # mirror only some community channels
      custom_channels: (3)
          conda-forge: https://my-mirror.com/conda-forge
      1 The default_channels setting completely replaces Conda’s built-in default channels, redirecting all package requests for them to specified mirror URLs.
      2 The channel_alias setting establishes a base URL that prefixes all non-default channel names (e.g., conda-forge in conda install -c conda-forge), thereby redirecting their package requests to a designated mirror location.
      3 The custom_channels setting allows for direct mapping of specific channel names to particular mirror URLs, providing fine-grained control and overriding any channel_alias for those listed channels.
      # using TUNA mirrors
      show_channel_urls: true
      default_channels:
        - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main
        - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r
        - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2
      custom_channels:
        conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
  • Conda is a powerful command line tool for package and environment management that runs on Windows, macOS, and Linux.

    • Create, activate, list, share, remove, and update environments.

      # create a new, empty environment
      conda create -n <env-name>
      
      # create a new environment with default packages
      conda create -n <env-name> python pandas
      
      # createa new environment with specific Python version
      conda create -n <env-name> python=3.12
      
      # create or update an environment from a file
      conda env create -f environment.yml
      # list all environments
      conda env list
      # activate an environment
      conda activate myenv
      
      # deactivate the current environment
      conda deactivate
      # export the current environment to a file (verbose)
      conda env export > environment.yml
      
      # export only explicitly installed packages (recommended)
      conda env export --from-history > environment.yml
      # remove an environment and all its packages
      conda env remove --name my_env
    • Run commands (conda run) in an environment without shell activation.

      # Best Practice: Use `conda run` in scripts to execute a command in an
      # environment without needing to activate it first. This is more robust
      # for automation as it doesn't modify the shell's state.
      
      # run a command in a specific conda environment without activating it
      conda run -n myenv python my_script.py
      
      # run an arbitrary command
      conda run -n myenv pytest
      
      # for interactive commands or long-running services, use --no-capture-output
      # to see the output in real-time instead of all at the end.
      conda run -n myenv --no-capture-output python my_interactive_app.py
      #!/bin/bash
      
      # A script that runs a Python application, demonstrating a priority-based
      # approach for choosing the Python interpreter:
      # 1. Use the python from an active Conda environment.
      # 2. If not active, use `conda run` with the environment from `environment.yml`.
      # 3. As a fallback, use the system's `python3`.
      
      PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)"
      PYTHON_CMD="python3"
      
      # Check for active Conda environment (CONDA_PREFIX is set when conda env is activated)
      if [ -n "$CONDA_PREFIX" ]; then
          PYTHON_CMD="python"
      elif command -v conda >/dev/null && [ -f "$PROJECT_ROOT/environment.yml" ]; then
          ENV_NAME=$(grep -E "^name:" "$PROJECT_ROOT/environment.yml" | sed -E 's/^name:[[:space:]]*([^[:space:]#]+).*/\1/' | head -n1)
          # Use 'conda run' to execute in the specified environment without activating it
          [ -n "$ENV_NAME" ] && PYTHON_CMD="conda run --no-capture-output -n $ENV_NAME python"
      fi
      
      cd "$PROJECT_ROOT"
      
      # Execute the main module (e.g., 'cli.main', 'app.main', etc.), passing through all script arguments
      PYTHONPATH="$PROJECT_ROOT" $PYTHON_CMD -m cli.main "$@"
    • Find, install, remove, list, and update packages.

      # search for a package across all configured channels
      conda search scipy
      
      # search ONLY in a specific channel, ignoring all other configured channels
      conda search --override-channels -c conda-forge scipy
      
      # search an additional channel with highest priority (results are combined with configured channels)
      conda search -c conda-forge scipy
      # search for a package with a version pattern (e.g., greater than or equal to)
      conda search "numpy>=1.20"
      
      # search for a package with a version prefix
      conda search "numpy=1.20.*"
      
      # search for a package for a different platform
      conda search numpy --platform linux-64
      
      # filter search results using external tools like grep (e.g., for a specific Python build)
      conda search numpy | grep py39
      # show detailed information for a specific package build
      conda search scipy=1.15.2=py313hf4aebb8_1 --info
      # install a package into the currently active environment
      conda install matplotlib
      
      # install a package into a specific environment
      conda install -n myenv matplotlib
      
      # install a package from a specific channel
      conda install -c conda-forge numpy
      # remove a package from the currently active environment
      conda remove matplotlib
      
      # remove a package from a specific environment
      conda remove -n myenv pandas
      # list installed packages in the current environment
      conda list
      
      # list installed packages in a specific environment
      conda list -n myenv
      # update a specific package
      conda update biopython
      
      # update Python in the current environment
      conda update python
      
      # update a specific package in a specific environment
      conda update -n myenv biopython
    • Update the Conda package manager itself.

      # update conda itself (simple, but may use non-default channels)
      conda update conda
      
      # update conda from the official defaults channel (recommended for stability)
      conda update -n base -c defaults conda

19. Concurrency

import os
import platform
from pathlib import Path

# 1. system info
print(platform.platform())           # modern os.uname()
print(os.cpu_count())                # 4
print(Path.cwd())                    # portable os.getcwd()
print(os.getpid())                   # 1295

# 2. process & user (Unix-specific)
if os.name == 'posix':
    print(os.getloadavg())           # (0.05126953125, 0.03955078125, 0.00341796875)
    print(os.getuid(), os.getgid())  # 1000 1000

# 3. shell calling ('date -u' on Unix, 'date /t' on Windows)
cmd = 'date /t' if os.name == 'nt' else 'date -u'
os.system(cmd)

While os provides static info, psutil is the cross-platform standard for real-time resource monitoring (CPU, RAM, Disk).

import psutil  # uv add psutil
print(psutil.cpu_percent(interval=1))  # CPU load
print(psutil.virtual_memory().percent) # RAM usage %

19.1. Processes and threads

  • A process can be spawned with subprocess or multiprocessing in a dedicated memory space.

    # execute and communicate with external system programs (e.g., shell commands)
    import subprocess
    import os
    
    # Modern way to run and capture output
    cmd = ['date', '/t'] if os.name == 'nt' else ['date', '-u']
    result = subprocess.run(cmd, capture_output=True, text=True)
    
    print(result.stdout.strip())
    print(f"Exit Code: {result.returncode}")
    # execute functions in parallel across multiple CPU cores on independent processes
    import multiprocessing
    import time
    import os
    
    def whoami(label: str) -> None:
        print(f"Process {os.getpid()}: {label}")
    
    def loopy(name: str) -> None:
        whoami(name)
        for num in range(1, 6):
            print(f"\t{num}. Honk!")
            time.sleep(1)
    
    if __name__ == "__main__":
        whoami("Main")
    
        # 1. start and join short-lived workers
        workers = [multiprocessing.Process(target=whoami, args=(f"Worker {n}",)) for n in range(4)]
    
        for p in workers: p.start()
        for p in workers: p.join()
    
        # 2. manage a long-running background process
        lp = multiprocessing.Process(target=loopy, args=("loopy",))
        lp.start()
    
        print("Main program is now waiting 5 seconds...")
        time.sleep(5)
    
        print("Terminating loopy...")
        lp.terminate()
        lp.join()
    
        print("Execution complete.")
  • A thread is managed with the threading to execute concurrent tasks within a single process sharing the same memory space.

    In free threading (available in 3.13+), the global interpreter lock, which restricts execution to one thread at a time, is optional, whereby multiple threads can run bytecode within the same interpreter in parallel across multiple CPU cores for both I/O and CPU-bound tasks.

    import sys
    import sysconfig
    
    # True if the Python binary supports it
    build_support = sysconfig.get_config_var("Py_GIL_DISABLED") == 1
    
    # True if the GIL is currently running (3.13+)
    is_gil_enabled = getattr(sys, "_is_gil_enabled", lambda: True)()
    
    print(f"Supported: {build_support} | GIL Active: {is_gil_enabled}")
    import threading
    import time
    
    def task(name: str, delay: int) -> None:
        print(f"Thread {name} starting...")
        time.sleep(delay)
        print(f"Thread {name} done.")
    
    if __name__ == "__main__":
        threads = [
            threading.Thread(target=task, args=("A", 2)),
            threading.Thread(target=task, args=("B", 1))
        ]
    
        for t in threads: t.start()  # fire all threads (Concurrent)
        for t in threads: t.join()   # wait for all to finish
    
        print("All threads finished.")

19.2. Queues, futures and coroutines

  • A queue is a concurrency-safe, FIFO channel (like Go, sharing by communication) used to distribute tasks without data corruption.

    • A threading queue is ideal for I/O-bound tasks where the execution flow is primarily limited by high-latency operations such as disk access, network communication, or web requests.

      import threading, queue, time
      
      def washer(dishes, dish_queue):
          for dish in dishes:
              print(f"Washing {dish}")
              time.sleep(1) # Network/Disk Latency
              dish_queue.put(dish)
      
      def dryer(dish_queue):
          while True:
              dish = dish_queue.get()
              print(f"Drying {dish}")
              time.sleep(2) # Network/Disk Latency
              dish_queue.task_done()
      
      dish_queue = queue.Queue()
      for _ in range(2):
          threading.Thread(target=dryer, args=(dish_queue,), daemon=True).start()
      
      dishes = ['salad', 'bread', 'entree', 'dessert']
      washer(dishes, dish_queue)
      dish_queue.join()
    • A multiprocessing queue is ideal for CPU-bound tasks where the execution flow is limited by intensive data processing or heavy mathematical calculations across multiple CPU cores.

      import multiprocessing as mp
      
      def washer(dishes, output):
          for dish in dishes:
              print(f"Washing {dish} dish")
              output.put(dish)
      
      def dryer(input_queue):
          while True:
              dish = input_queue.get()
              print(f"Drying {dish} dish")
              input_queue.task_done()
      
      if __name__ == "__main__":
          dish_queue = mp.JoinableQueue()
      
          dryer_proc = mp.Process(target=dryer, args=(dish_queue,), daemon=True)
          dryer_proc.start()
      
          dishes = ['salad', 'bread', 'entree', 'dessert']
          washer(dishes, dish_queue)
      
          dish_queue.join()
  • A future is an object (like Task in .NET) representing an asynchronous operation’s eventual result for tracking the status of a task executed on threads for I/O-bound work or processes for CPU-bound work.

    from concurrent import futures
    from typing import Generator, Callable
    import math
    import sys
    
    def calc(val: int) -> tuple[int, float]:
        """Calculates square root of a value."""
        return val, math.sqrt(float(val))
    
    def use_threads(num: int, values: list[int]) -> Generator[tuple[int, float], None, None]:
        """Executes calc using a thread pool."""
        with futures.ThreadPoolExecutor(max_workers=num) as executor:
            tasks = [executor.submit(calc, v) for v in values]
            for f in futures.as_completed(tasks):
                yield f.result()
    
    def use_processes(num: int, values: list[int]) -> Generator[tuple[int, float], None, None]:
        """Executes calc using a process pool."""
        with futures.ProcessPoolExecutor(max_workers=num) as executor:
            tasks = [executor.submit(calc, v) for v in values]
            for f in futures.as_completed(tasks):
                yield f.result()
    
    def main(workers: int, values: list[int]) -> None:
        print(f"Workers: {workers} | Tasks: {len(values)}")
    
        modes: list[tuple[str, Callable[[int, list[int]], Generator[tuple[int, float], None, None]]]] = [
            ("Threads", use_threads),
            ("Processes", use_processes)
        ]
    
        for mode_name, func in modes:
            print(f"\nUsing {mode_name}:")
            for val, result in func(workers, values):
                print(f"sqrt({val}) = {result:.4f}")
    
    if __name__ == '__main__':
        num_workers: int = int(sys.argv[1]) if len(sys.argv) > 1 else 3
        data: list[int] = list(range(1, 6))
    
        main(num_workers, data)
  • A coroutine is a pausable function (created with async and await) managed by an event loop for high-concurrency I/O-bound tasks, such as networking and database connections, without the overhead of multiple threads.

    import asyncio
    
    async def task():
        # 'await' pauses the coroutine, yielding control back to the loop
        await asyncio.sleep(1)
        return "Data retrieved"
    
    async def main():
        # coroutines are scheduled on the event loop
        result = await task()
        print(result)
    
    if __name__ == "__main__":
        asyncio.run(main())

20. SQL

  • DB-API is Python’s standardized interface (like ADO.NET) for interacting with relational databases via a consistent set of methods and parameterized queries to prevent SQL injection.

    import sqlite3
    
    # connect to a temporary in-memory database
    with sqlite3.connect(":memory:") as conn:
        cursor = conn.cursor()
    
        # 1. setup table
        cursor.execute("CREATE TABLE users (id INTEGER PRIMARY KEY, name TEXT, email TEXT)")
    
        # 2. insert data using parameterization (?) to prevent injection
        users = [("Alice", "alice@example.com"), ("Bob", "bob@example.com")]
        cursor.executemany("INSERT INTO users (name, email) VALUES (?, ?)", users)
        conn.commit()
    
        # 3. query and fetch
        cursor.execute("SELECT * FROM users")
        for row in cursor.fetchall():
            print(f"User: {row[1]} | Email: {row[2]}")
  • SQLAlchemy is the industry-standard SQL toolkit and ORM (like EF in .NET) that sits atop the DB-API to interact with databases using Pythonic classes instead of raw SQL strings.

    from sqlalchemy import create_engine, Column, Integer, String
    from sqlalchemy.orm import declarative_base, sessionmaker
    
    # 1. setup: define the schema as a Python class
    Base = declarative_base()
    
    class User(Base):
        __tablename__ = 'users'
        id = Column(Integer, primary_key=True)
        name = Column(String)
        email = Column(String, unique=True)
    
    # 2. connection: create an in-memory engine and the tables
    engine = create_engine('sqlite:///:memory:')
    Base.metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    
    # 3. logic: use a session to interact with objects
    with Session() as session:
        # insert data (replaces cursor.executemany)
        users = [User(name="Alice", email="alice@example.com"),
                 User(name="Bob", email="bob@example.com")]
        session.add_all(users)
        session.commit()
    
        # query and fetch (replaces SELECT * and fetchall)
        for user in session.query(User).all():
            print(f"User: {user.name} | Email: {user.email}")

References