> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
- 1. Running
- 2. Indentations, comments, and multi-line expressions
- 3. Keywords
- 4. Types
- 5. Bytes and bytearray
- 6. Strings
- 7. If, while, and for
- 8. Tuples and lists
- 9. Dictionaries and sets
- 10. Iterations
- 11. Files and directories
- 12. Functions
- 13. Classes
- 13.1. Methods
- 13.2. Inheritances
- 13.3. Operator overloading
- 13.3.1. Indexing and slicing: __getitem__ and __setitem__
- 13.3.2. Iterable objects: __iter__ and __next__
- 13.3.3. Membership: __contains__, __iter__, and __getitem__
- 13.3.4. Attribute access: __getattr__ and __setattr__
- 13.3.5. String representation: __repr__ and __str__
- 13.3.6. Right-side and in-place uses: __radd__ and __iadd__
- 13.3.7. Call expressions: __call__
- 13.3.8. Boolean tests: __bool__ and __len__
- 13.3.9. with/as Context Managers: __enter__ and __exit__
- 13.4. Enum
- 14. Exceptions
- 15. Decorators
- 16. Ellipsis (…)
- 17. Modules and packages
- 18. Testing
- 19. Processes and concurrency
- 20. SQL
- Appendix A: Install Python from Source Code on Linux
- Appendix B: Build a Docker Image for FastAPI
- References
1. Running
-
Using the interactive interpreter (shell)
$ python3 -q >>> 2+2 4 >>> quit()IPython provides an enhanced text-based REPL with completion and introspection, whereas JupyterLab is a web-based environment that executes Python via an IPython kernel (
ipykernel).$ pip install ipython $ ipython In [1]: 2+2 Out[1]: 4 In [2]: len?$ pip install jupyterlab $ jupyter lab -
Using python files
print(2+2)$ python3 test.py 4 -
Using python files with shebang
In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark (
#!) at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.— From Wikipedia, the free encyclopedia
#!/usr/bin/env python3 print(2+2)$ ./test.py 4 -
Executing modules as scripts
In Python,
python -mexecutes installed modules as scripts directly from the command line, removing the need for a separate.pyfile.$ python3 -m venv --help usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear] [--upgrade] [--without-pip] [--prompt PROMPT] [--upgrade-deps] ENV_DIR [ENV_DIR ...] Creates virtual Python environments in one or more target directories. . . .$ python3 -m webbrowser https://www.google.com
2. Indentations, comments, and multi-line expressions
-
Python uses four-space indentation (PEP-8) instead of curly brackets or keywords to delimit code blocks.
-
Don’t mix tabs and spaces to avoid messesing up the indentation count.
-
Guido van Rossum designed Python to use indentation for structure, avoiding the parentheses and braces common in other languages.
disaster = True if disaster: print("Woe!") else: print("Whee!") -
A compound statement body can optionally follow the colon on the same line.
if x > y: print(x) # Simple statement on header line
-
-
Line breaks generally terminate statements automatically.
x = 1 # x = 1; -
Multiple statements may be placed on one line using semicolon separators.
a = 1; b = 2; print(a + b) # Three statements on one line -
Python expressions can span multiple lines when enclosed within delimiters like
(),[], or{}.-
In pre-3.0 Python, a trailing backslash (
\) was required for line continuation, a practice now obsolete in modern versions.# Example in older Python (error-prone, not recommended) long_expression = (1 + 2 + 3 + 4 + 5 + \ 6 + 7 + 8 + 9 + 10) -
In modern Python, favor delimiters like
(),[], or{}over the backslash (\) to improve readability and structure in multi-line expressions.# Parentheses for complex calculations long_calculation = (a * b + c) * (d / e - f) # Brackets for multi-line lists or data structures data = [ "item1", "item2 with a longer description", "item3" ] # Braces for multi-line dictionaries person_info = { "name": "Alice", "age": 30, "hobbies": ["reading", "hiking"] }
-
-
A comment is marked by the
#character (hash, sharp, pound, or octothorpe) and extends to the end of the line.# 60 sec/min * 60 min/hr * 24 hr/day seconds_per_day = 86400seconds_per_day = 86400 # 60 sec/min * 60 min/hr * 24 hr/day# Python does NOT # have a multiline comment. print("No comment: quotes make the # harmless.")
3. Keywords
False class from or
None continue global pass
True def if raise
and del import return
as elif in try
assert else is while
async except lambda with
await finally nonlocal yield
break for not
4. Types
-
Python is dynamically and strongly typed with built-in garbage collection.
-
A dynamically typed language determines a variable type at runtime rather than requiring an explicit declaration during definition.
age = 30 # age is an integer (no need to declare the data type explicitly) age = "thirty" # age is now a string -
A statically typed language requires a variable type to be declared at compile time to ensure type compatibility.
// In Java, declare the type of a variable before assigning a value. int age = 30; // age is declared as an integer age = "thirty"; // error: incompatible types: String cannot be converted to int -
A strongly typed language requires strict type safety by preventing operations between incompatible data types.
Static typing dictates when a type is verified (compile time), whereas strong typing dictates how strictly that type is enforced (both compile time and runtime). -
In Python, every data type is an object whose associated methods and attributes are verified for compatibility at runtime.
# Python supports type inference on assignment. name = "Alice" # String inferred name + 10 # TypeError: mixed types (Strongly typed)In computer programming, duck typing is an application of the duck test—"If it walks like a duck and it quacks like a duck, then it must be a duck"—to determine whether an object can be used for a particular purpose.
— From Wikipedia, the free encyclopedia
# str, tuple, list, bytes, bytearray # dict, set, frozenset # int, bool, float, complex, decimal, fraction # function, generator, class, method # module, NoneType, Ellipsis, type, code, frame, tracebackbool # True, False int # 47, 25000, 25_000, 0b0100_0000, 0o100, 0x40, sys.maxsize, - sys.maxsize - 1 float # 3.14, 2.7e5, float('inf'), float('-inf'), float('nan') complex # 3j, 5 + 9j str # unicode: 'alas', "alack", '''a verse attack''' tuple # (2, 4, 8) list # ['Winken', 'Blinken', 'Nod'] bytes # b'ab\xff' bytearray # bytearray(...) dict # {}, {'game': 'bingo', 'dog': 'dingo', 'drummer': 'Ringo'} set # set([3, 5, 7]) frozenset # frozenset(['Elsa', 'Otto']) # import decimal, fractions decimal.Decimal(1/3) # Decimal('0.333333333333333314829616256247390992939472198486328125') fractions.Fraction(1, 3) # Fraction(1, 3)# int(), float(), bin(), oct(), hex(), chr(), and ord() int(True), int(False) # (1, 0) int(98.6), int(1.0e4) # (98, 10_000) int('99'), int('-23'), int('+12'), int('1_000_000') # (99, -23, 12, 1_000_000) int('10', 2), 'binary', int('10', 8), 'octal', int('10', 16), 'hexadecimal', int('10', 22), 'chesterdigital' # (2, 'binary', 8, 'octal', 16, 'hexadecimal', 22, 'chesterdigital') float(True), float(False) # (1.0, 0.0) float('98.6'), float('-1.5'), float('1.0e4') # (98.6, -1.5, 10_000.0) bin(65), oct(65), hex(65) # ('0b1000001', '0o101', '0x41') chr(65), ord('A') # ('A', 65) False + 0, True + 0, False + 0., True + 0. # (0, 1, 0.0, 1.0) True + True, True + False, False + False # (2, 1, 0)
-
4.1. type hints
-
In Python, type hints (annotations) provide optional metadata to specify expected data types for variables, parameters, and return values.
from typing import Annotated, Any # primitives & unions (3.10+) age: int = 30 pi: float | None = 3.14 # nullable | optional is_active: bool = True raw: bytes = b"\x01\x02" flex: Any = "can be anything" # generics (3.9+): list, dict, tuple, set def process( ids: list[int], data: dict[str, float], point: tuple[int, int, str], unique: set[bytes] ) -> str: ... # classes & metadata class User: def __init__(self, name: str): self.name = name def register( user: User, note: Annotated[str, "Max 20 chars"] ) -> bool: return Trueanyis a built-in function for truthiness checks, whereastyping.Anyis the type hint for unconstrained values.from typing import Any x: any = 10 # function object y: Any = 10 # type hint print(f"x hint: {__annotations__['x']}") # <built-in function any> print(f"y hint: {__annotations__['y']}") # typing.Any
4.2. assignments
-
In Python, variables must be assigned to an object before being referenced, otherwise, a
NameErroris raised.# assignment statements spam = 'Spam' # simple assignment spam, ham = 'yum', 'YUM' # tuple unpacking [spam, ham] = ['yum', 'YUM'] # list unpacking a, b, c, d = 'spam' # sequence unpacking (each character to a variable) a, *b = 'spam' # extended sequence unpacking (a='s', b=['p', 'a', 'm']) a, *_ = 'spam' # use the underscore (_) for unwanted variables spam = ham = 'lunch' # multiple assignment (both variables refer to the same object) spams += 42 # augmented assignment (equivalent to spams = spams + 42) spam = ham = eggs = 0 # multiple variable names can be assigned a value at the same time# swap variable names a, b = 1, 2 b, a = a, b # 2, 1
4.3. bindings
-
In python, variables are labels referencing memory objects (PyObjects) defined by a type, unique ID, value, and reference count.
import sys # 1. Type & ID: Exploring the PyObject val = 5.20 print(type(val)) # <class 'float'> print(id(val)) # Unique memory address (ID) # 2. Reference Counting: Labels on a PyObject x = y = z = 1000.1 base_count = sys.getrefcount(x) del y print(sys.getrefcount(x) == base_count - 1) # True: one label removed del z print(sys.getrefcount(x) == base_count - 2) # True: only 'x' remains
4.4. identities
-
A class is a blueprint for creating objects; in Python, "class" and "type" are synonymous.
type(7) # <class 'int'> type(7) == int # True isinstance(7, int) # True isinstance(type(int), type) # True # 1. instances vs. blueprints print(type(7) == int) # True print(isinstance(7, int)) # True # 2. bool is a subclass of int print(issubclass(bool, int)) # True print(isinstance(True, int)) # True (True is an int instance) # 3. meta identity print(isinstance(int, type)) # True (Classes are type objects)
4.5. equality
-
In Python,
==compares object values via recursive equivalence whileischecks if two variables reference the same memory address.# 1. value equivalence (==) L1 = [1, 2, 3] L2 = [1, 2, 3] print(L1 == L2) # True: content is identical print(L1 is L2) # False: different objects in memory # 2. object identity (is) S1 = 'spam' S2 = 'spam' print(S1 == S2) # True: same value print(S1 is S2) # True: same object (interned) # 3. memory addresses (id) x, y = 1024, 1024 print(x == y) # True print(x is y) # False: distinct IDs for large ints
4.6. sequences
-
Strings, tuples, and lists are ordered, zero-indexed collections; while tuples and lists store any data type, strings are strictly sequences of characters.
# concatenation (+) and repetition (*) combo = ('cat',) + ('dog', 'cow') # ('cat', 'dog', 'cow') alarm = ('bark',) * 3 # ('bark', 'bark', 'bark') # membership & unpacking 'c' in 'cat' # True c, d, w = ['meow', 'bark', 'moo'] # unpacking # iteration: direct vs. indexed vs. enumerated items = ['meow', 'bark', 'moo'] for item in items: ... for i in range(len(items)): ... for i, v in enumerate(items): ...# indexing s = 'hello!' # len(s) is 6 # positive offsets (0 to len-1) print(s[0]) # 'h' (first) print(s[5]) # '!' (last) # negative offsets (-1 to -len) print(s[-1]) # '!' (same as s[len(s)-1]) print(s[-6]) # 'h' (same as s[0]) # out of bounds # s[6] # IndexError: index out of range# slicing s = 'hello!' # [start:stop] - stop is non-inclusive print(s[1:3]) # 'el' (offsets 1 and 2) print(s[:3]) # 'hel' (default start: 0) print(s[1:]) # 'ello!' (default end: len) # [start:stop:step] print(s[::2]) # 'hlo' (every 2nd character) print(s[::-1]) # '!olleh' (negative step reverses) # shadow copy print(s[:]) # 'hello!' (top-level copy) print(s[slice(0, 6, 1)]) # 'hello!' (the internal logic)
4.7. truthiness
-
In Python, truthiness and falsiness determine a value’s evaluation in a Boolean context where most non-empty collections and non-zero numbers are truthy while None and empty or zero-valued objects are falsy.
# truthy: objects with content or non-zero value bool(42) # True bool("hello") # True bool([1, 2]) # True # falsy: empty, zero, or null bool(0) # False bool("") # False bool([]) # False bool(None) # False
4.8. and, or, not
-
In Python, logical operators combine Boolean expressions where
notnegates a value and bothandandoruse short-circuiting to return the operand that determines the result.# 1. negation print(not True) # False print(not 0) # True # 2. short-circuiting AND: returns first Falsy or last Truthy print([] and "hello") # [] print(10 and "hello") # "hello" # 3. short-circuiting OR: returns first Truthy or last Falsy print("apple" or "pear") # "apple" print(None or 0) # 0 letter = 'o' if letter == 'a' or letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u': print(letter, 'is a vowel') else: print(letter, 'is not a vowel')
4.9. ~, <<, >>, &, ^, |
-
In Python, bitwise operators perform bit-level manipulations with a precedence lower than arithmetic operators following the specific order of
~,<<>>,&,^, and then|.x = 5 # 0b0101 y = 1 # 0b0001 # 1. AND, OR, XOR print(f"0b{(x & y):04b}") # 0b0001 (both bits must be 1) print(f"0b{(x | y):04b}") # 0b0101 (either bit is 1) print(f"0b{(x ^ y):04b}") # 0b0100 (bits must differ) # 2. shifts & inversion print(f"0b{(x << 1):04b}") # 0b1010 (shift left: multiply by 2) print(f"0b{(x >> 1):04b}") # 0b0010 (shift right: floor divide by 2) print(f"0b{~x:b}") # 0b-110 (invert: -(x+1))
4.10. /, //, %
-
In Python,
/performs true division returning a float while//and%perform floor division and modulo returning integers only if both operands are integers.# 1. true division (/): always float print(10 / 2) # 5.0 print(11 / 2) # 5.5 # 2. floor division (//): truncates toward negative infinity print( 11 // 2) # 5 (int) print( 11.0 // 2) # 5.0 (float if any operand is float) print(-11 // 2) # -6 (floor of -5.5 is -6) # 3. modulo (%): remainder of division print( 10 % 3) # 1 print(-10 % 3) # 2 (result sign matches divisor) print( 10 % -3) # -2
5. Bytes and bytearray
-
In Python, eight-bit integer sequences represent values from 0 to 255 as either immutable bytes or mutable bytearray objects.
# 1. bytes: immutable literal (b'...') b_seq = b'abc' # b_seq[0] = 65 # TypeError: immutable # 2. bytearray: mutable constructor ba_seq = bytearray(b'abc') ba_seq[0] = 65 # 'a' (97) becomes 'A' (65) print(ba_seq) # bytearray(b'Abc') # 3. indexing returns integers, slicing returns new sequences print(b_seq[0]) # 97 (integer) print(b_seq[:1]) # b'a' (bytes) # 4. initialization from size or list empty_bytes = bytes(5) # b'\x00\x00\x00\x00\x00' from_list = bytes([97, 98, 99]) # b'abc'Endianness is a computer architecture convention for multi-byte data where "big-endian" (standard for IBM mainframes and networking) stores the most significant byte at the lowest address and "little-endian" (standard for x86 and ARM) stores the least significant byte first.
import sys, struct # 1. check local architecture print(sys.byteorder) # 'little' (common) # 2. multi-byte integer (hex: 0x0400) n = 1024 # 3. convert to bytes big = n.to_bytes(2, 'big') # b'\x04\x00' (MSB first) little = n.to_bytes(2, 'little') # b'\x00\x04' (LSB first) print(f"Big: {big.hex(' ')}") # Big: 04 00 print(f"Little: {little.hex(' ')}") # Little: 00 04 # 4. interpretation risk wrong = int.from_bytes(little, 'big') print(wrong) # 4 (interpreted as 0x0004) # 5. using struct '>' for network order network_pkt = struct.pack('>H', n) # pack as big-endian (>) unsigned short (H) print(network_pkt) # b'\x04\x00'
6. Strings
In Python, strings exist as Unicode str for text, immutable bytes for binary data, and mutable bytearray for modified raw data.
|
|
Designed by Unix legends Ken Thompson and Rob Pike on a diner placemat in New Jersey, UTF-8 is a variable-width encoding that serves as the standard for Python, Linux, and the Web.
|
-
Python strings are created using single, double, or triple quotes, with triple quotes specifically designed to handle multiline text and preserve formatting like newlines and indentation.
# 1. single and double quotes (interchangeable) s1 = 'Snap' s2 = "Crackle" # 2. nesting quotes without escapes s3 = "'Nay!' said the naysayer." s4 = 'The rare double quote: ".' # 3. triple quotes for multiline blocks poem = """There was a Young Lady of Norway, Who casually sat in a doorway; When the door squeezed her flat, She exclaimed, "What of that?" This courageous Young Lady of Norway.""" # 4. raw representation (showing \n and spaces) print(repr(poem))# 1. repeating and combining hi = 'Na ' * 4 + 'Hey ' * 4 # 2. escaping and line continuation farewell = '\\' + '\t' + 'Goodbye.' \ + ' Done.' # 3. implicit concatenation s = ("Auto-" "merged " "literals") -
Python supports specialized string types via single-letter prefixes that determine how the interpreter processes formatting, escape sequences, and underlying data structures.
# 1. f-strings: formatted string literals animal, loc = 'wereduck', 'werepond' print(f'The {animal} is in the {loc}') # 2. r-strings: raw strings (ignores backslashes) path = r'C:\Users\name' # Interpreted as 'C:\\Users\\name' # 3. b-strings: bytes literals (binary data) blob = b'\x14\xcd' print(list(blob)) # [20, 205] # 4. fr-strings: raw f-strings (combined) var = "Value" 14 print(fr'Raw plus {var}') -
Python supports three formatting methodologies: legacy C-style expressions, the
.format()method, and modern interpolated f-strings.actor = 'Richard Gere' cat, weight = 'Chester', 28 # 1. C-style (%) s1 = 'Actor: %s' % actor s2 = 'Our cat %s weighs %d lbs' % (cat, weight) s3 = '%(cat)s is %(weight)d' % {'cat': cat, 'weight': weight} # 2. str.format() s4 = '{0}, {1} and {2}'.format('spam', 'ham', 'eggs') s5 = '{motto}, {0} and {food}'.format('ham', motto='spam', food='eggs') s6 = '{}, {} and {}'.format('spam', 'ham', 'eggs') # 3. f-strings s7 = f'Our cat {cat} weighs {weight} pounds' -
Python’s
remodule provides a suite of tools for pattern matching, substitution, and splitting string data using regular expressions.import re source = "Charles Baudelaire's 'Les Fleurs du Mal'" # 1. compiling a pattern (optional, improves performance for reuse) pattern = re.compile('Les Fleurs du Mal') # 2. search(): find first occurrence anywhere m = pattern.search(source) if m: print("Match found within the string.") # 3. match(): find exact match at the START only print(re.match('Les Fleurs du Mal', source)) # None # 4. findall(): returns a list of all matches print(re.findall('es', source)) # ['es', 'es'] # 5. split(): break string at every pattern occurrence print(re.split(r'\s', source)) # split by whitespace # 6. sub(): search and replace patterns print(re.sub("'", '?', source)) # replaces single quotes with ?
7. If, while, and for
|
The walrus operator (
|
-
Branch with
if,elif, andelse:# 1. standard multi-way branching color = "mauve" if color == "red": print("a tomato") elif color == "green": print("a green pepper") else: print("unknown:", color) # 2. ternary expression result = 't' if 'spam' else 'f' # 3. chained comparisons x = 2.5 if 4 > x > 2 > 1: ... # evaluates as (4 > x) and (x > 2) and (2 > 1) # 4. dictionary-based branching (dispatch tables) menu = {'spam': 1.25, 'ham': 1.99, 'eggs': 0.99} price = menu.get('bacon', 'N/A') actions = {'spam': lambda: print("order spam"), 'ham': lambda: print("order ham")} actions.get('spam', lambda: print("default action"))() -
Repeat with
while, andbreak,continue, andelse:items = [1, 3, 5] while items: if (val := items[0]) == 0: break # exit and skip 'else' items = items[1:] # slice to progress if val % 2 == 0: continue # skip to next condition check print(f"{val} squared is {val**2}") else: # optional print("no zeros found") # ONLY if the break above was never hit -
Iterate with
for/in, andbreak,continueandelse:# 1. loop control: continue and break for char in 'thud': if char == 'u': continue # skip remaining block for this item if char == 'x': break # exit loop immediately print(char) else: # optional print("no 'x' found") # ONLY if the break above was never hit # 2. range-based loops (start, stop, step) for i in range(0, 10, 2): print(i, end=' ') # 0 2 4 6 8 # 3. parallel and indexed iteration s = 'spam' for i, char in enumerate(s): # generates (index, item) pairs print(f'{i}: {char}') for a, b in zip(s, s.upper()): # pairs elements from multiple iterables print(a, b) # s S, p P, a A, m M # 4. sequence unpacking pairs = [[1, 2], [3, 4]] for x, y in pairs: # direct assignment to variables print(x + y)
8. Tuples and lists
-
A
tupleis an immutable, ordered sequence built with commas as a structural operator ortuple()as a constructor for iterables.'cat', # singleton (trailing comma) 'cat', 'dog', 'cattle' # multi-item (separating commas) tuple() # constructor: empty () tuple('cat') # constructor: iterable to ('c', 'a', 't')Parentheses are grouping symbols used for empty literals, visual clarity, resolving syntactic ambiguity, and defining generator expressions.
() # empty literal ('cat',) # tuple ('cat') # string type(('cat',)) # <class 'tuple'> type('cat',) # <class 'str'> (x for x in range(10)) # generator expressionA
named tupleis a hybrid object factory that creates classes supporting positional indexing (tuple), dotted name attribute (class), and dictionary conversion (_asdict()).# modern class-based; supports PEP 484 type hints and IDE autocompletion from typing import NamedTuple class Rec(NamedTuple): name: str # explicit field type age: float # enables static analysis jobs: list[str] # self-documenting schema# legacy factory-based; quick, dynamic, but lacks static type hints from collections import namedtuple Rec = namedtuple('Rec', ['name', 'age', 'jobs']) bob = Rec('Bob', age=40.5, jobs=['dev', 'mgr']) bob[0] # positional indexing (tuple) bob.name # dotted name attribute (class) bob._asdict()['name'] # dictionary conversion (dict) -
A
listis a mutable, ordered sequence built with brackets[]as a literal orlist()as a constructor for iterables.[] # [] ['meow', 'bark', 'moo'] # ['meow', 'bark', 'moo'] [('cat', 'meow'), 'bark', 'moo'] # [('cat', 'meow'), 'bark', 'moo'] list() # [] list('cat') # ['c', 'a', 't']# append(), insert(), extend() wow = ['meow'] # ['meow'] wow.append('moo') # ['meow', 'moo'] wow.insert(1, 'bark') # ['meow', 'bark', 'moo'] wow.extend(['cluck', 'baa']) # ['meow', 'bark', 'moo', 'cluck', 'baa'] ```py # plus(+), repeat(*) plus = ['meow', 'bark', 'moo'] + ['cluck', 'baa'] # ['meow', 'bark', 'moo', 'cluck', 'baa'] repeat = ['bark'] * 3 # ['bark', 'bark', 'bark'] ```py # index, and slice assignment L = ['spam', 'Spam', 'SPAM!'] # index assignment L[1] = 'eggs' # ['spam', 'eggs', 'SPAM!'] # slice assignment: delete+insert # list[start:stop:step] = iterable # if the iterable is shorter, elements are deleted from the slice. # if the iterable is longer, extra elements are inserted. L[0:2] = ['eat', 'more'] # ['eat', 'more', 'SPAM!']# del, remove(), clear() farm = ['cat', 'dog', 'cattle', 'chicken', 'duck'] del farm[-1] # ['cat', 'dog', 'cattle', 'chicken'] farm.remove('dog') # ['cat', 'cattle', 'chicken'] farm.clear() # []# pop: remove and return item at index (default last). farm = ['cat', 'cattle', 'chicken'] farm.pop() # 'chicken' # ['cat', 'cattle'] farm.pop(-1) # 'cattle' # ['cat']# sort() and sorted() farm = ['cat', 'dog', 'cattle'] # a sorted copy sorted(farm) # ['cat', 'cattle', 'dog'] print(farm) # ['cat', 'dog', 'cattle'] # sorting in-place farm.sort() print(farm) # ['cat', 'cattle', 'dog']# list comprehensions: [expression for item in iterable] even_numbers = [2 * num for num in range(5)] # [0, 2, 4, 6, 8] # list comprehensions: [expression for item in iterable if condition] odd_numbers = [num for num in range(10) if num % 2 == 1] # [1, 3, 5, 7, 9]# shallow: copies the top-level container with shared nested objects. a = [['cat', 'meow'], ['dog', 'bark']] c = a[:] b = a.copy() # slower than direct bytecode slicing a[:] d = list(c) # deep: creates an independent clone of the container and all nested objects. e = copy.deepcopy(a) # import copy a[0][1] = 'moo' a # [['cat', 'moo'], ['dog', 'bark']] b # [['cat', 'moo'], ['dog', 'bark']] c # [['cat', 'moo'], ['dog', 'bark']] d # [['cat', 'moo'], ['dog', 'bark']] e # [['cat', 'meow'], ['dog', 'bark']]A
deque(double-ended queue) is optimized forO(1)appends and pops from either end, whereas a list incursO(N)costs for left-side mutations.from collections import deque q = deque([], maxlen=5) # fixed-length sliding window q.append(0) # O(1) end-point growth q.appendleft(5) # O(1) start-point growth (vs list's O(N)) q.pop() # O(1) end-point shrinkage q.popleft() # O(1) start-point shrinkage q.extend([1, 2]) # right-side batch add q.extendleft([3, 4]) # left-side batch add: deque([4, 3, ...])
9. Dictionaries and sets
|
In Python, keys in dictionaries and elements in sets must be of immutable, or hashable data types. The built-in
|
-
A
dictis a mutable, associative array/map of unique keys to values, built with curly braces{}as a literal ordict()as a constructor.{} # {} {'cat': 'meow', 'dog': 'bark'} # {'cat': 'meow', 'dog': 'bark'} dict() # constructor: empty {} dict(cat='meow', dog='bark') # constructor: keyword args dict([('cat', 'meow')]) # constructor: iterable of pairs# [key], get() animals = {'cat': 'meow', 'dog': 'bark'} animals['cattle'] = 'moo' # {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'} animals['cat'] # 'meow' animals['sheep'] # KeyError: 'sheep' animals.get('sheep') # None animals.get('sheep', 'baa') # 'baa'# testing animals = {'cat': 'meow', 'dog': 'bark'} 'cat' in animals # True 'sheep' in animals # False animals['sheep'] if 'sheep' in animals else 'oops!' # 'oops!'# keys(), values(), items(), len() animals.keys() # dict_keys(['cat', 'dog', 'cattle']) animals.values() # dict_values(['meow', 'bark', 'moo']) animals.items() # dict_items([('cat', 'meow'), ('dog', 'bark'), ('cattle', 'moo')]) len(animals) # 3# `**`, update() {**{'cat': 'meow'}, **{'dog': 'bark'}} # {'cat': 'meow', 'dog': 'bark'} animals = {'cat': 'meow'} animals.update({'dog': 'bark'}) # {'cat': 'meow', 'dog': 'bark'}# del, pop(), clear() animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'} del animals['dog'] # {'cat': 'meow', 'cattle': 'moo'} animals.pop('cattle') # 'moo' # {'cat': 'meow'} animals.clear() # {}# iterations animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'} for key in animals: # for key in animals.keys() print(f'{key} => {animals[key]}', end='\t') # cat => meow dog => bark cattle => moo for key, value in animals.items(): print(f'{key} => {value}', end='\t') # cat => meow dog => bark cattle => moo# dictionary comprehensions: {key_expression : value_expression for expression in iterable} word = 'letters' letter_counts = {letter: word.count(letter) for letter in word} # {'l': 1, 'e': 2, 't': 2, 'r': 1, 's': 1} # dictionary comprehensions: {key_expression : value_expression for expression in iterable if condition} vowels = 'aeiou' word = 'onomatopoeia' vowel_counts = {letter: word.count(letter) for letter in set(word) if letter in vowels} # {'i': 1, 'o': 4, 'a': 2, 'e': 1}# setdefault() d = {} d[0].extend(range(5)) # KeyError: 0 d.setdefault(0, []).extend(range(5)) d[0] # [0, 1, 2, 3, 4]-
A
defaultdictis adictsubclass that calls a factory function to provide a default value for missing keys.from collections import defaultdict # defaultdict(default_factory=None, /, [...]) # factory: list -> defaults to [] d_list = defaultdict(list) d_list[0].extend(range(5)) # auto-creates [] then extends # factory: int -> defaults to 0 d_int = defaultdict(int) d_int['count'] += 1 # auto-creates 0 then increments -
A
Counteris a dict subclass for counting hashable items, storing elements as keys and their frequencies as values.from collections import Counter # Counter(iterable=None, /, **kwds) word = 'banana' # O(N²) — scans string for 'b', then 'a', then 'n'... {l: word.count(l) for l in set(word)} # O(N) — single pass population c = Counter(word) # Counter({'a': 3, 'n': 2, 'b': 1}) c.most_common(1) # [('a', 3)] list(c.elements()) # ['b', 'a', 'a', 'a', 'n', 'n'] c['z'] # missing keys return 0 -
A
typed dictis a dict-like factory that defines fixed keys and types for static validation, providing a schema for flexible JSON-like maps.from typing import TypedDict, NotRequired class User(TypedDict): name: str # Required id: int # Required email: NotRequired[str] # Optional (PEP 655) # 1. type check: static tools (Mypy) flag missing keys or wrong types user: User = {"name": "Alice", "id": 42} # 2. runtime: a plain dict print(type(user)) # <class 'dict'> print(user["name"]) # Standard string-key access
-
-
A
setis a mutable, unordered collection of unique, hashable elements, built with curly braces{}or theset()constructor.{} # <class 'dict'> {0, 2, 4, 6} # {0, 2, 4, 6} set() # set() set('letter') # {'l', 't', 'r', 'e'} set({'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}) # {'cat', 'cattle', 'dog'} frozenset() # frozenset() frozenset([3, 1, 4, 1, 5, 9]) # frozenset({1, 3, 4, 5, 9})# len(), add(), remove() nums = {0, 1, 2, 3, 4, } len(nums) # 5 nums.add(5) # {0, 1, 2, 3, 4, 5} nums.remove(0) # {1, 2, 3, 4, 5}# iteration for num in {0, 2, 4, 6, 8}: print(num, end='\t') # 0 2 4 6 8# testing 2 in {0, 2, 4} # True 3 in {0, 2, 4} # False# `&`: intersection(), `|`: union(), `-`: difference(), `^`: symmetric_difference() a = {1, 3} b = {2, 3} a & b # {3} a | b # {1, 2, 3} a - b # {1} a ^ b # {1, 2}# `<=`: issubset(), `<`: proper subset, `>=`: issuperset(), `>`: proper superset a <= b # False a < b # False a >= b # False a > b # False# set comprehensions: { expression for expression in iterable } {num for num in range(10)} # {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} # set comprehensions: { expression for expression in iterable if condition } {num for num in range(10) if num % 2 == 0} # {0, 2, 4, 6, 8}
10. Iterations
An iterable is an object supporting the iter() call, while an iterator is the specific object returned by that call which supports next() to produce values.
|
An iterator is any object with a The iteration protocol—utilized by tools like
|
nums = [1, 2] # iterable
i = iter(nums) # iterator created here
print(next(i)) # 1
print(next(i)) # 2
# next(i) now raises StopIteration
Iteration contexts in Python include the for loop; list comprehensions; the map built-in function; the in membership test expression; and the built-in functions sorted, sum, any, and all, and also includes the list and tuple built-ins, string join methods, and sequence assignments, all of which use the iteration protocol to step across iterable objects one item at a time.
|
List comprehensions are executed at internal C-level routines, running faster than manual
|
# all, any, map, filter, reduce, zip, enumerate, shuffle, sample, reversed, sorted
nums = list(range(10)) # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
all(num > 0 for num in nums) # False
any(num > 0 for num in nums) # True
map(lambda x: x * x, nums) # [0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
filter(lambda x: x % 2 == 0, nums) # [0, 2, 4, 6, 8]
# from functools import reduce
reduce(lambda x, y: x + y, nums) # ((0 + 1) + 2) + ... = 45
zip(range(3), range(4), range(5)) # [(0, 0, 0), (1, 1, 1), (2, 2, 2)]
funcs = ['map', 'filter', 'reduce']
enumerate(funcs) # [(0, 'map'), (1, 'filter'), (2, 'reduce')]
enumerate(funcs, 1) # [(1, 'map'), (2, 'filter'), (3, 'reduce')]
[(i, func) for i, func in enumerate(funcs, start=1)] # [(1, 'map'), (2, 'filter'), (3, 'reduce')]
# from random import shuffle, sample
shuffle(nums) # Shuffle list x in place, and return None.
nums # [4, 2, 5, 9, 6, 0, 1, 3, 8, 7]
sample(nums, k=len(nums)) # [5, 3, 7, 6, 8, 4, 0, 1, 2, 9]
reversed(nums) # [7, 8, 3, 1, 0, 6, 9, 5, 2, 4]
nums[::-1] # [7, 8, 3, 1, 0, 6, 9, 5, 2, 4]
sorted(nums, reverse=True) # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
import itertools
names = ["Alan", "Adam", "Wes", "Will", "Albert", "Steven"]
for letter, names in itertools.groupby(names, lambda x: x[0]):
print(letter, list(names))
# A ['Alan', 'Adam']
# W ['Wes', 'Will']
# A ['Albert']
# S ['Steven']
for num in itertools.chain(range(3), range(3, 7), range(7, 10), [10]):
print(num, end='\t')
# 0 1 2 3 4 5 6 7 8 9 10
list(itertools.combinations([0, 1, 2], 2))
[(0, 1), (0, 2), (1, 2)]
list(itertools.combinations_with_replacement([0, 1, 2], 2))
[(0, 0), (0, 1), (0, 2), (1, 1), (1, 2), (2, 2)]
list(itertools.permutations([0, 1, 2], 2))
[(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]
list(itertools.product([0, 1, 2], [3, 4, 5], repeat=1))
[(0, 3), (0, 4), (0, 5), (1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)]
11. Files and directories
A file is a byte sequence identified by a filename within a directory-based filesystem, categorized into text files—which automatically handle Unicode encoding and line endings—and binary files, which provide raw, unaltered access via the bytes type.
-
A file is opened by
open()with an optionalmodeindicating permissions and newline handling, resulting a stream object for data reading or writing.open(f, 'r') # read an EXISTING file open(f, 'w') # create or overwrite a file open(f, 'a') # create or append to a file open(f, 'x') # create a NON-EXISTING file (fails if exists) open(f, 'r+') # read and write an EXISTING file open(f, 'w+') # read and write a file (creates or overwrites) open(f, 'a+') # read and append to a file (creates if missing) open(f, 'rb') # read an EXISTING file as a raw stream of bytes open(f, 'wb') # write a file as a raw stream of bytes# text mode (str): .txt, .csv, .json with open("file.txt", "w", encoding="utf-8") as f: f.write("Line 1\n") # write string f.writelines(["L2\n", "L3\n"]) # write list of strings with open("file.txt", "r") as f: content = f.read() # read entire file as a single string fio.seek(0) lines = f.readlines() # read entire file as a list of strings fio.seek(0) for line in f: ... # read line by line (lazy loading)# binary mode (bytes): .jpg, .pdf, .zip, .exe with open("image.jpg", "rb") as f: header = f.read(10) # read first 10 bytes data = f.read() # read remainder as bytes object with open("copy.jpg", "wb") as f: f.write(data) # write bytes object-
By default, files open in text mode (
t) using universal newlines, which transparently maps OS-specific endings (CRLF on Windows, LF on Unix) to the standard\ncharacter.open(f, 'r', newline=None) # default enables universal newline translation open(f, 'r', newline='') # disables translation to return raw endings open(f, 'w', newline='\n') # forces LF line endings regardless of OS -
By default, files open in system-dependent locale, causing cross-platform failures (e.g., cp1252) when reading UTF-8 files.
import locale print(locale.getpreferredencoding()) # preferred encoding open(f, 'r', encoding='utf-8') # explicit & safe
-
-
pathlibis a modern, object-oriented module for path manipulation, replacing the raw string-based logic ofos.path.from pathlib import Path # 1. initialization p = Path("data/v1/config.yaml") # object initialization p = Path.cwd() / "src" / "app.py" # path combination p = Path.home() # home dir # 2. attributes p.name # app.py p.stem # app" p.suffix # .py p.parent # parent dir p.parts # ('/', 'src', 'app.py') # 3. verification & metadata p.exists() # existence p.is_file() # file p.is_dir() # directory p.stat() # size, mtime, etc. p.resolve() # absolute path # 4. mutations p.mkdir(parents=True, exist_ok=True) # create dir + parents p.touch() # create file/update timestamp p.unlink(missing_ok=True) # delete file p.rmdir() # delete empty dir p.rename("new.py") # move/rename p.replace("new.py") # atomic move/overwrite # 5. search & iteration p.iterdir() # shallow contents generator p.glob("*.csv") # shallow pattern match p.rglob("*.py") # recursive pattern match # 6. stream & I/O with p.open('r') as f: f.read() # manual stream p.read_text() # fast read (UTF-8) p.write_text("data") # fast write (UTF-8)pathlibsupports*(shallow),**(recursive),?(single-char), and[](sets/ranges), but excludes shell-style{}expansion.p.glob("*.py") # shallow : current directory only p.glob("**/*.py") # recursive: explicit double-star pattern p.rglob("*.py") # recursive: shorthand (implies ** prefix) # multi-extension workaround (no {} support) target_exts = {'.jpg', '.png', '.gif'} images = (f for f in p.rglob("*") if f.suffix in target_exts)
12. Functions
Python functions are first-class objects existing as named blocks with def, anonymous expressions with lambda, or methods with a bound instance.
-
defis a statement creating a named function at runtime, whilelambdais an expression coding an anonymous, single-expression function.def add(x, y): return x + y # named, multiple statements add_alt = lambda x, y: x + y # anonymous, one expression if persistent: def save(): ... # def works inside logic blocks else: save = lambda: None # lambda works where expressions are expected def future_func(): pass # NOOP: classic def todo_func(): ... # NOOP: modern -
returnsends a result and exits, whileyieldproduces a result and suspends state to generate a series over time.def get_one(): return 1 # terminate def get_seq(): yield 1; yield 2 # generator -
globalbinds names to the module-level scope, whilenonlocalbinds names to the nearest enclosing function scope.# global: modifies module-level x def change_global(): global x; x = 2 # nonlocal: modifies outer function y def outer(): y = 1 def inner(): nonlocal y; y = 2 -
Python uses pass-by-assignment, matching arguments from left to right by default, or by keyword (
name=value).def myfunc(arg1, arg2, meat='ham', *args, **kwargs): ... # 'spam', 'eggs' -> positional # meat=ham -> keyword # *args -> unpacks remaining positionals # **kargs -> unpacks remaining keyword myfunc('spam', 'eggs', meat=ham, *args, **kargs)In Python, the
/indicates that everything before it is positional-only, and the*(when used alone) indicates that everything after it is keyword-only.def feed_ animal(qty, /, kind="goat"): ... # 'qty' is positional-only; 'kind' is standard feed_animal(5, "sheep") # valid # feed_animal(qty=5, kind="sheep") # type error def harvest(*, crop, tool="scythe"): ... # everything to the right must be named harvest(crop="wheat") # valid # harvest("wheat") # type error
12.1. Attributes and annotations
-
A function is a first-class object supporting system and user-defined attributes alongside metadata annotations.
# 1. annotations (type hint vs. general metadata) def cube(n: int) -> int: ... # type hint def spam(a: 'tag'): return a # general metadata # 2. user-defined attribute cube.category = "math" # 3. system-defined attributes print(cube.__name__) # 'cube' print(cube.__annotations__) # {'n': <class 'int'>, 'return': <class 'int'>} print(cube.category) # 'math' # 4. first-class citizen (high order) def execute(func, value): return func(value) print(execute(cube, 3)) # 27 print(execute(lambda n: n**3, 3)) # 27
12.2. Lambdas
-
A lambda expression is created by the keyword
lambdawith a comma-separated argument list and a single expression that returns the function’s result.from functools import reduce nums = range(10) # map: mapping functions over iterables list(map(lambda x: x+1, nums)) # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # filter: selecting items in iterables list(filter(lambda x: x % 2 == 0, nums)) # [0, 2, 4, 6, 8] # reduce: combining items in iterables reduce(lambda x, y: x+y, nums) # 45
12.3. Namespaces
A namespace is a scope where names live within LEGB levels (local, enclosing, global, and built-in).
-
A name resolution is the process of searching LEGB levels in order and stopping at the first match.
-
A name assignment is bound to the local scope by default, unless overridden by
globalornonlocal.a = 5.21 # global (G) def tester(start): state = start # enclosing (E) def nested(label): nonlocal state # bound to 'state' in tester global a # bound to 'a' at module level state += 1 print(locals()) # local names and values print(globals()) # global names and values print(vars()) # local names and values (same as locals) import math print(vars(math)) # attribute names and values of math module return nested
12.4. Closures
-
A
closureis a function object that remembers the values in its enclosing lexical environment even after the outer scope has finished executing.A
lexical environment(orlexical scope) is a static structure where variable accessibility is determined by the physical placement of code at write-time rather than the execution path at runtime.# 1. named closure (using def) def maker(n): def action(x): return x ** n # remembers n from enclosing scope return action f = maker(2) g = maker(3) print(f(4)) # 16 (remembers n=2) print(g(4)) # 64 (remembers n=3) # 2. anonymous closure (using lambda) def lambda_maker(n): return lambda x: x ** n # n captured by lambda expression h = lambda_maker(4) print(h(2)) # 16 (remembers n=4)If a lambda or def defined within a function is nested inside a loop, all generated functions will share the loop variable’s final value because the variable is bound late at call-time rather than definition-time.
# 1. the trap: late binding def make_actions(): # 'i' is not "captured" yet; it is just a name to be looked up later return [lambda x: i ** x for i in range(5)] acts = make_actions() # At call-time, the loop is finished and 'i' is 4 in the enclosing scope print([f(2) for f in acts]) # [16, 16, 16, 16, 16] # 2. the fix: early binding def make_actions(): # i=i binds the current value of 'i' to a local parameter immediately return [lambda x, i=i: i ** x for i in range(5)] acts = make_actions() print([f(2) for f in acts]) # [0, 1, 4, 9, 16]
12.5. Generators
-
A
generatoris a specializediterator—a one-way stream that produces items one at a time on demand through the iteration protocol instead of returning a complete sequence at once.-
A
generator functionis a generator factory with adefstatement andyieldto produce an object featuring state suspension, retaining its local scope and code position between yields.def count_factory(n): for i in range(n): yield i # suspends execution, retains local scope/position def delegated_factory(n): yield from range(n) # shorthand for "for i in range(n): yield i" for val in count_factory(5): print(val) # 0, 1, 2, 3, 4 -
A
generator expressionis a memory-space optimization shorthand with()to produce items on-demand, running slower than list comprehensions due to iteration overhead but essential for large datasets.gen_exp = (i for i in range(5)) # memory-efficient print(next(gen_exp)) # 0 / yields on-demand print(list(gen_exp)) # [1, 2, 3, 4] / exhausts the remaining values next(gen_exp) # stop iteration exception
-
13. Classes
A class is a blueprint defining a namespace of shared attributes to create instance objects with a unique namespace for instance attributes while delegating shared attribute lookups to the class.
class Animal:
"""blueprint for creating animal instances with shared and unique traits."""
kingdom = "Animalia" # public
_territory = "Earth" # protected (convention)
__acient = "Paleolith" # private (mangled)
def __init__(self, species, color, voice):
self.voice = voice # public
self._color = color # protected (convention)
self.__species = species # private (mangled)
# self refers to the specific instance object being created
@classmethod
@property
def territory(cls):
"""getter for protected class territory."""
return cls._territory
@classmethod
@property
def acient(cls):
"""getter for private class acient."""
return cls.__acient
@property
def species(self):
"""getter for private instance species."""
return self.__species
@species.setter
def species(self, value):
"""setter for private instance species."""
self.__species = value
@property
def color(self):
"""getter for protected instance color."""
return self._color
@color.setter
def color(self, value):
self._color = value
def wow(self):
"""prints the animal voice."""
print(f"{self.voice}!")
@classmethod
def change_kingdom(cls, new_name):
"""modifies the shared class namespace."""
cls.kingdom = new_name
# cls refers to the class object itself, not a specific instance
@staticmethod
def is_living():
"""utility bound to the class namespace."""
return True
dog = Animal("Canine", "Brown", "Woof")
cat = Animal("Feline", "Orange", "Meow")
dog.wow() # Woof!
print(cat.species) # Feline
print(Animal.territory) # Earth
print(Animal.acient}") # Paleolith
|
The term attribute is an umbrella for any named member accessed with dot (
|
|
|
13.1. Methods
-
An instance method is a function that implicitly receives (binds) the instance as its first argument (
self) to reference and manipulate instance attributes.class Robot: def __init__(self, name): self.name = name def rename(self, new_name): self.name = new_name bot = Robot('WALL·E') bot.rename('WALL·EVE') -
A class method is a function decorated by
@classmethodthat implicitly receives (binds) the class as its first argument (cls) to reference and manage class attributes.class Robot: name = 'WALL·E' @classmethod def rename(cls, new_name): cls.name += new_name Robot.rename('WALL·EVE') -
A static method is a function decorated by
@staticmethodthat receives no implicit argument and serves as a namespace-bound utility.class Robot: @staticmethod def is_sustainable(plant_count): """Check if life is sustainable based on current plant discovery.""" return plant_count > 0 if Robot.is_sustainable(1): print("Return to Earth!")
13.2. Inheritances
-
A subclass is a child class that extends or overrides the functionality from one or more base classes.
class Robot: def __init__(self, name): self.name = name def move(self): print(f"{self.name} moves on treads.") class WallE(Robot): def __init__(self, name): self.name = name def work(self): print(f"{self.name} compacting trash.")-
A Mixin is a small, specialized class used in multiple inheritance to plug in a specific feature (like flight or tread) without changing the core identity of the target class (like a Robot).
class Robot: def __init__(self, name): self.name = name class FlightMixin: def move(self): print(f"{self.name} is flying through the air!") class TreadMixin: def move(self): print(f"{self.name} is rolling on treads.") class Eve(FlightMixin, Robot): """Identity: Robot | Feature: Flight""" def scan(self): # Uses the identity name to perform a unique action print(f"{self.name} is scanning for plant life...") class WallE(TreadMixin, Robot): """Identity: Robot | Feature: Treads""" def work(self): # Uses the identity name to perform a unique action print(f"{self.name} is compacting trash cubes.") issubclass(Eve, Robot) # True issubclass(Eve, FlightMixin) # True issubclass(WallE, Robot) # True issubclass(WallE, TreadMixin) # True
-
-
In Python, inheritance is an attribute lookup process that uses C3 Linearization to flatten class hierarchies into a single, predictable search path called the MRO (Method Resolution Order).
-
Class.mro()is a method that returns a list of classes representing the search path derived from C3 Linearization for attribute lookup.class Base: ... class Mixin(Base): ... class Child(Mixin, Base): ... print(Child.mro()) # [<class 'Child'>, <class 'Mixin'>, <class 'Base'>, <class 'object'>] -
super()is a proxy object that delegates method calls to the next class in the MRO without specifying the class name explicitly.class Robot: def __init__(self, name): self.name = name class WallE(Robot): def __init__(self, name): super().__init__(name) # FLEXIBLE: finds Robot automatically class Eve(Robot): def __init__(self, name): Robot.__init__(self, name) # RIGID: must name class and pass 'self' manually
-
-
In Python, duck typing is a loose implementation of polymorphism that prioritizes an object’s behavior (methods and attributes) over its inheritance or class identity.
# If it walks like a duck and quacks like a duck, it’s a duck. # —— A Wise Person class Duck: def wow(self): return 'quack!' class Cat: def wow(self): return 'meow!' def speak(entity): print(entity.speak()) speak(Duck()) # quack! speak(Cat()) # meow! -
ABC (Abstract Base Class) is an explicit contract for interfaces with runtime type checking (i.e.,
isinstance()), while Protocol is an implicit shape using structural subtyping for static type checking (e.g., Pyright), aligning with Python’s duck typing philosophy.from abc import ABC, abstractmethod from typing import Protocol class RobotABC(ABC): @abstractmethod def move(self): ... class WallE(RobotABC): # explicitly inherits def move(self): print("Solar rolling...") class Flyer(Protocol): def move(self) -> None: ... class Eve: # implicitly matches def move(self): print("Thruster flight...") def activate(unit: Flyer): unit.move() activate(WallE()) # works (has .move) activate(Eve()) # works (has .move)
13.3. Operator overloading
-
A dataclass is a specialized class decorated by
@dataclass(similar to Lombok in Java), designed primarily to store data while automatically generating boilerplate methods like__init__,__repr__, and__eq__.from dataclasses import dataclass @dataclass class Point: x: float y: float p1 = Point(1.0, 2.0) p2 = Point(1.0, 2.0) print(p1) # Point(x=1.0, y=2.0) print(p1 == p2) # True -
Operator overloading lets classes intercept normal Python operations.
-
Classes can overload all Python expression operators.
-
Classes can also overload built-in operations such as printing, function calls, attribute access, etc.
-
Overloading makes class instances act more like built-in types.
-
Overloading is implemented by providing specially named methods in a class.
13.3.1. Indexing and slicing: __getitem__ and __setitem__
-
When an instance
Xappears in an indexing expression likeX[i], Python calls the__getitem__method inherited by the instance, passingXand the index in brackets to the arguments.class Indexer: def __getitem__(self, index): return index ** 2 X = Indexer() X[2] # X[i] calls X.__getitem__(i) # 4 for i in range(5): print(X[i], end=' ') # Runs __getitem__(X, i) each time # 0 1 4 9 16 -
In addition to indexing,
__getitem__is also called for slice expressions—using upper and lower bounds and a stride bundled up into a slice object.class Indexer: data = [5, 6, 7, 8, 9] def __getitem__(self, index: int | slice) -> int | list[int]: # Called for index or slice print('getitem:', index) return self.data[index] # Perform index or sliceX = Indexer() X[0] # getitem: 0 # 5 X[-1] # getitem: -1 # 9 X[2:4] # getitem: slice(2, 4, None) # [7, 8] X[1:] # getitem: slice(1, None, None) # [6, 7, 8, 9] X[:-1] # getitem: slice(None, -1, None) # [5, 6, 7, 8] X[::2] # getitem: slice(None, None, 2) # [5, 7, 9] -
The
__getitem__may be also called automatically as an iteration fallback option (all iteration contexts will try the__iter__method first), for example, theforloops,inmembership test, list comprehensions, themapbuilt-in, list and tuple assignments, and type constructors.class StepperIndex: def __init__(self, data): self.data = data def __getitem__(self, i): return self.data[i]X = StepperIndex('Spam') X[1] # Indexing calls __getitem__ # 'p' for item in X: # for loops call __getitem__ print(item, end=' ') # for indexes items 0..N # S p a m 'p' in X # All call __getitem__ too # True [c for c in X] # List comprehension # ['S', 'p', 'a', 'm'] list(map(str.upper, X)) # map calls # ['S', 'P', 'A', 'M'] (a, b, c, d) = X # Sequence assignments a, c, d # ('S', 'a', 'm') list(X), tuple(X), ''.join(X) # And so on... # (['S', 'p', 'a', 'm'], ('S', 'p', 'a', 'm'), 'Spam') -
The
__setitem__index assignment method similarly intercepts both index and slice assignments.class IndexSetter: def __init__(self, data): self.data = data def __setitem__(self, index, value): # Intercept index or slice assignment self.data[index] = value # Assign index or slice -
The
__index__method returns an integer value for an instance when needed and is used by built-ins that convert to digit strings.class C: def __index__(self): return 255X = C() hex(X) # '0xff' bin(X) # '0b11111111' oct(X) # '0o377'
13.3.2. Iterable objects: __iter__ and __next__
-
Technically, iteration contexts work by passing an iterable object to the
iterbuilt-in function to invoke an__iter__method, which is expected to return an iterator object. -
If it’s provided, Python then repeatedly calls the iterator object’s
__next__method to produce items until aStopIterationexception is raised. -
A
nextbuilt-in function is also available as a convenience for manual iterations—next(I)is the same asI.next(). -
In all iteration contexts, Python tries to use
__iter__first, which returns an object that supports the iteration protocol with a__next__method: if no__iter__is found by inheritance search, Python falls back on the__getitem__indexing method, which is called repeatedly, with successively higher indexes, until anIndexErrorexception is raised.class Squares: def __init__(self, start, stop): # Save state when created self.value = start - 1 self.stop = stop def __iter__(self): # Get iterator object on iter return self # One-shot iteration, single traversal only def __next__(self): # Return a square on each iteration if self.value == self.stop: # Also called by next built-in raise StopIteration self.value += 1 return self.value ** 2 -
If used, the
yieldstatement can create the__next__method automatically.class Squares: # __iter__ + yield generator def __init__(self, start, stop): # __next__ is automatic/implied self.start = start self.stop = stop def __iter__(self): for value in range(self.start, self.stop + 1): yield value ** 2 -
To achieve the multiple-iterator effect on one object,
__iter__simply needs to define a new stateful object for the iterator, instead of returningselffor each iterator request.class SkipObject: def __init__(self, wrapped): # Save item to be used self.wrapped = wrapped def __iter__(self): return SkipIterator(self.wrapped) # New iterator each time class SkipIterator: def __init__(self, wrapped): self.wrapped = wrapped # Iterator state information self.offset = 0 def __next__(self): if self.offset >= len(self.wrapped): # Terminate iterations raise StopIteration else: item = self.wrapped[self.offset] # else return and skip self.offset += 2 return item
13.3.3. Membership: __contains__, __iter__, and __getitem__
-
In the iterations domain, classes can implement the
inmembership operator as an iteration, using either the__iter__or__getitem__methods. -
To support more specific membership, though, classes may code a
__contains__method—when present, this method is preferred over__iter__, which is preferred over__getitem__. -
The
__contains__method should define membership as applying to keys for a mapping (and can use quick lookups), and as a search for sequences.class Iters: def __init__(self, value): self.data = value def __getitem__(self, i): # Fallback for iteration print('get[%s]:' % i, end='') # Also for index, slice return self.data[i] def __iter__(self): # Preferred for iteration print('iter=> next:', end='') # Allows multiple active iterators for x in self.data: # no __next__ to alias to next yield x print('next:', end='') def __contains__(self, x): # Preferred for 'in' print('contains: ', end='') return x in self.data
13.3.4. Attribute access: __getattr__ and __setattr__
-
The
__getattr__method intercepts attribute references.-
It’s called with the attribute name as a string whenever trying to qualify an instance with an undefined (nonexistent) attribute name.
-
It is not called if Python can find the attribute using its inheritance tree search procedure.
class Empty: def __getattr__(self, attrname): # On self.undefined if attrname == 'age': # age becomes a dynamically computed attribute return 40 else: raise AttributeError(attrname) # raises the builtin AttributeError exceptionX = Empty() X.age # 40 getattr(X, 'age') # 40 X.name # AttributeError: name getattr(X, 'name', 'Jon') # 'Jon' hasattr(X, 'name') # False setattr(X, 'name', 'Jon X') X.name # 'Jon X'
-
-
In the same department, the
__setattr__intercepts all attribute assignments.-
If the method is defined or inherited,
self.attr = valuebecomesself.__setattr__('attr', value). -
Assigning to any
selfattributes within__setattr__calls__setattr__again, potentially causing an infinite recursion loop. -
Avoid loops by coding instance attribute assignments as assignments to attribute dictionary keys:
self.dict['name'] = x, notself.name = x.class Accesscontrol: def __setattr__(self, attr, value): if attr == 'age': self.__dict__[attr] = value + 10 # Not self.name=val or setattr # It’s also possible to avoid recursive loops in a class that uses __setattr__ by routing # any attribute assignments to a higher superclass with a call, instead of assigning keys # in __dict__: # self.__dict__[attr] = value + 10 # OK: doesn't loop # object.__setattr__(self, attr, value + 10) # OK: doesn't loop (new-style only) else: raise AttributeError(attr + ' not allowed')X = Accesscontrol() X.age = 40 X.age # 50 X.name = 'Bob' # AttributeError: name not allowed
-
-
A third attribute management method,
__delattr__, is passed the attribute name string and invoked on all attribute deletions (i.e.,del object.attr).-
Like
__setattr__, it must avoid recursive loops by routing attribute deletions with the using class through__dict__or a superclass.
-
-
The built-in
getattrfunction is used to fetch an attribute from an object by name string—getattr(X,N)is likeX.N, except thatNis an expression that evaluates to a string at runtime, not a variable.class Wrapper: # A wrapper (sometimes called a proxy) class def __init__(self, object): self.wrapped = object # Save object def __getattr__(self, attrname): print('Trace: ' + attrname) # Trace fetch return getattr(self.wrapped, attrname) # Delegate fetch
13.3.5. String representation: __repr__ and __str__
If defined, __repr__ (or its close relative, __str__) is called automatically when class instances are printed or converted to strings.
-
__str__is tried first for theprintoperation and thestrbuilt-in function (the internal equivalent of whichprintruns). It generally should return a user-friendly display. -
__repr__is used in all other contexts: for interactive echoes, thereprfunction, and nested appearances, as well as byprintandstrif no__str__is present. It should generally return an as-code string that could be used to re-create the object, or a detailed display for developers.class adder: def __init__(self, value=0): self.data = value # Initialize data def __add__(self, other): self.data += other # Add other in place (bad form?)x = adder() # Default displays print(x) # <__main__.adder object at 0x7fd1fd745a50> x # <__main__.adder object at 0x7fd1fd745a50>class addrepr(adder): # Inherit __init__, __add__ def __repr__(self): # Add string representation return 'addrepr(%s)' % self.data # Convert to as-code stringx = addrepr(2) x # Runs __repr__ # addrepr(2) print(x) # Runs __repr__ # addrepr(2) str(x), repr(x) # Runs __repr__ for both # ('addrepr(2)', 'addrepr(2)')class addstr(adder): def __str__(self): # __str__ but no __repr__ return '[Value: %s]' % self.data # Convert to nice stringx = addstr(3) x # Default __repr__ # <demo.addstr object at 0x7fd1fd63d2d0> print(x) # # Runs __str__ # [Value: 3] str(x), repr(x) # ('[Value: 3]', '<demo.addstr object at 0x7fd1fd63d2d0>')class addboth(adder): def __str__(self): return '[Value: %s]' % self.data # User-friendly string def __repr__(self): return 'addboth(%s)' % self.data # As-code stringx = addboth(4) x # Runs __repr__ # addboth(4) print(x) # Runs __str__ # [Value: 4] str(x), repr(x) # ('[Value: 4]', 'addboth(4)')
13.3.6. Right-side and in-place uses: __radd__ and __iadd__
-
Every binary operator has a left, right, and in-place variant overloading methods (e.g.,
__add__,__radd__, and__iadd__). -
For example, the
__add__for objects on the left is called instead in all other cases and does not support the use of instance objects on the right side of the+operator.class Number: def __init__(self, value=0): self.data = value def __add__(self, other): return self.data+otherx = Number(5) x + 2 # 7 2 + x # TypeError: unsupported operand type(s) for +: 'int' and 'Number' -
To implement more general expressions, and hence support commutative-style operators, code the
__radd__method as well.class Number: def __init__(self, value=0): self.data = value def __add__(self, other): return self.data+other def __radd__(self, other): return self.data+other # Reusing __add__ in __radd__ # def __radd__(self, other): # return self.__add__(other) # Call __add__ explicitly # return self + other # Swap order and re-add # __radd__ = __add__ # Alias: cut out the middlemanx = Number(5) x + 2 # 7 2 + x # 7 -
To also implement
+=in-place augmented addition, code either an__iadd__or an__add__. The latter is used if the former is absent, but may not be able optimize in-place cases.class Number: def __init__(self, value=0): self.data = value def __add__(self, other): return self.data+other __radd__ = __add__ def __iadd__(self, other): # __iadd__ explicit: x += y self.data += other # Usually returns self return selfx = Number(5) x += 1 x += 1 x.data # 7
13.3.7. Call expressions: __call__
-
Python runs a
__call__method for function call expressions applied to the instances, passing along whatever positional or keyword arguments were sent.class Callee: def __call__(self, *pargs, **kargs): # Intercept instance calls print('Called:', pargs, kargs) # Accept arbitrary argumentsC = Callee() C(1, 2, 3) # C is a callable object # Called: (1, 2, 3) {} C(1, 2, 3, x=4, y=5) # Called: (1, 2, 3) {'y': 5, 'x': 4}class C: def __call__(self, a, b, c=5, d=6): ... # Normals and defaults class C: def __call__(self, *pargs, **kargs): ... # Collect arbitrary arguments class C: def __call__(self, *pargs, d=6, **kargs): ... # 3.X keyword-only argument
13.3.8. Boolean tests: __bool__ and __len__
-
In Boolean contexts, Python first tries
__bool__to obtain a direct Boolean value; if that method is missing, Python tries__len__to infer a truth value from the object’s length.class Truth: def __bool__(self): return TrueX = Truth() if X: print('yes!') # yes!class Truth: def __bool__(self): return FalseX = Truth() bool(X) # Falseclass Truth: def __len__(self): return 0 X = Truth() if not X: print('no!') # no! -
If both methods are present Python prefers
__bool__over__len__, because it is more specific:class Truth: def __bool__(self): return True # 3.X tries __bool__ first def __len__(self): return 0 # 2.X tries __len__ first X = Truth() if X: print('yes!') # yes! -
If neither truth method is defined, the object is vacuously considered true (though any potential implications for more metaphysically inclined readers are strictly coincidental):
class Truth: passX = Truth() bool(X) # True
13.3.9. with/as Context Managers: __enter__ and __exit__
with expression [as variable], [expression [as variable]]:
with-block
The with statement can be used with any object implementing __enter__() (resource acquisition/setup) and __exit__() (resource release/cleanup) to enable automatic resource management.
-
Files: The with
open('filename', 'mode') as file:syntax opens a file, assigns it to a variable (file), and automatically closes the file when the indented block exits, even in case of exceptions. -
Database Connections:
with sqlite3.connect(':memory:') as con:creates a connection, assigns it to a variable, and guarantees closure upon exiting the block. -
Locks: In multithreaded environments, with can be used with lock objects to acquire a lock at the beginning of the block and release it at the end, ensuring proper synchronization.
fi = open('test.txt', 'w', encoding='utf-8') try: fi.write('hello world') finally: fi.close()with open('test.txt', 'r', encoding='utf-8') as fo: txt = fo.read() print(txt)with open('data', 'r', encoding='utf-8') as fin, open('res', 'wb') as fout: # multiple context managers for line in fin: if 'some key' in line: fout.write(line)
class Cat:
"""A custom context manager class that simulates a cat entering and leaving."""
def __enter__(self):
"""
Called when entering the `with` block. Prints a message and returns itself.
Returns:
The Cat instance (self) to be used within the `with` block.
"""
print("I'm coming in!")
return self # Return self to provide the managed object to the `with` block
def __exit__(self, exc_type: type, exc_value: object, traceback: object) -> bool:
"""
Called when exiting the `with` block, regardless of exceptions.
Prints a message, optionally handles exceptions, and returns True to suppress them.
Args:
exc_type (type): The type of exception raised within the `with` block (if any).
exc_value (object): The actual exception object raised (if any).
traceback (object): A traceback object containing information about the call stack
(if any exception was raised).
Returns:
bool: True to suppress any exceptions raised within the `with` block,
False to re-raise them. (Can be modified for specific exception handling)
"""
print("I'm going out.")
# Suppress potential exceptions (modify for specific handling)
return True
def wow(self) -> None:
"""
Method to simulate a cat's meow. Prints "meow!".
Returns:
None
"""
print("meow!")
with Cat() as cat: # type: Cat
"""Enters the context manager and assigns the Cat object to 'cat'."""
cat.wow() # Calls the cat's meow method within the context
# I'm coming in!
# meow!
# I'm going out.
from contextlib import contextmanager
class Cat:
"""A simple Cat class with a meow method."""
def wow(self) -> None:
"""
Method to simulate a cat's meow. Prints "meow!".
Returns:
None
"""
print("meow!")
@contextmanager
def cat_context():
"""
A generator-based context manager that simulates a cat entering and leaving.
Yields:
Cat: The Cat instance to be used within the `with` block.
"""
print("I'm coming in!")
cat = Cat()
try:
yield cat # Provide the cat to the `with` block
except Exception as e:
# Suppress exceptions (like returning True in __exit__)
pass # Swallow the exception
finally:
print("I'm going out.")
with cat_context() as cat:
"""Enters the context manager and assigns the Cat object to 'cat'."""
cat.wow() # Calls the cat's meow method within the context
# I'm coming in!
# meow!
# I'm going out.
13.4. Enum
Added in version 3.4.
from enum import Enum
class Weekday(Enum):
MONDAY = 1
TUESDAY = 2
WEDNESDAY = 3
THURSDAY = 4
FRIDAY = 5
SATURDAY = 6
SUNDAY = 7
Weekday(3) # <Weekday.WEDNESDAY: 3>
Weekday["WEDNESDAY"] # <Weekday.WEDNESDAY: 3>
print(Weekday.THURSDAY) # Weekday.THURSDAY
print(Weekday.TUESDAY.name) # TUESDAY
print(Weekday.WEDNESDAY.value) # 3
for day in list(Weekday):
print(day)
# Weekday.MONDAY
# Weekday.TUESDAY
# Weekday.WEDNESDAY
# Weekday.THURSDAY
# Weekday.FRIDAY
# Weekday.SATURDAY
# Weekday.SUNDAY
from enum import Flag
class Weekday(Flag):
MONDAY = 1
TUESDAY = 2
WEDNESDAY = 4
THURSDAY = 8
FRIDAY = 16
SATURDAY = 32
SUNDAY = 64
weekend = Weekday.SATURDAY | Weekday.SUNDAY
# <Weekday.SATURDAY|SUNDAY: 96>
for day in weekend:
print(day)
# Weekday.SATURDAY
# Weekday.SUNDAY
14. Exceptions
-
An exception is a class, which is a child of the class
Exception.class OopsException(Exception): pass # user-defined exception -
The
raisestatement raises (triggers) a built-in or user-defined exception.raise instance # raise instance of class raise clazz # make and raise instance of class: makes an instance with no constructor arguments raise # reraise the most recent exceptiontry: 1 / 0 except Exception as E: raise TypeError('Bad') from E # raise newexception from otherexception # Traceback (most recent call last): # ZeroDivisionError: division by zero # # The above exception was the direct cause of the following exception: # # Traceback (most recent call last): # TypeError: Bad -
The
assertstatement raises anAssertionErrorexception if a condition is false.# assert test, data # the data part is optional assert False, 'Nobody expects the Spanish Inquisition!' # AssertionError: Nobody expects the Spanish Inquisition! -
The
trystatement catches and recovers from exceptions with one or more handlers for exceptions that may be raised during the block’s execution.# try -> except -> else -> finally try: raise OopsException('panic') # raising exceptions except OopsException as err: # 3.X localizes 'as' names to except block print(err) # catch and recover from exceptions except (RuntimeError, TypeError, NameError) as err: # multiple exceptions as a parenthesized tuple ... except Exception as other: # except to catch all exceptions ... except: # bare except to catch all exceptions ... else: ... # run if no exception was raised during try block finally: # termination actions ... -
The
with/asstatement is designed to automate startup and termination activities that must occur around a block of code.# try: # file = open('lumberjack.txt', 'w', encoding='utf-8') # file.write('The larch!\n') # finally: # if file: file.close() with open('lumberjack.txt', 'w', encoding='utf-8') as file: # always close file on exit file.write('The larch!\n')
15. Decorators
A decorator is a callable that returns a callable to specify management or augmentation code for functions and classes.
-
Function decorators, do name rebinding at function definition time, install wrapper objects to intercept later function calls and process them as needed, usually passing the call on to the original function to run the managed action.
def decorator(F): # Process function F return F @decorator # Decorate function def func(): ... # func = decorator(func)def decorator(F): # Save or use function F # Return a different callable, a proxy: nested def, class with __call__, etc. ... @decorator def func(): ... # func = decorator(func)def decorator(F): # On @ decoration def wrapper(*args, **kargs): # On wrapped function call that retains the original function in an enclosing scope # Use F, args, and kargs # F(*args, **kargs) calls original function ... return wrapper @decorator # func = decorator(func) def func(x, y, z=122): # func is passed to decorator's F ... func(6, 7, 8) # 6, 7, 8 are passed to wrapper's *args, **kargsclass decorator: def __init__(self, func): # On @ decoration self.func = func def __call__(self, *args): # On wrapped function call by overloading the call operation # Use self.func and args # self.func(*args) calls original function @decorator def func(x, y): # func = decorator(func) ... # func is passed to __init__ func(6, 7) # 6, 7 are passed to __call__'s *argsdef decorator(A, B): # Save or use A, B def actualDecorator(F): # Save or use function F # Return a callable: nested def, class with __call__, etc. return callable return actualDecorator @decorator(A, B) def F(arg): # F = decorator(A, B)(F) # Rebind F to result of decorator's return value ... -
Class decorators, do name rebinding at class definition time, install wrapper objects to intercept later instance creation calls and process them as needed, usually passing the call on to the original class to create a managed instance.
def decorator(C): # Process class C return C @decorator # Decorate class class C: ... # C = decorator(C)def decorator(C): # Save or use class C # Return a different callable, a proxy: nested def, class with __call__, etc. @decorator class C: ... # C = decorator(C)def decorator(cls): # On @ decoration class Wrapper: def __init__(self, *args): # On instance creation self.wrapped = cls(*args) def __getattr__(self, name): # On attribute fetch return getattr(self.wrapped, name) return Wrapper @decorator class C: # C = decorator(C) def __init__(self, x, y): # Run by Wrapper.__init__ self.attr = 'spam' x = C(6, 7) # Really calls Wrapper(6, 7) print(x.attr) # Runs Wrapper.__getattr__, prints "spam"
16. Ellipsis (…)
Ellipsis (…) is Python’s built-in Ellipsis object, a singleton constant (like None, True, False).
>>> ...
Ellipsis
>>> type (...)
<class 'ellipsis'>
>>> Ellipsis is ...
True
>>>
-
Use
…as a placeholder in function/class bodies (similar topass):def function_to_implement_later(): ... # Placeholder - does nothing class IncompleteClass: ... # Placeholder # Equivalent to: def function_to_implement_later(): pass -
In type annotations,
…represents any number of elements:# Tuple with any number of ints def process(*args: tuple[int, ...]) -> None: pass # Callable with variadic arguments from collections.abc import Callable func: Callable[..., int] # Any args, returns int # Fixed-length tuple point: tuple[int, int] = (1, 2) # Variadic tuple numbers: tuple[int, ...] = (1, 2, 3, 4, 5) -
Libraries use
…as a sentinel to mark special cases (distinct fromNone):from pydantic import BaseModel, Field class User(BaseModel): # Required field (no default) email: str = Field(..., description="Email address") # Optional field (defaults to None) avatar: str | None = Field(None, description="Avatar URL") # Optional with default value is_active: bool = Field(default=True, description="Active status")# Simplified Pydantic internal logic def Field(default=..., **kwargs): if default is ...: # Checks if default is the Ellipsis object # Field is REQUIRED - no default value return RequiredField(**kwargs) else: # Field is OPTIONAL - has a default value return OptionalField(default=default, **kwargs) -
In NumPy,
…represents all remaining dimensions:import numpy as np arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) # Shape: (2, 2, 2) arr[..., 0] # All dimensions except last, then first element # Equivalent to: arr[:, :, 0] arr[0, ...] # First element of first dimension, all others # Equivalent to: arr[0, :, :] -
In type stub files,
…indicates implementation not shown:# module.pyi (type stub) def complex_function(arg1: int, arg2: str) -> bool: ... # Implementation details not shown in stub class MyClass: def method(self) -> None: ... # Method signature only
17. Modules and packages
# A module is a single Python file (.py extension) containing Python code,
# that can include functions, classes, variables, and statements.
# animal.py (module file)
class Animal:
def __init__(self, voice: str) -> None:
self.__voice = voice
def wow(self):
print(f'{self.__voice}!')
# A package is a directory containing multiple Python modules and potentially
# subdirectories with even more modules, that represents a collection of related
# modules organized under a common namespace.
#
# A package import turns a directory into another Python namespace, with attributes
# corresponding to the subdirectories and module files that the directory contains.
# .
# ├── animals
# │ ├── cat.py
# │ ├── dog.py
# │ └── __init__.py
# └── main.py
# animals/cat.py
def wow():
print('meow!')
# animals/dog.py
def wow():
print('bark!')
# main.py
from animals import cat # from package import module
import animals.dog as dog # import package.module
cat.wow() # meow!
dog.wow() # bark!
17.1. search path
In the context of programming languages and environments, the search path refers to a list of directories that the program or interpreter looks at to locate specific files, particularly modules or libraries, that is composed of the concatenation of the four major components, that ultimately becomes sys.path, a mutable list of directory name strings:
-
Home directory (automatic)
-
When running a program, this entry is the directory containing the program’s top-level script file.
-
When working interactively, this entry is the directory in the working (i.e., the current working directory).
-
-
PYTHONPATH directories (if set)
-
In brief, PYTHONPATH is simply a list of user-defined and platform-specific names of directories that contain Python code files.
-
The
os.pathsepconstant in Python provides the provide platform-specific directory path separator on the module search path.-
Windows:
C:\Python310;C:\Users\YourName\Documents\my_modulesimport os, platform platform.system(), os.pathsep # ('Windows', ';') -
Linux/macOS:
/usr/lib/python3.10/site-packages:/home/yourname/my_modulesimport os, platform platform.system(), os.pathsep # ('Linux', ':')
-
-
-
Standard library directories
-
The contents of any .pth files (if present)
-
The site-packages directory of third-party extensions (automatic)
import sys
for path in sys.path:
print(f"'{path}'")
'' # current working directory where the script is located
'/usr/lib/python311.zip' # standard library, built-in modules
'/usr/lib/python3.11'
'/usr/lib/python3.11/lib-dynload' # dynamically loaded modules or libraries
'/usr/local/lib/python3.11/dist-packages' # third-party libraries
'/usr/lib/python3/dist-packages'
# sys.path is a list, and can be updated programmlly
sys.path
# ['', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/usr/local/lib/python3.11/dist-packages', '/usr/lib/python3/dist-packages']
sys.path.insert(0, '/tmp')
sys.path
# ['/tmp', '', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/usr/local/lib/python3.11/dist-packages', '/usr/lib/python3/dist-packages']
17.2. __init__.py
# dir0\ # Container on module search path
# dir1\
# __init__.py
# dir2\
# __init__.py
# mod.py
import dir1.dir2.mod
-
dir1anddir2both must contain an__init__.pyfile at least until Python 3.3. -
dir0, the container, does not require an__init__.pyfile; this file will simply be ignored if present. -
dir0, notdir0\dir1, must be listed on the module search pathsys.path.
The __init__.py file serves as a hook for package initialization-time actions, declares a directory as a package, generates a module namespace for a directory, and implements the behavior of from * (i.e., from .. import *) statements when used with directory imports:
-
Package initialization: The first time a Python program imports through a directory, it automatically runs all the code in the directory’s
__init__.pyfile which a natural place to put code to initialize the state required by files in a package. -
Module usability declarations: Package
__init__.pyfiles are also partly present to declare that a directory is a regular module package. -
Module namespace initialization: In the package import model, the directory paths in a script become real nested object paths after an import.
-
from *statement behavior: As an advanced feature, the__all__lists in__init__.pyfiles can define what is exported when a directory is imported with thefrom *statement form.
17.3. import and from statements, reload call
-
importfetches the module as a whole, and must qualify to fetch its names.import module_name -
fromfetches (or copies) specific names out of the module over to another scope, and when using a*(used only at the top level of a module file, not within a function) instead of specific names, it copies of all names assigned at the top level of the referenced module.# import specific functions or classes from a module. from module_name import element1, element2 # import a specific element and assign it an alias for easier use. from module_name import element1 as alias # copy out _all_ variables from module_name import * -
Like
def,importandfromare executable statements, not compile-time declarations, and they are implicit assignments:-
importassigns an entire module object to a single name. -
fromassigns one or more names to objects of the same names in another module.
-
-
Modules are loaded and run on the first
importorfrom, and only the first. -
Unlike
importandfrom:-
reloadis a function in Python, not a statement. -
reloadis passed an existing module object, not a new name. -
reloadlives in a module in Python 3.X and must be imported itself.
# import module # initial import # ...use module.attributes... # ... # now, go change the module file # ... # from importlib import reload # get reload itself (in 3.x) # reload(module) # get updated exports # ...use module.attributes... -
-
A namespace package is not fundamentally different from a regular package (must have an
__init__.pyfile that is run automatically); it is just a different way of creating packages which are still relative tosys.pathat the top level: the leftmost component of a dotted namespace package path must still be located in an entry on the normal module search path.import dir1.dir2.mod from dir1.dir2.mod import x import splitdir.modmkdir -p /code/ns/dir{1,2}/sub # two dirs of same name in different dirs# module files in different directories # /code/ns/dir1/sub/mod1.py print(r'dir1\sub\mod1') # /code/ns/dir2/sub/mod2.py print(r'dir2\sub\mod2')PYTHONPATH=/code/ns/dir1:/code/ns/dir2 python -qimport sub sub # namespace packages: nested search paths # <module 'sub' (<_frozen_importlib_external.NamespaceLoader object at 0x7fd1eeda5c50>)> sub.__path__ # _NamespacePath(['/code/ns/dir1/sub', '/code/ns/dir2/sub']) from sub import mod1 # dir1\sub\mod1 import sub.mod2 # content from two different directories # dir2\sub\mod2 mod1 # <module 'sub.mod1' from '/code/ns/dir1/sub/mod1.py'> sub.mod2 # <module 'sub.mod2' from '/code/ns/dir2/sub/mod2.py'>
17.4. relative imports
-
The
fromstatement can use leading dots (.) to specify that it require modules located within the same package (known as package relative imports), instead of modules located elsewhere on the module import search path (called absolute imports).from . import string # relative to this package, imports mypkg.string from .string import name1, name2 # imports names from mypkg.string from .. import string # imports string sibling of mypkg├── main.py └── spam ├── eggs.py ├── ham.py └── __init__.py# spam/ham.py from . import eggs print('eggs')# main.py from spam import ham$ python3 main.py eggsRunning
main.pydirectly sets the module’s__name__attribute to "__main__", causing issues with relative imports which rely on it being set to the package name.# mypkg\ # main.py # string.py# string.py def some_function(): ...# main.py from .string import some_function$ python3 main.py Traceback (most recent call last): from .string import some_function ImportError: attempted relative import with no known parent package
17.5. import best practices
-
Import statements should be organized in this order: standard library, third-party packages, local application imports, separated by blank lines between groups and sorted alphabetically within each group for consistency and easier maintenance.
# Standard library from collections import defaultdict from datetime import datetime # Third-party from fastapi import APIRouter from sqlalchemy.orm import Session # Local application from app.models import User from app.services import UserService -
Use absolute imports for cross-package imports and when importing from internal modules within the same package, and use relative imports primarily in
__init__.pyfiles to re-export from sibling modules.# Absolute import (preferred for most cases) from app.services.user import UserService # Relative import (for __init__.py files) from .user import UserService
# app/services/__init__.py from .auth import AuthService from .user import UserService from .transaction import TransactionService
-
External code should import through the package’s
init.pyrather than directly from submodules to provide a stable public API and hides internal structure.# External code (e.g., routers, dependencies) from app.services import UserService # Good - uses __init__.py # Not this: from app.services.user import UserService # Bypasses package API -
Never import through
__init__.pyfrom within the same package to prevent circular dependencies.# ❌ BAD: app/services/settlement.py from app.services import TransactionService # Circular import! # ✅ GOOD: app/services/settlement.py from app.services.transaction import TransactionService # Direct import -
When modules in the same package need each other, use direct absolute imports to the specific module, not the package’s
__init__.py.# Internal module importing from same package from app.services.user import UserService from app.services.transaction import TransactionService -
Use
TYPE_CHECKINGfrom thetypingmodule to import types that are only needed for type hints, not at runtime, preventing circular imports and reducing runtime overhead.from typing import TYPE_CHECKING if TYPE_CHECKING: from .transaction import TransactionService from .user import UserService def process(service: TransactionService) -> None: # Type hint works pass -
Never use wildcard imports (
from module import *) except in specific controlled scenarios, as they pollute the namespace and make code harder to understand
17.6. _X, __all__, __name__, and __main__
-
Python looks for an
__all__list in the module first and copies its names irrespective of any underscores; if__all__is not defined,from *copies all names without a single leading underscore (_X):# unders.py a, _b, c, _d = 1, 2, 3, 4from unders import * # Load non _X names only a, c # (1, 3) _b # NameError: name '_b' is not defined import unders # But other importers get every name unders._b # 2# alls.py __all__ = ['a', '_c'] # __all__ has precedence over _X a, b, _c, _d = 1, 2, 3, 4from alls import * # load __all__ names only a, _c # (1, 3) b # NameError: name 'b' is not defined from alls import a, b, _c, _d # but other importers get every name a, b, _c, _d # (1, 2, 3, 4) import alls alls.a, alls.b, alls._c, alls._d # (1, 2, 3, 4) -
If a module’s
__name__variable is the string "__main__", it means that the file is being executed as a top-level script as a program instead of being imported from another file as a library in the program.# cat.py def wow(): return __name__ if __name__ == '__main__': print(f'executed: {wow()}')$ python3 cat.py # directly executed (as a script) executed: __main__# imported by another module from cat import wow print(f'imported: {wow()}') # imported: cat
17.7. modules by name strings
-
To import the referenced module given its string name, build and run an
importstatement withexec, or pass the string name in a call to the__import__orimportlib.import_module.# The `import` statements can’t directly to load a module given its name as a # string—Python expects a variable name that’s taken literally and not evalu- # ated, not a string or expression. import 'string' # File "<stdin>", line 1 # import 'string' # ^^^^^^^^ # SyntaxError: invalid syntax# The most general approach is to construct an `import` statement as a string of Python # code and pass it to the `exec` built-in function to run, but it must compile the `import` # statement each time it runs, and compiling can be slow. modname = 'string' exec('import ' + modname) # Run a string of code string # <module 'string' from '/usr/lib/python3.11/string.py'># In most cases it’s probably simpler and may run quicker to use the built-in `__import__` # function to load from a name string instead, which returns the module object, so assign it # to a name here to keep it. modname = 'string' string = __import__(modname) string # <module 'string' from '/usr/lib/python3.11/string.py'># The newer call `importlib.import_module` does the same work as the built-in `__import__` # function, and is generally preferred in more recent Pythons for direct calls to import # by name string. import importlib modname = 'string' string = importlib.import_module(modname)
17.8. pip: pip install packages
# ensure can run pip from the command line
python3 -m pip --version # pip --version
# pip 23.0.1 from /usr/lib/python3/dist-packages/pip (python 3.11)
# OR, install pip, venv modules in Debian/Ubuntu for the system python.
apt install python3-pip python3-venv # On Debian/Ubuntu systems
17.8.1. virtual environment
# create a virtual environment
python3 -m venv python-learning-notes_env
# active a virtual environment
source python-learning-notes_env/bin/activate
# ensure pip, setuptools, and wheel are up to date
pip install --upgrade pip setuptools wheel
# show pip version
pip --version # python3 -m pip --version
# pip 24.0 from .../python-learning-notes_env/lib/python3.11/site-packages/pip (python 3.11)
# deactive a virtual environment: the deactivate command is often implemented as a shell function.
deactivate
17.8.2. Version specifiers
A version specifier consists of a series of version clauses, separated by commas. For example:
~= 0.9, >= 1.0, != 1.3.4.*, < 2.0
The comparison operator determines the kind of version clause:
-
~=: Compatible release clause -
==: Version matching clause -
!=: Version exclusion clause -
<=,>=: Inclusive ordered comparison clause -
<,>: Exclusive ordered comparison clause -
===: Arbitrary equality clause.
Examples:
-
~=3.1: version 3.1 or later, but not version 4.0 or later. -
~=3.1.2: version 3.1.2 or later, but not version 3.2.0 or later. -
~=3.1a1: version 3.1a1 or later, but not version 4.0 or later. -
== 3.1: specifically version 3.1 (or 3.1.0), excludes all pre-releases, post releases, developmental releases and any 3.1.x maintenance releases. -
== 3.1.*: any version that starts with 3.1. Equivalent to the~=3.1.0compatible release clause. -
~=3.1.0, != 3.1.3: version 3.1.0 or later, but not version 3.1.3 and not version 3.2.0 or later.
17.8.3. pip install
# install the latest stable version.
pip install <package_name>
# install a package with extras, i.e., optional dependencies (e.g., pip install 'transformers[torch]').
pip install <package_name>[extra1[,extra2,...]]
# install the exact version (e.g., pip install vllm==0.4.3).
pip install <package_name>==<version>
# install the latest version greater than or equal to the specified one (e.g., pip install vllm>=0.4.0 gets anything from 0.4.0 onwards), but within the same major version.
pip install <package_name>>=<version>
# install the latest patch version (tilde operator) within the specified major and minor version (e.g., pip install vllm~=0.4).
pip install <package_name>~=<version>
# upgrade an already installed to the latest from PyPI.
pip install --upgrade <package_name>
# install from an alternate index
pip install --index-url http://my.package.repo/simple/ <package_name>
# search an additional index during install, in addition to PyPI
pip install --extra-index-url http://my.package.repo/simple <package_name>
# install pre-release and development versions, in addition to stable versions
pip install --pre <package_name>
17.8.4. cache, configuration
# get the cache directory that pip is currently configured to use
pip cache dir # ~/.cache/pip
# Configuration files can change the default values for command line options, and pip has 3 levels:
# - global: system-wide configuration file, shared across users.
# - user: per-user configuration file.
# - site: per-environment configuration file; i.e. per-virtualenv.
# the names of the settings are derived from the long command line option.
[global]
timeout = 60
index-url = https://download.zope.org/ppix
# per-command section: pip install
[install]
ignore-installed = true
no-dependencies = yes
# finding the config directory programmatically:
Debian GNU/Linux$ pip config list -v
For variant 'global', will try loading '/etc/xdg/pip/pip.conf'
For variant 'global', will try loading '/etc/pip.conf'
For variant 'user', will try loading '~/.pip/pip.conf'
For variant 'user', will try loading '~/.config/pip/pip.conf'
For variant 'site', will try loading '$VIRTUAL_ENV/pip.conf' or '/usr/pip.conf'
Microsoft Windows 11 > pip config list -v
For variant 'global', will try loading '%ALLUSERSPROFILE%\pip\pip.ini'
For variant 'user', will try loading '%USERPROFILE%\pip\pip.ini'
For variant 'user', will try loading '%APPDATA%\pip\pip.ini'
For variant 'site', will try loading '%VIRTUAL_ENV%\pip.ini' or '%LOCALAPPDATA%\Programs\Python\Python312\pip.ini'
17.8.5. mirror
# default: https://pypi.org/simple
# set the PyPI mirror
pip config --user set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# pip config --user set global.index-url https://mirrors.aliyun.com/pypi/simple/
# pip config set global.extra-index-url "https://mirrors.sustech.edu.cn/pypi/web/simple https://mirrors.aliyun.com/pypi/simple/"
17.8.6. conda
Conda is a free, open-source software program for package and environment management originally developed by Anaconda.
-
Miniconda is a free, miniature installation of Anaconda Distribution that includes only conda, Python, the packages they both depend on, and a small number of other useful packages.
# download and install the latest version wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash ~/Miniconda3-latest-Linux-x86_64.sh # optional: disable auto-activation of base environment on startup conda config --set auto_activate_base false -
Conda channels are the locations where packages are stored that serve as the base for hosting and managing packages. By default, conda can serve packages from two main locations:
-
By default, Conda automatically uses
repo.anaconda.comto download and update packages.$ conda config --get channels --add channels 'https://repo.anaconda.com/pkgs/msys2' # lowest priority (1) --add channels 'https://repo.anaconda.com/pkgs/r' (2) --add channels 'https://repo.anaconda.com/pkgs/main' # highest priority (3)1 A Windows-specific channel that provides Unix-like tools and libraries necessary for many packages to function on Windows. 2 A specialized channel dedicated to packages for the R programming language. 3 The default, general-purpose channel maintained by Anaconda, Inc., primarily hosting Python-based scientific computing packages. -
In addition, Conda clients search
conda.anaconda.orgfor community channels likeconda-forgeorbioconda.conda-forgeis a separate, community-led channel, required to be added explicitly.$ conda config --add channels conda-forge $ conda config --get channels --add channels 'https://repo.anaconda.com/pkgs/msys2' # lowest priority --add channels 'https://repo.anaconda.com/pkgs/r' --add channels 'https://repo.anaconda.com/pkgs/main' --add channels 'conda-forge' # highest priorityThe
conda config --add channels conda-forgecommand modifies the Conda configuration file (~/.condarcor%USERPROFILE%\.condarc), addingconda-forgeto the top of the channels list, thereby assigning it the highest priority during package searches. -
Conda can be configured to use mirror servers instead of the default online repositories.
# mirror defaults default_channels: (1) - https://my-mirror.com/pkgs/main - https://my-mirror.com/pkgs/r - https://my-mirror.com/pkgs/msys2 # mirror all community channels channel_alias: https://my-mirror.com (2) # mirror only some community channels custom_channels: (3) conda-forge: https://my-mirror.com/conda-forge1 The default_channelssetting completely replaces Conda’s built-in default channels, redirecting all package requests for them to specified mirror URLs.2 The channel_aliassetting establishes a base URL that prefixes all non-default channel names (e.g.,conda-forgeinconda install -c conda-forge), thereby redirecting their package requests to a designated mirror location.3 The custom_channelssetting allows for direct mapping of specific channel names to particular mirror URLs, providing fine-grained control and overriding anychannel_aliasfor those listed channels.# using TUNA mirrors show_channel_urls: true default_channels: - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/main - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/r - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/msys2 custom_channels: conda-forge: https://mirrors.tuna.tsinghua.edu.cn/anaconda/cloud
-
-
Conda is a powerful command line tool for package and environment management that runs on Windows, macOS, and Linux.
-
Create, activate, list, share, remove, and update environments.
# create a new, empty environment conda create -n <env-name> # create a new environment with default packages conda create -n <env-name> python pandas # createa new environment with specific Python version conda create -n <env-name> python=3.12 # create or update an environment from a file conda env create -f environment.yml# list all environments conda env list# activate an environment conda activate myenv # deactivate the current environment conda deactivate# export the current environment to a file (verbose) conda env export > environment.yml # export only explicitly installed packages (recommended) conda env export --from-history > environment.yml# remove an environment and all its packages conda env remove --name my_env -
Run commands (
conda run) in an environment without shell activation.# Best Practice: Use `conda run` in scripts to execute a command in an # environment without needing to activate it first. This is more robust # for automation as it doesn't modify the shell's state. # run a command in a specific conda environment without activating it conda run -n myenv python my_script.py # run an arbitrary command conda run -n myenv pytest # for interactive commands or long-running services, use --no-capture-output # to see the output in real-time instead of all at the end. conda run -n myenv --no-capture-output python my_interactive_app.py#!/bin/bash # A script that runs a Python application, demonstrating a priority-based # approach for choosing the Python interpreter: # 1. Use the python from an active Conda environment. # 2. If not active, use `conda run` with the environment from `environment.yml`. # 3. As a fallback, use the system's `python3`. PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" PYTHON_CMD="python3" # Check for active Conda environment (CONDA_PREFIX is set when conda env is activated) if [ -n "$CONDA_PREFIX" ]; then PYTHON_CMD="python" elif command -v conda >/dev/null && [ -f "$PROJECT_ROOT/environment.yml" ]; then ENV_NAME=$(grep -E "^name:" "$PROJECT_ROOT/environment.yml" | sed -E 's/^name:[[:space:]]*([^[:space:]#]+).*/\1/' | head -n1) # Use 'conda run' to execute in the specified environment without activating it [ -n "$ENV_NAME" ] && PYTHON_CMD="conda run --no-capture-output -n $ENV_NAME python" fi cd "$PROJECT_ROOT" # Execute the main module (e.g., 'cli.main', 'app.main', etc.), passing through all script arguments PYTHONPATH="$PROJECT_ROOT" $PYTHON_CMD -m cli.main "$@" -
Find, install, remove, list, and update packages.
# search for a package across all configured channels conda search scipy # search ONLY in a specific channel, ignoring all other configured channels conda search --override-channels -c conda-forge scipy # search an additional channel with highest priority (results are combined with configured channels) conda search -c conda-forge scipy# search for a package with a version pattern (e.g., greater than or equal to) conda search "numpy>=1.20" # search for a package with a version prefix conda search "numpy=1.20.*" # search for a package for a different platform conda search numpy --platform linux-64 # filter search results using external tools like grep (e.g., for a specific Python build) conda search numpy | grep py39# show detailed information for a specific package build conda search scipy=1.15.2=py313hf4aebb8_1 --info# install a package into the currently active environment conda install matplotlib # install a package into a specific environment conda install -n myenv matplotlib # install a package from a specific channel conda install -c conda-forge numpy# remove a package from the currently active environment conda remove matplotlib # remove a package from a specific environment conda remove -n myenv pandas# list installed packages in the current environment conda list # list installed packages in a specific environment conda list -n myenv# update a specific package conda update biopython # update Python in the current environment conda update python # update a specific package in a specific environment conda update -n myenv biopython -
Update the Conda package manager itself.
# update conda itself (simple, but may use non-default channels) conda update conda # update conda from the official defaults channel (recommended for stability) conda update -n base -c defaults conda -
Using pip in a Conda Environment.
# install pip into an environment conda install -n myenv pip # use pip to install a package (after activating the environment) conda activate myenv pip install <package-name>
-
17.8.7. uv
uv is an extremely fast Python package and project manager, written in Rust, to replace pip, pip-tools, pipx, virtualenv, and more.
-
uv provides a standalone installer to download and install uv:
$ curl -LsSf https://astral.sh/uv/install.sh | sh downloading uv 0.9.26 x86_64-unknown-linux-gnu no checksums to verify installing to /home/user/.local/bin uv uvx everything's installed!pip install uvwinget install --id=astral-sh.uv -eecho 'eval "$(uv generate-shell-completion bash)"' >> ~/.bashrc -
installing and managing Python versions.
# install Python versions. uv python install # view available Python versions. uv python list # find an installed Python version. uv python find # pin the current project to use a specific Python version. uv python pin # uninstall a Python version. uv python uninstall -
creating and working on Python projects, i.e., with a
pyproject.toml.# create a new Python project. uv init # add a dependency to the project. uv add # remove a dependency from the project. uv remove # sync the project's dependencies with the environment. uv sync # create a lockfile for the project's dependencies. uv lock # run a command in the project environment. uv run # view the dependency tree for the project. uv tree # build the project into distribution archives. uv build # publish the project to a package index. uv publish -
running and installing tools published to Python package indexes, e.g.,
rufforblack.# run a tool in a temporary environment. uvx # an alias for `uv tool run` # install a tool user-wide. uv tool install # uninstall a tool. uv tool uninstall # list installed tools. uv tool list # update the shell to include tool executables. uv tool update-shell -
managing and inspecting uv’s state, such as the cache, storage directories, or performing a self-update:
# remove cache entries. uv cache clean # remove outdated cache entries. uv cache prune # show the uv cache directory path. uv cache dir # show the uv tool directory path. uv tool dir # show the uv installed Python versions path. uv python dir # update uv to the latest version. uv self update
18. Testing
-
unittest# **Key Points About `unittest` in Python:** # # * **Test Cases:** Individual units of testing that verify specific functionality. # * **Test Suites:** Collections of test cases that can be run together. # * **Assertions:** Methods used to check if expected results match actual results. # * **Test Case Structure:** Arrange-Act-Assert (AAA) is a common structure. # * **Test Fixtures:** `setUp()` and `tearDown()` methods for setup and cleanup. # * **Running Tests:** `unittest.main()` is the primary way to run tests. # * **Best Practices:** Write clear, concise, and well-organized tests. # * **Naming Conventions:** Test case functions must be prefixed with `test_`. # # **Common Assertions:** # # * `assertEqual(a, b)`: Checks if `a` equals `b`. # * `assertNotEqual(a, b)`: Checks if `a` does not equal `b`. # * `assertTrue(condition)`: Checks if `condition` is `True`. # * `assertFalse(condition)`: Checks if `condition` is `False`. # * `assertIn(item, container)`: Checks if `item` is in `container`. # * `assertNotIn(item, container)`: Checks if `item` is not in `container`. # test_cap.py import unittest def cap(text: str) -> str: return text.capitalize() class TestCap(unittest.TestCase): def setUp(self) -> None: pass def tearDown(self) -> None: pass def test_one_word(self): text = 'duck' # _arrange_ the objects, create and set them up as necessary. result = cap(text) # _act_ on an object. self.assertEqual('Duck', result) # _assert_ that something is as expected. def test_multi_words(self): text = 'hello world' # _arrange_ the objects, create and set them up as necessary. result = cap(text) # _act_ on an object. self.assertEqual('Hello World', result) # _assert_ that something is as expected. def test_table_driven(self): # _arrange_ the objects, create and set them up as necessary. tests = [ ('duck', 'Duck'), ('hello world', 'Hello World') ] for text, expected in tests: result = cap(text) # _act_ on an object. self.assertEqual(result, expected) # _assert_ that something is as expected. if __name__ == '__main__': unittest.main()$ python3 test_cap.py F. ====================================================================== FAIL: test_multi_words (__main__.TestCap.test_multi_words) ---------------------------------------------------------------------- Traceback (most recent call last): File "...", line 27, in test_multi_words self.assertEqual('Hello World', result) AssertionError: 'Hello World' != 'Hello world!' - Hello World ? ^ + Hello world ? ^ ---------------------------------------------------------------------- Ran 2 tests in 0.003s FAILED (failures=1) -
doctest# doctest_cap.py def cap(text: str) -> str: """ >>> cap('duck') 'Duck' >>> cap('hello world') 'Hello World' """ return text.capitalize() if __name__ == '__main__': import doctest doctest.testmod()$ python3 doctest_cap.py ********************************************************************** File "...", line 5, in __main__.cap Failed example: cap('hello world') Expected: 'Hello World' Got: 'Hello world' ********************************************************************** 1 items had failures: 1 of 2 in __main__.cap ***Test Failed*** 1 failures. -
pytest# test_cap.py def cap(text: str) -> str: return text.capitalize() def test_one_word(): text = 'duck' result = cap(text) assert result == 'Duck' def test_multiple_words(): text = 'hello world' result = cap(text) assert result == 'Hello World'$ pipenv install pytest Installing pytest... Installing dependencies from Pipfile.lock (207fdb)... $ pytest ============================================== test session starts ============================================== platform linux -- Python 3.11.2, pytest-8.2.1, pluggy-1.5.0 rootdir: ... collected 2 items test_cap.py .F [100%] =================================================== FAILURES ==================================================== ______________________________________________ test_multiple_words ______________________________________________ def test_multiple_words(): text = 'hello world' result = cap(text) > assert result == 'Hello World' E AssertionError: assert 'Hello world' == 'Hello World' E E - Hello World E ? ^ E + Hello world E ? ^ test_cap.py:12: AssertionError ============================================ short test summary info ============================================ FAILED test_cap.py::test_multiple_words - AssertionError: assert 'Hello world' == 'Hello World' ========================================== 1 failed, 1 passed in 0.09s ==========================================
19. Processes and concurrency
# The standard library’s os module provides a common way of accessing some system information.
import os
os.uname()
# posix.uname_result(sysname='Linux', nodename='node-0', release='6.1.0-21-amd64', version='#1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03)', machine='x86_64')
os.getloadavg()
# (0.05126953125, 0.03955078125, 0.00341796875)
os.cpu_count()
# 4
(os.getpid(), os.getcwd(), os.getuid(), os.getgid())
# (1295, '/tmp', 1000, 1000)
os.system('date -u')
# Thu Jun 6 11:23:23 AM UTC 2024
# 0
# get system and process information with the third-party package psutil
import psutil # pip install psutil
print(psutil.cpu_times(percpu=True))
# [scputimes(user=4.37, nice=0.0, system=6.71, idle=1468.69, iowait=0.26, irq=0.0, softirq=1.86, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=11.84, nice=0.0, system=9.3, idle=1465.29, iowait=1.02, irq=0.0, softirq=0.75, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=10.31, nice=0.0, system=8.58, idle=1468.4, iowait=1.66, irq=0.0, softirq=0.97, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=9.11, nice=0.0, system=10.02, idle=1467.95, iowait=0.81, irq=0.0, softirq=0.65, steal=0.0, guest=0.0, guest_nice=0.0)]
print(psutil.cpu_percent(percpu=False))
# 0.0
print(psutil.cpu_percent(percpu=True))
# [0.3, 0.4, 0.4, 0.1]
19.1. subprocess and multiprocessing
import subprocess
# run another program in a shell
# and grab whatever output it created (both standard output and standard error output)
print(subprocess.getoutput('date')) # Thu Jun 6 07:19:50 PM CST 2024
# A variant method called `check_output()` takes a list of the command and arguments.
# By default it returns standard output only as type bytes rather than a string, and
# does not use the shell:
print(subprocess.check_output(['date', '-u'])) # b'Thu Jun 6 11:30:09 AM UTC 2024\n'
# return a tuple with the status code and output of the other program
print(subprocess.getstatusoutput('date')) # (0, 'Thu Jun 6 07:32:25 PM CST 2024')
# capture the exit status only
ret = subprocess.call('date -u', shell=True)
# Thu Jun 6 11:45:51 AM UTC 2024
print(ret)
# 0
# makes a list of the arguments, not need to call the shell
ret = subprocess.call(['date', '-u'])
# Thu Jun 6 11:50:04 AM UTC 2024
print(ret)
# 0
# create multiple independent processes
import multiprocessing
import os
def whoami(what):
print("Process %s says: %s" % (os.getpid(), what))
if __name__ == "__main__":
whoami("I'm the main program")
for n in range(4):
p = multiprocessing.Process(
target=whoami, args=("I'm function %s" % n,))
p.start()
# Process 1648 says: I'm the main program
# Process 1649 says: I'm function 0
# Process 1650 says: I'm function 1
# Process 1651 says: I'm function 2
# Process 1652 says: I'm function 3
# kill a process with terminate()
import multiprocessing
import time
import os
def whoami(name):
print("I'm %s, in process %s" % (name, os.getpid()))
def loopy(name):
whoami(name)
start = 1
stop = 1000000
for num in range(start, stop):
print("\tNumber %s of %s. Honk!" % (num, stop))
time.sleep(1)
if __name__ == "__main__":
whoami("main")
p = multiprocessing.Process(target=loopy, args=("loopy",))
p.start()
time.sleep(5)
p.terminate()
# I'm main, in process 13084
# I'm loopy, in process 14664
# Number 1 of 1000000. Honk!
# Number 2 of 1000000. Honk!
# Number 3 of 1000000. Honk!
# Number 4 of 1000000. Honk!
# Number 5 of 1000000. Honk!
19.2. Queues, processes, and threads
A queue is like a list: things are added at one end and taken away from the other, which most common is referred to as FIFO (first in, first out). In general, queues transport messages, which can be any kind of information, for distributed task management, also known as work queues, job queues, or task queues.
Threads can be dangerous. Like manual memory management in languages such as C and C++, they can cause bugs that are extremely hard to find, let alone fix. To use threads, all the code in the program (and in external libraries that it uses) must be thread safe.
In Python, threads do not speed up CPU-bound tasks because of an implementation detail in the standard Python system called the Global Interpreter Lock (GIL).
-
Use threads for I/O-bound problems
-
Use processes, networking, or events (discussed in the next section) for CPU-bound problems
import multiprocessing as mp
def washer(dishes, output):
for dish in dishes:
print('Washing', dish, 'dish')
output.put(dish)
def dryer(input):
while True:
dish = input.get()
print('Drying', dish, 'dish')
input.task_done()
dish_queue = mp.JoinableQueue()
dryer_proc = mp.Process(target=dryer, args=(dish_queue,))
dryer_proc.daemon = True
dryer_proc.start()
dishes = ['salad', 'bread', 'entree', 'dessert']
washer(dishes, dish_queue)
dish_queue.join()
# Washing salad dish
# Washing bread dish
# Washing entree dish
# Washing dessert dish
# Drying salad dish
# Drying bread dish
# Drying entree dish
# Drying dessert dish
import threading
import queue
import time
def washer(dishes, dish_queue):
for dish in dishes:
print("Washing", dish)
time.sleep(5)
dish_queue.put(dish)
def dryer(dish_queue):
while True:
dish = dish_queue.get()
print("Drying", dish)
time.sleep(10)
dish_queue.task_done()
dish_queue = queue.Queue()
for n in range(2):
dryer_thread = threading.Thread(target=dryer, args=(dish_queue,))
dryer_thread.start()
dishes = ['salad', 'bread', 'entree', 'dessert']
washer(dishes, dish_queue)
dish_queue.join()
# Washing salad
# Washing bread
# Drying salad
# Washing entree
# Drying bread
# Washing dessert
# Drying entree
# Drying dessert
19.3. concurrent.futures
The concurrent.futures module in the standard library can be used to schedule an asynchronous pool of workers, using threads (when I/O-bound) or processes (when CPU-bound), and get back a future to track their state and collect the results.
Use concurrent.futures any time to launch a bunch of concurrent tasks, such as the following:
-
Crawling URLs on the web
-
Processing files, such as resizing images
-
Calling service APIs
from concurrent import futures
import math
import sys
def calc(val):
result = math.sqrt(float(val))
return val, result
def use_threads(num, values):
with futures.ThreadPoolExecutor(num) as tex:
tasks = [tex.submit(calc, value) for value in values]
for f in futures.as_completed(tasks):
yield f.result()
def use_processes(num, values):
with futures.ProcessPoolExecutor(num) as pex:
tasks = [pex.submit(calc, value) for value in values]
for f in futures.as_completed(tasks):
yield f.result()
def main(workers, values):
print(f"Using {workers} workers for {len(values)} values")
print("Using threads:")
for val, result in use_threads(workers, values):
print(f'{val} {result:.4f}')
print("Using processes:")
for val, result in use_processes(workers, values):
print(f'{val} {result:.4f}')
if __name__ == '__main__':
workers = 3
if len(sys.argv) > 1:
workers = int(sys.argv[1])
values = list(range(1, 6)) # 1 .. 5
main(workers, values)
19.4. Asynchronous I/O
Python 3.4 introduced the asyncio module for asynchronous programming, and Python 3.5 added the async and await keywords, enabling coroutines (pausable functions) and an event loop for scheduling them, which facilitates high-performance networking, web servers, database connections, and distributed task queues, making it ideal for IO-bound and structured network applications.
import asyncio
async def say(phrase, seconds):
print(phrase)
await asyncio.sleep(seconds)
async def wicked():
task_1 = asyncio.create_task(say("Surrender,", 2))
task_2 = asyncio.create_task(say("Dorothy!", 0))
await task_1
await task_2
# blocking: runs the passed coroutine in the default executor, which given a timeout duration of 5 minutes to shutdown
asyncio.run(wicked())
import asyncio
async def say(phrase, seconds):
print(phrase)
await asyncio.sleep(seconds)
async def wicked():
task_1 = asyncio.create_task(say("Surrender,", 2))
task_2 = asyncio.create_task(say("Dorothy!", 0))
await asyncio.gather(task_1, task_2) # Wait for all tasks to finish concurrently
loop = asyncio.get_event_loop()
loop.run_until_complete(wicked())
loop.close()
20. SQL
DB-API (Database API), similar to JDBC in Java, is a standardized interface for Python that allows us to interact with various relational databases using a consistent set of functions and methods, which can simplify database access by providing a common ground for working with different database systems like MySQL, PostgreSQL, SQL Server, and SQLite.
-
DB-API focuses on fundamental database operations like connecting, executing SQL queries, fetching results, and committing/rolling back transactions.
-
Different database modules (e.g.,
MySQLdb,psycopg2,sqlite3) implement the DB-API standard, ensuring consistency in these core functionalities across various systems. -
DB-API promotes parameterization of SQL queries using placeholders (
%s,?, etc.) for values, which enhances security by preventing SQL injection vulnerabilities and improves portability by separating data from the query itself.
20.1. Using DB-API with SQLite in Memory
import sqlite3
# Connect to an in-memory database (no file needed)
with sqlite3.connect(":memory:") as connection:
# Create a cursor object
cursor = connection.cursor()
# Create a table (assuming you don't have one)
cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT NOT NULL,
email TEXT UNIQUE NOT NULL)
''')
# Insert some data using parameterization
users = [("Alice", "alice@example.com"), ("Bob", "bob@example.com")]
cursor.executemany(
"INSERT INTO users (username, email) VALUES (?, ?)", users)
# Commit the changes
connection.commit()
# Query the data
cursor.execute("SELECT * FROM users")
# Fetch all results
results = cursor.fetchall()
# Print the results
for row in results:
print(f"ID: {row[0]}, Username: {row[1]}, Email: {row[2]}")
Appendix A: Install Python from Source Code on Linux
-
Download Python Source Releases
# replace the Python version (e.g. 3.13.0) as needed curl -LO https://www.python.org/ftp/python/3.13.0/Python-3.13.0.tar.xz -
Extract the XZ compressed source tarball
tar xvf Python-3.13.0.tar.xz -
Configure, make and install the Python
cd Python-3.13.0 && ./configure && sudo make installBy default,
make installwill install all the files in/usr/local/bin,/usr/local/libetc. You can specify an installation prefix other than/usr/localusing--prefixon./configure, for instance--prefix=$HOME.$ ls /usr/local/lib/ libpython3.12.a libpython3.13.a pkgconfig python3.11 python3.12 python3.13 $ ls /usr/local/bin/ 2to3 2to3-3.12 idle3 idle3.12 idle3.13 pip3 pip3.12 pip3.13 pydoc3 pydoc3.12 pydoc3.13 python3 python3.12 python3.12-config python3.13 python3.13-config python3-config -
Check the Python version
$ python3 --version Python 3.13.0
Appendix B: Build a Docker Image for FastAPI
$ ls
Dockerfile main.py requirements.txt
# syntax=docker/dockerfile:1
ARG PYTHON_VERSION=3.11
FROM python:${PYTHON_VERSION}-alpine AS builder
ARG PYTHON_VERSION
WORKDIR /app
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt \
-i https://pypi.tuna.tsinghua.edu.cn/simple
FROM python:${PYTHON_VERSION}-alpine
ARG PYTHON_VERSION
ENV APP_UID=1654
RUN apk add --no-cache shadow \
&& groupadd -r app -g $APP_UID \
&& useradd --no-log-init -r -g app -u $APP_UID app
USER app
COPY --from=builder /usr/local/lib/python${PYTHON_VERSION}/site-packages /usr/local/lib/python${PYTHON_VERSION}/site-packages
COPY --from=builder /usr/local/bin /usr/local/bin
COPY . .
EXPOSE 8000
CMD ["fastapi", "run", "main.py"]
References
-
[1] Bill Lubanovic Introducing Python: Modern Computing in Simple Packages. second edition, O’Reilly Media, Inc., November 2019
-
[2] Learning Python, 5th Edition Powerful Object-Oriented Programming (Mark Lutz), O’Reilly Media; 5th edition (July 30, 2013)
-
[3] https://en.wikipedia.org/wiki/Python_(programming_language)
-
[7] https://numpy.org/doc/stable/user/absolute_beginners.html
-
[8] Wes McKinney Python for Data Analysis. thrid edition, O’Reilly, August 2022.