CODE FARM

Dimensions and Embedding Models

2024-06-19T17:15:55+08:00

1. Dimensions & Embedding Models
- 1.1. Dimensionality: Mapping the Essence of Data
- 1.2. Embedding Models: Bridging the Gap Between Data and Meaning
2. Dimensionality in Milvus
3. Building a Text-based KB System with Milvus

This post is generated by Google Gemini

1. Dimensions & Embedding Models

In the realm of machine learning, particularly when dealing with complex data like text, two concepts play a crucial role in capturing meaning and enabling efficient information retrieval: dimensionality and embedding models.

1.1. Dimensionality: Mapping the Essence of Data

Imagine a vast space with multiple axes. Each axis represents a specific feature used to describe something. In machine learning, this space is often used to represent data points. Dimensionality refers to the number of axes (features) used in this space.

The Right Fit: There’s no one-size-fits-all approach to dimensionality. The optimal number of features depends on the type of data and the task at hand. For example, representing an image might require hundreds or thousands of dimensions (pixels), while a simple classification task might only need a few.
Balancing Complexity and Efficiency: Higher dimensionality can potentially capture more nuanced details about the data. However, it also comes with downsides:
Increased Complexity: More features can make algorithms more complex and computationally expensive to train.
Curse of Dimensionality: As dimensionality increases, the amount of data needed to effectively learn relationships between features grows exponentially. This can lead to poor performance with limited data.

The goal is to find the sweet spot – a dimensionality that captures the essential information for your task without introducing unnecessary complexity. Experimenting with different dimensions and evaluating the performance of your model is key to finding this balance.

1.2. Embedding Models: Bridging the Gap Between Data and Meaning

Raw data, like text or images, can be difficult for machines to understand directly. Embedding models act as a bridge, transforming this data into a more meaningful representation suitable for machine learning algorithms. They do this by:

Analyzing the Data: The model analyzes the data, identifying patterns and relationships within it. For example, a text embedding model might analyze word co-occurrence to understand semantic relationships between words.
Generating Embeddings: Based on the analysis, the model generates numerical vectors (embeddings) that represent the data. These vectors capture the essence of the data in a way that the machine learning model can understand and use effectively.

Benefits of Embedding Models:

Efficiency: Embeddings are often lower dimensional than the original data, making them more efficient for storage and processing by machine learning algorithms.
Capturing Relationships: Well-designed embedding models can capture complex relationships within the data, leading to better performance in various tasks like similarity search and classification.

Choosing the Right Embedding Model:

The best embedding model depends on the specific type of data and the task at hand. Different models excel at handling different data types (text, images, etc.) and capturing different aspects of the data.

2. Dimensionality in Milvus

While Milvus itself doesn’t directly deal with "dimensionality" in the same way as traditional machine learning models, it plays a crucial role in how embedding models and vector data are stored and retrieved.

In essence, Milvus provides a storage and retrieval framework for vector data generated by embedding models. By carefully considering dimensionality and choosing the right embedding model, you can optimize your Milvus system for efficient storage, retrieval, and accurate search results based on the semantic meaning of your data.

2.1. Collections in Milvus:

Collections Define Dimension: When you create a collection in Milvus, you specify its dimensionality, which refers to the size of the vector embeddings that will be stored in that collection. This essentially defines the number of features used to represent your data points.
Fixed Dimension for a Collection: Unlike traditional models where dimensionality can be dynamic, each collection in Milvus has a fixed dimensionality. All vectors stored within a collection must have the same size (number of elements).
Choosing the Right Dimension: The optimal dimension for your Milvus collection depends on the chosen embedding model and your specific data. Experimenting with different dimensions within the recommended range for your embedding model is crucial to find the balance between capturing sufficient information and storage efficiency.

2.2. Vector Embeddings:

Pre-trained or Train Your Own: You can utilize pre-trained embedding models (e.g., Word2Vec) or train your own model to generate vector embeddings for your data. These embeddings capture the semantic meaning of your data points (text, images, etc.) in a numerical format.
Dimensionality Match with Collection: The dimension (size) of the generated vector embeddings must match the dimensionality of the Milvus collection where you plan to store them. This ensures compatibility and efficient storage within Milvus.
Dimensionality Mismatch (Optional): If using a pre-trained model with a different dimension than your collection, you might need to adapt:
- Dimensionality Reduction: Techniques like PCA can be used to project higher dimensional vectors into a lower dimension that aligns with your collection’s dimensionality.
- Partial Vector Usage: You can utilize only a specific portion (e.g., the first 300 dimensions) of a higher dimensional vector if it aligns with your collection size.

2.3. Efficient Retrieval:

Similarity Search at Its Core: Milvus excels at performing similarity search on vector data stored within collections. It compares query vectors (representing search terms) with stored vectors based on their distance in the multi-dimensional space.
Dimensionality Impacts Search Performance: While the exact impact can vary depending on the data and search algorithm, lower dimensionality can potentially lead to faster search times in Milvus. This is because there are fewer features to compare during the similarity calculation.

3. Building a Text-based KB System with Milvus

Milvus offers a powerful platform for building text-based knowledge base (KB) systems.

3.1. Understanding Textual Data:

Transforming Text into Meaningful Vectors: Raw text data isn’t directly usable by Milvus. We need a way to capture the semantic meaning of words and documents. This is where embedding models come in.
Embedding Models Bridge the Gap: These models analyze your text data, identifying relationships between words and documents. They then generate numerical vectors (embeddings) that represent this semantic meaning in a multi-dimensional space.

3.2. Dimensionality and Milvus Collections:

Defining the Vector Space: When creating a collection in Milvus for your KB system, you specify its dimensionality. This represents the number of elements (features) in your vector embeddings. It essentially defines the size of the multi-dimensional space where meaning is represented.
Choosing the Right Dimension: There’s no one-size-fits-all answer. The optimal dimension depends on the chosen embedding model and your specific dataset. Common text embedding models typically work well within a range of 50 to 1024 dimensions.
Balancing Accuracy and Efficiency: Higher dimensionality can potentially capture more nuanced semantic information, leading to better retrieval accuracy (finding relevant documents in your KB). However, it also comes with trade-offs:
- Storage Requirements: Higher dimensional vectors require more storage space within Milvus.
- Search Performance: Milvus performs similarity searches to retrieve documents. Higher dimensional vectors might lead to slightly slower search times.

3.3. Selecting the Right Embedding Model for your KB System:

Multiple Options: Consider Word2Vec, the default model from Milvus (e.g., paraphrase-albert-small-v2), or explore other pre-trained models.
Word2Vec: A Reliable Choice: This well-established model excels at capturing word-level semantics. Many pre-trained Word2Vec models are available, often with 300 dimensions (ideal for your collection). However, it might not capture complex relationships within longer text passages as effectively.
Default Milvus Model: Potential for Richer Relationships: Milvus’s default model might capture more complex relationships compared to Word2Vec. The advantage? It’s pre-trained, eliminating the need for training on your data. However, it might have a higher dimension (e.g., 768) requiring handling for your collection:
- Dimensionality Reduction: Techniques like PCA can project these higher dimensional vectors into a lower dimension compatible with your collection.
- Partial Vector Usage: You can utilize only the first 300 dimensions of the generated vectors if it aligns with your collection size.

3.4. Experimentation is Key:

The best approach depends on your specific needs. Try both Word2Vec (potentially pre-trained) and the default model on your KB system’s data. Evaluate the retrieval performance (Recall@K) to see which one delivers the most accurate results in finding relevant documents based on your queries.

Here are some additional tips:

Focus on Accuracy with an Eye on Efficiency: Prioritize retrieval accuracy, but consider the impact of dimensionality on storage and search performance. Find a balance that works for your KB system’s needs.
Consider Training Your Own Word2Vec (Optional): If pre-trained models don’t offer the desired performance or your KB system deals with a specific domain vocabulary, consider training your own Word2Vec model. This requires data pre-processing and setting training parameters, but can offer the most optimized performance.

RAG: Boosting LLMs with Contextual Retrieval

2024-06-18T13:33:05+08:00

RAG (Retrieval-Augmented Generation) is a powerful technique that enhances the capabilities of Large Language Models (LLMs) like GPT-4. While LLMs excel at generating text, they often lack context and struggle to understand the deeper meaning behind user queries. RAG bridges this gap by incorporating information retrieval to provide LLMs with relevant context, leading to improved response quality.

1. How does RAG work?
2. Deep Dive into Context Enrichment for RAG Systems
3. Automatic Prompt Construction
4. Build RAG with Milvus
- 4.1. Prepare the data in Milvus
- 4.2. Use LLM to get a RAG response
References

1. How does RAG work?

RAG is a pattern which uses your data with an LLM to generate answers specific to your data. When a user asks a question, the data store is searched based on user input. The user question is then combined with the matching results and sent to the LLM using a prompt (explicit instructions to an AI or machine learning model) to generate the desired answer. This can be illustrated as follows. [1]

User Input: The user provides a query or prompt.
Vector Search: A vector database (like Milvus) efficiently retrieves documents or passages most relevant to the user’s query based on semantic similarity.
Context Enrichment: Techniques like summarization, keyphrase extraction, or entity recognition are applied to the retrieved information, providing context for the LLM.
Prompt Construction: The user’s original query is combined with the extracted context to form a new, enriched prompt for the LLM.
Enhanced Generation: The LLM leverages the enriched prompt to generate a more informative and relevant response that addresses the user’s specific intent and considers the retrieved context.

While Milvus and GPT-like LLMs are key players, consider these additional aspects for a well-rounded RAG system:

Machine Learning Fundamentals: Understanding concepts like word embeddings and information retrieval is crucial.
Alternative Tools: Explore other vector databases and pre-trained word embedding models.
Prompt Construction Techniques: Utilize template-based prompts, conditional logic, or fine-tuning for automatic prompt generation.
Evaluation: Continuously monitor performance to identify areas for improvement.

In essence, RAG empowers LLMs to become more contextually aware, leading to a more informative and engaging user experience.

2. Deep Dive into Context Enrichment for RAG Systems

Context enrichment is a crucial step in RAG (Retrieval-Augmented Generation) that bridges the gap between a user’s query and the LLM’s response. It involves processing the information retrieved from the vector database (like Milvus) to provide the LLM with a deeper understanding of the user’s intent and the relevant context.

Here’s a breakdown of some popular libraries and techniques for context enrichment:

Text Summarization:
- Goal: Condense retrieved documents into concise summaries for the LLM to grasp the key points.
- Libraries:
  - Gensim (Python): Offers various summarization techniques, including extractive (selecting important sentences) and abstractive (generating a new summary).
  - BART (Transformers library): A powerful pre-trained model specifically designed for text summarization.
Keyword Extraction:
- Goal: Identify the most relevant keywords or keyphrases within retrieved documents to highlight the main themes.
- Libraries:
  - spaCy (Python): Provides functionalities for part-of-speech tagging, named entity recognition, and keyword extraction.
  - NLTK (Python): A comprehensive toolkit for various NLP tasks, including keyword extraction using techniques like TF-IDF (Term Frequency-Inverse Document Frequency).
Named Entity Recognition (NER):
- Goal: Recognize and classify named entities (people, locations, organizations) within retrieved text, enriching the context for the LLM.
- Libraries:
  - spaCy: Offers pre-trained NER models for various languages, allowing the LLM to understand the context of specific entities.
  - Stanford NER: A widely used Java-based library for named entity recognition.

Choosing the Right Technique:

The best approach for context enrichment depends on your specific needs and the type of data you’re working with. Here’s a quick guide:

For factual or informative responses: Text summarization can be highly effective.
For understanding the main topics: Keyword extraction is a good choice.
For tasks involving specific entities: Named entity recognition becomes crucial.

Advanced Techniques:

Combining Techniques: Don’t be limited to a single approach. Combine summarization with keyword extraction or NER to provide richer context to the LLM.
Custom Summarization Models: For specialized domains, consider training custom summarization models using domain-specific data.

3. Automatic Prompt Construction

Several approaches can automate prompt construction based on user input and extracted context:

Template-Based Prompts: Pre-defined templates can be used to structure the prompt, incorporating user query and extracted elements (e.g., "{user_query}: Based on similar content, here are some key points: {key_phrases}. Can you elaborate?").
Conditional Logic: Conditional statements can be used based on the chosen context enrichment technique. For example, if using summaries, the prompt might say "Here’s a summary of relevant information…" while using keyphrases, it might mention "Here are some key points…"
Fine-tuning Language Models: Techniques like fine-tuning pre-trained LLMs can be explored to allow them to automatically learn how to integrate user queries and retrieved context into a cohesive prompt. This is an advanced approach requiring expertise in machine learning.

Choosing the Right Tool:

The best tool or approach depends on your specific needs and available resources. Here’s a basic guideline:

Simpler Systems: For less complex RAG systems, template-based prompts with basic summarization or keyword extraction tools might suffice.
Advanced Systems: For more sophisticated applications, consider exploring conditional logic, fine-tuning LLMs, or combining different context enrichment techniques to create richer prompts.

By combining vector databases with the right context enrichment tools and automatic prompt construction techniques, we can build a robust RAG system that leverages the power of LLMs to generate more informative and relevant responses.

4. Build RAG with Milvus

We will use Phi-3, an open small language model, to provide an OpenAI-compatible API.

Prepare the Phi3 LLM with Ollama on Linux

Install Ollama on Linux:

curl -fsSL https://ollama.com/install.sh | sh

Pull model phi3:mini, and make sure the model checkpoint is prepared:

ollama pull phi3:mini

$ ollama list
NAME                    ID              SIZE    MODIFIED
phi3:mini               64c1188f2485    2.4 GB  17 minutes ago

Check the Phi3 model with the Chat Completion API:

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"model":"phi3:mini","messages":[{"role":"user","content":"Hi, who are you?"}]}'

{
  "id": "chatcmpl-866",
  "object": "chat.completion",
  "created": 1718872510,
  "model": "phi3:mini",
  "system_fingerprint": "fp_ollama",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": " I am Phi, an AI developed to provide information and answer questions to the best of my ability. How can I assist you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 30,
    "total_tokens": 30
  }
}

4.1. Prepare the data in Milvus

Dependencies and Environment

pip install --upgrade 'pymilvus[model]==2.4.4' 'numpy<2' openai requests
# pipenv install -v 'pymilvus[model]==2.4.4' 'numpy<2'  openai requests

Prepare the embedding model

from pymilvus.model.dense import SentenceTransformerEmbeddingFunction  # Sentence Transformer pre-trained models

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

ef = SentenceTransformerEmbeddingFunction(
    model_name='all-MiniLM-L6-v2',  # Specify the model name
)

Create the collection in Milvus

from pymilvus import MilvusClient, DataType

COLLECTION_NAME = "my_rag_collection"
SERVER_ADDR = "http://localhost:19530"
ACCESS_TOKEN = "root:Milvus"
DB_NAME = "default"

# 1. Set up a Milvus client
client = MilvusClient(
    uri=SERVER_ADDR,
    token=ACCESS_TOKEN,
    db_name=DB_NAME,
)

# 2. Check if the collection already exists and drop it if it does.
if client.has_collection(COLLECTION_NAME):
    client.drop_collection(COLLECTION_NAME)

# 3. Create a new collection with specified parameters.
client.create_collection(
    collection_name=COLLECTION_NAME,
    dimension=384,  # The vector has 384 dimensions, matching the SBERT embedding function with all-MiniLM-L6-v2
    auto_id=True,  # default is False
    # primary_field_name="id",
    # id_type="int",
    # vector_field_name="vector",
    # metric_type="COSINE",
    # enable_dynamic_field=True,
)

# 4. (optional) To load a collection, use the load_collection() method.
# client.load_collection(
#     collection_name=COLLECTION_NAME
# )
#
# To release a collection, use the release_collection() method.
# client.release_collection(
#     collection_name=COLLECTION_NAME
# )

# 5. (optional) The collection created above is loaded automatically.
res = client.get_load_state(
    collection_name=COLLECTION_NAME
)

print(res)

# 6. (optional) List detailed information about the collection.
import json
desc = client.describe_collection(
    collection_name=COLLECTION_NAME,
)
print(json.dumps(desc, indent=2))

{'state': }
{
  "collection_name": "my_rag_collection",
  "auto_id": true,
  "num_shards": 1,
  "description": "",
  "fields": [
    {
      "field_id": 100,
      "name": "id",
      "description": "",
      "type": 5,
      "params": {},
      "auto_id": true,
      "is_primary": true
    },
    {
      "field_id": 101,
      "name": "vector",
      "description": "",
      "type": 101,
      "params": {
        "dim": 384
      }
    }
  ],
  "aliases": [],
  "collection_id": 450568843972908135,
  "consistency_level": 2,
  "properties": {},
  "num_partitions": 1,
  "enable_dynamic_field": true
}

Use the Milvus development guide to be as the private knowledge in our RAG, which is a good data source for a simple RAG pipeline.

# download and save it as a local text file.
import os
import urllib.request

URL = "https://raw.githubusercontent.com/milvus-io/milvus/master/DEVELOPMENT.md"
FILE_PATH = "./Milvus_DEVELOPMENT.md1"

if not os.path.exists(FILE_PATH):
    urllib.request.urlretrieve(URL, FILE_PATH)

Create embeddings, and then insert the data into Milvus

from pymilvus import MilvusClient, model

COLLECTION_NAME = "my_rag_collection"
SERVER_ADDR = "http://localhost:19530"
ACCESS_TOKEN = "root:Milvus"
DB_NAME = "default"

client = MilvusClient(
    uri=SERVER_ADDR,
    token=ACCESS_TOKEN,
    db_name=DB_NAME,
)

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

ef = model.dense.SentenceTransformerEmbeddingFunction(
    model_name='all-MiniLM-L6-v2',
)

FILE_PATH = "./Milvus_DEVELOPMENT.md"

with open(FILE_PATH, "r+t", encoding='utf-8') as fi:
    text = fi.read()

# Use "# " to separate the content in the file, which can roughly separate
# the content of each main part of the markdown file.
docs = text.split("# ")

vectors = ef.encode_documents(docs)
data = [{"vector": vectors[i], "text": docs[i]} for i in range(len(vectors))]

res = client.insert(collection_name=COLLECTION_NAME, data=data)
print(res)

{'insert_count': 47, 'ids': [450568843971283844, ... , 450568843971283889, 450568843971283890], 'cost': 0}

4.2. Use LLM to get a RAG response

from openai import OpenAI
from pymilvus import MilvusClient, model

COLLECTION_NAME = "my_rag_collection"
SERVER_ADDR = "http://localhost:19530"
ACCESS_TOKEN = "root:Milvus"
DB_NAME = "default"

client = MilvusClient(
    uri=SERVER_ADDR,
    token=ACCESS_TOKEN,
    db_name=DB_NAME,
)

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

ef = model.dense.SentenceTransformerEmbeddingFunction(
    model_name='all-MiniLM-L6-v2',
)

# Define a query question about the content of the development guide documentation.
question = "what is the hardware requirements specification if I want to build Milvus and run from source code?"

# Search for the question in the collection and retrieve the semantic top-3 matches.
res = client.search(
    collection_name=COLLECTION_NAME,
    data=ef.encode_queries([question]),
    limit=3,  # Return top 3 results
    output_fields=["text"],  # Return the text field
)

retrieved_lines_with_distances = [
    (r["entity"]["text"], r["distance"]) for r in res[0]
]
# [
#   [
#     "Hardware Requirements\n\nThe following specification (either physical or virtual machine resources) is recommended for Milvus to build and run from source code.\n\n```\n- 8GB of RAM\n- 50GB of free disk space\n```\n\n##",
#     0.8904632329940796
#   ],
#   [
#     "Software Requirements\n\nAll Linux distributions are available for Milvus development. However a majority of our contributor worked with Ubuntu or CentOS systems, with a small portion of Mac (both x86_64 and Apple Silicon) contributors. If you would like Milvus to build and run on other distributions, you are more than welcome to file an issue and contribute!\n\nHere's a list of verified OS types where Milvus can successfully build and run:\n\n- Debian/Ubuntu\n- Amazon Linux\n- MacOS (x86_64)\n- MacOS (Apple Silicon)\n\n##",
#     0.7089803814888
#   ],
#   [
#     "Building Milvus on a local OS/shell environment\n\nThe details below outline the hardware and software requirements for building on Linux and MacOS.\n\n##",
#     0.7013456225395203
#   ]
# ]

# Convert the retrieved documents into a string format.
context = "\n".join(
    [line_with_distance[0]
        for line_with_distance in retrieved_lines_with_distances]
)

# Define system and user prompts for the Lanage Model.
SYSTEM_PROMPT = """
Human: You are an AI assistant. You are able to find answers to the questions from the contextual passage snippets provided.
"""
USER_PROMPT = f"""
Use the following pieces of information enclosed in  tags to provide an answer to the question enclosed in  tags.

{context}


{question}

"""

# Use OpenAI Chat Completion API to generate a response based on the prompts.
OAI_API_KEY = "EMPTY"
OAI_API_BASE = "http://localhost:11434/v1"

oai_client = OpenAI(
    api_key=OAI_API_KEY,
    base_url=OAI_API_BASE,
)

response = oai_client.chat.completions.create(
    model="phi3:mini",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_PROMPT},
    ],
)
print(response.choices[0].message.content)

The hardware requirements specification for building Milvus and running it from source code includes having at least 8GB of RAM and 50GB of free disk space.

References

Proxies in Docker and containerd

2024-06-15T10:36:13+08:00

1. Unveiling the Proxy Trio: http_proxy, https_proxy, and no_proxy
2. Unveiling the Proxy Trio with Docker and containerd
3. Install Docker Engine on Debian

1. Unveiling the Proxy Trio: http_proxy, https_proxy, and no_proxy

Ever tried downloading a file online but encountered a restrictive firewall (security barrier) blocking your way?

Fear not! This guide unveils the magic behind http_proxy and https_proxy, the environment variables that act as passports for your tools to navigate these gatekeepers.

Imagine curl, a popular Swiss army knife tool for downloading files from the internet, needs to access a website, but a firewall stands between them, potentially for restriction of information access, and security reasons. Here is where the environment variables http_proxy and https_proxy come in:

http_proxy specifies the address and port of a proxy server specifically for handling regular, unencrypted HTTP traffic.

Think of it as a translator, converting your request into a format the proxy understands before forwarding it to the website.
https_proxy: caters to HTTPS traffic, which is encrypted for security.

When set, curl establishes a secure tunnel with the proxy server using TLS/SSL before sending any data. It’s like whispering your request through a hidden passage.
no_proxy specifies a list of hostnames, domains, or IP addresses that should bypass the proxy server. Traffic destined for these entries will be sent directly to the internet without going through the proxy.
```
export no_proxy="localhost,127.0.0.1,internal.mycompany.com"
```
The no_proxy list takes precedence over http_proxy and https_proxy. Any traffic destined for a hostname or IP address listed in no_proxy will bypass the proxy, even if a proxy server is defined using the other variables.

In most cases, setting http_proxy and https_proxy to the same value (pointing to the same proxy server) is the common practice for proxying all traffic.

It’s important to note that no_proxy might support wildcard patterns (e.g., *.mycompany.com) depending on the specific tool or library interpreting the environment variable. However, it is not a guaranteed feature, and its behavior can vary across applications.

Whenever possible, it’s safer and more reliable to specify exact hostnames or IP addresses in no_proxy instead of relying on wildcards.
If you must use wildcards, thoroughly consult the documentation for the specific tool or library to understand how it interprets wildcard patterns in no_proxy.

Imagine you want to download a file using curl but a firewall with a proxy server is in place. Here’s how you would use the environment variables:

# Set the proxy server address and port (replace with your actual details)
export https_proxy=http://your_proxy_server:port

# Now, use curl to download the file
curl https://example.com/file.txt

By setting https_proxy, curl knows to route its request through the specified proxy server, allowing you to download the file despite the firewall and proxy combination.

While curl itself is generally case-insensitive for these variables, using lowercase (http_proxy and https_proxy) is recommended for consistency.

While http_proxy and https_proxy can be configured for SOCKS proxies by including the socks5:// prefix in the URL, it’s important to understand some limitations:

export http_proxy=socks5://proxy_server:port
export https_proxy=socks5://proxy_server:port

SOCKS proxies typically don’t handle DNS resolution themselves. You might need to configure a separate DNS server for your system to work correctly with a SOCKS proxy.
Not all applications are SOCKS proxy aware. Some applications might require additional configuration or may not work correctly through a SOCKS proxy.

With the understanding of http_proxy and https_proxy, you can navigate proxy servers and access the resources you need!

2. Unveiling the Proxy Trio with Docker and containerd

When working with containerized applications, pulling images from registries is a common task. But what if your network environment requires a proxy server for internet access? This guide explores how to configure https_proxy for secure communication with container image registries using Docker and containerd.

Docker: A container engine built on top of containerd, offering a user-friendly interface and additional functionalities for managing images and containers.
containerd: A container runtime engine that manages the lifecycle of containers (creation, starting, stopping, and deletion).

While Docker, built on top of containerd, manages container image pulling by default, containerd also has its own built-in image pulling plugin (e.g., used by Kubernetes).

To ensure consistent proxy behavior regardless of which tool pulls the image, it’s recommended to configure the proxy for both Docker and containerd.

Set the HTTPS_PROXY environment variable on systemd service files:
- Locate the systemd service file for containerd and dockerd (e.g., /etc/systemd/system/docker.service).
- Edit the file (e.g., sudo systemctl edit docker.service) and add the following lines under the [Service] section:
  
  [Service] Environment="HTTPS_PROXY=https://your_proxy_server:port"
- Reload systemd and restart the service:
  
  sudo systemctl daemon-reload sudo systemctl restart docker

Configure the Docker daemon to use a proxy server in the daemon.json file (Recommended for Docker only):

{
  "proxies": {
    "http-proxy": "http://proxy.example.com:3128",
    "https-proxy": "https://proxy.example.com:3129",
    "no-proxy": "*.test.example.com,.example.org,127.0.0.0/8"
  }
}

3. Install Docker Engine on Debian

Set up Docker’s apt repository.

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian \
  $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
  sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

To use the APT source mirro from Alibaba Cloud, replace the https://download.docker.com with https://mirrors.aliyun.com/docker-ce at the /etc/apt/sources.list.d/docker.list.

Install the Docker packages.

To install the latest version, run:

sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

To install a specific version of Docker Engine, start by listing the available versions in the repository:

# List the available versions:
apt-cache madison docker-ce | awk '{ print $3 }'

5:26.1.0-1~debian.12~bookworm
5:26.0.2-1~debian.12~bookworm
...

# Select the desired version and install:
VERSION_STRING=5:26.1.0-1~debian.12~bookworm
sudo apt-get install docker-ce=$VERSION_STRING docker-ce-cli=$VERSION_STRING containerd.io docker-buildx-plugin docker-compose-plugin

What is Milvus?

2024-06-14T13:53:48+08:00

Milvus (/ˈmɪlvəs/) is an open-source vector database to store, index, and manage massive embedding vectors generated by deep neural networks and machine learning (ML) models. [1]

Unlike existing relational databases which mainly deal with structured data following a pre-defined pattern, Milvus is designed from the bottom-up to handle embedding vectors converted from unstructured data, including images, video, audio, and natural language.

Embedding vectors or vectors, the output data format of Neural Network models, can effectively encode information and serve a pivotal role in AI applications such as knowledge base, semantic search, Retrieval Augmented Generation (RAG) and more. Mathematically speaking, an embedding vector is an array of floating-point numbers or binaries. Modern embedding techniques are used to convert unstructured data to embedding vectors.

Milvus is able to analyze the correlation between two vectors by calculating their similarity distance. If the two embedding vectors are very similar, it means that the original data sources are similar as well. Vector similarity search is the process of comparing a vector to a database to find vectors that are most similar to the query vector. Approximate nearest neighbor (ANN) search algorithms are used to accelerate the searching process. If the two embedding vectors are very similar, it means that the original data sources are similar as well.

Milvus adopts a shared-storage architecture featuring storage and computing disaggregation and horizontal scalability for its computing nodes. Following the principle of data plane and control plane disaggregation, Milvus comprises four layers: access layer, coordinator service, worker node, and storage. [2]

1. Install Milvus
2. Schema and collections
- 2.1. Load & release collection
- 2.2. Dynamic field
3. Embeddings
References

1. Install Milvus

Milvus Lite is good for getting started with vector search or building demos and prototypes, and supports the following OS distributions and sillicon types: Ubuntu >= 20.04 (x86_64), and macOS >= 11.0 (Apple Silicon and x86_64), and Debian 12 (x86_64) on Windows with WSL 2 enabled. [4] [5]

For a production use case, It’s recommended using Milvus on Docker and Kubenetes, or considering the fully-managed Milvus on Zilliz Cloud.

All deployment modes of Milvus share the same API, so your client side code doesn’t need to change much if moving to another deployment mode. Simply specify the URI and Token of a Milvus server deployed anywhere: [5]

from pymilvus import MilvusClient

# Authentication not enabled
client = MilvusClient("http://localhost:19530")

# Authentication enabled with the root user
client = MilvusClient(
    uri="http://localhost:19530",
    token="root:Milvus",
    db_name="default"
)

# Authentication enabled with a non-root user
client = MilvusClient(
    uri="http://localhost:19530",
    token="user:password", # replace this with your token
    db_name="default"
)

Milvus provides REST and gRPC API, with client libraries in languages such as Python, Java, Go, C# and Node.js.

1.1. Run Milvus with Docker Compose

Milvus provides a Docker Compose configuration file in the Milvus repository. To install Milvus using Docker Compose, just run [install_standalone-docker-compose]

# Download the configuration file
$ wget https://github.com/milvus-io/milvus/releases/download/v2.4.4/milvus-standalone-docker-compose.yml -O docker-compose.yml

# Start Milvus
$ sudo docker compose up -d

Creating milvus-etcd  ... done
Creating milvus-minio ... done
Creating milvus-standalone ... done

After starting up Milvus, containers named milvus-standalone, milvus-minio, and milvus-etcd are up.

The milvus-etcd container does not expose any ports to the host and maps its data to volumes/etcd in the current folder.
The milvus-minio container serves ports 9090 and 9091 locally with the default authentication credentials and maps its data to volumes/minio in the current folder.
The milvus-standalone container serves ports 19530 locally with the default settings and maps its data to volumes/milvus in the current folder.

You can check if the containers are up and running using the following command:

$ sudo docker compose ps

      Name                     Command                  State                            Ports
--------------------------------------------------------------------------------------------------------------------
milvus-etcd         etcd -advertise-client-url ...   Up             2379/tcp, 2380/tcp
milvus-minio        /usr/bin/docker-entrypoint ...   Up (healthy)   9000/tcp
milvus-standalone   /tini -- milvus run standalone   Up             0.0.0.0:19530->19530/tcp, 0.0.0.0:9091->9091/tcp

You can stop and delete this container as follows

# Stop Milvus
$ sudo docker compose down

# Delete service data
$ sudo rm -rf volumes

1.2. Run Milvus Lite locally

Milvus Lite is the lightweight version of Milvus included in the Python SDK of Milvus, which can be imported into a Python application, providing the core vector search functionality of Milvus.

Install Milvus

# set up Milvus Lite with pymilvus, the Python SDK library of Milvus
pip install "pymilvus>=2.4.2"

Set up vector database

# connect to Milvus Lite
from pymilvus import MilvusClient

# generate  or load an existing vector database file named milvus_demo.db in the current folder
client = MilvusClient("milvus_demo.db")

Create a collection
```
# create a collection to store vectors and their associated metadata
client.create_collection(
    collection_name="demo_collection",
    dimension=768,  # The vectors we will use in this demo has 768 dimensions
)
```
- The primary key and vector fields use their default names ("id" and "vector").
- The metric type (vector distance definition) is set to its default value (COSINE).
- The primary key field accepts integers and does not automatically increments (namely not using auto-id feature)

Represent text with vectors

To perform semantic search on text, it’s needed to generate vectors for text by downloading embedding models, which can be easily done by using the utility functions from pymilvus[model] library including essential ML tools such as PyTorch.
```
pip install "pymilvus[model]>=2.4.2"
```

Milvus expects data to be inserted organized as a list of dictionaries, where each dictionary represents a data record, termed as an entity.

# generate vector embeddings with default model
from pymilvus import model

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

# This will download a small embedding model "paraphrase-albert-small-v2" (~50MB).
embedding_fn = model.DefaultEmbeddingFunction()

# Text strings to search from.
docs = [
    "Artificial intelligence was founded as an academic discipline in 1956.",
    "Alan Turing was the first person to conduct substantial research in AI.",
    "Born in Maida Vale, London, Turing was raised in southern England.",
]

vectors = embedding_fn.encode_documents(docs)
# The output vector has 768 dimensions, matching the collection that we just created.
print("Dim:", embedding_fn.dim, vectors[0].shape)  # Dim: 768 (768,)

# Each entity has id, vector representation, raw text, and a subject label that we use
# to demo metadata filtering later.
data = [
    {"id": i, "vector": vectors[i], "text": docs[i], "subject": "history"}
    for i in range(len(vectors))
]

print("Data has", len(data), "entities, each with fields: ", data[0].keys())
print("Vector dim:", len(data[0]["vector"]))

Dim: 768 (768,)
Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
Vector dim: 768

Insert data into the collection.

res = client.insert(collection_name="demo_collection", data=data)

print(res)

{'insert_count': 3, 'ids': [0, 1, 2], 'cost': 0}

Semantic search

Milvus accepts one or multiple vector search requests as a list of vectors, where each vector is an array of float numbers, at the same time.

# from pymilvus import MilvusClient, model
#
# client = MilvusClient("milvus_demo.db")
#
# # If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
#
# # This will download a small embedding model "paraphrase-albert-small-v2" (~50MB).
# embedding_fn = model.DefaultEmbeddingFunction()

query_vectors = embedding_fn.encode_queries(["Who is Alan Turing?"])

res = client.search(
    collection_name="demo_collection",  # target collection
    data=query_vectors,  # query vectors
    limit=2,  # number of returned entities
    output_fields=["text", "subject"],  # specifies fields to be returned
)

print(res)

data: ["[{'id': 2, 'distance': 0.5859944820404053, 'entity': {'text': 'Born in Maida Vale, London, Turing was raised in southern England.', 'subject': 'history'}}, {'id': 1, 'distance': 0.5118255019187927, 'entity': {'text': 'Alan Turing was the first person to conduct substantial research in AI.', 'subject': 'history'}}]"] , extra_info: {'cost': 0}

# Vector search with metadata filtering

# Insert more docs in another subject.
docs = [
    "Machine learning has been used for drug design.",
    "Computational synthesis with AI algorithms predicts molecular properties.",
    "DDR1 is involved in cancers and fibrosis.",
]
vectors = embedding_fn.encode_documents(docs)
data = [
    {"id": 3 + i, "vector": vectors[i], "text": docs[i], "subject": "biology"}
    for i in range(len(vectors))
]

client.insert(collection_name="demo_collection", data=data)

# This will exclude any text in "history" subject despite close to the query vector.
res = client.search(
    collection_name="demo_collection",
    data=embedding_fn.encode_queries(["tell me AI related information"]),
    filter="subject == 'biology'",
    limit=2,
    output_fields=["text", "subject"],
)

print(res)

data: ["[{'id': 4, 'distance': 0.27030572295188904, 'entity': {'text': 'Computational synthesis with AI algorithms predicts molecular properties.', 'subject': 'biology'}}, {'id': 3, 'distance': 0.1642588973045349, 'entity': {'text': 'Machine learning has been used for drug design.', 'subject': 'biology'}}]"] , extra_info: {'cost': 0}

A query() is an operation that retrieves all entities matching a cretria, such as a filter expression or matching some ids.

# retrieving all entities whose scalar field has a particular value
res = client.query(
    collection_name="demo_collection",
    filter="subject == 'history'",
    output_fields=["text", "subject"],
)

# retrieving entities by primary key directly
res = client.query(
    collection_name="demo_collection",
    ids=[0, 2],
    output_fields=["vector", "text", "subject"],
)

Delete entities specifying the primary key or delete all entities matching a particular filter expression.

# Delete entities by primary key
res = client.delete(collection_name="demo_collection", ids=[0, 2])

print(res)

# Delete entities by a filter expression
res = client.delete(
    collection_name="demo_collection",
    filter="subject == 'biology'",
)

print(res)

# Drop collection
client.drop_collection(collection_name="demo_collection")

[0, 2]
[3, 4, 5]

1.3. Milvus Command-Line Interface (CLI)

Milvus Command-Line Interface (CLI), based on Milvus Python SDK, is a command-line tool that supports database connection, data operations, and import and export of data. [6]

Install via pip
```
pip install milvus-cli
```
Install with Docker
```
docker run -it zilliz/milvus_cli:latest
```

Commands

milvus_cli > connect -uri http://127.0.0.1:19530
milvus_cli > create database -db testdb
milvus_cli > list databases
milvus_cli > use database -db testdb
milvus_cli > list collections
milvus_cli > show collection -c test_collection_insert
milvus_cli > list connections
milvus_cli > search

Collection name (car, test_collection): car

The vectors of search data(the length of data is number of query (nq), the dim of every vector in data must be equal to vector field’s of collection. You can also import a csv file
out headers): examples/import_csv/search_vectors.csv

The vector field used to search of collection (vector): vector

Metric type: L2

Search parameter nprobe's value: 10

The max number of returned record, also known as topk: 2

The boolean expression used to filter attribute []: id > 0

The names of partitions to search (split by "," if multiple) ['_default'] []: _default

timeout []:

Guarantee Timestamp(It instructs Milvus to see all operations performed before a provided timestamp. If no such timestamp is provided, then Milvus will search all operations performed to date) [0]:

2. Schema and collections

In Milvus, schema is used to define the properties of a collection and the fields within. [7]

A field schema is the logical definition of a field, and Milvus supports only one primary key field in a collection.

To reduce the complexity in data inserts, Milvus allows to specify a default value for each scalar field during field schema creation, excluding the primary key field.

Create a regular field schema:

from pymilvus import FieldSchema
id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, description="primary id")
age_field = FieldSchema(name="age", dtype=DataType.INT64, description="age")
embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")

# The following creates a field and use it as the partition key
position_field = FieldSchema(name="position", dtype=DataType.VARCHAR, max_length=256, is_partition_key=True)

Create a field schema with default field values:

from pymilvus import FieldSchema

fields = [
  FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
  # configure default value `25` for field `age`
  FieldSchema(name="age", dtype=DataType.INT64, default_value=25, description="age"),
  embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")
]

A collection schema is the logical definition of a collection.

Define the field schemas before defining a collection schema.

Create a collection schema

from pymilvus import FieldSchema, CollectionSchema
id_field = FieldSchema(name="id", dtype=DataType.INT64, is_primary=True, description="primary id")
age_field = FieldSchema(name="age", dtype=DataType.INT64, description="age")
embedding_field = FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=128, description="vector")

# Enable partition key on a field if you need to implement multi-tenancy based on the partition-key field
position_field = FieldSchema(name="position", dtype=DataType.VARCHAR, max_length=256, is_partition_key=True)

# Set enable_dynamic_field to True if you need to use dynamic fields.
schema = CollectionSchema(fields=[id_field, age_field, embedding_field], auto_id=False, enable_dynamic_field=True, description="desc of a collection")

Enable dynamic schema by setting enable_dynamic_field to True in the collection schema.

Create a collection with the schema specified:

from pymilvus import Collection
collection_name1 = "tutorial_1"
collection1 = Collection(name=collection_name1, schema=schema, using='default', shards_num=2)

2.1. Load & release collection

Before conducting searches in a collection, ensure that the collection is loaded. During the loading process of a collection, Milvus loads the collection’s index file into memory. Conversely, when releasing a collection, Milvus unloads the index file from memory. [8]

To load a collection, use the load_collection() method, specifying the collection name.

# Load the collection
client.load_collection(
    collection_name="customized_setup_2",
    replica_number=1 # Number of replicas to create on query nodes. Max value is 1 for Milvus Standalone, and no greater than `queryNode.replicas` for Milvus Cluster.
)

res = client.get_load_state(
    collection_name="customized_setup_2"
)

print(res)

# Output
#
# {
#     "state": ""
# }

To release a collection, use the release_collection() method, specifying the collection name.

# Release the collection
client.release_collection(
    collection_name="customized_setup_2"
)

res = client.get_load_state(
    collection_name="customized_setup_2"
)

print(res)

# Output
#
# {
#     "state": ""
# }

2.2. Dynamic field

The dynamic field in a collection is a reserved JSON field named $meta. It can hold non-schema-defined fields and their values as key-value pairs. Using the dynamic field, search and query both schema-defined fields and any non-schema-defined fields they may have.

Enable dynamic field

When defining a schema for a collection, set enable_dynamic_field to True to enable the reserved dynamic field, indicating that any non-schema-defined fields and their values inserted later on will be saved as key-value pairs in the reserved dynamic field.

import random, time
from pymilvus import connections, MilvusClient, DataType

SERVER_ADDR = "http://localhost:19530"

# 1. Set up a Milvus client
client = MilvusClient(
    uri=SERVER_ADDR
)

# 2. Create a collection
schema = MilvusClient.create_schema(
    auto_id=False,
    # highlight-next-line
    enable_dynamic_field=True,
)

schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=5)

index_params = MilvusClient.prepare_index_params()

index_params.add_index(
    field_name="id",
    index_type="STL_SORT"
)

index_params.add_index(
    field_name="vector",
    index_type="IVF_FLAT",
    metric_type="L2",
    params={"nlist": 1024}
)

client.create_collection(
    collection_name="test_collection",
    schema=schema,
    index_params=index_params
)

res = client.get_load_state(
    collection_name="test_collection"
)

print(res)

# Output
#
# {
#     "state": ""
# }

# check the details of the collection.
res = client.describe_collection(
    collection_name="test_collection"
)

print(res)

# Output
#
# {
#   "collection_name": "test_collection",
#   "auto_id": false,
#   "num_shards": 1,
#   "description": "",
#   "fields": [
#     {
#       "field_id": 100,
#       "name": "id",
#       "description": "",
#       "type": 5,
#       "params": {},
#       "is_primary": true
#     },
#     {
#       "field_id": 101,
#       "name": "vector",
#       "description": "",
#       "type": 101,
#       "params": {
#         "dim": 5
#       }
#     }
#   ],
#   "aliases": [],
#   "collection_id": 450568843971279780,
#   "consistency_level": 2,
#   "properties": {},
#   "num_partitions": 1,
#   "enable_dynamic_field": true
# }

Insert dynamic data

Prepare some randomly generated data for the insertion later on.

colors = ["green", "blue", "yellow", "red", "black", "white", "purple", "pink", "orange", "brown", "grey"]
data = []

for i in range(1000):
    current_color = random.choice(colors)
    current_tag = random.randint(1000, 9999)
    data.append({
        "id": i,
        "vector": [ random.uniform(-1, 1) for _ in range(5) ],
        "color": current_color,
        "tag": current_tag,
        "color_tag": f"{current_color}_{str(current_tag)}"
    })

print(data[0])

Insert the data into the collection.

res = client.insert(
    collection_name="test_collection",
    data=data,
)

print(res)

# Output
#
# {
#     "insert_count": 1000,
#     "ids": [
#         0,
#         1,
#         2,
#         3,
#         4,
#         5,
#         6,
#         7,
#         8,
#         9,
#         "(990 more items hidden)"
#     ]
# }

time.sleep(5)

Search with dynamic fields

# 4. Search with dynamic fields
query_vectors = [[0.3580376395471989, -0.6023495712049978, 0.18414012509913835, -0.26286205330961354, 0.9029438446296592]]

res = client.search(
    collection_name="test_collection",
    data=query_vectors,
    filter="color in [\"red\", \"green\"]",
    search_params={"metric_type": "L2", "params": {"nprobe": 10}},
    limit=3
)

print(res)

# Output
#
# [
#     [
#         {
#             "id": 863,
#             "distance": 0.188413605093956,
#             "entity": {
#                 "id": 863,
#                 "color_tag": "red_2371"
#             }
#         },
#         {
#             "id": 799,
#             "distance": 0.29188022017478943,
#             "entity": {
#                 "id": 799,
#                 "color_tag": "red_2235"
#             }
#         },
#         {
#             "id": 564,
#             "distance": 0.3492690920829773,
#             "entity": {
#                 "id": 564,
#                 "color_tag": "red_9186"
#             }
#         }
#     ]
# ]

3. Embeddings

Embedding is a machine learning concept for mapping data into a high-dimensional space, where data of similar semantic are placed close together. [9]

Typically being a Deep Neural Network from BERT or other Transformer families, the embedding model can effectively represent the semantics of text, images, and other data types with a series of numbers known as vectors.
A key feature of these models is that the mathematical distance between vectors in the high-dimensional space can indicate the similarity of the semantics of original text or images, that unlocks many information retrieval applications, such as web search engines like Google and Bing, product search and recommendations on e-commerce sites, and the recently popular Retrieval Augmented Generation (RAG) paradigm in generative AI.

There are two main categories of embeddings, each producing a different type of vector:

Dense embedding: Most embedding models represent information as a floating point vector of hundreds to thousands of dimensions. The output is called "dense" vectors as most dimensions have non-zero values.

Dense embedding is a technique used in natural language processing to represent words or phrases as continuous, dense vectors in a high-dimensional space, capturing semantic relationships.

For instance, the popular open-source embedding model BAAI/bge-base-en-v1.5 outputs vectors of 768 floating point numbers (768-dimension float vector).
Sparse embedding: In contrast, the output vectors of sparse embeddings has most dimensions being zero, namely "sparse" vectors. These vectors often have much higher dimensions (tens of thousands or more) which is determined by the size of the token vocabulary.

Sparse vectors can be generated by Deep Neural Networks or statistical analysis of text corpora. Due to their interpretability and observed better out-of-domain generalization capabilities, sparse embeddings are increasingly adopted by developers as a complement to dense embeddings.

Milvus is a vector database designed for vector data management, storage, and retrieval. By integrating mainstream embedding and reranking models, it can easily transform original text into searchable vectors or rerank the results using powerful models to achieve more accurate results for RAG, and simplifies text transformation and eliminates the need for additional embedding or reranking components, thereby streamlining RAG development and validation.

To use embedding functions with Milvus, first install the PyMilvus client library with the model subpackage that wraps all the utilities for embedding generation.

pip install pymilvus[model]
# or pip install "pymilvus[model]" for zsh.
# or pipenv install 'pymilvus[model]==2.4.4' 'numpy<2'

The model subpackage supports various embedding models, from OpenAI, Sentence Transformers, BGE M3, BM25, to SPLADE pretrained models.

Use default embedding function to generate dense vectors

from pymilvus import model

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

# This will download a small embedding model "paraphrase-albert-small-v2" (~50MB).
embedding_fn = model.DefaultEmbeddingFunction()

# Text strings to search from.
docs = [
    "Artificial intelligence was founded as an academic discipline in 1956.",
    "Alan Turing was the first person to conduct substantial research in AI.",
    "Born in Maida Vale, London, Turing was raised in southern England.",
]

vectors = embedding_fn.encode_documents(docs)
# The output vector has 768 dimensions, matching the collection that we just created.
print("Dim:", embedding_fn.dim, vectors[0].shape)  # Dim: 768 (768,)

# Each entity has id, vector representation, raw text, and a subject label that we use
# to demo metadata filtering later.
data = [
    {"id": i, "vector": vectors[i], "text": docs[i], "subject": "history"}
    for i in range(len(vectors))
]

print("Data has", len(data), "entities, each with fields: ", data[0].keys())
print("Vector dim:", len(data[0]["vector"]))

Dim: 768 (768,)
Data has 3 entities, each with fields:  dict_keys(['id', 'vector', 'text', 'subject'])
Vector dim: 768

# To create embeddings for queries, use the encode_queries() method:
query_vectors = embedding_fn.encode_queries(["Who is Alan Turing?"])

Use sentence transformer embedding function to generate dense vectors with Sentence Transformer pre-trained models

pip install sentence_transformers  # (optional) install sentence_transformers manually

from pymilvus import model

# If connection to https://huggingface.co/ failed, uncomment the following path
# import os
# os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

sentence_transformer_ef = model.dense.SentenceTransformerEmbeddingFunction(
    model_name='all-MiniLM-L6-v2', # Specify the model name
    device='cpu' # Specify the device to use, e.g., 'cpu', 'cuda:0'. If None, checks if a GPU can be used.
)

docs = [
    "Artificial intelligence was founded as an academic discipline in 1956.",
    "Alan Turing was the first person to conduct substantial research in AI.",
    "Born in Maida Vale, London, Turing was raised in southern England.",
]

docs_embeddings = sentence_transformer_ef.encode_documents(docs)

# Print embeddings
print("Embeddings:", docs_embeddings)
# Print dimension and shape of embeddings
print("Dim:", sentence_transformer_ef.dim, docs_embeddings[0].shape)

Embeddings: [array([-3.09392996e-02, -1.80662833e-02,  1.34775648e-02,  2.77156215e-02,
       -4.86349640e-03, -3.12581174e-02, -3.55921760e-02,  5.76934684e-03,
        2.80773244e-03,  1.35783911e-01,  3.59678417e-02,  6.17732145e-02,
...
       -4.61330153e-02, -4.85207550e-02,  3.13997865e-02,  7.82178566e-02,
       -4.75336798e-02,  5.21207601e-02,  9.04406682e-02, -5.36676683e-02],
      dtype=float32)]
Dim: 384 (384,)

# To create embeddings for queries, use the encode_queries() method:

queries = ["When was artificial intelligence founded",
           "Where was Alan Turing born?"]

query_embeddings = sentence_transformer_ef.encode_queries(queries)

# Print embeddings
print("Embeddings:", query_embeddings)
# Print dimension and shape of embeddings
print("Dim:", sentence_transformer_ef.dim, query_embeddings[0].shape)

Embeddings: [array([-2.52114702e-02, -5.29330298e-02,  1.14570223e-02,  1.95571519e-02,
       -2.46500354e-02, -2.66519729e-02, -8.48201662e-03,  2.82961670e-02,
       -3.65092754e-02,  7.50745758e-02,  4.28900979e-02,  7.18822703e-02,
...
       -6.76431581e-02, -6.45996556e-02, -4.67132553e-02,  4.78532910e-02,
       -2.31596199e-03,  4.13446948e-02,  1.06935494e-01, -1.08258888e-01],
      dtype=float32)]
Dim: 384 (384,)

References

LLMs: Ollama, vLLM, Hugging Face, and Open WebUI

2024-06-12T14:07:43+08:00

Ollama, vLLM, and llama.cpp are all tools related to running large language models (LLMs) locally on the own computer.

1. Ollama
2. vLLM
3. Hugging Face
4. Open WebUI

1. Ollama

Ollama (/ˈɒlˌlæmə/) is a user-friendly, higher-level interface for running various LLMs, including Llama, Qwen, Jurassic-1 Jumbo, and others.
It provides a streamlined workflow for downloading models, configuring settings, and interacting with LLMs through a command-line interface (CLI) or Python API.
Ollama acts as a central hub for managing and running multiple LLM models from different providers, and integrates with underlying tools like llama.cpp for efficient execution.

To pull a model checkpoint and run the model, use the ollama run command.

Install Ollama on Linux:

curl -fsSL https://ollama.com/install.sh | sh

>>> Downloading ollama...
######################################################################## 100.0%-#O#- #   #
######################################################################## 100.0%
>>> Installing ollama to /usr/local/bin...
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
WARNING: No NVIDIA/AMD GPU detected. Ollama will run in CPU-only mode.

For more install instructions , see https://github.com/ollama/ollama.

Keep Ollama service running whenever using ollama:

$ systemctl status ollama.service
○ ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; disabled; preset: enabled)
     Active: inactive (dead)

$ ollama run phi3:mini
Error: could not connect to ollama app, is it running?

To run systemd inside of Windows Subsystem for Linux (WSL) distros:

Add these lines to the /etc/wsl.conf to ensure systemd starts up on boot.
```
[boot]
systemd=true
```
Run wsl.exe --shutdown from PowerShell to restart the WSL instances.

Start and check the Ollama service status.

$ sudo systemctl start ollama.service
$ systemctl status ollama.service
● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; disabled; preset: enabled)
     Active: active (running) since Wed 2024-06-12 15:21:39 CST; 5min ago
   Main PID: 914 (ollama)
      Tasks: 15 (limit: 9340)
     Memory: 576.9M
     CGroup: /system.slice/ollama.service
             └─914 /usr/local/bin/ollama serve
$ sudo ss -ntlp
State     Recv-Q    Send-Q    Local Address:Port     Peer Address:Port    Process
LISTEN    0         4096          127.0.0.1:11434         0.0.0.0:*        users:(("ollama",pid=914,fd=3))

Ollama has its own library to pull models, and store them at home directory of the user (i.e., ollama) that running the ollama service:
- macOS: ~/.ollama/models
- Linux: /usr/share/ollama/.ollama/models
- Windows: C:\Users\%username%\.ollama\models
If a different directory needs to be used, set the environment variable OLLAMA_MODELS to the chosen directory.

To get the home directory of the user ollama, run getent passwd ollama | cut -d: -f6.

The ollama service can also be accessed via its OpenAI-compatible API when the model checkpoint is prepared.

$ ollama serve --help
Start ollama

Usage:
  ollama serve [flags]

Aliases:
  serve, start

Flags:
  -h, --help   help for serve

Environment Variables:
      OLLAMA_DEBUG               Show additional debug information (e.g. OLLAMA_DEBUG=1)
      OLLAMA_HOST                IP Address for the ollama server (default 127.0.0.1:11434)
      OLLAMA_KEEP_ALIVE          The duration that models stay loaded in memory (default "5m")
      OLLAMA_MAX_LOADED_MODELS   Maximum number of loaded models (default 1)
      OLLAMA_MAX_QUEUE           Maximum number of queued requests
      OLLAMA_MODELS              The path to the models directory
      OLLAMA_NUM_PARALLEL        Maximum number of parallel requests (default 1)
      OLLAMA_NOPRUNE             Do not prune model blobs on startup
      OLLAMA_ORIGINS             A comma separated list of allowed origins
      OLLAMA_TMPDIR              Location for temporary files
      OLLAMA_FLASH_ATTENTION     Enabled flash attention
      OLLAMA_LLM_LIBRARY         Set LLM library to bypass autodetection
      OLLAMA_MAX_VRAM            Maximum VRAM

//  ensure that the model checkpoint is prepared.
$ ollama list
NAME                    ID              SIZE    MODIFIED
phi3:mini               64c1188f2485    2.4 GB  17 minutes ago

curl

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{"messages":[{"role":"user","content":"Say this is a test"}],"model":"phi3:mini"}'

Python

pip install openai

from openai import OpenAI
client = OpenAI(
    base_url='http://localhost:11434/v1/',
    api_key='ollama',  # required but ignored
)
chat_completion = client.chat.completions.create(
    messages=[
        {
            'role': 'user',
            'content': 'Say this is a test',
        }
    ],
    model='phi3:mini',
)

C#/.NET

# The official .NET library for the OpenAI API
dotnet add package OpenAI --prerelease

using OpenAI.Chat;

ChatClient client = new(
    model: "phi3:mini",
    credential: "EMPTY_OPENAI_API_KEY",
    options: new OpenAI.OpenAIClientOptions { Endpoint = new Uri("http://localhost:11434/v1/") });

ChatCompletion completion = client.CompleteChat("Say 'this is a test.'");

Console.WriteLine($"[ASSISTANT]: {completion}");

2. vLLM

vLLM (Very Low Latency Model) primarily focuses on deploying LLMs as low-latency inference servers.
It prioritizes speed and efficiency, making it suitable for serving LLMs to multiple users in real-time applications.
vLLM offers APIs that allow developers to integrate LLM functionality into their applications. While it can be used locally, server deployment is its main strength.
vLLM is a Python library that also contains pre-compiled C++ and CUDA (12.1) binaries, and with the requirements:
- OS: Linux
- Python: 3.8 – 3.11
- GPU: compute capability 7.0 or higher (e.g., V100, T4, RTX20xx, A100, L4, H100, etc.)

To deploy a model as an OpenAI-compatible service:

pip install vllm

$ pip list | egrep 'vllm|transformers'
transformers                      4.41.2
vllm                              0.5.0
vllm-flash-attn                   2.5.9

$ python -m vllm.entrypoints.openai.api_server --help
vLLM OpenAI-Compatible RESTful API server.

options:
  --host HOST           host name
  --port PORT           port number
  --api-key API_KEY     If provided, the server will require this key to be presented in the header.
  --model MODEL         Name or path of the huggingface model to use.
  --max-model-len MAX_MODEL_LEN
                        Model context length. If unspecified, will be automatically derived from the model config.
  --gpu-memory-utilization GPU_MEMORY_UTILIZATION
                        The fraction of GPU memory to be used for the model executor, which can range from 0 to 1. For example, a value of 0.5 would imply 50% GPU memory utilization. If unspecified, will use
                        the default value of 0.9.
  --served-model-name SERVED_MODEL_NAME [SERVED_MODEL_NAME ...]
                        The model name(s) used in the API. If multiple names are provided, the server will respond to any of the provided names. The model name in the model field of a response will be the
                        first name in this list. If not specified, the model name will be the same as the `--model` argument. Noted that this name(s)will also be used in `model_name` tag content of
                        prometheus metrics, if multiple names provided, metricstag will take the first one.

# Start an OpenAI-compatible API service
python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2-0.5B-Instruct

If saw connection to https://huggingface.co/ failed, try:

HF_ENDPOINT=https://hf-mirror.com python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2-0.5B-Instruct

Run in a firewalled or offline environment with locally cached files by setting the environment variable TRANSFORMERS_OFFLINE=1.

HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 \
    HF_ENDPOINT=https://hf-mirror.com \
    python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen2-0.5B-Instruct \
    --max-model-len 4096

The vLLM requires a NVIDIA GPU on the host system, and the --device cpu doesn’t work.

$ python -m vllm.entrypoints.openai.api_server --model Qwen/Qwen2-0.5B-Instruct --device cpu
RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

llama.cpp:

llama.cpp is a C++ library as a core inference engine that provides the core functionality for running LLMs on CPUs and GPUs.
It’s designed to efficiently execute LLM models for tasks like text generation and translation.
Ollama and other tools like Hugging Face Transformers can use llama.cpp as the underlying engine for running LLM models locally.

Think of Ollama as a user-friendly car with a dashboard and controls that simplifies running different LLM models (like choosing a destination). vLLM is more like a high-performance racing engine focused on speed and efficiency, which is optimized for serving LLMs to many users (like a racing car on a track). llama.cpp is the core engine that does the actual work of moving the car (like the internal combustion engine), and other tools can utilize it for different purposes.

Use Ollama for a simple and user-friendly experience running different LLM models locally.
Consider vLLM if the focus is on deploying a low-latency LLM server for real-time applications.
llama.cpp is a low-level library that serves as the core engine for other tools to run LLMs efficiently.

3. Hugging Face

Hugging Face is a popular open-source community and platform focused on advancing natural language processing (NLP) research and development, which is well-known for the Transformers library, a widely used open-source framework written in Python that provides tools and functionalities for training, fine-tuning, and deploying various NLP models, including LLMs.
Hugging Face maintains a Model Hub, a vast repository of pre-trained NLP models, including LLMs like Qwen, Jurassic-1 Jumbo, and many others which can be downloaded and used with the Transformers library or other compatible tools.
Model Scope is a platform that focus on model access and aims to democratize access to a wide range of machine learning models, including LLMs. It goes beyond NLP models and encompasses various domains like computer vision, audio processing, and more. It acts as a model hosting service, allowing developers to access and utilize pre-trained models through APIs or a cloud-based environment.
While Model Scope has its own model repository, it also collaborates with Hugging Face. Some models from the Hugging Face Model Hub are also available on Model Scope, providing users with additional access options.

Here’s a table summarizing the key differences:

Feature	Hugging Face	Model Scope
Focus	Open-source community, NLP research & development	Model access across various domains (including NLP)
Core Strength	Transformers library, Model Hub	Model hosting service, API access
Model Scope	Primarily NLP, but expanding	Wide range of machine learning models
Community Focus	Strong community focus, education, collaboration	Less emphasis on community, more on commercial aspect

Feature

Hugging Face

Model Scope

Focus

Open-source community, NLP research & development

Model access across various domains (including NLP)

Core Strength

Transformers library, Model Hub

Model hosting service, API access

Model Scope

Primarily NLP, but expanding

Wide range of machine learning models

Community Focus

Strong community focus, education, collaboration

Less emphasis on community, more on commercial aspect

Command line interface (CLI)

The huggingface_hub Python package comes with a built-in CLI called huggingface-cli that can be used to interact with the Hugging Face Hub directly from a terminal.
```
pip install -U "huggingface_hub[cli]"
```
In the snippet above, the [cli] extra dependencies is also installed to make the user experience better, especially when using the delete-cache command.

To download a single file from a repo, simply provide the repo_id and filename as follow:
```
# If saw connection to https://huggingface.co/ failed, uncomment the following line:
# ENV HF_ENDPOINT=https://hf-mirror.com

huggingface-cli download sentence-transformers/all-MiniLM-L6-v2
```

4. Open WebUI

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted WebUI designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs.

Lua Learning Notes

2024-06-03T11:21:44+08:00

Lua is a powerful, efficient, lightweight, embeddable scripting language. It supports procedural programming, object-oriented programming, functional programming, data-driven programming, and data description. [1]

Lua combines simple procedural syntax with powerful data description constructs based on associative arrays and extensible semantics.
Lua is dynamically typed, runs by interpreting bytecode with a register-based virtual machine, and has automatic memory management with a generational garbage collection, making it ideal for configuration, scripting, and rapid prototyping.
Lua is implemented as a library, written in clean C, the common subset of standard C and C++.
The Lua distribution includes a host program called lua, which uses the Lua library to offer a complete, standalone Lua interpreter, for interactive or batch use.
Lua is intended to be used both as a powerful, lightweight, embeddable scripting language for any program that needs one, and as a powerful but lightweight and efficient stand-alone language.
As an extension language, Lua has no notion of a "main" program: it works embedded in a host client, called the embedding program or simply the host.

1. The stand-alone interpreter

The stand-alone interpreter (also called lua.c due to its source file or simply lua due to its executable) is a small program that allows the direct use of Lua. [2]

# Debian
apt install lua5.4

# Windows
winget install DEVCOM.Lua --version 5.4.6

When the interpreter loads a file, it ignores its first line if this line starts with a hash (#).
```
#!/usr/bin/env lua
print("Hello World!")
```

Without arguments the interpreter enters the interactive mode.

$ lua
Lua 5.4.4  Copyright (C) 1994-2022 Lua.org, PUC-Rio
> math.pi / 4
0.78539816339745
> os.exit()
$

A script can retrieve its arguments through the predefined global variable arg.

In a call like % lua script a b c, the interpreter creates the table arg with all the command-line arguments, before running any code.
- The script name goes into index 0; its first argument ("a" in the example) goes to index 1, and so on.
- Preceding options go to negative indices, as they appear before the script.
For instance, consider this call:
```
% lua -e "sin=math.sin" script a b
```
The interpreter collects the arguments as follows:
```
arg[-3] = "lua"
arg[-2] = "-e"
arg[-1] = "sin=math.sin"
arg[0] = "script"
arg[1] = "a"
arg[2] = "b"
```

2. Lexical

-- Lua is case-sensitive: and is a reserved word, but And and AND are two different identifiers.
and	break	do	else	elseif
end	false	for	function	goto
if	in	local	nil	not
or	repeat	return	then	true
until	while

Naming conventions in Lua
- Variables and Functions: Lower camel case (e.g., userName, calculateArea)
- Table Keys: Lower camel case or underscore separated (e.g., userData.name, user_data["age"])
- Constants: Uppercase with underscores (e.g., MAX_PLAYERS)
A chunk is simply a sequence of commands (or statements), that is a piece of code that Lua executes, such as a file or a single line in interactive mode. [2]
A comment starts anywhere with two consecutive hyphens (--) and runs until the end of the line. Lua also offers long comments, which start with two hyphens followed by two opening square brackets and run until the first occurrence of two consecutive closing square brackets, like here:
```
--[[A multi-line
long comment
]]
```

Lua needs no separator (i.e. semicolon, ;) between consecutive statements.

a = 1
b = a * 2

a = 1;
b = a * 2;

a = 1; b = a * 2

a = 1 b = a * 2 -- ugly, but valid

It is not an error to access a non-initialized variable (nil).
```
$ lua -e 'print(x)'
nil
```

2.1. Local variables and blocks

By default, variables in Lua are global. Unlike global variables, a local variable has its scope limited to the block where it is declared.

A block is the body of a control structure, the body of a function, or a chunk (the file or string where the variable is declared):

x = 10
local i = 1     -- local to the chunk
while i <= x do
    local x = i * 2 -- local to the while body
    print(x)    --> 2, 4, 6, 8, ...
    i = i + 1
end
if i > 20 then
    local x  -- local to the "then" body
    x = 20
    print(x + 2) -- (would print 22 if test succeeded)
else
    print(x) --> 10 (the global one)
end
print(x)     --> 10 (the global one)

In interactive mode, each line is a chunk by itself (unless it is not a complete command).

$ lua
Lua 5.4.4  Copyright (C) 1994-2022 Lua.org, PUC-Rio
> local x = 10
> print(x)
nil
> do
>> local x = 20
>> print(x)
>> end
20
>

The do-end blocks are useful to finer control over the scope of some local variables:

local x1, x2
do
    local a2 = 2 * a
    local d = (b ^ 2 - 4 * a * c) ^ (1 / 2)
    x1 = (-b + d) / a2
    x2 = (-b - d) / a2
end           -- scope of 'a2' and 'd' ends here
print(x1, x2) -- 'x1' and 'x2' still in scope

It is good programming style to use local variables whenever possible.
- Local variables avoid cluttering the global environment with unnecessary names; they also avoid name clashes between different parts of a program.
- Moreover, the access to local variables is faster than to global ones.
- Finally, a local variable vanishes as soon as its scope ends, allowing the garbage collector to release its value.
The Lua distribution comes with a module strict.lua for global-variable checks; it raises an error if we try to assign to a non-existent global inside a function or to use a non-existent global.
A common idiom in Lua is local foo = foo to create a local variable, foo, and initializes it with the value of the global variable foo.

2.2. Control structures

Lua provides a small and conventional set of control structures, with if for conditional execution and while, repeat, and for for iteration.

All control structures have a syntax with an explicit terminator: end terminates if, for and while structures; until terminates repeat structures.
The condition expression of a control structure can result in any value.

2.2.1. if then else

An if statement tests its condition and executes its then-part or its else-part accordingly.

if op == "+" then
    r = a + b
elseif op == "-" then
    r = a - b
elseif op == "*" then
    r = a * b
elseif op == "/" then
    r = a / b
else
    error("invalid operation")
end

2.2.2. While

A while loop repeats its body while a condition is true. As usual, Lua first tests the while condition; if the condition is false, then the loop ends; otherwise, Lua executes the body of the loop and repeats the process.

local i = 1
while a[i] do
    print(a[i])
    i = i + 1
end

2.2.3. repeat

A repeat–until statement repeats its body until its condition is true. It does the test after the body, so that it always executes the body at least once.

-- print the first non-empty input line
local line
repeat
    line = io.read()
until line ~= ""
print(line)

-- computes the square root of 'x' using Newton-Raphson method
local sqr = x / 2
repeat
    sqr = (sqr + x / sqr) / 2
    local error = math.abs(sqr ^ 2 - x)
until error < x / 10000 -- local 'error' still visible here

2.2.4. For

The for statement has two variants: the numerical for and the generic for.

A numerical for has the following syntax:

for var = from, to, step = 1 do
    -- something
end

for i = 0, 3 do
    io.write(i .. '\t')
end
-- 0	1	2	3

for i = 0, 10, 2 do
    io.write(i .. '\t')
end
-- 0	2	4	6	8	10

The generic for loop traverses all values returned by an iterator function, with pairs, ipairs, io.lines, etc.
Unlike the numerical for, the generic for can have multiple variables, which are all updated at each iteration. The loop stops when the first variable gets nil.

2.2.5. break, return, and goto

The break and return statements are used to jump out of a block, and the goto statement is used jump to almost any point in a function.

In Lua, the syntax for a goto statement is quite conventional: it is the reserved word goto followed by the label name, which can be any valid identifier: it has two colons followed by the label name followed by more two colons, like in ::name::, which is intentional, to highlight labels in a program.

3. Values and Types

Lua is a dynamically typed language, which means that variables do not have types; only values do. [1]

All values carry their own type.
All values in Lua are first-class values, which means that all values can be stored in variables, passed as arguments to other functions, and returned as results.

There are eight basic types in Lua: nil, boolean, number, string, function, userdata, thread, and table.

The type userdata is provided to allow arbitrary C data to be stored in Lua variables. A userdata value represents a block of raw memory. There are two kinds of userdata: full userdata, which is an object with a block of memory managed by Lua, and light userdata, which is simply a C pointer value. Userdata has no predefined operations in Lua, except assignment and identity test. By using metatables, the programmer can define operations for full userdata values.

The type thread represents independent threads of execution and it is used to implement coroutines. Lua threads are not related to operating-system threads. Lua supports coroutines on all systems, even those that do not support threads natively.

Tables, functions, threads, and (full) userdata values are objects: variables do not actually contain these values, only references to them. Assignment, parameter passing, and function returns always manipulate references to such values; these operations do not imply any kind of copy.

3.1. Nil

The type nil has one single value, nil, whose main property is to be different from any other value; it often represents the absence of a useful value.

$ lua
Lua 5.4.4  Copyright (C) 1994-2022 Lua.org, PUC-Rio
> undefined
nil
> not undefined
true
>

3.2. Booleans

The type boolean has two values, false and true.

Both nil and false make a condition false; they are collectively called false values. Any other value makes a condition true.
Despite its name, false is frequently used as an alternative to nil, with the key difference that false behaves like a regular value in a table, while a nil in a table represents an absent key.
- Lua supports a conventional set of logical operators: and, or, and not.
  
  Both and and or use short-circuit evaluation, that is, they evaluate their second operand only when necessary.
The result of the and operator is its first operand if that operand is false; otherwise, the result is its second operand.
```
4 and 5      --> 5
nil and 13   --> nil
false and 13 --> false
```
The result of the or operator is its first operand if it is not false; otherwise, the result is its second operand:
```
0 or 5        --> 0
false or "hi" --> "hi"
nil or false  --> false
```

The not operator always gives a Boolean value.

not nil     --> true
not false   --> true
not 0       --> false
not not 1   --> true
not not nil --> false

3.3. Numbers

The type number represents both integer numbers and real (floating-point) numbers, using two subtypes: integer and float.

Integers and floats with the same value compare as equal in Lua:

1 == 1.0     --> true
-3 == -3.0   --> true
0.2e3 == 200 --> true

To distinguish between floats and integers, use math.type:
```
math.type(3)   --> integer
math.type(3.0) --> float
```
If both operands are integers, the operation gives an integer result; otherwise, the operation results in a float. In case of mixed operands, Lua converts the integer one to a float before the operation:
```
13.0 + 25  --> 38.0
-(3 * 6.0) --> -18.0
```
To avoid different results between division of integers and divisions of floats, division always operates on floats and gives float results:
```
3.0 / 2.0 --> 1.5
3 / 2     --> 1.5
3 // 2    --> 1 -- floor division and denoted by //
```
Lua provides the following relational operators, and all these operators always produce a Boolean value:
```
<	>	<=	>=	==	~=
```

To force a number to be a float, simply add 0.0 to it.

-3 + 0.0                  --> -3.0
0x7fffffffffffffff + 0.0  --> 9.2233720368548e+18

To force a number to be an integer, OR it with zero:

2^53      --> 9.007199254741e+15 (float)
2^53 | 0  --> 9007199254740992

-- number has no integer representation
3.2 | 0   -- fractional part
2^64 | 0  -- out of range

3.4. Strings

The type string represents immutable sequences of bytes.

Lua is 8-bit clean: strings can contain any 8-bit value, including embedded zeros ('\0').
Lua is also encoding-agnostic; it makes no assumptions about the contents of a string.

Get the length of a string using the length operator (denoted by #):

hi = 'Hello 世界'
print(#hi)  --> 12  -- always counts the length in bytes

Concatenate two strings with the concatenation operator .. (two dots):

"Hello " .. "World"  --> Hello World
"result is " .. 3    --> result is 3

Multiple line literal strings can be delimited also by matching double square brackets, as with long comments. Moreover, it ignores the first character of the string when this character is a newline.
```
page = [[


    An HTML Page


    Lua


]]
```
Lua provides automatic conversions between numbers and strings at run time.

To convert a string to a number explicitly, we can use the function tonumber, which returns nil if the string does not denote a proper number.

tonumber(" -3 ")      --> -3
tonumber(" 10e4 ")    --> 100000.0
tonumber("10e")       --> nil (not a valid number)
tonumber("0x1.3p-4")  --> 0.07421875

To convert a number to a string explicitly, call the function tostring:
```
print(tostring(10) == "10") --> true
```
Since version 5.3, Lua includes a small library (utf8) to support operations on Unicode strings encoded in UTF-8.
```
hi = 'Hello 世界'
print(string.len(hi))  -- 12
print(utf8.len(hi))  -- 8
```

3.5. Tables

The type table implements associative arrays, that is, arrays that can have as indices not only numbers, but any Lua value except nil and NaN.

Tables can be heterogeneous; that is, they can contain values of all types (except nil).
Any key associated to the value nil is not considered part of the table. Conversely, any key that is not part of a table has an associated value nil.
Lua uses tables to represent packages and objects as well. For Lua, the math.sin means “index the table math using the string "sin" as the key”.
Lua stores global variables in ordinary tables.

Tables are created by means of a constructor expression, which in its simplest form is written as {}:

a = {}  -- create a table and assign its reference
a['x'] = 10  -- new entry, with key="x" and value=10
print(a.x)  --> 10

3.5.1. Table Indices

Each table can store values with different types of indices, and it grows as needed to accommodate new entries.

a = {} -- empty table
-- create 1000 new entries
for i = 1, 1000 do a[i] = i*2 end
a[9]           --> 18
a['x'] = 10
a['x']         --> 10
a['y']         --> nil

Lua supports to use the field name as an index by providing a.name as syntactic sugar for a['name'].

a = { x = 10 }
a.x == a['x']  --> true  -- indexed by the string 'x'
a.x == a[x]  --> false

3.5.2. Table Constructors

Constructors are expressions that create and initialize tables, and the simplest constructor is the empty constructor, {}.

-- empty constructor
a = {}

-- record-style and list-style initializations
days = { "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday" }  -- initialize a list
a = { x = 10, y = 20 }  -- initialize a record-like table

-- explicitly write each index as an expression, between square brackets, to
-- initialize fields with negative indices, nor with string indices
opnames = {
    ["+"] = "add",
    ["-"] = "sub",
    ["*"] = "mul",
    ["/"] = "div"
}

3.5.3. Arrays, Lists, and Sequences

To represent a conventional array or a list, simply use a table with integer keys.

-- read 10 lines, storing them in a table
a = {}
for i = 1, 10 do
    a[i] = io.read()
end

Sequences are lists without holes.
- For sequences, Lua offers the length operator (#) to give the length of the sequence represented by a table.
- The length operator (#) is unreliable for lists with holes (nils).
  a = { [1] = 1, [3] = 3, } print(#a) -- 1

3.5.4. Table Traversal

Tables can be traversed all key–value pairs with the pairs iterator, the order that elements appear in a traversal is undefined.

t = { 10, print, x = 12, k = "hi" }
for k, v in pairs(t) do
    print(k, v)
end
-- 1	10
-- 2	function: 0x5595d1eb1730
-- k	hi
-- x	12

For lists, they can be traversed by using the ipairs iterator:

t = { 10, print, 12, "hi" }
for k, v in ipairs(t) do
    print(k, v)
end
-- 1	10
-- 2	function: 0x558e75c75730
-- 3	12
-- 4	hi

Or, with a numerical for:

t = { 10, print, 12, "hi" }
for k = 1, #t do
    print(k, t[k])
end
-- 1	10
-- 2	function: 0x561090ff8730
-- 3	12
-- 4	hi

3.5.5. The table library

The function table.insert inserts an element in a given position of a sequence, moving up other elements to open space. Without a position, it inserts the element in the last position of the sequence, moving no elements.
```
t = { 10, 20, 30 }
table.insert(t, 1, 50)
for k, v in ipairs(t) do
    print(k, v)
end
-- 1	50
-- 2	10
-- 3	20
-- 4	30
```
The function table.remove removes and returns an element from the given position in a sequence, moving subsequent elements down to fill the gap. Without a position, it removes the last element of the sequence.
```
t = { 10, 20, 30 }
table.remove(t)
for k, v in ipairs(t) do
    print(k, v)
end
-- 1	10
-- 2	20
```

3.6. Functions

Functions are the main mechanism for abstraction of statements and expressions in Lua.

print(8*9, 9/8)  -- as a statement
a = math.sin(3) + math.cos(10)  -- as an expression
print(os.date())

If a function has one single argument and that argument is either a literal string or a table constructor, then the parentheses in the call are optional:

print "Hello World"   --> print("Hello World")
dofile 'a.lua'        --> dofile ('a.lua')
print [[a multi-line  --> print([[a multi-line
message]] message]])
f{x=10, y=20}         --> f({x=10, y=20})
type{}                --> type({})

A Lua program can use functions defined both in Lua and in C (or in any other language used by the host application).

A function definition in Lua has a conventional syntax, like here:

-- add the elements of sequence 'a'
function add(a)
    local sum = 0
    for i = 1, #a do
        sum = sum + a[i]
    end
    return sum
end

Lua adjusts the number of arguments to the number of parameters by throwing away extra arguments and supplying nils to extra parameters.

function f(a, b) print(a, b) end

f()        -- nil 	nil
f(3)       -- 3		nil
f(3, 4)    -- 3		4
f(3, 4, 5) -- 3		4	(5 is discarded)

3.6.1. Multiple results

Functions that we write in Lua also can return multiple results, by listing them all after the return keyword.

function maximum(a)
    local mi = 1    -- index of the maximum value
    local m = a[mi] -- maximum value
    for i = 1, #a do
        if a[i] > m then
            mi = i; m = a[i]
        end
    end
    return m, mi
end

print(maximum({ 8, 10, 23, 12, 5 }))  -- 23	3

Lua always adjusts the number of results from a function to the circumstances of the call.

When call a function as a statement, Lua discards all results from the function.
When use a call as an expression (e.g., the operand of an addition), Lua keeps only the first result.

Lua gives all results only when the call is the last (or the only) expression in a list of expressions: multiple assignments, arguments to function calls, table constructors, and return statements.

function foo0() end                  -- returns no results
function foo1() return "a" end       -- returns 1 result
function foo2() return "a", "b" end  -- returns 2 results

In a multiple assignment, a function call as the last (or only) expression produces as many results as needed to match the variables:

x, y = foo2()        -- x="a", y="b"
x = foo2()           -- x="a", "b" is discarded
x, y, z = 10, foo2() -- x=10, y="a", z="b"

-- In a multiple assignment, if a function has fewer results than we
-- need, Lua produces nils for the missing values:
x, y = foo0()        -- x=nil, y=nil
x, y = foo1()        -- x="a", y=nil
x, y, z = foo2()     -- x="a", y="b", z=nil

-- A function call that is not the last
-- element in the list always produces exactly one result:
x, y = foo2(), 20    -- x="a", y=20 ('b' discarded)
x, y = foo0(), 20, 30 -- x=nil, y=20 (30 is discarded)

When a function call is the last (or the only) argument to another call, all results from the first call go as arguments.

print(foo0())        --> (no results)
print(foo1())        --> a
print(foo2())        --> a b
print(foo2(), 1)     --> a 1
print(foo2() .. "x") --> ax

A constructor also collects all results from a call, without any adjustments:

t = { foo0() }          -- t = {} (an empty table)
t = { foo1() }          -- t = {"a"}
t = { foo2() }          -- t = {"a", "b"}
t = { foo0(), foo2(), 4 } -- t[1] = nil, t[2] = "a", t[3] = 4

Finally, a statement like return f() returns all values returned by f:

function foo(i)
    if i == 0 then
        return foo0()
    elseif i == 1 then
        return foo1()
    elseif i == 2 then
        return foo2()
    end
end

print(foo(1))     --> a
print(foo(2))     --> a b
print(foo(0))     -- (no results)
print(foo(3))     -- (no results)

-- force a call to return exactly one result by enclosing it in an
-- extra pair of parentheses:
print((foo0())) --> nil
print((foo1())) --> a
print((foo2())) --> a

3.6.2. Variadic Functions

A function in Lua can be variadic (…), that is, it can take a variable number of arguments.

To iterate over its extra arguments as a sequence, a function can use the expression {…} or table.pack to collect them all in a table.

function add(...)
    local s = 0
    for _, v in ipairs { ... } do
        s = s + v
    end
    return s
end

print(add(3, 4, 10, 25, 12))     --> 54

function nonils(...)
    local arg = table.pack(...)
    for i = 1, arg.n do
        if arg[i] == nil then return false end
    end
    return true
end

print(nonils(2, 3, nil))   --> false
print(nonils(2, 3))        --> true
print(nonils())            --> true
print(nonils(nil))         --> false

The three-dot expression is a vararg expression, which behaves like a multiple return function, returning all extra arguments of the current function.
```
function echo(...)
    return ...
end

print(echo(1, 3, 5, 7))  -- 1	3	5	7
```

4. The I/O library

The I/O library provides two different styles for file manipulation. The first one uses implicit file descriptors; that is, there are operations to set a default input file and a default output file, and all input/output operations are over these default files. The second style uses explicit file descriptors.

When using implicit file descriptors, all operations are supplied by table io. When using explicit file descriptors, the operation io.open returns a file descriptor and then all operations are supplied as methods of the file descriptor.

The table io also provides three predefined file descriptors with their usual meanings from C: io.stdin, io.stdout, and io.stderr. The I/O library never closes these files.

Unless otherwise stated, all I/O functions return nil on failure (plus an error message as a second result and a system-dependent error code as a third result) and some value different from nil on success.

References

[1] https://lua.org/manual/5.4/manual.html
[2] Programming in Lua, Fourth Edition, Roberto Ierusalimschy

Python Learning Notes

2024-05-17T10:29:20+08:00

> import this
The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

1. Running Python
2. Indentations, comments, and multi-line expressions
3. Types
4. Strings, bytes and bytearray
5. If, while, and for
6. Tuples and lists
7. Dictionaries and sets
8. Iterations and comprehensions
9. Files and directories
10. Functions
11. Objects and classes
12. Automatic resource management
13. Modules and packages
14. Testing
15. Processes and concurrency
16. SQL
- 16.1. Using DB-API with SQLite in Memory
References

1. Running Python

Using the interactive interpreter (shell)
```
$ python3 -q
>>> 2+2
4
>>> quit()
```
Using python files
test.py
```
print(2+2)
```
```
$ python3 test.py
4
```

Using python files with shebang

In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark (#!) at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.

— From Wikipedia, the free encyclopedia

test.py

#!/usr/bin/env python3
print(2+2)

$ chmod +x test.py
$ ./test.py
4

Executing modules as scripts

In Python, python -m is a command-line construct used to execute modules as scripts directly from the command line without explicitly writing a separate Python script file (.py).

$ python3 -m venv --help
usage: venv [-h] [--system-site-packages] [--symlinks | --copies] [--clear] [--upgrade] [--without-pip]
            [--prompt PROMPT] [--upgrade-deps]
            ENV_DIR [ENV_DIR ...]

Creates virtual Python environments in one or more target directories.
. . .

$ python3 -m webbrowser https://www.google.com

2. Indentations, comments, and multi-line expressions

Python uses whitespace indentation (the recommended style, called PEP-8, is to use four spaces), rather than curly brackets or keywords, to delimit blocks.
- Don’t use tabs, or mix tabs and spaces; it messes up the indent count.
- When designing the language that became Python, Guido van Rossum decided that the indentation itself was enough to define a program’s structure, and avoided typing all those parentheses and curly braces. Python is unusual in this use of white space to define program structure.
```
disaster = True
if disaster:
    print("Woe!")
else:
    print("Whee!")
```
- As one special case here, the body of a compound statement can instead appear on the same line as the header in Python, after the colon:
  if x > y: print(x) # # Simple statement on header line
In Python, the general rule is that the end of a line automatically terminates the statement that appears on that line.
```
x = 1  # x = 1;
```
Although normally appearing one per line, it is possible to squeeze more than one statement onto a single line in Python by separating them with semicolons:
```
a = 1; b = 2; print(a + b) # Three statements on one line
```

Python allows to write expressions that span multiple lines within certain delimiters.

In older versions of Python (pre-3.0), the backslash character (\) at the end of a line was used to indicate that the line continued on the next line, which is no longer required in modern Python (versions 3.0 and above).
```
# Example in older Python (error-prone, not recommended)
long_expression = (1 + 2 + 3 + 4 + 5 + \
                  6 + 7 + 8 + 9 + 10)
```

In modern Python, avoid using the continuation character (\) for line continuation, and utilize parentheses (()), brackets ([]), or braces ([]) for readability and structure in multi-line expressions.

# Parentheses for complex calculations
long_calculation = (a * b +
                    c) * (d /
                          e - f)

# Brackets for multi-line lists or data structures
data = [
    "item1",
    "item2 with a longer description",
    "item3"
]

# Braces for multi-line dictionaries
person_info = {
    "name": "Alice",
    "age": 30,
    "hobbies": ["reading", "hiking"]
}

A comment is marked by using the # (names: hash, sharp, pound, or or the sinister-sounding octothorpe) character; everything from that point on to the end of the current line is part of the comment.

# 60 sec/min * 60 min/hr * 24 hr/day
seconds_per_day = 86400

seconds_per_day = 86400 # 60 sec/min * 60 min/hr * 24 hr/day

# Python does NOT
# have a multiline comment.
print("No comment: quotes make the # harmless.")

3. Types

False               class               from                or
None                continue            global              pass
True                def                 if                  raise
and                 del                 import              return
as                  elif                in                  try
assert              else                is                  while
async               except              lambda              with
await               finally             nonlocal            yield
break               for                 not

# Python's major built-in object types, organized by categories.
Collections:
  Sequences:
    Immutable:
      String:
      Unicode (2.X):
      Bytes (3.X):
      Tuple:
    Mutable:
      List:
      Bytearray (3.X/2.6+):
  Mappings:
    Dictionary:
  Sets:
    Set:
    Fronzenset:
Numbers:
  Integers:
    Integer:
    Long (2.X):
    Boolean:
  Float:
  Complex:
  Decimal:
  Fraction:
Callables:
  Function:
  Generator:
  Class:
  Method:
    Bound:
    Unbound (2.X):
Other:
  Module:
  Instance:
  File:
  None:
  View (3.X/2.7):
Internals:
  Type:
  Code:
  Frame:
  Traceback:

Python is a dynamically, strongly typed and garbage-collected programming language.

In a dynamically typed language, the data type of a variable is NOT explicitly declared at the time of definition, and is determined at runtime.
```
age = 30  # age is an integer (no need to declare the data type explicitly)
age = "thirty"  # age is now a string
```

In a statically typed language, the data type of a variable MUST be declared at compile time and the compiler ensures type compatibility throughout the code.

// In Java, declare the type of a variable before assigning a value.
int age = 30;  // age is declared as an integer
age = "thirty";  // error: incompatible types: String cannot be converted to int

In a strongly typed language, the data type of a variable MUST be declared at the time of definition, and the compiler or interpreter enforces type safety.

In Python, everything is ultimately an object, even data types like integers and strings, that has associated methods and attributes. During runtime, Python checks if the methods or attributes involved are compatible with the object’s type.

# Like dynamic languages, Python infers types based on assigned values.
name = "Alice"  # name is a string
name + 10  # This would cause a TypeError in Python (mixing string and number)

In computer programming, duck typing is an application of the duck test—"If it walks like a duck and it quacks like a duck, then it must be a duck"—to determine whether an object can be used for a particular purpose.

— From Wikipedia, the free encyclopedia

bool # True, False

int # 47, 25000, 25_000, 0b0100_0000, 0o100, 0x40

float # 3.14, 2.7e5

complex # 3j, 5 + 9j

# In Python 3, strings are Unicode character sequences, not byte arrays.
str # 'alas', "alack", '''a verse attack'''

list # ['Winken', 'Blinken', 'Nod']
tuple # (2, 4, 8)

bytes # b'ab\xff'
bytearray # bytearray(...)

set # set([3, 5, 7])
frozenset # frozenset(['Elsa', 'Otto'])

dict # {}, {'game': 'bingo', 'dog': 'dingo', 'drummer': 'Ringo'}

decimal.Decimal('1.0'), fractions.Fraction(1, 3)  # Decimal and fraction extension types

In Python, variables are NOT places, just names, and a name is a reference to an object rather than the object itself, which is a chunk of data that contains at least a type, a unique id, a value, and a reference count.

type(5.20)  # 
id(5.20)  # 140683748269744
x = y = z = 0  # More than one variable name can be assigned a value at the same time
sys.getrefcount(x)  # 1000000591
del y
sys.getrefcount(x)  # 1000000590
del z
sys.getrefcount(x)  # 1000000589

A class is the definition of an object, and "class" and "type" mean pretty much the same thing.
```
type(7)  # 
type(7) == int  # True
isinstance(7, int)  # True
```

Strings, tuples and lists are common built-in sequences, which are zero-based indexing and ordered collections that can store elements of any data types, except strings, which are sequences of characters themselves.

# iteration
for item in ['meow', 'bark', 'moo']:
    print(item)

# enumeration
for index, item in enumerate(['meow', 'bark', 'moo']):
    print(f'Index: {index}, Item: {item}')

# comparisons
('meow', 'bark', 'moo') == ('meow', 'bark', 'moo')  # True
('meow', 'bark', 'moo') >= ('meow', 'bark')  # True
('meow', 'bark', 'moo') > ('meow', 'bark')  # True

# `+`, `*`
('cat',) + ('dog', 'cattle')  # ('cat', 'dog', 'cattle')
('bark',) * 3  # ('bark', 'bark', 'bark')

# unpacking
cat, dog, cattle = ('meow', 'bark', 'moo')

# testing with `in`
'c' in 'cat'  # True
'meow' in ['cat', 'cattle', 'dog']  # False

# indexing, and slicing a shallow copy subsequence:
s = 'hello!'  # len(S) is 6
# S[-7], S[6]  # IndexError: string index out of range

# The slice expression X[I:J:K] is equivalent to indexing with a slice object: X[slice(I, J, K)]:
#    slice(stop)
#    slice(start, stop[, step])
#
# [:] extracts the entire sequence from start to end.
# [ start :] specifies from the start offset to the end.
# [: end ] specifies from the beginning to the end offset minus 1.
# [ start : end ] indicates from the start offset to the end offset minus 1.
# [ start : end : step ] extracts from the start offset to the end offset minus 1, skipping characters by step.

# Indexing (S[i]) fetches components at offsets:
#   The first item is at offset 0.
#   Negative indexes mean to count backward from the end or right.
#     Technically, a negative offset is added to the length of a sequence to derive a positive offset.
#   S[0] fetches the first item.
#   S[−2] fetches the second item from the end (like S[len(S)−2]).
#
# Slicing(S[i:j]) extracts contiguous sections of sequences:
#   The upper bound is noninclusive.
#   Slice boundaries default to 0 and the sequence length, if omitted.
#   S[1:3] fetches items at offsets 1 up to but not including 3.
#   S[1:] fetches items at offset 1 through the end(the sequence length).
#   S[:3] fetches items at offset 0 up to but not including 3.
#   S[:−1] fetches items at offset 0 up to but not including the last item.
#   S[:] fetches items at offsets 0 through the end—making a top-level copy of S.
#
# Extended slicing (S[i:j:k]) accepts a step ( or stride) k, which defaults to + 1:
#   Allows for skipping items and reversing order(using a negative stride).

s[:], s[0:6], s[:6], s[:6:], s[0:6:], s[0:6:1]  # ('hello!', 'hello!', 'hello!', 'hello!', 'hello!', 'hello!')
s[::-1]  # '!olleh'
len(s), s[-1], s[len(s)-1], s[-len(s)], s[0]  # (6, '!', '!', 'h', 'h')

In Python, truthiness and falsiness are used to check a value in a Boolean context:
- Truthy: Values that evaluate to True, which includes most non-zero numbers, non-empty strings, lists, dictionaries, and many objects.
- Falsy: Values that evaluate to False, which include False, zero numbers (0, 0.0), empty strings (""), lists ([]), and tuples (()), and None.

In Python, the logical operators and, or, not are used to combine Boolean values (True/False) or expressions that evaluate to Boolean values.

letter = 'o'
if letter == 'a' or letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u':
    print(letter, 'is a vowel')
else:
    print(letter, 'is not a vowel')

int(), float(), bin(), oct(), hex(), chr(), and ord()

int(True), int(False)  # (1, 0)
int(98.6), int(1.0e4)  # (98, 10_000)
int('99'), int('-23'), int('+12'), int('1_000_000')  # (99, -23, 12, 1_000_000)

int('10', 2), 'binary', int('10', 8), 'octal', int('10', 16), 'hexadecimal', int('10', 22), 'chesterdigital'
# (2, 'binary', 8, 'octal', 16, 'hexadecimal', 22, 'chesterdigital')

float(True), float(False)  # (1.0, 0.0)
float('98.6'), float('-1.5'), float('1.0e4')  # (98.6, -1.5, 10_000.0)

bin(65), oct(65), hex(65)  # ('0b1000001', '0o101', '0x41')

chr(65), ord('A')  # ('A', 65)

# Python also promotes booleans to integers or floats:
False + 0, True + 0, False + 0., True + 0.  # (0, 1, 0.0, 1.0)

type hints (or type annotations): variable_name: type, def func(argument: type) -> type

age: int = 30
pi: float = 3.14159

def greet(name: str) -> str:
  """Greets the provided name."""
  return f"Hello, {name}!"

Python provides bit-level integer operators, similar to those in the C language.

x = 5  # 0b0101
y = 1  # 0b0001

print(f"0b{(x & y):04b}")  # and
# 0b0001
print(f"0b{(x | y):04b}")  # or
# 0b0101
print(f"0b{(x ^ y):04b}")  # exclusive or
# 0b0100
print(f'0b{~x:04b}')  # flip bits
# 0b-110
print(f'0b{(x << 1):04b}')  # left shift
# 0b1010
print(f'0b{(x >> 1):04b}')  # right shift
# 0b0010

Test for equality: == and is

# The `==` operator tests value equivalence.
#   Python performs an equivalence test, comparing all nested objects recursively.
#
# The `is` operator tests object identity.
#   Python tests whether the two are really the same object (i.e., live at the same address in memory).
S1 = 'spam'
S2 = 'spam'
S1 == S2, S1 is S2
(True, True)

4. Strings, bytes and bytearray

In Python 3.X there are three string types: str is used for Unicode text (including ASCII), bytes is used for binary data (including encoded text), and bytearray is a mutable variant of bytes. Files work in two modes: text, which represents content as str and implements Unicode encodings, and binary, which deals in raw bytes and does no data translation.

UTF-8 is the standard text encoding in Python, Linux, and HTML.

Ken Thompson and Rob Pike, whose names will be familiar to Unix developers, designed the UTF-8 dynamic encoding scheme one night on a placemat in a New Jersey diner. It uses one to four bytes per Unicode character:
- One byte for ASCII
- Two bytes for most Latin-derived (but not Cyrillic) languages
- Three bytes for the rest of the basic multilingual plane
- Four bytes for the rest, including some Asian languages and symbols
```
cafe = 'café'

# len() function on string counts Unicode characters, not bytes:
len(cafe)  # 4

cafe_bytes = cafe.encode()  # b'caf\xc3\xa9'

# len() returns the number of bytes:
len(cafe_bytes)  # 5

cafe_text = cafe_bytes.decode()  # 'café'
```

Strings are created by enclosing characters in matching single, double, or triple quotes:

'Snap'
"Crackle"
"'Nay!' said the naysayer. 'Neigh?' said the horse."
'The rare double quote in captivity: ".'
'''Boom!'''
"""Eek!"""

Triple quotes are very useful to create multiline strings, like this classic poem from Edward Lear:

poem = '''There was a Young Lady of Norway,
    Who casually sat in a doorway;
    When the door squeezed her flat,
    She exclaimed, "What of that?"
    This courageous Young Lady of Norway.'''
print(poem)

There was a Young Lady of Norway,
    Who casually sat in a doorway;
    When the door squeezed her flat,
    She exclaimed, "What of that?"
    This courageous Young Lady of Norway.

# the line ending characters, and leading or trailing spaces are preserved as below:
'There was a Young Lady of Norway,\n    Who casually sat in a doorway;\n    When the door squeezed her flat,\n    She exclaimed, "What of that?"\n    This courageous Young Lady of Norway.'

Escape with \, combine by using +, duplicate with *

hi = 'Na ' 'Na ' 'Na ' 'Na ' \ # literal strings (not string variables) just one after the other
    + 'Hey ' * 4 \
    + '\\' + '\t' + 'Goodbye.'
print(hi)  # Na Na Na Na Hey Hey Hey Hey \	Goodbye.

Python has a few special types of strings, indicated by a letter before the first quote.

f or F starts an f-string, used for formatting.

thing = 'wereduck'
place = 'werepond'
print(f'The {thing} is in the {place}')  # 'The wereduck is in the werepond'

r or R starts a raw string, used to prevent escape sequences in the string.

info = r'Type a \n to get a new line'  # info = 'Type a \\n to get a new line'

# raw string does not undo any real (not `\n`) newlines:
poem = r'''Boys and girls, come out to play.
The moon doth shine as bright as day.'''  # 'Boys and girls, come out to play.\nThe moon doth shine as bright as day.'
print(poem)

Boys and girls, come out to play.
The moon doth shine as bright as day.

fr (or FR, Fr, or fR), the combination, that starts a raw f-string.

hello = 'Hello'
world = '世界'
print(fr'{hello}, {world}!')  # Hello, 世界!

u starts a Unicode string, which is the same as a plain string.

Python 3 strings are Unicode character sequences, not byte arrays.
```
hi = u'Hello, 世界!'  # same as: hi = 'Hello, 世界!'
```

b starts a value of type bytes.

ip = [20, 205, 243, 166]
bytes(ip)  # b'\x14\xcd\xf3\xa6'

Python has three ways of formatting strings.

actor = 'Richard Gere'
cat = 'Chester'
weight = 28

# old style (supported in Python 2 and 3): format_string % data
'My wife\'s favorite actor is %s' % actor  # "My wife's favorite actor is Richard Gere"
'Our cat %s weighs %d pounds' % (cat, weight)  # 'Our cat Chester weighs 28 pounds'
'Our cat %(cat)s weighs %(weight)d pounds' % {'cat': cat, 'weight': weight}  # dictionary-based expressions

# new style (Python 2.6 and up): format_string.format(data)
'{0}, {1} and {2}'.format('spam', 'ham', 'eggs')  # By position
'{motto}, {pork} and {food}'.format(motto='spam', pork='ham', food='eggs')  # By keyword
'{motto}, {0} and {food}'.format('ham', motto='spam', food='eggs')  # By both
'{}, {} and {}'.format('spam', 'ham', 'eggs')  # By relative position
# 'spam, ham and eggs'

# f-strings (Python 3.6 and up): f, F
f'Our cat {cat} weighs {weight} pounds'  # 'Our cat Chester weighs 28 pounds'

Python 3 introduced the following sequences of eight-bit integers, with possible values from 0 to 255, in two types:

bytes is immutable, like a tuple of bytes
bytearray is mutable, like a list of bytes

Endian order refers to the byte order used to store multi-byte values (like integers, floats) in computer memory.

Big-Endian: In big-endian order, the most significant byte (MSB) of a multi-byte value is stored at the beginning (lower memory address) of the allocated space. The remaining bytes follow in decreasing order of significance.
Little-Endian: In little-endian order, the least significant byte (LSB) is stored at the beginning (lower memory address), followed by bytes of increasing significance.

blist = [1, 2, 3, 255]

the_bytes = bytes(blist)
print(the_bytes)
# b'\x01\x02\x03\xff'

the_byte_array = bytearray(blist)
print(the_byte_array)
# bytearray(b'\x01\x02\x03\xff')

the_bytes[0] = 127  # TypeError: 'bytes' object does not support item assignment

the_byte_array[0] = 127

the_byte_array[1] = 256  # ValueError: byte must be in range(0, 256)

the_bytes = bytes(range(0, 256))
for i in range(0, len(the_bytes), 16):
    end_index = min(i+16, len(the_bytes))
    print(the_bytes[i:end_index])
# b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t\n\x0b\x0c\r\x0e\x0f'
# b'\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f'
# b' !"#$%&\'()*+,-./'
# b'0123456789:;<=>?'
# b'@ABCDEFGHIJKLMNO'
# b'PQRSTUVWXYZ[\\]^_'
# b'`abcdefghijklmno'
# b'pqrstuvwxyz{|}~\x7f'
# b'\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f'
# b'\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f'
# b'\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf'
# b'\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf'
# b'\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf'
# b'\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf'
# b'\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef'
# b'\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff'

regular expressions

import re

p = 'Les Fleurs du Mal'  # pattern
c = re.compile(p)  # compile
s = "Charles Baudelaire's 'Les Fleurs du Mal'"  # source
m = c.search(s)  # match
if m:  # m != None
    print("Mon cœur est comme une feuille sèche, emportée par le vent...")

m = re.match('Les Fleurs du Mal', s)  # find exact beginning match with match()
print(m)  # return a Match object
# None

m = re.search('Les Fleurs du Mal', s)  # find first match with search()
print(m)  # return a Match object
# 

m = re.findall('es', s)  # find all matches with findall()
print(m)  # return a list
# ['es', 'es']

m = re.split(r'\s', s)  # split at matches with split()
print(m)  # return a list
# ['Charles', "Baudelaire's", "'Les", 'Fleurs', 'du', "Mal'"]

m = re.sub("'", '?', s)  # replace at matches with sub()
print(m)  # return a string
# Charles Baudelaire?s ?Les Fleurs du Mal?

5. If, while, and for

In Python (version 3.8 and above), the walrus operator (:=, formally known as the assignment expression operator) combines assignment and expression evaluation in a single line.

tweet_limit = 280
tweet_string = "Blah" * 50
if diff := tweet_limit - len(tweet_string) >= 0:  # walrus operator
    print("A fitting tweet")
else:
    print("Went over by", abs(diff))

Compare with if, elif, and else:

color = "mauve"
if color == "red":
    print("It's a tomato")
elif color == "green":
    print("It's a green pepper")
else:
    print("I've never heard of the color", color)

The if/else ternary expression:

# Python runs expression Y only if X turns out to be true, and runs expression Z only if X turns out to be false.
# A = Y if X else Z  # equivalent to `((X and Y) or Z)`
A = 't' if 'spam' else 'f'  # (('spam' and 't') or 'f')
A  # 't'

Dictionary-based multiway branching:

# Handling switch defaults
branch = {'spam': 1.25,
          'ham': 1.99,
          'eggs': 0.99}
print(branch.get('spam', 'Bad choice'))  # 1.25
print(branch.get('bacon', 'Bad choice'))  # Bad choice
# membership test in an if statement can have the same default effect:
choice = 'bacon'
if choice in branch:
    print(branch[choice])
else:
    print('Bad choice')  # Bad choice

# handle defaults by catching and handling the exceptions they'd otherwise trigger:
try:
    print(branch[choice])
except KeyError:
    print('Bad choice')

# Handling larger actions
branch = {'spam': lambda: ...,  # A table of callable function objects
          'ham': function,
          'eggs': lambda: ...}
branch.get(choice, default)()

Repeat with while, and break, continue, and else:

while True:
    value = input("Integer, please [q to quit]: ")
    if value == 'q':  # quit
        break
    number = int(value)
    if number % 2 == 0:  # an even number
        continue
    print(number, "squared is", number*number)

while x:  # Exit when x empty
    if match(x[0]):  # Value at front?
        print('Ni')
        break  # Exit, go around else
    x = x[1:]  # Slice off front and repeat
else:  # break not called
    print('Not found')  # Only here if exhausted x

Iterate with for/in, and break, continue and else:

word = 'thud'
for letter in word:
    if letter == 'u':
        continue
    print(letter)

word = 'thud'
for letter in word:
    if letter == 'x':
        print("Eek! An 'x'!")
        break
    print(letter)
else:  # break not called
    print("No 'x' in there.")

# counter loops: range
for num in range(0, 10, 2):
    print(num)  # 0 2 ... 8

# generating both offsets and items: enumerate
for (index, item) in enumerate('spam'):
    print(f'{index}: {item}', end='\t')  # 0: s	1: p	2: a	3: m

# parallel traversals: zip
for nums in zip(range(0, 10, 2), range(1, 10, 2)):
    print(nums)  # (0, 1) (2, 3) .. (8, 9)

6. Tuples and lists

Tuples are built-in immutable sequences.

# to make a tuple with one or more elements, follow each element with a comma (`,`):
'cat',  # ('cat',)
'cat', 'dog', 'cattle'  # ('cat', 'dog', 'cattle')

# to make an empty tuple, using `()`, or `tuple()`:
()  # ()
tuple()  # ()

# the comma is required to make a tuple
('cat')  # 'cat'

# the parentheses is not required, but could make the tuple more visible
('cat',)  # ('cat',)
('cat', 'dog', 'cattle')  # ('cat', 'dog', 'cattle')

# for cases in which commas might also have another use, the parentheses is needed
type('cat',)  # 
type(('cat',))  # 

# tuple()
tuple('cat')  # ('c', 'a', 't')

# zip()
for x in zip([1, 2, 8], [1, 4, 9], ('cat', 'dog', 'cattle', 'chicken')):
     print(x)
# (1, 1, 'cat')
# (2, 4, 'dog')
# (8, 9, 'cattle')

# named tuples are a tuple/class/dictionary hybrid.
from collections import namedtuple  # import extension type
Rec = namedtuple('Rec', ['name', 'age', 'jobs'])  # make a generated class
bob = Rec('Bob', age=40.5, jobs=['dev', 'mgr'])  # a named-tuple record
print(bob)  # Rec(name='Bob', age=40.5, jobs=['dev', 'mgr'])

bob[0], bob[2]  # access by position
('Bob', ['dev', 'mgr'])

bob.name, bob.jobs  # access by attribute
('Bob', ['dev', 'mgr'])

# converting to a dictionary supports key-based behavior when needed:
O = bob._asdict()  # dictionary-like form
O['name'], O['jobs']  # access by key too
('Bob', ['dev', 'mgr'])
O
# OrderedDict([('name', 'Bob'), ('age', 40.5), ('jobs', ['dev', 'mgr'])])

Lists are built-in mutable sequences.

# create with `[]` or `list()`
[]  # []
['meow', 'bark', 'moo']  # ['meow', 'bark', 'moo']
[('cat', 'meow'), 'bark', 'moo']  # [('cat', 'meow'), 'bark', 'moo']
list()  # []
list('cat')  # ['c', 'a', 't']

# append(), insert()
wow = ['meow']  # ['meow']
wow.append('moo')  # ['meow', 'moo']
wow.insert(1, 'bark')  # ['meow', 'bark', 'moo']

# index, and slice assignment
L = ['spam', 'Spam', 'SPAM!']
# index assignment
L[1] = 'eggs'  # ['spam', 'eggs', 'SPAM!']
# slice assignment: delete+insert
L[0:2] = ['eat', 'more']  # ['eat', 'more', 'SPAM!']

# del, remove(), pop(), clear()
farm = ['cat', 'dog', 'cattle', 'chicken', 'duck']

del farm[-1]
# ['cat', 'dog', 'cattle', 'chicken']

farm.remove('dog')
# ['cat', 'cattle', 'chicken']

farm.pop()  # 'chicken'
# ['cat', 'cattle']

farm.pop(-1)  # 'cattle'
# ['cat']

farm.clear()
# []

# sort() and sorted()
farm = ['cat', 'dog', 'cattle']

# a sorted copy
sorted(farm)  # ['cat', 'cattle', 'dog']
print(farm)  # ['cat', 'dog', 'cattle']

# sorting in-place
farm.sort()
print(farm)  # ['cat', 'cattle', 'dog']

# shallow copy: any changes made to the elements within the original list will also be reflected in the copy.
a = [['cat', 'meow'], ['dog', 'bark']]
c = a[:]
b = a.copy()  # equivalent to list slicing ([:] )but might be slightly less efficient.
d = list(c)

# deep copy: changes to elements within the original list won't affect the copy (and vice versa) because they point to different objects in memory.
import copy
e = copy.deepcopy(a)

a[0][1] = 'moo'
a  # [['cat', 'moo'], ['dog', 'bark']]
b  # [['cat', 'moo'], ['dog', 'bark']]
c  # [['cat', 'moo'], ['dog', 'bark']]
d  # [['cat', 'moo'], ['dog', 'bark']]

e  # [['cat', 'meow'], ['dog', 'bark']]

# list comprehensions: [expression for item in iterable]
even_numbers = [2 * num for num in range(5)]
# [0, 2, 4, 6, 8]
# list comprehensions: [expression for item in iterable if condition]
odd_numbers = [num for num in range(10) if num % 2 == 1]
# [1, 3, 5, 7, 9]

7. Dictionaries and sets

In Python, keys in dictionaries (dict) and elements in sets must be of immutable, or hashable data types.

Dictionaries

# `{}`
{}  # {}
{'cat': 'meow', 'dog': 'bark'}  # {'cat': 'meow', 'dog': 'bark'}

# dict(): keyword argument names need to be legal variable names (no spaces, no reserved words)
dict(cat='meow', dog='bark')  # {'cat': 'meow', 'dog': 'bark'}

# dict(): zipping together sequences of keys and values into a dictionary
dict([['cat', 'meow'], ['dog', 'bark']])  # {'cat': 'meow', 'dog': 'bark'}

# [key], get()
animals = {'cat': 'meow', 'dog': 'bark'}
animals['cattle'] = 'moo'  # {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
animals['cat']  # 'meow'
animals['sheep']  # KeyError: 'sheep'
animals.get('sheep')  # None
animals.get('sheep', 'baa')  # 'baa'

# testing
animals = {'cat': 'meow', 'dog': 'bark'}
'cat' in animals  # True
'sheep' in animals  # False
animals['sheep'] if 'sheep' in animals else 'oops!'  # 'oops!'

# keys(), values(), items(), len()
animals.keys()  # dict_keys(['cat', 'dog', 'cattle'])
animals.values()  # dict_values(['meow', 'bark', 'moo'])
animals.items()  # dict_items([('cat', 'meow'), ('dog', 'bark'), ('cattle', 'moo')])
len(animals)  # 3

# `**`, update()
{**{'cat': 'meow'}, **{'dog': 'bark'}}  # {'cat': 'meow', 'dog': 'bark'}
animals = {'cat': 'meow'}
animals.update({'dog': 'bark'})  # {'cat': 'meow', 'dog': 'bark'}

# del, pop(), clear()
animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
del animals['dog']
# {'cat': 'meow', 'cattle': 'moo'}
animals.pop('cattle')  # 'moo'
# {'cat': 'meow'}
animals.clear()
# {}

# iterations
animals = {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
for key in animals:  # for key in animals.keys()
    print(f'{key} => {animals[key]}', end='\t')
# cat => meow	dog => bark	cattle => moo

# dictionary comprehensions: {key_expression : value_expression for expression in iterable}
word = 'letters'
letter_counts = {letter: word.count(letter) for letter in word}
# {'l': 1, 'e': 2, 't': 2, 'r': 1, 's': 1}

# dictionary comprehensions: {key_expression : value_expression for expression in iterable if condition}
vowels = 'aeiou'
word = 'onomatopoeia'
vowel_counts = {letter: word.count(letter)
                for letter in set(word) if letter in vowels}
# {'i': 1, 'o': 4, 'a': 2, 'e': 1}

Sets

# `{}`, set(), frozenset()
{}  # 
{0, 2, 4, 6}  # {0, 2, 4, 6}

set()  # set()
set('letter')  # {'l', 't', 'r', 'e'}
set({'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'})  # {'cat', 'cattle', 'dog'}

frozenset()  # frozenset()
frozenset([3, 1, 4, 1, 5, 9])  # frozenset({1, 3, 4, 5, 9})

# len(), add(), remove()
nums = {0, 1, 2, 3, 4, }
len(nums)  # 5
nums.add(5)  # {0, 1, 2, 3, 4, 5}
nums.remove(0)  # {1, 2, 3, 4, 5}

# iteration
for num in {0, 2, 4, 6, 8}:
    print(num, end='\t')
# 0	2	4	6	8

# testing
2 in {0, 2, 4}  # True
3 in {0, 2, 4}  # False

# `&`: intersection(), `|`: union(), `-`: difference(), `^`: symmetric_difference()
a = {1, 3}
b = {2, 3}
a & b  # {3}
a | b  # {1, 2, 3}
a - b  # {1}
a ^ b  # {1, 2}

# `<=`: issubset(), `<`: proper subset, `>=`: issuperset(), `>`: proper superset
a <= b  # False
a < b  # False
a >= b  # False
a > b  # False

# set comprehensions: { expression for expression in iterable }
{num for num in range(10)}  # {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
# set comprehensions: { expression for expression in iterable if condition }
{num for num in range(10) if num % 2 == 0}  # {0, 2, 4, 6, 8}

8. Iterations and comprehensions

The terms "iterable" and "iterator" are sometimes used interchangeably to refer to an object that supports iteration in general. For clarity, using the term iterable to refer to an object that supports the iter call, and iterator to refer to an object returned by an iterable on iter that supports the next(I) call.

Any object with a __next__ method to advance to a next result, which raises StopIteration at the end of the series of results, is considered an iterator, that may also be stepped through with a for loop or other iteration tool, because all iteration tools normally work internally by calling __next__ on each iteration and catching the StopIteration exception to determine when to exit.

print(open('script2.py').read())
# import sys
# print(sys.path)
# x = 2
# print(x**32)

f = open('script2.py')
f.__next__()
# 'import sys\n'
f.__next__()
# 'print(sys.path)\n'
f.__next__()
# 'x = 2\n'
f.__next__()
# 'print(x**32)\n'
f.__next__()
# Traceback (most recent call last):
#   File "", line 1, in 
# StopIteration

# manual iteration: what for loops usually do
with open('script2.py', 'rt', encoding='utf-8') as fi:
    while True:
        try:
            # To simplify manual iteration code, Python 3.X also provides a built-in function, next,
            # that automatically calls an object’s __next__ method.
            line = fi.__next__()  # same as: line = next(fi)
            print(line, end='')
        except StopIteration:
            break

for line in open('script2.py'):  # use file iterators to read by lines
    print(line.upper(), end='')  # calls __next__, catches StopIteration

When the for loop begins, it first uses the iteration protocol to obtain an iterator from the iterable object by passing it to the iter built-in function; the object returned by iter in turn has the required next method. The iter function internally runs the __iter__ method, much like next and __next__.

The Python iteration protocol, used by for loops, comprehensions, maps, and more, and supported by files, lists, dictionaries, generators, and more.

The iterable object you request iteration for, whose __iter__ is run by iter.

The iterator object returned by the iterable that actually produces values during the iteration, whose __next__ is run by next and raises StopIteration when finished producing results.

L = [1, 2, 3]  # iterable
I = iter(L)  # iterator
next(I)
# 1
next(I)
# 2
next(I)
# 3
next(I)
# Traceback (most recent call last):
#   File "", line 1, in 
# StopIteration

Iteration contexts in Python include the for loop; list comprehensions; the map built-in function; the in membership test expression; and the built-in functions sorted, sum, any, and all, and also includes the list and tuple built-ins, string join methods, and sequence assignments, all of which use the iteration protocol to step across iterable objects one item at a time.

Technically speaking, list comprehensions are never really required because a list of expression results can be always built up manually with for loops, however, list comprehensions might run much faster than manual for loop statements (often roughly twice as fast) because their iterations are performed at C language speed inside the interpreter, rather than with manual Python code.

L = [1, 2, 3, 4, 5]
res = []
for x in L:
    res.append(x+10)
print(res)  # [11, 12, 13, 14, 15]

res2 = [x + 10 for x in L]
print(res2)  # [11, 12, 13, 14, 15]

# filter clauses: if
[line.rstrip() for line in open('script2.py') if line[0] == 'p']

# nested loops: for
[x + y for x in 'abc' for y in 'lmn']

9. Files and directories

A file is a sequence of bytes, stored in some filesystem, and accessed by a filename. A directory (or folder) is a collection of files, and possibly other directories.

Text files represent content as normal str strings, perform Unicode encoding and decoding automatically, and perform end-of-line translation by default.
Binary files represent content as a special bytes string type and allow programs to access file content unaltered.
open(filename, mode): Opens a file in the specified mode, and returns a file object used for reading or writing data.
- file.read(size): Read a specified number of characters (or bytes) from the file (or all remaining bytes if no size is provided).
- file.readline(): Read a single line from the file.
- file.readlines(): Read all lines from the file into a list.
- for line in open('data'): use line: File iterators read line by line.
- file.write(data): Write a string of characters (or bytes) data to the file.
- file.writelines(aList): Write all line strings in a list into file.
- file.flush(): Flush output buffer to disk without closing.
- file.seek(N): Change file position to offset N for next operation.
- mode (optional): a string specifies how the file will be opened, which determines the access permissions and how newline characters (for text files) are handled.
  r (read): Opens the file for reading. The file must exist, or an error will be raised.
  
  w (write): Opens the file for writing. An existing file will be truncated (emptied) before writing. If the file doesn’t exist, it will be created.
  
  a (append): Opens the file for appending. New data will be written to the end of the file. If the file doesn’t exist, it will be created.
  
  x (exclusive creation): Attempts to create a new file. If the file already exists, an error will be raised.
  
  r+ (read and write): Opens the file for both reading and writing. The file must exist.
  
  w+ (read and write): Opens the file for both reading and writing. An existing file will be truncated before any operations. If the file doesn’t exist, it will be created.
  
  a+ (append and read): Opens the file for both appending and reading. If the file doesn’t exist, it will be created.
  
  By default, Python opens files in text mode (t), that handles newline characters differently based on the operating system (CRLF on Windows, LF on Unix/Linux).
  
  The binary mode (b) can be specified by appending it to any mode (e.g., rb, wb), that treats the file as a raw stream of bytes without newline conversion.
  
  Python 3 offers a universal newline mode (U) that attempts to handle various newline conventions consistently (consult documentation for details).
  poem = ''' Je suis l'automne, la saison des pluies, Le temps des fruits mûrs et des feuilles jaunies, Le soleil pâle et les jours qui décroissent, Le vent qui hurle et les chaumes qui gémissent. Je suis l'automne, la saison des regrets, Le temps où meurent les amours et les joies, Le temps des souvenirs et des larmes secrètes, Le temps des nuits longues et des tristesses froides. Je suis l'automne, la saison des douleurs, Le temps des fièvres et des maladies, Le temps où l'on se sent mourir sans pouvoir guérir, Le temps où l'on voudrait mourir et qu'on n'ose pas. Je suis l'automne, la saison de la mort, Le temps où l'on se couche dans la terre humide, Le temps où l'on dort pour toujours sans rêver, Le temps où l'on ne souffre plus et qu'on n'aime plus. ''' with open('autumn_song.txt', 'w+') as fio: fio.write(poem) fio.seek(0) lines = fio.readlines() for line in lines: print(line, sep='', end='') fio.seek(0) for line in fio: # iterate over lines in the file object (text mode only) print(line, sep='', end='')
os.mkdir(directory_name): Create a single directory.
os.makedirs(directory_path) : Create nested directories if they don’t exist.
os.remove(filename): Delete a single file.
shutil.rmtree(directory_path): Delete a directory and its contents recursively.
os.rename(old_name, new_name): Rename a file or directory.
os.getcwd(): Get the current working directory.
os.chdir(new_path): Change the working directory.
os.listdir(directory_path): Get a list of files and subdirectories within a directory.
os.path.exists(path): Check if a file or directory exists.
os.path.getsize(path): Get a file size.
os.path.isdir(path): Check if it’s a directory.
os.path.isfile(path): Check whether a path is a regular file.
os.walk(directory): Iterate through a directory recursively, yielding a 3-tuple for each directory containing its path, subdirectories, and filenames.
glob.glob(pathname): Return a list of paths matching a pathname pattern.

10. Functions

# Function-related statements and expressions

# call expressions
myfunc('spam', 'eggs', meat=ham, *rest)

# def
def printer(messge):
    print('Hello ' + message)

# return
def adder(a, b=1, *c):
    return a + b + c[0]

# global
x = 'old'
def changer():
    global x; x = 'new'

# nonlocal (3.X)
def outer():
    x = 'old'
    def changer():
        nonlocal x; x = 'new'

# yield
def squares(x):
  for i in range(x): yield i ** 2

# lambda
funcs = [lambda x: x**2, lambda x: x**3]

# pass
def do_nothing():
    pass  # NOOP
do_nothing()

Python 3.X (but not 2.X) allows ellipses coded as … (literally, three consecutive dots) to appear any place an expression can. Because ellipses do nothing by themselves, this can serve as an alternative to the pass statement, especially for code to be filled in later—a sort of Python "TBD":

def func1():
    ... # Alternative to pass
def func2():
    ...
func1() # Does nothing if called

Ellipses can also appear on the same line as a statement header and may be used to initialize variable names if no specific type is required:

def func1(): ... # Works on same line too
def func2(): ...

X = ... # Alternative to None
X  # Ellipsis

This notation is new in Python 3.X—and goes well beyond the original intent of … in slicing extensions—so time will tell if it becomes widespread enough to challenge pass and None in these roles.

# None
def whatis(thing):  # def whatis(thing: any) -> None:
    if thing is None:
        print(thing, "is None")
    elif thing:
        print(thing, "is True")

whatis(None)  # None is None

# docstring
def echo(anything):
    'echo returns its input argument'
    return anything

print(echo.__doc__)  # 'echo returns its input argument'
help(echo)

# arguments
def menu(wine, entree, dessert):
    return {'wine': wine, 'entree': entree, 'dessert': dessert}

# positional (or named) arguments: passed by order
menu('chardonnay', 'chicken', 'cake')
# {'wine': 'chardonnay', 'entree': 'chicken', 'dessert': 'cake'}

# keyword arguments: passed by name
menu(entree='beef', dessert='bagel', wine='bordeaux')
# {'wine': 'bordeaux', 'entree': 'beef', 'dessert': 'bagel'}

# mix positional and keyword arguments
menu('frontenac', dessert='flan', entree='fish')
# {'wine': 'frontenac', 'entree': 'fish', 'dessert': 'flan'}

# optional positional arguments
def print_args(*args):
    print(args)  # gather as a tuple

print_args()
# ()
print_args('meow', 'bark', 'moo')
# ('meow', 'bark', 'moo')
print_args(('meow', 'bark', 'moo'))
# (('meow', 'bark', 'moo'),)
print_args(*('meow', 'bark', 'moo'))  # explode a tuple with `*`
# ('meow', 'bark', 'moo')

# optional keyword arguments
def print_kargs(**kargs):
    print(kargs)  # gather as a dict

print_kargs()
# {}
print_kargs(cat='meow', dog='bark', cattle='moo')
# {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}
print_kargs(**{'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'})  # explode a dict with `**`
# {'cat': 'meow', 'dog': 'bark', 'cattle': 'moo'}

# default parameters
def menu(wine, entree, dessert='pudding'):
    return {'wine': wine, 'entree': entree, 'dessert': dessert}

menu('chardonnay', 'chicken')
# {'wine': 'chardonnay', 'entree': 'chicken', 'dessert': 'pudding'}

# keyword-only arguments `*`
def print_data(data, *, start=0, end=100):
    """
    the parametes start and end must be provided as keyword/named arguments
    """
    for v in data[start:end]:
        print(v, end='\t')

print_data(('meow', 'bark', 'moo'))
# meow	bark	moo
print_data(('meow', 'bark', 'moo'), start=1)
# bark	moo

def the_order_of_arguments(
    required: str,
    optional: str = None,
    *args: tuple,
    key: str = None,
    **kwargs: dict
) -> None:
  """
  This function demonstrates the order of arguments in Python.

  Args:
      required (str): A required positional argument.
      optional (str, optional): An optional positional argument with a default value of None.
      *args (tuple, optional): Captures any remaining positional arguments as a tuple.
      key (str, optional): A keyword-only argument with a default value of None.
      **kwargs (dict, optional): Captures any remaining keyword arguments as a dictionary.

  Returns:
      None
  """
  # Function body (can be replaced with actual logic)
  print(f"Required argument: {required}")
  print(f"Optional argument: {optional}")
  print(f"Positional arguments (as tuple): {args}")
  print(f"Keyword-only argument: {key}")
  print(f"Keyword arguments (as dictionary): {kwargs}")

the_order_of_arguments("This is required", "This is optional", x=10, y="hello")

# functions are first-class citizens
def answer():
    print(42)

def run_sth(func):
    func()

run_sth(answer)  # 42

# inner functions
def outer(a, b):
    def inner(c, d):
        return c+d
    return inner(a, b)

# closures
def wow(voice):
    def inner():
        return f'Wow: {voice}'
    return inner

cat = wow('meow')
dog = wow('bark')
cat()  # 'Wow: meow'
dog()  # 'Wow: bark'

# recursion
def flatten(lol):
    for item in lol:
        if isinstance(item, list):
            yield from flatten(item)  # yield from expression
        else:
            yield item

lol = [1, 2, [3, 4, 5], [6, [7, 8, 9], []]]
list(flatten(lol))
[1, 2, 3, 4, 5, 6, 7, 8, 9]

# anonymous functions: lambda
def is_odd(num):
    return num % 2 == 1

nums = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
list(filter(is_odd, nums))
# [1, 3, 5, 7, 9]
list(filter(lambda num: num % 2 == 0, nums))
# [0, 2, 4, 6, 8]

10.1. Generators

A generator is a Python sequence creation object, which is often the source of data for iterators.

It can be used to iterate through potentially huge sequences without creating and storing the entire sequence in memory at once.
Every time iteration through a generator, it keeps track of where it was the last time it was called and returns the next value.
A generator can be run only once, and can’t be to restart or back up.

A generator function is a normal function, but it returns its value with a yield statement rather than return.

def xrange(start=0, stop=10, step=1):
    number = start
    while number < stop:
        yield number
        number += step

ranger = xrange(1, 5)
print(ranger)  # 

for num in ranger:
    print(num, end='\t')  # 1	2	3	4

10.2. Decorators

A decorator is a function that takes one function as input and returns another function.

def document_it(func):
    def new_function(*args, **kwargs):
        print('Running function:', func.__name__)
        print('Positional arguments:', args)
        print('Keyword arguments:', kwargs)
        result = func(*args, **kwargs)
        print('Result:', result)
        return result
    return new_function

def add_ints(a, b):
    return a+b

cooler_add_ints = document_it(add_ints)  # manual decorator assignment
cooler_add_ints(1, 2)
# Running function: add_ints
# Positional arguments: (1, 2)
# Keyword arguments: {}
# Result: 3
# 3

@document_it  # an alternative to the manual decorator assignment
def add_floats(a: float, b: float) -> float:
    return a + b

def square_it(func):
    def new_function(*args, **kargs):
        result = func(*args, **kargs)
        return result*result
    return new_function

# more than one decorator for a function
@document_it
@square_it
def add_numbers(a: float, b: float) -> float:
    return a + b

add_numbers(2, 3)
# Running function: new_function
# Positional arguments: (2, 3)
# Keyword arguments: {}
# Result: 25
# 25

def dump(func):
    "Print input arguments and output value(s)"
    def wrapped(*args, **kwargs):
        print("Function name:", func.__name__)
        print("Input arguments:", ' '.join(map(str, args)))
        print("Input keyword arguments:", kwargs.items())
        output = func(*args, **kwargs)
        print("Output:", output)
        return output
    return wrapped

10.3. Exceptions

An exception is a class, which is a child of the class Exception.

class OopsException(Exception):
    pass

try:
    raise OopsException('panic')  # raising exceptions
except OopsException as err:
    print(err)  # panic
except (RuntimeError, TypeError, NameError) as err:  # multiple exceptions as a parenthesized tuple
    pass
except Exception as other:  # except to catch all exceptions
    pass
except:  # bare except to catch all exceptions
    pass

10.4. locals() and globals()

Python provides two functions to access the contents of the namespaces:

locals() returns a dictionary of the contents of the local namespace.
globals() returns a dictionary of the contents of the global namespace.

a = 5.21

def print_global_a():
 global a  # the global keyword: explicit is better than implicit
 print(a)

print_global_a()
# 5.21

def print_locals_globals():
    a: int = 0
    b: float = 3.14
    print(locals())
    print(globals())

print_locals_globals()
# {'a': 0, 'b': 3.14}
# {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': , '__spec__': None, '__annotations__': {}, '__builtins__': , 'print_locals': , 'print_globals': , 'print_locals_globals': , 'a': 5.21}

vars() without arguments, equivalent to locals().

print(vars())
# {'__name__': '__main__', '__doc__': None, '__package__': None, '__loader__': , '__spec__': None, '__annotations__': {}, '__builtins__': }

11. Objects and classes

# define a class
class Cat:  # standard class definition
    pass

class Cat():  # less common approach (equivalent in functionality)
    pass

# create an object from a class
cat = Cat()

# assign attributes directly to an object anytime after its creation.
cat.wow = 'meow'
cat.wow  # 'meow'

# initialization: __init__(), to save syllables, double underscores (__), also pronounce as dunder.
class Cat:
    # self is not a reserved word, but it’s common as the first argument to refer to the object itself.
    def __init__(self, name):  # initializer
        self.name = name

    # a method is a function in a class or object.
    def wow(self):
        print(f'{self.name:}: meow!')


cat = Cat('Tom')
cat.wow()  # Tom: meow!
Cat.wow(cat)  # Tom: meow!

# class and object attributes
class Cat:
    color = 'red'

tom = Cat()
jerry = Cat()
print(tom.color)  # red
print(jerry.color)  # red

tom.color = 'black'  # object attributes take precedence over class attributes when accessed or modified
Cat.color = 'blue'  # affect existing and new objects

butch = Cat()
print(jerry.color)  # blue
print(tom.color)  # black
print(butch.color)  # blue

# inheritance
class Animal:
    def __init__(self, voice) -> None:
        self.voice = voice

    def wow(self):
        print(f'{self.voice}!')


class Cat(Animal):
    pass


class Dog(Animal):
    def __init__(self) -> None:
        super().__init__('bark')

    def wow(self):
        print(f'{self.voice}! '*3)

cat = Cat('meow')
cat.wow()  # meow!

dog = Dog()
dog.wow()  # bark! bark! bark!

# multiple inheritance: method resolution order
class Animal:
    def wow(self):
        print('I speak!')

class Horse(Animal):
    def wow(self):
        print('Neigh!')

class Donkey(Animal):
    def wow(self):
        print('Hee-haw!')

class Mule(Donkey, Horse):
    pass

print(Mule.mro())
# [, , , , ]

class Hinny(Horse, Donkey):
    pass

print(Hinny.__mro__)
# (, , , , )

# Mixins in Python are a code reuse technique used to add functionalities to classes
# without relying on traditional inheritance to achieve modularity.
class PrettyMixin():
    def dump(self):
        import pprint
        pprint.pprint(vars(self))

class Thing():
    def __init__(self) -> None:
        self.name = "Nyarlathotep"
        self.feature = "ichor"
        self.age = "eldritch"

# Mixins are included in a class definition using multiple inheritance syntax.
class PrettyThing(Thing, PrettyMixin):
    pass

t = PrettyThing()
t.dump()  # {'age': 'eldritch', 'feature': 'ichor', 'name': 'Nyarlathotep'}

# Python doesn’t have private attributes, but has a naming convention for attributes that
# should not be visible outside of their class definition: begin with two underscores (__).
class Cat:
    def __init__(self, name) -> None:
        self.__name = name

    @property
    def name(self):  # getter
        return self.__name

    @name.setter
    def name(self, name):  # setter
        self.__name = name

cat = Cat('Tom')
print(cat.name)  # Tom
cat.name = 'Jerry'
print(cat.name)  # Jerry

# instance methods, class methods, static methods
class Cat:
    # Class attribute (shared by all instances)
    species = "Felis catus"

    def __init__(self, name, age):
        self.name = name
        self.age = age

    # Instance method (operates on a specific instance)
    def meow(self):
        print(f"{self.name} says meow!")

    @classmethod
    def create_from_dict(cls, cat_dict):
        """
        Class method to create a Cat object from a dictionary.

        Args:
            cls (class): The Cat class itself.
            cat_dict (dict): A dictionary containing cat data (name, age).

        Returns:
            Cat: A new Cat object.
        """
        return cls(cat_dict["name"], cat_dict["age"])

    @staticmethod
    def is_adult(age):
        """
        Static method to check if a cat is considered adult (age >= 1).

        Args:
            age (int): The cat's age.

        Returns:
            bool: True if the cat is adult, False otherwise.
        """
        return age >= 1


# Create Cat objects
cat1 = Cat("Whiskers", 2)
cat2 = Cat.create_from_dict({"name": "Luna", "age": 5})

# Instance method call (operates on specific objects)
cat1.meow()  # Output: Whiskers says meow!
cat2.meow()  # Output: Luna says meow!

# Class method call
new_cat = Cat.create_from_dict({"name": "Simba", "age": 1})

# Static method call
is_cat1_adult = Cat.is_adult(cat1.age)

# Output: Simba is 1 years old.
print(f"{new_cat.name} is {new_cat.age} years old.")
# Output: Is Whiskers an adult? True
print(f"Is Whiskers an adult? {is_cat1_adult}")

# duck typing: a loose implementation of polymorphism
# If it walks like a duck and quacks like a duck, it’s a duck.
#     —— A Wise Person
class Duck:
    def __init__(self, name) -> None:
        self.__name = name

    def who(self):
        return self.__name

    def wow(self):
        return 'quack!'

class Cat:
    def __init__(self, name) -> None:
        self.__name = name

    def who(self):
        return self.__name

    def wow(self):
        return 'meow!'

def who_wow(obj):
    print(f'{obj.who()}: {obj.wow()}')

who_wow(Duck('Donald'))  # Donald: quack!
who_wow(Cat('Tom'))  # Tom: meow!

# dataclasses
from dataclasses import dataclass

@dataclass
class Cat:
    name: str
    age: int
    color: str = 'blue'

tom = Cat('tom', 3)
print(tom)  # Cat(name='tom', age=3, color='blue')

12. Automatic resource management

fi = open('test.txt', 'w', encoding='utf-8')
try:
    fi.write('hello world')
finally:
    fi.close()

with open('test.txt', 'r', encoding='utf-8') as fo:
    txt = fo.read()
    print(txt)

The with statement can be used with any object that implements the __enter__() and __exit__() special methods that provide hooks for initializing and finalizing resource management. Common resources managed with with include:

Files: The with open('filename', 'mode') as file: syntax opens a file, assigns it to a variable (file), and automatically closes the file when the indented block exits, even in case of exceptions.
Database Connections: with sqlite3.connect(':memory:') as con: creates a connection, assigns it to a variable, and guarantees closure upon exiting the block.
Locks: In multithreaded environments, with can be used with lock objects to acquire a lock at the beginning of the block and release it at the end, ensuring proper synchronization.

class Cat:
    """A custom context manager class that simulates a cat entering and leaving."""

    def __enter__(self) -> "Cat":
        """
        Called when entering the `with` block. Prints a message and returns itself.

        Returns:
            The Cat instance (self) to be used within the `with` block.
        """
        print("I'm coming in!")
        return self  # Return self to provide the managed object to the `with` block

    def __exit__(self, exc_type: type, exc_value: object, traceback: object) -> bool:
        """
        Called when exiting the `with` block, regardless of exceptions.
        Prints a message, optionally handles exceptions, and returns True to suppress them.

        Args:
            exc_type (type): The type of exception raised within the `with` block (if any).
            exc_value (object): The actual exception object raised (if any).
            traceback (object): A traceback object containing information about the call stack
                               (if any exception was raised).

        Returns:
            bool: True to suppress any exceptions raised within the `with` block,
                  False to re-raise them. (Can be modified for specific exception handling)
        """
        print("I'm going out.")
        # Suppress potential exceptions (modify for specific handling)
        return True

    def wow(self) -> None:
        """
        Method to simulate a cat's meow. Prints "meow!".

        Returns:
            None
        """
        print("meow!")


with Cat() as cat:  # type: Cat
    """Enters the context manager and assigns the Cat object to 'cat'."""
    cat.wow()  # Calls the cat's meow method within the context

# I'm coming in!
# meow!
# I'm going out.

13. Modules and packages

# A module is a single Python file (.py extension) containing Python code,
# that can include functions, classes, variables, and statements.

# animal.py (module file)
class Animal:
    def __init__(self, voice: str) -> None:
        self.__voice = voice

    def wow(self):
        print(f'{self.__voice}!')

# the `import` statement is `import module`, where `module` is the name
# of another Python file, without the .py extension.
from animal import Animal as Duck  # import only what you want from a module
from animal import Animal
import animal as mouse  # import a module with another name
import animal  # import a module

donald = Duck('quack')
donald.wow()  # quack!

tom = Animal('meow')
tom.wow()  # meow!

jerry = mouse.Animal('peep')
jerry.wow()  # peep!

butch = animal.Animal('bark')
butch.wow()  # bark!

13.1. packages

A package is a directory containing multiple Python modules and potentially subdirectories with even more modules, that represents a collection of related modules organized under a common namespace.

If the version of Python is earlier than 3.3, it’ll need one more thing in the sources subdirectory to make it a Python package: a file named __init__.py.

# .
# ├── animals
# │   ├── cat.py
# │   ├── dog.py
# │   └── __init__.py
# └── main.py

# animals/cat.py
def wow():
    print('meow!')

# animals/dog.py
def wow():
    print('bark!')

# main.py
from animals import cat  # from package import module
import animals.dog as dog  # import package.module

cat.wow()  # meow!
dog.wow()  # bark!

13.2. main

Identifying the main module: the entry point for a Python program’s execution.

Python uses a special variable called __name__.
When a module is directly executed (as a script), the __name__ variable within that module is set to the string '__main__'.
When a module is imported by another module, the __name__ variable within the imported module gets the actual module name (e.g., 'my_module').

# cat.py
def wow():
    return __name__

if __name__ == '__main__':
    print(f'executed: {wow()}')

$ python3 cat.py  # directly executed (as a script)
executed: __main__

# imported by another module
from cat import wow
print(f'imported: {wow()}')  # imported: cat

13.3. import

Basic structure:
```
import module_name
```

Importing specific elements:

# import specific functions or classes from a module.
from module_name import element1, element2
# import a specific element and assign it an alias for easier use.
from module_name import element1 as alias

Importing a module with an alias:

# assign an alias to a whole module for shorter references.
import module_name as alias

Importing sub-modules: use the dot (.) to navigate within package hierarchies:

# import a sub-module from a package.
import package_name.submodule_name

# import a specific element from a sub-module.
from package_name.submodule_name import element

Relative imports (within packages): use the dot (.) to navigate within the same package structure:
```
# import from a sub-module within the same package.
from .submodule_name import element
```

13.4. search path

In the context of programming languages and environments, the search path refers to a list of directories that the program or interpreter looks at to locate specific files, particularly modules or libraries.

import sys
for path in sys.path:
    print(f"'{path}'")

''  # current working directory where the script is located
'/usr/lib/python311.zip'  # standard library, built-in modules
'/usr/lib/python3.11'
'/usr/lib/python3.11/lib-dynload'  # dynamically loaded modules or libraries
'/usr/local/lib/python3.11/dist-packages'  # third-party libraries
'/usr/lib/python3/dist-packages'

# sys.path is a list, and can be updated programmlly
sys.path
# ['', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/usr/local/lib/python3.11/dist-packages', '/usr/lib/python3/dist-packages']
sys.path.insert(0, '/tmp')
sys.path
# ['/tmp', '', '/usr/lib/python311.zip', '/usr/lib/python3.11', '/usr/lib/python3.11/lib-dynload', '/usr/local/lib/python3.11/dist-packages', '/usr/lib/python3/dist-packages']

13.5. pip install packages

# ensure can run pip from the command line
python3 -m pip --version  # pip --version
# pip 23.0.1 from /usr/lib/python3/dist-packages/pip (python 3.11)

# OR, install pip, venv modules in Debian/Ubuntu for the system python.
apt install python3-pip python3-venv  # On Debian/Ubuntu systems

13.5.1. virtual environment

# create a virtual environment
python3 -m venv python-learning-notes_env

# active a virtual environment
source python-learning-notes_env/bin/activate

# ensure pip, setuptools, and wheel are up to date
pip install --upgrade pip setuptools wheel

# show pip version
pip --version  # python3 -m pip --version
# pip 24.0 from .../python-learning-notes_env/lib/python3.11/site-packages/pip (python 3.11)

# deactive a virtual environment: the deactivate command is often implemented as a shell function.
deactivate

13.5.2. pip install

# install the latest stable version.
pip install 

# install a package with extras, i.e., optional dependencies (e.g., pip install 'transformers[torch]').
pip install [extra1[,extra2,...]]

# install the exact version (e.g., pip install vllm==0.4.3).
pip install ==

# install the latest version greater than or equal to the specified one (e.g., pip install vllm>=0.4.0 gets anything from 0.4.0 onwards), but within the same major version.
pip install >=

# install the latest patch version (tilde operator) within the specified major and minor version (e.g., pip install vllm~0.4).
pip install ~

# upgrade an already installed to the latest from PyPI.
pip install --upgrade 

# install from an alternate index
pip install --index-url http://my.package.repo/simple/ 

# search an additional index during install, in addition to PyPI
pip install --extra-index-url http://my.package.repo/simple 

# install pre-release and development versions, in addition to stable versions
pip install --pre

13.5.3. cache, configuration

# get the cache directory that pip is currently configured to use
pip cache dir  # ~/.cache/pip

# INI format configuration files can change the default values for command line options.
#   - global: system-wide configuration file, shared across users.
#   - user: per-user configuration file.
#   - site: per-environment configuration file; i.e. per-virtualenv.

# the names of the settings are derived from the long command line option.
[global]
timeout = 60
index-url = https://download.zope.org/ppix

# per-command section: pip install
[install]
ignore-installed = true
no-dependencies = yes

13.5.4. mirror

# set the PyPI mirror
pip config --user set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple
# pip config --user set global.index-url https://mirrors.aliyun.com/pypi/simple/
# pip config set global.extra-index-url "https://mirrors.sustech.edu.cn/pypi/web/simple https://mirrors.aliyun.com/pypi/simple/"

13.6. pipenv

Pipenv is a dependency manager for Python projects, is similar in spirit to Node.js’ npm or Ruby’s bundler.

# install pipenv in Debian/Ubuntu for the system python.
apt install pipenv

# install pipenv for the user python.
pip install pipenv --user

# If pipenv isn’t available in a shell after installation, add the user site-packages binary directory to `PATH`.
#
# On Windows, the user base binary directory can be found by running
# `python -m site --user-site`
# and replacing `site-packages` with `Scripts`.
#
# On Linux and macOS, find the user base binary directory by running
# `python -m site --user-base`
# and appending `bin` to the end.

Debian/Linux might not work due to limitations with user-based installations.

Using apt
```
apt install pipenv
```

Using pip with virtualenv

# Create a virtual environment
python3 -m venv pipenv_env

# Activate the virtual environment (replace "pipenv_env" with your chosen name)
source pipenv_env/bin/activate

# Install pipenv within the virtual environment
pip install pipenv

# Deactivate the virtual environment (optional)
deactivate

# Pipenv manages dependencies on a per-project basis.
mkdir myproject && cd myproject
pipenv install requests
ls  # Pipfile  Pipfile.lock

# activate the project's virtualenv:
pipenv shell

# main.py
import requests

response = requests.get('https://httpbin.org/ip')

print('Your IP is {0}'.format(response.json()['origin']))

# run a command inside the virtualenv:
pipenv run python main.py
# Your IP is 9.5.2.7

pipenv check         # Checks for PyUp Safety security vulnerabilities and against
                     # PEP 508 markers provided in Pipfile.
pipenv clean         # Uninstalls all packages not specified in Pipfile.lock.
pipenv graph         # Displays currently-installed dependency graph information.
pipenv install       # Installs provided packages and adds them to Pipfile, or (if no
                     # packages are given), installs all packages from Pipfile.
pipenv lock          # Generates Pipfile.lock.
pipenv open          # View a given module in your editor.
pipenv requirements  # Generate a requirements.txt from Pipfile.lock.
pipenv run           # Spawns a command installed into the virtualenv.
pipenv scripts       # Lists scripts in current environment config.
pipenv shell         # Spawns a shell within the virtualenv.
pipenv sync          # Installs all packages specified in Pipfile.lock.
pipenv uninstall     # Uninstalls a provided package and removes it from Pipfile.
pipenv update        # Runs lock, then sync.
pipenv upgrade       # Resolves provided packages and adds them to Pipfile, or (if no
                     # packages are given), merges results to Pipfile.lock
pipenv verify        # Verify the hash in Pipfile.lock is up-to-date.

14. Testing

unittest

# test_cap.py
import unittest

def cap(text: str) -> str:
    return text.capitalize()

class TestCap(unittest.TestCase):
    def setUp(self) -> None:
        pass

    def tearDown(self) -> None:
        pass

    def test_one_word(self):
        text = 'duck'  # _arrange_ the objects, create and set them up as necessary.

        result = cap(text)  # _act_ on an object.

        self.assertEqual('Duck', result)  # _assert_ that something is as expected.

    def test_multi_words(self):
        text = 'hello world'  # _arrange_ the objects, create and set them up as necessary.

        result = cap(text)  # _act_ on an object.

        self.assertEqual('Hello World', result)  # _assert_ that something is as expected.

if __name__ == '__main__':
    unittest.main()

$ python3 test_cap.py
F.
======================================================================
FAIL: test_multi_words (__main__.TestCap.test_multi_words)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "...", line 27, in test_multi_words
    self.assertEqual('Hello World', result)
AssertionError: 'Hello World' != 'Hello world!'
- Hello World
?       ^
+ Hello world
?       ^


----------------------------------------------------------------------
Ran 2 tests in 0.003s

FAILED (failures=1)

doctest

# doctest_cap.py
def cap(text: str) -> str:
    """
    >>> cap('duck')
    'Duck'
    >>> cap('hello world')
    'Hello World'
    """
    return text.capitalize()

if __name__ == '__main__':
    import doctest
    doctest.testmod()

$ python3 doctest_cap.py
**********************************************************************
File "...", line 5, in __main__.cap
Failed example:
    cap('hello world')
Expected:
    'Hello World'
Got:
    'Hello world'
**********************************************************************
1 items had failures:
   1 of   2 in __main__.cap
***Test Failed*** 1 failures.

pytest

# test_cap.py
def cap(text: str) -> str:
    return text.capitalize()

def test_one_word():
    text = 'duck'
    result = cap(text)
    assert result == 'Duck'

def test_multiple_words():
    text = 'hello world'
    result = cap(text)
    assert result == 'Hello World'

$ pipenv install pytest
Installing pytest...
Installing dependencies from Pipfile.lock (207fdb)...
$ pytest
============================================== test session starts ==============================================
platform linux -- Python 3.11.2, pytest-8.2.1, pluggy-1.5.0
rootdir: ...
collected 2 items

test_cap.py .F                                                                                            [100%]

=================================================== FAILURES ====================================================
______________________________________________ test_multiple_words ______________________________________________

    def test_multiple_words():
        text = 'hello world'
        result = cap(text)
>       assert result == 'Hello World'
E       AssertionError: assert 'Hello world' == 'Hello World'
E
E         - Hello World
E         ?       ^
E         + Hello world
E         ?       ^

test_cap.py:12: AssertionError
============================================ short test summary info ============================================
FAILED test_cap.py::test_multiple_words - AssertionError: assert 'Hello world' == 'Hello World'
========================================== 1 failed, 1 passed in 0.09s ==========================================

15. Processes and concurrency

# The standard library’s os module provides a common way of accessing some system information.
import os
os.uname()
# posix.uname_result(sysname='Linux', nodename='node-0', release='6.1.0-21-amd64', version='#1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03)', machine='x86_64')
os.getloadavg()
# (0.05126953125, 0.03955078125, 0.00341796875)
os.cpu_count()
# 4
(os.getpid(), os.getcwd(), os.getuid(), os.getgid())
# (1295, '/tmp', 1000, 1000)
os.system('date -u')
# Thu Jun  6 11:23:23 AM UTC 2024
# 0

# get system and process information with the third-party package psutil
import psutil  # pip install psutil
print(psutil.cpu_times(percpu=True))
# [scputimes(user=4.37, nice=0.0, system=6.71, idle=1468.69, iowait=0.26, irq=0.0, softirq=1.86, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=11.84, nice=0.0, system=9.3, idle=1465.29, iowait=1.02, irq=0.0, softirq=0.75, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=10.31, nice=0.0, system=8.58, idle=1468.4, iowait=1.66, irq=0.0, softirq=0.97, steal=0.0, guest=0.0, guest_nice=0.0), scputimes(user=9.11, nice=0.0, system=10.02, idle=1467.95, iowait=0.81, irq=0.0, softirq=0.65, steal=0.0, guest=0.0, guest_nice=0.0)]
print(psutil.cpu_percent(percpu=False))
# 0.0
print(psutil.cpu_percent(percpu=True))
# [0.3, 0.4, 0.4, 0.1]

15.1. subprocess and multiprocessing

import subprocess

# run another program in a shell
# and grab whatever output it created (both standard output and standard error output)
print(subprocess.getoutput('date'))  # Thu Jun  6 07:19:50 PM CST 2024

# A variant method called `check_output()` takes a list of the command and arguments.
# By default it returns standard output only as type bytes rather than a string, and
# does not use the shell:
print(subprocess.check_output(['date', '-u']))  # b'Thu Jun  6 11:30:09 AM UTC 2024\n'

# return a tuple with the status code and output of the other program
print(subprocess.getstatusoutput('date'))  # (0, 'Thu Jun  6 07:32:25 PM CST 2024')

# capture the exit status only
ret = subprocess.call('date -u', shell=True)
# Thu Jun  6 11:45:51 AM UTC 2024
print(ret)
# 0

# makes a list of the arguments, not need to call the shell
ret = subprocess.call(['date', '-u'])
# Thu Jun  6 11:50:04 AM UTC 2024
print(ret)
# 0

# create multiple independent processes
import multiprocessing
import os

def whoami(what):
    print("Process %s says: %s" % (os.getpid(), what))

if __name__ == "__main__":
    whoami("I'm the main program")
    for n in range(4):
        p = multiprocessing.Process(
            target=whoami, args=("I'm function %s" % n,))
        p.start()

# Process 1648 says: I'm the main program
# Process 1649 says: I'm function 0
# Process 1650 says: I'm function 1
# Process 1651 says: I'm function 2
# Process 1652 says: I'm function 3

# kill a process with terminate()
import multiprocessing
import time
import os

def whoami(name):
    print("I'm %s, in process %s" % (name, os.getpid()))

def loopy(name):
    whoami(name)
    start = 1
    stop = 1000000
    for num in range(start, stop):
        print("\tNumber %s of %s. Honk!" % (num, stop))
        time.sleep(1)

if __name__ == "__main__":
    whoami("main")
    p = multiprocessing.Process(target=loopy, args=("loopy",))
    p.start()
    time.sleep(5)
    p.terminate()

# I'm main, in process 13084
# I'm loopy, in process 14664
#         Number 1 of 1000000. Honk!
#         Number 2 of 1000000. Honk!
#         Number 3 of 1000000. Honk!
#         Number 4 of 1000000. Honk!
#         Number 5 of 1000000. Honk!

15.2. Queues, processes, and threads

A queue is like a list: things are added at one end and taken away from the other, which most common is referred to as FIFO (first in, first out). In general, queues transport messages, which can be any kind of information, for distributed task management, also known as work queues, job queues, or task queues.

Threads can be dangerous. Like manual memory management in languages such as C and C++, they can cause bugs that are extremely hard to find, let alone fix. To use threads, all the code in the program (and in external libraries that it uses) must be thread safe.

In Python, threads do not speed up CPU-bound tasks because of an implementation detail in the standard Python system called the Global Interpreter Lock (GIL).

Use threads for I/O-bound problems
Use processes, networking, or events (discussed in the next section) for CPU-bound problems

import multiprocessing as mp

def washer(dishes, output):
    for dish in dishes:
        print('Washing', dish, 'dish')
        output.put(dish)

def dryer(input):
    while True:
        dish = input.get()
        print('Drying', dish, 'dish')
        input.task_done()

dish_queue = mp.JoinableQueue()
dryer_proc = mp.Process(target=dryer, args=(dish_queue,))
dryer_proc.daemon = True
dryer_proc.start()
dishes = ['salad', 'bread', 'entree', 'dessert']
washer(dishes, dish_queue)
dish_queue.join()

# Washing salad dish
# Washing bread dish
# Washing entree dish
# Washing dessert dish
# Drying salad dish
# Drying bread dish
# Drying entree dish
# Drying dessert dish

import threading
import queue
import time

def washer(dishes, dish_queue):
    for dish in dishes:
        print("Washing", dish)
        time.sleep(5)
        dish_queue.put(dish)

def dryer(dish_queue):
    while True:
        dish = dish_queue.get()
        print("Drying", dish)
        time.sleep(10)
        dish_queue.task_done()

dish_queue = queue.Queue()
for n in range(2):
    dryer_thread = threading.Thread(target=dryer, args=(dish_queue,))
    dryer_thread.start()
dishes = ['salad', 'bread', 'entree', 'dessert']
washer(dishes, dish_queue)
dish_queue.join()

# Washing salad
# Washing bread
# Drying salad
# Washing entree
# Drying bread
# Washing dessert
# Drying entree
# Drying dessert

15.3. concurrent.futures

The concurrent.futures module in the standard library can be used to schedule an asynchronous pool of workers, using threads (when I/O-bound) or processes (when CPU-bound), and get back a future to track their state and collect the results.

Use concurrent.futures any time to launch a bunch of concurrent tasks, such as the following:

Crawling URLs on the web
Processing files, such as resizing images
Calling service APIs

from concurrent import futures
import math
import sys

def calc(val):
    result = math.sqrt(float(val))
    return val, result

def use_threads(num, values):
    with futures.ThreadPoolExecutor(num) as tex:
        tasks = [tex.submit(calc, value) for value in values]
        for f in futures.as_completed(tasks):
            yield f.result()

def use_processes(num, values):
    with futures.ProcessPoolExecutor(num) as pex:
        tasks = [pex.submit(calc, value) for value in values]
        for f in futures.as_completed(tasks):
            yield f.result()

def main(workers, values):
    print(f"Using {workers} workers for {len(values)} values")
    print("Using threads:")
    for val, result in use_threads(workers, values):
        print(f'{val} {result:.4f}')
    print("Using processes:")
    for val, result in use_processes(workers, values):
        print(f'{val} {result:.4f}')

if __name__ == '__main__':
    workers = 3
    if len(sys.argv) > 1:
        workers = int(sys.argv[1])
        values = list(range(1, 6))  # 1 .. 5
    main(workers, values)

15.4. Asynchronous programming with async and await

In Python 3.4, Python added a standard asynchronous module called asyncio. Python 3.5 then added the keywords async and await. These implement some new concepts:

Coroutines are functions that pause at various points
An event loop that schedules and runs coroutines

import asyncio

async def say(phrase, seconds):
    print(phrase)
    await asyncio.sleep(seconds)

async def wicked():
    task_1 = asyncio.create_task(say("Surrender,", 2))
    task_2 = asyncio.create_task(say("Dorothy!", 0))
    await task_1
    await task_2

#  blocking: runs the passed coroutine in the default executor, which given a timeout duration of 5 minutes to shutdown
asyncio.run(wicked())

import asyncio

async def say(phrase, seconds):
    print(phrase)
    await asyncio.sleep(seconds)

async def wicked():
    task_1 = asyncio.create_task(say("Surrender,", 2))
    task_2 = asyncio.create_task(say("Dorothy!", 0))
    await asyncio.gather(task_1, task_2)  # Wait for all tasks to finish concurrently

loop = asyncio.get_event_loop()
loop.run_until_complete(wicked())
loop.close()

16. SQL

DB-API (Database API), similar to JDBC in Java, is a standardized interface for Python that allows us to interact with various relational databases using a consistent set of functions and methods, which can simplify database access by providing a common ground for working with different database systems like MySQL, PostgreSQL, SQL Server, and SQLite.

DB-API focuses on fundamental database operations like connecting, executing SQL queries, fetching results, and committing/rolling back transactions.
Different database modules (e.g., MySQLdb, psycopg2, sqlite3) implement the DB-API standard, ensuring consistency in these core functionalities across various systems.
DB-API promotes parameterization of SQL queries using placeholders (%s, ?, etc.) for values, which enhances security by preventing SQL injection vulnerabilities and improves portability by separating data from the query itself.

16.1. Using DB-API with SQLite in Memory

import sqlite3

# Connect to an in-memory database (no file needed)
with sqlite3.connect(":memory:") as connection:

    # Create a cursor object
    cursor = connection.cursor()

    # Create a table (assuming you don't have one)
    cursor.execute('''
CREATE TABLE IF NOT EXISTS users (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  username TEXT NOT NULL,
  email TEXT UNIQUE NOT NULL)
''')

    # Insert some data using parameterization
    users = [("Alice", "alice@example.com"), ("Bob", "bob@example.com")]
    cursor.executemany(
        "INSERT INTO users (username, email) VALUES (?, ?)", users)

    # Commit the changes
    connection.commit()

    # Query the data
    cursor.execute("SELECT * FROM users")

    # Fetch all results
    results = cursor.fetchall()

    # Print the results
    for row in results:
        print(f"ID: {row[0]}, Username: {row[1]}, Email: {row[2]}")

References

[1] Bill Lubanovic Introducing Python: Modern Computing in Simple Packages. second edition, O’Reilly Media, Inc., November 2019
[2] Learning Python, 5th Edition Powerful Object-Oriented Programming (Mark Lutz), O’Reilly Media; 5th edition (July 30, 2013)
[3] https://en.wikipedia.org/wiki/Python_(programming_language)
[4] https://gemini.google.com
[5] https://docs.python.org/3/library/
[6] https://pypi.org/

WebSockets and Server-sent events

2024-05-15T15:00:40+08:00

1. WebSockets
2. Server-sent events
3. Websocket, Server-Sent Events (SSE), and HTTP 206 Partial Content [Gemini]
References

1. WebSockets

WebSocket, standardized by the IETF as RFC 6455 in 2011, is a computer communications protocol, providing a simultaneous two-way communication channel over a single Transmission Control Protocol (TCP) connection. [1]

The WebSocket is designed to work over HTTP ports 80 and 443 as well as to support HTTP proxies and intermediaries on the existing HTTP infrastructure, and could also use a simpler handshake over a dedicated port without reinventing the entire protocol. [2]

WebSocket, conceptually, is really just a layer on top of TCP that does the following: [2]

adds a web origin-based security model for browsers
adds an addressing and protocol naming mechanism to support multiple services on one port and multiple host names on one IP address
layers a framing mechanism on top of TCP to get back to the IP packet mechanism that TCP is built on, but without length limits
includes an additional closing handshake in-band that is designed to work in the presence of proxies and other intermediaries

The protocol has two parts: a handshake and the data transfer. After a successful handshake, clients and servers transfer messages back and forth. [2]

On the wire, a message is composed of one or more fragmented frames.
A frame has an associated type and broadly speaking, there are types for textual data, binary data, and control frames.

WebSocket specification defines two URI schemes: [2]

ws-URI = "ws:" "//" host [ ":" port ] path [ "?" query ]
wss-URI = "wss:" "//" host [ ":" port ] path [ "?" query ]

The protocol uses the HTTP/1.1 Upgrade mechanism (Section 6.7 of RFC7230) to transition a TCP connection from HTTP into a WebSocket connection, and uses the extended CONNECT method to initiate a WebSocket connection on an HTTP/2 stream. [3]

1.1. Date framing

A high-level overview of the framing is given in the following figure. [2]

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-------+-+-------------+-------------------------------+
|F|R|R|R| opcode|M| Payload len |    Extended payload length    |
|I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
|N|V|V|V|       |S|             |   (if payload len==126/127)   |
| |1|2|3|       |K|             |                               |
+-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
|     Extended payload length continued, if payload len == 127  |
+ - - - - - - - - - - - - - - - +-------------------------------+
|                               |Masking-key, if MASK set to 1  |
+-------------------------------+-------------------------------+
| Masking-key (continued)       |          Payload Data         |
+-------------------------------- - - - - - - - - - - - - - - - +
:                     Payload Data continued ...                :
+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
|                     Payload Data continued ...                |
+---------------------------------------------------------------+

FIN:  1 bit

   Indicates that this is the final fragment in a message.  The first
   fragment MAY also be the final fragment.

Opcode:  4 bits

   Defines the interpretation of the "Payload data".  If an unknown
   opcode is received, the receiving endpoint MUST _Fail the
   WebSocket Connection_.  The following values are defined.

   *  %x0 denotes a continuation frame

   *  %x1 denotes a text frame

   *  %x2 denotes a binary frame

   *  %x3-7 are reserved for further non-control frames

   *  %x8 denotes a connection close

   *  %x9 denotes a ping

   *  %xA denotes a pong

   *  %xB-F are reserved for further control frames

Control frames are identified by opcodes where the most significant bit of the opcode is 1.
- Currently defined opcodes for control frames include 0x8 (Close), 0x9 (Ping), and 0xA (Pong).
- Control frames are used to communicate state about the WebSocket.
Data frames (e.g., non-control frames) are identified by opcodes where the most significant bit of the opcode is 0.
- Currently defined opcodes for data frames include 0x1 (Text), 0x2 (Binary).
- Data frames carry application-layer and/or extension-layer data.

1.2. Opening handshake in HTTP/1.1

The opening handshake is intended to be compatible with HTTP-based server-side software and intermediaries, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server. [2]

The WebSocket client’s handshake is an HTTP Upgrade request with the Request-Line [4] format: [2]

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Origin: http://example.com
Sec-WebSocket-Protocol: chat, superchat (1)
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ== (2)
Sec-WebSocket-Version: 13

1	The `Sec-WebSocket-Protocol` request-header field can be used to indicate what subprotocols (application-level protocols layered over the WebSocket Protocol) are acceptable to the client.
2	The server takes the `Sec-WebSocket-Key` header field and echo the `Sec-WebSocket-Accept` header field to prove the received handshake.

The handshake from the server is much simpler than the client handshake, and looks as follows with the Status-Line [4] format:

HTTP/1.1 101 Switching Protocols (1)
Upgrade: websocket (2)
Connection: Upgrade (2)
Sec-WebSocket-Accept: s3pPLMBiTxaQ9kYGzzhZRbK+xOo= (3)
Sec-WebSocket-Protocol: chat (4)

1	Any status code other than 101 indicates that the WebSocket handshake has not completed and that the semantics of HTTP still apply.
2	The `Connection` and `Upgrade` header fields complete the HTTP Upgrade.
3	The `Sec-WebSocket-Accept` header field indicates whether the server is willing to accept the connection.
4	The `Sec-WebSocket-Protocol` is an option field, which indicates the subprotocol that the server has selected.

Either peer can send a control frame with data containing a specified control sequence to begin the closing handshake.

1.3. WebSocket and HTTP Protocol [Gemini]

While WebSockets leverage the initial HTTP connection for the handshake, proxy servers do typically handle WebSocket traffic and normal HTTP traffic distinctly.

WebSocket vs. HTTP Protocol Differences:

Purpose: HTTP is designed for request-response interactions, while WebSocket establishes a full-duplex communication channel for real-time data exchange.
Data Format: HTTP uses a text-based request-response format with headers and payloads. WebSocket uses a binary frame format for efficient data transfer.
Connection State: HTTP connections are typically short-lived, closing after the response is sent. WebSockets maintain persistent connections for bi-directional communication.

Proxy Handling:

Initial Handshake: For both HTTP and WebSocket traffic, the proxy first establishes a standard HTTP connection with the target server.
Handshake Differentiation: The proxy can identify WebSocket traffic by recognizing the specific handshake headers used in the initial HTTP request.
Separate Handling: Once a WebSocket handshake is detected, the proxy switches to handling the subsequent frames using the WebSocket protocol. It might involve unmasking/remasking data and forwarding it appropriately.
HTTP Traffic Handling: Normal HTTP requests and responses continue to be handled using the standard HTTP protocol by the proxy.

Benefits of Separate Handling:

Performance: By handling WebSocket traffic differently, the proxy can optimize processing for the specific needs of each protocol. This can improve performance for both WebSocket and HTTP traffic.
Security: Some proxies might have specific security mechanisms tailored for HTTP traffic (e.g., content filtering). These wouldn’t be applicable to the binary data format of WebSockets. Separate handling allows for targeted security measures.
Complexity Management: Separating the handling logic simplifies the proxy implementation as it deals with each protocol according to its unique characteristics.

In summary:

A single proxy server can manage both HTTP and WebSocket traffic.
However, it differentiates between the two protocols during the initial handshake and then employs separate handling mechanisms for each to ensure optimal performance and proper data flow.

1.4. WebSockets in .NET

Using WebSockets over HTTP/2 takes advantage of new features are available in Kestrel on all HTTP/2 enabled platforms such as: [5]

Header compression.
Multiplexing, which reduces the time and resources needed when making multiple requests to the server.

HTTP/2 WebSockets use CONNECT requests rather than GET.

WebSockets Server in ASP.NET Core

var webSocketOptions = new WebSocketOptions
{
    KeepAliveInterval = TimeSpan.FromMinutes(2)
};

webSocketOptions.AllowedOrigins.Add("https://client.com");
webSocketOptions.AllowedOrigins.Add("https://www.client.com");

// Add the WebSockets middleware in `Program.cs`:
app.UseWebSockets(webSocketOptions);

app.Use(async (context, next) =>
{
    // [Route("/ws")] // HTTP/2 WebSockets use CONNECT requests rather than GET.
    if (context.Request.Path == "/ws")
    {
        // Accept WebSocket requests
        if (context.WebSockets.IsWebSocketRequest)
        {
            using var webSocket = await context.WebSockets.AcceptWebSocketAsync();
            await EchoAsync(webSocket);
        }
        else
        {
            context.Response.StatusCode = StatusCodes.Status400BadRequest;
        }
    }
    else
    {
        await next(context);
    }

});

app.Run();

// Send and receive messages
static async Task EchoAsync(WebSocket webSocket)
{
    var buffer = new byte[1024 * 4];
    var receiveResult = await webSocket.ReceiveAsync(
        new ArraySegment<byte>(buffer), CancellationToken.None);

    while (!receiveResult.CloseStatus.HasValue)
    {
        await webSocket.SendAsync(
            new ArraySegment<byte>(buffer, 0, receiveResult.Count),
            receiveResult.MessageType,
            receiveResult.EndOfMessage,
            CancellationToken.None);

        receiveResult = await webSocket.ReceiveAsync(
            new ArraySegment<byte>(buffer), CancellationToken.None);
    }

    await webSocket.CloseAsync(
        receiveResult.CloseStatus.Value,
        receiveResult.CloseStatusDescription,
        CancellationToken.None);
}

WebSockets Client in .NET

string[] messages = [
    "我们的生命不是消逝于那些重大的事件中，而是流逝在那些日常琐碎的小事里。",
    "生活是由无数微不足道的细节构建起来的，而记忆正是这些细节的忠实记录者，它们在某个不经意的瞬间被唤醒，带我们穿越回往昔。",
    "人们在追求他们以为是幸福的东西时，常常错过真正的幸福。",
    "在失去之后，我们才开始寻找那些曾经拥有但未被珍惜的东西，而记忆，则成了我们找回那些失落时光的唯一线索。"
    ];

Uri uri = new("ws://localhost:5000/ws");
using ClientWebSocket ws = new();
var cts = new CancellationTokenSource();
await ws.ConnectAsync(uri, cts.Token);

foreach (var message in messages)
{
    var bytes = Encoding.UTF8.GetBytes(message);
    await ws.SendAsync(bytes, WebSocketMessageType.Text, true, cts.Token);
}

ThreadPool.QueueUserWorkItem(async _ =>
{
    while (!cts.Token.IsCancellationRequested)
    {
        var (echoMessage, _, _, _, _) = await ReadMessageAsync(ws, cts.Token);
        Console.WriteLine(Encoding.UTF8.GetString(echoMessage.ToArray()));
    }
});

await Task.Delay(1000);

await ws.CloseAsync(WebSocketCloseStatus.NormalClosure, "Client closed", cts.Token);

// Read a complete message from a WebSocket.
static async Task<(IList<byte>, WebSocketMessageType, bool, WebSocketCloseStatus?, string?)> ReadMessageAsync(WebSocket webSocket, CancellationToken token = default)
{
    var message = new List<byte>(1024 * 2);
    var buffer = new byte[8 * 4];
    var receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), token).ConfigureAwait(false);
    while (true)
    {
        message.AddRange(new ArraySegment<byte>(buffer, 0, receiveResult.Count));
        if (receiveResult.CloseStatus.HasValue || receiveResult.EndOfMessage)
        {
            break;
        }
        receiveResult = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), token).ConfigureAwait(false);
    }

    return (message.AsReadOnly(), receiveResult.MessageType, receiveResult.EndOfMessage, receiveResult.CloseStatus, receiveResult.CloseStatusDescription);
}

1.5. WebSockets in Browser

The WebSocket API is an advanced technology that makes it possible to open a two-way interactive communication session between browser and server, which can send messages to a server and receive event-driven responses without having to poll the server for a reply. [7]

const excerpts = [
  'Grown-ups never understand anything by themselves, and it is tiresome for children to be always and forever explaining things to them.',
  'And now here is my secret, a very simple secret: It is only with the heart that one can see rightly; what is essential is invisible to the eye.',
  "People have forgotten this truth,' the fox said. 'But you mustn’t forget it. You become responsible forever for what you’ve tamed. You’re responsible for your rose.",
  'All grown-ups were once children... but only few of them remember it.',
  'It is the time you have wasted for your rose that makes your rose so important.',
  'One sees clearly only with the heart. Anything essential is invisible to the eyes.',
  'You - you alone will have the stars as no one else has them... In one of the stars I shall be living. In one of them I shall be laughing. And so it will be as if all the stars were laughing, when you look at the sky at night... You - only you - will have stars that can laugh.',
  'You become responsible, forever, for what you have tamed.'
]

// Creating a WebSocket object
const ws = new WebSocket('ws://localhost:5000/ws')
// Listen for possible errors
ws.addEventListener('error', (event) => {
  console.log('WebSocket error: ', event)
})

// Sending data to the server
ws.onopen = () => {
  for (const excerpt of excerpts) {
    ws.send(excerpt)
  }
}

// Receiving messages from the server
ws.onmessage = (e) => {
  console.log(e.data)
}

2. Server-sent events

Server-Sent Events (SSE) is a server push technology enabling a client to receive automatic updates from a server via an HTTP connection, and describes how servers can initiate data transmission towards clients once an initial client connection has been established. [8]

2.1. Event stream format

The event stream is a simple stream of text data messages which are separated by a pair of newline characters (\n\n), and must be encoded using UTF-8. [9]

A colon (:) as the first character of a line is in essence a comment, and is ignored.
Each message consists of one or more lines of text listing the fields for that message.

Each field is represented by the field name (event, data, id, and retry), followed by a colon, followed by the text data for that field’s value.

: this is a test stream (1)


event: userconnect (2)
data: {"username": "bobby", "time": "02:33:48"}


data: another message (3)
data: with two lines (3)


event: usermessage (2)
data: {"username": "bobby", "time": "02:34:11", "text": "Hi everyone."}

1	The first is just a comment, since it starts with a colon character.
2	This sends custom named events.
3	The third message contains a data field with the value "another message\nwith two lines". Note the newline special character in the value.

2.2. Sending events from the server

The server-side that sends events needs to respond using the MIME type text/event-stream. Each notification is sent as a block of text terminated by a pair of newlines. [9]

Here is the .NET code for the example:

app.UseCors(policy => policy.AllowAnyOrigin()); // builder.Services.AddCors();

var excerpts = new string[]
{
  "Notre vie ne se gaspille pas dans les grands événements, mais s'écoule dans les petites choses de tous les jours.",
  "La vie est faite de millions de détails insignifiants, et la mémoire est le fidèle enregistreur de ces détails, qui se réveillent à un moment inattendu et nous transportent dans le passé.",
  "En poursuivant ce qu'ils croient être le bonheur, les gens passent souvent à côté du vrai bonheur.",
  "Ce n'est qu'après avoir perdu quelque chose que nous commençons à chercher ce que nous avions et que nous n'avons pas chéri, et la mémoire devient alors le seul fil conducteur pour retrouver ces moments perdus."
};

app.Use(async (context, next) =>
{
    if (context.Request.Path == "/sse")
    {
        if (context.Request.Headers.Accept.Any(x => x != null && x.Contains("text/event-stream")))
        {
            context.Response.Headers.ContentType = "text/event-stream";
            context.Response.Headers.CacheControl = "no-cache";

            await context.Response.Body.WriteAsync(System.Text.Encoding.UTF8.GetBytes($"event: ping\ndata: pong!\n\n"));
            await context.Response.Body.FlushAsync();

            foreach (var excerpt in excerpts)
            {
                await context.Response.Body.WriteAsync(System.Text.Encoding.UTF8.GetBytes($"data: {excerpt}\n\n"));
                await context.Response.Body.FlushAsync();
            }

            // the stream terminated by a data: [DONE]
            await context.Response.Body.WriteAsync(System.Text.Encoding.UTF8.GetBytes("data: [DONE]\n\n"));
            await context.Response.Body.FlushAsync();
        }
        else
        {
            context.Response.StatusCode = StatusCodes.Status415UnsupportedMediaType;
        }
    }
    else
    {
        await next(context);
    }
});

app.Run();

$ curl -i localhost:5000/sse -H 'accept: text/event-stream'
HTTP/1.1 200 OK
Content-Type: text/event-stream
Date: Thu, 16 May 2024 11:14:26 GMT
Server: Kestrel
Cache-Control: no-cache
Transfer-Encoding: chunked

event: ping
data: pong!

data: Notre vie ne se gaspille pas dans les grands événements, mais s'écoule dans les petites choses de tous les jours.

data: La vie est faite de millions de détails insignifiants, et la mémoire est le fidèle enregistreur de ces détails, qui se réveillent à un moment inattendu et nous transportent dans le passé.

data: En poursuivant ce qu'ils croient être le bonheur, les gens passent souvent à côté du vrai bonheur.

data: Ce n'est qu'après avoir perdu quelque chose que nous commençons à chercher ce que nous avions et que nous n'avons pas chéri, et la mémoire devient alors le seul fil conducteur pour retrouver ces moments perdus.

data: [DONE]

2.3. Receiving events from the server

The server-sent event API is contained in the EventSource interface.

// Creating an EventSource instance
const evtSource = new EventSource('//localhost:5000/sse', {
  // withCredentials: true
})
// Listening for message events
evtSource.onmessage = (event) => {
  console.log(`message: ${event.data}`)
  // the stream terminated by a data: [DONE]
  if (event.data === '[DONE]') {
    // Closing event streams
    evtSource.close()
  }
}
// Listening for custom events
evtSource.addEventListener('ping', (event) => {
  console.log(`ping: ${event.data}`)
})
// Error handling
evtSource.onerror = (err) => {
  console.error('EventSource failed:', err)
}

3. Websocket, Server-Sent Events (SSE), and HTTP 206 Partial Content [Gemini]

While all three terms involve transferring data from a server to a client, they serve distinct purposes and operate in different contexts. This blog post clarifies the key distinctions to help you choose the right approach for your application.

WebSockets: Real-time, Two-Way Communication

Imagine a live chat conversation. WebSockets facilitate full-duplex communication, where both the server and client can send and receive messages in real-time. This allows for interactive experiences like chat applications, online games, or collaborative editing tools. A long-lived connection is established, enabling continuous data flow in both directions.
Server-Sent Events (SSE): Server Pushes Updates

Think of a live news feed. SSE allows the server to proactively push updates or data to connected clients. Clients cannot directly send messages back. This unidirectional approach is ideal for real-time updates where the server needs to broadcast information, like stock tickers, sensor readings, or live sports scores. Similarly to WebSockets, a long-lived connection is maintained between the server and client.
HTTP 206 Partial Content: Downloading Large Files in Chunks

Imagine downloading a large movie. HTTP 206 Partial Content is part of the standard HTTP protocol for handling partial downloads. The client requests a specific portion of a resource (e.g., a specific chunk of the movie file), the server sends only that part, and the connection closes. This is useful for downloading large files more efficiently, allowing for progress updates and potentially faster perceived download speeds.

Choosing the Right Tool:

The best approach depends on your application’s needs:

Real-time, two-way communication: Use WebSockets.
Server-side updates without client interaction: Use SSE.
Downloading large resources in chunks: Use HTTP 206 Partial Content.

By understanding these concepts, you can make informed decisions when designing real-time or download functionalities in your web applications.

References

AI and Large Language Models

2024-05-06T09:03:40+08:00

1. What is AI?
2. Large language models
- 2.1. Transformer models
3. What is Azure OpenAI?
Appendix A: FAQ
References

1. What is AI?

Simply put, AI is software that imitates human behaviors and capabilities. Key workloads include: [1]

Machine learning - This is often the foundation for an AI system, and is the way we "teach" a computer model to make predictions and draw conclusions from data.
Computer vision - Capabilities within AI to interpret the world visually through cameras, video, and images.
Natural language processing - Capabilities within AI for a computer to interpret written or spoken language, and respond in kind.
Document intelligence - Capabilities within AI that deal with managing, processing, and using high volumes of data found in forms and documents.
Knowledge mining - Capabilities within AI to extract information from large volumes of often unstructured data to create a searchable knowledge store.
Generative AI - Capabilities within AI that create original content in a variety of formats including natural language, image, code, and more.

1.1. Machine Learning

Machine Learning is the foundation for most AI solutions.

Since the 1950’s, researchers, often known as data scientists, have worked on different approaches to AI.
Most modern applications of AI have their origins in machine learning, a branch of AI that combines computer science and mathematics.

How machine learning works?

The answer is, from data.

In today’s world, we create huge volumes of data as we go about our everyday lives. From the text messages, emails, and social media posts we send to the photographs and videos we take on our phones, we generate massive amounts of information. More data still is created by millions of sensors in our homes, cars, cities, public transport infrastructure, and factories.
Data scientists can use all of that data to train machine learning models that can make predictions and inferences based on the relationships they find in the data.

Deep learning, machine learning, and AI

1.2. Computer Vision

Computer Vision is an area of AI that deals with visual processing.

Image Analysis: capabilities for analyzing images and video, and extracting descriptions, tags, objects, and text.
Face: capabilities that enable you to build face detection and facial recognition solutions.
Optical Character Recognition (OCR): capabilities for extracting printed or handwritten text from images, enabling access to a digital version of the scanned text.

1.3. Natural language processing (NLP)

Natural language processing (NLP) is the area of AI that deals with creating software that understands written and spoken language.

Analyze and interpret text in documents, email messages, and other sources.
Interpret spoken language, and synthesize speech responses.
Automatically translate spoken or written phrases between languages.
Interpret commands and determine appropriate actions.

1.4. Document Intelligence

Document Intelligence is the area of AI that deals with managing, processing, and using high volumes of a variety of data found in forms and documents.

Document intelligence enables us to create software that can automate processing for contracts, health documents, financial forms and more.

1.5. Knowledge Mining

Knowledge mining is the term used to describe solutions that involve extracting information from large volumes of often unstructured data to create a searchable knowledge store.

1.6. Generative AI?

Generative artificial intelligence (generative AI, GenAI, or GAI) is artificial intelligence capable of generating text, images, videos, or other data using generative models, often in response to prompts.

Improvements in transformer-based deep neural networks, particularly large language models (LLMs), enabled an AI boom of generative AI systems in the early 2020s. These include chatbots such as ChatGPT, Copilot, Gemini and LLaMA, text-to-image artificial intelligence image generation systems such as Stable Diffusion, Midjourney and DALL-E, and text-to-video AI generators such as Sora.

— From Wikipedia
the free encyclopedia

Artificial Intelligence (AI) imitates human behavior by using machine learning to interact with the environment and execute tasks without explicit directions on what to output. [2]

Generative AI describes a category of capabilities within AI that create original content.

People typically interact with generative AI that has been built into chat applications. One popular example of such an application is ChatGPT, a chatbot created by OpenAI, an AI research company that partners closely with Microsoft.
Generative AI applications take in natural language input, and return appropriate responses in a variety of formats including natural language, image, code, audio, and video.

2. Large language models

A large language model (LLM) is a computational model notable for its ability to achieve general-purpose language generation and other natural language processing tasks such as classification. Based on language models, LLMs acquire these abilities by learning statistical relationships from text documents during a computationally intensive self-supervised and semi-supervised training process. LLMs can be used for text generation, a form of generative AI, by taking an input text and repeatedly predicting the next token or word.

— From Wikipedia
the free encyclopedia

Generative AI applications are powered by large language models (LLMs), which are a specialized type of machine learning model that you can use to perform natural language processing (NLP) tasks, including:

Determining sentiment or otherwise classifying natural language text.
Summarizing text.
Comparing multiple text sources for semantic similarity.
Generating new natural language.

2.1. Transformer models

Machine learning models for natural language processing have evolved over many years. Today’s cutting-edge large language models are based on the transformer architecture, which builds on and extends some techniques that have been proven successful in modeling vocabularies to support NLP tasks - and in particular in generating language.

Transformer models are trained with large volumes of text, enabling them to represent the semantic relationships between words and use those relationships to determine probable sequences of text that make sense.

Transformer models with a large enough vocabulary are capable of generating language responses that are tough to distinguish from human responses.

Transformer model architecture consists of two components, or blocks:

An encoder block that creates semantic representations of the training vocabulary.
A decoder block that generates new language sequences.

In practice, the specific implementations of the architecture vary – for example,

the Bidirectional Encoder Representations from Transformers (BERT) model developed by Google to support their search engine uses only the encoder block, while
the Generative Pretrained Transformer (GPT) model developed by OpenAI uses only the decoder block.

2.1.1. Tokenization

The first step in training a transformer model is to decompose the training text into tokens - in other words, identify each unique text value. With a sufficiently large set of training text, a vocabulary of many thousands of tokens could be compiled. For the sake of simplicity, we can think of each distinct word in the training text as a token (though in reality, tokens can be generated for partial words, or combinations of words and punctuation).

2.1.2. Embeddings

To create a vocabulary that encapsulates semantic relationships between the tokens, we define contextual vectors, known as embeddings, for them.

Vectors are multi-valued numeric representations of information, for example [10, 3, 1] in which each numeric element represents a particular attribute of the information.
For language tokens, each element of a token’s vector represents some semantic attribute of the token.
The specific categories for the elements of the vectors in a language model are determined during training based on how commonly words are used together or in similar contexts.

It can be useful to think of the elements in a token embedding vector as coordinates in multidimensional space, so that each token occupies a specific "location."

The closer tokens are to one another along a particular dimension, the more semantically related they are.
In other words, related words are grouped closer together.

2.1.3. Attention

The encoder and decoder blocks in a transformer model include multiple layers that form the neural network for the model. One of the types of layers that is used in both blocks are attention layers.

Attention is a technique used to examine a sequence of text tokens and try to quantify the strength of the relationships between them.
In particular, self-attention involves considering how other tokens around one particular token influence that token’s meaning.
In an encoder block, each token is carefully examined in context, and an appropriate encoding is determined for its vector embedding. The vector values are based on the relationship between the token and other tokens with which it frequently appears.
In a decoder block, attention layers are used to predict the next token in a sequence. For each token generated, the model has an attention layer that takes into account the sequence of tokens up to that point. The model considers which of the tokens are the most influential when considering what the next token should be.

Remember that the attention layer is working with numeric vector representations of the tokens, not the actual text.

In a decoder, the process starts with a sequence of token embeddings representing the text to be completed.
During training, the goal is to predict the vector for the final token in the sequence based on the preceding tokens.
The attention layer assigns a numeric weight to each token in the sequence so far. It uses that value to perform a calculation on the weighted vectors that produces an attention score that can be used to calculate a possible vector for the next token.

In practice, a technique called multi-head attention uses different elements of the embeddings to calculate multiple attention scores.

A neural network is then used to evaluate all possible tokens to determine the most probable token with which to continue the sequence.
The process continues iteratively for each token in the sequence, with the output sequence so far being used regressively as the input for the next iteration – essentially building the output one token at a time.

What all of this means, is that a transformer model such as GPT-4 (the model behind ChatGPT and Bing) is designed to take in a text input (called a prompt) and generate a syntactically correct output (called a completion).

In effect, the “magic” of the model is that it has the ability to string a coherent sentence together.
This ability doesn’t imply any “knowledge” or “intelligence” on the part of the model; just a large vocabulary and the ability to generate meaningful sequences of words.
What makes a large language model like GPT-4 so powerful however, is the sheer volume of data with which it has been trained (public and licensed data from the Internet) and the complexity of the network.
This enables the model to generate completions that are based on the relationships between words in the vocabulary on which the model was trained; often generating output that is indistinguishable from a human response to the same prompt.

3. What is Azure OpenAI?

Azure OpenAI Service is Microsoft’s cloud solution for deploying, customizing, and hosting large language models, which is a result of the partnership between Microsoft and OpenAI. The service combines Azure’s enterprise-grade capabilities with OpenAI’s generative AI model capabilities. [3][4]

Azure OpenAI is available for Azure users and consists of four components:

Pre-trained generative AI models
Customization capabilities; the ability to fine-tune AI models with your own data
Built-in tools to detect and mitigate harmful use cases so users can implement AI responsibly
Enterprise-grade security with role-based access control (RBAC) and private networks

Azure OpenAI Service provides REST API access to OpenAI’s powerful language models which can be easily adapted to specific task including but not limited to content generation, summarization, image understanding, semantic search, and natural language to code translation. Users can access the service through REST APIs, Python SDK, or web-based interface in the Azure OpenAI Studio. [6]

3.1. Models

Azure OpenAI supports many models that can serve different needs. These models include:

GPT-4 models are the latest generation of generative pretrained (GPT) models that can generate natural language and code completions based on natural language prompts.

The latest most capable Azure OpenAI models, GPT-4 Turbo, is a large multimodal model (accepting text or image inputs and generating text) that can solve difficult problems with greater accuracy than any of OpenAI’s previous models. [5]
GPT 3.5 models can generate natural language and code completions based on natural language prompts.

In particular, GPT-35-turbo models are optimized for chat-based interactions and work well in most generative AI scenarios.
Embeddings models convert text into numeric vectors, and are useful in language analytics scenarios such as comparing text sources for similarities.
DALL-E (/ˈdɑːli/) models are used to generate images based on natural language prompts.
Whisper models can be used for speech to text. [5]
Text to speech models, currently in preview, can be used to synthesize text to speech. [5]

3.2. Prompts & completions

The completions endpoint is the core component of the API service which provides access to the model’s text-in, text-out interface. Users simply need to provide an input prompt containing the English text command, and the model will generate a text completion. [6]

Here’s an example of a simple prompt and completion:

Prompt: """ count to 5 in a for loop """

Completion: for i in range(1, 6): print(i)

3.3. Tokens

Text tokens [6]

Azure OpenAI processes text by breaking it down into tokens. Tokens can be words or just chunks of characters. For example, the word “hamburger” gets broken up into the tokens “ham”, “bur” and “ger”, while a short and common word like “pear” is a single token. Many tokens start with a whitespace, for example “ hello” and “ bye”.

The total number of tokens processed in a given request depends on the length of your input, output and request parameters. The quantity of tokens being processed will also affect your response latency and throughput for the models.
Image tokens (GPT-4 Turbo with Vision)

The token cost of an input image depends on two main factors: the size of the image and the detail setting (low or high) used for each image.

3.4. Prompt engineering

The GPT-3, GPT-3.5 and GPT-4 models from OpenAI are prompt-based. With prompt-based models, the user interacts with the model by entering a text prompt, to which the model responds with a text completion. This completion is the model’s continuation of the input text. [6]

While these models are extremely powerful, their behavior is also very sensitive to the prompt, that makes prompt engineering an important skill to develop.

Prompt engineering is a technique that is both art and science, which involves designing prompts for generative AI models, that utilizes in-context learning (zero shot and few shot) and, with iteration, improves accuracy and relevancy in responses, optimizing the performance of the model. [7]

Note that with the Chat Completion API few-shot learning examples are typically added to the messages array in the form of example user/assistant interactions after the initial system message. [8]

Prompt construction can be difficult. In practice, the prompt acts to configure the model weights to complete the desired task, but it’s more of an art than a science, often requiring experience and intuition to craft a successful prompt.

3.5. RAG (Retrieval Augmented Generation)

RAG (Retrieval Augmented Generation) is a method that integrates external data into a Large Language Model prompt to generate relevant responses. [7]

It is particularly beneficial when using a large corpus of unstructured text based on different topics.
It allows for answers to be grounded in the organization’s knowledge base (KB), providing a more tailored and accurate response.

RAG is also advantageous when answering questions based on an organization’s private data or when the public data that the model was trained on might have become outdated, that helps ensure that the responses are always up-to-date and relevant, regardless of the changes in the data landscape.

3.6. Fine-tuning

Fine-tuning, specifically supervised fine-tuning in this context, is an iterative process that adapts an existing large language model to a provided training set in order to improve performance, teach the model new skills, or reduce latency. [7]

3.7. Chat Completions vs. Completions

The Chat Completions format was designed specifically for multi-turn conversations, but can be made similar to the completions format for nonchat scenarios by constructing a request using a single user message. For example, one can translate from English to French with the following completions prompt: [9][10]

Translate the following English text to French: "{text}"

And an equivalent chat prompt would be:

[{"role": "user", "content": 'Translate the following English text to French: "{text}"'}]

Likewise, the completions API can be used to simulate a chat between a user and an assistant by formatting the input accordingly.

The difference between these APIs is the underlying models that are available in each.

	Model families	API endpoint
Newer models (2023–)	gpt-4, gpt-4-turbo-preview, gpt-3.5-turbo	https://api.openai.com/v1/chat/completions
Updated LEGACY models (2023)	gpt-3.5-turbo-instruct, babbage-002, davinci-002	https://api.openai.com/v1/completions

Model families

API endpoint

Newer models (2023–)

gpt-4, gpt-4-turbo-preview, gpt-3.5-turbo

https://api.openai.com/v1/chat/completions

Updated LEGACY models (2023)

gpt-3.5-turbo-instruct, babbage-002, davinci-002

https://api.openai.com/v1/completions

OpenAI Chat Completions API

Chat models take a list of messages as input and return a model-generated message as output. Although the chat format is designed to make multi-turn conversations easy, it’s just as useful for single-turn tasks without any conversation.

An example Chat Completions API call looks like the following:

curl https://api.openai.com/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Who won the world series in 2020?"
      },
      {
        "role": "assistant",
        "content": "The Los Angeles Dodgers won the World Series in 2020."
      },
      {
        "role": "user",
        "content": "Where was it played?"
      }
    ]
  }'

An example Chat Completions API response looks as follows:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The 2020 World Series was played in Texas at Globe Life Field in Arlington.",
        "role": "assistant"
      },
      "logprobs": null
    }
  ],
  "created": 1677664795,
  "id": "chatcmpl-7QyqpwdfhqwajicIEznoc6Q47XAyW",
  "model": "gpt-3.5-turbo-0613",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 17,
    "prompt_tokens": 57,
    "total_tokens": 74
  }
}

To learn more, you can view the full API reference documentation for the Chat API.

Azure OpenAI Chat Completions API

An example Chat Completions API in Azure OpenAI call looks like the following:

curl https://YOUR_ENDPOINT_NAME.openai.azure.com/openai/deployments/YOUR_DEPLOYMENT_NAME/chat/completions?api-version=2023-03-15-preview \
  -H "Content-Type: application/json" \
  -H "api-key: YOUR_API_KEY" \
  -d '{"messages":[{"role": "system", "content": "You are a helpful assistant, teaching people about AI."},
{"role": "user", "content": "Does Azure OpenAI support multiple languages?"},
{"role": "assistant", "content": "Yes, Azure OpenAI supports several languages, and can translate between them."},
{"role": "user", "content": "Do other Azure AI Services support translation too?"}]}'

The response from the API will be similar to the following JSON:

{
  "id": "chatcmpl-6v7mkQj980V1yBec6ETrKPRqFjNw9",
  "object": "chat.completion",
  "created": 1679001781,
  "model": "gpt-35-turbo",
  "usage": {
    "prompt_tokens": 95,
    "completion_tokens": 84,
    "total_tokens": 179
  },
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": "Yes, other Azure AI Services also support translation. Azure AI Services offer translation between multiple languages for text, documents, or custom translation through Azure AI Services Translator."
      },
      "finish_reason": "stop",
      "index": 0
    }
  ]
}

To learn more, you can view the full Azure OpenAI Service REST API reference for the Chat API.

3.8. Code generation

GPT models are able to take natural language or code snippets and translate them into code. The OpenAI GPT models are proficient in over a dozen languages, such as C#, JavaScript, Perl, PHP, and is most capable in Python. [11]

GPT models have been trained on both natural language and billions of lines of code from public repositories. The models are able to generate code from natural language instructions such as code comments, and can suggest ways to complete code functions.

Part of the training data for GPT-3 included programming languages, so it’s no surprise that GPT models can answer programming questions if asked. What’s unique about the Codex model family is that it’s more capable across more languages than GPT models.

OpenAI partnered with GitHub to create GitHub Copilot, which they call an AI pair programmer. GitHub Copilot integrates the power of OpenAI Codex into a plugin for developer environments like Visual Studio Code.

Appendix A: FAQ

A.1. Large Language Model (LLM) Platforms: A Comparison

Table 1. WARNING: Generated by Google Gemini.
Platform	Model Families	Representative Products	Key Features	RAG Functionality	Pros	Cons	Documentation Quality	Supported SDKs
OpenAI	GPT-n (e.g., GPT-3, GPT-4+)	ChatGPT	Text generation, translation, writing different creative text formats, code generation	Limited (integrations in progress)	Powerful text generation, user-friendly interface (ChatGPT)	Limited control over factual accuracy, potential for bias in outputs	Moderate	Python, Node.js
Azure OpenAI	GPT-n (based on OpenAI)	Azure OpenAI Service	Similar to OpenAI’s offerings	Integrated with Azure AI Search for retrieval-augmented generation (RAG)	Easy integration with Azure services, access to Microsoft’s computing power	Limited control over model (based on OpenAI’s offerings), potential for bias in outputs	Moderate	Python, Java, C#, JavaScript
Google AI	LaMDA, PaLM, T5, Gemini (Bard)	LaMDA, Gemini (Bard)	Text generation, translation, question answering, chatbot interactions	Not publicly available for RAG integration	Powerful for various tasks (PaLM), focus on conversational abilities (LaMDA, Gemini)	Limited public access to some models (e.g., PaLM), potential for bias in outputs	High	Python, Java
Meta	BlenderBot 3, Jurassic-1 Jumbo, Llama	BlenderBot 3, Llama	Focus on chatbots, strong performance in benchmarks	Not currently available	Promising for chatbots, good benchmark performance	Limited public information on model capabilities, potential for bias in outputs	Moderate	Python (PyTorch Hub)
Anthropic	Claude 3 (various models)	N/A	Focus on safety and responsible use, multiple models for various tasks	Not publicly available	Strong focus on safety and ethical considerations	Limited public access, early development stage	Low	Not publicly available yet
Alibaba DashScope	Proprietary models + Third-party models (limited info)	Tongyi Qianwen, Ali NLG	Text generation, machine translation, NLP tasks (limited public info)	Not publicly available	Focus on domestic market, potential for customization, third-party model support	Limited transparency on models and capabilities, potential for language bias	Low (limited public info)	Java, Python (limited information available)
Baidu Qianfan	ERNIE (Wénxīn Yīyán) + Third-party models (limited info)	Baidu Qianfan (text generation, translation, code generation, chatbot interactions)	Text generation, translation, code generation, chatbot interactions	Not directly supported (potential internal solutions for information retrieval)	Powerful models (WuDao 2.0), user-friendly interface (Qianfan), third-party model support	Limited public information on RAG implementation, potential for bias in outputs	Moderate	Python, Java, Go, Node.js
Huawei Pangu	Proprietary models	(no public product yet)	Focus on three-layer architecture: foundational LLM, industry-specific models, scenario-specific models	Not applicable (no public product)	Focus on customization for specific industries and use cases (based on announcements)	Limited public information on capabilities, early access might be restricted	Not applicable (no public product)	Not applicable (no public product yet)

A.2. How does RAG work like GPT, Gemini, ERNIE?

RAG (Retrieval-Augmented Generation) differs fundamentally from large language models (LLMs) like GPT, Gemini, ERNIE, and others in its approach to generating text. Here’s a breakdown:

LLMs (GPT, Gemini, ERNIE):

Function: LLMs are trained on massive amounts of text data. This allows them to learn complex statistical relationships between words and phrases. When given a prompt or query, they use this knowledge to generate text that is statistically similar to the text they were trained on.
Process: Here’s a simplified view of how LLMs work:
1. Input: You provide a prompt or question.
2. Internal Representation: The LLM converts the input into an internal representation, like a series of numbers.
3. Prediction: The LLM predicts the next word or phrase in the sequence based on the internal representation and its knowledge of language patterns.
4. Output: The LLM continues predicting words or phrases, building a coherent text response based on the prompt or question.
Focus: LLMs excel at generating different creative text formats, translating languages, writing different kinds of creative content, and answering your questions in an informative way. They rely solely on their internal knowledge base for generating text.

RAG (Retrieval-Augmented Generation):

Function: RAG combines retrieval techniques with LLM capabilities. It retrieves relevant information from an external source (like a search engine or document database) and feeds that information to an LLM for text generation.
Process: Here’s a simplified view of how RAG works:
1. Input: You provide a prompt or question.
2. Retrieval System: An information retrieval system searches for relevant documents or information based on the prompt.
3. Information Extraction: Key information from the retrieved documents is extracted.
4. Feeding the LLM: The prompt, along with the extracted information, is fed to an LLM.
5. Text Generation: The LLM uses the prompt and extracted information to generate a text response.
Focus: RAG aims to improve the factual accuracy and grounding of the generated text by incorporating external information. It’s particularly valuable for tasks where access to relevant information is crucial.

Key Differences:

Here’s a table summarizing the key differences:

Feature	LLM (GPT, Gemini, ERNIE)	RAG
Data Source	Massive text corpus	External source (search engine, document database) + LLM’s internal knowledge
Information Retrieval	No	Yes
Focus	Statistical similarity, fluency	Factual accuracy, grounding

Feature

LLM (GPT, Gemini, ERNIE)

RAG

Data Source

Massive text corpus

External source (search engine, document database) + LLM’s internal knowledge

Information Retrieval

Yes

Focus

Statistical similarity, fluency

Factual accuracy, grounding

In essence, LLMs are self-contained text generation machines, while RAG leverages external information to enhance the quality of the generated text.

A.3. What’s vector search and embedding?

Feature

Full-Text Search

Keyword Search

Vector Search

Search Method

Scans entire document content

Matches specific keywords

Uses vector embeddings for semantic similarity

Strengths

More comprehensive, finds documents with similar meaning

Simple, fast

Finds similar data points even without exact keywords

Weaknesses

Less efficient for large datasets, might return irrelevant results

Misses relevant documents with different phrasing

Requires complex infrastructure, computationally expensive (large datasets)

Ideal Use Cases

Searching large document collections, finding documents related to a topic

Finding documents with specific terminology

Efficient search for similar data points (documents, images) based on meaning

Vector search and embedding are two techniques that work together to efficiently search through large amounts of data, particularly textual data. Here’s a breakdown of each concept:

Vector Embedding:
- Imagine representing data points (like words, documents, images) as points in a high-dimensional space.
- Vector embedding is the process of converting these data points into numerical vectors that capture their semantic meaning and relationships.
- These vectors are like unique fingerprints that encode the essence of the data point.
- Techniques like word2vec, GloVe, and transformers are used to create these embeddings.
Vector Search:
- Once you have data points converted into vectors, you can perform vector search.
- This involves comparing a query vector (an embedding of your search term) to the document vectors in your collection.
- The documents whose vectors are closest to the query vector are considered the most relevant results.
- Vector search algorithms like cosine similarity are used to measure the closeness between vectors.

Benefits of using vector search and embedding:

Efficiency: Compared to traditional keyword search, vector search can find similar data points much faster, especially for large datasets.
Semantic understanding: Vector search goes beyond exact keyword matches and retrieves results based on meaning and context.
Handling synonyms and variations: Similar words or phrases with different wording will have close vectors, allowing for broader and more relevant searches.

Applications of vector search and embedding:

Search engines: Can improve search results by finding semantically similar documents, even if they don’t contain the exact keywords.
Recommendation systems: Recommend products, articles, or music similar to what a user has liked in the past.
Chatbots and virtual assistants: Understand the user’s intent better and provide more relevant responses.
Anomaly detection: Identify data points that deviate significantly from the norm, potentially indicating fraud or errors.
Image retrieval: Find similar images based on their content, not just their filenames or captions.

Here’s an analogy to understand it better:

Imagine a library with books on various topics. Traditional keyword search is like looking for a specific book title. Vector search and embedding are like browsing the library by genre or topic. You can find relevant books even if they don’t have the exact keywords you were looking for.

A.3.1. What’s its relationship with LLM, like Gemini or GPT?

LLMs (Large Language Models) like Gemini and GPT-3 are a powerful tool for generating text, translating languages, writing different kinds of creative content, and answering your questions in an informative way. However, they primarily focus on processing and understanding the meaning of text data itself.

Vector search and embedding, on the other hand, are complementary techniques that help LLMs find and retrieve relevant information more efficiently. Here’s how they relate:

Finding the data: LLMs are excellent at understanding and processing textual information. But when it comes to finding specific data points within a vast collection, they can be less efficient. This is where vector search comes in.
Embeddings bridge the gap: Vector embeddings act as a bridge between the textual world that LLMs understand and the numerical world of vector search. By converting text data (documents, queries) into numerical vectors, vector search algorithms can efficiently find similar data points based on their semantic meaning.
LLMs leverage the results: Once a vector search retrieves the most relevant data points (documents, articles, code) based on the query, the LLM can take over. It can process the retrieved information, analyze it in detail and provide a comprehensive answer or complete the task at hand.

Analogy:

Imagine you’re a researcher and you need to find information for a project. LLM is like a highly skilled assistant who understands your research topic and the kind of information you need. But, if the information is scattered across a massive library, your assistant might get overwhelmed searching through everything. Here’s where vector search comes in. It acts like a sophisticated library catalog system that can quickly point you to the most relevant books (data points) based on your research topic. With this curated list, your assistant (LLM) can then delve deeper into those resources and provide you with the insights you need.

In essence, LLMs and vector search/embedding work together to create a more powerful and efficient system for processing information and retrieving relevant data.

A.3.2. What are the popular tools or databases that support Vector Search?

Here are some popular tools and databases that support Vector Search:

Vector Databases:

Pinecone: A managed vector database service with a focus on ease of use and scalability. It offers a user-friendly interface and integrates well with various machine learning frameworks.
Milvus: An open-source vector database known for its high performance and flexibility. It supports various similarity search algorithms and offers features like multilingual search and data partitioning.
Weaviate: An open-source vector database that allows you to store not only vector embeddings but also the original data objects. This can be helpful for tasks like visualizing search results or managing metadata.
Faiss (Facebook AI Similarity Search): A popular open-source library for efficient similarity search on GPUs and CPUs. While not a full-fledged database itself, Faiss is often used as the underlying engine for vector search functionalities within other tools.
MongoDB Atlas Vector Search: This is a managed vector search offering built on top of the popular MongoDB database platform. It allows you to leverage MongoDB’s existing functionalities for data storage and management alongside vector search capabilities.

Libraries and Tools:

ScaNN (Scalable Nearest Neighbors): An open-source library by Google Research that offers efficient and scalable algorithms for approximate nearest neighbor search. It’s a good option for large-scale datasets where exact similarity might not be crucial.
Annoy (Approximate Nearest Neighbors Optimized for Yandex): Another open-source library offering approximate nearest neighbor search functionality. It’s known for its memory efficiency and can be a good choice for resource-constrained environments.

Remember, this is not an exhaustive list, and new tools and databases are emerging all the time. It’s always a good idea to research and compare different options based on your specific requirements.

A.3.3. I’m new to search engine. show the learn path about vector search.

Learning Path for Vector Search Beginners

Welcome to the world of vector search! Here’s a roadmap to guide you through the basics and get you started with this exciting technology.

Step 1: Grasp the Fundamentals

Understand Text Search Limitations: Traditional search engines rely on keyword matching, which can be limiting. Start by understanding the challenges of keyword-based search, especially when dealing with synonyms, context, and variations in phrasing.
Demystify Vector Embeddings: These are the magic behind vector search! They’re numerical representations of data (text, images) that capture their meaning and relationships. Explore concepts like word2vec, GloVe, and transformers, which are techniques used to create embeddings. Resources like https://jalammar.github.io/illustrated-word2vec/ or https://nlp.stanford.edu/projects/glove/ can provide a good introduction.

Step 2: Deep Dive into Vector Search

Explore the Core Concepts: Learn how vector search works! This involves converting queries and data points into vectors, then using algorithms like cosine similarity to find the closest matches. Resources like https://learn.microsoft.com/en-us/azure/search/vector-search-overview or https://learn.microsoft.com/en-us/azure/search/vector-search-overview offer good explanations.
Benefits and Applications: Discover the advantages of vector search over traditional methods. See how it’s used in search engines, recommendation systems, chatbots, anomaly detection, and image retrieval. Articles like https://www.algolia.com/blog/ai/what-is-vector-search/ or https://engineering.fb.com/2017/03/29/data-infrastructure/faiss-a-library-for-efficient-similarity-search/ can be helpful.

Step 3: Explore Tools and Techniques

Popular Vector Search Tools: Investigate some of the leading tools and databases that support vector search. This includes Pinecone, Milvus, Weaviate, Faiss, and MongoDB Atlas Vector Search. https://www.pinecone.io/ offers a good starting point.
Experiment with Libraries: If you’re comfortable with coding, try out open-source libraries like ScaNN (Scalable Nearest Neighbors) or Annoy (Approximate Nearest Neighbors Optimized for Yandex). They provide efficient algorithms for vector search. Tutorials on these libraries can be found online.

Step 4: Practice and Refine

Find a Project: Apply your newfound knowledge! Look for a small project where you can experiment with vector search. This could be building a simple search engine for documents, a recommendation system for movies, or a chatbot with improved understanding.
Refine your Skills: As you work on your project, keep learning and exploring. There are many online resources, tutorials, and communities dedicated to vector search. Don’t hesitate to seek help and feedback from experienced users.

Additional Resources:

Books: "Deep Learning with Python" by Francois Chollet (Chapter 18 covers embeddings)
Online Courses: Platforms like Coursera, edX, and Udacity might offer courses related to vector search or information retrieval.
Blogs: Many companies and research institutions working on vector search publish informative blog posts. Look for blogs from Pinecone, Milvus, Facebook AI, and Google AI.

Remember, this is a beginner-friendly path, and you can adjust it based on your interests and pace. The key is to start with the basics, explore the practical applications, and don’t be afraid to experiment!

A.3.4. Should I also need to learn ML and DL?

Whether you need to learn Machine Learning (ML) and Deep Learning (DL) to delve into Large Language Models (LLMs) depends on your specific goals:

If you want to use LLMs:

Basic understanding is helpful: Familiarizing yourself with core ML concepts like training data, evaluation metrics, and different types of models will be beneficial. You’ll encounter these terms when using or interacting with LLMs.
No need to be an expert: You don’t necessarily need to become an ML or DL expert to utilize LLMs for various tasks like text generation, translation, or writing different creative content formats. Many user-friendly LLM APIs and tools are available that don’t require in-depth knowledge of the underlying algorithms.

If you want to build or modify LLMs:

ML & DL are crucial: Understanding ML and DL is essential if you want to develop or customize LLMs. These fields provide the foundation for training and fine-tuning LLM models, as well as exploring cutting-edge research areas.

Learning Path:

Here’s a potential learning path based on your goals:

Focus on using LLMs: Start with a basic understanding of ML through online resources, tutorials, or introductory courses. This will give you a good foundation for working with LLMs and interpreting their outputs.
Aiming to build or modify LLMs: Invest time in learning ML and DL. Numerous online courses, textbooks, and tutorials can equip you with the necessary skills.

Additional Resources:

Machine Learning Crash Course by Google: https://developers.google.com/machine-learning/crash-course (Great introduction to ML concepts)
Fast.ai Practical Deep Learning for Coders: https://course.fast.ai/ (Practical approach to learning DL with coding exercises)
Deep Learning Book by Ian Goodfellow, Yoshua Bengio, and Aaron Courville: https://www.deeplearningbook.org/ (Comprehensive textbook for in-depth DL knowledge)

Remember, the field of AI, ML, and LLMs is constantly evolving. Start with manageable goals and gradually expand your knowledge as you gain experience and confidence!

A.3.5. Papers [Gemini]

There are many research papers that can deepen your understanding of Large Language Models (LLMs). Here’s a breakdown of how to approach learning about LLMs through research papers:

Start with Foundational Papers:

Attention Is All You Need (Vaswani et al., 2017): This groundbreaking paper introduced the Transformer architecture, a core building block for many modern LLMs.
A Primer on Neural Language Processing (Jurafsky & Martin, 2014): This comprehensive textbook provides a solid foundation in NLP concepts essential for understanding LLMs.

Explore Specific LLM Implementations:

GPT-3: Language Models are Few-Shot Learners (Brown et al., 2020): This paper describes the capabilities and training methods behind GPT-3, a highly influential LLM.
PaLM: Scaling Language Models with Pathways (Chowdhery et al., 2022): This paper introduces PaLM, a powerful LLM from Google AI, focusing on its training methodology and capabilities.

Delve into Specific LLM Applications:

Learning to Code with Large Language Models (Chen et al., 2021): This paper explores how LLMs can be used for code generation and programming tasks.
Reasoning with Large Language Models (McCoy et al., 2022): This paper delves into the challenges and potential of LLMs for reasoning and logical tasks.

Finding Resources:

These are just a few examples, and the specific papers you choose will depend on your interests and goals. Here are some resources to help you find relevant research papers:

ArXiv: A leading repository for pre-print research papers in various fields, including NLP and AI (arxiv.org)
Google Scholar: A powerful search engine for academic publications (scholar.google.com)
AI Open Access Papers with Code (The A List): A curated list of influential AI papers with their corresponding code (allenai.org)

Additional Tips:

Start with introductory papers: Begin with foundational papers to establish a solid understanding of core NLP concepts.
Explore blog posts and summaries: Many blogs and websites provide summaries or explanations of research papers, making them easier to digest.
Join online communities: Engage in discussions and get recommendations from other researchers or enthusiasts in online forums or communities.

Remember, the field of LLMs is evolving rapidly, so staying updated with current research is crucial. Enjoy your exploration of the fascinating world of LLMs!

A.3.6. How to read a paper? [Gemini]

Here’s a breakdown of how to effectively approach reading a research paper, particularly in the field of AI and Large Language Models (LLMs):

Preparation:

Identify your goal: Before diving in, clarify your purpose for reading the paper. Are you aiming for a general understanding, specific details, or critical evaluation? This will guide your focus while reading.
Gather background knowledge: Familiarize yourself with the general area of research. Skim the introduction or abstract to understand the paper’s topic and its contribution to the field. Consider searching for related blog posts or introductory articles for basic concepts.

Active Reading:

Start with the abstract and introduction: The abstract provides a concise overview of the paper’s purpose, methodology, key findings, and conclusions. The introduction elaborates on the context, motivation, and research question(s) addressed in the paper.
Survey the structure: Most research papers follow a similar format: Introduction, Related Work, Methodology, Results, Discussion, Conclusion, and References. Skimming these sections gives you a roadmap of the paper’s content.
Focus on key sections: Depending on your goal, prioritize specific sections. For a general understanding, focus on the Introduction, Results, and Conclusion. For a deeper dive, delve into the Methodology section to understand how the research was conducted.
Engage with the content: Don’t just passively read. Take notes, highlight important points, and write down any questions or confusions you have.

Critical Analysis:

Evaluate the methodology: Consider the data used, the training approach, and the evaluation metrics. Are they appropriate for the research question? Are there any limitations or biases to be aware of?
Analyze the results: Pay close attention to the figures, tables, and data visualizations. Do the results support the conclusions? Are there alternative explanations or interpretations?
Consider the broader context: How does this research contribute to the field? Does it align with existing knowledge or challenge current understanding? Are there any ethical implications to consider?

Additional Tips:

Use online resources: Many research papers are accompanied by supplementary materials like code, datasets, or presentations. Utilize these resources to gain a deeper understanding.
Consult online communities: Engage in discussions about the paper in online forums or communities related to AI or LLMs. This can help clarify concepts and gain different perspectives.
Don’t be afraid to re-read: Complex research papers often require multiple readings to fully grasp the content. Don’t hesitate to revisit confusing sections or consult online resources for clarification.

By following these steps and actively engaging with the material, you can effectively read and understand research papers in the field of AI and LLMs, even if you’re new to the subject. Remember, it’s a journey of exploration and learning!

References

HTML Learning Notes

2024-04-29T09:01:35+08:00

/* TODO */