1. What Is PostgreSQL

PostgreSQL is an object-relational database management system (ORDBMS) based on POSTGRES, Version 4.2, developed at the University of California at Berkeley Computer Science Department. POSTGRES pioneered many concepts that only became available in some commercial database systems much later. [1]

PostgreSQL is an open-source descendant of this original Berkeley code. It supports a large part of the SQL standard and offers many modern features:

  • complex queries

  • foreign keys

  • triggers

  • updatable views

  • transactional integrity

  • multiversion concurrency control

Also, PostgreSQL can be extended by the user in many ways, for example by adding new

  • data types

  • functions

  • operators

  • aggregate functions

  • index methods

  • procedural languages

2. Installation from Docker

2.1. start a postgres instance

$ docker run --name some-postgres -e POSTGRES_PASSWORD=mysecretpassword -d postgres

2.2. …​ or via psql

$ docker run -it --rm --network some-network postgres psql -h some-postgres -U postgres
psql (9.5.0)
Type "help" for help.

postgres=# SELECT 1;
(1 row)

2.3. …​ via docker-compose

    image: postgres:14
    restart: on-failure
      # The PostgreSQL image uses several environment variables which are easy to miss.
      # The only variable required is POSTGRES_PASSWORD, the rest are optional.
      # Warning: the Docker specific variables will only have an effect if you start
      # the container with a data directory that is empty; any pre-existing database
      # will be left untouched on container startup.
      # This environment variable is required for you to use the PostgreSQL image.
      # It must not be empty or undefined. This environment variable sets the superuser
      # password for PostgreSQL.
      # The default superuser is defined by the POSTGRES_USER environment variable.
      # Note 1: The PostgreSQL image sets up trust authentication locally so you may notice
      # a password is not required when connecting from localhost (inside the same container).
      # However, a password will be required if connecting from a different host/container.
      # Note 2: This variable defines the superuser password in the PostgreSQL instance, as
      # set by the initdb script during initial container startup. It has no effect on the
      # PGPASSWORD environment variable that may be used by the psql client at runtime, as
      # described at https://www.postgresql.org/docs/current/libpq-envars.html.
      # PGPASSWORD, if used, will be specified as a separate environment variable.
      POSTGRES_PASSWORD: postgres

      # This optional environment variable is used in conjunction with POSTGRES_PASSWORD to
      # set a user and its password. This variable will create the specified user with
      # superuser power and a database with the same name. If it is not specified, then the
      # default user of `postgres` will be used.
      # Be aware that if this parameter is specified, PostgreSQL will still show The files
      # belonging to this database system will be owned by user "postgres" during initialization.
      # This refers to the Linux system user (from /etc/passwd in the image) that the postgres
      # daemon runs as, and as such is unrelated to the POSTGRES_USER option.
      POSTGRES_USER: postgres

      # This optional environment variable can be used to define a different name for the
      # default database that is created when the image is first started. If it is not
      # specified, then the value of `POSTGRES_USER` will be used.
      POSTGRES_DB: postgres

      # This optional variable can be used to define another location - like a subdirectory - for
      # the database files. The default is /var/lib/postgresql/data. If the data volume you're
      # using is a filesystem mountpoint (like with GCE persistent disks) or remote folder that
      # cannot be chowned to the postgres user (like some NFS mounts), Postgres initdb recommends
      # a subdirectory be created to contain the data.
      # For example:
      # $ docker run -d \
      # 	--name some-postgres \
      # 	-e POSTGRES_PASSWORD=mysecretpassword \
      # 	-e PGDATA=/var/lib/postgresql/data/pgdata \
      # 	-v /custom/mount:/var/lib/postgresql/data \
      # 	postgres
      # This is an environment variable that is not Docker specific. Because the variable is used by
      # the postgres server binary (see the PostgreSQL docs), the entrypoint script takes it into account.
      PGDATA: /var/lib/postgresql/data/pgdata

    volumes: []
      # If you would like to do additional initialization in an image derived from this one,
      # add one or more *.sql, *.sql.gz, or *.sh scripts
      # under /docker-entrypoint-initdb.d (creating the directory if necessary). After the
      # entrypoint calls initdb to create the default postgres user and database, it will run
      # any *.sql files, run any executable *.sh scripts, and source any non-executable *.sh
      # scripts found in that directory to do further initialization before starting the service.
      # Warning: scripts in /docker-entrypoint-initdb.d are only run if you start the container
      # with a data directory that is empty; any pre-existing database will be left untouched on
      # container startup. One common problem is that if one of your /docker-entrypoint-initdb.d
      # scripts fails (which will cause the entrypoint script to exit) and your orchestrator
      # restarts the container with the already initialized data directory, it will not continue
      # on with your scripts.
      # These initialization files will be executed in sorted name order as defined by the current
      # locale, which defaults to en_US.utf8. Any *.sql files will be executed by POSTGRES_USER,
      # which defaults to the postgres superuser. It is recommended that any psql commands that are
      # run inside of a *.sh script be executed as POSTGRES_USER by using
      # the --username "$POSTGRES_USER" flag. This user will be able to connect without a password
      # due to the presence of trust authentication for Unix socket connections made inside the container.
      # - ./init.sql:/docker-entrypoint-initdb.d/1-init.sql
      # - ./data.sql:/docker-entrypoint-initdb.d/2-data.sql
$ docker-compose config
      PGDATA: /var/lib/postgresql/data/pgdata
      POSTGRES_DB: postgres
      POSTGRES_PASSWORD: postgres
      POSTGRES_USER: postgres
    image: postgres:14
    restart: on-failure
    volumes: []
version: '3.9'

$ docker-compose up -d
Creating network "pg_default" with the default driver
Creating pg_database_1 ... done

$ docker-compose exec database psql -U postgres
psql (14.2 (Debian 14.2-1.pgdg110+1))
Type "help" for help.


3. Clusters, Databases, Tables, Rows, Columns

PostgreSQL is a relational database management system (RDBMS). That means it is a system for managing data stored in relations. Relation is essentially a mathematical term for table. The notion of storing data in tables is so commonplace today that it might seem inherently obvious, but there are a number of other ways of organizing databases. Files and directories on Unix-like operating systems form an example of a hierarchical database. A more modern development is the object-oriented database. [2]

Each table is a named collection of rows. Each row of a given table has the same set of named columns, and each column is of a specific data type. Whereas columns have a fixed order in each row, it is important to remember that SQL does not guarantee the order of the rows within the table in any way (although they can be explicitly sorted for display).

Tables are grouped into databases, and a collection of databases managed by a single PostgreSQL server instance constitutes a database cluster.

A small number of objects, like role, database, and tablespace names, are defined at the cluster level and stored in the pg_global tablespace. Inside the cluster are multiple databases, which are isolated from each other but can access cluster-level objects. Inside each database are multiple schemas, which contain objects like tables and functions. So the full hierarchy is: cluster, database, schema, table (or some other kind of object, such as a function). [3]

Databases are called “catalogs” in the SQL standard.

  1. ceate a login user named 'test' with password 'test'

    CREATE USER test OWNER test PASSWORD 'test';
  2. ceate a database/catalog as the user same name, i.e. 'test'

  3. alter a database owner to a user

    $ psql -h localhost postgres postgres -W -q
    postgres=# SELECT datname AS Name, pg_get_userbyid(datdba) AS Owner FROM pg_database WHERE datname = 'testdb';
      name  | owner
     testdb | postgres
    (1 row)
    postgres=# \du
                                       List of roles
     Role name |                         Attributes                         | Member of
     postgres  | Superuser, Create role, Create DB, Replication, Bypass RLS | {}
     test      |                                                            | {}
    postgres=# ALTER DATABASE testdb OWNER TO test;
    postgres=# SELECT datname AS Name, pg_get_userbyid(datdba) AS Owner FROM pg_database WHERE datname = 'testdb';
      name  | owner
     testdb | test
    (1 row)
  4. connect the test catalog with psql

    $ psql -h localhost -W -q testdb test

4. Schemas

A PostgreSQL database cluster contains one or more named databases. Roles and a few other object types are shared across the entire cluster. A client connection to the server can only access data in a single database, the one specified in the connection request. [4]

A database contains one or more named schemas, which in turn contain tables.

  • Schemas also contain other kinds of named objects, including data types, functions, and operators.

  • The same object name can be used in different schemas without conflict; for example, both schema1 and myschema can contain tables named mytable.

  • Unlike databases, schemas are not rigidly separated: a user can access objects in any of the schemas in the database they are connected to, if they have privileges to do so.

There are several reasons why one might want to use schemas:

  • To allow many users to use one database without interfering with each other.

  • To organize database objects into logical groups to make them more manageable.

  • Third-party applications can be put into separate schemas so they do not collide with the names of other objects.

Schemas are analogous to directories at the operating system level, except that schemas cannot be nested.

4.1. Creating a Schema

To create a schema, use the CREATE SCHEMA command. Give the schema a name of your choice. For example:


To create or access objects in a schema, write a qualified name consisting of the schema name and table name separated by a dot:


Actually, the even more general syntax


can be used too, but at present this is just for pro forma compliance with the SQL standard. If you write a database name, it must be the same as the database you are connected to.

To drop a schema if it’s empty (all objects in it have been dropped), use:

DROP SCHEMA myschema;

To drop a schema including all contained objects, use:


Schema names beginning with pg_ are reserved for system purposes and cannot be created by users.

To list schema, use:

select catalog_name,schema_name,schema_owner from information_schema.schemata;

or use psql:

postgres=> \dn+

4.2. The Public Schema

By default tables (and other objects) are automatically put into a schema named “public”. Every new database contains such a schema. Thus, the following are equivalent:

CREATE TABLE products ( ... );


CREATE TABLE public.products ( ... );

4.3. The Schema Search Path

Qualified names are tedious to write, and it’s often best not to wire a particular schema name into applications anyway.

Therefore tables are often referred to by unqualified names, which consist of just the table name.

  • The system determines which table is meant by following a search path, which is a list of schemas to look in.

  • The first matching table in the search path is taken to be the one wanted.

  • If there is no match in the search path, an error is reported, even if matching table names exist in other schemas in the database.

  • The first schema named in the search path is called the current schema.

    Aside from being the first schema searched, it is also the schema in which new tables will be created if the CREATE TABLE command does not specify a schema name.

To show the current search path, use the following command:

SHOW search_path;

In the default setup this returns:

 "$user", public
  • The first element specifies that a schema with the same name as the current user is to be searched. If no such schema exists, the entry is ignored.

  • The second element refers to the public schema that we have seen already.

The first schema in the search path that exists is the default location for creating new objects.

  • That is the reason that by default objects are created in the public schema.

  • When objects are referenced in any other context without schema qualification (table modification, data modification, or query commands) the search path is traversed until a matching object is found.

  • Therefore, in the default configuration, any unqualified access again can only refer to the public schema.

To put our new schema in the path, we use:

SET search_path TO myschema,public;

4.4. Schemas and Privileges

By default, users cannot access any objects in schemas they do not own.

  • To allow that, the owner of the schema must grant the USAGE privilege on the schema.

  • To allow users to make use of the objects in the schema, additional privileges might need to be granted, as appropriate for the object.

A user can also be allowed to create objects in someone else’s schema.

  • To allow that, the CREATE privilege on the schema needs to be granted.

Note that by default, everyone has CREATE and USAGE privileges on the schema public.

This allows all users that are able to connect to a given database to create objects in its public schema.

4.5. The System Catalog Schema

In addition to public and user-created schemas, each database contains a pg_catalog schema, which contains the system tables and all the built-in data types, functions, and operators. pg_catalog is always effectively part of the search path.

If it is not named explicitly in the path then it is implicitly searched before searching the path’s schemas. This ensures that built-in names will always be findable. However, you can explicitly place pg_catalog at the end of your search path if you prefer to have user-defined names override built-in names.

Since system table names begin with pg_, it is best to avoid such names to ensure that you won’t suffer a conflict if some future version defines a system table named the same as your table. (With the default search path, an unqualified reference to your table name would then be resolved as the system table instead.) System tables will continue to follow the convention of having names beginning with pg_, so that they will not conflict with unqualified user-table names so long as users avoid the pg_ prefix.

4.6. Portability

In the SQL standard, the notion of objects in the same schema being owned by different users does not exist. Moreover, some implementations do not allow you to create schemas that have a different name than their owner.

In fact, the concepts of schema and user are nearly equivalent in a database system that implements only the basic schema support specified in the standard. Therefore, many users consider qualified names to really consist of user_name.table_name. This is how PostgreSQL will effectively behave if you create a per-user schema for every user.

Also, there is no concept of a public schema in the SQL standard. For maximum conformance to the standard, you should not use the public schema.

Of course, some SQL database systems might not implement schemas at all, or provide namespace support by allowing (possibly limited) cross-database access. If you need to work with those systems, then maximum portability would be achieved by not using schemas at all.

5. Tablespaces

Tablespaces in PostgreSQL allow database administrators to define locations in the file system where the files representing database objects can be stored. Once created, a tablespace can be referred to by name when creating database objects.

By using tablespaces, an administrator can control the disk layout of a PostgreSQL installation. This is useful in at least two ways.

First, if the partition or volume on which the cluster was initialized runs out of space and cannot be extended, a tablespace can be created on a different partition and used until the system can be reconfigured.

Second, tablespaces allow an administrator to use knowledge of the usage pattern of database objects to optimize performance. For example, an index which is very heavily used can be placed on a very fast, highly available disk, such as an expensive solid state device. At the same time a table storing archived data which is rarely used or not performance critical could be stored on a less expensive, slower disk system.

6. Database Roles, Users, Groups

PostgreSQL manages database access permissions using the concept of roles.

A role can be thought of as either a database user, or a group of database users, depending on how the role is set up.

Roles can own database objects (for example, tables and functions) and can assign privileges on those objects to other roles to control who has access to which objects.

Furthermore, it is possible to grant membership in a role to another role, thus allowing the member role to use privileges assigned to another role.

The concept of roles subsumes the concepts of “users” and “groups”.

  • In PostgreSQL versions before 8.1, users and groups were distinct kinds of entities, but now there are only roles.

  • Any role can act as a user, a group, or both.

6.1. Database Roles

Database roles are conceptually completely separate from operating system users. In practice it might be convenient to maintain a correspondence, but this is not required.

Database roles are global across a database cluster installation (and not per individual database).

To create a role use the CREATE ROLE SQL command:


To remove an existing role, use the analogous DROP ROLE command:


To determine the set of existing roles, examine the pg_roles system catalog, for example

SELECT rolname FROM pg_roles;

The psql program’s \du meta-command is also useful for listing the existing roles.

In order to bootstrap the database system, a freshly initialized system always contains one predefined role.

  • This role is always a “superuser”, and by default (unless altered when running initdb) it will have the same name as the operating system user that initialized the database cluster.

  • Customarily, this role will be named postgres.

  • In order to create more roles you first have to connect as this initial role.

Every connection to the database server is made using the name of some particular role, and this role determines the initial access privileges for commands issued in that connection.

  • The role name to use for a particular database connection is indicated by the client that is initiating the connection request in an application-specific fashion.

    For example, the psql program uses the -U command line option to indicate the role to connect as.

  • Many applications assume the name of the current operating system user by default (including createuser and psql).

    Therefore it is often convenient to maintain a naming correspondence between roles and operating system users.

6.2. Role Attributes

A database role can have a number of attributes that define its privileges and interact with the client authentication system.

6.2.1. login privilege

Only roles that have the LOGIN attribute can be used as the initial role name for a database connection. A role with the LOGIN attribute can be considered the same as a “database user”.

To create a role with login privilege, use either:

CREATE USER is equivalent to CREATE ROLE except that CREATE USER includes LOGIN by default, while CREATE ROLE does not.

6.2.2. superuser status

A database superuser bypasses all permission checks, except the right to log in. This is a dangerous privilege and should not be used carelessly; it is best to do most of your work as a role that is not a superuser.

To create a new database superuser, use CREATE ROLE name SUPERUSER. You must do this as a role that is already a superuser.

6.2.3. database creation

A role must be explicitly given permission to create databases (except for superusers, since those bypass all permission checks).

To create such a role, use CREATE ROLE name CREATEDB.

6.2.4. role creation

A role must be explicitly given permission to create more roles (except for superusers, since those bypass all permission checks).

To create such a role, use CREATE ROLE name CREATEROLE.

A role with CREATEROLE privilege can alter and drop other roles, too, as well as grant or revoke membership in them.

However, to create, alter, drop, or change membership of a superuser role, superuser status is required; CREATEROLE is insufficient for that.

6.2.5. initiating replication

A role must explicitly be given permission to initiate streaming replication (except for superusers, since those bypass all permission checks). A role used for streaming replication must have LOGIN permission as well.

To create such a role, use CREATE ROLE name REPLICATION LOGIN.

6.2.6. password

A password is only significant if the client authentication method requires the user to supply a password when connecting to the database. The password and md5 authentication methods make use of passwords. Database passwords are separate from operating system passwords.

Specify a password upon role creation with CREATE ROLE name PASSWORD 'string'.

A role’s attributes can be modified after creation with ALTER ROLE. See the reference pages for the CREATE ROLE and ALTER ROLE commands for details.

It is good practice to create a role that has the CREATEDB and CREATEROLE privileges, but is not a superuser, and then use this role for all routine management of databases and roles. This approach avoids the dangers of operating as a superuser for tasks that do not really require it.

6.3. Role Membership

It is frequently convenient to group users together to ease management of privileges: that way, privileges can be granted to, or revoked from, a group as a whole.

In PostgreSQL this is done by creating a role that represents the group, and then granting membership in the group role to individual user roles.

To set up a group role, first create the role:


Typically a role being used as a group would not have the LOGIN attribute, though you can set it if you wish.

Once the group role exists, you can add and remove members using the GRANT and REVOKE commands:

GRANT group_role TO role1, ... ;
REVOKE group_role FROM role1, ... ;

You can grant membership to other group roles, too (since there isn’t really any distinction between group roles and non-group roles). The database will not let you set up circular membership loops. Also, it is not permitted to grant membership in a role to PUBLIC.

The members of a group role can use the privileges of the role in two ways.

  • First, every member of a group can explicitly do SET ROLE to temporarily “become” the group role.

    In this state, the database session has access to the privileges of the group role rather than the original login role, and any database objects created are considered owned by the group role not the login role.

  • Second, member roles that have the INHERIT attribute automatically have use of the privileges of roles of which they are members, including any privileges inherited by those roles.

    As an example, suppose we have done:

    GRANT admin TO joe;
    GRANT wheel TO admin;

6.4. Predefined Roles

PostgreSQL provides a set of predefined roles that provide access to certain, commonly needed, privileged capabilities and information.

Administrators (including roles that have the CREATEROLE privilege) can GRANT these roles to users and/or other roles in their environment, providing those users with access to the specified capabilities and information.

7. Privileges

When an object is created, it is assigned an owner. The owner is normally the role that executed the creation statement.

  • For most kinds of objects, the initial state is that only the owner (or a superuser) can do anything with the object.

  • To allow other roles to use it, privileges must be granted.

There are different kinds of privileges: SELECT, INSERT, UPDATE, DELETE, TRUNCATE, REFERENCES, TRIGGER, CREATE, CONNECT, TEMPORARY, EXECUTE, and USAGE. The privileges applicable to a particular object vary depending on the object’s type (table, function, etc).

An object can be assigned to a new owner with an ALTER command of the appropriate kind for the object, for example

ALTER TABLE table_name OWNER TO new_owner;

Superusers can always do this; ordinary roles can only do it if they are both the current owner of the object (or a member of the owning role) and a member of the new owning role.

To assign privileges, the GRANT command is used. For example, if joe is an existing role, and accounts is an existing table, the privilege to update the table can be granted with:

GRANT UPDATE ON accounts TO joe;

Writing ALL in place of a specific privilege grants all privileges that are relevant for the object type.

The special “role” name PUBLIC can be used to grant a privilege to every role on the system.

To revoke a previously-granted privilege, use the fittingly named REVOKE command:


8. SQL Syntax: Identifiers and Key Words

SQL input consists of a sequence of commands. A command is composed of a sequence of tokens, terminated by a semicolon (“;”). The end of the input stream also terminates a command. Which tokens are valid depends on the syntax of the particular command. [5]

A token can be a key word, an identifier, a quoted identifier, a literal (or constant), or a special character symbol. Tokens are normally separated by whitespace (space, tab, newline), but need not be if there is no ambiguity (which is generally only the case if a special character is adjacent to some other token type).

For example, the following is (syntactically) valid SQL input:


8.1. Identifiers and Key Words

Tokens such as SELECT, UPDATE, or VALUES in the example above are examples of key words, that is, words that have a fixed meaning in the SQL language. The tokens MY_TABLE and A are examples of identifiers. They identify names of tables, columns, or other database objects, depending on the command they are used in. Therefore they are sometimes simply called “names”. Key words and identifiers have the same lexical structure, meaning that one cannot know whether a token is an identifier or a key word without knowing the language.

Key words and unquoted identifiers are case insensitive. Therefore:


can equivalently be written as:

uPDaTE my_TabLE SeT a = 5;

A convention often used is to write key words in upper case and names in lower case, e.g.:

UPDATE my_table SET a = 5;

There is a second kind of identifier: the delimited identifier or quoted identifier. It is formed by enclosing an arbitrary sequence of characters in double-quotes ("). A delimited identifier is always an identifier, never a key word. So "select" could be used to refer to a column or table named “select”, whereas an unquoted select would be taken as a key word and would therefore provoke a parse error when used where a table or column name is expected. The example can be written with quoted identifiers like this:

UPDATE "my_table" SET "a" = 5;

Quoting an identifier also makes it case-sensitive, whereas unquoted names are always folded to lower case. For example, the identifiers FOO, foo, and "foo" are considered the same by PostgreSQL, but "Foo" and "FOO" are different from these three and each other. (The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard. If you want to write portable applications you are advised to always quote a particular name or never quote it.)

Object name case sensitivity might be different for SQL Server and PostgreSQL.

By default, SQL Server names are case insensitive. However, you can create a case sensitive SQL Server database by changing the COLLATION property.

In PostgreSQL, object names are case insensitive. [6]

9. Walk-through PostgresSQL

$ psql -V
psql (PostgreSQL) 15.3 (Debian 15.3-0+deb12u1)
$ psql --help
psql is the PostgreSQL interactive terminal.


General options:
  -?, --help[=options]     show this help, then exit
      --help=commands      list backslash commands, then exit
      --help=variables     list special variables, then exit
Connection options:
  -h, --host=HOSTNAME      database server host or socket directory (default: "local socket")
  -p, --port=PORT          database server port (default: "5432")
  -U, --username=USERNAME  database user name (default: "x")
  -w, --no-password        never prompt for password
  -W, --password           force password prompt (should happen automatically)

For more information, type "\?" (for internal commands) or "\help" (for SQL
commands) from within psql, or consult the psql section in the PostgreSQL
$ psql -h localhost -p 5432 -W postgres postgres
psql (15.3 (Debian 15.3-0+deb12u1), server 13.10 (Debian 13.10-1.pgdg110+1))
Type "help" for help.

postgres=# CREATE DATABASE testdb;
postgres=# CREATE ROLE test LOGIN; -- CREATE USER test;
postgres=# ALTER USER test PASSWORD 'test';
postgres=# GRANT ALL ON DATABASE testdb TO test;
postgres=# exit
$ psql -h localhost -p 5432 -q -W testdb test
testdb=> \dn+
                          List of schemas
  Name  |  Owner   |  Access privileges   |      Description
 public | postgres | postgres=UC/postgres+| standard public schema
        |          | =UC/postgres         |
(1 row)