A PostgreSQL extension providing an async networking interface accessible via SQL using a background worker and curl.

Overview
Issues
  • feat: libuv

    feat: libuv

    Fixes #46, #13.

    This PR integrates the libuv event loop with curl_multi_socket_action to replace the curl_multi_perform-based approach. Incremental changes wasn't really possible, so this is practically a half-rewrite (sorry about the mess!)

    The main changes are:

    • Use libuv as the event loop. Previously we also have a loop which mostly dealth with TTL and submitting new requests to curl. We integrate these logic into an idle handle - basically these let you run something once per loop iteration.
    • Unlike the previous curl_multi_perform implementation, we now submit requests to curl one by one, each in its own transaction. This fixes #13 and sidesteps the need for the batch_size GUC, so I omitted it.
    • The previous implementation stores per-request information in a global hash table. We sidestep the need for one by instead using a "request context" that is associated with each curl easy handle using CURLOPT_PRIVATE.
    • Handles all memory leaks that I can eyeball (whatever that's worth).
    • The request/response handling logic is mostly left unchanged.
    • Adds a basic .clang-format. Feel free to configure it if you prefer.

    Caveats:

    • Fundamentally, event loops make understanding runtime behavior harder by exploding the state space of the program. Hopefully this is mitigated by judicious elogs, but it's not perfect.
    • There's one more known failure mode, explained here.

    Blockers:

    • replicate & fix the issue with invalid URLs & https
    opened by soedirgo 18
  • Pathological behavior when `http_request_queue` grows too big

    Pathological behavior when `http_request_queue` grows too big

    Hypothetical scenario:

    • a bug in pg_net causes requests to accumulate in http_request_queue
    • a patch is released
    • all requests are fired off at once
    • DB has trouble keeping up and causes OOM/EMFILE/etc.
    • worker enters crashloop
    • DB instance is forever sad

    I haven't tried reproducing the above scenario, but we should take preventive measures against this. Some possible actions:

    • limiting max amount of requests in progress
    • TRUNCATE all requests on worker restart (so it doesn't enter crashloop)
    bug 
    opened by soedirgo 5
  • feat: support POST

    feat: support POST

    What kind of change does this PR introduce?

    Feature

    What is the new behavior?

    Support POST requests via net.http_post(). Right now the function is mostly copypasting from http_get(), but I think in the future this should be refactored to use a (public?) generic function net.http().

    @olirice the order of arguments to net.http_post() is a bit out of place since required arguments have to be listed first - let me know if you want them changed. Also I see I'm failing the pytest stuff - anything I should look at? Do you want me to add a test as well? (haven't looked at how it's setup)

    @steve-chavez CMIIW, we currently ignore params and headers in the request right? Atm content_type is handled separately, but I think we should handle it together with headers. In retrospect, I probably should've omitted content_type in the SQL too... (was following pgsql-http) Also, let me know if I've freed the palloc'd stuff properly.

    Additional context

    Closes #6.

    TODOs

    • [x] merge Content-Type into headers
    • [x] convert body to byte array to align with #5
    • [x] http_post fails when body is null
    • [ ] check memory leak
    opened by soedirgo 4
  • >1024 concurrent requests crashes background worker

    >1024 concurrent requests crashes background worker

    When a large number of requests starts at the same time (> 1024) the worker crashes

    Run a webserver locally:

    Process 1

    python3 -m http.server
    

    SQL

    create extension pg_net;
    select net.http_get('http://0.0.0.0:8000') from generate_series(1,5000);
    

    Error

    postgres=# 2021-07-23 16:20:16.307 CDT [32736] ERROR:  CURL error code: 56 
    
    bug 
    opened by olirice 3
  • pg_net and http exts have the same http_method domain

    pg_net and http exts have the same http_method domain

    Bug report

    Describe the bug

    If a user already has http extension on their database and wants to create pg_net extension, or vice versa, then they will get type "http_method" already exists error.

    To Reproduce

    1. create extension http;
    2. create extension pg_net;
    3. See error: type "http_method" already exists

    System information

    • Version of Postgres: 12.5
    bug 
    opened by w3b6x9 2
  • Insert values into response table from the background worker

    Insert values into response table from the background worker

    • inserts response body as text into response table
    • inserts headers as jsonb in the response table
    • inserts ContentType and timedOut in response table
    • removes the uuid-ossp dependency(back to bigint)
    opened by steve-chavez 2
  • Handle `Server return ed nothing (no headers, no data)`(CURLE_GOT_NOTHING)

    Handle `Server return ed nothing (no headers, no data)`(CURLE_GOT_NOTHING)

    Once this error is returned, the worker stops processing new requests.

    (needs reproducing first, I need to setup an http server that produces this invalid response)

    We also need to enable more tracing to debug the worker state in production.

    bug wontfix 
    opened by steve-chavez 1
  • Discrepancy Between net.http_request_queue and net._http_response records

    Discrepancy Between net.http_request_queue and net._http_response records

    Bug report

    Describe the bug

    There are cases when net._http_response would have a subset of records from net.http_request_queue even though records from both tables should ideally match.

    Expected behavior

    There should be a record in net._http_response for every record in net.http_request_queue and requests should go through successfully or be marked as failed in net._http_response.

    Additional context

    Related issue: https://github.com/supabase/supabase/issues/2997

    bug 
    opened by w3b6x9 1
  • fix: allow usage of double quotes within json bodies

    fix: allow usage of double quotes within json bodies

    A valid json object that includes a double quote escaped within a string causes an error.

    Not sure if this is the best way to fix this!

    postgres=# select json_build_object('foo', 'b"ar')::text;
     json_build_object
    -------------------
     {"foo" : "b\"ar"}
    (1 row)
    
    postgres=# select json_build_object('foo', 'b"ar')::text::bytea;
    ERROR:  invalid input syntax for type bytea
    
    postgres=# select convert_to(json_build_object('foo', 'b"ar')::text, 'UTF8');
                  convert_to
    --------------------------------------
     \x7b22666f6f22203a2022625c226172227d
    (1 row)
    
    opened by darora 1
  • feat: support params directly in url

    feat: support params directly in url

    https://curl.se/libcurl/c/libcurl-url.html

    also change append query code with curl.

    leaving _urlencode in to keep the tests passing

    also granted everything to (hardcoded) postgres - probably not correct but we'll cross that bridge when we come to it

    opened by soedirgo 1
  • Handle requests errors and responses with empty body and content-type

    Handle requests errors and responses with empty body and content-type

    Fixes #4. When doing invalid requests:

    -- invalid protocol
    select net.http_get('net://supabase.io');
    
    -- invalid host name
    select net.http_get('http://new.ycombinator.com');
    

    We now get:

    select id, url, error_msg from net.http_request_queue;
    
    ┌────┬────────────────────────────┬────────────────────────────┐
    │ id │            url             │         error_msg          │
    ├────┼────────────────────────────┼────────────────────────────┤
    │  1 │ net://supabase.io          │ Unsupported protocol       │
    │  2 │ http://new.ycombinator.com │ Couldn't resolve host name │
    └────┴────────────────────────────┴────────────────────────────┘
    

    Additionally, also handles cases where there's no body and content-type:

    select net.http_get('http://wikipedia.org');
    
    select id, status_code, content_type, body from net.http_response;
    ┌────┬─────────────┬──────────────┬────────┐
    │ id │ status_code │ content_type │  body  │
    ├────┼─────────────┼──────────────┼────────┤
    │  1 │         301 │ [NULL]       │ [NULL] │
    └────┴─────────────┴──────────────┴────────┘
    
    opened by steve-chavez 1
  • Allow http DELETE

    Allow http DELETE

    Feature request

    Is your feature request related to a problem? Please describe.

    We want to send a delete request to a serverless function within a Nextjs api to trigger the deletion of the entity at an external provider after it was deleted in our database.

    Describe the solution you'd like

    Support http delete in addition to post and get.

    Describe alternatives you've considered

    Adding another endpoint, e.g. /delete, and sending a post request there

    enhancement 
    opened by psteinroe 0
  • Change table queue to in-memory queue and add callbacks

    Change table queue to in-memory queue and add callbacks

    Reasons

    • For all the requests to finish, an INSERT on the _http_response table must be done, this reduces throughput. There are cases where the client doesn't care about the response so it doesn't need to be persisted. For example, function hooks don't do anything with the response.
    • As discussed in https://github.com/supabase/pg_net/pull/50#issuecomment-953796674, we don't have to ensure delivery and do retries since we're just an HTTP client, so we don't really need to persist the requests into http_request_queue.
    • The http_request_queue can grow big with many requests and cause production issues(internal link).
    • Users create FKs to our private tables https://github.com/supabase/pg_net/issues/44, making the bg worker crash when trying to do the TTL.

    Proposal

    Drop the _http_response and http_request_queue tables and instead use an in-memory queue, plus add two callbacks:

    create or replace function net.http_get(
       url text,
    -- ..,
       success_cb text default '',
       error_cb text default ''
    )
    returns void -- no id is returned now
    as $$
    -- ...
    $$ language sql;
    

    Which can be used like:

    select net.http_get(
      url := 'http://domain.com',
    , success_cb := $$ insert into my_table values ($1, $2, $3) $$ -- $1=status, $2=headers, $3=body
    , error_cb := $$ do $_$begin raise exception 'Failed request on %: "%"', $1, $2; end$_$; $$ -- $1=url, $2=error_message
    );
    

    Pros

    • The callbacks are optional, so we don't have to start a transaction for each request if there are none provided. Throughput should be greatly increased from this.
    • The error_cb can also be an insert/update on a table, so the request can be retried if needed.

    @olirice @soedirgo WDYT?

    enhancement 
    opened by steve-chavez 5
  • pg_net causes database to crash if url is null

    pg_net causes database to crash if url is null

    Bug report

    Describe the bug

    If the url is null, the database crashes

    To Reproduce

    supabase init supabase start docker restart <your-db-container-name> (see https://github.com/supabase/cli/issues/137) Go to the sql editor and run the following code

    create extension if not exists pg_net;
    
    create table if not exists test (
      id uuid primary key default gen_random_uuid()
    );
    
    create or replace function test_trigger()
      returns trigger
      language plpgsql
      security invoker
    as
    $$
    begin
      perform
        net.http_post(
            url:=null,
            body:='{"hello": "world"}'::jsonb
        );
      return new;
    end
    $$;
    
    create trigger call_test_trigger
    after insert on test
    for each row execute procedure test_trigger();
    
    insert into test default values;
    

    Result: Connection terminated unexpectedly

    Expected behavior

    An exception is raised telling the user that the url cannot be null.

    System information

    • OS: macOS
    bug 
    opened by psteinroe 0
  • some remarks

    some remarks

    1. https://github.com/supabase/pg_net/blob/04dd22fb7cf267d55057deccdf70aed373dc7032/src/worker.c#L434-L436 No! idle_cb function is called in TopMemoryContext context, and it does not pfree palloc'ed like https://github.com/supabase/pg_net/blob/04dd22fb7cf267d55057deccdf70aed373dc7032/src/worker.c#L452 https://github.com/supabase/pg_net/blob/04dd22fb7cf267d55057deccdf70aed373dc7032/src/worker.c#L486 https://github.com/supabase/pg_net/blob/04dd22fb7cf267d55057deccdf70aed373dc7032/src/worker.c#L490 etc
    bug 
    opened by RekGRpth 17
  • Optionally log based on status code responses

    Optionally log based on status code responses

    Can we optionally log something based on the status code when the response is handled?

    Default to logging 4xx - 5xx response codes would be great, and then everyone would automatically get this info in their Postgres logs.

    enhancement observability 
    opened by chasers 3
Releases(v0.3)
YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features

YugabyteDB is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. It is best to fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments.

yugabyte 6.6k Jun 21, 2022
PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL.

PolarDB for PostgreSQL (PolarDB for short) is an open source database system based on PostgreSQL. It extends PostgreSQL to become a share-nothing distributed database, which supports global data consistency and ACID across database nodes, distributed SQL processing, and data redundancy and high availability through Paxos based replication. PolarDB is designed to add values and new features to PostgreSQL in dimensions of high performance, scalability, high availability, and elasticity. At the same time, PolarDB remains SQL compatibility to single-node PostgreSQL with best effort.

Alibaba 2.3k Jun 30, 2022
PostgreSQL extension for pgexporter

pgexporter_ext pgexporter_ext is an extension for PostgreSQL to provide additional Prometheus metrics for pgexporter. Features Disk space metrics See

null 4 Apr 13, 2022
Distributed PostgreSQL as an extension

What is Citus? Citus is a PostgreSQL extension that transforms Postgres into a distributed database—so you can achieve high performance at any scale.

Citus Data 6.8k Jun 28, 2022
A framework to monitor and improve the performance of PostgreSQL using Machine Learning methods.

pg_plan_inspector pg_plan_inspector is being developed as a framework to monitor and improve the performance of PostgreSQL using Machine Learning meth

suzuki hironobu 166 Jun 20, 2022
Modern cryptography for PostgreSQL using libsodium.

pgsodium pgsodium is an encryption library extension for PostgreSQL using the libsodium library for high level cryptographic algorithms. pgsodium can

Michel Pelletier 252 May 17, 2022
An SQLite binding for node.js with built-in encryption, focused on simplicity and (async) performance

Description An SQLite (more accurately SQLite3MultipleCiphers) binding for node.js focused on simplicity and (async) performance. When dealing with en

mscdex 14 May 15, 2022
dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover

dqlite dqlite is a C library that implements an embeddable and replicated SQL database engine with high-availability and automatic failover. The acron

Canonical 3k Jun 27, 2022
C++11 Lightweight Redis client: async, thread-safe, no dependency, pipelining, multi-platform

C++11 Lightweight Redis client: async, thread-safe, no dependency, pipelining, multi-platform

Simon Ninon 967 Jun 22, 2022
OceanBase is an enterprise distributed relational database with high availability, high performance, horizontal scalability, and compatibility with SQL standards.

What is OceanBase database OceanBase Database is a native distributed relational database. It is developed entirely by Alibaba and Ant Group. OceanBas

OceanBase 4.4k Jun 27, 2022
A lightweight header-only C++11 library for quick and easy SQL querying with QtSql classes.

EasyQtSql EasyQtSql is a lightweight header-only C++11 library for quick and easy SQL querying with QtSql classes. Features: Header only C++11 library

null 44 Jun 28, 2022
A type safe SQL template library for C++

sqlpp11 A type safe embedded domain specific language for SQL queries and results in C++ Documentation is found in the wiki So what is this about? SQL

Roland Bock 2k Jun 19, 2022
DuckDB is an in-process SQL OLAP Database Management System

DuckDB is an in-process SQL OLAP Database Management System

DuckDB 5.4k Jun 27, 2022
TimescaleDB is an open-source database designed to make SQL scalable for time-series data.

An open-source time-series SQL database optimized for fast ingest and complex queries. Packaged as a PostgreSQL extension.

Timescale 13.3k Jun 27, 2022
A bare-bone SQL implementation

MiniSQL A bare-bone SQL implementation. Project Structure include folder contains header files of all modules. These header files are meant to be shar

null 16 Apr 23, 2022
PGSpider: High-Performance SQL Cluster Engine for distributed big data.

PGSpider: High-Performance SQL Cluster Engine for distributed big data.

PGSpider 127 Jun 24, 2022
A friendly and lightweight C++ database library for MySQL, PostgreSQL, SQLite and ODBC.

QTL QTL is a C ++ library for accessing SQL databases and currently supports MySQL, SQLite, PostgreSQL and ODBC. QTL is a lightweight library that con

null 155 Jun 26, 2022
The official C++ client API for PostgreSQL.

libpqxx Welcome to libpqxx, the C++ API to the PostgreSQL database management system. Home page: http://pqxx.org/development/libpqxx/ Find libpqxx on

Jeroen Vermeulen 643 Jun 25, 2022
Prometheus exporter for PostgreSQL

pgexporter pgexporter is a Prometheus exporter for PostgreSQL. pgexporter will connect to one or more PostgreSQL instances and let you monitor their o

null 15 Apr 17, 2022