APSI is a C++ library for Asymmetric (unlabeled or labeled) Private Set Intersection.

Overview

APSI: C++ library for Asymmetric PSI

WARNING: This is research code

This library is research code and is not yet intended for production use. Use at your own risk.

Introduction

(Unlabeled) PSI and Labeled PSI

Private Set Intersection (PSI) refers to a functionality where two parties, each holding a private set of items, can check which items they have in common without revealing anything else to each other. Upper bounds on the sizes of the sets are assumed to be public information and are not protected.

The APSI (Asymmetric PSI) library provides a PSI functionality for asymmetric set sizes. For example, in many cases one party may hold a large dataset of millions of records, and the other party wishes to find out whether a single particular record or a small number of records appear in the dataset. We refer to this as APSI in unlabeled mode.

In many cases, however, the querier wishes to also retrieve some information per each record that matched. This can be viewed as a key-value store with a privacy preserving batched query capability. We use the terminology item and label to refer to the key and the value in such a key-value store, and call this APSI in labeled mode.

Note: Unless labeled mode is actually needed, it will be much more efficient (in both communication and computation) to use the unlabeled mode.

Sender and Receiver

We use the terminology sender and receiver to denote the two parties in the APSI protocol: a sender sends the result to the receiver. For example, in a common use case where a server hosts a look-up table that multiple clients can query with encrypted records. In this case the server acts as the sender, and the clients act as (independent) receivers.

How APSI Works

Homomorphic Encryption

APSI uses a relatively new encryption technology called homomorphic encryption that allows computations to be performed directly on encrypted data. Results of such computations remain encrypted and can be only decrypted by the owner of the secret key. There are many homomorphic encryption schemes with different properties; APSI uses the BFV encryption scheme implemented in the Microsoft SEAL library.

Computation on Encrypted Data

Microsoft SEAL enables computation representable with arithmetic circuits (e.g., additions and multiplications modulo a prime number) with limited depths rather than arbitrary computation on encrypted data. These computations can be done in a batched manner, where a single Microsoft SEAL ciphertext encrypts a large vector of values, and computations are done simultaneously and independently on every value in the vector; batching is crucial for APSI to achieve good performance.

Noise Budget

The capacity of computation that can be done on encrypted data is tracked by noise budget that each ciphertext carries. A freshly encrypted ciphertext has a certain amount of noise budget which is then consumed by computations – particularly multiplications. A ciphertext can no longer be decrypted correctly once its noise budget is fully consumed. To support computations of larger multiplicative depths, it is necessary to start with a larger initial noise budget, which can be done through appropriate changes to the encryption parameters.

Encryption Parameters

Homomorphic encryption schemes, such as BFV, are difficult to configure for optimal performance. APSI requires the user to explicitly provide the Microsoft SEAL encryption parameters. So we need to describe them here briefly. For much more details, we refer the reader to the examples in the Microsoft SEAL repository. We describe three important encryption parameters that the user should be familiar with.

plain_modulus is the easiest to understand. It must be a prime number congruent to 1 modulo 2 * poly_modulus_degree and defines the finite field datatype that the BFV scheme encrypts. For example, if plain_modulus is 65537 – a 16-bit prime – then the scheme encrypts integers modulo 65537, and computations on encrypted data preserves integer arithmetic modulo 65537. A larger plain_modulus leads to faster noise budget consumption. It is recommended to design computation with as small a plaint_modulus as possible.

poly_modulus_degree is a positive power-of-two integer that determines how many integers modulo plain_modulus can be encoded into a single Microsoft SEAL plaintext; typical values are 2048, 4096, 8192, and 16384. It is now easy for the reader to appreciate the value of batching: computation of thousands of values can be done at the cost of one computation on encrypted data. poly_modulus_degree also affects the security level of the encryption scheme: if other parameters remain the same, a bigger poly_modulus_degree is more secure.

coeff_modulus is a set of prime numbers that determine the noise budget of a freshly encrypted ciphertext. In Microsoft SEAL the coeff_modulus primes are rarely given explicitly by values but instead by bit counts – the library can create them. In APSI it is necessary to specify at least two primes in coeff_modulus, but it is beneficial to have as few of them as possible; using 2 – 8 primes is probably reasonable. The individual primes can be up to 60 bits. The noise budget depends linearly on the total bit count of the primes. coeff_modulus also affects the security level of the encryption scheme: if other parameters remain the same, a bigger total bit count is less secure. Thus, to obtain more computing capability, i.e., more noise budget, one needs to increase the total bit count of the coeff_modulus, and consequently may have to increase poly_modulus_degree for security. This will subsequently have an impact on the batching capability, so the computation itself may now change.

Fortunately, Microsoft SEAL prevents the user from accidentally setting insecure parameters. It checks that, for the given poly_modulus_degree, the total coeff_modulus bit count does not exceed the following bounds:

poly_modulus_degree max coeff_modulus bit count
1024 27
2048 54
4096 109
8192 218
16384 438
32768 881

In APSI, the user will need to explicitly provide the coeff_modulus prime bit counts, so the table above will be of great help in avoiding unnecessary exceptions being thrown by Microsoft SEAL.

Theory

Naive Idea

The basic idea of APSI is as follows. Suppose the sender holds a set {Y_i} of items – each an integer modulo plain_modulus – and the receiver holds a single item X – also an integer modulo plain_modulus. The receiver can choose a secret key, encrypts X to obtain a ciphertext Q = Enc(X), and sends it over to the sender. The sender can now evaluate the matching polynomial M(x) = (x - Y_0)(x - Y_1)...(x - Y_n) at x = Q. Here the values Y_i are unencrypted data held by the sender. Due to the capabilities of homomorphic encryption, M(Q) will hold an encryption of (X - Y_0)(X-Y_1)...(X-Y_n) which is zero if X matches one of the sender's items and non-zero otherwise. The sender who performs computation on X – encrypted data – will not be able to know this result due to the secret key being held only by the receiver.

One problem with the above is that the computation has an enormously high multiplicative depth. It is not uncommon for the sender to have millions or even hundreds of millions of items. This would require a very high initial noise budget and subsequently very large encryption parameters with an impossibly large computational overhead.

Lowering the Depth

The first step towards making this naive idea practical is to figure out ways of lowering the multiplicative depth of the computation. First, notice that the sender can split up its set into S equally sized parts and evaluate the matching polynomial independently on each of the parts, producing S results {M_i(Q)}. All of these results must be sent back to the receiver, so the sender-to-receiver communication has increased by a factor of S. Nevertheless, this turns out to be a really valuable trick in helping reduce the size of the encryption parameters.

The second step is to use batching in Microsoft SEAL. Per each of the S parts described above, the sender can further split its set into poly_modulus_degree many equally sized parts, and the receiver can batch-encrypt its item into a single batched query ciphertext Q = Enc([ X, X, ..., X ]). Now, the sender can evaluate vectorized versions of the matching polynomials on Q, improving the computational complexity by a factor of poly_modulus_degree and significantly reducing the multiplicative depth.

The third step is to have the receiver compute higher powers of its query, encrypt those separately, and send them all to the sender. Suppose the matching polynomials that the sender hopes to evaluate have degree d. Then, the sender will need ciphertexts encrypting all powers of the receiver's query, up to power d. Although the sender can always compute Q^2, ..., Q^d from a given Q, the computation can have high multiplicative depth even with the improvements described above. Instead, suppose the receiver precomputes certain powers of its query, encrypts them, and sends them to the sender in addition to Q. If the powers are chosen appropriately, the sender can compute all remaining necessary powers of Q with a much lower depth circuit. The receiver-to-sender communication cost increases by a factor of how many powers were sent. It is almost always beneficial to use this trick to reduce the multiplicative depth of the matching polynomials, and subsequently the size of the encryption parameters.

Cuckoo Hashing

The above techniques scale poorly when the receiver has more items. Indeed, it would seem that the query needs to be repeated once per receiver's item, so if the receiver holds 10,000 items, the communicational and computational cost would massively increase.

There is a well-known technique for fixing this issue. We use a hashing technique called cuckoo hashing, as implemented in the Kuku library. Cuckoo hashing uses multiple hash functions (usually 2 – 4) to achieve very high packing rates for a hash table with a bin size of 1. Instead of batch-encrypting its single item X into a query Q by repeating it into each batching slot, the receiver uses cuckoo hashing to insert multiple items {X_i} into a hash table of size poly_modulus_degree (the batch size) and bin size 1. The cuckoo hash table is then encrypted to form a query Q, and is sent to the sender.

The sender uses all the different cuckoo hash functions to hash its items {Y_i} into a large hash table with arbitrarily sized bins; notably, it does not use cuckoo hashing. In fact, it inserts each item multiple times – once per each cuckoo hash function. This is necessary, because the sender cannot know which of the hash functions the receiver's cuckoo hashing process ended up using for each item. If the number of cuckoo hash functions is H, then clearly this effectively increases the sender's set size by a factor of H. After hashing its items the sender breaks down its hash table into parts as described above in Lowering the Depth, and proceeds as before upon receiving Q.

The benefit is enormous. Cuckoo hashing allows dense packing of the receiver's items into a single query Q. For example, the receiver may be able to fit thousands of query items {X_i} into a single query Q, and the sender can perform the matching for all of these query items simultaneously, at the cost of increasing the sender's dataset size by a small factor H.

Large Items

Recall how each item had to be represented as an integer modulo plain_modulus. Unfortunately, plain_modulus has usually 16 – 30 bits and always less than 60 bits in Microsoft SEAL. Larger plain_modulus also causes larger noise budget consumption, lowering capability of computing on encrypted data. On the other hand, we may need to support arbitrary length items. For example, an item may be an entire document, an email address, a street address, or a driver's license number. Two tricks make this possible.

The first trick is to apply a hash function to all items on both the sender's and receiver's side, so that they have a capped standard length. We hash to 128 bits and truncate the hash to a shorter length (large enough to be collision-resistant) as necessary. The shortest item length we support (after truncation) is 80 bits, which is still far above the practical sizes of plain_modulus.

The second trick is to break up each item into multiple parts and encode them separately into consecutive batching slots. Namely, if plain_modulus is a B-bit prime, then we write only B - 1 bits of an item into a batching slot and the next B - 1 bits into the next slot. One downside is that a batched plaintext/ciphertext now only holds a fraction of poly_modulus_degree items. Typically we would use either 4 or 8 slots per item. For example, if plain_modulus is a 20-bit prime, then 4 slots could encode an item of length 80, and the query ciphertext Q (and its powers) can encrypt up to poly_modulus_degree / 4 of the receiver's items. The receiver now queries substantially fewer items than before. The solution is to decouple the cuckoo hash table size from the poly_modulus_degree and simply use two or more ciphertexts to encrypt {X_i} (and their powers).

OPRF

Unfortunately, the above approach reveals more than whether there is a match:

  1. It allows the receiver to learn if parts of its query matched;
  2. The result of the matching polynomial reveals information about the sender's data, even when there is no match. These are significant issues and unacceptable.

The solution is to use an Oblivious Pseudo-Random Function, or OPRF for short. An OPRF can be thought of as a keyed hash function OPRF(s, -) that only the sender knows; here s denotes the sender's key. Further, the receiver can obtain OPRF(s, X) without learning the function OPRF(s, -) or the key s, and without the sender learning X.

The way to do this is simple. The receiver hashes its item X to an elliptic curve point A in some cryptographically secure elliptic curve. Next, the receiver chooses a secret number r, computes the point B = rA, and sends it to the sender. The sender uses its secret s to compute C = sB, and sends it to back to the receiver. Upon receiving C, the receiver computes the inverse r^(-1) modulo the order of the elliptic curve, and further computes r^(-1) C = r^(-1) srA = sA. The receiver then extracts the OPRF hash value OPRF(s, X) from this point, for example by hashing its x-coordinate to an appropriate domain.

The sender knows s, so it can simply replace its items {Y_i} with {OPRF(s, Y_i)}. The receiver needs to communicate with the sender to obtain {OPRF(s, X_i)}; once the receiver has received these values, the protocol can proceed as described above. With OPRF, the problem of the receiver learning whether parts of its query matched goes away. Since all the items are hashed with a hash function known only by the sender, the receiver will benefit nothing from learning parts of the sender's hashed items. In fact, the sender's dataset is not private information and could in principle be sent in full to the receiver. Homomorphic encryption only protects the receiver's data.

There is one further detail that must be mentioned here. We choose OPRF(s, -) to have a 256-bit output and denote its first 128 bits by ItemHash(s, -). Instead of {OPRF(s, X_i)} we use {ItemHash(s, X_i)} as the items; the reason will be given later in Label Encryption.

Paterson-Stockmeyer

In some cases it is beneficial for the sender to use the Paterson-Stockmeyer algorithm instead for evaluating the matching and label polynomials. Consider a simple example where the matching polynomial happens to be

M(x) = 1 + 2x + 3x^2 + 4x^3 + 5x^4 + 6x^5 + 7x^6 + 8x^7 + 9x^8 + 10x^9 + 11x^10 + 12x^11 + 13x^12 + 14x^13 + 15x^14.

To evaluate M(Q) for an encrypted query Q=Enc(X), the sender must first compute all encrypted powers of X up to X^14 from some number of source powers the receiver sends to the sender (see Lowering the Depth). Next, the encrypted powers of X are multiplied with the encrypted coefficients of M and the results are added together. It is important to note that the first step (computing all powers) involves a lot of ciphertext-ciphertext multiplications, while the second step (multiply with coefficients and adding) involves only ciphertext-plaintext multiplications, which are much faster than ciphertext-ciphertext multiplications. If the sender splits it dataset into S equal parts (see Lowering the Depth), then the powers need to be computed only once and can be used repeatedly for each of the S parts.

Now consider the following approach, which is a special case of the Paterson-Stockmeyer algorithm. The polynomial M(x) can be alternatively written as:

M(x) =     (1 + 2x + 3x^2 + 4x^3) +
        x^4(5 + 6x + 7x^2 + 8x^3) +
        x^8(9 + 10x + 11x^2 + 12x^3) +
       x^12(13 + 14x + 15x^2 + 0x^3).

To evaluate M(Q), the sender needs to compute all encrypted powers of X up to X^3, and also X^4, X^8, and X^12: a total of 6 powers, as opposed to the 14 powers needed above. Next the sender needs to evaluate the four degree-3 polynomials by computing ciphertext-plaintext multiplications with the appropriate powers and adding up the terms. Then it needs to multiply the degree-3 polynomial results obtained above with appropriate powers of X^4 (either X^4, X^8, or X^12) and add up the results. One difference to the earlier approach is the number of ciphertext-ciphertext multiplications: in the first approach the number is 14 minus the number of precomputed powers the receiver sent; in the second approach it is 6 minor the number of precomputed powers the receiver sent. Another key difference is that the number of costly ciphertext-ciphertext multiplications is now proportional to S.

We decided to group the terms above into degree-3 polynomials, but we could have chosen to use either lower or higher degree polynomials instead. Different choices result in different communication-computation trade-offs. We refer to the degree of these inner polynomials as the (Paterson-Stockmeyer) low-degree. As is obvious from above, the sender must also compute powers of low-degree + 1 of the query; we call low-degree + 1 the (Paterson-Stockmeyer) high-degree.

The user may want to ensure that the multiplicative depth of computing the inner polynomials – taking into account an additional level from multiplying by the plaintext coefficients – matches the multiplicative depth of computing the powers of high-degree. This way the last multiplication will be depth-optimal.

Practice

We now begin to illustrate how the Theory is implemented in APSI. Our discussion only considers the unlabeled mode; the labeled mode is not very different, and we will discuss it later. For simplicity, we assume that the OPRF step has already been performed, and consider the simplified case where the receiver needs only a single ciphertext (and its powers) to encrypt the query. This could happen, for example, if poly_modulus_degree is 16, each item uses 2 batching slots, and the cuckoo hash table size is 8; these numbers are too small to work in reality, but are helpful to illustrate the concepts. Suppose the receiver wants to perform a query for a vector of items as follows.

Receiver's query vector

[ item92  ]
[ item14  |
[ item79  ]
[ item3   ]
[ item401 ]

After cuckoo hashing, the receiver's view is as follows. The entire vector of size 16 becomes a single Microsoft SEAL ciphertext.

Receiver's cuckoo hash table

[ item79-part1  ]
[ item79-part2  ]
[ empty         ]
[ empty         ]
[ item14-part1  ]
[ item14-part2  ]
[ item92-part1  ]
[ item92-part2  ]  ==>  query-ctxt
[ item401-part1 ]
[ item401-part2 ]
[ empty         ]
[ empty         ]
[ empty         ]
[ empty         ]
[ item3-part1   ]
[ item3-part2   ]

The sender creates one big hash table and then breaks it into several independent bin bundles. The matching polynomials for each bin bundle are evaluated independently on query-ctxt; this is the first idea presented in Lowering the Depth. For simplicity, we ignore the fact that the sender must use all of the cuckoo hash functions to insert each item.

Sender's big hash table

[ item416-part1 | item12-part1  ][ item71-part1  | item611-part1 ]
[ item416-part2 | item12-part2  ][ item71-part2  | item611-part2 ]
[ item125-part1 | item9-part1   ][ item512-part1 | empty         ]
[ item125-part2 | item9-part2   ][ item512-part2 | empty         ]
[ item500-part1 | item277-part1 ][ item14-part1  | item320-part1 ]
[ item500-part2 | item277-part2 ][ item14-part2  | item320-part2 ]
[ item92-part1  | empty         ][ empty         | empty         ]
[ item92-part2  | empty         ][ empty         | empty         ]
[ item498-part1 | item403-part1 ][ item88-part1  | item5-part1   ]
[ item498-part2 | item403-part2 ][ item88-part2  | item5-part2   ]
[ item216-part1 | empty         ][ empty         | empty         ]
[ item216-part2 | empty         ][ empty         | empty         ]
[ item315-part1 | item491-part1 ][ item262-part1 | empty         ]
[ item315-part2 | item491-part2 ][ item262-part1 | empty         ]
[ item100-part1 | item37-part1  ][ item90-part1  | item3-part1   ]
[ item100-part2 | item37-part2  ][ item90-part2  | item3-part2   ]

\-------------------------------/\-------------------------------/
           Bin bundle 1                     Bin bundle 2

The sender's table is created by first starting with a single bin bundle. Imagine first item416 is inserted and it happens to land in the very first bin of the hash table. Next suppose we add item500, item125, item12, item9, and finally item512 into the bins as shown in the diagram.

APSI allows to specify how many items the sender can fit (horizontally) into each bin bundle. For the sake of this example, we shall assume that this value is 2, but in reality it would be larger.

Once more room is needed, a new bin bundle is created. The sender started with only Bin bundle 1, but item512 would land in the same bin as item125 and item9, which is already full according to our bound of 2. Hence, APSI creates Bin bundle 2 and inserts item512 into it. Next, item277 is inserted into Bin bundle 1 since there is still room for it. In the end, we may end up with dozens or hundreds of bin bundles, and some of the last bin bundles to be added may be left with many empty locations.

For the matching, the encrypted query query-ctxt is matched – in encrypted form – against both Bin bundle 1 and Bin bundle 2, producing results result-ctxt-1 and result-ctxt-2 which are sent back to the receiver. The receiver decrypts the results and finds a result as follows.

Receiver decrypting the result

                    [ item79-no-match  ]                            [ item79-no-match  ]
                    [ empty            ]                            [ empty            ]
                    [ item14-no-match  ]                            [ item14-match     ]
result-ctxt-1  ==>  [ item92-match     ]        result-ctxt-2  ==>  [ item92-no-match  ]
                    [ item401-no-match ]                            [ item401-no-match ]
                    [ empty            ]                            [ empty            ]
                    [ empty            ]                            [ empty            ]
                    [ item3-no-match   ]                            [ item3-match      ]

APSI computes the logical OR of the match values for each result ciphertext and orders the results according to the order of the items appearing in the original query producing, for example, a result vector as follows. The order of the items in the query is arbitrary and irrelevant.

Receiver's query vector      Receiver's result vector

[ item92  ]                  [ match    ]
[ item14  ]                  [ match    ]
[ item79  ]                  [ no-match ]
[ item3   ]                  [ match    ]
[ item401 ]                  [ no-match ]

The receiver, in this case, concludes that item92, item14, and item3 are all present in the sender's database, whereas the other items are not.

A few important details are omitted from the description above. First, the original items on either side are never inserted directly into the APSI protocol, but instead their OPRF hashes are used.

Second, the sender needs to insert each item multiple times, once using each of the cuckoo hash functions. For example, if three cuckoo hash functions are used, the sender would insert, e.g., Hash1(item92), Hash2(item92), and Hash3(item92). The receiver, on the other hand, has inserted only Hash?(item92), where Hash? is one of the three hash functions; in any case, the match will be discovered. Thus, our diagram above is misleading: we should have used names like item92-hash1-part1.

Third, as explained in Large Items, in many cases the receiver's query consists of multiple ciphertexts – not just one like above. For example, suppose we use a cuckoo hash table of size 32, instead of size 8. Now a single plaintext cannot encode the receiver's query anymore. Instead, the query is broken into 4 ciphertexts, each encoding a contiguous chunck of the bigger cuckoo hash table.

Receiver's cuckoo hash table

[ item79-part1  ]
[ item79-part2  ]
[ empty         ]
[ empty         ]
[ item14-part1  ]
[ item14-part2  ]
[ item92-part1  ]
[ item92-part2  ]  ==>  query-ctxt0
[ item401-part1 ]
[ item401-part2 ]
[ empty         ]
[ empty         ]
[ empty         ]
[ empty         ]
[ item3-part1   ]
[ item3-part2   ]

[ ...           ]  ==>  query-ctxt1

[ ...           ]  ==>  query-ctxt2

[ ...           ]  ==>  query-ctxt3

A similar breakdown takes place on the sender's side, creating a jagged array of bin bundles. Here is an example of what the sender's view could be.

                +------------++------------++------------+
                |            ||            ||            |
Bundle index 0  | Bin bundle || Bin bundle || Bin bundle |
                |            ||            ||            |
                +------------++------------++------------+
                +------------++------------+
                |            ||            |
Bundle index 1  | Bin bundle || Bin bundle |
                |            ||            |
                +------------++------------+
                +------------++------------++------------++------------+
                |            ||            ||            ||            |
Bundle index 2  | Bin bundle || Bin bundle || Bin bundle || Bin bundle |
                |            ||            ||            ||            |
                +------------++------------++------------++------------+
                +------------++------------++------------+
                |            ||            ||            |
Bundle index 3  | Bin bundle || Bin bundle || Bin bundle |
                |            ||            ||            |
                +------------++------------++------------+

When the sender receives query-ctxt0, it must compute the matching for each bin bundle at bundle index 0. Similarly, query-ctxt1 must be matched against each bin bundle at bundle index 1, and so on. The number of result ciphertexts obtained by the receiver will be equal to the total number of bin bundles held by the sender; the client cannot know this number in advance.

Labeled Mode

Basic Idea

The labeled mode is not too different but requires some extra explanation. The receiver, in addition to learning whether its query items are in the sender's set, will learn data the sender has associated to these items. One can think of this as a key-value store with privacy-preserving querying.

To understand how the labeled mode works, recall from Basic Idea how the matching polynomial M(x) outputs either an encryption of zero or an encryption of a non-zero value when being evaluated at the receiver's encrypted item Q. In the labeled mode, the sender creates another polynomial L(x), the label interpolation polynomial, that has the following property: if {(Y_i, V_i)} denotes the sender's set of item-label pairs, then L(Y_i) = V_i. Upon receiving Q, the sender computes the ciphertext pair (M(Q), L(Q)) and returns them to the receiver. The receiver decrypts the pair and checks whether the first value decrypts to zero. If it does, the second value decrypts to the corresponding label.

Large Labels

One immediate issue is that all encrypted computations happen modulo the plain_modulus, but the sender's labels might be much longer than that. This was a problem for the items, and was resolved in Large Items by hashing the items first to a bounded size (80 – 128 bits) and then using a sequence of batching slots to encode the items. This solution works to some extent for labels as well. Namely, the labels can be broken into parts similarly to how the items are, and for each part we can form a label interpolation polynomial that outputs that part of the label when evaluated at the corresponding part of the item.

This is not yet a fulfilling solution, because our items do not have a fixed size and are fairly short anyway (up to 128 bits). Labels that are longer than the items can be broken into multiple parts each of the length of the item. For each part we can construct a separate label interpolation polynomial, evaluate them all at the encrypted query, and return each encrypted result to the receiver. The receiver decrypts the results and concatenates them to recover the label for those items that were matched.

Label Encryption

There is a serious issue with the above approach that must be resolved. Recall how we used OPRF to prevent partial (or full) leakage of the sender's items to the receiver: given an item Y, the matching polynomial is not actually computed for Y itself, but rather for ItemHash(s, Y), which denoted the first 128 bits of the item's OPRF value OPRF(s, Y). This means that the label interpolation polynomial L should actually have the property that L(ItemHash(s, Y_i)) = V_i for each of the sender's items Y_i. However, if the receiver can guess a part of some ItemHash(s, Y_i) it can use it to query for the corresponding part of the label for that item, which is unacceptable since the receiver does not actually know the item Y_i.

To solve this issue, the sender uses a symmetric encryption function Enc(, , ) to encrypt the labels V_i using keys derived from OPRF(s, Y_i). Specifically, as the encryption key LabelKey(s, Y_i) for the label V_i corresponding to an item Y_i we use the remaining 128 bits of the 256-bit output of OPRF(s, Y_i), so the label we must communicate to the receiver becomes Enc(V_i, LabelKey(s, Y_i), nonce).

There is still a bit of a problem, because the receiver must somehow know the nonce as well. One option is to use a constant or empty nonce. In this case extreme care must be taken, because if an adversary can learn encryptions of two different labels for the same item with the same OPRF key s, then they may be able to learn information about the labels. This can happen, because APSI supports updating labels for items. Another option is to use a long randomly generated nonce – different for each encryption – which the receiver must somehow learn. APSI achieves this by randomly sampling a nonce, concatenating it with the encryption of V_i, and using the concatenation for the interpolation polynomial L. In other words, the sender samples a random nonce for each item Y_i and computes the label interpolation polynomial L such that L(ItemHash(s, Y_i)) = nonce | Enc(V_i, LabelKey(s, Y_i), nonce).

The receiver benefits nothing from learning parts (or all) of the encrypted label unless it also knows the original item. Furthermore, even if the receiver manages to obtain nonce | Enc(V_i, LabelKey(s, Y_i), nonce) by guessing ItemHash(s, Y_i), and in an offline attack enumerates all possible items Y_i (or later learns Y_i through other means), it still cannot obtain the label because LabelKey(s, Y_i) is derived from OPRF(s, Y_i) – not just from Y_i. Of course at this later point the sender may decide to serve a normal query to the receiver for Y_i, in which case the receiver will learn V_i, as it is supposed to.

APSI allows the sender can specify the nonce size in bytes. The default nonce size is set to 16 bytes, but expert users who fully understand the issue may want to use smaller values to achieve improved performance.

Partial Item Collisions

There is one last subtle issue that must be addressed. Recall from Practice how the sender constructs a large hash table and breaks it into a jagged array of bin bundles. In the labeled mode each bin bundle holds not only the item parts in it, but also the corresponding label parts, and the label interpolation polynomials, as described above. The interpolation polynomials are not created for the entire label at a time, but for each part separately, although the encryption is applied to the full item before decomposing it into parts.

Now consider what happens when – by change – item416-part1 and item12-part1 (as in Practice) are the same. If the corresponding label parts label416-part1 and label12-part1 are different, it will be impossible to create a label interpolation polynomial L, as it cannot output both label416-part1 and label12-part1 on the corresponding item part.

This issue is resolved by checking that label parts do not already appear in the same locations before inserting an item into a bin bundle. If any of them does, the item simply cannot be inserted into that bin bundle, and a new bin bundle for the same bundle index must be created. Note that the problem exists only in the labeled mode and can lead to worse packing rate (items_inserted / theoretical_max) than in unlabeled mode, where no such limitation exists.

Using APSI

Receiver

The apsi::receiver::Receiver class implements all necessary functions to create and send parameter, OPRF, and PSI or labeled PSI queries (depending on the sender), and process any responses received. Most of the member functions are static, but a few (related to creating and processing the query itself) require an instance of the class to be created. All functions and types are in the apsi namespace, so we omit apsi:: from all names below. For simplicity, we also use Receiver to denote apsi::receiver::Receiver.

This same text appears in the receiver.h header file.

Receiver includes functionality to request protocol parameters (PSIParams object) from a sender. This is needed when the receiver does not know what parameters it is supposed to use with a specific sender. In other cases the receiver would know the parameters ahead of time, and can skip this step. In any case, once the receiver has an appropriate PSIParams object, an Receiver can be instantiated. The Receiver constructor automatically creates Microsoft SEAL public and private keys. The public keys are sent to the sender along with every query request, and the private keys are held internally by the Receiver object for decrypting query responses. New keys can be generated by calling the member function Receiver::reset_keys, and we recommend doing this after every query has been completed to protect against leaked keys.

The class includes two versions of an API to performs the necessary operations. The "simple" API consists of three functions: Receiver::RequestParams, Receiver::RequestOPRF, and Receiver::request_query. However, these functions only support network::NetworkChannel, such as network::ZMQReceiverChannel, for the communication. Other channels, such as network::StreamChannel, are only supported by the "advanced" API.

The advanced API requires many more steps. The full process is as follows:

  1. (optional) Receiver::CreateParamsRequest must be used to create a parameter request. The request must be sent to the sender on a channel with network::Channel::send. The sender must respond to the request and the response must be received on the channel with network::Channel::receive_response. The received Response object must be converted to the right type (ParamsResponse) with the to_params_response function. This function will return nullptr if the received response was not of the right type. A PSIParams object can be extracted from the response.

  2. A Receiver object must be created from a PSIParams object. The PSIParams must match what the sender uses.

  3. Receiver::CreateOPRFReceiver must be used to process the input vector of items and return an associated oprf::OPRFReceiver object. Next, Receiver::CreateOPRFRequest must be used to create an OPRF request from the oprf::OPRFReceiver, which can subsequently be sent to the sender with network::Channel::send. The sender must respond to the request and the response must be received on the channel with network::Channel::receive_response. The received Response object must be converted to the right type (OPRFResponse) with the to_oprf_response function. This function will return nullptr if the received response was not of the right type. Finally, Receiver::ExtractHashes must be called with the OPRFResponse and the oprf::OPRFReceiver object. This function returns std::pair, std::vector>, containing the OPRF hashed items and the label encryption keys. Both vectors in this pair must be kept for the next steps.

  4. Receiver::create_query (non-static member function) must then be used to create the query itself. The function returns std::pair, where the Request object contains the query itself to be send to the sender, and the IndexTranslationTable is an object associated to this query describing how the internal data structures of the query maps to the vector of OPRF hashed items given to Receiver::create_query. The IndexTranslationTable is needed later to process the responses from the sender. The Request object must be sent to the sender with network::Channel::send. The received Response object must be converted to the right type (QueryResponse) with the to_query_response function. This function will return nullptr if the received response was not of the right type. The QueryResponse contains only one important piece of data: the number of ResultPart objects the receiver should expect to receive from the sender in the next step.

  5. network::Channel::receive_result must be called repeatedly to receive all ResultParts. For each received ResultPart, Receiver::process_result_part must be called to find a std::vector representing the match data associated to that ResultPart. Alternatively, one can first retrieve all ResultParts, collect them into a std::vector, and use Receiver::process_result to find the complete result – just like what the simple API returns. Both Receiver::process_result_part and Receiver::process_result require the IndexTranslationTable and the std::vector objects created in the previous steps.

Request, Response, and ResultPart

The Request type is defined in requests.h as an alias for std::unique_ptr, where network::SenderOperation is a purely virtual class representing either a parameter request (network::SenderOperationParms), an OPRF request (network::SenderOperationOPRF), or a PSI or labeled PSI query request (network::SenderOperationQuery). The types ParamsRequest, OPRFRequest, and QueryRequest are similar aliases to unique pointers of these derived types. The functions to_params_request, to_oprf_request, and to_query_request convert a Request into the specific kind of request, returning nullptr if the Request was not of the right type. Conversely, the to_request function converts a ParamsRequest, OPRFRequest, or QueryRequest into a Request object.

Similarly, the Response type is defined in responses.h as an alias for std::unique_ptr, along with related type aliases ParamsResponse, OPRFResponse, and QueryResponse, and corresponding conversion functions to_params_response, to_oprf_responset, to_query_response, and to_response.

Finally, the ResultPart type is defined in responses.h as an alias for std::unique_ptr, where network::ResultPackage contains an encrypted result to a query request. Since the query is evaluated independently per each bin bundle (recall Practice), the results for each bin bundle are sent back to the receiver as separate ResultPart objects. The receiver must collect all these together to find the final result, as was described above in Receiver.

The important thing about Request, Response, and ResultPart is that these are the object handled by the network::Channel class member functions send, receive_operation, receive_response, and receive_result (see channel.h).

Sender

The Sender class implements all necessary functions to process and respond to parameter, OPRF, and PSI or labeled PSI queries (depending on the sender). Unlike the Receiver class, Sender also takes care of actually sending data back to the receiver. Sender is a static class and cannot be instantiated.

All functions and types are in the apsi namespace, so we omit apsi:: from all names below. For simplicity, we also use Sender to denote apsi::sender::Sender, and SenderDB to denote apsi::sender::SenderDB.

This same text appears in the sender.h header file.

Just like Receiver, there are two ways of using Sender. The "simple" approach supports network::ZMQSenderChannel and is implemented in the ZMQSenderDispatcher class in zmq/sender_dispatcher.h. The ZMQSenderDispatcher provides a very fast way of deploying an APSI Sender: it automatically binds to a ZeroMQ socket, starts listening to requests, and acts on them as appropriate.

The advanced Sender API consisting of three functions: RunParams, RunOPRF, and RunQuery. Of these, RunParams and RunOPRF take the request object (ParamsRequest or OPRFRequest) as input. RunQuery requires the QueryRequest to be "unpacked" into a Query object first.

The full process for the sender is as follows:

  1. Create an oprf::OPRFKey object and use oprf::OPRFSender::ComputeHashes with the oprf::OPRFKey to process the sender's items (or item-label pairs) and convert them into hashed items (or hashed-item-label pairs).

  2. Create a PSIParams object and a SenderDB object. The SenderDB must be created with the PSIParams and the hashed items (or hashed item-label pairs) must be loaded into it with SenderDB::set_data. The SenderDB can be used repeatedly and can be updated efficiently.

  3. (optional) Receive a parameter request with network::Channel::receive_operation. The received Request object must be converted to the right type (ParamsRequest) with the to_params_request function. This function will return nullptr if the received request was not of the right type. Once the request has been obtained, the RunParams function can be called with the ParamsRequest, the SenderDB, the network::Channel, and optionally a lambda function that implements custom logic for sending the ParamsResponse object on the channel.

  4. Receive an OPRF request with network::Channel::receive_operation. The received Request object must be converted to the right type (OPRFRequest) with the to_oprf_request function. This function will return nullptr if the received request was not of the right type. Once the request has been obtained, the RunOPRF function can be called with the OPRFRequest, the oprf::OPRFKey, the network::Channel, and optionally a lambda function that implements custom logic for sending the OPRFResponse object on the channel.

  5. Receive a query request with network::Channel::receive_operation. The received Request object must be converted to the right type (QueryRequest) with the to_query_request function. This function will return nullptr if the received request was not of the right type. Once the request has been obtained, a Query object must be created from it. The constructor of the Query class verifies that the QueryRequest is valid for the given SenderDB, and if it is not the constructor still returns successfully but the Query is marked as invalid (Query::is_valid() returns false) and cannot be used in the next step. Once a valid Query object is created, the RunQuery function can be used to perform the query and respond on the given channel. Optionally, two lambda functions can be given to RunQuery to provide custom logic for sending the QueryResponse and the ResultPart objects on the channel.

SenderDB

For simplicity, we use SenderDB to denote apsi::sender::SenderDB.

This same text appears in the sender_db.h header file.

A SenderDB maintains an in-memory representation of the sender's set of items and labels (in labeled mode). These items are not simply copied into the SenderDB data structures, but also preprocessed heavily to allow for faster online computation time. Since inserting a large number of new items into a SenderDB can take time, it is not recommended to recreate the SenderDB when the database changes a little bit. Instead, the class supports fast update and deletion operations that should be preferred: SenderDB::insert_or_assign and SenderDB::remove.

The SenderDB constructor allows the label byte count to be specified; unlabeled mode is activated by setting the label byte count to zero. It is possible to optionally specify the size of the nonce used in encrypting the labels, but this is best left to its default value unless the user is absolutely sure of what they are doing.

The SenderDB requires substantially more memory than the raw data would. Part of that memory can automatically be compressed when it is not in use; this feature is enabled by default, and can be disabled when constructing the SenderDB. The downside of in-memory compression is a performance reduction from decompressing parts of the data when they are used, and recompressing them if they are updated.

In many cases the SenderDB does not need to be modified after having been constructed, or loaded from disk. The function SenderDB::strip can be called to remove all data that is not strictly needed to serve query requests. A SenderDB that has been stripped cannot be modified, cannot be checked for the presence of specific items, and labels cannot be retrieved from it. A stripped SenderDB can be serialized and deserialized.

PSIParams

The apsi::PSIParams class encapsulates parameters for the PSI or labeled PSI protocol. These parameters are important to set correctly to ensure correct behavior and good performance. All of the concepts behind these parameters have come up in How APSI Works, which we urge the reader to review unless it is absolutely clear to them.

If a parameter set works for specific sender and receiver sets, then it will always work for a smaller or larger sender's set as well (with asymptotic linear scaling in communication and computation complexity), both in the unlabeled and labeled mode with arbitrary length labels. However, it does not necessarily work for a larger receiver's set.

For simplicity, we use PSIParams to denote apsi::PSIParams.

A PSIParams object contains four kinds of parameters, encapsulated in sub-structs: PSIParams::SEALParams, PSIParams::ItemParams, PSIParams::TableParams, and PSIParams::QueryParams. We shall discuss each separately.

SEALParams

The PSIParams::SEALParams simply wraps an instance of Microsoft SEAL seal::EncryptionParameters object with the encryption scheme always set to seal::scheme_type::bfv. Unfortunately these parameters are not entirely easy to comprehend, and while some explanation was given above in Encryption Parameters, we highly recommend the reader study the extensive comments in the Microsoft SEAL examples to have a better grasp of how the parameters should be set, and what their impact on performance is.

ItemParams

The PSIParams::ItemParams struct contains only one member variable: a 32-bit integer felts_per_item. This number was described in Large Items; it specifies how many Microsoft SEAL batching slots should represent each item, and hence influences the item length.

The item length (in bits) is a product of felts_per_item and floor(log_2(plain_modulus)), where plain_modulus refers to the Microsoft SEAL plain_modulus parameter set in the PSIParams::SEALParams. The PSIParams constructor will verify that the item length is bounded between 80 and 128 bits, and will throw an exception otherwise.

felts_per_item must be one of 2, 4, 8, 16, or 32.

TableParams

The PSIParams::ItemParams struct contains parameters describing the receiver's cuckoo hash table and the sender's data structure. It holds three member variables:

  • table_size denotes the size of the receiver's cuckoo hash table. It must be such that its size is a positive multiple of (possibly equal to) the number of batching slots in a Microsoft SEAL plaintext, i.e., it must be a multiple of the poly_modulus_degree parameter set in the PSIParams::SEALParams.
  • max_items_per_bin denotes how many items fit into each row of the sender's bin bundles. It cannot be zero.
  • hash_func_count denotes the number of hash functions used for cuckoo hashing. It must be at least 1 and at most 8. While setting hash_func_count to 1 means essentially disabling cuckoo hashing, it can improve performance in cases where the receiver is known to have only a single item (set membership).

QueryParams

The PSIParams::QueryParams struct contains two parameters: a std::uint32_t called ps_low_degree, and a std::set called query_powers. ps_low_degree determined the Paterson-Stockmeyer low-degree, as was discused in Paterson-Stockmeyer. If set to zero, the Paterson-Stockmeyer algorithm is not used. query_powers defines which encrypted powers of the query the receiver sends to the sender, as was discussed in Lowering the Depth. This is one of the most complex parameters to set, which is why we have dedicated an entire subsection below for describing how to choose it.

ps_low_degree can be any number between 0 and max_items_per_bin, although values 1 and max_items_per_bin are meaningless to use. query_powers must contain 1, cannot contain 0, and cannot contain values larger than max_items_per_bin. Any value in query_powers larger than ps_low_degree must be a multiple of ps_low_degree + 1.

PSIParams Constructor

To construct a PSIParams object, one needs to provide the constructor with a valid PSIParams::SEALParams, PSIParams::ItemParams, PSIParams::TableParams, and PSIParams::QueryParams. The constructor will perform the following validations on the parameters, in order, and will throw an exception (with a descriptive message) if any of them fails:

  1. PSIParams::TableParams::table_size is verified to be non-zero.
  2. PSIParams::TableParams::max_items_per_bin is verified to be non-zero.
  3. PSIParams::TableParams::hash_func_count is verified to be at least 1 and at most 8.
  4. PSIParams::ItemParams::felts_per_item is verified to be 2, 4, 8, 16, or 32.
  5. PSIParams::QueryParams::ps_low_degree is verified to not exceed max_items_per_bin.
  6. PSIParams::QueryParams::query_powers is verified to not contain 0, to contain 1, and to not contain values larger than max_items_per_bin. Any value larger than ps_low_degree is verified to be divisible by ps_low_degree + 1.
  7. PSIParams::SEALParams are verified to be valid and to support Microsoft SEAL batching. Specificially, the parameters must have a plain_modulus that is prime and congruent to 1 modulo 2 * poly_modulus_degree. Microsoft SEAL contains functions in seal::CoeffModulus and seal::PlainModulus classes (see modulus.h) to choose appropriate coeff_modulus and plain_modulus primes.
  8. The item bit count is computed as the product of felts_per_item and floor(log_2(plain_modulus)), and is verified to be at least 80 and at most 128.
  9. The number of item fitting vertically in a bin bundle is computed as poly_modulus_degree / felts_per_item. This number is verified to be non-zero.
  10. table_size is verified to be a multiple of poly_modulus_degree / felts_per_item.

If all of these checks pass, the PSIParams object is successfully created and is valid for use in APSI.

Loading from JSON

A PSIParams object can be most conveniently created from a JSON string with the PSIParams::Load method. The format of the JSON string is as in the following example:

{
    "table_params": {
        "hash_func_count": 3,
        "table_size": 512,
        "max_items_per_bin": 92
    },
    "item_params": {
        "felts_per_item": 8
    },
    "query_params": {
        "ps_low_degree": 0,
        "query_powers": [ 3, 4, 5, 8, 14, 20, 26, 32, 38, 41, 42, 43, 45, 46 ]
    },
    "seal_params": {
        "plain_modulus": 40961,
        "poly_modulus_degree": 4096,
        "coeff_modulus_bits": [ 49, 40, 20 ]
    }
}

The Microsoft SEAL plain_modulus parameter can be set to a specific prime, like in the example above in the seal_params section, or can be expressed as a desired number of bits for the prime, in which case the library will sample a prime of appropriate form. In such a case, the seal_params section would appear as follows:

    ...
    "seal_params": {
        "plain_modulus_bits": 24,
        "poly_modulus_degree": 4096,
        "coeff_modulus_bits": [ 49, 40, 20 ]
    }
    ...

We provide multiple example parameter sets in the parameters/ subdirectory. The files are named with triples of numbers, where the first two indicate recommended upper bounds for the sender's and receiver's set sizes, respectively, and the third (optional) number indicates that the parameters are meant for use in labeled mode and denote the label byte size. The file names end with an optional specifier -com or -cmp, indicating whether the parameters are optimized to minimize communication or computation cost.

Query Powers

It is unfortunately difficult to find good choices for the query_powers parameter in PSIParams. This is related to the so-called global postage-stamp problem in combinatorial number theory (see Challis and Robinson (2010)). In short, the global postage-stamp problem can be stated as follows:

For given positive integers h and k, determine a set of k integers { a_i | 1 = a_0 < a_1 < ... < a_k }, such that

  • any positive integer up to n can be realized as a sum of at most h of the a_i (possibly with repetition), and
  • n is as large as possible.

For example, if h = 2 and k = 3, then { 1, 3, 4 } provides a solution for n = 8. This is easy to verify:

Value First summand Second summand
1 1 N/A
2 1 1
3 3 N/A
4 4 N/A
5 1 4
6 3 3
7 3 4
8 4 4

For a larger example, if h = 3 and k = 3, then { 1, 4, 5 } provides a solution for n = 15. Simply start from 1 and write each number, in order, a sum of two of the previous numbers, in a way that minimizes the total number of summands:

Value First summand Second summand Total # of summands
1 1 N/A 1
2 1 1 2
3 1 2 3
4 4 N/A 1
5 5 N/A 1
6 1 5 2
7 2 5 3
8 4 4 2
9 4 5 2
10 5 5 2
11 5 6 3
12 4 8 3
13 5 8 3
14 4 10 3
15 5 10 3

The choice of { 1, 4, 5 } is optimal in the sense that there is no set {a_i} of size 3 (k = 3) that allows for n >= 16 without at least 4 total summands (h = 4).

The above table can now immediately be represented as a directed graph with each value (integers 1 through 15) labeling the nodes and the is-a-summand-of relationship represented by directed edges. In this case { 1, 4, 5 } will appear as sink nodes.

Now, recall from Practice how bin bundle rows can hold only a predetermined number of items. For this example, suppose that number was 15. Then, once the sender receives a query ciphertext from the receiver, it must compute – in encrypted form – all powers of the query up to 15. Computing these powers will require a circuit of multiplicative depth 4. Evaluating the matching polynomials will further require an additional multiplication (by the coefficients), so in the end the encrypted computation will have multiplicative depth 5: this requires large encryption parameters. Instead, the receiver can precompute the 4th and the 5th powers of the query, encrypt them, and send them to the sender in addition to the query itself (1st power). Now the sender can use the graph to compute all powers of the query in an efficient manner with only a depth 2 circuit. The coefficient multiplications will increase the depth of the full computation to 3, but this is considerably better than 5, and will allow for much smaller encryption parameters to be used. The downside is, of course, that the communication from the receiver to the sender is now three times larger than if only the query itself was sent. Still, the reduction in the size of the parameters is typically immensely beneficial, and using appropriate source powers will be the key to good performance.

We recommend using the tables in Challis and Robinson (2010) to determine good source powers. For example, suppose the bin bundle rows are desired to hold at least 70 items. Then, looking at the tables in Challis and Robinson (2010), we find the following possibly good source powers:

Multiplicative depth Source powers Highest power
1 (h = 2) 1, 3, 4, 9, 11, 16, 20, 25, 27, 32, 33, 35, 36 (k = 13) 72
2 (h = 3) 1, 4, 5, 15, 18, 27, 34 (k = 7) 70
2 (h = 4) 1, 3, 11, 15, 32 (k = 5) 70
3 (h = 5) 1, 4, 12, 21 (or 1, 5, 12, 28) (k = 4) 71
3 (h = 6) 1, 4, 19, 33 (k = 4) 114

Several comments are in order:

  • The second and the third row represent a communication-computationt trade-off. The two computations have the same depth (2), but one (second row) requires 40% more communication. The computational cost will be only slightly lower for the second row, because in both cases 70 - k encrypted multiplications must be performed. Hence, we can conclude that the second row will almost certainly not make sense, and the third row is objectively better.
  • It is not easy to compare rows with different multiplicative depth. Their performance differences will depend largely on the other protocol parameters – in particular the Microsoft SEAL encryption parameters.
  • If depth 3 is acceptable, then the last row may be the best choice, as it allows the bin bundle row size to be increased from 70 to 114. This will result in fewer bin bundles, and hence smaller communication from the sender to the receiver. However, now the sender must compute all powers of the query up to 114, increasing the online computation cost.
  • It is probably necessary to try all options to determine what is overall best for a particular use-case.
  • Challis and Robinson (2010) also shows a possible set for k = 3 with depth 3 (h = 7): { 1, 8, 13 }. While this only allows a highest power of 69, which does not quite satisfy our requirement of 70, such a set should be considered as it reduces the receiver-to-sender communication by 25%, while increasing the sender-to-receiver communication by only a tiny amount (roughly by a factor of 70/69 = 1.45%) due to the slightly smaller bin bundles. This will almost certainly be a beneficial trade-off.

Thread Control

Many of the computations APSI does are highly parallelizable. APSI uses a thread pool that can be controlled with the static functions apsi::ThreadPoolMgr::SetThreadCount and apsi::ThreadPoolMgr::GetThreadCount in apsi/thread_pool_mgr.h.

By default the thread count is set to std::thread::hardware_concurrency(), but this is in many cases not ideal due to caching issues, especially when hyper-threading is enabled. Instead, the user would typically want to use apsi::ThreadPoolMgr::SetThreadCount to set it to some lower value depending on the number of physical cores.

Logging

APSI can be optionally compiled to use log4cplus for logging. Logging is configured using the static member functions of apsi::Log in apsi/log.h. For example, these can be used to set the log level (one of "all", "debug", "info", "warning", "error", or "off"), set a log filename, and control console output.

Log messages for different log levels are emitted with the macros APSI_LOG_DEBUG, APSI_LOG_INFO, APSI_LOG_WARNING, and APSI_LOG_ERROR. The macros allow "streaming"; for example,

int val = 3;
APSI_LOG_INFO("My value is " << val);

would log the message My value is 3 at "info" log level.

Building APSI

Requirements

System Toolchain
Windows Visual Studio 2019 with C++ CMake Tools for Windows
Linux Clang++ (>= 7.0) or GNU G++ (>= 7.0), CMake (>= 3.12)
macOS Xcode toolchain (>= 9.3), CMake (>= 3.12)

Dependencies

APSI has multiple external dependencies that must be pre-installed. By far the easiest and recommended way to do this is using vcpkg. Each package's name in vcpkg is listed below.

Note: On Windows, the vcpkg triplet x64-windows-static-md should be specified. For examples, to install Microsoft SEAL on Windows, you must use .\vcpkg install seal[no-throw-tran]:x64-windows-static-md, while on other systems use simply ./vcpkg install seal[no-throw-tran].

The CMake build system can then automatically find these pre-installed packages, if the following arguments are passed to CMake configuration:

  • -DCMAKE_TOOLCHAIN_FILE=${vcpkg_root_dir}/scripts/buildsystems/vcpkg.cmake, and
  • -DVCPKG_TARGET_TRIPLET="x64-windows-static-md" on Windows only.
Dependency vcpkg name
Microsoft SEAL seal[no-throw-tran]
Microsoft Kuku kuku
Log4cplus log4cplus
cppzmq cppzmq (needed only for ZeroMQ networking support)
FlatBuffers flatbuffers
jsoncpp jsoncpp
Google Test gtest (needed only for building tests)
TCLAP tclap (needed only for building CLI)

Note on Microsoft SEAL and Intel HEXL

At this time the vcpkg version of Microsoft SEAL does not directly have support for Intel HEXL. To utilize the benefits of Intel HEXL, we recommend manually downloading, configuring (with -DSEAL_USE_INTEL_HEXL=ON), building, and installing Microsoft SEAL, instead of using vcpkg. Any potential benefit of using Intel HEXL is highly dependent on the CPU, but on certain systems it can have an enormous impact on the online computation time by accelerating the CPU intensive Microsoft SEAL operations.

Command-Line Interface (CLI)

The APSI library comes with example command-line programs implementing a sender and a receiver. In this section we describe how to run these programs.

Common Arguments

The following optional arguments are common both to the sender and the receiver applications.

Parameter Explanation
-t | --threads Number of threads to use
-f | --logFile Log file path
-s | --silent Do not write output to console
-l | --logLevel One of all, debug, info (default), warning, error, off

Receiver

The following arguments specify the receiver's behavior.

Parameter Explanation
-q | --queryFile Path to a text file containing query data (one per line)
-o | --outFile Path to a file where intersection result will be written
-a | --ipAddr IP address for a sender endpoint
--port TCP port to connect to (default is 1212)

Sender

The following arguments specify the sender's behavior and determine the parameters for the protocol. In our CLI implementation the sender always chooses the parameters and the receiver obtains them through a parameter request. Note that in other applications the receiver may already know the parameters, and the parameter request may not be necessary.

Parameter
Explanation
-d | --dbFile Path to a CSV file describing the sender's dataset (an item-label pair on each row) or a file containing a serialized SenderDB; the CLI will first attempt to load the data as a serialized SenderDB, and – upon failure – will proceed to attempt to read it as a CSV file
-p | --paramsFile Path to a JSON file describing the parameters to be used by the sender
--port TCP port to bind to (default is 1212)
-n | --nonceByteCount Number of bytes used for the nonce in labeled mode (default is 16)
-c | --compress Whether to compress the SenderDB in memory; this will make the memory footprint smaller at the cost of increased computation

Note: The first row of the CSV file provided to --dbFile determines whether APSI will be used in unlabeled or labeled mode. If the first row contains two values, the first will be interpreted as the item and the rest will be interpreted as the label data. Leading and trailing whitespaces will be trimmed from both the item and the label data. If the first row contains only a single value (i.e., no label), then APSI will read only items from the subsequent rows and set up an unlabeled SenderDB instance. In the labeled mode the longest label appearing will determine the label byte count.

Test Data

The library contains a Python script tools/scripts/test_data_creator.py that can be used to easily create test data for the CLI. Running the script is easy; it accepts three necessary arguments and two optional parameters as follows:

python3 test_data_creator.py    [] []

Here denotes the size of the sender's dataset that will be written in a file db.csv. If this file already exists, it will be overwritten. Similarly denotes the size of the receiver's query that will be written in a file query.csv. The third argument, , denotes the number of items common in both the generated db.csv and query.csv. The fourth (optional) argument denotes the byte size of the randomly generated labels in db.csv; if omitted, the data will be unlabeled. The last (optional) argument denotes the item byte size; it defaults to 64 if omitted. Since long the items will automatically be hashed before they are inserted into a SenderDB, the item byte size does not matter much in practice.

Acknowledgments

Multiple people have contributed substantially to the APSI library but may not appear in Git history. We wish to extend special thanks to Hao Chen, Peter Rindal, and Michael Rosenberg for major contributions to the protocol and the library. We wish to also thank Craig Costello and Patrick Longa for helping implement hash-to-curve for the awesome FourQ curve, and Mariana Botelho da Gama for helping with the Paterson-Stockmeyer implementation.

Contributing

For contributing to APSI, please see CONTRIBUTING.md.

Issues
  • Build fails on arm64-windows

    Build fails on arm64-windows

    See https://github.com/microsoft/APSI/issues/15, I get the same issue when I built on arm64-windows.

    After updated to the latest version 0.8.1(include fix https://github.com/microsoft/APSI/commit/2b61950707ca8759e31eb889081fdcd48f0a1e6c), shows the following error. CMake Error:

    CMake Error: TRY_RUN() invoked in cross-compiling mode, please set the following cache variables appropriately:
       APSI_FOURQ_ARM64_EXITCODE (advanced)
       APSI_FOURQ_ARM64_EXITCODE__TRYRUN_OUTPUT (advanced)
    For details see D:/buildtrees/apsi/arm64-windows-dbg/TryRunResults.cmake
    
    opened by Adela0814 5
  • arm64 support

    arm64 support

    Does this lib support arm64?

    https://github.com/microsoft/APSI/blob/39b45ceff89ca07d0e9481fac894252765989fcc/common/apsi/fourq/CMakeLists.txt#L29-L30

    You add a sub folder arm64, but there is no sub folder arm64.

    From https://github.com/microsoft/vcpkg/pull/23034#discussion_r805106335

    opened by autoantwort 4
  • Build error On Windows 11

    Build error On Windows 11

    I want to build the test by the clion , i add the cmake options -DCMAKE_TOOLCHAIN_FILE=D:/vcpkg/scripts/buildsystems/vcpkg.cmake -DVCPKG_TARGET_TRIPLET="x64-windows-static-md" However , the errors occur.

    "D:\Program Files\JetBrains\CLion 2021.3.3\bin\cmake\win\bin\cmake.exe" --build C:\Users\69424\Desktop\APSI-main\cmake-build-debug --target apsi [1/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/crypto_util.c.obj FAILED: CMakeFiles/apsi.dir/common/apsi/fourq/crypto_util.c.obj D:\PROGRA~2\JETBRA~1\CLION2~1.3\bin\mingw\bin\gcc.exe -DZMQ_STATIC -IC:/Users/69424/Desktop/APSI-main/common -IC:/Users/69424/Desktop/APSI-main/sender -IC:/Users/69424/Desktop/APSI-main/receiver -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/common -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/sender -isystem D:/vcpkg/installed/x64-windows-static-md/include/SEAL-3.7 -isystem D:/vcpkg/installed/x64-windows-static-md/include -isystem D:/vcpkg/installed/x64-windows-static-md/include/Kuku-2.1 -g -DHAVE_CONFIG -DUSE_SECURE_SEED -DUSE_ENDO=true -D_AMD64_ -march=native -mavx2 -D_AVX2_ -MD -MT CMakeFiles/apsi.dir/common/apsi/fourq/crypto_util.c.obj -MF CMakeFiles\apsi.dir\common\apsi\fourq\crypto_util.c.obj.d -o CMakeFiles/apsi.dir/common/apsi/fourq/crypto_util.c.obj -c C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/crypto_util.c In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_api.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/crypto_util.c:9: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ.h:36:2: error: #error -- "Unsupported OS" 36 | #error-- "Unsupported OS" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/crypto_util.c:9: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:40:2: error: #error -- "Unsupported configuration" 40 | #error-- "Unsupported configuration" | ^~~~~ C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/crypto_util.c: In function 'Montgomery_inversion_mod_order': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/crypto_util.c:188:17: warning: implicit declaration of function 'SHIFTL' [-Wimplicit-function-declaration] 188 | SHIFTL(modulus2[j], modulus2[j - 1], 1, modulus2[j], RADIX); | ^~~~~~ [2/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_no_endo.c.obj FAILED: CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_no_endo.c.obj D:\PROGRA~2\JETBRA~1\CLION2~1.3\bin\mingw\bin\gcc.exe -DZMQ_STATIC -IC:/Users/69424/Desktop/APSI-main/common -IC:/Users/69424/Desktop/APSI-main/sender -IC:/Users/69424/Desktop/APSI-main/receiver -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/common -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/sender -isystem D:/vcpkg/installed/x64-windows-static-md/include/SEAL-3.7 -isystem D:/vcpkg/installed/x64-windows-static-md/include -isystem D:/vcpkg/installed/x64-windows-static-md/include/Kuku-2.1 -g -DHAVE_CONFIG -DUSE_SECURE_SEED -DUSE_ENDO=true -D_AMD64_ -march=native -mavx2 -D_AVX2_ -MD -MT CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_no_endo.c.obj -MF CMakeFiles\apsi.dir\common\apsi\fourq\eccp2_no_endo.c.obj.d -o CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_no_endo.c.obj -c C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_no_endo.c In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_api.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_no_endo.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ.h:36:2: error: #error -- "Unsupported OS" 36 | #error-- "Unsupported OS" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_no_endo.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:40:2: error: #error -- "Unsupported configuration" 40 | #error-- "Unsupported configuration" | ^~~~~ [3/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/eccp2.c.obj FAILED: CMakeFiles/apsi.dir/common/apsi/fourq/eccp2.c.obj D:\PROGRA~2\JETBRA~1\CLION2~1.3\bin\mingw\bin\gcc.exe -DZMQ_STATIC -IC:/Users/69424/Desktop/APSI-main/common -IC:/Users/69424/Desktop/APSI-main/sender -IC:/Users/69424/Desktop/APSI-main/receiver -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/common -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/sender -isystem D:/vcpkg/installed/x64-windows-static-md/include/SEAL-3.7 -isystem D:/vcpkg/installed/x64-windows-static-md/include -isystem D:/vcpkg/installed/x64-windows-static-md/include/Kuku-2.1 -g -DHAVE_CONFIG -DUSE_SECURE_SEED -DUSE_ENDO=true -D_AMD64_ -march=native -mavx2 -D_AVX2_ -MD -MT CMakeFiles/apsi.dir/common/apsi/fourq/eccp2.c.obj -MF CMakeFiles\apsi.dir\common\apsi\fourq\eccp2.c.obj.d -o CMakeFiles/apsi.dir/common/apsi/fourq/eccp2.c.obj -c C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2.c In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_api.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ.h:36:2: error: #error -- "Unsupported OS" 36 | #error-- "Unsupported OS" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:40:2: error: #error -- "Unsupported configuration" 40 | #error-- "Unsupported configuration" | ^~~~~ C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2.c: In function 'mul_truncate': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2.c:208:5: error: unknown type name 'uint128_t'; did you mean 'uint32_t'? 208 | uint128_t tt1, tt2; | ^~~~~~~~~ | uint32_t [4/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/hash_to_curve.c.obj FAILED: CMakeFiles/apsi.dir/common/apsi/fourq/hash_to_curve.c.obj D:\PROGRA~2\JETBRA~1\CLION2~1.3\bin\mingw\bin\gcc.exe -DZMQ_STATIC -IC:/Users/69424/Desktop/APSI-main/common -IC:/Users/69424/Desktop/APSI-main/sender -IC:/Users/69424/Desktop/APSI-main/receiver -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/common -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/sender -isystem D:/vcpkg/installed/x64-windows-static-md/include/SEAL-3.7 -isystem D:/vcpkg/installed/x64-windows-static-md/include -isystem D:/vcpkg/installed/x64-windows-static-md/include/Kuku-2.1 -g -DHAVE_CONFIG -DUSE_SECURE_SEED -DUSE_ENDO=true -D_AMD64_ -march=native -mavx2 -D_AVX2_ -MD -MT CMakeFiles/apsi.dir/common/apsi/fourq/hash_to_curve.c.obj -MF CMakeFiles\apsi.dir\common\apsi\fourq\hash_to_curve.c.obj.d -o CMakeFiles/apsi.dir/common/apsi/fourq/hash_to_curve.c.obj -c C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/hash_to_curve.c In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_api.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/hash_to_curve.c:9: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ.h:36:2: error: #error -- "Unsupported OS" 36 | #error-- "Unsupported OS" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/hash_to_curve.c:9: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:40:2: error: #error -- "Unsupported configuration" 40 | #error-- "Unsupported configuration" | ^~~~~ [5/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/random.c.obj [6/29] Building C object CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_core.c.obj FAILED: CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_core.c.obj D:\PROGRA~2\JETBRA~1\CLION2~1.3\bin\mingw\bin\gcc.exe -DZMQ_STATIC -IC:/Users/69424/Desktop/APSI-main/common -IC:/Users/69424/Desktop/APSI-main/sender -IC:/Users/69424/Desktop/APSI-main/receiver -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/common -IC:/Users/69424/Desktop/APSI-main/cmake-build-debug/sender -isystem D:/vcpkg/installed/x64-windows-static-md/include/SEAL-3.7 -isystem D:/vcpkg/installed/x64-windows-static-md/include -isystem D:/vcpkg/installed/x64-windows-static-md/include/Kuku-2.1 -g -DHAVE_CONFIG -DUSE_SECURE_SEED -DUSE_ENDO=true -D_AMD64_ -march=native -mavx2 -D_AVX2_ -MD -MT CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_core.c.obj -MF CMakeFiles\apsi.dir\common\apsi\fourq\eccp2_core.c.obj.d -o CMakeFiles/apsi.dir/common/apsi/fourq/eccp2_core.c.obj -c C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_core.c In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_api.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:22, from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_core.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ.h:36:2: error: #error -- "Unsupported OS" 36 | #error-- "Unsupported OS" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_core.c:14: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/FourQ_internal.h:40:2: error: #error -- "Unsupported configuration" 40 | #error-- "Unsupported configuration" | ^~~~~ In file included from C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/eccp2_core.c:20: C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h: In function 'fpmul1271': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:122:5: error: unknown type name 'uint128_t'; did you mean 'uint32_t'? 122 | uint128_t tt1, tt2, tt3 = { 0 }; | ^~~~~~~~~ | uint32_t C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h: In function 'fpsqr1271': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:157:5: error: unknown type name 'uint128_t'; did you mean 'uint32_t'? 157 | uint128_t tt1, tt2, tt3 = { 0 }; | ^~~~~~~~~ | uint32_t C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h: In function 'multiply': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:255:13: warning: implicit declaration of function 'MUL' [-Wimplicit-function-declaration] 255 | MUL(a[i], b[j], UV + 1, UV[0]); | ^~~ C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:256:13: warning: implicit declaration of function 'ADDC' [-Wimplicit-function-declaration] 256 | ADDC(0, UV[0], u, carry, v); | ^~~~ C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h: In function 'subtract': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:284:9: warning: implicit declaration of function 'SUBC' [-Wimplicit-function-declaration] 284 | SUBC(borrow, a[i], b[i], borrow, c[i]); | ^~~~ C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h: In function 'fpdiv1271': C:/Users/69424/Desktop/APSI-main/common/apsi/fourq/amd64/fp_x64.h:377:5: warning: implicit declaration of function 'SHIFTR' [-Wimplicit-function-declaration] 377 | SHIFTR(temp[1], temp[0], 1, a[0], RADIX); | ^~~~~~ [7/29] Building CXX object CMakeFiles/apsi.dir/common/apsi/network/sender_operation_response.cpp.obj [8/29] Building CXX object CMakeFiles/apsi.dir/common/apsi/network/result_package.cpp.obj [9/29] Building CXX object CMakeFiles/apsi.dir/common/apsi/network/sender_operation.cpp.obj [10/29] Building CXX object CMakeFiles/apsi.dir/common/apsi/network/zmq/zmq_channel.cpp.obj ninja: build stopped: subcommand failed.

    In a Word, when it refers to "#error" or "uint128_t", the build will fail. Someone in the internet says something is wrong about my GCC.

    gcc --version gcc (x86_64-posix-seh-rev0, Built by MinGW-W64 project) 5.4.0 Copyright (C) 2015 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.`

    opened by FearlessSong 3
  • Fails to link with seal and kuku: Target

    Fails to link with seal and kuku: Target "apsi" links to target "SEAL::seal" but the target was not found.

    -- Build type (CMAKE_BUILD_TYPE): Release
    -- APSI debug mode: OFF
    -- APSI_USE_CXX17: ON
    -- Microsoft SEAL: found
    -- Microsoft Kuku: found
    -- Flatbuffers: found
    --     flatc path: /usr/local/bin/flatc
    -- jsoncpp: found
    -- log4cplus: found
    -- FourQlib optimization: arch=AMD64
    -- FourQlib optimization: simd=AVX2
    -- FourQlib optimization: asm=ON
    -- Configuring done
    CMake Error at CMakeLists.txt:230 (add_library):
      Target "apsi" links to target "SEAL::seal" but the target was not found.
      Perhaps a find_package() call is missing for an IMPORTED target, or an
      ALIAS target is missing?
    
    
    CMake Error at CMakeLists.txt:230 (add_library):
      Target "apsi" links to target "Kuku::kuku" but the target was not found.
      Perhaps a find_package() call is missing for an IMPORTED target, or an
      ALIAS target is missing?
    

    rev. a104124 seal-3.6.4 and kuku-2.1.0 are installed. cmake-3.19.6 FreeBSD 12.2

    opened by yurivict 3
  • Build fails on x86-windows #error: --

    Build fails on x86-windows #error: -- "Unsupported ARCHITECTURE"

    When I build apsi using vcpkg, I get the following error message. Looks like the macro _86_ is missing. I want to confirm this needs to be fixed or is it really not supporting x86-windows?

    E:\vcpkg\buildtrees\apsi\src\d48f0a1e6c-9e75655829.clean\common\apsi/fourq/FourQ.h(91): fatal error C1189: #error: -- "Unsupported ARCHITECTURE"

    opened by Adela0814 2
  • arm64-android cross compilation error on x64-linux host

    arm64-android cross compilation error on x64-linux host

    To use APSI in an Android application, I'm trying to build it with vcpkg install apsi:arm64-android on an Arch Linux x64 machine but cmake fails with the output as given below.

    Is setting APSI_FOURQ_ARM64_EXITCODE and APSI_FOURQ_ARM64_EXITCODE__TRYRUN_OUTPUT something I should do? From /vcpkg/buildtrees/apsi/arm64-android-dbg/TryRunResults.cmake:

    # APSI_FOURQ_ARM64_EXITCODE
    #    indicates whether the executable would have been able to run on its
    #    target platform. If so, set APSI_FOURQ_ARM64_EXITCODE to
    #    the exit code (in many cases 0 for success), otherwise enter "FAILED_TO_RUN".
    

    /vcpkg/buildtrees/apsi/config-arm64-android-dbg-err.log

    CMake Error: TRY_RUN() invoked in cross-compiling mode, please set the following cache variables appropriately:
       APSI_FOURQ_ARM64_EXITCODE (advanced)
       APSI_FOURQ_ARM64_EXITCODE__TRYRUN_OUTPUT (advanced)
    For details see /vcpkg/buildtrees/apsi/arm64-android-dbg/TryRunResults.cmake
    CMake Error at common/apsi/fourq/CMakeLists.txt:30 (add_subdirectory):
      add_subdirectory given source "arm64" which is not an existing directory.
    

    /vcpkg/buildtrees/apsi/config-arm64-android-dbg-out.log

    -- Build type (CMAKE_BUILD_TYPE): Debug
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Check for working CXX compiler: /Android/Sdk/ndk/21.4.7075529/toolchains/llvm/prebuilt/linux-x86_64/bin/clang++ - skipped
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Detecting C compiler ABI info
    -- Detecting C compiler ABI info - done
    -- Check for working C compiler: /Android/Sdk/ndk/21.4.7075529/toolchains/llvm/prebuilt/linux-x86_64/bin/clang - skipped
    -- Detecting C compile features
    -- Detecting C compile features - done
    -- APSI debug mode: ON
    -- APSI_USE_CXX17: ON
    -- Check if compiler accepts -pthread
    -- Check if compiler accepts -pthread - yes
    -- Microsoft SEAL: found
    -- Microsoft Kuku: found
    -- Flatbuffers: found
    -- flatc path: /vcpkg/installed/x64-linux/tools/flatbuffers/flatc
    -- jsoncpp: found
    -- log4cplus: found
    -- ZeroMQ: found
    -- cppzmq: found
    -- FourQlib optimization: arch=ARM64
    -- Looking for a ASM compiler
    -- Looking for a ASM compiler - /usr/bin/cc
    -- The ASM compiler identification is GNU
    -- Found assembler: /usr/bin/cc
    -- FourQlib optimization: asm=ON
    -- Configuring incomplete, errors occurred!
    
    opened by LGro 2
  • APSI as a shared library

    APSI as a shared library

    When I applied the instructions on how to build a shared library with vcpkg on Linux to APSI (vcpkg install apsi:x64-linux-dynamic), I noticed Error: apsi[core] is only supported on 'static & !(arm & osx)'

    How strict is that limitation? I'm currently trying to wrap APSI in a dynamic library and was wondering if and how it'd be possible to allow building a shared version of APSI. This is not blocking anything on my end, I'm mostly curious to learn what the implications would be on the APSI side.

    opened by LGro 2
  • Failed to compile apsi on arm64 machine

    Failed to compile apsi on arm64 machine

    When i compile the latest source code, there was an error: [ 97%] Linking CXX executable bin/receiver_cli /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functionfp2sqr1271': eccp2_core.c:(.text+0x1504): undefined reference to fp2sqr1271_a' /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functionfp2mul1271': eccp2_core.c:(.text+0x1510): undefined reference to fp2mul1271_a' /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functioneccdouble': eccp2_core.c:(.text+0x1afc): undefined reference to fp2addsub1271_a' collect2: error: ld returned 1 exit status make[2]: *** [CMakeFiles/receiver_cli.dir/build.make:112: bin/receiver_cli] Error 1 make[1]: *** [CMakeFiles/Makefile2:428: CMakeFiles/receiver_cli.dir/all] Error 2 make[1]: *** Waiting for unfinished jobs.... [100%] Linking CXX executable bin/sender_cli /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functionfp2sqr1271': eccp2_core.c:(.text+0x1504): undefined reference to fp2sqr1271_a' /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functionfp2mul1271': eccp2_core.c:(.text+0x1510): undefined reference to fp2mul1271_a' /usr/bin/ld: lib/libapsi-0.8.a(eccp2_core.c.o): in functioneccdouble': eccp2_core.c:(.text+0x1afc): undefined reference to fp2addsub1271_a' collect2: error: ld returned 1 exit status make[2]: *** [CMakeFiles/sender_cli.dir/build.make:128: bin/sender_cli] Error 1 make[1]: *** [CMakeFiles/Makefile2:401: CMakeFiles/sender_cli.dir/all] Error 2 make: *** [Makefile:136: all] Error 2 I don't kown how to fix it, can you help me? It seems that the problem is correlated with the FourQ.

    opened by BeStrongok 2
  • Visual studio build error

    Visual studio build error

    i have installed the APSI by vcpkg, and run .\vcpkg.exe integrate install commend . Cmake successed. However, the processing of building fail. capture_20220316124217095 Thanks for your help!!!!!

    opened by FearlessSong 2
  • Error?

    Error?

    Lowering the Depth

    The first step towards making this naive idea practical is to figure out ways of lowering the multiplicative depth of the computation. First, notice that the sender can split up its set into S equally sized parts and evaluate the matching polynomial independently on each of the parts, producing S results {M_i(Q)}. All of these results must be sent back to the receiver, so the sender-to-receiver communication has increased by a factor of S. Nevertheless, this turns out to be a really valuable trick in helping reduce the size of the encryption parameters.

    Shouldn't it be a receiver split? Shouldn't the receiver send it to the sender? receiver-》sender?

    opened by Hang-shao 2
  • Implementing arbitrary computation on the intersection

    Implementing arbitrary computation on the intersection

    In an earlier paper (CHLR18, section 6) the authors briefly outline an extension of the protocol to allow computation on the intersection. This is achieved by returning secret shares of the labels which are then forwarded to a secondary two-party MPC protocol.

    Some colleagues and I are investigating the possibility of extending this library to perform this functionality. The computations in question are summary statistics. For example, the Sender is a credit card company with labels that are the values of transactions, and the Receiver has both qualitative and numerical labels on each of the respondents. We would like to compute aggregates of the transactions based on the qualities of respondents.

    I was wondering if you had any thoughts on how one could go about adapting this code to add this functionality? How much work do you think it would take? If we went ahead with implementing this, I'm sure we could submit a pull request, if desired. We are considering using ABY to facilitate the MPC portion.

    Thanks!

    opened by zaczanussi 2
  • Query Parameters Setting Help

    Query Parameters Setting Help

    { "table_params": { "hash_func_count": 3, "table_size": 2457, "max_items_per_bin": 72 }, "item_params": { "felts_per_item": 5 }, "query_params": { "ps_low_degree": 11, "query_powers": [ 1, 3, 5, 6, 12, 36, 48] }, "seal_params": { "plain_modulus": 65537, "poly_modulus_degree": 4096, "coeff_modulus_bits": [ 40, 32, 36 ] } }

    Above is my params setting for 1M-2048 setup, if I set the ps_low_degree to 0 (disable the Paterson Stockmeyer Algorithm) and pass all needed powers, the setting runs fine. But if I stick to this, the outputs show 0 matchings.

    My thought was using n(2,4) for low powers up to 11, and n(2,3) for high powers. Am I missing anything here?

    opened by HelmetOff 1
  • error occurs when generating dynamic library?

    error occurs when generating dynamic library?

    I try to generate a dynamic library and report the following errors:

    relocation R_X86_64_PC32 against undefined symbol `ONEx8' can not be used when making a shared object; recompile with -fPIC
    /opt/rh/devtoolset-7/root/usr/libexec/gcc/x86_64-redhat-linux/7/ld: final link failed: Bad value
    

    I found that cmakelist currently only supports static libraries。 When will dynamic libraries be supported or My operations have any error?

    opened by muou55555 2
  • APSI goes Python

    APSI goes Python

    Hi folks, just wanted to let you know that PyAPSI is available now for Linux via PyPi.org, enabled by your great work here. In case you'd like to give it a spin, please do so and let me know how it goes. And in case you want to start a section in the APSI README pointing to implementations / wrappers in other languages, I'd be stoked to see PyAPSI listed there. If you have ideas regarding a transparent versioning scheme I'd be happy to read about them. Also, thanks again to @kimlaine for helping me over the initial bumps with the stream channel setup.

    opened by LGro 1
Owner
Microsoft
Open source projects and samples from Microsoft
Microsoft
Header-only, event based, tiny and easy to use libuv wrapper in modern C++ - now available as also shared/static library!

Do you have a question that doesn't require you to open an issue? Join the gitter channel. If you use uvw and you want to say thanks or support the pr

Michele Caini 1.5k Jul 30, 2022
Radeon Rays is ray intersection acceleration library for hardware and software multiplatforms using CPU and GPU

RadeonRays 4.1 Summary RadeonRays is a ray intersection acceleration library. AMD developed RadeonRays to help developers make the most of GPU and to

GPUOpen Libraries & SDKs 965 Aug 8, 2022
Secure ECC-based DID intersection in Go, Java and C.

SecureUnionID Secure ECC-based DID intersection. ABSTRACT This project is used to protect device ID using Elliptic Curve Cryptography algorithm. The d

Volcengine 19 Jul 22, 2022
Next Index to Query Kmer Intersection

NIQKI NIQKI stand for Next Index to Query K-mer Intersection. NIQKI is an sketch based software, similar to Mash or Dashing, which can index the large

Antoine Limasset 14 Jun 27, 2022
Single header asymmetric stackful cross-platform coroutine library in pure C.

minicoro Minicoro is single-file library for using asymmetric coroutines in C. The API is inspired by Lua coroutines but with C use in mind. The proje

Eduardo Bart 309 Jul 24, 2022
A small library for Asymmetric cryptography, otherwise known as public-key cryptography.

crypto-library A small library for Asymmetric cryptography, otherwise known as public-key cryptography. Self study Resources: https://en.wikipedia.org

thescientist 1 Dec 16, 2021
Rebuild asymmetric CPU topology on macOS.

CpuTopologySync Activate efficient cores in Alder Lake CPUs on macOS. The kext additionally requires a kernel patch: Base: _cpuid_set_info Find:

Acidanthera 22 Jul 26, 2022
Coroutine - C++11 single .h asymmetric coroutine implementation via ucontext / fiber

C++11 single .h asymmetric coroutine implementation API in namespace coroutine: routine_t create(std::function<void()> f); void destroy(routine_t id);

null 379 Jun 30, 2022
oZKS (Ordered Zero-Knowledge Set) is a library that provides an implementation of an Ordered (and Append Only) Zero-Knowledge Set.

Ordered Zero-Knowledge Set - oZKS Introduction oZKS is a library that provides an implementation of an Ordered (and Append Only) Zero Knowledge Set. A

Microsoft 6 Mar 25, 2022
Implementation of Univaraint Linear Regresion (Supervised Machine Learning) in c++. With a data set (training set) you can predict outcomes.

Linear-Regression Implementation of Univaraint Linear Regresion (Supervised Machine Learning) in c++. With a data set (training set) you can predict o

vincent laizer 1 Nov 3, 2021
6D - Pose Annotation Tool (6D-PAT) - is a tool that allows the user to load a set of images and also a set of 3D models and annotate where in the 2D image the 3D object ist placed.

6D - Pose Annotation Tool (6D-PAT) For detiled explanations checkout the WikiPage. What is it? With 6D-PAT you can create 6D annotations on images for

Florian Blume 64 Aug 4, 2022
AstoriaCore is a customized Version of AzerothCore and the private source of WaloriaCore by Fractional aka Fred.

Community driven Classless MMO Framework. Proudly founded by Lushen and based on AzerothCore and TrinityCore ??

AstoriaCore 15 Apr 22, 2021
Monero: the secure, private, untraceable cryptocurrency

Monero: the secure, private, untraceable cryptocurrency

The Monero Project 7.2k Aug 3, 2022
GTOS (Growtopia Private Server)'s latest source, everyone started selling so I decided to releasing it on github, enjoy it without spending your dls on that.

GTOSLatest GTOS (Growtopia Private Server)'s latest source, everyone started selling so we (Erben#1337 and Kaan#1337) decided to releasing it on githu

Kaan 19 Jul 13, 2022
A header-only version of Instagram Private API in C++

C++ Instagram API A header-only version of Instagram Private API in C++ Getting started. Writing your first bot General API Documentation Usage Error

Denis Beqiraj 35 Jun 4, 2022
Demonstrates implementation of the Windows 10 Acrylic Effect on C++ Win32 Apps using DWM Private APIs and Direct Composition

Win32 Acrylic Effect A Demonstration of Acrylic Effect on C++ Win32 applications using Direct Composition and DWM private APIs. Table of Contents Over

Selastin 96 Jul 22, 2022
i stole this from some guys private repo on github

.data-ptr-swap i stole this from some guys private repo on github This is Detected on Battleye(im suprised to silly bastine sutter) and Easy Anti Chea

null 26 Jul 28, 2022
First Growtopia Private Server made with ENet

GrowtopiaServer First Growtopia Private Server made with ENet. This project has been compiled with Visual Studio 2015 (Visual Studio 2013 or older won

Lyte 10 Oct 21, 2021
Brute Force Bitcoin Private keys, Public keys

Rotor-Cuda This is a modified version of KeyHunt v1.7 by kanhavishva. A lot of gratitude to all the developers whose codes has been used here. Feature

LostCoins 74 Jul 23, 2022
A framework for generic hybrid two-party computation and private inference with neural networks

MOTION2NX -- A Framework for Generic Hybrid Two-Party Computation and Private Inference with Neural Networks This software is an extension of the MOTI

ENCRYPTO 11 Jul 6, 2022
Code of this universe. Fork from private repository.

Universe Code of this universe. Forked from a private repository. Hello User. This is a fork of Universe project. Unfortunately free access to latest

Samsonov Dima 8 May 27, 2022
Bruteforce BitCoin Private keys WIF, Minikeys, Passphrases...

Fialka M-125 This is a modified version LostCoins Huge thanks kanhavishva and to all developers whose codes were used in Fialka M-125. Quick start Сon

Fialka M-125 47 Jul 14, 2022
Hello, I created a real banking system with creating each customer a private file.

bankingsystem Hello, I created a real banking system for my school project with creating each customer a private file. Your Transaction Code is your l

Byrsh 2 Dec 21, 2021
Monero: the secure, private, untraceable cryptocurrency

Monero GUI Copyright (c) 2014-2019, The Monero Project Table of Contents Development resources Vulnerability response Introduction About this project

The Monero Project 1.2k Aug 6, 2022
Private internal Cheat-Injector for Valorant

CyberVal-Injector CyberVal is a paste of a internal Valorant Cheat which has been used by several providers like LeagueHell, Enduty and several other

zeroday 21 Jul 31, 2022
WifSolverCuda - Tool for solving misspelled or damaged Bitcoin Private Key in Wallet Import Format (WIF)

WifSolverCuda Tool for solving misspelled or damaged Bitcoin Private Key in Wallet Import Format (WIF) Usage: WifSolverCuda [-d deviceId] [-b NbBlocks

null 16 Jun 20, 2022
The libxo library allows an application to generate text, XML, JSON, and HTML output using a common set of function calls. The application decides at run time which output style should be produced.

libxo libxo - A Library for Generating Text, XML, JSON, and HTML Output The libxo library allows an application to generate text, XML, JSON, and HTML

Juniper Networks 242 Jul 19, 2022
lua-nuspell is a set of Lua 5.x bindings for Nuspell spellchecking C++ library.

lua-nuspell lua-nuspell is a set of Lua 5.x bindings for Nuspell spellchecking C++ library. About Nuspell Nuspell is a fast and safe spelling checker

AF 7 Jul 3, 2022