📊 An Overview of PDP Integration for FIL+ Deals

@luca, @irene, March 2025

As the mainnet launch of PDP approaches, a natural question arises: how can PDP be integrated with FIL+ deals which are designed to provide retrievability guarantees?

While we know that not all FIL+ deals require this feature, a subclass of them does and would benefit from showing PDP-like properties.

🤔 Why PDP?

PDP (Proof of Data Possession) periodically proves that a hot (i.e. unsealed) copy of the data is accessible by the storage provider (SP), offering assurance that data can be retrieved without the need to unseal the sector. While this does not guarantee retrieval, it is a step forward in increasing the likelihood of data serving.

Furthermore, PDP can serve as a prerequisite for datacap allocation. If verified and included in the compliance report, allocators can ask SPs to meet this requirement to request additional datacap.

🔌 PDP Integration

The simplest way to integrate PDP is to require FIL+ SPs to pair a PDP proof to the existing WindowPost. This ensures that an unsealed copy of the data remains accessible over time.

However, pairing these two distinct proofs presents a primary challenge, which can be addressed in a few different ways.

The key property to enforce is that the data sealed into sectors and the data proven in the PDP proofset must be the same. If this is not the case, the PDP proof does not offer any meaningful guarantees beyond the status quo.

Without such a link, a malicious SP could submit a PDP proof for a proofset filled with random data (such as a string of zeros) and easily pass the PDP proof requirement without accessing the actual unsealed data.

To resolve this, the PieceCID of the FIL+ sector must be included in the PDP proofset.

Proofsets, by definition, are versatile objects, allowing multiple CIDs to be aggregated together to reduce the number of Merkle tree roots per proofset (at the cost of larger Merkle tree proofs). In principle, any PDP SP can decide whether or not to aggregate CIDs.

⚙️ Option 1: Require Unaggregated CIDs in the Proofset

The easiest option is to require FIL+ SPs to provide a corresponding PDP proofset in which each PieceCID of sectors is stored and proven without being aggregated.

🔗 Option 2: Allow Aggregated CIDs with Proofs on Demand

Alternatively, aggregation of PieceCIDs within the proofset can be allowed, with the caveat that PieceCID inclusion proofs must be available on demand. These proofs would use Merkle tree (MT) proofs to show that a given PieceCID has been aggregated and belongs to a specific tree rooted in the aggregated MT root of the proofset.

🤔 Why PDP?

🔌 PDP Integration

⚙️ Option 1: Require Unaggregated CIDs in the Proofset

🔗 Option 2: Allow Aggregated CIDs with Proofs on Demand

✅ Checking Correct FIL+ CID Inclusion in the Proofset