Oracle Machine Truora Study Notes 3--Principle of Oracle Machine

Reprinted: Oracle | ethereum.org

This article is a good introduction to the concept and general usage of the oracle machine. The content is very long, so this article is cut here.

Oracle

An oracle is a data feed that extracts data from a blockchain data source (off-chain) and stores the data on the blockchain (on-chain) for use by smart contracts. Because smart contracts running on Ethereum cannot access information stored outside the blockchain network, oracles are essential.

Giving smart contracts the ability to execute using off-chain data inputs extends the value of decentralized applications. For example, decentralized prediction markets rely on oracles to provide information about outcomes and are able to use this information to validate users' predictions. Suppose Alice bets 20 ETH on who will be the next President of the United States. In this case, a prediction market dapp needs an oracle to confirm the election result and determine whether Alice is eligible for payment.

prerequisite

This page assumes that the reader is familiar with the basics of Ethereum, such as nodes , consensus mechanisms , and the Ethereum Virtual Machine . Readers should also have a solid understanding of smart contracts and smart contract analysis , especially events .

What is a blockchain oracle?

Oracles are applications that acquire, verify, and communicate external information (that is, information stored off-chain) to smart contracts running on the blockchain. In addition to “pulling” off-chain data and broadcasting it on Ethereum, oracles also “push” information from the blockchain to external systems. An oracle that unlocks a smart lock after a user sends a fee via an Ethereum transaction is an example of a push message.

The oracle acts as a "bridge" between smart contracts on the blockchain and off-chain data providers. Without oracles, smart contract applications can only access on-chain data. Oracles provide a mechanism to trigger smart contract functions using off-chain data.

Oracles differ in data provenance (one or more sources), trust model (centralized or decentralized), and system architecture (read immediately, publish-subscribe, and request-response). We can also differentiate oracles based on whether they retrieve external data for use by on-chain contracts (input oracles), send information from the blockchain to off-chain applications (output oracles), or perform computational tasks off-chain (computing oracle).

Why do smart contracts need oracles?

Most developers just think of smart contracts as pieces of code that run at specific addresses on the blockchain. However, a broader view of smart contracts is that a smart contract is a self-executing software program capable of executing an agreement between parties when certain conditions are met, which explains the term "smart contract".

However, using smart contracts to enforce agreements between people is not trivial because Ethereum is a deterministic system. A deterministic system (opens in a new tab) is a system that always produces the same result given an initial state and certain inputs, that is, there is no randomness or variation in the computation of the output using the inputs.

To achieve deterministic execution, blockchains restrict nodes to reaching consensus on simple binary (true/false) questions by using only data stored in the blockchain itself. Examples of such questions include:

  • "Did the account owner (identified by the public key) sign this transaction with the paired private key?"

  • "Does the account have sufficient funds to cover this transaction?"

  • "Is this transaction valid in this smart contract?" and so on.

If the blockchain receives information from external sources (such as the real world), determinism would not be possible, preventing nodes from agreeing on the validity of blockchain state changes. Take, for example, a smart contract that executes trades based on the current Ether-USD exchange rate obtained from a traditional price API. This exchange rate may change frequently (not to mention that the API may be deprecated or hacked), which means that nodes executing the same contract code will get different results.

For public blockchains, such as Ethereum, where transactions are processed by thousands of nodes around the world, finality is critical. Since there is no centralized organization as a source of truth, it is expected that nodes should reach the same state after making the same transaction. Node A executes the code of the smart contract and gets the result "3", while Node B gets "7" after running the same transaction, which will break the consensus and remove the value of Ethereum as a decentralized computing platform.

The foregoing also highlights the problem of designing blockchains to obtain information from external sources. However, oracles solve this problem by taking information from off-chain sources and storing it on the blockchain for use by smart contracts. Since information stored on-chain is immutable and publicly available, Ethereum nodes can safely use off-chain data imported by oracles to compute state changes without breaking consensus.

To this end, an oracle typically consists of a smart contract running on-chain and some off-chain components. On-chain contracts receive data requests from other smart contracts and pass those requests to off-chain components (called oracle nodes). Such oracle nodes can query data sources—for example, using an application programming interface (API)—and send transactions to store the requested data in the smart contract's storage.

In essence, blockchain oracles bridge the information gap between the blockchain and the external environment, creating a "hybrid smart contract." The working principle of hybrid smart contracts is based on the combination of on-chain contract code and off-chain infrastructure. A good example of a hybrid smart contract is the decentralized prediction market described in the introduction. Other examples might include crop insurance smart contracts that pay out when a set of oracles determine that certain weather phenomena have occurred.

What is the oracle problem?

By relying on an entity (or entities) to introduce external information to the blockchain (i.e. store the information in the transaction's data payload), it is easy for smart contracts to obtain off-chain data. But this creates new problems:

  • How can I verify that injected information was pulled from the correct source or has not been tampered with?

  • How do you ensure that this data is always available and regularly updated?

The so-called "oracle problem" shows the problems that arise when using blockchain oracles to send inputs to smart contracts. It is important to ensure that the data coming from the oracle is correct, otherwise the smart contract execution will produce wrong results. Trust-free is just as important, having to "trust" the oracle operator to reliably provide accurate information would take away the most critical feature of a smart contract.

Different oracles take different approaches to solving the oracle problem, which we explore later. While no oracle is perfect, the merits of an oracle should be measured in terms of how it handles the following challenges:

  1. Correctness: Oracles should not cause smart contracts to trigger state changes based on invalid off-chain data. Therefore, the oracle must guarantee the authenticity and integrity of the data - authenticity means that the data is obtained from the correct source. Integrity means that the data remains intact (i.e. the data has not been modified) before being sent on-chain.

  2. Availability: Oracles should not delay or prevent smart contracts from executing actions or triggering state changes. This characteristic requires that the data provided by the oracle be available when requested and without interruption.

  3. Incentive Compatibility: Oracles should incentivize off-chain data providers to submit correct information to smart contracts. Reward compatibility includes attribution and accountability. Attributability allows linking a piece of external information to its provider, while accountability ties data providers to the information they provide so that they can be rewarded or punished based on the quality of the information they provide.

How does the blockchain oracle service work?

user

A user is an entity (i.e. a smart contract) that requires information from outside the blockchain to complete a specific action. The basic workflow of an oracle service starts with a user sending a data request to the oracle contract. Data requests will typically answer some or all of the following questions:

  1. What sources can off-chain nodes look up for requested information?

  2. How did the reporter process the information in the data sources and extract useful data points?

  3. How many oracle nodes can participate in data retrieval?

  4. How should variance in oracle reports be managed?

  5. What approach should be taken when filtering submissions and aggregating reports into a single value?

Oracle contract

The oracle contract is the on-chain component of the oracle service: it listens for data requests from other contracts. Forward data queries to oracle nodes and broadcast returned data to client contracts. The contract can also do some calculations on the returned data points to produce an aggregated value and send to the requesting contract.

The oracle contract exposes functions that the client contract calls when it makes a request for data. When a new query is received, the smart contract will fire a log event with the data request details. This will notify off-chain nodes subscribed to the log (typically using a JSON-RPC eth_comment like command) to continue retrieving the data defined in the log event.

Below is an example oracle contract by Pedro Costa (opens in a new tab) . It is a simple oracle service that queries off-chain APIs when requested by other smart contracts, and stores the requested information on the blockchain:

pragma solidity >=0.4.21 <0.6.0;


contract Oracle {

 Request[] requests; //list of requests made to the contract

 uint currentId = 0; //increasing request id

 uint minQuorum = 2; //minimum number of responses to receive before declaring final result

 uint totalOracleCount = 3; // Hardcoded oracle count


 // defines a general api request

 struct Request {

   uint id;                            //request id

   string urlToQuery;                  //API url

   string attributeToFetch;            //json attribute (key) to retrieve in the response

   string agreedValue;                 //value from key

   mapping(uint => string) answers;     //answers provided by the oracles

   mapping(address => uint) quorum;    //oracles which will query the answer (1=oracle hasn't voted, 2=oracle has voted)

 }


 //event that triggers oracle outside of the blockchain

 event NewRequest (

   uint id,

   string urlToQuery,

   string attributeToFetch

 );


 //triggered when there's a consensus on the final result

 event UpdatedRequest (

   uint id,

   string urlToQuery,

   string attributeToFetch,

   string agreedValue

 );


 function createRequest (

   string memory _urlToQuery,

   string memory _attributeToFetch

 )

 public

 {

   uint length = requests.push(Request(currentId, _urlToQuery, _attributeToFetch, ""));

   Request storage r = requests[length-1];





   // Hardcoded oracles address

   r.quorum[address(0x6c2339b46F41a06f09CA0051ddAD54D1e582bA77)] = 1;

   r.quorum[address(0xb5346CF224c02186606e5f89EACC21eC25398077)] = 1;

   r.quorum[address(0xa2997F1CA363D11a0a35bB1Ac0Ff7849bc13e914)] = 1;


   // launch an event to be detected by oracle outside of blockchain

   emit NewRequest (

     currentId,

     _urlToQuery,

     _attributeToFetch

   );


   // increase request id

   currentId++;

 }


 //called by the oracle to record its answer

 function updateRequest (

   uint _id,

   string memory _valueRetrieved

 ) public {

   Request storage currRequest = requests[_id];


   //check if oracle is in the list of trusted oracles

   //and if the oracle hasn't voted yet

   if(currRequest.quorum[address(msg.sender)] == 1){

     //marking that this address has voted

     currRequest.quorum[msg.sender] = 2;


     //iterate through "array" of answers until a position if free and save the retrieved value

     uint tmpI = 0;

     bool found = false;

     while(!found) {

       //find first empty slot

       if(bytes(currRequest.answers[tmpI]).length == 0){

         found = true;

         currRequest.answers[tmpI] = _valueRetrieved;

       }

       tmpI++;

     }


     uint currentQuorum = 0;


     //iterate through oracle list and check if enough oracles(minimum quorum)

     //have voted the same answer has the current one

     for(uint i = 0; i < totalOracleCount; i++){

       bytes memory a = bytes(currRequest.answers[i]);

       bytes memory b = bytes(_valueRetrieved);


       if(keccak256(a) == keccak256(b)){

         currentQuorum++;

         if(currentQuorum >= minQuorum){

           currRequest.agreedValue = _valueRetrieved;

           emit UpdatedRequest (

             currRequest.id,

             currRequest.urlToQuery,

             currRequest.attributeToFetch,

             currRequest.agreedValue

           );

         }

       }

     }

   }

 }

}

An oracle node is the off-chain component of an oracle service: it pulls information from external sources (such as APIs hosted on third-party servers) and puts the information on-chain for use by smart contracts. The oracle node listens to the events in the oracle contract on the chain, and then completes the tasks described in the log.

A common task for an oracle node is to send an HTTP GET (opens in a new tab) request to an API service , parse the response to extract relevant data, format the output in a blockchain-readable format, and pass the input to the Transactions in oracle contracts send them on-chain. Oracle nodes may also be used when proving the validity and integrity of submitted information using "proofs of authenticity," which we'll explore later.

Computational oracles also rely on off-chain nodes to perform intensive computational tasks that are impractical to perform on-chain due to gas costs and block size constraints. For example, an oracle node might be tasked with generating a verifiably random number (for example, for a blockchain game).


Oracle Design Pattern

There are different types of oracles, including immediate read, publish-subscribe, and request-response, with the latter two being the most popular in Ethereum smart contracts. The following briefly introduces the two types of oracle services:

publish-subscribe oracle

A publish-subscribe based oracle service exposes a "data feed" that other contracts can periodically read for information. In this case, the data may change frequently, so the client contract must listen for updates to the data in the oracle storage. An oracle that provides users with the latest ETH-USD price is a good example.

request-response oracle

The request-response setup allows client contracts to request arbitrary data other than that provided by the pub-sub oracle. Request-response oracles are best suited for the following conditions:

  • The dataset is too large to be stored in the smart contract's storage

  • Users only need a fraction of the data at any point in time

Although more complex than publish-subscribe oracles, request-response oracles are basically the same as we described in the previous section. The oracle will have an on-chain component that receives data requests and sends them to off-chain nodes for processing.

Users who initiate data queries must bear the cost of retrieving information from off-chain sources. The client contract must also provide funds to pay for the gas costs incurred by the oracle contract to return responses through the callback function specified in the request.


Types of oracles

Centralized oracle

A centralized oracle is controlled by a single entity that aggregates off-chain information and updates the oracle contract's data as requested. Centralized oracles are efficient because they rely on a single source of truth. Centralized oracles may even be a better choice in cases where proprietary datasets are published directly by their owners and have recognized signatures. However, using a centralized oracle can present various problems.

low correctness guarantee

When using a centralized oracle, it is impossible to confirm that the information provided is correct. An oracle provider may be "reputable," but that doesn't rule out the possibility that someone misbehaved or a hacker tampered with the system. If the oracle is compromised, the smart contract will execute based on bad data.

Poor usability

Centralized oracles cannot guarantee that off-chain data will always be provided to other smart contracts. If the provider decides to shut down the service or if a hacker hijacks the off-chain component of the oracle, the smart contract is at risk of a Denial of Service (DoS) attack.

Poor Incentive Compatibility

Centralized oracles often have poorly designed incentives or no incentives at all, encouraging data providers to send accurate/unaltered information. Paying for the services of an oracle may encourage honest behavior, but it may not be enough. With smart contracts controlling vast amounts of value, the benefits of manipulating oracle data are unprecedented.

Decentralized Oracle

Decentralized oracles aim to break the limitations of neutral oracles by eliminating single points of failure. A decentralized oracle service consists of multiple participants in a peer-to-peer network that reach consensus on off-chain data before sending the data to a smart contract.

Ideally, a decentralized oracle should be permissionless, trustless, and not managed by a central authority; in reality, there are varying degrees of decentralization of oracles. There are semi-decentralized oracle networks where anyone can participate, but the "owners" approve and remove nodes based on past performance. Fully decentralized oracle networks also exist: these networks typically operate as independent blockchains and have established consensus mechanisms for coordinating nodes and punishing bad behavior.

Using a decentralized oracle has the following benefits:

High accuracy guarantee

Decentralized oracles try to use different methods to achieve data correctness. These include the use of proofs to demonstrate the authenticity and integrity of returned information, and requiring multiple entities to collectively agree on the validity of off-chain data.

certificate of authenticity

Proof of Authenticity is a cryptographic mechanism that enables independent verification of information retrieved from external sources. These proofs verify the origin of the information and, when retrieved, uncover possible changes to the data.

Examples of proof of authenticity include:

Transport Layer Security (TLS) Proof: Oracle nodes typically retrieve data from external data sources using a secure HTTP connection based on the Transport Layer Security (TLS) protocol. Some decentralized oracles use proof-of-authentication to authenticate a session with Transport Layer Security (i.e., to confirm the exchange of information between a node and a specific server), and to confirm that the contents of the session have not been altered.

Trusted Execution Environment (TEE) Certification: A Trusted Execution Environment (opens in a new tab) (TEE) is a sandboxed computing environment that is isolated from the host system's operating processes. A Trusted Execution Environment ensures that any application code or data stored/used within a computing environment maintains integrity, confidentiality and immutability. Users can also generate a certificate that the application instance is running in a trusted execution environment.

Certain classes of decentralized oracles require oracle node operators to provide TEE certification. This confirms to the user that the node operator is running an instance of the oracle client in a trusted execution environment. A Trusted Execution Environment prevents external processes from changing or reading the application's code and data, so these certifications prove that the oracle node maintains the integrity and confidentiality of the information.

Consensus-based information verification

Centralized oracles rely on a single source of truth when feeding data to smart contracts, thus potentially publishing inaccurate information. Decentralized oracles solve this problem by relying on multiple oracle nodes to query off-chain information. By comparing data from multiple sources, decentralized oracles reduce the risk of passing invalid information to on-chain contracts.

However, decentralized oracles must deal with discrepancies in information retrieved from multiple off-chain sources. To minimize information discrepancies and ensure that the data sent to the oracle contract reflects the collective opinion of the oracle nodes, decentralized oracles employ the following mechanisms:

Vote/stake on data accuracy

Some decentralized oracle networks require participants to vote or stake on the accuracy of answers to data queries (e.g., "Who won the 2020 U.S. election?") (e.g., "Who won the 2020 U.S. election?") and then , the aggregation protocol aggregates votes and pledges, and takes the answer supported by the majority of participants as a valid answer.

If a node's answer differs from the majority answer, it is penalized by distributing its tokens to other nodes that provided a more correct value. Forcing nodes to provide a deposit before providing data will incentivize nodes to respond honestly, since nodes are assumed to be rational economically active participants intent on maximizing rewards.

Staking/voting also protects decentralized oracles from "Sybil attacks", in which malicious actors create multiple identities to exploit the consensus system. However, the staking mechanism does not prevent "mobbing" (oracle nodes copying information from other nodes) and "lazy validation" (oracle nodes follow the crowd without verifying information themselves).

Schelling point mechanism

A Schelling point (opens in a new tab) is a game-theoretic concept that assumes that in the absence of any communication, multiple entities always default to choosing a common solution to a problem. The Schelling point mechanism is often used in decentralized oracle networks to enable nodes to reach consensus on responses to data requests.

An early example is Schelling Coin (opens in a new tab) , a proposed data feed in which participants submit answers to "scalar" questions (these answers are described by quantities such as "the price of ether how much?”) and deposits. Users who provide values ​​between the 25th and 75th percentile (opens in a new tab) will be rewarded, while users who provide values ​​that deviate significantly from the median will be penalized.

While Schelling coins are currently defunct, many decentralized oracles—especially Maker Protocol oracles (opens in a new tab) —still use the Schelling point mechanism to improve the accuracy of oracle data. Each Maker oracle consists of an off-chain network of peer nodes (“relays” and “feeders”) that submit market prices for collateral assets, and an on-chain “median” contract that calculates all offered value median value. After the stipulated deferment period expires, this median value becomes the new reference price for the underlying asset.

Other examples of oracles that use the Schelling point mechanism include Chainlink off-chain reporting (opens in a new tab) and Witnet. In both systems, the replies from oracle nodes in a peer-to-peer network are aggregated into a single aggregated value, such as an average or median. Nodes are rewarded or penalized based on how well their replies agree with or deviate from the aggregated value.

Schelling point mechanisms are attractive because they minimize on-chain impact (only one transaction needs to be sent) while still maintaining decentralization. The latter is possible because nodes must approve the list of submitted replies before feeding the replies into the algorithm that generates the mean/median.

availability

Decentralized oracle services ensure high availability of off-chain data to smart contracts. High availability is achieved by decentralizing both off-chain sources of information and the nodes responsible for transferring the information on-chain.

This ensures fault tolerance, as oracle contracts are able to rely on multiple nodes (which also rely on multiple data sources) to execute queries issued by other contracts. Achieving decentralization at the level of information sources and node operators is critical—a network of oracle nodes that provide information retrieved from the same source will suffer from the same problems as centralized oracles.

Staking-based oracles can also penalize node operators who fail to respond quickly to data requests. This greatly incentivizes oracle nodes to invest in fault-tolerant infrastructure and provide data in a timely manner.

Good incentive compatibility

Decentralized oracles adopt different incentive designs to avoid Byzantine (opens in a new tab) behavior in oracle nodes. Specifically, they enable attribution and accountability:

  1. Typically, decentralized oracle nodes are required to sign the data they provide in response to data requests. This information helps evaluate the historical performance of oracle nodes, allowing users to filter out unreliable oracle nodes when making data requests. Examples include Chainlink's oracle reputation (opens in a new tab) or Witnet's algorithmic reputation system (opens in a new tab) .

  2. As mentioned earlier, decentralized oracles may require nodes to pledge their trustworthiness in the authenticity of the data they submit. If the claim is substantiated, this pledge can be returned along with rewards for honest service. But nodes can also be penalized if the information is incorrect, which provides a degree of accountability.


The Application of Oracle in Smart Contract

  Retrieve financial data

  Generate verifiable randomness

  get event result

  Smart Contract Automation

Guess you like

Origin blog.csdn.net/u012084827/article/details/130803353