# Subgraph Query

Any project that has implemented a functioning [subgraph](https://thegraph.com/docs/en/developing/creating-a-subgraph/) can configure KPI options to track any custom metric as provided by selected subgraph. In addition to [common ancillary data parameters](https://docs.outcome.finance/kpi-options/kpi-option-configuration/common-ancillary-data-parameters) the deployer should configure subgraph specific settings as documented below. Full specification of subgraph query data source is available in its [implementation document](https://github.com/UMAprotocol/UMIPs/blob/master/Implementations/subgraph-query.md) to be referenced by verifiers.

### Endpoint

When using subgraph on the hosted service `Endpoint` string value should be set to API endpoint in the format `"https://api.thegraph.com/subgraphs/name/<GITHUB_USER>/<SUBGRAPH_NAME>"` where `<GITHUB_USER>` should be replaced with project's GitHub user name (used when creating hosted subgraph) and `<SUBGRAPH_NAME>` should be replaced with the name of the tracked subgraph.

Users of the subgraph implementation should be aware that the Graph protocol is planning to sunset the hosted service in Q1 2023. Hence, all contracts that are expected to expire after end of 2022 should use the decentralized subgraph network. When using the decentralized network the `Endpoint` parameter should not be provided and instead the `SubgraphId` string value should be set to the identifier of deployed subgraph on the decentralized network.

When publicly available deployers should also pass `Source` parameter with string URL link to the repository containing source code for the relevant subgraph. This would allow verifiers to check if returned data is consistent with the tracked `Metric` parameter and provide transparency on any subgraph implementation upgrades.

### Query Configuration

`QueryString` string value should be set to configure desired subgraph query as documented in [GraphQL API](https://thegraph.com/docs/en/querying/graphql-api/). Both single entity queries and querying multiple entities within a collection are supported, though a collection of entities (array) can only be used once within the returned data object.

#### Single Entities

When querying for a single entity `MetricKey` string value should be set to key path to the tracked metric relative to the returned `data` object where nested elements are joined by dots (`.`). `MetricKey` should always be consistent with the requested `QueryString` so that returned `data` object returns the tracked metric in location defined by `MetricKey` parameter.

As an illustration consider request on [UMA voting subgraph](https://api.thegraph.com/subgraphs/name/umaprotocol/mainnet-voting) for total number of voters who participated in a particular governance vote where `QueryString` is constructed as:

```
{
  priceRequest(id: "Admin 123-1626827369") {
    latestRound {
      votersAmount
    }
  }
}
```

The above query would result in following returned data object:

```
{
  "data": {
    "priceRequest": {
      "latestRound": {
        "votersAmount": "57"
      }
    }
  }
}
```

In order to locate the tracked metric (`votersAmount`) the `MetricKey` should have been set to `priceRequest.latestRound.votersAmount`.

#### Collection of Entities

When querying for a collection of entities `CollectionKey` string value should be set to key path to the relevant collection relative to the returned `data` object and `MetricKey` string value should be set to key path to the tracked metric within the returned collection of entities. Unless data aggregation method is explicitly set the values of tracked metric would be summed together across the returned collection of entities.

As an illustration consider request on [UMA voting subgraph](https://api.thegraph.com/subgraphs/name/umaprotocol/mainnet-voting) for voting activity in July 2022 where `QueryString` is constructed as:

```
{
  priceRequests(
    where: {
      resolutionTimestamp_lt: 1659312000,
      resolutionTimestamp_gte: 1656633600
    }
  ) {
    latestRound {
      votersAmount
    }
  }
}
```

The above query would result in following returned data object (truncated):

```
{
  "data": {
    "priceRequests": [
      {
        "latestRound": {
          "votersAmount": "64"
        }
      },
      {
        "latestRound": {
          "votersAmount": "74"
        }
      },
      ...
    ]
  }
}
```

In order to locate the tracked metric (`votersAmount`) within the returned collection the `CollectionKey` should have been set to `priceRequests` (as its value is an array) while `MetricKey` set to `latestRound.votersAmount`.

#### Dynamic query configuration

Most commonly dynamic query configuration would be useful when due to aggregation the subgraph queries should be iterated with differing timestamp filter parameters or to support repeated queries with pagination. This can be achieved by using angle bracket (`<>`) enclosed macros in `QueryString` that would be expanded at the time of verification. KPI options on subgraph queries support following macros:

* `<QUERY_DTS>` would be replaced with the value of daily query timestamp that normally is rounded down to 24:00 UTC from the actual request timestamp (or value of `RequestTimestampOverride` when provided). In case of iterative aggregation the verifiers would repeat the query replacing `<QUERY_DTS>` with daily timestamps over the configured aggregation period.
* `<QUERY_DTS-[N]D>` with `[N]` set to integer value of days would be replaced by the value of daily query timestamp subtracted by `[N] * 86400` seconds. As an illustration, the query above on voting activity in July 2022 can be rewritten to track last 30 days before contract expiration:

  ```
  {
    priceRequests(
      where: {
        resolutionTimestamp_lt: <QUERY_DTS>,
        resolutionTimestamp_gte: <QUERY_DTS-30D>
      }
    ) {
      latestRound {
        votersAmount
      }
    }
  }
  ```
* `<QUERY_DBN>` would be replaced with the latest block number that is at or before the daily query timestamp. This macro operates similar to `<QUERY_DTS>`, but is useful when subgraph schema does not support filtering by timestamps directly. Please consult Graph protocol documentation on [time-travel queries](https://thegraph.com/docs/en/querying/graphql-api/#time-travel-queries) for querying subgraph state at any given block number. When querying subgraphs on other chains than Ethereum mainnet the deployer would also need to pass `ChainId` parameter set to the relevant numeric chain identifier so that verifiers can correctly translate daily query timestamp to the corresponding block number.
* `<QUERY_DBN-[N]D>` with `[N]` set to integer value of days would be replaced with the latest block number that is at or before the daily query timestamp that had been reduced by `[N] * 86400` seconds. As an illustration, `<QUERY_DBN-30D>` would be replaced by the corresponding block number for the timestamp that is 30 days before the daily query timestamp. Similarly as for `<QUERY_DBN>` `ChainId` parameter should be provided when querying subgraphs on other chains than Ethereum mainnet.
* `<PAGINATE>` would inform verifiers that the returned collection could include more than 1000 entities and the query should be repeated multiple times as documented in the Graph protocol [pagination](https://thegraph.com/docs/en/querying/graphql-api/#pagination) instructions. `<PAGINATE>` should be properly located within the selected entity parameters section, so that the verifiers can safely replace it with `first:1000` on the first run or `first:1000,skip:X000` on any subsequent runs iterating over `X`. When `<PAGINATE>` macro is used it should not conflict with any other `first` or `skip` parameters. As an illustration consider request on [UMA voting subgraph](Endpoint:https://api.thegraph.com/subgraphs/name/umaprotocol/mainnet-voting) for cumulative amount of tokens revealed in voting till expiry where `QueryString` is constructed as:

  ```
  {
    revealedVotes(
      <PAGINATE>,
      where: {
        time_lte: <QUERY_DTS>
      }
    ) {
      numTokens
    }
  }
  ```

  When relying on paginated queries the deployers should be aware that the Graph protocol does not support `skip` argument higher than 5000, hence only maximum of 6000 data entries can be handled.

### Time Series Aggregation

In order to support time series aggregation for subgraph queries as documented in [common ancillary data parameters](https://docs.outcome.finance/kpi-options/kpi-option-configuration/common-ancillary-data-parameters) section the `QueryString` should include macro for timestamp / block number based filtering. This would instruct verifiers to iterate over daily interval timestamps within the period set in `AggregationPeriod` parameter and repeat the query by using macro substitution.

#### Single Query Aggregation

If subgraph schema provides timestamps it is also possible to construct `QueryString` so that one request (or multiple pagination requests) returns all the data points and their corresponding timestamps that could be used for time series aggregation. The `AggregationPeriod` parameter is redundant in this mode as the desired aggregation time range filtering should be performed within the query definition. This mode can be enabled by providing `TimestampKey` parameter string value set to key path to the timestamp of the returned metric within the returned collection of entities. This would instruct verifiers to slice returned timestamp / metric series in daily intervals ending at 24:00 UTC and pick the last metric - timestamp pairs by its timestamp value within each daily interval (if it includes any data). In case there are more than one data items for the same last daily timestamp the values of such metrics would be summed together.

As a real application example consider request on [Idle Finance tranche subgraph](https://api.thegraph.com/subgraphs/name/samster91/idle-tranches) for total value locked in senior wstETH tranche denominated in Wei where `QueryString` is constructed as:

```
{
  trancheInfos (
    <PAGINATE>,
    where: {
      Tranche: \"0x2688fc68c4eac90d9e5e1b94776cf14eade8d877\",
      timeStamp_lte: <QUERY_DTS>,
      timeStamp_gte: <QUERY_DTS-90D>
    }
  )
  {
    timeStamp,
    contractValue
  }
}
```

When above example `QueryString` is accompanied with below listed ancillary data parameters the verifiers would combine results of paginated queries and select last `timeStamp` `contractValue` metric for each day within the 90 day period before contract expiration for time weighted average processing:

```
CollectionKey:trancheInfos,
MetricKey:contractValue,
TimestampKey:timeStamp,
AggregationMethod:TWAP
```

#### Daily Aggregation

Single query aggregation method documented above can be modified by providing additional `DailyAggregation` parameter set to `true` that would allow summing up metric values within separate daily intervals before processing them for further aggregation. This would be mostly useful for tracking daily volume metric if subgraph provides only transaction data.

As an illustration consider request on [SushiSwap exchange subgraph](https://api.thegraph.com/subgraphs/name/sushiswap/exchange) for daily volume in UMA-WETH pair where `QueryString` is constructed as:

```
{
  swaps(
    <PAGINATE>,
    where: {
      timestamp_lte: <QUERY_DTS>,
      timestamp_gte: <QUERY_DTS-30D>,
      pair: \"0x001b6450083e531a5a7bf310bd2c1af4247e23d4\"
    }
  ) {
    timestamp,
    amountUSD,
  }
}
```

When above example `QueryString` is accompanied with below listed ancillary data parameters the verifiers would first calculate daily total values of `amountUSD` and then calculate average daily volume over the last 30 day period before contract expiration:

```
CollectionKey:swaps,
MetricKey:amountUSD,
TimestampKey:timestamp,
DailyAggregation:true,
AggregationMethod:TWAP
```
