Skip to content

Conversation

@iloveitaly
Copy link
Contributor

@iloveitaly iloveitaly commented Jul 23, 2022

Very rough right now: wanted to submit early to get some feedback on the approach.

@macanudo527
Copy link
Collaborator

You might want to check the CCXT converter plugin #53 . It pulls historical data from Binance.com as well as any CCXT supported exchange. Right now, it pulls historical data specifically from the exchange listed in the historical bar, but we could easily edit it to default to Binance or set a default for Nexo since it looks like they don't have a pricing API. I'm sure @eprbell could weigh in on the best option.

Either way I'm pretty sure there isn't a need for a separate converter plugin. It's currently undergoing final testing. I'd appreciate any input you have.

Copy link
Owner

@eprbell eprbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting new plugins! The code looks pretty good and I just have a few nits and comments. As @macanudo527 mentioned, the Binance pair converter is probably overlapping with the CCXT pair converter he's working on (so we may not need it), but the Nexo CSV input plugin can definitely be merged.

Could you also add a small section for Nexo in the docs?

]
"""

# binance has more tokens in USDT, which should be equal to the USD price
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a safe assumption: USDT oscillates around the peg, most of the time slightly, sometimes quite a bit (https://www.coinbase.com/price/tether).

to_asset = "USDT"

symbol = f"{from_asset}{to_asset}"
twelve_hours = 60 * 60 * 12
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make it into a global constant starting with underscore: _TWELVE_HOURS

utc_timestamp = int(timestamp.astimezone(timezone.utc).timestamp())

response = requests.get(f"https://api.binance.com/api/v3/klines?symbol={symbol}&interval=12h&startTime={utc_timestamp}&limit=1")
response_json = json.loads(response.text)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need any error checks here (e.g. see this example)?

response = requests.get(f"https://api.binance.com/api/v3/klines?symbol={symbol}&interval=12h&startTime={utc_timestamp}&limit=1")
response_json = json.loads(response.text)

return HistoricalBar(
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of hardcoded values (0, 1, 2, 3, etc.) please use global constants (uppercase and starting with underscore). E.g.: https://github.com/eprbell/dali-rp2/blob/main/src/dali/plugin/input/rest/coinbase.py#L44

asset: str,
exchange: str,
holder: str,
# TODO not sure why we aren't using `Keyword` here?
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate? Not sure I understand the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration.Keyword enum looks like what the input values should be here, curious what this isn't used as the type instead of a raw string.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the parameters to instantiate a transaction are strings (they come from REST/CSV sources or from the configuration file), so the design choice has been to pass in strings and then validate them inside the transaction constructor.

However it's true that for transaction_type we could have used an enum, because transaction_type is assigned in the code and not read from external input. I guess I chose string for uniformity with the rest of the parameters.

Note that the transaction_type parameter is validated fully at runtime and even if we changed transaction_type to enum, we would still need to validate the value, depending on the actual class type of the transaction (see the DIRECTION_2_TRANSACTION_TYPE_SET dictionary) so not much would change in the code (we would only skip the string check).

if not pair_converter_list:
pair_converter_list.append(HistoricCryptoPairConverterPlugin(Keyword.HISTORICAL_PRICE_HIGH.value))
pair_converter_list.append(BinancePairConverterPlugin(Keyword.HISTORICAL_PRICE_HIGH.value))
LOGGER.info("No pair converter plugins found in configuration file: using Historic_Crypto/high as default")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change the message to "using default configuration".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to butt in here, but should I add the CCXT converter to the default config like this as well?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Yes, I think the CCXT converter should definitely be added to the default configuration. Currently we only have one converter (HistoricCrypto), so the default configuration is quite meager. We do need to make the default configuration better and adding CCXT is a step in the right direction.

@eprbell
Copy link
Owner

eprbell commented Jul 24, 2022

Actually this is an interesting idea: would it make sense to add variants (subclasses) of your CCXT converter that use a fixed exchange (one for Binance, one for Kraken, etc.)? The binance subclass would be exactly equivalent to the pair converter in this PR. Additionally, if we added a CCXT converter subclass for Coinbase, it should be equivalent to the HistoricCrypto plugin. Does that sound correct or am I missing something?

@macanudo527
Copy link
Collaborator

macanudo527 commented Jul 24, 2022

The CCXT converter plugin reads in the exchange name from each transaction that is fed to it then pulls data from that exchange per transaction.

So in theory, if you transacted on Binance, kraken and Coinbase it would pull pricing data from each exchange all in one plugin.

I don't know what the use case would be for a converter 'locked' to one exchange. As we discussed before, there are price fluctuations between exchanges so it is best to pull pricing data from the specific exchange the crypto asset was transacted on.

And any exchange that doesn't have USD base pairs would need an additional exchange for that conversion (USDT -> USD).

If done right, the CCXT converter should cover the vast majority of cases except for fringe exchanges with apis not supported by CCXT.

I propose though that the user can define a default exchange or assign alternative exchanges for ones not available via CCXT. So, you could say pull Binance pricing for Nexo for example since nexo has no pricing.

@macanudo527
Copy link
Collaborator

Thinking on this more, what we could do is create a USD base pair version that only works on exchanges that have USD base pairs and for users that need USD denominated prices.

We could rip out the routing / graphing and speed things up.

Then, have a more robust version that handles everything.

@eprbell
Copy link
Owner

eprbell commented Jul 24, 2022

I understand the behavior of the CCXT plugin, reading the exchange from each transaction and that is great as it is. The "locked" behavior would offer an alternative/additional model. Each of the locked plugins would behave like the existing HistoricCrypto plugin (which ignores the exchange hint). The user could then arrange the locked plugins in a static configuration, defining a custom priority chain (e.g. try binance first, then coinbase, then kraken, etc). I am not sure if some users would prefer this, but it's a different model, with different operational semantics.

@eprbell
Copy link
Owner

eprbell commented Jul 24, 2022

Let me paint a scenario and ask a question:

  • Bob is a US user (using the US country plugin)
  • he uses exchanges that have USD-based pairs
  • he uses the CCXT plugin (as it is written today in your PR)

In the above situation, does the routing code get executed or not? I was under the impression that it would not be executed, but correct me if I'm wrong. If I'm right, would we still need a router-less variant of the CCXT exchange?

@macanudo527
Copy link
Collaborator

No it doesn't. If the market exists for the conversion on the exchange:

Conversion BTC -> USD
Market BTC/USD

Then, the routing isn't used. As is, the graph gets loaded but I'm going to push that down so it lazy loads. I guess it wouldn't offer too much of an efficiency boost.

@eprbell
Copy link
Owner

eprbell commented Jul 24, 2022

No it doesn't. If the market exists for the conversion on the exchange:

Conversion BTC -> USD Market BTC/USD

Then, the routing isn't used. As is, the graph gets loaded but I'm going to push that down so it lazy loads. I guess it wouldn't offer too much of an efficiency boost.

Ok, that's what I thought: sounds good.

@macanudo527
Copy link
Collaborator

I understand the behavior of the CCXT plugin, reading the exchange from each transaction and that is great as it is. The "locked" behavior would offer an alternative/additional model. Each of the locked plugins would behave like the existing HistoricCrypto plugin (which ignores the exchange hint). The user could then arrange the locked plugins in a static configuration, defining a custom priority chain (e.g. try binance first, then coinbase, then kraken, etc). I am not sure if some users would prefer this, but it's a different model, with different operational semantics.

Ok, I got you now. I was misunderstanding this in thinking you wanted it for efficiency reasons. For priority reasons would probably be useful for some users that have assets on hardware wallets or some other place that doesn't have readily available pricing data. I opened an issue #61 so that this doesn't get lost.


# we do know the spot price, but nexo seems to do some aggressive rounding
spot_price=Keyword.UNKNOWN.value,
crypto_sent=Keyword.UNKNOWN.value,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eprbell this line actually caused me issues with rp2. Should I use 0.0 here instead? That seemed to solve the issue when I was trying to get rp2 to generate.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the issue: could you paste a stack trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the error message on the rp2 side:

Exception: Encountered an unresolved DaLI transaction (read DaLI's documentation / FAQ to learn how to resolve this issue): 

When I filled in the crypto_sent value with 0.0 the issue was fixed. Any ideas here? In this case we are just receiving crypto, so it makes sense that the sent would be zero?

@eprbell
Copy link
Owner

eprbell commented Jul 25, 2022

I understand the behavior of the CCXT plugin, reading the exchange from each transaction and that is great as it is. The "locked" behavior would offer an alternative/additional model. Each of the locked plugins would behave like the existing HistoricCrypto plugin (which ignores the exchange hint). The user could then arrange the locked plugins in a static configuration, defining a custom priority chain (e.g. try binance first, then coinbase, then kraken, etc). I am not sure if some users would prefer this, but it's a different model, with different operational semantics.

Ok, I got you now. I was misunderstanding this in thinking you wanted it for efficiency reasons. For priority reasons would probably be useful for some users that have assets on hardware wallets or some other place that doesn't have readily available pricing data. I opened an issue #61 so that this doesn't get lost.

Yes, that's a good example. E.g. if the user moves coins from Trezor to Ledger the "locked" instances of CCXT would be useful.

@iloveitaly
Copy link
Contributor Author

I've handled most of the comments on the nexo plugin. Using the CCTX-based pair makes sense instead of my hacky converter. Will take a look at that next!

Copy link
Owner

@eprbell eprbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking better and better! As for the locked CCXT converter variants, I think they will be trivial subclasses of the CCXT converter: @macanudo527 should be able to tackle them when his CCXT converter is finished. So for now I think we can just remove the binance converter from this PR. But if you'd like to contribute more code there is plenty to do. Feel free to check open issues in DaLI and RP2, look at anything that interests you in the code, or just ask here for suggestions.

Also remember to add a small section for Nexo in the docs: you can use the Ledger section as boilerplate.

IntraTransaction(**(common_params | {
'crypto_received':amount,

# most likely, it's you, but we can't say for sure
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is a little unclear: by "you" do you mean the user? A deposit could potentially come anywhere, including other users. I would suggest just removing the comment altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleared this up. I think we should keep the comment—for new devs coming to the codebase it makes it easier to understand why we are using unknown, otherwise it seems like an error/omission.

@iloveitaly
Copy link
Contributor Author

@eprbell I removed the binance pair converter with the idea that we'll get the other PR merged and can leverage that. Addressed most of the PR comments. Thanks for the thorough review!

Copy link
Owner

@eprbell eprbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost ready to merge. Here's what needs to be done:

  • I added a few minor comments/nits to the latest version of the code;
  • add a small section for Nexo in the docs: you can use the Ledger section as boilerplate;
  • fix static analysis errors.

@iloveitaly iloveitaly changed the title Nexo plugin, binance historical data Nexo plugin Aug 1, 2022
@iloveitaly
Copy link
Contributor Author

I believe I fixed all of the linting errors! Will take care of doc updates later this week. Appreciate you working with me on this!

@eprbell
Copy link
Owner

eprbell commented Aug 1, 2022

Thank you for submitting PRs! I will review the code for this and the other PRs in the next few days, however I wanted to share an important point that applies to all data loader plugins (so it applies #60, #65 and #66). It has to do with the unique_id field, which requires special attention. This field is used by the DaLI transaction resolver to match incomplete transaction and join them. E.g. a transfer from Coinbase to Trezor is initially represented as 2 transactions:

  • an IntraTransaction from Coinbase to UNKNOWN, generated by the Coinbase plugin (which doesn't know that the destination is a Trezor)
  • an IntraTransaction from UNKNOWN to Trezor, generated by the Trezor plugin (which doesn't know that the source is Coinbase)

These two transactions both share the same unique_id (which is typically the transaction hash and is filled in by the respective plugins): they are fed to the transaction resolver, which uses unique_id to match them up and to join them into a single transaction without UNKNOWN fields.

This is why unique_id is so important: if an incorrect value is used, the transaction resolver is unable to match and join incomplete transactions.

The reason I point this out is that sometimes in CSV files from exchanges the transaction id is empty, other times (confusingly) it's an internal id of the exchange (which cannot be used by the transaction resolver) instead of the hash. In such cases (where transaction hash is not available) DaLI plugins must set the unique_id field to UNKNOWN.

Some examples of this:

  • BlockFI CSVs don't contain the hash or any other transaction id
  • Coinbase CSVs contain an internal transaction id, but it's not the hash

Check the following for more details:

There is a way to tell if you're using the right unique_id in your plugins: e.g. if you have a transfer from Coinbase to Nexo or viceversa, make sure the final generated ODS file contains only one transaction for this (with "from" fields set to Coinbase, "to" fields set to Nexo and no UNKNOWN source or destination).

This applies to all CSV and REST plugins, but it's especially relevant to CSV plugins (because REST APIs often provide hash information): make sure all these new plugins use either the transaction hash or UNKNOWN as the unique_id.

@iloveitaly
Copy link
Contributor Author

@eprbell Wow, thank you so much for the information! Your PR reviews are the most helpful I've ever gotten from an open source project, very impressed.

This makes sense: in other words, even if there is a unique ID in a CSV we don't want to include it unless it's referenced in another data source (i.e. external id or foreign key). I'll adjust the PRs!

@iloveitaly
Copy link
Contributor Author

Addressed comments + added documentation! Let me know if anything else is needed here.

@eprbell
Copy link
Owner

eprbell commented Aug 7, 2022

Hey, anytime! And thank you for submitting code and helping out with the project: two way street and all that :-D.

Correct, we don't want to include unique_id unless it's either the transaction hash (which is always understood by all exchanges and wallets) or in some special cases the internal id (e.g. transactions from Coinbase to Coinbase Pro don't use the transaction hash, but an internal Coinbase-only id which both CB and CB Pro understand).

Copy link
Owner

@eprbell eprbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's my latest review.

Copy link
Owner

@eprbell eprbell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ship it!

@eprbell eprbell merged commit 49e3cb5 into eprbell:main Aug 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants