# Darts (Time Series)
# Overview
First we can start with some general terminology.
* **Endogenous Variables**: This is the **target** time series (that we wish to predict). In sklearn this would be the `y` (which, due to the autoregressive nature of the problem, is also used to generate `X`).
* **Exogenous Variables**: This would be our **features** in traditional modeling, i.e. our `X`. These are variables that have predictive power for whatever our problem is, but our model cannot predict them.
* **Covariates[^1]**: External data that can be used to help improve forecasts. In the context of forecasting models, the **target** is the series to be forecasted/predicted, and the covariates themselves are not predicted. As with exogenous variables, covariates are yet another way of describing *features*.
* **Past Covariates**: Covariates known only into the past (e.g. measurements, for example, *what the actual temperature was 2 days prior to inference time*)
* **Future Covariates**: Covariates known into the future (e.g. weather forecasts).
* **Static Covariates:** Covariates that are constant over time (e.g. the county that given node was located in)
### How does Darts do probabilistic prediction?
This is described nicely in [G. Grosch, F. Lässig - Darts: Unifying time series forecasting models from ARIMA to Deep Learning - YouTube](https://www.youtube.com/watch?v=thg10qDqpRE).
> It does not output a time series, but rather it outputs *parameters*, $\theta$, of a given probability distribution. Using these parameters we can obtain an arbitrary number of sample predictions.

### Misc Notes
* When performing quantile regression with LGBM, a model is trained for *each* quantile that we wish to estimate! [darts/lgbm.py at master · unit8co/darts · GitHub](https://github.com/unit8co/darts/blob/master/darts/models/forecasting/lgbm.py#L200)
* If we fit, say, a unique model to 10 different quantiles (via [pinball loss](Quantile%20Regression.md)), then we have started to get useful information about the full distribution (if we have 100 evenly spaced quantiles that starts to be a nice approximation, under certain conditions)
* When we call `predict`, under the hood each of our 10 fitted models generates a prediction (the associated target value at the associated quantile). Then, for each sample that we wish to generate (say we want 300 samples, see [here](https://github.com/unit8co/darts/blob/f067f27103ad9327e67938bb59e3e8c098562f78/darts/models/forecasting/regression_model.py#L552)), under the hood a random number between 0 and 1 will be generated, and then we will linearly interpolate (take a weighted average) between the nearest known quantiles ([darts.utils.likelihood_models — darts documentation](https://unit8co.github.io/darts/_modules/darts/utils/likelihood_models.html#QuantileRegression))
### Terminology
* `components` are features
### What other feature engineering could they do with us
* diff method
* ratio method
# Notes
* sub estimators should maintain universal interface like sktime
* could subclass
* could have adapter/shim to get things into format of darts
*
* Train and inference functions?
* Who is calling distributed estimator?
* who is
---
Date: 20230519
Links to:
Tags:
References:
* [G. Grosch, F. Lässig - Darts: Unifying time series forecasting models from ARIMA to Deep Learning - YouTube](https://www.youtube.com/watch?v=thg10qDqpRE)
[^1]: [Covariates — darts documentation](https://unit8co.github.io/darts/userguide/covariates.html)