Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't train PatchTST on different (than X) dimensions for target #713

Open
strakehyr opened this issue Mar 23, 2023 · 3 comments
Open

Can't train PatchTST on different (than X) dimensions for target #713

strakehyr opened this issue Mar 23, 2023 · 3 comments
Labels
enhancement New feature or request ideas New ideas to enhance tsai

Comments

@strakehyr
Copy link

Hi @oguiza and thanks again for implementing a SOTA model.

Seem to have run into a limitation for PatchTST in which I can't seem to train it for a different dimension X compared to y. As opposed to, for instance, TSTPlus where we could have several X covariates and then different number of Y series (with different dimensions). I might just be using the wrong approach for this, as your PatchTST example uses a sliding window on the same variables for X and Y.

The TSTPlus case:

X.shape, y.shape
arch_config = dict(
    n_layers=3,  
    ks = 4,
    n_heads=4,  
    d_model=16,  
    d_ff=128,  
    dropout=0.3,
)
learn = TSForecaster(X, y, splits=splits, batch_size=16, path="models", pipelines=[exp_pipe],
                     arch="PatchTST", arch_config=arch_config, metrics=[mse, mae], cbs=ShowGraph())


n_epochs = 100
lr_max = 0.0025
lr_max = learn.lr_find().valley
((7416, 168, 4), (7416, 24, 3))
Epoch 1/1 : |█████---------------| 28.57% [92/322 00:01<00:03 1.1506]

This doesn't seem to be the case for the PatchTST:

arch_config = dict(
    n_layers=3,
    n_heads=4,
    d_model=16,
    d_ff=128,
    attn_dropout=0.0,
    dropout=0.3,
    patch_len=1,
)
learn = TSForecaster(X, y, splits=splits, batch_size=16, path="models", pipelines=[exp_pipe],
                     arch="PatchTST", arch_config=arch_config, metrics=[mse, mae], cbs=ShowGraph())
n_epochs = 100
lr_max = 0.0025
lr_max = learn.lr_find().valley
RuntimeError: The size of tensor a (8064) must match the size of tensor b (1152) at non-singleton dimension 0            
@oguiza
Copy link
Contributor

oguiza commented Mar 23, 2023

Thanks for raising this @strakehyr.
You are absolutely right. This is a limitation of the current PatchTST model.
I'd like to develop a PatchTSTPlus model that allows to use it in scenarios like the one you described above.
This is the normal process that I've followed in the library. The standard version replicates as close as possible the model published in the paper/ code. And then a Plus version add additional functionality. Forecasting any # variables is one of those scenarios. Another one is Classification or Regression.
I'd like to work on this soon, but need to find the time or resources. Would you be interested in creating a PR? If your are interested I could you give you some direction on what needs to be done.

@oguiza oguiza added enhancement New feature or request ideas New ideas to enhance tsai labels Mar 23, 2023
@vrodriguezf
Copy link
Contributor

I am interested in this too. I might have some time to put my hands on it in about two weeks

@strakehyr
Copy link
Author

I would be happy to help, but I don't currently have the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ideas New ideas to enhance tsai
Projects
None yet
Development

No branches or pull requests

3 participants